[DRBD-user] Secondary server works harder than primary

Wed Jun 8 02:06:57 CEST 2011

I have a two machine DRBD setup, supposedly with identical hardware in
both machines. I have 3 drives in each, with software RAID-0, and DRBD
on the RAID partition. When I write large amounts of data via NFS to the
DRBD partition, the harddrive LED on the primary machine blinks slowly.
However, on the secondary machine, the LED is on solidly. I also ran
iostat on the drives:

Primary machine
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    4.93   25.52    0.00   69.30

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             349.00     34960.00       680.00      34960        680
sdb             330.00     35112.00       544.00      35112        544
sdc               6.00         0.00       144.00          0        144
md0            2517.00    105736.00      1920.00     105736       1920

Secondary machine
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    2.51    0.00    0.00   97.49

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             937.00         0.00     16316.00          0      16316
sdb             986.00         0.00     16248.00          0      16248
sdc             905.00         0.00     16324.00          0      16324
md0           12116.00         0.00     48464.00          0      48464

Notice that the secondary machine TPS is much higher than the primary
machine.

The two machines have identical configurations:

resource hnlcsv {
	protocol	B;

	handlers {
		pri-on-incon-degr
"/usr/lib/drbd/notify-pri-on-incon-degr.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
		pri-lost-after-sb
"/usr/lib/drbd/notify-pri-lost-after-sb.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
		local-io-error	"/usr/lib/drbd/notify-io-error.sh;
/usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger
; halt -f";
	}

	net {
		max-epoch-size	16000;
		max-buffers	16000;
		sndbuf-size	0;
	}

	disk {
		use-bmbv;
	}

	syncer {
		al-extents	3389;
		rate 50M;
	}

	on hnlcsv4 {
		device	/dev/drbd0;
		disk	/dev/md0;
		address	192.168.1.2:7789;
		flexible-meta-disk	internal;
	}

	on hnlcsv3 {
		device	/dev/drbd0;
		disk	/dev/md0;
		address	192.168.1.1:7789;
		flexible-meta-disk	internal;
	}
}

Does anyone know what could cause the secondary to work so hard? I think
if it didn't have to, I could get much higher throughput on my DRBD
partition because it's being limited by the secondary machine.

Thanks,
Chris Gouveia

<DIV><FONT size="1">

E-mail confidentiality.
--------------------------------
This e-mail contains confidential and / or privileged information belonging to Spirent Communications plc, its affiliates and / or subsidiaries. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution and / or the taking of any action based upon reliance on the contents of this transmission is strictly forbidden. If you have received this message in error please notify the sender by return e-mail and delete it from your system.

Spirent Communications plc
Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.
Tel No. +44 (0) 1293 767676
Fax No. +44 (0) 1293 767677

Registered in England Number 470893
Registered at Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.

Or if within the US,

Spirent Communications,
26750 Agoura Road, Calabasas, CA, 91302, USA.
Tel No. 1-818-676- 2300 

</FONT></DIV>