[DRBD-user] drbd 0.7 vs 8 latency

Wed Dec 31 16:18:38 CET 2008

On Wed, Dec 31, 2008 at 12:52:41PM +0100, X LAci wrote:
> Dear all,
> 
> I ran into high latency with drbd 8.3.0, on the same hardware where drbd
> 0.7 was OK.
> 
> I have created a test system Yesterday, two Lenovo Thinkcenter PCs, P4
> 3.0 GHz, Intel Chipset, 1 SATA disk in each PC, Gigabit Ethernet, one
> cable between them. I've installed Debian Etch on each of them, and drbd
> 0.7 that came with Etch. Drbd 8.3.0 was compiled by me.
> 
> One goal was to verify that 0.7 -> 8 version upgrade goes without data
> loss. That is succeeded, there was no data loss.
> 
> I have tested the performance as described in the performance tuning
> webinar.
> 
> Hdparm and dd showed that the disks are capable of reading and writing
> at about 62 MByte/sec. 
> 
> The disk latency for each node:
> 
> dd if=/dev/zero of=/dev/sda3 bs=512 count=1000 oflag=direct
> 
> node1: 512000 bytes (512 kB) copied, 0.158945 seconds, 3.2 MB/s
> node2: 512000 bytes (512 kB) copied, 0.158367 seconds, 3.2 MB/s
> 
> For initial sync, I put resync rate to 60 MByte/sec so that it finishes
> quickly. The actual resync rate was around 48 MByte/sec.
> 
> Throughput was OK with 8.3.0, bs=300M count=1, 60.3 MByte/sec.
> 
> Latency with drbd 0.7 was acceptable also:
> 
> dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct
> 512000 bytes (512 kB) copied, 0.308893 seconds, 1.7 MB/s
> 
> Latency with drbd 8.3.0 was very bad:
> 
> dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct
> 512000 bytes (512 kB) copied, 8.36032 seconds, 61.2 kB/s
> 
> It was not resyncing, there was no other activity on the PCs. 
> 
> I've also tried with drbd 8.2.6, same results. Also tried with Debian
> Etch 2.6.18 and 2.6.24-etchnhalf kernel, same results. 
> 
> drbd.conf was basicly default, only put in disk and network parameters,
> resync rate was changed to 20 Mbyte/sec, and al-extents 1201. 
> 
> What could cause this problem?

drbd 0.7 has no idea about disk flushes or barriers,
so on volatile caches it may cause data loss on power outage.

drbd 8 by default cares very much about explicitly flushing
or inserting barriers into the data stream to avoid this problem.
however when running on huge (controller) caches, and if the
implementation of "flush" is indeed a full cache flush,
these frequent flushes can degrade performance considerably.

they are not even necessary when your cache is batter backed.

so if you are running on a "safe" device
(either NON-volatile, battery backed, cache; or no cache at all),
consider turning these additional flushes off:

no-disk-barrier;
no-disk-flushes;
no-md-flushes;

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed