[DRBD-user] Ultra slowness

Steven Maddox s.maddox at lantizia.me.uk
Tue May 9 08:31:39 CEST 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hey,

We use DRBD at work in a very simple way... but since rebuilding it
we've found it runs slower than before.

Setup is as follows...

Sites A and B both have a Debian 8 VM and they also both have a physical
SAN each.

These VM's use Open-iSCSI / Multipath to connect to their respective
SAN's (where 12 real disks are used in raid 6 to make one 30TB virtual
disk, and that virtual disk just has one partition with ext4 fs).

The VM's replicate that partition between them using DRBD.  Site A is
always primary and Site B always secondary (unless a dragon takes out
Site A - long story).

Protocol 'A' was chosen as I want writes to happen at Site A immediately
as the data goes to Site B over a WAN link.

The only reason the below config doesn't look like a WAN link is because
there's a VPLS between them (think of it like a layer 2 tunnel).  Using
iperf3 I can see general throughput is very good, around 700 to 900 mbps
region... but we only get about 800 K/sec showing in drbd-overview.

Also I've chosen DRBD 8.9.2 that comes with stock Debian 8.  I'm willing
to change that but only if someone points out some massive performance
hampering bug with it.  I prefer to stick to what Debian stable comes
with unless there's a reason not to.

Also also... you'll notice there's no security on this (e.g. password)
as it's an entirely private link... I've only put the crc32c in there as
I'm under the impression you're supposed to have some minimal way (at
least) of checking the data was sent correctly?

I've kept the stock config files the way they come with Debian 8, which
is to say there is nothing in drbd.conf and global_common.conf (as
everything is commented out by default - except for 'usage-count no')

Just added in my own as drbd.d/sanvirtualdisk.res ...

resource archive1 {
        protocol A;
        device /dev/drbd1;
        meta-disk internal;
        startup {
                wfc-timeout 0;
                degr-wfc-timeout 60;
                become-primary-on site-a;
        }
        net {
                verify-alg crc32c;
        }
        on site-a {
                disk /dev/mapper/sanvirtualdisk-part1;
                address 10.0.0.1:7789;
        }
        on site-b {
                disk /dev/mapper/sanvirtualdisk-part1;
                address 10.0.0.2:7789;
        }
}

Any suggestions for performance/integrity improvements to this config
would be greatly appreciated - keep in mind my scenario.

On last thing as well (something else I've been researching for weeks
and come up with no concrete answer) I'm wondering how I can tell Site A
to tell Site B which parts of the filesystem are marked as empty.. that
way speed up the initial sync.  I keep changing my config and
invalidating Site B to retest sync speeds - but it always seems to want
to sync the whole 30TB.  The virtual disks (on both sides) were
completely zero'd recently - so in theory DRBD should realise there is
large amounts of emptiness it could summarise (assuming it uses some
kind of compression) to help it copy quicker?  I've read TRIM would also
help - but I'm not sure that ability survives when you're using an iSCSI
disk.

Thanks for any advice you might have!

-- 
Steven Maddox
Lantizia



More information about the drbd-user mailing list