[DRBD-user] slow sync speed of 12TB over 10GE

Mon Jan 12 16:44:05 CET 2009

Hi Lars.

Lars Ellenberg <lars.ellenberg at linbit.com> schrieb:
> On Mon, Jan 12, 2009 at 03:58:17PM +0100, Lars Täuber wrote:
> > The metadata doesn't seem to be the problem. I hope this config puts the meta data onto a ramdisk:
> > # drbdadm dump
> > # /etc/drbd.conf
> > common {
> >     protocol               C;
> >     syncer {
> >         rate             1150M;
> >     }
> >     handlers {
> >         split-brain      "/usr/lib/drbd/notify-split-brain.sh root";
> >     }
> > }
> > 
> > # resource RAID61 on monosan: not ignored, not stacked
> > resource RAID61 {
> >     on monosan {
> >         device           /dev/drbd0;
> >         disk             /dev/md4;
> >         address          ipv4 10.9.8.7:7788;
> >         flexible-meta-disk /dev/ram0;
> >     }
> >     on duosan {
> >         device           /dev/drbd0;
> >         disk             /dev/md4;
> >         address          ipv4 10.9.8.6:7788;
> >         flexible-meta-disk /dev/ram0;
> 
> are you _sure_ that drbd uses that?
> (drbdsetup /dev/drbd0 show)
> you are aware that you'll get a full sync again?

Yes I'm aware. It's for testing purpose only.
I just tested a ramdisk only setup:

# cat /proc/mdstat 
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] 
md5 : active raid6 ram3[3] ram2[2] ram1[1] ram0[0]
      1023872 blocks level 6, 64k chunk, algorithm 2 [4/4] [UUUU]

# drbdsetup /dev/drbd0 show
disk {
	size            	0s _is_default; # bytes
	on-io-error     	pass_on _is_default;
	fencing         	dont-care _is_default;
	no-disk-flushes ;
	no-md-flushes   ;
	max-bio-bvecs   	0 _is_default;
}
net {
	timeout         	60 _is_default; # 1/10 seconds
	max-epoch-size  	2048 _is_default;
	max-buffers     	2048 _is_default;
	unplug-watermark	128 _is_default;
	connect-int     	10 _is_default; # seconds
	ping-int        	10 _is_default; # seconds
	sndbuf-size     	131070 _is_default; # bytes
	ko-count        	0 _is_default;
	after-sb-0pri   	disconnect _is_default;
	after-sb-1pri   	disconnect _is_default;
	after-sb-2pri   	disconnect _is_default;
	rr-conflict     	disconnect _is_default;
	ping-timeout    	5 _is_default; # 1/10 seconds
}
syncer {
	rate            	1177600k; # bytes/second
	after           	-1 _is_default;
	al-extents      	127 _is_default;
}
protocol C;
_this_host {
	device			"/dev/drbd0";
	disk			"/dev/md5";
	meta-disk		internal;
	address			ipv4 10.9.8.7:7788;
}
_remote_host {
	address			ipv4 10.9.8.6:7788;
}

but the speed is still frustrating:

# cat /proc/drbd 
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at monosan, 2009-01-09 15:15:29
 0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r---
    ns:0 nr:781524 dw:781448 dr:0 al:0 bm:44 lo:20 pe:9847 ua:19 ap:0 ep:1 wo:b oos:242356
        [==============>.....] sync'ed: 76.4% (242356/1023804)K
        finish: 0:00:01 speed: 130,240 (130,240) K/sec

> >     }
> >     disk {
> 
> try add (introduced more recently:)
> 	no-disk-barrier;

Ok, I'll test this on the disk raid again.

> > What are the other possible bottlenecks?
> 
> md resyncing at the same time?  ;)

No I'm sure.

> misaligned device offset?

I'm not sure what you mean. But the disk array uses full disks /dev/sd[c-r] - no partitions.

> secondary device performance?
What do you mean by secondary the seconday drbd site? Both machines are equally build and are running under the same distribution and are up to date.

> device latencies?

How to measure this? The drives are Seagate SATA-II server disks (ST31000340NS).

> max-buffers (and max-epoch);
> al-size;

I'll have a look.

> your sync-rate is set _much_ too high.  unit is _bytes_ not bits, we are
> storage guys.  and it is meant to be a _throttle_. increasing it way
> beyond physical limits does hurt, not help.

The 10GE network connection is dedicated to the sychronisation between both drbds. There is no other load on this network (it really is only a crosslink).
So why is it to high?
> >     syncer {
> >         rate             1150M;
> >     }

This are 1.15 GB/s or 9.2Gb/s. Should be fine on a 10G Ethernet, shouldn't it? Or is there more overhead? BTW I use jumbo frames of 9000 Bytes.

Thanks
Lars