[DRBD-user] [Q] What is causing drbd to be slow and cycle between a little fast and very slow?

Maurice Volaski mvolaski at aecom.yu.edu
Wed Feb 1 02:50:06 CET 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Thank you for your detailed response.

>Maurice Volaski wrote:
>>This was never answered. I'm beginning to wonder if this is how it 
>>is supposed to work....
>>
>
>It does seem a bit slow.
>

I'm also seeing this behavior on our "fast" set of servers, which, in 
drbd.conf, is
syncer  { rate 16M; group 2; }

Given it is as slow as the slower server pair, I think that qualifies 
it as slow. =-O

>>My drbd's syncing speed is cycling between a little fast and very 
>>slow spending most of its time being slow.
>>
>>It is running on two near identical computers (IBM x340s).
>>Both are running Gentoo, which is completely up-to-date with kernel 
>>2.6.15 and drbd 0.7.14.
>>One has gigabit interface and the other 100 Mbit. Both show full 
>>duplex on their respective interfaces and on the switch. The 
>>underlying drives are SCSI via a ServeRAID adapter.
>>One computer has 1 GB real RAM; the other, only 256 MB.
>>There is no LVM. The filesystem is ext3.
>>The performance is not affected by whether iptables is running with 
>>any rules or not.
>
>Q1: you did not mention, Is DRBD transferring over a dedicated 
>network cable or  the one shared with the rest of your LAN? For my 
>test setup I have it shared with the LAN and get some variance, but 
>I think the network switch prevents me from being hit as hard as 
>you, i.e., on 100Mb connection I am seeing ~8500K/sec +-500K/sec.

Our network should be operating at wire speed, which for the "slow" 
pair, is full-duplex 100BaseT. The fast pair is dual, full-duplex 
gigabit on the primary (using adaptive load-balancing) and single 
full-duplex gigabit on the other.

>Q2: how hard can you drive the hard drives (on both systems) without DRBD?
>on my systems `dd if=/dev/zero of=/dev/hda13 bs=4k` has `iostat -x 
>/dev/hda 2` showing ~30000kB/s on both systems. With DRBD, proto C, 
>`dd if=/dev/zero of=/drbd0mnt/testzone/junk bs=4k` gets an iostat of 
>~8300KB/s.

iostat gives me bogus values. For example, on the non-DRBD disk, it's 
28.49 rkB/s, which, if I understand correctly means 28.49 KB/second, 
and that's obviously bogus.

Using time with dd, I found the DRBD disk to be about 13 MB/second 
and the non-DRBD disk to be 29 MB/second on the slow set of servers.

On the fast servers, the non-DRBD disk tested via dd yields an 
astounding 182 MB/second and the DRBD disk, a mere 16 MB/second.

Actually, that's in line with drbd.conf. So I want to make clear that 
I have seen these slow numbers only when the secondary has been 
offline for some time and there is quite a bit of data to be 
resynced. However, the server's functions are disabled, so nobody is 
actively using them to write new data during the resync.

>Q3: I would not expect it to cause the variance, but does adding
>al-extents 257; to the syncer settings help? Philipp mentioned it in 
>his "The need for Speed 2" thread.

I need to research this...

>Q4: what kind of network speed do you see using a tool like ttcp?
>i.e. `ttcp -t -n 16384 -s recivemachineIP` &
>`ttcp -r -n 16384 -s sendmachineIP`
>yields for me
>ttcp-t: 134217728 bytes in 14.54 real seconds = 9013.14 KB/sec +++
>ttcp-t: 16384 I/O calls, msec/call = 0.91, calls/sec = 1126.64
>So apparently, I am using almost all the speed the network can give 
>me while syncing.

On the slow server pair:
ttcp-t: 134217728 bytes in 28.13 real seconds = 4659.27 KB/sec +++
ttcp-t: 16384 I/O calls, msec/call = 1.76, calls/sec = 582.41

I wonder if the fact that the secondary of the slow pair's having 
only 256 MB impacts the overall performance for this result.

On the fast server pair:
ttcp-t: 134217728 bytes in 1.88 real seconds = 69817.52 KB/sec +++
ttcp-t: 16384 I/O calls, msec/call = 0.12, calls/sec = 8727.19



>>
>>The computers are not doing much else during this time.
>>
>>As far as I can tell, this is the only aspect of the machines that 
>>are not running at full speed.
>>
>>Here is output of /proc/drbd showing the performance start out 
>>reasonable, then slow down dramatically, sometimes even stopping, 
>>only to jump back to full speed. It spends most of its time at 
>>suboptimal speed though.
>>
>>version: 0.7.14 (api:77/proto:74)
>>SVN Revision: 1989 build by root at kennedy1, 2006-01-05 20:07:14
>>  0: cs:SyncSource st:Secondary/Secondary ld:Consistent
>>     ns:123136 nr:0 dw:0 dr:123136 al:0 bm:13 lo:0 pe:135 ua:0 ap:0
>>         [>...................] sync'ed:  4.8% (410680/426656)K
>>         finish: 0:02:51 speed: 2,240 (1,452) K/sec
>>
>
><SNIP current and average sync speed varying wildly>
>
>>Here is the relevant drbd.conf, which should allow drbd to move up 
>>to 4 MB per second.
>>
>>resource database {
><SNIP>
>>         syncer  { rate 4M; group 1; } # sync when r0 and r1 are finished
><SNIP>


-- 

Maurice Volaski, mvolaski at aecom.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University



More information about the drbd-user mailing list