Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I am having trouble getting drbd to perform well. I've measured the
speed of both the raw disk and the network connection, and drbd is
only getting to about 30% of these limits. At best, I can't get over
about 50% of the MB/second of a netcat transfer between the 2 nodes.
*** Hardware and Configuration ***
2 identical Dell Poweredge 2950s (called 2950-22 and 2950-23). 16GB
RAM. 2 dual-core 1.6 Ghz Xeon processors. Dedicated ethernet link for
drbd using Broadcom BCM5708 Gigabit Ethernet adapters. The 2 nodes
are in separate data centers (maybe 40km distance between them), but
there are dual redundant fiber links between the 2 centers. The drbd
link is on a dedicated VLAN which was created only for those 2 boxes.
/dev/sda is hardware RAID 10. PERC 5/i controller with battery-backed
cache. 6 15K RPM 68 GB drives, for 271GB of usable storage in RAID
10. Read policy is 'no read ahead', and write policy is 'write back'.
/dev/drbd0 sits on top of /dev/sda7, which is 255GB in size.
/dev/drbd0 is mounted as /db, using ext3 filesystem. All other
filesystems are also on partitions of /dev/sda.
Both machines run Red Hat 5.
[alexd at dellpe2950-23 ~]$ uname -a
Linux dellpe2950-23.azcentral.com 2.6.18-8.1.15.el5 #1 SMP Thu Oct 4
04:06:39 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
Tests
*** Disk (w/o network) ***
dd if=/dev/zero of=/db/tmp/testfile bs=1G count=1 oflag=dsync
I've tried to establish the performance of the raw disk+filesystem by
writing 1GB of /dev/zero to a file on 2 non-drbd partitions, and by
writing to the drbd partition in StandAlone mode. I see pretty
consistent performance between 115 MB/sec and 125 MB/sec. The drbd
partition in StandAlone mode has the highest average. I'm guessing
because it's the largest and so is spread over the most number of disks.
*** Network (w/o disk) ***
node1 (server): /usr/local/bin/iperf -s -B 10.99.210.34
node2 (client): /usr/local/bin/iperf -c 10.99.210.34
Running this 10 times, I get min/mean/max of 97/104/107 MB/second.
*** Network + Disk (w/o drbd) ***
node1 (server): nc -v -l 1234 > testfile.out
node2 (client): time nc -v 10.99.210.34 1234 < testfile
testfile was created with 'dd', same command as was used in the disk
tests above. 1GB of /dev/zero. Running 10 times on 2 non-drbd
partitions (5 each) gives min/mean/max of 68/71/72 MB/second.
*** DRBD performance ***
drbd.conf attached below...
[alexd at dellpe2950-23 ~]$ cat /proc/drbd
version: 8.0.12 (api:86/proto:86)
GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by
alexd at dellpe2950-23.azcentral.com, 2008-05-01 09:44:22
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:2197684 nr:0 dw:1267670012 dr:45858705 al:517265 bm:40270 lo:0
pe:0 ua:0 ap:0
resync: used:0/61 hits:138942 misses:219 starving:0 dirty:0
changed:219
act_log: used:0/577 hits:248624541 misses:518844 starving:0
dirty:1579 changed:51726
On drbd primary : dd if=/dev/zero of=/db/tmp/testfile bs=1G count=1
oflag=dsync
My drbd.conf is pasted at the end. I used that as a baseline, and
varied one value at a time up and down, to see what effects each would
have on performance. I've tried tuning al-extents, max-buffers,
unplug-watermark, sndbuf-size, and max-epoch-size. I've also tried
the 'deadline' and 'noop' i/o schedulers, though the majority of the
tests were done with 'deadline'. In all cases, performance is pretty
consistent.
min/mean/max 3/36/39. This is from 32 test runs of 1GB each.
The 3 MB/second was with a max-buffers of 32, which is I believe the
minimum allowable value. Leaving that out gives min/mean/max of
30/37/39 MB/second. (This is also excluding some tests I previously
emailed about where I had sndbuf-size set to 128, which was basically
unusable and never finished.)
No matter what I tinker with, drbd performance is almost the same.
This made me suspect something outside of drbd was a limiting factor,
but given the other tests I'd run above I'm not sure what that could be.
Can anyone help me spot flaws in my test plans, or suggest other
things to try? I'm at a loss at this moment.
thanks for your input,
alex
[alexd at dellpe2950-23 ~]$ cat /etc/drbd.conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd/drbd.conf
#
# Our MySQL Share
resource drbd-resource-0 {
protocol C;
#incon-degr-cmd "halt -f";
startup {
degr-wfc-timeout 5;
}
net {
#on-disconnect reconnect;
after-sb-0pri disconnect;
after-sb-1pri disconnect;
max-epoch-size 8192;
max-buffers 8192;
unplug-watermark 128;
}
disk {
on-io-error detach;
}
syncer {
rate 12M;
al-extents 577;
}
on dellpe2950-22 {
device /dev/drbd0;
disk /dev/sda7; # db partition
address 10.99.210.33:7789; # Private subnet IP
meta-disk internal;
}
on dellpe2950-23 {
device /dev/drbd0;
disk /dev/sda7; # db partition
address 10.99.210.34:7789; # Private subnet IP
meta-disk internal;
}
}