Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I am having trouble getting drbd to perform well. I've measured the speed of both the raw disk and the network connection, and drbd is only getting to about 30% of these limits. At best, I can't get over about 50% of the MB/second of a netcat transfer between the 2 nodes. *** Hardware and Configuration *** 2 identical Dell Poweredge 2950s (called 2950-22 and 2950-23). 16GB RAM. 2 dual-core 1.6 Ghz Xeon processors. Dedicated ethernet link for drbd using Broadcom BCM5708 Gigabit Ethernet adapters. The 2 nodes are in separate data centers (maybe 40km distance between them), but there are dual redundant fiber links between the 2 centers. The drbd link is on a dedicated VLAN which was created only for those 2 boxes. /dev/sda is hardware RAID 10. PERC 5/i controller with battery-backed cache. 6 15K RPM 68 GB drives, for 271GB of usable storage in RAID 10. Read policy is 'no read ahead', and write policy is 'write back'. /dev/drbd0 sits on top of /dev/sda7, which is 255GB in size. /dev/drbd0 is mounted as /db, using ext3 filesystem. All other filesystems are also on partitions of /dev/sda. Both machines run Red Hat 5. [alexd at dellpe2950-23 ~]$ uname -a Linux dellpe2950-23.azcentral.com 2.6.18-8.1.15.el5 #1 SMP Thu Oct 4 04:06:39 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux Tests *** Disk (w/o network) *** dd if=/dev/zero of=/db/tmp/testfile bs=1G count=1 oflag=dsync I've tried to establish the performance of the raw disk+filesystem by writing 1GB of /dev/zero to a file on 2 non-drbd partitions, and by writing to the drbd partition in StandAlone mode. I see pretty consistent performance between 115 MB/sec and 125 MB/sec. The drbd partition in StandAlone mode has the highest average. I'm guessing because it's the largest and so is spread over the most number of disks. *** Network (w/o disk) *** node1 (server): /usr/local/bin/iperf -s -B 10.99.210.34 node2 (client): /usr/local/bin/iperf -c 10.99.210.34 Running this 10 times, I get min/mean/max of 97/104/107 MB/second. *** Network + Disk (w/o drbd) *** node1 (server): nc -v -l 1234 > testfile.out node2 (client): time nc -v 10.99.210.34 1234 < testfile testfile was created with 'dd', same command as was used in the disk tests above. 1GB of /dev/zero. Running 10 times on 2 non-drbd partitions (5 each) gives min/mean/max of 68/71/72 MB/second. *** DRBD performance *** drbd.conf attached below... [alexd at dellpe2950-23 ~]$ cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by alexd at dellpe2950-23.azcentral.com, 2008-05-01 09:44:22 0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- ns:2197684 nr:0 dw:1267670012 dr:45858705 al:517265 bm:40270 lo:0 pe:0 ua:0 ap:0 resync: used:0/61 hits:138942 misses:219 starving:0 dirty:0 changed:219 act_log: used:0/577 hits:248624541 misses:518844 starving:0 dirty:1579 changed:51726 On drbd primary : dd if=/dev/zero of=/db/tmp/testfile bs=1G count=1 oflag=dsync My drbd.conf is pasted at the end. I used that as a baseline, and varied one value at a time up and down, to see what effects each would have on performance. I've tried tuning al-extents, max-buffers, unplug-watermark, sndbuf-size, and max-epoch-size. I've also tried the 'deadline' and 'noop' i/o schedulers, though the majority of the tests were done with 'deadline'. In all cases, performance is pretty consistent. min/mean/max 3/36/39. This is from 32 test runs of 1GB each. The 3 MB/second was with a max-buffers of 32, which is I believe the minimum allowable value. Leaving that out gives min/mean/max of 30/37/39 MB/second. (This is also excluding some tests I previously emailed about where I had sndbuf-size set to 128, which was basically unusable and never finished.) No matter what I tinker with, drbd performance is almost the same. This made me suspect something outside of drbd was a limiting factor, but given the other tests I'd run above I'm not sure what that could be. Can anyone help me spot flaws in my test plans, or suggest other things to try? I'm at a loss at this moment. thanks for your input, alex [alexd at dellpe2950-23 ~]$ cat /etc/drbd.conf # # please have a a look at the example configuration file in # /usr/share/doc/drbd/drbd.conf # # Our MySQL Share resource drbd-resource-0 { protocol C; #incon-degr-cmd "halt -f"; startup { degr-wfc-timeout 5; } net { #on-disconnect reconnect; after-sb-0pri disconnect; after-sb-1pri disconnect; max-epoch-size 8192; max-buffers 8192; unplug-watermark 128; } disk { on-io-error detach; } syncer { rate 12M; al-extents 577; } on dellpe2950-22 { device /dev/drbd0; disk /dev/sda7; # db partition address 10.99.210.33:7789; # Private subnet IP meta-disk internal; } on dellpe2950-23 { device /dev/drbd0; disk /dev/sda7; # db partition address 10.99.210.34:7789; # Private subnet IP meta-disk internal; } }