Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I'm experiencing a strange and interesting issue. I have a two-node-cluster with write-speed on a local partition of ~500MB/s and network throughput of ~220MB/s. But when writing to the drbd-partition I get only ~60MB/s. Another interesting thing is, that the initial-sync-speed reported by "watch cat /proc/drbd" is roughly 215MB/s. Also the seeks/s are a lot slower on the drbd- partition - which makes me worry as this cluster is also meant to run a database. As the network- and the disks-benchmarks show the expected performance and only the drbd-partition has slow write speed, I suspect the issue to lie within drbd or it's configuration. Below is my drbd-configuration, a netcat-based-benchmark and bonnie-benchmarks to the local partition and to the drbd-partition. The backing device for drbd is a LVM-partition formatted with ext4. That same LV (formatted newly) was used for the netcat-based-benchmark. In all cases iotop shows "jbd2/sda2-8" as the I/O-bottleneck. I also tried with xfs - same write-speed. The firewall was switched off during the tests. I have left out the bonding- and bridge-details of the other interfaces as I don't think they are relevant for this issue. I also monitored the benchmarks with atop and it reflects the same disk- and network-stats. Some details: - two nodes: hydra04 and hydra05 (same hardware) - 48 CPU-threads, 48GB RAM, RAID6 on SAS-drives - distro: debian-testing (wheezy). All software are stock-debian-packages - kernel: 3.2.0-3-amd64 - DRBD: 8.3.13 - Networking: 2x Gigabit IF back-to-back in boding-mode rr; MTU: 9000 I have a second two-node-drbd-cluster which is very similar to the one above (especially networking, all software versions and the drbd-configuration are the same). I get about the same I/O performance on a local partition and the same raw network-throughput. But when I run bonnie on the drbd-partition I get the full ~235MB/s write speed instead of the 60MB/s on this cluster. Any help appreciated. I also played around with some drbd-configuration settings which are supposed to help with performance. With some of these settings I got the throughput from 60MB/s to 70MB/s - which is still a lot less than the 220 I expect. warm regards, Tom ------------------------- drbd-configuration --------------------------- tom at hydra04 [1501]:~$ sudo drbdadm dump # /etc/drbd.conf common { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; } } # resource leela on hydra04: not ignored, not stacked resource leela { on hydra04 { device minor 0; disk /dev/vg0/leela; address ipv4 10.0.0.1:7788; meta-disk internal; } on hydra05 { device minor 0; disk /dev/vg0/leela; address ipv4 10.0.0.2:7788; meta-disk internal; } } # resource gelb on hydra04: not ignored, not stacked resource gelb { on hydra04 { device minor 1; disk /dev/vg0/gelb; address ipv4 10.0.0.1:7789; meta-disk internal; } on hydra05 { device minor 1; disk /dev/vg0/gelb; address ipv4 10.0.0.2:7789; meta-disk internal; } } # resource cyan on hydra04: not ignored, not stacked resource cyan { on hydra04 { device minor 2; disk /dev/vg0/cyan; address ipv4 10.0.0.1:7790; meta-disk internal; } on hydra05 { device minor 2; disk /dev/vg0/cyan; address ipv4 10.0.0.2:7790; meta-disk internal; } } # resource test-io on hydra04: not ignored, not stacked resource test-io { on hydra04 { device minor 5; disk /dev/vg0/test-io; address ipv4 10.0.0.1:7793; meta-disk internal; } on hydra05 { device minor 5; disk /dev/vg0/test-io; address ipv4 10.0.0.2:7793; meta-disk internal; } } ---------------------------------------------------------------------- --- testing bandwidth with netcat (test.dd resides on local partition) ---- root at hydra05:~# nc -l 1234 | pipebench > /mnt/test-io/test.dd tom at hydra04 [1522]:~$ sudo dd if=/dev/zero of=/mnt/test-io/test.dd bs=1M count=5000 5000+0 records in 5000+0 records out 5242880000 bytes (5.2 GB) copied, 13.8065 s, 380 MB/s tom at hydra04 [1523]:~$ sudo nc 10.0.0.2 1234 < /mnt/test-io/test.dd # Summary: # Piped 4.88 GB in 00h00m22.37s: 223.45 MB/second ---------------------------------------------------------------------- ----- bonnie on local partition on hydra04 - hydra05 is the same --------- tom at hydra04 [1498]:~$ sudo mount /dev/vg0/test-io /mnt/test-io/; cd /mnt/test-io/; time sudo bonnie -f -u root; cd -; sudo umount /mnt/test-io/ Using uid:0, gid:0. Writing intelligently...done Rewriting...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- -- Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- -- Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP hydra04 96704M 495967 68 159106 20 546991 28 727.7 11 Latency 203ms 940ms 183ms 186ms Version 1.96 ------Sequential Create------ --------Random Create-------- hydra04 -Create-- --Read--- -Delete-- -Create-- --Read--- - Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 5425 6 +++++ +++ 20766 23 +++++ +++ +++++ +++ +++++ +++ Latency 478us 707us 685us 339us 18us 724us 1.96,1.96,hydra04,1,1350881245,96704M,,,,495967,68,159106,20,,,546991,28,727.7,11,16,,,,,5425,6, +++++,+++,20766,23,+++++,+++,+++++,+++,+++++, +++,,203ms,940ms,,183ms,186ms,478us,707us,685us,339us,18us,724us ---------------------------------------------------------------------------- ------------------ bonnie on drbd-partition on hydra04 --------------------- tom at hydra04 [1501]:~$ sudo mount /dev/drbd/by-res/test-io /mnt/test-io/; cd /mnt/test-io/; time sudo bonnie -f -u root; cd -; sudo umount /mnt/test-io/ Using uid:0, gid:0. Writing intelligently...done Rewriting...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- -- Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- -- Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP hydra04 96704M 65471 10 58394 8 527477 32 162.4 3 Latency 101ms 5002ms 208ms 176ms Version 1.96 ------Sequential Create------ --------Random Create-------- hydra04 -Create-- --Read--- -Delete-- -Create-- --Read--- - Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 2614 3 +++++ +++ +++++ +++ 22167 33 +++++ +++ +++++ +++ Latency 315us 960us 754us 372us 41us 759us 1.96,1.96,hydra04,1,1350685155,96704M,,,,65471,10,58394,8,,,527477,32,162.4,3,16,,,,,2614,3, +++++,+++,+++++,+++,22167,33,+++++,+++,+++++, +++,,101ms,5002ms,,208ms,176ms,315us,960us,754us,372us,41us,759us ----------------------------------------------------------------------------