Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I am trying to setup drbd on some HP Proliant Servers but have some issues about performance. The speed is with most linux kernels very slow. There are just a few exceptions. Software Both nodes running debian 3.1 ("sarge") Hardware DL 380 G4 with a 440GB Hardware Raid5 (4 144GB disks) on /dev/cciss/c0d1 A dual Intel Gigabit Controller configured as bonding device for drbd use. Connected to a an extra vlan. When sending data via tcpspray it looks like linbackup-1:~# tcpspray -n 1000000 192.168.53.2 Transmitted 1024000000 bytes in 9.006240 seconds (111034.127 kbytes/s) linbackup-2:~# tcpspray -n 1000000 192.168.53.1 Transmitted 1024000000 bytes in 12.658490 seconds (78998.364 kbytes/s) (there is a significant speed difference, but I am not sure if it explains the later effect) For all following results drbd-0.7.18 (SVN Revision: 2186M) is used. And also the following drbd.conf was the same global { minor-count 5; } resource drbd0 { protocol C; incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f"; startup { wfc-timeout 0; degr-wfc-timeout 120; # 2 minutes. } disk { on-io-error detach; } net { max-buffers 131072; ko-count 4; } syncer { rate 90M; group 1; al-extents 257; } on linbackup-1 { device /dev/drbd0; disk /dev/vgdrbd/drbd; address 192.168.53.1:7788; meta-disk /dev/vg00/lvol5[0]; } on linbackup-2 { device /dev/drbd0; disk /dev/vgdrbd/drbd; address 192.168.53.2:7788; meta-disk /dev/vg00/lvol5[0]; } } (there was no speed difference when using drbd on top of lvm2 disks or directly on the device. The metadisk is on a seperate disk) When syncing I get with most kernel Versions something like (example with both nodes running linux-2.6.6) SVN Revision: 2186M build by root at linbackup-1, 2006-05-06 19:52:29 0: cs:SyncSource st:Secondary/Secondary ld:Consistent ns:263444 nr:0 dw:0 dr:272268 al:0 bm:45 lo:0 pe:192 ua:2206 ap:0 [>...................] sync'ed: 4.6% (5511/5768)M finish: 0:47:02 speed: 1,672 (3,280) K/sec with some Versions I got 10 times faster speed like (linux-2.6.5) SVN Revision: 2186M build by root at linbackup-2, 2006-05-06 20:01:39 0: cs:SyncTarget st:Secondary/Secondary ld:Inconsistent ns:0 nr:1090348 dw:1090344 dr:0 al:0 bm:60 lo:3524 pe:1721 ua:3524 ap:0 [===>................] sync'ed: 19.6% (4349/5400)M finish: 0:01:41 speed: 43,868 (33,632) K/sec In all tests (except noted) the node linbackup-1 was the primary and linbackup-2 secondary. The following kernel configurations had been slow both nodes running 2.6.16.14, 2.6.15.7, 2.6.14.7, 2.6.13.5, 2.6.12.6, 2.6.11.12, 2.6.10, 2.6.6 and also linbackup-1(primary) running 2.6.5 and linbackup-2(secondary) running 2.6.16.14 Just with the following configurations I got the better speed both nodes running 2.6.5 linbackup-1(primary) running 2.6.16.14 and linbackup-2(secondary) running 2.6.5 and vice versa when switching primary and secondary linbackup-1(secondary) with 2.6.5 and linbackup-2(primary) with 2.6.16.14 Other configurations of different kernel versions had been tested too. (Unfortunally not all documented) When not at least one node was running 2.6.5 it was always slow. Any ideas how I can track down this problem. I don't like to run ancient kernel revisions on a production system that I can never update. I am not even sure if its a hard or a software problem. Anymore information needed, or any idea what else I should test to make it working all the time? Regards Torsten