Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi! I'm using DRBD as a backend for postgre in a two node HA setup and I'm experiencing severe slowdowns. An analysis follows below. All results have been obtained with linuxHA being off and nothing resource intensive running on the servers. I have executed each benchmark several times (at least 3 times each) to make sure I'm not falling prey to statistic outliers. Could someone please give me some advice about how to proceed to determine the root cause of the slowdown? I suspect it could be the write cache of the RAID controller not being used, but unfortunately my version of DRBD does not seem to support the no-disk-flushes, no-md-flushes options. Please note the following two performance figures below: Speed for: sudo dd if=/dev/zero of=/dev/drbd0 bs=512 Disconnected: 3.5 MB/s Connected: 3.5 kB/s = Network latency = asterix02 at asterix02:/usr/lib/lmbench/bin/i686-pc-linux-gnu$ ./lat_tcp 192.168.1.1 TCP latency using 192.168.1.1: 0.2479 microseconds asterix01 at asterix01:/usr/lib/lmbench/bin/i686-pc-linux-gnu$ ./lat_tcp 192.168.1.2 TCP latency using 192.168.1.2: 0.2463 microseconds = Network throughput = asterix01 at asterix01:/usr/lib/lmbench/bin/i686-pc-linux-gnu$ iperf -f M -c 192.168.1.2 ------------------------------------------------------------ Client connecting to 192.168.1.2, TCP port 5001 TCP window size: 0.02 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.1 port 39381 connected with 192.168.1.2 port 5001 [ 3] 0.0-10.0 sec 1116 MBytes 112 MBytes/sec = Local disk (IBM Serveraid RAID 1) = asterix01 at asterix01:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 9.97766 s, 108 MB/s asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 9.84385 s, 109 MB/s asterix01 at asterix01:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=512 count=1000 oflag=direct 1000+0 records in 1000+0 records out 512000 bytes (512 kB) copied, 0.136389 s, 3.8 MB/s asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=512 count=1000 oflag=direct 1000+0 records in 1000+0 records out 512000 bytes (512 kB) copied, 0.140179 s, 3.7 MB/s = DRBD default configuration = asterix02 at asterix02:/$ sudo drbdsetup /dev/drbd0 show disk { size 0s _is_default; # bytes on-io-error detach; fencing dont-care _is_default; } syncer { rate 33792k; # bytes/second after -1 _is_default; al-extents 127 _is_default; } _this_host { device "/dev/drbd0"; disk "/dev/sda5"; meta-disk internal; } == Disconnected DRBD == asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 43.7656 s, 24.5 MB/s asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct 1000+0 records in 1000+0 records out 512000 bytes (512 kB) copied, 0.14615 s, 3.5 MB/s == Connected DRBD (no resync happening) == asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 53.9678 s, 19.9 MB/s asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct 1000+0 records in 1000+0 records out 512000 bytes (512 kB) copied, 144.54 s, 3.5 kB/s = Optimised DRBD = disk { size 0s _is_default; # bytes on-io-error detach; fencing dont-care _is_default; } net { timeout 20; # 1/10 seconds max-epoch-size 2048 _is_default; max-buffers 8192; unplug-watermark 8192; connect-int 10 _is_default; # seconds ping-int 1; # seconds sndbuf-size 131070 _is_default; # bytes ko-count 0 _is_default; after-sb-0pri disconnect _is_default; after-sb-1pri disconnect _is_default; after-sb-2pri disconnect _is_default; rr-conflict disconnect _is_default; ping-timeout 5 _is_default; # 1/10 seconds } syncer { rate 33792k; # bytes/second after -1 _is_default; al-extents 2129; } protocol C; _this_host { device "/dev/drbd0"; disk "/dev/sda5"; meta-disk internal; address 192.168.1.2:7788; } _remote_host { address 192.168.1.1:7788; } asterix02 at asterix02:/$ cat /proc/drbd version: 8.0.11 (api:86/proto:86) GIT-hash: b3fe2bdfd3b9f7c2f923186883eb9e2a0d3a5b1b build by phil at mescal, 2008-02-12 11:56:43 0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- ns:1048576 nr:0 dw:32129175 dr:66621614 al:2934 bm:578 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:196447 misses:193 starving:0 dirty:0 changed:193 act_log: used:0/2129 hits:218622 misses:256 starving:0 dirty:0 changed:256 == Disconnected DRBD == asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 8.48373 s, 127 MB/s asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct 1000+0 records in 1000+0 records out 512000 bytes (512 kB) copied, 0.145206 s, 3.5 MB/s == Connected DRBD (no resync happening) == asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 20.0519 s, 53.5 MB/s asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct 1000+0 records in 1000+0 records out 512000 bytes (512 kB) copied, 144.431 s, 3.5 kB/s -- DI Florian Hackenberger florian at hackenberger.at www.hackenberger.at