Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi!
I'm using DRBD as a backend for postgre in a two node HA setup and I'm
experiencing severe slowdowns. An analysis follows below. All results
have been obtained with linuxHA being off and nothing resource
intensive running on the servers. I have executed each benchmark
several times (at least 3 times each) to make sure I'm not falling prey
to statistic outliers. Could someone please give me some advice about
how to proceed to determine the root cause of the slowdown? I suspect
it could be the write cache of the RAID controller not being used, but
unfortunately my version of DRBD does not seem to support the
no-disk-flushes, no-md-flushes options.
Please note the following two performance figures below:
Speed for: sudo dd if=/dev/zero of=/dev/drbd0 bs=512
Disconnected: 3.5 MB/s
Connected: 3.5 kB/s
= Network latency =
asterix02 at asterix02:/usr/lib/lmbench/bin/i686-pc-linux-gnu$ ./lat_tcp
192.168.1.1
TCP latency using 192.168.1.1: 0.2479 microseconds
asterix01 at asterix01:/usr/lib/lmbench/bin/i686-pc-linux-gnu$ ./lat_tcp
192.168.1.2
TCP latency using 192.168.1.2: 0.2463 microseconds
= Network throughput =
asterix01 at asterix01:/usr/lib/lmbench/bin/i686-pc-linux-gnu$ iperf -f
M -c 192.168.1.2
------------------------------------------------------------
Client connecting to 192.168.1.2, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.1.1 port 39381 connected with 192.168.1.2 port 5001
[ 3] 0.0-10.0 sec 1116 MBytes 112 MBytes/sec
= Local disk (IBM Serveraid RAID 1) =
asterix01 at asterix01:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=1G count=1
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 9.97766 s, 108 MB/s
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=1G count=1
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 9.84385 s, 109 MB/s
asterix01 at asterix01:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=512
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.136389 s, 3.8 MB/s
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/sda6 bs=512
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.140179 s, 3.7 MB/s
= DRBD default configuration =
asterix02 at asterix02:/$ sudo drbdsetup /dev/drbd0 show
disk {
size 0s _is_default; # bytes
on-io-error detach;
fencing dont-care _is_default;
}
syncer {
rate 33792k; # bytes/second
after -1 _is_default;
al-extents 127 _is_default;
}
_this_host {
device "/dev/drbd0";
disk "/dev/sda5";
meta-disk internal;
}
== Disconnected DRBD ==
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 43.7656 s, 24.5 MB/s
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.14615 s, 3.5 MB/s
== Connected DRBD (no resync happening) ==
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 53.9678 s, 19.9 MB/s
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 144.54 s, 3.5 kB/s
= Optimised DRBD =
disk {
size 0s _is_default; # bytes
on-io-error detach;
fencing dont-care _is_default;
}
net {
timeout 20; # 1/10 seconds
max-epoch-size 2048 _is_default;
max-buffers 8192;
unplug-watermark 8192;
connect-int 10 _is_default; # seconds
ping-int 1; # seconds
sndbuf-size 131070 _is_default; # bytes
ko-count 0 _is_default;
after-sb-0pri disconnect _is_default;
after-sb-1pri disconnect _is_default;
after-sb-2pri disconnect _is_default;
rr-conflict disconnect _is_default;
ping-timeout 5 _is_default; # 1/10 seconds
}
syncer {
rate 33792k; # bytes/second
after -1 _is_default;
al-extents 2129;
}
protocol C;
_this_host {
device "/dev/drbd0";
disk "/dev/sda5";
meta-disk internal;
address 192.168.1.2:7788;
}
_remote_host {
address 192.168.1.1:7788;
}
asterix02 at asterix02:/$ cat /proc/drbd
version: 8.0.11 (api:86/proto:86)
GIT-hash: b3fe2bdfd3b9f7c2f923186883eb9e2a0d3a5b1b build by phil at mescal,
2008-02-12 11:56:43
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:1048576 nr:0 dw:32129175 dr:66621614 al:2934 bm:578 lo:0 pe:0
ua:0 ap:0
resync: used:0/31 hits:196447 misses:193 starving:0 dirty:0
changed:193
act_log: used:0/2129 hits:218622 misses:256 starving:0 dirty:0
changed:256
== Disconnected DRBD ==
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 8.48373 s, 127 MB/s
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.145206 s, 3.5 MB/s
== Connected DRBD (no resync happening) ==
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=1G count=1
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 20.0519 s, 53.5 MB/s
asterix02 at asterix02:/$ sudo dd if=/dev/zero of=/dev/drbd0 bs=512
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 144.431 s, 3.5 kB/s
--
DI Florian Hackenberger
florian at hackenberger.at
www.hackenberger.at