Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wednesday 09 April 2008 17:02:51 Joris van Rooij wrote:
> Hi,
>
> I'm using two identical machines, both packing 32GB of RAM, 16 Xeon
> cores and a battery-backed hardware RAID-6 setup (SUN STK using Adaptec
> AAC). Both machines are running Debian, Linux 2.6.24.3 and DRBD 8.0.11.
> The two machines are connected using a dedicated gigabit ethernet link
> (using an Intel 82571EB) with MTU set to 9000.
Nice setup. :-)
About your dd tests: while I admire your efforts, all of them are slightly
misled. Let me explain.
> Streamed dd to the local filesystem:
>
> # dd if=/dev/zero of=/tmp/testfile bs=4096 count=10000
> 10000+0 records in
> 10000+0 records out
> 40960000 bytes (41 MB) copied, 0.0920567 s, 445 MB/s
You're testing your memory and page cache here, not your I/O subsystem.
> Synced dd to the local filesystem:
>
> # dd if=/dev/zero of=/tmp/testfile bs=4096 count=10000 oflag=dsync
> 10000+0 records in
> 10000+0 records out
> 40960000 bytes (41 MB) copied, 5.39007 s, 7.6 MB/s
This is better (oflag=dsync), however by the block size and count you
selected, you're mixing up a throughput and latency measurement. Can you
re-run with bs=1G and count=1, then repeat that 3 times to get some
reasonable average?
Sadly, you reduplicated these errors for all your other test runs, so I'm
afraid you'll have to re-run those as well.
> Some snippets of configuration (from drdbsetup show):
> disk {
> size 0s _is_default; # bytes
> on-io-error detach;
> fencing resource-only;
> }
> net {
> timeout 60 _is_default; # 1/10 seconds
> max-epoch-size 16000;
> max-buffers 16000;
> unplug-watermark 128 _is_default;
> connect-int 10 _is_default; # seconds
> ping-int 10 _is_default; # seconds
> sndbuf-size 524288; # bytes
> ko-count 0 _is_default;
> cram-hmac-alg "sha1";
> shared-secret :);
> after-sb-0pri disconnect _is_default;
> after-sb-1pri disconnect _is_default;
> after-sb-2pri disconnect _is_default;
> rr-conflict disconnect _is_default;
> ping-timeout 5 _is_default; # 1/10 seconds
> }
> syncer {
> rate 51200k; # bytes/second
> after -1 _is_default;
> al-extents 257;
This is _extremely_ low for your I/O subsystem. Try 1801, or even 3389.
Please re-run your tests considering the suggestions I made above, and we'll
go from there.
Cheers,
Florian
--
: Florian G. Haas
: LINBIT Information Technologies GmbH
: Vivenotgasse 48, A-1120 Vienna, Austria
When replying, there is no need to CC my personal address.
I monitor the list on a daily basis. Thank you.