Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I'm currently testing DRBD over Infiniband/SDP vs Infiniband/IP.
My configuration is as follows:
DRBD 8.3.11 (Protocol C)
Linux kernel 2.6.39
OFED 1.5.4
Infiniband: Mellanox Technologies MT26428
My baseline test was to attempt a resync of the secondary node using Infiniband over IP. I noted the sync rate. Once complete, I performed some other very rudimentary tests using 'dd' and 'mkfs' to get a sense of actual performance. Then I shutdown DRBD on both primary and secondary, modified the config to use SDP and started it back up to re-try all of the tests.
original:
address 10.0.99.108:7790 ;
to use SDP:
address sdp 10.0.99.108:7790 ;
No other config changes were made.
After this, I issued "drbdadm invalidate-remote all" on the primary to force a re-sync. I noted my sync rate almost doubled, which was excellent.
Once the sync was complete I re-attempted my other tests. Amazingly every tests using Infiniband over SDP performed significantly worse than Infiniband over IP.
Is there anything that can explain this?
Here are my actual tests/results for each config:
=============================================================================
Infiniband over IP
=============================================================================
# dd if=/dev/zero of=/dev/drbd0 bs=512M count=4 oflag=direct
4+0 records in
4+0 records out
2147483648 bytes (2.1 GB) copied, 5.1764 s, 415 MB/s
# dd if=/dev/zero of=/dev/drbd0 bs=4k count=100 oflag=direct
100+0 records in
100+0 records out
409600 bytes (410 kB) copied, 0.0232504 s, 17.6 MB/s
# time mkfs.ext4 /dev/drbd0
real 3m54.848s
user 0m4.272s
sys 0m37.758s
=============================================================================
Infiniband over SDP
=============================================================================
# dd if=/dev/zero of=/dev/drbd0 bs=512M count=4 oflag=direct
4+0 records in
4+0 records out
2147483648 bytes (2.1 GB) copied, 12.507 s, 172 MB/s <--- (2.4x slower)
# dd if=/dev/zero of=/dev/drbd0 bs=4k count=100 oflag=direct
100+0 records in
100+0 records out
409600 bytes (410 kB) copied, 19.6418 s, 20.9 kB/s <--- (844x slower)
# time mkfs.ext4 /dev/drbd0
real 10m12.337s <--- (4.25x slower)
user 0m4.336s
sys 0m39.866s
=============================================================================
At the same time I've used the netpipe benchmark to test Infiniband SDP performance, and it looks good.
netpipe benchmark using:
nodeA# LD_PRELOAD=libsdp.so NPtcp
nodeB# LD_PRELOAD=libsdp.so NPtcp -h 10.0.99.108
It consistently out performs Infiniband/IP as I would expect. So this leaves me thinking there is either a problem with my DRBD config or DRBD is using SDP differently for re-sync vs keeping in sync or my testing is flawed.
Here is what my config looks like:
# drbdsetup /dev/drbd0 show
disk {
size 0s _is_default; # bytes
on-io-error pass_on _is_default;
fencing dont-care _is_default;
no-disk-flushes ;
no-md-flushes ;
max-bio-bvecs 0 _is_default;
}
net {
timeout 60 _is_default; # 1/10 seconds
max-epoch-size 8192;
max-buffers 8192;
unplug-watermark 16384;
connect-int 10 _is_default; # seconds
ping-int 10 _is_default; # seconds
sndbuf-size 0 _is_default; # bytes
rcvbuf-size 0 _is_default; # bytes
ko-count 4;
after-sb-0pri disconnect _is_default;
after-sb-1pri disconnect _is_default;
after-sb-2pri disconnect _is_default;
rr-conflict disconnect _is_default;
ping-timeout 5 _is_default; # 1/10 seconds
on-congestion block _is_default;
congestion-fill 0s _is_default; # byte
congestion-extents 127 _is_default;
}
syncer {
rate 524288k; # bytes/second
after -1 _is_default;
al-extents 3833;
cpu-mask "15";
on-no-data-accessible io-error _is_default;
c-plan-ahead 0 _is_default; # 1/10 seconds
c-delay-target 10 _is_default; # 1/10 seconds
c-fill-target 0s _is_default; # bytes
c-max-rate 102400k _is_default; # bytes/second
c-min-rate 4096k _is_default; # bytes/second
}
protocol C;
_this_host {
device minor 0;
disk "/dev/sdc1";
meta-disk internal;
address sdp 10.0.99.108:7790;
}
_remote_host {
address ipv4 10.0.99.107:7790;
Any insight would be greatly appreciated.
--
Aj Mirani
Operations Manager, Tucows Inc.
416-535-0123 x1294