Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, Have you seen my post on (quite) the same subject: http://lists.linbit.com/pipermail/drbd-user/2011-July/016598.html ? Based on your experiments and mine, it would seem that SDP does not like "transferring small bits of data" (not being a TCP/SDP guru, I don't know how to put it more appropriately). This would somehow correlate with my finding of needing to increase the 'sndbuf-size' as much as possible. And this also correlates with the fact that initial sync or "dd" test with large block size actually use SDP very efficiently, while operations involving smaller "data bits" don't. I'm curious whether playing with the 'sndbuf-size' and ib_sdp's 'recv_poll' parameters would affect your setup the same way it did mine. Cheers, Cédric On 19/08/11 21:45, Aj Mirani wrote: > I'm currently testing DRBD over Infiniband/SDP vs Infiniband/IP. > > My configuration is as follows: > DRBD 8.3.11 (Protocol C) > Linux kernel 2.6.39 > OFED 1.5.4 > Infiniband: Mellanox Technologies MT26428 > > My baseline test was to attempt a resync of the secondary node using Infiniband over IP. I noted the sync rate. Once complete, I performed some other very rudimentary tests using 'dd' and 'mkfs' to get a sense of actual performance. Then I shutdown DRBD on both primary and secondary, modified the config to use SDP and started it back up to re-try all of the tests. > > original: > address 10.0.99.108:7790 ; > to use SDP: > address sdp 10.0.99.108:7790 ; > > No other config changes were made. > > After this, I issued "drbdadm invalidate-remote all" on the primary to force a re-sync. I noted my sync rate almost doubled, which was excellent. > > Once the sync was complete I re-attempted my other tests. Amazingly every tests using Infiniband over SDP performed significantly worse than Infiniband over IP. > > Is there anything that can explain this? > > > Here are my actual tests/results for each config: > ============================================================================= > Infiniband over IP > ============================================================================= > # dd if=/dev/zero of=/dev/drbd0 bs=512M count=4 oflag=direct > 4+0 records in > 4+0 records out > 2147483648 bytes (2.1 GB) copied, 5.1764 s, 415 MB/s > > # dd if=/dev/zero of=/dev/drbd0 bs=4k count=100 oflag=direct > 100+0 records in > 100+0 records out > 409600 bytes (410 kB) copied, 0.0232504 s, 17.6 MB/s > > # time mkfs.ext4 /dev/drbd0 > real 3m54.848s > user 0m4.272s > sys 0m37.758s > > > ============================================================================= > Infiniband over SDP > ============================================================================= > # dd if=/dev/zero of=/dev/drbd0 bs=512M count=4 oflag=direct > 4+0 records in > 4+0 records out > 2147483648 bytes (2.1 GB) copied, 12.507 s, 172 MB/s <--- (2.4x slower) > > # dd if=/dev/zero of=/dev/drbd0 bs=4k count=100 oflag=direct > 100+0 records in > 100+0 records out > 409600 bytes (410 kB) copied, 19.6418 s, 20.9 kB/s <--- (844x slower) > > # time mkfs.ext4 /dev/drbd0 > real 10m12.337s <--- (4.25x slower) > user 0m4.336s > sys 0m39.866s > > > ============================================================================= > > At the same time I've used the netpipe benchmark to test Infiniband SDP performance, and it looks good. > > netpipe benchmark using: > nodeA# LD_PRELOAD=libsdp.so NPtcp > nodeB# LD_PRELOAD=libsdp.so NPtcp -h 10.0.99.108 > > It consistently out performs Infiniband/IP as I would expect. So this leaves me thinking there is either a problem with my DRBD config or DRBD is using SDP differently for re-sync vs keeping in sync or my testing is flawed. > > > Here is what my config looks like: > # drbdsetup /dev/drbd0 show > disk { > size 0s _is_default; # bytes > on-io-error pass_on _is_default; > fencing dont-care _is_default; > no-disk-flushes ; > no-md-flushes ; > max-bio-bvecs 0 _is_default; > } > net { > timeout 60 _is_default; # 1/10 seconds > max-epoch-size 8192; > max-buffers 8192; > unplug-watermark 16384; > connect-int 10 _is_default; # seconds > ping-int 10 _is_default; # seconds > sndbuf-size 0 _is_default; # bytes > rcvbuf-size 0 _is_default; # bytes > ko-count 4; > after-sb-0pri disconnect _is_default; > after-sb-1pri disconnect _is_default; > after-sb-2pri disconnect _is_default; > rr-conflict disconnect _is_default; > ping-timeout 5 _is_default; # 1/10 seconds > on-congestion block _is_default; > congestion-fill 0s _is_default; # byte > congestion-extents 127 _is_default; > } > syncer { > rate 524288k; # bytes/second > after -1 _is_default; > al-extents 3833; > cpu-mask "15"; > on-no-data-accessible io-error _is_default; > c-plan-ahead 0 _is_default; # 1/10 seconds > c-delay-target 10 _is_default; # 1/10 seconds > c-fill-target 0s _is_default; # bytes > c-max-rate 102400k _is_default; # bytes/second > c-min-rate 4096k _is_default; # bytes/second > } > protocol C; > _this_host { > device minor 0; > disk "/dev/sdc1"; > meta-disk internal; > address sdp 10.0.99.108:7790; > } > _remote_host { > address ipv4 10.0.99.107:7790; > > > Any insight would be greatly appreciated. > >