[DRBD-user] DRBD over Infiniband (SDP) performance oddity

Aj Mirani aj at tucows.com
Fri Aug 19 21:45:58 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I'm currently testing DRBD over Infiniband/SDP vs Infiniband/IP.  

My configuration is as follows:
DRBD 8.3.11 (Protocol C)
Linux kernel 2.6.39 
OFED 1.5.4
Infiniband: Mellanox Technologies MT26428

My baseline test was to attempt a resync of the secondary node using Infiniband over IP.  I noted the sync rate. Once complete, I performed some other very rudimentary tests using 'dd' and 'mkfs' to get a sense of actual performance.  Then I shutdown DRBD on both primary and secondary, modified the config to use SDP and started it back up to re-try all of the tests.

original:
    address   10.0.99.108:7790 ;
to use SDP:
    address   sdp 10.0.99.108:7790 ;

No other config changes were made.

After this, I issued "drbdadm invalidate-remote all" on the primary to force a re-sync.  I noted my sync rate almost doubled, which was excellent.

Once the sync was complete I re-attempted my other tests.  Amazingly every tests using Infiniband over SDP performed significantly worse than Infiniband over IP.  

Is there anything that can explain this? 


Here are my actual tests/results for each config:
=============================================================================
Infiniband over IP
=============================================================================
# dd if=/dev/zero of=/dev/drbd0 bs=512M count=4 oflag=direct
4+0 records in
4+0 records out
2147483648 bytes (2.1 GB) copied, 5.1764 s, 415 MB/s

# dd if=/dev/zero of=/dev/drbd0 bs=4k count=100 oflag=direct
100+0 records in
100+0 records out
409600 bytes (410 kB) copied, 0.0232504 s, 17.6 MB/s

# time mkfs.ext4 /dev/drbd0
real    3m54.848s
user    0m4.272s
sys     0m37.758s


=============================================================================
Infiniband over SDP
=============================================================================
# dd if=/dev/zero of=/dev/drbd0 bs=512M count=4 oflag=direct
4+0 records in
4+0 records out
2147483648 bytes (2.1 GB) copied, 12.507 s, 172 MB/s    <--- (2.4x slower)

# dd if=/dev/zero of=/dev/drbd0 bs=4k count=100 oflag=direct
100+0 records in
100+0 records out
409600 bytes (410 kB) copied, 19.6418 s, 20.9 kB/s      <--- (844x slower)

# time mkfs.ext4 /dev/drbd0
real    10m12.337s                                      <--- (4.25x slower)
user    0m4.336s
sys     0m39.866s


=============================================================================

At the same time I've used the netpipe benchmark to test Infiniband SDP performance, and it looks good.  

netpipe benchmark using:
    nodeA# LD_PRELOAD=libsdp.so NPtcp 
    nodeB# LD_PRELOAD=libsdp.so  NPtcp -h 10.0.99.108

It consistently out performs Infiniband/IP as I would expect.  So this leaves me thinking there is either a problem with my DRBD config or DRBD is using SDP differently for re-sync vs keeping in sync or my testing is flawed.


Here is what my config looks like:
# drbdsetup /dev/drbd0 show
disk {
        size                    0s _is_default; # bytes
        on-io-error             pass_on _is_default;
        fencing                 dont-care _is_default;
        no-disk-flushes ;
        no-md-flushes   ;
        max-bio-bvecs           0 _is_default;
}
net {
        timeout                 60 _is_default; # 1/10 seconds
        max-epoch-size          8192;
        max-buffers             8192;
        unplug-watermark        16384;
        connect-int             10 _is_default; # seconds
        ping-int                10 _is_default; # seconds
        sndbuf-size             0 _is_default; # bytes
        rcvbuf-size             0 _is_default; # bytes
        ko-count                4;
        after-sb-0pri           disconnect _is_default;
        after-sb-1pri           disconnect _is_default;
        after-sb-2pri           disconnect _is_default;
        rr-conflict             disconnect _is_default;
        ping-timeout            5 _is_default; # 1/10 seconds
        on-congestion           block _is_default;
        congestion-fill         0s _is_default; # byte
        congestion-extents      127 _is_default;
}
syncer {
        rate                    524288k; # bytes/second
        after                   -1 _is_default;
        al-extents              3833;
        cpu-mask                "15";
        on-no-data-accessible   io-error _is_default;
        c-plan-ahead            0 _is_default; # 1/10 seconds
        c-delay-target          10 _is_default; # 1/10 seconds
        c-fill-target           0s _is_default; # bytes
        c-max-rate              102400k _is_default; # bytes/second
        c-min-rate              4096k _is_default; # bytes/second
}
protocol C;
_this_host {
        device                  minor 0;
        disk                    "/dev/sdc1";
        meta-disk               internal;
        address                 sdp 10.0.99.108:7790;
}
_remote_host {
        address                 ipv4 10.0.99.107:7790;


Any insight would be greatly appreciated.


-- 
Aj Mirani  
Operations Manager, Tucows Inc.
416-535-0123 x1294



More information about the drbd-user mailing list