Cédric Dufour - Idiap Research Institute cedric.dufour at idiap.ch
Fri Jul 29 13:50:14 CEST 2011

Hello DRBD users,

I wanted to share my experience with DRBD being used over (Infiniband)
SDP, for I obtained results that may be interesting to some of you.

I first started very frustated by the fact that while SDP (re-)synching
performances were better than TCP's (~5%), general write operations
(e.g. bonnie++ or dd if=/dev/zero ...) were horrible - ~50MB/s (SDP) vs.
~170MB/s (TCP) - the relevant DRBD configuration being:

  net {
    max-buffers 8192;
    max-epoch-size 8192;
    sndbuf-size 1024k;

After much playing around, I found that two parameters affect write
performances drastically:

1. DRBD's 'sdnbuf-size' parameter
While 'sdnbuf-size' does not affect TCP performances when set above a
value of 1024k, it does so sensibly for SDP connections.
The more you increase the 'sndbuf-size' value, the more SDP performances
increase: write (cf. dd if=/dev/zero ...) jumping from ~50MB/s
(sndbuf-size=1024k) to ~120MB/s (sndbuf-size=10240k; the maximum allowed).
Fortunately, increasing this parameter does not affect small writes
performances (cf. bonnie++ -n ... ), neither when using TCP or SDP.

2. SDP module's (ib_sdp) 'recv_poll' parameter
The 'recv_poll' parameter (default: 700usec) also affects SDP
performances sensibly.
I lowered it (step-by-step) down to 100usec and finally got ~230MB/s raw
write performances (cf. dd if=/dev/zero ...) and bonnie++ write
performances that are 10-15% above TCP performances.
Fortunately again, lowering this parameter does not affect small writes
performances (cf. bonnie++ -n ... ) nor general TCP-via-SDP performances
(iperf being stable at 9.1GBit/s via SDP vs. 3.2GBit/s via TCP).
Also, CPU usage remains the same.
Finally, nothing amiss shows up in the kernel log.

In the end, I obtained performances that are consistent with what the
backing RAID array allows me to expect.

I hope this may be of interest to some of you.
If someone knowledgeable of DRBD and/or Infiniband modules parameters
were to read this post, I would happily have a feedback about why those
parameters affect DRBD (write) performances (but not (re-)synch's).
Eventually, maybe those results are only valid for the particular
hardware and/or software versions I have been playing with.

Speaking of...

Software version:
 - Debian Squeeze 6.0.2 64-bit (kernel 2.6.32)
 - Custom-built OFED kernel modules (including patch related to
 - Custom-built DRBD 8.3.11 kernel module (along with re-packaged DRBD
8.3.11 user-space utilities)

 - Motherboard: Supermicro X7DWN+
 - CPU: Intel Xeon L5410 (2x) [HT disabled, VT enabled]
 - RAM: 32GB DDR2-667 (16x2GB)
 - NIC: Mellanox MT25204 [InfiniHost III Lx HCA] (ib_rdma_bw ~1.5GB/s;
ib_rdma_lat ~2usec)
 - HD/RAID: ICP 5165BR with 16x1TB HDs (RAID-6 - LVM2 - DRBD)



Cédric Dufour @ Idiap Research Institute
"No complaint is enough praise"

