[DRBD-user] Drbd and network speed

Diego Remolina diego.remolina at physics.gatech.edu
Wed Sep 16 15:17:14 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> 
> linux bonding of _two_ nics in "balance-rr" mode, after some tuning
> of the network stack sysctls, should give you about 1.6 to 1.8 x
> the throughput of a single link.
> For a single TCP connection (as DRBDs bulk data socket is),
> bonding more than two will degrade throughput again,
> mostly due to packet reordering.

I've tried several bonding modes and with balance-rr the most I got was 
about 1.2Gbps using netperf tests. IIRC, the other issue of balance-rr 
is that there can be retransmission which slows down the transfers.

Any specific information or howto accomplish the 1.6 to 1.8 x would be 
really appreciated.

I am currently replicating two drbd devices over separate bonds in 
active backup mode (two bonds with 2 Gigabit interfaces each using 
mode=1 miimon=100).

My peak speed for replication is ~120MB/s and as I stated before, my 
backend is about 5 times faster. So if I could really accomplish the 1.6 
to 1.8 x with a few tweaks, that would be great.

OTOH, 10GB Cooper nics have reached decent pricing, The Intel cards are 
~US $600. Please keep in mind you will need a special cable (SPF+ Direct 
Attach which is around US $50 for a 2 meter cable, I am sure you can get 
better pricing on those).

http://www.intel.com/Products/Server/Adapters/10-Gb-AF-DA-DualPort/10-Gb-AF-DA-DualPort-overview.htm

Diego

> 
>> I guess the only other configuration that may help speed would be to  
>> have a separate NIC per drbd device if you backend is capable of reading  
>> and writing from different locations on disk and feeding several gigabit  
>> replication links. I think it should be able with SAS drives.
>>
>> e.g
>>
>> /dev/drbd0 uses eth1 for replication
>> /dev/drbd1 uses eth2 for replication
>> /dev/drbd2 uses eth3 for replication
>>
>> ... you get the idea...
> 
> right.
> 
> or, as suggested, go for 10GBit.
> 
> or "supersockets" (Dolphin nics).
> 
> or infiniband, which can also be used back-to-back (if you don't have
> the need for an infiniband switch)
> two options there:
>   IPoIB (use "connected" mode!).
>   or "SDP" (you need drbd 8.3.3 and OFED >= 1.4.2).
> 
> 
> but, long story short: DRBD cannot write faster than your bottleneck.
> 
> This is how DATA modifications flow in a "water hose" picture of DRBD.
> (view with fixed width font, please)
> 
>             \    WRITE    /
>              \           /
>          .---'           '---.
>     ,---'     REPLICATION     '---.
>    /              / \              \
>    \    WRITE    /   \    WRITE    /
>     \           /     \           /
>      \         /       \         /
>       |       |         '''| |'''
>       |       |            |N|
>       |       |            |E|
>       |       |            |T|
>       | LOCAL |            | |
>       |  DISK |            |L|
>       |       |            |I|
>       +-------+            |N|
>                            |K|
>                            | |
>                      .-----' '---------.
>                      \      WRITE      /
>                       \               /
>                        \             /
>                         |  REMOTE   |
>                         |   DISK    |
>                         +-----------+
> 
> 
> REPLICATION basically doubles the data (though, of course, DRBD uses
> zero copy for that, if technically possible).
> 
> Interpretation and implications for throughput should be obvious.
> You want the width of all those things as broad as possible.
> 
> For the latency aspect, consider the height of the vertical bars.
> You want all of them to be as short as possible.
> 
> Unfortunately, you sometimes cannot have it both short and wide ;)
> 
> But you of course knew all of that already.
> 



More information about the drbd-user mailing list