Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> > linux bonding of _two_ nics in "balance-rr" mode, after some tuning > of the network stack sysctls, should give you about 1.6 to 1.8 x > the throughput of a single link. > For a single TCP connection (as DRBDs bulk data socket is), > bonding more than two will degrade throughput again, > mostly due to packet reordering. I've tried several bonding modes and with balance-rr the most I got was about 1.2Gbps using netperf tests. IIRC, the other issue of balance-rr is that there can be retransmission which slows down the transfers. Any specific information or howto accomplish the 1.6 to 1.8 x would be really appreciated. I am currently replicating two drbd devices over separate bonds in active backup mode (two bonds with 2 Gigabit interfaces each using mode=1 miimon=100). My peak speed for replication is ~120MB/s and as I stated before, my backend is about 5 times faster. So if I could really accomplish the 1.6 to 1.8 x with a few tweaks, that would be great. OTOH, 10GB Cooper nics have reached decent pricing, The Intel cards are ~US $600. Please keep in mind you will need a special cable (SPF+ Direct Attach which is around US $50 for a 2 meter cable, I am sure you can get better pricing on those). http://www.intel.com/Products/Server/Adapters/10-Gb-AF-DA-DualPort/10-Gb-AF-DA-DualPort-overview.htm Diego > >> I guess the only other configuration that may help speed would be to >> have a separate NIC per drbd device if you backend is capable of reading >> and writing from different locations on disk and feeding several gigabit >> replication links. I think it should be able with SAS drives. >> >> e.g >> >> /dev/drbd0 uses eth1 for replication >> /dev/drbd1 uses eth2 for replication >> /dev/drbd2 uses eth3 for replication >> >> ... you get the idea... > > right. > > or, as suggested, go for 10GBit. > > or "supersockets" (Dolphin nics). > > or infiniband, which can also be used back-to-back (if you don't have > the need for an infiniband switch) > two options there: > IPoIB (use "connected" mode!). > or "SDP" (you need drbd 8.3.3 and OFED >= 1.4.2). > > > but, long story short: DRBD cannot write faster than your bottleneck. > > This is how DATA modifications flow in a "water hose" picture of DRBD. > (view with fixed width font, please) > > \ WRITE / > \ / > .---' '---. > ,---' REPLICATION '---. > / / \ \ > \ WRITE / \ WRITE / > \ / \ / > \ / \ / > | | '''| |''' > | | |N| > | | |E| > | | |T| > | LOCAL | | | > | DISK | |L| > | | |I| > +-------+ |N| > |K| > | | > .-----' '---------. > \ WRITE / > \ / > \ / > | REMOTE | > | DISK | > +-----------+ > > > REPLICATION basically doubles the data (though, of course, DRBD uses > zero copy for that, if technically possible). > > Interpretation and implications for throughput should be obvious. > You want the width of all those things as broad as possible. > > For the latency aspect, consider the height of the vertical bars. > You want all of them to be as short as possible. > > Unfortunately, you sometimes cannot have it both short and wide ;) > > But you of course knew all of that already. >