Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Monday 04 October 2010 20:45:30 J. Ryan Earl wrote: > On Mon, Oct 4, 2010 at 11:13 AM, Bart Coninckx <bart.coninckx at telenet.be>wrote: > > JR, > > > > thank you for this very elaborate and technically rich reply. I will > > certainly > > look into your suggestions about using Broadcom cards. I have one dual > > port Broadcom card in this server, but I was using one port combined > > with one port > > on an Intel e1000 dual port NIC in balanced-rr to provide for backup in > > the event a NIC goes down. Two port NICs usually share one chip for two > > ports, so > > in case of a problem with the chip, the complete DRBD would be out. > > Reality shows this might be a bad idea though: doing a bonnie++ test to > > the backend storage (RAID5 on 15K rpm disks) gives me a 255 MB/sec write > > performance, doing the same test on the DRBD device drops this to 77 > > MB/sec, even with the > > MTU set to 9000. It would be nice to get as close as possible to the > > theoretical maximum, so a lot needs to be done to get there. > > Step 1 would be changing everything to the broadcom NIC. Any other > > suggestions? > > 77MB/sec is low for a single GigE link if you backing store can do > 250MB/sec. I think you should test on your hardware with a single GigE--no > bonding--and work on getting close to the 110-120M/sec range before > pursuing bonding optimization. Did you go through: > http://www.drbd.org/users-guide-emb/p-performance.html ? > > I use the following network sysctl tuning: > > # Tune TCP and network parameters > net.ipv4.tcp_rmem = 4096 87380 16777216 > net.ipv4.tcp_wmem = 4096 65536 16777216 > net.core.rmem_max = 16777216 > net.core.wmem_max = 16777216 > vm.min_free_kbytes = 65536 > net.ipv4.tcp_max_syn_backlog = 8192 > net.core.netdev_max_backlog = 25000 > net.ipv4.tcp_no_metrics_save = 1 > sys.net.ipv4.route.flush = 1 > > This gives me up to 16MB TCP windows and considerable backlog to tolerate > latency with high-throughput. It's tuned for 40gbit IPoIB, you could > reduce some of these numbers for slower connections... > > Anyway, what NICs are you using? Older interrupt-based NICs like the > e1000/e1000e (older Intel) and tg3 (older Broadcom) will not perform as > well as the newer RDMA-based hardware, but they should be well above the > 77MB/sec range. Does your RAID controller have a power-backed write > cache? Have you tried RAID10? > > -JR JR, finalized some testing and it seems there is quite a bit of difference in what a "dd" test reports opposed to a bonnie++ test. dd shows me an average of about 180 MB/sec, which I think is decent on two bonded gigabit NICs. bonnie++ shows me 100 MB/sec for block writes on ext3 on DRBD. Probably my conclusion needs be that my hardware is performing acceptably but that my test tool is wrong. B.