[DRBD-user] drbd throughtput with dual gigabit bonding

Fri Feb 19 16:33:34 CET 2010

See
http://www.drbd.org/users-guide/re-drbdconf.html
http://fghaas.wordpress.com/2007/06/22/performance-tuning-drbd-setups/

and comments below:

Good luck


Cristian Mammoli - Apra Sistemi wrote:
> I have 2 drbd resources shared between 2 IBM x3500 M2 servers.
> Storage is composed by 6x146GB 15k RPM SAS driver connected to a
> Serveraid MR10i (with BBU and writeback enabled)
> Each resource has a dedicated dual gigabit bonding (balance-rr).
>
> Testing the network speed with iperf I have ~ 1.9 Gbit/s on both links
> Testing the drbd resources in disconnected mode I have:
>
> [root at srvha01 ~]# dd if=/dev/zero of=/datastore1/test bs=1024k
> count=1000 oflag=direct
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 3.6059 seconds, 291 MB/s
>
> Doing the same with the drbd device connected I hardly reach 100 MB/s
>
> So where's the bottleneck?
>
> OS is CentOS 5.3 x86_64 and drbd version is 8.2
>
> drbd.conf follows:
> global {
>   usage-count no;
> }
>
> common {
>   syncer {
>     rate 700000K;

Try setting ^^ to something a little lower, for starters, just to see,
try 150M and work your way up.  This can also be adjusted "on-the-fly"
with: drbdsetup /dev/drbd1 syncer -r 150M
>     al-extents 257;
Try increasing ^^ to something higher, a prime number. 1801 is a good
starting point.
>     verify-alg md5;
>   }
>
>   protocol C;
>
>   handlers {
>     pri-on-incon-degr "echo b > /proc/sysrq-trigger ; reboot -f";
>     pri-lost-after-sb "echo b > /proc/sysrq-trigger ; reboot -f";
>     pri-lost "echo b > /proc/sysrq-trigger ; reboot -f";
>     local-io-error "echo b > /proc/sysrq-trigger ; reboot -f";
>     outdate-peer "/usr/lib64/heartbeat/drbd-peer-outdater -t 5";
>   }
>
>   startup {
>     # wfc-timeout  0;
>     degr-wfc-timeout 120;    # 2 minutes.
>     # wait-after-sb;
>     # become-primary-on both;
>   }
>
>   disk {
>     on-io-error   detach;
>     fencing resource-only;
>     no-disk-flushes;
>     no-md-flushes;
>   }
>
>   net {
>     # max-buffers     20480;
>     # max-epoch-size  16384;
>     # unplug-watermark 128;
Depending on your RAID controller, try going to both extremes: high/low.
>     sndbuf-size     1M;
>     after-sb-0pri discard-zero-changes;
>     after-sb-1pri discard-secondary;
>     after-sb-2pri call-pri-lost-after-sb;
>     rr-conflict call-pri-lost;
{cut...}