[DRBD-user] Speeding up sync rate on fast links and storage

Gennadiy Nerubayev parakie at gmail.com
Thu Dec 18 18:33:20 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, Dec 18, 2008 at 7:24 AM, Lars Ellenberg
<lars.ellenberg at linbit.com>wrote:

> On Wed, Dec 17, 2008 at 04:17:00PM -0500, Parak wrote:
> > Hi all,
> >
> > I'm currently playing with DRBD (8.2.7) on 20Gb/s Infiniband, and it
> seems that
> > I'm running at the sync rate as the limiting speed factor. The local
> storage on
> > both nodes is identical (SAS array), and has been benchmarked at about
> 650MB/s
> > (or higher, depending on benchmark) to native disk, and about 550MB/s
> when
> > writing to it through a disconnected DRBD device. The network link for
> DRBD is
> > Infiniband as well (IPoIB), which has been benchmarked with netperf at
> ~800MB/
> > s.
> >
> > The fastest speed that I'm able to get from the DRBD sync with this
> > configuration is ~340MB/s, which limits the speed from my initiator to
> that as
> > well. Interestingly, I was also able to benchmark DRBD sync speed over
> 10Gbe,
> > which despite my repeated attempts to tweak drbd.conf, mtu, and tcp
> kernel
> > parameters, has produced the same speed as the aformentioned 340MB/s over
> > IPoIB.
> >
> > Here's the drbd.conf:
> >
> > global {
> >     usage-count yes;
> > }
> >
> > common {
> >   syncer {
> >      rate 900M;
>
> check if
>        cpu-mask 3;
> or      cpu-mask 7;
> or      cpu-mask f;
> or something like that
> has any effect.


No effect for these.


> >          }
> > }
> >
> > resource drbd0 {
> >
> >   protocol C;
> >
> >   handlers {
> >   }
> >
> >   startup {
> >     degr-wfc-timeout 30;
> >   }
> >
> >   disk {
> >     on-io-error   detach;
> >     fencing dont-care;
> >     no-disk-flushes;
> >     no-md-flushes;
> >     no-disk-drain;
> >     no-disk-barrier;
> >   }
> >
> >   net {
> >     ko-count 2;
> >     after-sb-1pri discard-secondary;
> >     sndbuf-size 1M;
>
> you can try sndbuf-size 0; (auto-tuning)


Slightly slower by about 20-30MB/s.


> and check whether tweaking
> /proc/sys/net/ipv4/tcp_rmem
> /proc/sys/net/ipv4/tcp_wmem
> /proc/sys/net/core/optmem_max
> /proc/sys/net/core/rmem_max
> /proc/sys/net/core/wmem_max
> and the like has any effect.


These did have a positive effect, but they were already applied in my case
(as per recommendations from the Infiniband vendor and ixgb readme):

net.ipv4.tcp_timestamps=0
net.ipv4.tcp_sack=0
net.ipv4.tcp_rmem='10000000 10000000 10000000'
net.ipv4.tcp_wmem='10000000 10000000 10000000'
net.ipv4.tcp_mem='10000000 10000000 10000000'
net.core.rmem_max=524287
net.core.wmem_max=524287
net.core.rmem_default=524287
net.core.wmem_default=524287
net.core.optmem_max=524287
net.core.netdev_max_backlog=300000


> check wether the drbd option
>         no-tcp-cork;
> has any positiv/negative effect.


This one has a negative effect - about 70MB/s slower.

>   }
> >
> >   on srpt1 {
> >     device     /dev/drbd0;
> >     disk       /dev/sdb;
> >     address    10.0.0.2:7789;
> >     flexible-meta-disk  internal;
> >   }
> >
> >   on srpt2 {
> >     device     /dev/drbd0;
> >     disk       /dev/sdb;
> >     address    10.0.0.3:7789;
> >     flexible-meta-disk  internal;
> >   }
> > }
> >
> > Any advice/thoughts would be highly appreciated; thanks!
>
> cpu utilization during benchmarks?
> "wait state"?
> memory bandwidth?
> interrupt rate?


The cpu utilization during the sync for the top tasks looks like so
(fluctuates, and typically lower), and is similiar on both nodes. I have not
seen any iowait:
Cpu(s):  1.2%us, 43.9%sy,  0.0%ni, 13.6%id,  0.0%wa,  0.5%hi, 40.9%si,
0.0%st

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
29513 root      16   0     0    0    0 R   69  0.0   7:31.92 drbd0_receiver
   32 root      10  -5     0    0    0 S   39  0.0  44:32.93 kblockd/0
29518 root      -3   0     0    0    0 S   18  0.0   1:55.06 drbd0_asender
21392 root      15   0     0    0    0 S    1  0.0   0:36.02 drbd0_worker

The memory bandwidth I've benchmarked with ramspeed to be ~2500-2700Mb/s on
one node, and ~2200Mb/s on the other, due to it having fewer memory modules
and memory total.

Interrupt rate is ~13500-14000/sec on the primary and ~11500/sec on the
secondary during a sync.


> maybe bind or unbind NIC interrupts to cpus?
>  /proc/interrupts
>  /proc/irq/*/smp_affinity


They are on CPU0 currently, but would it help to move it if the CPU is not
being overly taxed?

Thanks,

-Gennadiy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20081218/1441311f/attachment.htm>


More information about the drbd-user mailing list