[DRBD-user] Speeding up sync rate on fast links and storage

Thu Dec 18 19:50:52 CET 2008

On Thu, Dec 18, 2008 at 12:33:20PM -0500, Gennadiy Nerubayev wrote:
>     check if
>            cpu-mask 3;
>     or      cpu-mask 7;
>     or      cpu-mask f;
>     or something like that
>     has any effect.
> 
>  
> No effect for these.

>     you can try sndbuf-size 0; (auto-tuning)
> 
> 
> Slightly slower by about 20-30MB/s.

I did not expect that.

>     and check whether tweaking
>     /proc/sys/net/ipv4/tcp_rmem
>     /proc/sys/net/ipv4/tcp_wmem
>     /proc/sys/net/core/optmem_max
>     /proc/sys/net/core/rmem_max
>     /proc/sys/net/core/wmem_max
>     and the like has any effect.
> 
> 
> These did have a positive effect, but they were already applied in my case (as
> per recommendations from the Infiniband vendor and ixgb readme):
> 
> net.ipv4.tcp_timestamps=0
> net.ipv4.tcp_sack=0
> net.ipv4.tcp_rmem='10000000 10000000 10000000'
> net.ipv4.tcp_wmem='10000000 10000000 10000000'
> net.ipv4.tcp_mem='10000000 10000000 10000000'
> net.core.rmem_max=524287
> net.core.wmem_max=524287
> net.core.rmem_default=524287
> net.core.wmem_default=524287
> net.core.optmem_max=524287
> net.core.netdev_max_backlog=300000
>  
> 
>     check wether the drbd option
>             no-tcp-cork;
>     has any positiv/negative effect.
> 
>  
> This one has a negative effect - about 70MB/s slower.

ok.
this is expected to reduce latency somewhat,
and latency vs throughput is a typicall tradeoff.

>     cpu utilization during benchmarks?
>     "wait state"?
>     memory bandwidth?
>     interrupt rate?
> 
> 
> The cpu utilization during the sync for the top tasks looks like so
> (fluctuates, and typically lower), and is similiar on both nodes. I have not
> seen any iowait:
> Cpu(s):  1.2%us, 43.9%sy,  0.0%ni, 13.6%id,  0.0%wa,  0.5%hi, 40.9%si,  0.0%st
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 29513 root      16   0     0    0    0 R   69  0.0   7:31.92 drbd0_receiver
>    32 root      10  -5     0    0    0 S   39  0.0  44:32.93 kblockd/0
> 29518 root      -3   0     0    0    0 S   18  0.0   1:55.06 drbd0_asender
> 21392 root      15   0     0    0    0 S    1  0.0   0:36.02 drbd0_worker
> 
> The memory bandwidth I've benchmarked with ramspeed to be ~2500-2700Mb/s on one
> node, and ~2200Mb/s on the other, due to it having fewer memory modules and
> memory total.
> 
> Interrupt rate is ~13500-14000/sec on the primary and ~11500/sec on the
> secondary during a sync.
>  
> 
>     maybe bind or unbind NIC interrupts to cpus?
>      /proc/interrupts
>      /proc/irq/*/smp_affinity
> 
> 
> They are on CPU0 currently, but would it help to move it if the CPU is not
> being overly taxed?

try and you'll find out. if you allow more CPUs,
try the drbd cpu-mask setting again.

just for the record, iirc, I have benchmarked Connected DRBD streaming
writes at ~600 MByte/sec in our lab with 10GbE as well as Dolphin
Interconnect.  so the drbd code should be able to handle this.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed