Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, Dec 18, 2008 at 7:24 AM, Lars Ellenberg
<lars.ellenberg at linbit.com>wrote:
> On Wed, Dec 17, 2008 at 04:17:00PM -0500, Parak wrote:
> > Hi all,
> >
> > I'm currently playing with DRBD (8.2.7) on 20Gb/s Infiniband, and it
> seems that
> > I'm running at the sync rate as the limiting speed factor. The local
> storage on
> > both nodes is identical (SAS array), and has been benchmarked at about
> 650MB/s
> > (or higher, depending on benchmark) to native disk, and about 550MB/s
> when
> > writing to it through a disconnected DRBD device. The network link for
> DRBD is
> > Infiniband as well (IPoIB), which has been benchmarked with netperf at
> ~800MB/
> > s.
> >
> > The fastest speed that I'm able to get from the DRBD sync with this
> > configuration is ~340MB/s, which limits the speed from my initiator to
> that as
> > well. Interestingly, I was also able to benchmark DRBD sync speed over
> 10Gbe,
> > which despite my repeated attempts to tweak drbd.conf, mtu, and tcp
> kernel
> > parameters, has produced the same speed as the aformentioned 340MB/s over
> > IPoIB.
> >
> > Here's the drbd.conf:
> >
> > global {
> > usage-count yes;
> > }
> >
> > common {
> > syncer {
> > rate 900M;
>
> check if
> cpu-mask 3;
> or cpu-mask 7;
> or cpu-mask f;
> or something like that
> has any effect.
No effect for these.
> > }
> > }
> >
> > resource drbd0 {
> >
> > protocol C;
> >
> > handlers {
> > }
> >
> > startup {
> > degr-wfc-timeout 30;
> > }
> >
> > disk {
> > on-io-error detach;
> > fencing dont-care;
> > no-disk-flushes;
> > no-md-flushes;
> > no-disk-drain;
> > no-disk-barrier;
> > }
> >
> > net {
> > ko-count 2;
> > after-sb-1pri discard-secondary;
> > sndbuf-size 1M;
>
> you can try sndbuf-size 0; (auto-tuning)
Slightly slower by about 20-30MB/s.
> and check whether tweaking
> /proc/sys/net/ipv4/tcp_rmem
> /proc/sys/net/ipv4/tcp_wmem
> /proc/sys/net/core/optmem_max
> /proc/sys/net/core/rmem_max
> /proc/sys/net/core/wmem_max
> and the like has any effect.
These did have a positive effect, but they were already applied in my case
(as per recommendations from the Infiniband vendor and ixgb readme):
net.ipv4.tcp_timestamps=0
net.ipv4.tcp_sack=0
net.ipv4.tcp_rmem='10000000 10000000 10000000'
net.ipv4.tcp_wmem='10000000 10000000 10000000'
net.ipv4.tcp_mem='10000000 10000000 10000000'
net.core.rmem_max=524287
net.core.wmem_max=524287
net.core.rmem_default=524287
net.core.wmem_default=524287
net.core.optmem_max=524287
net.core.netdev_max_backlog=300000
> check wether the drbd option
> no-tcp-cork;
> has any positiv/negative effect.
This one has a negative effect - about 70MB/s slower.
> }
> >
> > on srpt1 {
> > device /dev/drbd0;
> > disk /dev/sdb;
> > address 10.0.0.2:7789;
> > flexible-meta-disk internal;
> > }
> >
> > on srpt2 {
> > device /dev/drbd0;
> > disk /dev/sdb;
> > address 10.0.0.3:7789;
> > flexible-meta-disk internal;
> > }
> > }
> >
> > Any advice/thoughts would be highly appreciated; thanks!
>
> cpu utilization during benchmarks?
> "wait state"?
> memory bandwidth?
> interrupt rate?
The cpu utilization during the sync for the top tasks looks like so
(fluctuates, and typically lower), and is similiar on both nodes. I have not
seen any iowait:
Cpu(s): 1.2%us, 43.9%sy, 0.0%ni, 13.6%id, 0.0%wa, 0.5%hi, 40.9%si,
0.0%st
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29513 root 16 0 0 0 0 R 69 0.0 7:31.92 drbd0_receiver
32 root 10 -5 0 0 0 S 39 0.0 44:32.93 kblockd/0
29518 root -3 0 0 0 0 S 18 0.0 1:55.06 drbd0_asender
21392 root 15 0 0 0 0 S 1 0.0 0:36.02 drbd0_worker
The memory bandwidth I've benchmarked with ramspeed to be ~2500-2700Mb/s on
one node, and ~2200Mb/s on the other, due to it having fewer memory modules
and memory total.
Interrupt rate is ~13500-14000/sec on the primary and ~11500/sec on the
secondary during a sync.
> maybe bind or unbind NIC interrupts to cpus?
> /proc/interrupts
> /proc/irq/*/smp_affinity
They are on CPU0 currently, but would it help to move it if the CPU is not
being overly taxed?
Thanks,
-Gennadiy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20081218/1441311f/attachment.htm>