Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, Dec 18, 2008 at 12:33:20PM -0500, Gennadiy Nerubayev wrote: > check if > cpu-mask 3; > or cpu-mask 7; > or cpu-mask f; > or something like that > has any effect. > > > No effect for these. > you can try sndbuf-size 0; (auto-tuning) > > > Slightly slower by about 20-30MB/s. I did not expect that. > and check whether tweaking > /proc/sys/net/ipv4/tcp_rmem > /proc/sys/net/ipv4/tcp_wmem > /proc/sys/net/core/optmem_max > /proc/sys/net/core/rmem_max > /proc/sys/net/core/wmem_max > and the like has any effect. > > > These did have a positive effect, but they were already applied in my case (as > per recommendations from the Infiniband vendor and ixgb readme): > > net.ipv4.tcp_timestamps=0 > net.ipv4.tcp_sack=0 > net.ipv4.tcp_rmem='10000000 10000000 10000000' > net.ipv4.tcp_wmem='10000000 10000000 10000000' > net.ipv4.tcp_mem='10000000 10000000 10000000' > net.core.rmem_max=524287 > net.core.wmem_max=524287 > net.core.rmem_default=524287 > net.core.wmem_default=524287 > net.core.optmem_max=524287 > net.core.netdev_max_backlog=300000 > > > check wether the drbd option > no-tcp-cork; > has any positiv/negative effect. > > > This one has a negative effect - about 70MB/s slower. ok. this is expected to reduce latency somewhat, and latency vs throughput is a typicall tradeoff. > cpu utilization during benchmarks? > "wait state"? > memory bandwidth? > interrupt rate? > > > The cpu utilization during the sync for the top tasks looks like so > (fluctuates, and typically lower), and is similiar on both nodes. I have not > seen any iowait: > Cpu(s): 1.2%us, 43.9%sy, 0.0%ni, 13.6%id, 0.0%wa, 0.5%hi, 40.9%si, 0.0%st > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 29513 root 16 0 0 0 0 R 69 0.0 7:31.92 drbd0_receiver > 32 root 10 -5 0 0 0 S 39 0.0 44:32.93 kblockd/0 > 29518 root -3 0 0 0 0 S 18 0.0 1:55.06 drbd0_asender > 21392 root 15 0 0 0 0 S 1 0.0 0:36.02 drbd0_worker > > The memory bandwidth I've benchmarked with ramspeed to be ~2500-2700Mb/s on one > node, and ~2200Mb/s on the other, due to it having fewer memory modules and > memory total. > > Interrupt rate is ~13500-14000/sec on the primary and ~11500/sec on the > secondary during a sync. > > > maybe bind or unbind NIC interrupts to cpus? > /proc/interrupts > /proc/irq/*/smp_affinity > > > They are on CPU0 currently, but would it help to move it if the CPU is not > being overly taxed? try and you'll find out. if you allow more CPUs, try the drbd cpu-mask setting again. just for the record, iirc, I have benchmarked Connected DRBD streaming writes at ~600 MByte/sec in our lab with 10GbE as well as Dolphin Interconnect. so the drbd code should be able to handle this. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed