[DRBD-user] Sync slow with drbd 0.7.11 and linux 2.6.11

Fri Jul 22 16:30:39 CEST 2005

Tilo Kaltenecker wrote:

>Hi,
>
>I have two 2.6.11 systems with LSI Logic SATA Raid Controller and drbd 0.7.11.
>When setup the first node as the primary node and sync it to the second node, it
>runs 40-50 MBit/sec over the gigabit-connection (crossover).
>I have tested the network and disk bandwidth by doing nfsmount and copy a big file
>and I get 600 MBit/sec.
>  
>
Hi Tilo,

I might want to do two things:

   1. Make sure you don't have the tcp bic congestion bug
   2. Increase the MTU-size to 9000 on your replication nic's

The 1st point really had bitten me for 5 days where large transfers 
where suddenly very slow.. while iperf(8) could obtain 900Mbps on the 
eth1 (e1000) with 9000MTU.

Here's a bit of the kernel Changelog:

><davem at davemloft.net>
>	[PATCH] Fix BIC congestion avoidance algorithm error
>	
>	Since BIC is the default congestion control algorithm
>	enabled in every 2.6.x kernel out there, fixing errors
>	in it becomes quite critical.
>	
>	A flaw in the loss handling caused it to not perform
>	the binary search regimen of the BIC algorithm
>	properly.
>	
>	The fix below from Stephen Hemminger has been heavily
>	verified.
>	
>	[TCP]: BIC not binary searching correctly
>	
>	While redoing BIC for the split up version, I discovered that the existing
>	2.6.11 code doesn't really do binary search. It ends up being just a slightly
>	modified version of Reno.  See attached graphs to see the effect over simulated
>	1mbit environment.
>	
>	The problem is that BIC is supposed to reset the cwnd to the last loss value
>	rather than ssthresh when loss is detected.  The correct code (from the BIC
>	TCP code for Web100) is in this patch.
>	
>	Signed-off-by: Stephen Hemminger <shemminger at osdl.org>
>	Signed-off-by: David S. Miller <davem at davemloft.net>
>	Signed-off-by: Chris Wright <chrisw at osdl.org>
>	Signed-off-by: Greg Kroah-Hartman <gregkh at suse.de>
>
>  
>
Off-topic: I just installed my 2nd High Available drbd cluster - it's 
running very well using an Areca:

ARECA RAID: 64BITS PCI BUS DMA ADDRESSING SUPPORTED
scsi0 : ARECA ARC1220 PCI-EXPRESS 8 PORTS SATA RAID CONTROLLER (RAID6-ENGINE Inside)
        Driver Version 1.20.00.07
        Vendor: Areca     Model: ARC-1220-VOL#00   Rev: R001
        Type:   Direct-Access                      ANSI SCSI revision: 03
SCSI device sda: 2929686528 512-byte hdwr sectors (1500000 MB)

on RedHat Enteprise 4 with Update 1 - kernel 2.6.9-11smp which gives 
around 70MB/sec sustained writes over drbd0.
(iozone -s 512m -r 64k -t 8 -i 0 -i 1 has finished in 1min 25sec.)

Goodluck,
Leroy