[DRBD-user] Slow initial full sync

Ivan ivan at c3i.bg
Wed Oct 29 09:10:23 CET 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi

On 10/29/2014 06:34 AM, aurelien panizza wrote:
> Hi all,
>
> Here is my problem :
> I've got two servers connected via a switch with a dedicated 1Gb NIC. We
> have a SSD raid 1 on both servers (software RAID on Redhat 6.5).
> Using dd command I can reach ~180MB/s (dd if=/dev/zero
> of=/home/oracle/output conv=fdatasync bs=384k count=1k; rm -f
> /home/oracle/output).

you're likely to have a cache hit doing that, except if it's just after 
boot ; use oflag=direct to have meaningful results.

> Using scp to copy a 3GB file through the network uses all the bandwith
> (~80/90 MB/s)

scp is not a good tool for network testing since you load the cpu with 
encryption/decryption. Use a dedicated tool or rsync/ftp.


> Full sync between the primary and the secondary is from 3MB/s to 12MB/s
> (I start with a speed of 60MB/s during less than a minute and then drop
> to 8-9MB/s).

those are signs of filling a cache and then seeing a slowdown when the 
cache is flushed to disk. That said, the speeds you're mentioning are 
quite low for today's standards, and the flush delay is huge, so it 
might not be the case.

but first you should redo the dd and bandwidth tests as explained above 
and see if it's just not really your hd on the secondary that can't 
handle the I/O load.


> My drbd partition is 200GB and I use drbd 8.4.4.
> My config file is as follow :
>
>    net {
>           cram-hmac-alg "sha1";
>           shared-secret "secret";
>           sndbuf-size 512k;
>           max-buffers 8000;
>           max-epoch-size 8000;
>           verify-alg sha1;
>           after-sb-0pri disconnect;
>           after-sb-1pri disconnect;
>           after-sb-2pri disconnect;
>    }
>
>    disk {
>          al-extents 3389;
>          fencing resource-only;
>    }
>
>    syncer {
>          rate 60M;
>    }
>
> I tried protocol A and C, increasing then decreasing rate speed, adding
> "c-plan-ahead 0"  but got the same sync speed.
> How can I find where is the bottleneck ?
> iostat report 100%util on /dev/drbd0 on the secondary.
> vmstat and mpstat report all the CPU as idle (secondary)
> The flag "n" -> "network socket contention" is regularly showing up on
> the primary (/proc/drbd)
> ethtool report 1GB on both host (I also tried to replicate on another NIC)
>
> Regards,
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>



More information about the drbd-user mailing list