[DRBD-user] DRBD terrible sync performance on 10GigE

Emmanuel Florac eflorac at intellique.com
Fri Dec 4 19:19:22 CET 2015

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Le Fri, 4 Dec 2015 17:04:26 +0100
Lars Ellenberg <lars.ellenberg at linbit.com> écrivait:

> You are not supposed to disable the resync controller,
> you are supposed to correctly use it.
> 
> https://blogs.linbit.com/p/443/drbd-sync-rate-controller-2/
> 
> ... you should:
> 
>     * set c-plan-ahead to 20 (default with 8.4),
> 	or more if there’s a lot of latency on the connection (WAN
> 	link with protocol A); or less, if you want to have it
> 	react faster to changes

This is a dedicated, direct, 10GigE connection with redundant 1G (just
in case). Latency is less than 100ms; Throughput is more than 750 MB/s.
The required minimum transfer speed is 500 MB/s.

from the documentation, the adequate setting would be 1 but values
under 5 shouldn't be used. I tried 1 and 10, but got very poor results
(slowwww resync. I can't wait 2 weeks to 2 months for sync to finish,
the system enters production next week).

In my testing setting c-plan-ahead to 0 was precisely the most
important tweak to get decent sync/resync throughput.

>     * leave the fixed resync rate (the initial guess for the
> controller) at about 30% or less of what your hardware can handle;
> 
>     * set c-max-rate to 100% (or slightly more) of
>       what your hardware can handle;

I actually set it at 95% or so :)

>       (the default is 100M, which was the effective limitation in
> this case)
> 
>     * set c-fill-target to the minimum (just as high as necessary)
>       that gets your hardware saturated, if the system is otherwise
> idle.

Apparently something around 24M is OK with my current hardware.

> Respectively, figure out the maximum possible resync rate in your
> setup while the system is idle, then set c-fill-target to the
> minimum setting that still reaches that rate.
> 
> And finally, while checking application request
> latency/responsiveness, tune c-min-rate to the maximum that still
> allows for acceptable responsiveness.
> 
> You may need to adjust max-buffers and/or tcp send/receive buffer
> sizes as well.

Here's my current configuration that gives satisfying
results (cluster.res unchanged from previous post).

global {
	usage-count no;
	# minor-count dialog-refresh disable-ip-verification
}

common {
	handlers {
	}

	startup {
		# wfc-timeout degr-wfc-timeout outdated-wfc-timeout
		wait-after-sb # wfc-timeout 20;
	}

	disk {
        	on-io-error detach;
	 	no-disk-flushes ;
		no-disk-barrier;
		c-plan-ahead 0;
		c-fill-target 24M;
		c-min-rate 80M;
		c-max-rate 720M;
	} 
	net {
        	# max-epoch-size          20000;
        	max-buffers             36k;
        	sndbuf-size            1024k ;
		rcvbuf-size	       2048k;
	}
	syncer {
        	rate                    4194304k; # bytes/second
        	al-extents              6433;
	}
}

Using these settings, idle resync rate is ~700MB/s. sequential write
speed over 500MB/s during resync, sequential read over 700MB/s during
resync. I'll do a test again when the system are sync'ed but it looks
good enough.


-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac at intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------



More information about the drbd-user mailing list