[DRBD-user] [CASE-18] Performance of Async congestion mode too slow?

Lars Ellenberg lars.ellenberg at linbit.com
Mon Feb 15 15:42:21 CET 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Sun, Feb 14, 2016 at 07:34:55PM +0900, 김재헌 wrote:
> Hi,
> With the async congestion mode, local disk I/O performance is too slow than
> sync replication mode.
> 1.version
>   -  V9.0.1-1, GIT-hash: f57acfc22d29a95697e683fb6bbacd9a1ad4348e
>   - VM: CentOS 7
> 2. conf
>     protocol A;
>     sndbuf-size 256K;
>     on-congestion pull-ahead;
>     congestion-fill 128K;

That is a nonsense configuration.

These congestion parameters are intended to be used
with a "DRBD-Proxy" in between, or long fat pipes.

Useful values would be several hundred megabyte,
with a proxy memlimit of those several hundred megabyte
plus some.

If it does not behave "nicely" with very low values,
then that's expected.

In any case, even with protocol A,
IO completion occurs when we have both
 a) local disk completion 
 b) successfully sent any data (or out-of-sync information)

> I think there seems to be a problem in the following areas:
>  - Before congestion, completion for local disk I/O is treated at
> complete_master_bio function in drbd_sender thread.
>  - But even if the congestion occured, I think, it may be treated at the
> same position.
>  - In other words although local disk write it is already finished, the
> copy application is not receiving this completion signal and pending.
>  - This application waits for this completion until got_BarrierAck receives
> just requested-block from the peer.
>  - I think the local I/O completion should be done as soon as detecting
> congestion without waiting peer ack.
> Is there any my misunderstand about drbd congestion mechanism?

Completion is not supposed to wait for any actual DRBD protocol
ACKs, not even "barrier acks".

But it waits at least until send() returns.
send() of data while not yet switched to congested, or
send() of "this block is now out-of-sync" when already congested.

If that is too much for your network, then *disconnect*.
Or use periodic file level rsync instead of DRBD.
Or something like that.

: Lars Ellenberg
: http://www.LINBIT.com | Your Way to High Availability
: DRBD, Linux-HA  and  Pacemaker support and consulting

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list