[DRBD-user] ko-count still in effect when unset?

Thu Nov 13 09:39:56 CET 2014

On Wed, Nov 12, 2014 at 12:33:49PM -0600, Zev Weiss wrote:
> Hi,
> 
> I recently had the following occur on the primary node of a DRBD resource, running DRBD 8.4.5 on CentOS 6.6 (kernel 2.6.32-504.el6.x86_64):
> 
> Nov 11 05:34:54 kernel: block drbd5: Remote failed to finish a request within ko-count * timeout
> Nov 11 05:34:54 kernel: block drbd5: peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
> 
> Being unfamiliar with ko-count, I looked at the documentation and found:
> 
>     ko-count number
>         In case the secondary node fails to complete a single write request for count times the timeout, it is expelled from the cluster. (I.e. the primary node goes into StandAlone mode.) The default value is 0, which disables this feature.
> 
> The thing is -- nowhere in my config was ko-count set.  So seeing it apparently kick in was an unwelcome surprise.  I have since set ko-count and timeout to "large" values in the hope that it doesn't happen again.

So documentation is NEVER EVER wrong, right?

ko-count 0 (disabled) was 8.3 default.
ko-count 7 (iirc) is 8.4 default.

drbdsetup 5 show --show-default

How about explicitly configure ko-count 0, if you mean it.

> Is this a DRBD bug, or expected behavior?  If it's somehow the latter,
> I think the combination of the documentation and error messages is
> quite misleading and should be fixed.

-- 
: Lars Ellenberg
: http://www.LINBIT.com | Your Way to High Availability
: DRBD, Linux-HA  and  Pacemaker support and consulting

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed