[DRBD-user] Understanding degr-wfc-timeout

Lars Ellenberg lars.ellenberg at linbit.com
Fri Dec 3 09:43:11 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Fri, Dec 03, 2010 at 02:13:09AM +0000, Andrew Gideon wrote:
> On Thu, 02 Dec 2010 13:36:25 -0600, J. Ryan Earl wrote:
> > If you "gracefully" stop DRBD on one
> > node, it's not "degraded."  Degraded is from like a non-graceful
> > separation due to a crash, power-outage, network issue, etc where one
> > end detects the other is gone instead of being told to gracefully close
> > connection between nodes.
> I issued a "stop" (a graceful shutdown) only after I broke the DRBD 
> connection by blocking the relevant packets.  So before the stop, the 
> cluster was in a degraded state:
>  1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
> Using "stop" still causes a clean shutdown which then avoids degr-wfc-
> timeout?
> Is there any way that a network issue, or anything else short of a crash 
> of the system, can invoke degr-wfc-timeout?  I've even tried 'kill -9' of 
> the drbd processes, but they seems immune to this.
> I can force a system crash if I have to, but that's something of pain in 
> the neck so I'd prefer another option if one is available.
> Or have I misunderstood?  I've been assuming that degr-wfc-timeout 
> applies only to the WFC at startup (because the timeout value is in the 
> startup block of the configuration file).  Is this controlling some other 
> WFC period?
> When I break the connection (and with no extra fencing logic specified), 
> I see that both nodes go into a WFC state.  But this is lasting well 
> longer than the 60 seconds I have defined in degr-wfc-timeout.

See if my post
[DRBD-user] DRBD Failover Not Working after Cold Shutdown of Primary
dated Tue Jan 8 11:56:00 CET 2008 helps.
and other archives

BTW, that setting only affects drbdadm/drbdsetup wait-connect, as used
for example by the init script, if used without an explicit timeout.
It does not affect anything else.

What is it you are trying to prove/trying to achieve?

: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list