Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Jan 07, 2008 at 01:16:16PM -0800, Art Age Software wrote: > Hi all, > > I've asked this question before and have still not figured it out. > > Either the degr-wfc-timeout setting is not working as documented, or > I just don't understand how it is supposed to work. > > Here's the scenario: > > 1) Both primary and secondary nodes (servers) are running. DRBD is > primary/connected/uptodate on Node1 and secondary/connected/uptodate > on Node2. > > 2) Shut down Node2. This takes DRBD on Node1 into primary/disconnected state. > > 3) Reboot Node1. (Do **not** start up Node2. It remains shut down.) > > According to my understanding, what I now have is a "degraded > cluster." However, when Node1 reboots, the init script waits forever, > ignoring the degr-wfc-timeout setting. It is as if DRBD does not think > the cluster is degraded. > > Another DRBD user on the list has confirmed seeing this behavior as > well in his setup. > > So, is this a DRBD bug? Or am I misunderstanding the use of the > degr-wfc-timeout setting? If I am currently not Primary, but meta data primary indicator is set, I just now recover from a hard crash, and have been Primary before that crash. Now, if I had no connection before that crash (have been degraded Primary), chances are that I won't find my peer now either. In that case, and _only_ in that case, we use the degr-wfc-timeout instead of the default, so we can automatically recover from a crash of a degraded but active "cluster" after a certain timeout. which means, that if you _reboot_ a degraded node, this will not use the "degr-wfc-timeout". the idea is: if you intentionally reboot it, you aparently "logged in" anyways (well, reboot will kick you off, but you can immediately log in again). maybe you fixed some hardware thing, and the reboot is supposed to pick that up. if not, because you are sitting in front of the console anyways, you can confirm/kill that wfc-thing if necessary. if it crashed while being Primary, and then later boots up again, it will use degr-wfc-timeout. -- : Lars Ellenberg http://www.linbit.com : : DRBD/HA support and consulting sales at linbit.com : : LINBIT Information Technologies GmbH Tel +43-1-8178292-0 : : Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 : __ please use the "List-Reply" function of your email client.