Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2006-09-16 20:13:52 +0100 \ Tim Jackson: > Usman Ahmad wrote: > > Hi, it would be easier if you didn't top-post. Thanks. > > >No, there is no firewall setup, but same question again, it may be stupid. How long does a drbd node wait for > >changing the other node to unknown state ( i guess this is done through different timeout parameters etc.), and if > >the other server comes back up again, is it changed normally or not? > > First, just in case it wasn't clear, DRBD doesn't change the status of the *local* node; only an external program > (e.g. Heartbeat) does that. As for what it shows for the status of the *remote* node, well, I haven't read the code > but it normally seems to me to be pretty fast. Again, read the logs: if the local node is primary (and writes are > taking place, I think), then you should start seeing "ko" warnings quickly, and how long before the other node is > declared dead depends on the "ko-count" configuration option. You will see all kinds of warnings from DRBD when the > nodes can't communicate. See also my bug report on this list the other day though, about DRBD not reporting a > transition of the remote node to "Unknown" status in the logs. Thanks, Tim! almost correct... "ko-count" is for the case when the peer seems to still be alive (answers drbd ping packets timely), but does no longer accept data packets. this is to detect a half-broken secondary, where the local io subsystem is broken. such a half broken secondary would otherwise block the primary as well, which is obviously not desirable in an "HA" setup. if the peer is _dead_ (or the network is broken; "does not answer at all"), drbd detects this (worst case) within timeout (caution: unit centisects) + ping-int, (default: 60 centisecs + 10 secs == 16 seconds; both parameters configured within the net {} section). typically "dead peer" detection is faster. the heartbeat deadtime has to be larger. the heartbeat init-deadtime should be much larger than the drbd connect-int. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.