AW: [DRBD-user] Some weird behaviour

Martin Bene martin.bene at icomedias.com
Fri May 14 07:15:02 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> > If you use inittimeout, it should be as a last resort kind 
> of thing with
> > a timeout of -600 or something similar - this will still allow your
> > system to start if just one system comes up after a power fail.
> 
> For testing purposes, -600 seems a lot to me!
> Anyway, what are the alternatives? "load-only"? 

Inittimeout onlycomes into play when:
 - both systems are down at the same time
 - only one system comes up on startup.

So setting inittimeout to a fairly high value shouldn't impact operation
of your cluster unless you've got a fairly improbable failure situation
- and you can still override the inittimeout by typing "yes" on the
local keyboard during startup.

> As a matter of fact, I've considered and commented out 
> inittimeout, and uncommented load-only, since my DRBD is 
> heartbeat-managed.
> Is this the right way to do it?

I'd say no. Heartbeat just knows what nodes are up and waht nodes should
preferably run services. It's got no idea about which node holds the
must up-to-date drbd data. Drbd works around this limitation by
connecting drbd devices before heartbeat startup and waiting for sync to
finish on the drbd-selected secondary. After a power failure, there'll
always be a full sync.

The machine with newest data gets to load heartbeat and start services;
the drbd - secondary waits until sync is finished and only then starts
heartbeat.
 
> I'm just affraid that DRBD's algorithm contradicts what 
> heartbeat thinks,
> so I didn't really care about DRBD choosing WHO should be Primary.
> Heartbeat will decide.

Yep, it can well be that drbd contradicts heartbeat config - that's why
you let drbd make the decision and force heartbeat to also honor the
decision by starting heartbeat first on the drbd-designated primary. 

> > At a guess this short inittimeout is also the caus for the
> > "predetermined states are in contradiction to GC's" 
> Messages: The box
> > that used to be secondary decided to start on ist own because of
> > inittimeout and subsequently was made primary be heartbeat, 
> thus causing
> > the message when the former primary connected.
> 
> Ok, it's just a warning then. The old data is still consistent,
> right?

Consistent: yes. Complete: no. You may be losing a couple of updates
made to the drbd device.

Bye, Martin



More information about the drbd-user mailing list