AW: [DRBD-user] Some weird behaviour

Nuno Tavares nunotavares at hotmail.com
Fri May 14 01:09:20 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Em Thu, 13 May 2004 23:03:56 +0200, Lars Ellenberg escreveu:

> / 2004-05-13 20:51:52 +0100
> \ Nuno Tavares:
>> Em Thu, 13 May 2004 08:17:25 +0200, Martin Bene escreveu:
>> 
>> > Hi,
>> > 
>> >> resource drbd0 {
>> >>   protocol = C
>> >>   fsckcmd  = /bin/true
>> >>   inittimeout=-10
>> > 
>> > Eep, what are you doing here? With this inittimeout you don't give drbd
>> > sufficient time to connect with the 2nd node before trying to continue
>> > startup on ist own.
>> > 
>> > If you use inittimeout, it should be as a last resort kind of thing with
>> > a timeout of -600 or something similar - this will still allow your
>> > system to start if just one system comes up after a power fail.
>> 
>> For testing purposes, -600 seems a lot to me!
>> Anyway, what are the alternatives? "load-only"? 
>> As a matter of fact, I've considered and commented out inittimeout, and
>> uncommented load-only, since my DRBD is heartbeat-managed.
>> Is this the right way to do it?
> 
> the point is: heartbeat does not know where the "most recent" data
> lives. it only knows whether a node is up or not.
> it will happily tell a node with outdated data to become primary.
> thus you risk data corruption.
> so only do this if you care more about availability than data interity,
> i.e. it is more important to be online with *some* data,
> than to be online with the most *up-to-date* transactions.

That *is* my case.

> this situation only happens after total failure anyways, so you should
> give your nodes enough time to boot, and fsck, and whatnot, and give DRBD 
> time to connect and start to sync in the right direction.

Thanks for the explanation. It will be documented.
 From what I've understood, after a POWERFAIL, it's probably that a DRBD
device (even if it was *NOT* primary) thinks that is has the most recent
data. So inittimeout will give it a chance to boot, fsck (/ and other
mounts), and sync itself.

The problem arives when DRBD switches its state. I think heartbeat is
unable to know that...

>> I'm just affraid that DRBD's algorithm contradicts what heartbeat thinks,
>> so I didn't really care about DRBD choosing WHO should be Primary.
>> Heartbeat will decide.
> 
> but heartbeat does not know.
> you have been warned.

Yes, I'm aware of it. I meant that Heartbeat will decide, without care
about who's the most recent. I need to:
1) minimize DRBD switching
2) be available the most time possible, uninterruptible.


-- 
-
Nuno Tavares
http://nthq.cjb.net/





More information about the drbd-user mailing list