Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2006-10-03 09:12:15 -0700 \ Robinson, Eric: > Lars said: > <snip> > just make sure that your heartbeat won't decide to make a node primary > that happens to have long-since outdated data > cluster fine > secondary crash [first spike of a brown out] > time passes > primary crash [well, now its a real black out] > > ... [power back] > > previously secondary comes up, heartbeat decides to make it primary > *** you are primary with outdated data *** > previously primary needs a lot longer (recounts its scsi devices, > thinks it needs to fsck its root, whatever)... > > same effect as split brain: diverging data sets. > </snip> > > In the above case, I assume the new primary would update the new > secondary and you would not have diverging data sets, just old data from > before the first brownout, no? > > >From all of this the question arises: is there a good general > configuration (read "silver bullet") that covers most typical failure > scenarios and doesn't block when one node is down? No. It always depends. Different deployments have different requirements. Some might rather be non-operational than working with even slightly outdated data, some might prefer to just have _any_ data online, just so long as they _are_ online... > By the way, does drbd include a timestamp in its metadata, kind of like > a watchdog timer? It seems like a timestamp could be combined with the > primary/secondary metadata field to gracefully handle most failures, but > I'm probably being naive. no, it does not. for obvious reasons. but, drbd 8 tags its data generations with uuid, and keeps a short history of those uuids, which helps a lot in detecting various kinds of bad things... -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.