[DRBD-user] Automatic split-brain recovery strategies configuration

Fri Aug 10 19:27:48 CEST 2007

On Fri, Aug 10, 2007 at 12:57:03PM +0000, paddy at panici.net wrote:
> On Fri, Aug 10, 2007 at 04:05:02AM -0500, Abraham olivares Varela wrote:
> > Hi everybody,
> > 
> > 
> > Does anybody knows how can i configurate the Automatic split-brain
> > recovery strategies, in order to avoid a "split brain situation". ?
> > 
> > any example or any idea to do it that.
> > 
> > please help me
> >
> 
> On the one hand you talk about recovery and on the other hand you
> talk about avoiding it.  
> 
> I fear you may already be incurable ;-)
> 
> 
> Do you *ever* want to go split brain.  are there scenarios for you where
> that would be preferable and you want to think about what you are going 
> to do afterwards, or would you prefer to avoid it ever happening ?

basic problem is, that currently, with drbd 8 and two-primaries, as
necessary for cluster file systems, drbd will _always_ run into a
resource-internal (drbd specific) split brain as soon as you lose
the replication link, even if it is a very short network hiccup,
even if there has been no io on-the-fly.

that is because we did not yet implement any freeze-io due to loss of
write-quorum for drbd8 yet.

so as of now, if you want to use cluster file system with drbd8,
and you expect network hiccups, you run into "split-brain".

then you either always need to recover this by hand.

your you can configure some non-intrusive after-split-brain handler,
like the "discard-zero-changes", which would have nothing to "discard",
and "feature" auto-rejoin when there had been no-in-flight io, or no
changes on one side.

once you start using destructive settings,
"auto-recovery strategies" get very ugly very quickly, though.
and they are not a solution to the problem,
but only a work-around for those that commonly run into these problems,
e.g. people how configure two-primaries, but usually only accessing it
exclusively from one node (xen images). if you have a network hiccup
here, you will only have changes on one node, so you are fine with the
"discard-zero-changes" option.

solution is related to the implementation of (dynamically
reconfigurable) write-quorum, suspending IO as soon as we lose
quorum, then timeout, arbitrate, retransmit, resume ...
sorry, no time frame on that, besides "as soon as possible",
we are very busy with a lot of things around here :)

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.