[DRBD-user] Heartbeat & DRBD.SplitBrain.Auto recovering

Lars Ellenberg lars.ellenberg at linbit.com
Mon Nov 24 19:05:52 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Mon, Nov 24, 2008 at 06:01:07PM +0100, Gianluca Cecchi wrote:
> On Mon, Nov 24, 2008 at 4:17 PM, Lars Ellenberg
> <lars.ellenberg at linbit.com> wrote:
> >
> > So for the NOT recommended way of automatically recovering after split
> > brain, throwing away all transactions done on one of the nodes,
> >
> > doing something along the lines of:
> >  after-sb-0pri discard-least-changes;
> >  after-sb-1pri call-pri-lost-after-sb;
> >  after-sb-2pri call-pri-lost-after-sb;
> >
> > while having a handler
> >  pri-lost-after-sb "echo b > /proc/sysrq-trigger; reboot -f"
> >
> > might do what you ask for.
> >
> Keeping in mind the NOT recommended situation, but to better
> understand drbd behaviour, what is the relation between what above and
> the rr-conflict directive?
> in particular:
> 1) what is the default for rr-conflict if not explicit?

> 2) if answer of 1) is disconnect, does this prevent or not from the
> reboot of the node with least changes?

no drbd setting prevents reboot.
I don't quite understand your question?

first, drbd tries to find a re-sync decision that does not hurt anyone.
(are you familiar with version control systems? that would be a "fast forward")
if that is not possible, it defaults to disconnect (let someone else
sort out the mess).

if you tell it to try harder to automatically sort out the mess,
then it tries those auto-recovery strategies.

if one of those is "call userspace handler",
then that user space handler is supposed to reboot the node.

if that helper fails to do so
(helper returns, and the in kernel drbd code resumes from that point),
drbd tries some more.  if drbd now still comes to the conclusion that,
to follow your directions given in the configuration, it has to make a
currently Primary node SyncTarget (causing all sorts of serious cache
coherency trouble), only _then_ it is time to look at that "rr-conflict"

the rr-conflict setting again can call a (different) user space helper,
or refuse the re-sync (disconnect),
or -- DONT DO THAT -- allow a consistent Primary to become SyncTarget.
I'm not sure when we tested that code path last. if it works as
expected, it would very likely still crash the box due to users of the
device being seriously confused that what they thought they wrote there
just now suddenly looks completely different because of the
resync-process.  it was added to cover a very special case in a very
special setup I forgot about.

if you want any more details on what/if/when/while,
ask for consulting.
or use the code ;)
 --> drbd_receiver.c,
   drbd_sync_handshake, drbd_uuid_compare, drbd_asb_recover_*

btw, if you feel like it,
you could also hook into the "split-brain" handler, use drbd in its
default (after split brain: disconnect) setting. that handler was
intended to be able to send an email/page an admin.
but it could do all sorts of things, collect information,
trigger STONITH or other reset operations...

> 3) suppose no node has changes (for example nfs read partition synced
> with drbd) would it be better

you, and *only* you can tell what is "better" in this context.
this is a policy decision.

I doubt that there is any "always better", optimal, choice,
for _automatic_ merge of arbitrary divergent data sets.

> to use discard-younger-primary, supposed the time of the nodes is kept
> in sync with ntp?

that "younger" or "older" has absolutely nothing to do with
wall clock (or system) time, but only with the generation UUIDs
generated by drbd, and their history.

: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list