[Linux-ha-dev] Re: [DRBD-user] drbd peer outdater: higher level implementation?

Mon Sep 15 09:59:53 CEST 2008

On 2008-09-15T08:28:18, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:

> > > > The restart of the secondary is not just "spurious" though. It might
> > > > actually help "fix" (or at least "reset") things. Restarts are amazingly
> > > > simple and effective.
> > > hmm.
> > You've got to admit that it's a valid point ;-)
> that was more a disagreeing grumble.
> it may also break things.

How so?

> if we set aside confused admins for the moment,
> and assume CRM is the only entity promoting/demoting drbd.
> 
> would it not be enough for a Primary on connection loss to
> set some constraint pinning the master role on that node/node group?
> 
> the DRBD after-resync handler can then remove that contraint again.

The idea is interesting. A RA modifying its own constraints ...

However, it wouldn't work for a true split-brain. If the primary does
that before being fenced by the secondary (which, given awkward
circumstances for the split-brain, is possible), and the partition
heals, the master would be pinned to the "wrong" node briefly.

Also, given that it is a split-brain and the constraint is only on one
side, the secondary would allow itself to be promoted - okay, so the
cluster never would before the primary has been fenced, but neither
must the master continue before the secondary has been fenced ...

Does that make sense?

Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde