[DRBD-user] Resolving split brain

Tue Jun 7 22:09:04 CEST 2016

On Tue, Jun 07, 2016 at 12:24:48PM -0400, Digimer wrote:
> On 07/06/16 08:46 AM, David Pullman wrote:
> > Digimer,
> > 
> > Thanks for the direction, this sounds right for sure. So to actually do
> > this:
> > 
> > 1. Because we're running RHEL 6.7, I think I need to use 1.1-pcs version
> > of the docs, chapter 8 Configure STONITH? Our nodes are supermicro based
> > with an IPMI BMC controller, but with common non-controllable power. So
> > I think I need to use the IPMI fencing agent?
> 
> Yup, fence_ipmilan should work just fine. Try this, from the command line;
> 
> fence_ipmilan -a <ipmi_ip> -l <ipmi_user> -p <ipmi_passwd> -o status
> 
> If you can check the state of both nodes from both nodes, then it's a
> simple matter of adding it to pacemaker. Note that you will want to
> configure pacemaker with 'delay="15"' for the fence method for the
> primary node.
> 
> This way, if comms breaks but both nodes are up, node 2 will look up how
> to fence node 1, see the delay and sleep for 15 seconds. Node 1 looks up
> how to fence node 2, sees no delay and shoots immediately. This way, you
> can ensure that node 1 (assuming it's the primary node) always wins.
> 
> > 2. Would the correct approach for the DRBD fencing and handlers be the
> > guidance in users-guide84 under 8.3.2. Resource-level fencing using the
> > Cluster Information Base (CIB)?
> 
> No need to worry about the CIB directly. The pcs tool makes configuring
> fencing in pacemaker pretty easy. Once you have fencing working in
> pacemaker, then you can hook DRBD into it by setting 'fencing
> resource-and-stonith' and set the fence handlers to crm-{un,}fence-peer.sh.
> 
> With that, when DRBD loses the peer, it will block
> (resource-and-stonith) and call the fence handler (crm-fence-peer.sh).
> In turn, crm-fence-peer.sh asks pacemaker to shoot the lost node.

Not exactly.  The crm-fence-peer.sh script tries some heuristics
on the cib content and using crmadmin, and "figures out"
if "it is only me", or if pacemaker/"the cluster communication" also
does not see that peer anymore.

If (heuristics say that) cluster comm to that node is still up,
we place some constraint telling pacemaker to NOT try to promote
anyone without access to *my* data, then continue.

If (heuristics say that) cluster comm to that node is also down,

... and it looks clean down (or already successfully shot),
    we place that same constraint anyways, then continue

... and you don't have pacemaker fencing enabled, there are scenarios
    where you might end up with data divergence anyways.  That can only
    be avoided with fencing configured on both DRBD and pacemaker level.

... and it looks as if pacemaker will "soon" shoot that node
    (or is already in the process of doing so), 
    but it has not been successfully shot yet,
    we periodically poll the cib, until that is the case
    or we hit a timeout.

As of now, this script never asks pacemaker to shoot any peer.
It may, in specific scenarios, if called with --suicide-on-failure-if-primary,
ask pacemaker to have *this* node shot, and even tries to fall back to
other methods of suicide.

More details in said script,
it is heavily commented and tries to be descriptive
about not only the what, but also the why.

> DRBD
> will stay blocked until that succeeds (which is why stonith has to work
> in pacemaker before you setup fencing in DRBD).

> >     Fencing. 100% required, and will prevent split brains entirely.
> > 
> >     --
> >     Digimer
> >     Papers and Projects: https://alteeve.ca/w/
> >     What if the cure for cancer is trapped in the mind of a person without
> >     access to education?

Yes :-)

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed