[DRBD-user] Resolving split brain

Digimer lists at alteeve.ca
Tue Jun 7 22:27:49 CEST 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 07/06/16 04:09 PM, Lars Ellenberg wrote:
> On Tue, Jun 07, 2016 at 12:24:48PM -0400, Digimer wrote:
>> On 07/06/16 08:46 AM, David Pullman wrote:
>>> Digimer,
>>> Thanks for the direction, this sounds right for sure. So to actually do
>>> this:
>>> 1. Because we're running RHEL 6.7, I think I need to use 1.1-pcs version
>>> of the docs, chapter 8 Configure STONITH? Our nodes are supermicro based
>>> with an IPMI BMC controller, but with common non-controllable power. So
>>> I think I need to use the IPMI fencing agent?
>> Yup, fence_ipmilan should work just fine. Try this, from the command line;
>> fence_ipmilan -a <ipmi_ip> -l <ipmi_user> -p <ipmi_passwd> -o status
>> If you can check the state of both nodes from both nodes, then it's a
>> simple matter of adding it to pacemaker. Note that you will want to
>> configure pacemaker with 'delay="15"' for the fence method for the
>> primary node.
>> This way, if comms breaks but both nodes are up, node 2 will look up how
>> to fence node 1, see the delay and sleep for 15 seconds. Node 1 looks up
>> how to fence node 2, sees no delay and shoots immediately. This way, you
>> can ensure that node 1 (assuming it's the primary node) always wins.
>>> 2. Would the correct approach for the DRBD fencing and handlers be the
>>> guidance in users-guide84 under 8.3.2. Resource-level fencing using the
>>> Cluster Information Base (CIB)?
>> No need to worry about the CIB directly. The pcs tool makes configuring
>> fencing in pacemaker pretty easy. Once you have fencing working in
>> pacemaker, then you can hook DRBD into it by setting 'fencing
>> resource-and-stonith' and set the fence handlers to crm-{un,}fence-peer.sh.
>> With that, when DRBD loses the peer, it will block
>> (resource-and-stonith) and call the fence handler (crm-fence-peer.sh).
>> In turn, crm-fence-peer.sh asks pacemaker to shoot the lost node.
> Not exactly.  The crm-fence-peer.sh script tries some heuristics
> on the cib content and using crmadmin, and "figures out"
> if "it is only me", or if pacemaker/"the cluster communication" also
> does not see that peer anymore.
> If (heuristics say that) cluster comm to that node is still up,
> we place some constraint telling pacemaker to NOT try to promote
> anyone without access to *my* data, then continue.
> If (heuristics say that) cluster comm to that node is also down,
> ... and it looks clean down (or already successfully shot),
>     we place that same constraint anyways, then continue
> ... and you don't have pacemaker fencing enabled, there are scenarios
>     where you might end up with data divergence anyways.  That can only
>     be avoided with fencing configured on both DRBD and pacemaker level.
> ... and it looks as if pacemaker will "soon" shoot that node
>     (or is already in the process of doing so), 
>     but it has not been successfully shot yet,
>     we periodically poll the cib, until that is the case
>     or we hit a timeout.
> As of now, this script never asks pacemaker to shoot any peer.
> It may, in specific scenarios, if called with --suicide-on-failure-if-primary,
> ask pacemaker to have *this* node shot, and even tries to fall back to
> other methods of suicide.
> More details in said script,
> it is heavily commented and tries to be descriptive
> about not only the what, but also the why.

Oh wow, it's a lot smarter than I thought. Thanks for clarifying!

>> will stay blocked until that succeeds (which is why stonith has to work
>> in pacemaker before you setup fencing in DRBD).
>>>     Fencing. 100% required, and will prevent split brains entirely.
> Yes :-)

That statement I was confident in. :P

Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

More information about the drbd-user mailing list