[DRBD-user] Resolving split brain

Tue Jun 7 14:46:18 CEST 2016

Digimer,

Thanks for the direction, this sounds right for sure. So to actually do
this:

1. Because we're running RHEL 6.7, I think I need to use 1.1-pcs version of
the docs, chapter 8 Configure STONITH? Our nodes are supermicro based with
an IPMI BMC controller, but with common non-controllable power. So I think
I need to use the IPMI fencing agent?

2. Would the correct approach for the DRBD fencing and handlers be the
guidance in users-guide84 under 8.3.2. Resource-level fencing using the
Cluster Information Base (CIB)?

Thanks!

David

On Sat, Jun 4, 2016 at 2:26 AM, Digimer <lists at alteeve.ca> wrote:

> On 03/06/16 12:38 PM, David Pullman wrote:
> > We have a two node cluster that has Pacemaker and DRBD 8.4 on RHEL 6
> > configured in HA. The application is a monitoring system that has a web
> > front end and uses Postgresql as a backend. The Postgres db uses the
> > DRBD device for it's datastore.
> >
> > I'm trying to get a configuration that will deterministically resolve
> > split brain when it happens. The first priority is to have one and only
> > one node running the HA stack and have that updating the database with
> > monitoring data. The second priority is, in the event of a split brain,
> > to resolve to the most recent content.
>
> A *much* better approach is to build your system to not split-brain in
> the first place. Configure fencing/stonith in your resource manager
> (pacemaker or cman), then configure DRBD to hook into it via the
> fence-handler scripts (and set 'fencing resource-and-stonoith;').
>
> > I've looked at the automatic split brain recovery in the docs, and tried
> > a couple of things but I think I'm missing something to get the
> > resolution in the case of two standalones. Also, I'm thinking based on
> > some other list entries that fencing is the way to go, but I'm not sure
> > how to proceed.
>
> There is no way to determine what node has "better" data.
>
> Consider;
>
> Critical but small data is written to node 1, say accounting data or
> something. Few inodes change but the value of that data is great. Later,
> on the other node, someone uploads a distro ISO. Many inodes change but
> they are easily replaced and have no real value.
>
> Do you discard the node with the older changes?
>
> Do you discard the node with the fewest changes?
>
> Both would result in important data being lost.
>
> > I'm getting split brain on occasion when the I/O between the nodes goes
> > down. We have two switches in a redundant configuration connecting the
> > nodes. For unrelated reasons I can't change the interconnect.
> >
> > Any suggestions, referrals to docs, etc., would be greatly appreciated.
>
> Fencing. 100% required, and will prevent split brains entirely.
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20160607/67e0f518/attachment.htm>