<div dir="ltr">Digimer,<div><br></div><div>Thanks for the direction, this sounds right for sure. So to actually do this:</div><div><br></div><div>1. Because we're running RHEL 6.7, I think I need to use 1.1-pcs version of the docs, chapter 8 Configure STONITH? Our nodes are supermicro based with an IPMI BMC controller, but with common non-controllable power. So I think I need to use the IPMI fencing agent?</div><div><br></div><div>2. Would the correct approach for the DRBD fencing and handlers be the guidance in users-guide84 underĀ 8.3.2. Resource-level fencing using the Cluster Information Base (CIB)?</div><div><br></div><div>Thanks!</div><div><br></div><div>David</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jun 4, 2016 at 2:26 AM, Digimer <span dir="ltr"><<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 03/06/16 12:38 PM, David Pullman wrote:<br>
> We have a two node cluster that has Pacemaker and DRBD 8.4 on RHEL 6<br>
> configured in HA. The application is a monitoring system that has a web<br>
> front end and uses Postgresql as a backend. The Postgres db uses the<br>
> DRBD device for it's datastore.<br>
><br>
> I'm trying to get a configuration that will deterministically resolve<br>
> split brain when it happens. The first priority is to have one and only<br>
> one node running the HA stack and have that updating the database with<br>
> monitoring data. The second priority is, in the event of a split brain,<br>
> to resolve to the most recent content.<br>
<br>
</span>A *much* better approach is to build your system to not split-brain in<br>
the first place. Configure fencing/stonith in your resource manager<br>
(pacemaker or cman), then configure DRBD to hook into it via the<br>
fence-handler scripts (and set 'fencing resource-and-stonoith;').<br>
<span class=""><br>
> I've looked at the automatic split brain recovery in the docs, and tried<br>
> a couple of things but I think I'm missing something to get the<br>
> resolution in the case of two standalones. Also, I'm thinking based on<br>
> some other list entries that fencing is the way to go, but I'm not sure<br>
> how to proceed.<br>
<br>
</span>There is no way to determine what node has "better" data.<br>
<br>
Consider;<br>
<br>
Critical but small data is written to node 1, say accounting data or<br>
something. Few inodes change but the value of that data is great. Later,<br>
on the other node, someone uploads a distro ISO. Many inodes change but<br>
they are easily replaced and have no real value.<br>
<br>
Do you discard the node with the older changes?<br>
<br>
Do you discard the node with the fewest changes?<br>
<br>
Both would result in important data being lost.<br>
<span class=""><br>
> I'm getting split brain on occasion when the I/O between the nodes goes<br>
> down. We have two switches in a redundant configuration connecting the<br>
> nodes. For unrelated reasons I can't change the interconnect.<br>
><br>
> Any suggestions, referrals to docs, etc., would be greatly appreciated.<br>
<br>
</span>Fencing. 100% required, and will prevent split brains entirely.<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" rel="noreferrer" target="_blank">https://alteeve.ca/w/</a><br>
What if the cure for cancer is trapped in the mind of a person without<br>
access to education?<br>
</font></span></blockquote></div><br></div>