[DRBD-user] DRBD resource fenced by crm-fence-peer.sh with exit code 5

Lars Ellenberg lars.ellenberg at linbit.com
Fri Jun 13 16:46:12 CEST 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Jun 11, 2014 at 03:50:37PM -0500, Andrew Martin wrote:
> Hello,
> 
> I am in the process of testing a 3 node (2 real nodes and 1 quorum node) cluster
> with Pacemaker 1.1.11 + Corosync 2.3.3 and DRBD 8.3.11 on Ubuntu 12.04. I have
> backported most of these packages in this PPA:
> https://launchpad.net/~xespackages/+archive/clustertesting
> 
> I have configured a one-primary DRBD resource and configured it to run on either
> node (node0 or node1):
> primitive p_drbd_drives ocf:linbit:drbd \
>         params drbd_resource="r0" \
>         op start interval="0" timeout="240" \
>         op stop interval="0" timeout="100" \
>         op monitor interval="10" role="Master" timeout="90" \
>         op monitor interval="20" role="Slave" timeout="60"
> ms ms_drbd_drives p_drbd_drives \
>         meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master"
> colocation c_drbd_fs_services inf: g_store ms_drbd_drives:Master
> order o_drbd_fs_services inf: ms_drbd_drives:promote g_store:start
> 
> As you can see, it is colocated with a group of other resources (g_store) and the
> above order constraint makes it promote the DRBD resource before starting the 
> other resources. Due to this bug, I am stuck at DRBD 8.3.11:
> https://bugs.launchpad.net/ubuntu/+source/drbd8/+bug/1185756

No.
You are stuck with 8.3.11 because you *chose* to be stuck there.

If you wanted to, you'd simply use an 8.4.5 module
and corresponding userland.
Should be easy enough seeing that you chose to "backport"
all the other packages.

> However, this version of DRBD's crm-fence-peer.sh doesn't support newer versions 
> of pacemaker which no longer use ha="active" as part of the <node_state> tag:
> http://lists.linbit.com/pipermail/drbd-user/2012-October/019204.html
> 
> Therefore, I updated the copy of /usr/lib/drbd/crm-fence-peer.sh on all nodes to 
> use the latest version in the DRBD 8.3 series (2013-09-09):
> http://git.linbit.com/gitweb.cgi?p=drbd-8.3.git;a=history;f=scripts/crm-fence-peer.sh;h=6c8c6a4eda870b506b175d9833fea94761237d20;hb=HEAD
> 
> During testing, I've tried shutting down the currently-active node. When doing
> so, the fence peer handler inserts the constraint correctly, but it exits with 
> exit code 5:
> INFO peer is not reachable, my disk is UpToDate: placed constraint 'drbd-fence-by-handler-ms_drbd_drives'

"Shutting down", is in how?
Do you first cut the replication link, while still being primary?
Well, that *of course* will prevent the other node from being promoted.
That's exactly what this is supposed to do if a Primary loses the
replication link.

> crm-fence-peer.sh exit codes:
> http://www.drbd.org/users-guide-8.3/s-fence-peer.html
> 
> I can see this constraint in the CIB, however, the remaining (still secondary)
> node fails to promote.

Yes. Because that constraint tells it to not become Master.

> Moreover, when the original node is powered back on, it 
> repeatedly attempts to remove the constraint by calling crm-unfence-peer.sh, 

Is that so.
I don't see why it would do that.
the crm unfence should be called only by the after-resync-target handler,
so you would need to have a resync, be sync target, and finish that
resync successfully.

> which exits with exit code 0, removing the constraint. However it doesn't seem to 
> recognize this and repeatedly keeps calling crm-unfence-peer.sh. 

I don't think that is what happens.
Please double check the logs.

> How can I resolve these problems with crm-fence-peer.sh? Is exit code 5 an 
> acceptable state to allow DRBD to promote the resource on the remaining node?

No.
It is an acceptable exit code for *this* node to continue operating as Primary,
and *prevent* the other node from being promoted, because it has stale data.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list