[DRBD-user] Fw: DRBD STONITH - how is Pacemaker constraint cleared?

Jake Smith jsmith at argotec.com
Thu Aug 11 20:04:49 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Comments in-line.  Also in-line with Pacemaker config at the bottom.

HTH

Jake

----- Original Message ----- 

> From: "Bob Schatz" <bschatz at yahoo.com>
> To: drbd-user at lists.linbit.com
> Sent: Thursday, August 11, 2011 1:09:56 PM
> Subject: [DRBD-user] Fw: DRBD STONITH - how is Pacemaker constraint
> cleared?

> Hi,

> Does anyone know the answer to the question below about DRBD STONITH
> setting Pacemaker location constraints?

> Thanks!

> Bob

> ----- Forwarded Message -----
> From: Bob Schatz <bschatz at yahoo.com>
> To: "drbd-user at lists.linbit.com" <drbd-user at lists.linbit.com>
> Sent: Tuesday, August 2, 2011 12:21 PM
> Subject: [DRBD-user] DRBD STONITH - how is Pacemaker constraint
> cleared?

> Hi,

> I setup DRBD and Pacemaker using STONITH for DRBD and for Pacemaker.
> (Configs at bottom of email)

> When I reboot the PRIMARY DRBD node (cnode-1-3-6), Pacemaker shows
> this location constraint:

> location drbd-fence-by-handler-ms-glance-drbd ms-glance-drbd \
> rule $id="drbd-fence-by-handler-rule-ms-glance-drbd" $role="Master"
> -inf: #uname ne cnode-1-3-5

> and transitions the SECONDARY to PRIMARY. This makes sense to me.

> However, when I restart cnode-1-3-6 (cnode-1-3-5 still up as PRIMARY)
> the location constraint is not cleared as I would have expected.
> Also, DRBD is not started (I assume because of the location
> constraint). I would expect that since cnode-1-3-5 is still up the
> constraint would be moved and DRBD would change to SECONDARY.

The location constraint would only prevent DRBD from allowing glance-drbd to be promoted to Master on cnode-1-3-6.  Basicly it says that the role of ms-glance-drbd:Master can only be on node named cnode-1-3-5.  It doesn't care about ms-glance-drbd:Secondary.  It would not prevent DRBD from starting either (though your ordering could cause it not to start...).  Could you clarify what you mean by "DRBD is not stared"?

> Am I correct that this location constraint should be cleared?

> I assumed this would be cleared by the DRBD handler
> after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh" script but I
> do not believe it is called.

That is the handler that would clear the location constraint. You should see it cleared after the resync is complete.  If DRBD isn't running it will never resync which means it will never run the after-resync-target commands.  Have you checked that cnode-1-3-6 is UpToDate (cat /proc/drbd).  Here's an excerpt of how it should look in the logs as the constraint is removed (should log on cnode-1-3-6):

kernel: [   77.131564] block drbd4: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
kernel: [   77.131573] block drbd4: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
kernel: [   77.131585] block drbd4: helper command: /sbin/drbdadm after-resync-target minor-4
crm-unfence-peer.sh[3024]: invoked for bind <-- drbd4
kernel: [   77.261360] block drbd4: helper command: /sbin/drbdadm after-resync-target minor-4 exit code 0 (0x0)


> BTW, I am pretty sure I have ordering duplications in my Pacemaker
> configuration (pointed out by Andrew on the Pacemaker mailing list)
> but I am not sure if that is the problem.

> Thanks,

> Bob

> drbd.conf file:

> global {
> usage-count yes;
> }

> common {
> protocol C;
> }

> resource glance-repos-drbd {
> disk {
> fencing resource-and-stonith;
> }
> handlers {
> fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
> after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
> }
> on cnode-1-3-5 {
> device /dev/drbd1;
> disk /dev/glance-repos/glance-repos-vol;
> address 10.4.1.29:7789;
> flexible-meta-disk /dev/glance-repos/glance-repos-drbd-meta-vol;
> }
> on cnode-1-3-6 {
> device /dev/drbd1;
> disk /dev/glance-repos/glance-repos-vol;
> address 10.4.1.30:7789;
> flexible-meta-disk /dev/glance-repos/glance-repos-drbd-meta-vol;
> }
> syncer {
> rate 40M;
> }
> }

> Pacemaker configuration:

> node cnode-1-3-5
> node cnode-1-3-6

> primitive glance-drbd-p ocf:linbit:drbd \ params
> drbd_resource="glance-repos-drbd" \ op start interval="0"
> timeout="240" \ op stop interval="0" timeout="100" \ op monitor
> interval="59s" role="Master" timeout="30s" \ op monitor
> interval="61s" role="Slave" timeout="30s"

> primitive glance-fs-p ocf:heartbeat:Filesystem \ params
> device="/dev/drbd1" directory="/glance-mount" fstype="ext4" \ op
> start interval="0" timeout="60" \ op monitor interval="60"
> timeout="60" OCF_CHECK_LEVEL="20" \ op stop interval="0"
> timeout="120"

> primitive glance-ip-p ocf:heartbeat:IPaddr2 \ params ip="10.4.0.25"
> nic="br100" \ op monitor interval="5s"

> primitive glance-lvm-p ocf:heartbeat:LVM \ params
> volgrpname="glance-repos" exclusive="true" \ op start interval="0"
> timeout="30" \ op stop interval="0" timeout="30" \ meta
> target-role="Started"

I don't understand why you have this primitive?

> primitive node-stonith-5-p stonith:external/ipmi \ op monitor
> interval="10m" timeout="1m" target_role="Started" \ params
> hostname="cnode-1-3-5 cnode-1-3-6" ipaddr="172.23.8.99"
> userid="ADMIN" passwd="foo" interface="lan"

> primitive node-stonith-6-p stonith:external/ipmi \ op monitor
> interval="10m" timeout="1m" target_role="Started" \ params
> hostname="cnode-1-3-5 cnode-1-3-6" ipaddr="172.23.8.100"
> userid="ADMIN" passwd="foo" interface="lan"

> group group-glance-fs glance-fs-p glance-ip-p \ meta
> target-role="Started"

> ms ms-glance-drbd glance-drbd-p \ meta master-node-max="1"
> clone-max="2" clone-node-max="1" globally-unique="false"
> notify="true" target-role="Master"

> clone cloneLvm glance-lvm-p

> location drbd-fence-by-handler-ms-glance-drbd ms-glance-drbd \ rule
> $id="drbd-fence-by-handler-rule-ms-glance-drbd" $role="Master" -inf:
> #uname ne cnode-1-3-5

> location loc-node-stonith-5 node-stonith-5-p \ rule
> $id="loc-node-stonith-5-rule" -inf: #uname eq cnode-1-3-5

> location loc-node-stonith-6 node-stonith-6-p \ rule
> $id="loc-node-stonith-6-rule" -inf: #uname eq cnode-1-3-6

> colocation coloc-drbd-and-fs-group inf: ms-glance-drbd:Master
> group-glance-fs

This is backwards I believe... group-glance-fs runs on the ms-glance-drbd:Master correct?
Colocation reads x on y so this would say that the ms-glance-drbd:Master has to run on the group-glance-fs.  That means if group-glance-fs isn't running then ms-glance-drbd:Master can never run on that node.

Quote from Pacemaker Docs:
 <rsc_colocation id="colocate" rsc="resource1" with-rsc="resource2" score="INFINITY"/>
Remember, because INFINITY was used, if resource2 can't run on any of the cluster nodes (for whatever reason) then resource1 will not be allowed to run.


> order order-glance-drbd-demote-before-stop-drbd inf:
> ms-glance-drbd:demote ms-glance-drbd:stop

Not needed

> order order-glance-drbd-promote-before-fs-group inf:
> ms-glance-drbd:promote group-glance-fs:start

Ordering statements are applied in reverse when stopping so the above statements handles the demote/stop also making the ordering statements with demote unneeded.

> order order-glance-drbd-start-before-drbd-promote inf:
> ms-glance-drbd:start ms-glance-drbd:promote

Not needed - start for ms resources... they should be started normally.

> order order-glance-fs-stop-before-demote-drbd inf:
> group-glance-fs:stop ms-glance-drbd:demote

Not needed

> order order-glance-lvm-before-drbd 0: cloneLvm ms-glance-drbd:start

> property $id="cib-bootstrap-options" \
> dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
> cluster-infrastructure="openais" \ expected-quorum-votes="2" \
> stonith-enabled="true" \ no-quorum-policy="ignore" \
> last-lrm-refresh="1311899021"

> rsc_defaults $id="rsc-options" \ resource-stickiness="100"



More information about the drbd-user mailing list