Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Extremely late reply... yay vacations! \o/ (replies in-line) On 26/06/14 06:11 AM, Lars Ellenberg wrote: > > I already relayed an answer through Matt, off list. > But here is for the benefit of the archives: Ditto. Some of my comments here are ones I made off list, but for the record... > On Thu, Jun 19, 2014 at 12:54:29AM -0400, Digimer wrote: >> Hi all, >> >> This is something of a repeat of my questions on the Pacemaker >> mailing list, but I think I am dealing with a DRBD issue, so allow >> me to ask again here. >> >> I have drbd configured thus: >> >> ==== >> # /etc/drbd.conf >> common { >> protocol C; >> net { >> allow-two-primaries; >> disk { >> fencing resource-and-stonith; > >> fence-peer /usr/lib/drbd/crm-fence-peer.sh; > > You need to unfence it, too... > after-resync-target ... -unfence- ... I noticed that in 8.4, there is a before-resync-target option. Does such an option exist in 8.3? >> I've setup pacemaker thus: >> >> ==== >> Cluster Name: an-anvil-04 >> Corosync Nodes: >> >> Pacemaker Nodes: >> an-a04n01.alteeve.ca an-a04n02.alteeve.ca >> >> Resources: >> Master: drbd_r0_Clone >> Meta Attrs: master-max=2 master-node-max=1 clone-max=2 >> clone-node-max=1 notify=true >> Resource: drbd_r0 (class=ocf provider=linbit type=drbd) >> Attributes: drbd_resource=r0 >> Operations: monitor interval=30s (drbd_r0-monitor-interval-30s) >> Master: lvm_n01_vg0_Clone >> Meta Attrs: master-max=2 master-node-max=1 clone-max=2 >> clone-node-max=1 notify=true >> Resource: lvm_n01_vg0 (class=ocf provider=heartbeat type=LVM) >> Attributes: volgrpname=an-a04n01_vg0 >> Operations: monitor interval=30s (lvm_n01_vg0-monitor-interval-30s) >> >> Stonith Devices: >> Resource: fence_n01_ipmi (class=stonith type=fence_ipmilan) >> Attributes: pcmk_host_list=an-a04n01.alteeve.ca >> ipaddr=an-a04n01.ipmi action=reboot login=admin passwd=Initial1 >> delay=15 >> Operations: monitor interval=60s (fence_n01_ipmi-monitor-interval-60s) >> Resource: fence_n02_ipmi (class=stonith type=fence_ipmilan) >> Attributes: pcmk_host_list=an-a04n02.alteeve.ca >> ipaddr=an-a04n02.ipmi action=reboot login=admin passwd=Initial1 >> Operations: monitor interval=60s (fence_n02_ipmi-monitor-interval-60s) >> Fencing Levels: >> >> Location Constraints: >> Resource: drbd_r0_Clone >> Constraint: drbd-fence-by-handler-r0-drbd_r0_Clone >> Rule: score=-INFINITY role=Master >> (id:drbd-fence-by-handler-r0-rule-drbd_r0_Clone) >> Expression: #uname ne an-a04n01.alteeve.ca >> (id:drbd-fence-by-handler-r0-expr-drbd_r0_Clone) >> Ordering Constraints: >> promote drbd_r0_Clone then start lvm_n01_vg0_Clone (Mandatory) >> (id:order-drbd_r0_Clone-lvm_n01_vg0_Clone-mandatory) >> Colocation Constraints: >> >> Cluster Properties: >> cluster-infrastructure: cman >> dc-version: 1.1.10-14.el6_5.3-368c726 >> last-lrm-refresh: 1403147476 >> no-quorum-policy: ignore >> stonith-enabled: true >> ==== >> >> Note the -INFINITY rule, I didn't add that, crm-fence-peer.sh did on >> start. This brings me to the question; >> >> When pacemaker starts the DRBD resource, immediately the peer is >> resource fenced. Note that only r0 is configured at this time to >> simplify debugging this issue. >> >> Here is the DRBD related /var/log/messages entries on node 1 >> (an-a04n01) from when I start pacemaker: > >> ==== > *please* don't wrap log lines. Blame thunderbird... heh. I set it to plain text and hope for the best. >> Jun 19 00:14:24 an-a04n01 kernel: block drbd0: disk( Diskless -> Attaching ) >> Jun 19 00:14:24 an-a04n01 kernel: block drbd0: disk( Attaching -> Consistent ) >> Jun 19 00:14:24 an-a04n01 kernel: block drbd0: attached to UUIDs 561F3328043888C0:0000000000000000:052A1A6B59936EC5:05291A6B59936EC5 >> Jun 19 00:14:24 an-a04n01 kernel: block drbd0: conn( StandAlone -> Unconnected ) >> Jun 19 00:14:24 an-a04n01 kernel: block drbd0: conn( Unconnected -> WFConnection ) >> Jun 19 00:14:24 an-a04n01 pengine[16894]: notice: LogActions: Promote drbd_r0:0#011(Slave -> Master an-a04n01.alteeve.ca) >> Jun 19 00:14:24 an-a04n01 pengine[16894]: notice: LogActions: Promote drbd_r0:1#011(Slave -> Master an-a04n02.alteeve.ca) >> Jun 19 00:14:24 an-a04n01 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 > > There. Since DRBD is promoted while still "only Consistent", > it has to call the fence-peer handler. Given that the system was still starting, is there a way to have things pause (perhaps for wfc-timeout or whatever) to see if the node goes UpToDate? It seems like this is a race. >> Jun 19 00:14:25 an-a04n01 kernel: block drbd0: Handshake successful: Agreed network protocol version 97 >> Jun 19 00:14:25 an-a04n01 crm-fence-peer.sh[17156]: INFO peer is reachable, my disk is Consistent: placed constraint 'drbd-fence-by-handler-r0-drbd_r0_Clone' >> Jun 19 00:14:25 an-a04n01 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 4 (0x400) >> Jun 19 00:14:25 an-a04n01 kernel: block drbd0: fence-peer helper returned 4 (peer was fenced) > > Which is successful, > which allows this state transition: > >> Jun 19 00:14:25 an-a04n01 kernel: block drbd0: role( Secondary -> Primary ) disk( Consistent -> UpToDate ) pdsk( DUnknown -> Outdated ) > > Meanwhile, the other node did the same, > and cib arbitrates: > >> Jun 19 00:14:25 an-a04n01 cib[16890]: warning: update_results: Action cib_create failed: Name not unique on network (cde=-76) >> Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB Update failures <failed> >> Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB Update failures <failed_update id="drbd-fence-by-handler-r0-drbd_r0_Clone" object_type="rsc_location" operation="cib_create" reason="Name not unique on network"> >> Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB Update failures <rsc_location rsc="drbd_r0_Clone" id="drbd-fence-by-handler-r0-drbd_r0_Clone"> >> Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB Update failures <rule role="Master" score="-INFINITY" id="drbd-fence-by-handler-r0-rule-drbd_r0_Clone"> >> Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB Update failures <expression attribute="#uname" operation="ne" value="an-a04n02.alteeve.ca" id="drbd-fence-by-handler-r0-expr-drbd_r0_Clone"/> > > "Works as designed." Understood. Now I am wondering if we can make it better. :) >> Jun 19 00:14:26 an-a04n01 kernel: block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) >> Jun 19 00:14:26 an-a04n01 kernel: block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) > > And there the after-resync-target handler on the peer should remove the constraint again. This I find odd... Setting aside the "bad idea" of promoting an Inconsistent/SyncTarget-but-Primary resource; It is possible to use a node that is primary regardless of it's backing disk state (heck, even Diskless still works). So I think, as it seems someone else did with before-resync-target, that it should be feasible to unfence before the resync completes. >> ==== >> >> This causes pacemaker to declare the resource failed, though >> eventually the crm-unfence-peer.sh is called, clearing the -INFINITY >> constraint and node 2 (an-a04n02) finally promotes to Primary. >> However, the drbd_r0 resource remains listed as failed. > > A failed promote should result in "recovery", > and that is supposed to be demote (?) -> stop -> start > > Do you need to manually do a cleanup there? > Is it enough to wait for "cluster recheck interval"? If I delete the contraint, it recovers. Though I suspect that is a different way of answering "yes". > Or does it recover by itself anyways, > and only records that it has failed once? *If* I use the after-resync-target to call crm-unfence-peer.sh, it cleans up. However, my request here is to not have the fence applied at all during initial startup (at least not until some small timeout expires). It seems messed to fence/unfence when nothing is wrong, just slow. >> I have no idea why this is happening, and I am really stumped. Any >> help would be much appreciated. > > Because pacemaker is allowed to try to promote an "only Consistent" DRBD, > and does so sufficiently simultaneously, before DRBD is done with its handshake. Yup, and I would love to have a startup delay configurable to give DRBD more time. >> digimer >> >> PS - RHEL 6.5 fully updated, DRBD 8.3.15, Pacemaker 1.1.10 > > Upgrade to 8.4 > (which also gives you much improved (random) write performance). A) I meant 8.5.16. B) Done, on 8.4.4 and, so far (two rounds of tests), no fences have been called. Though I did have a hang->fence when trying to stop pacemaker and DRBD refused to stop. That could very well have been another mistake, so I will test more. > Or at least use the resource agent that comes with 8.4 > (which is supposed to be backward compatible with 8.3; > if it is not, let me know) Nah... I was trying to keep 8.3 on my RHEL/CentOS 6 boxes for consistency, but that is a hard argument to maintain when I am already changing cman+rgmanager -> pacemaker. :) So 8.4 it is. > It has a parameter "adjust_master_scores", > which is a list of four numbers, explained there. > Set the first to 0 (that's the master score for a Consistent drbd), > which makes pacemaker to not even try to promote a resource that > does not know whether it has "good" data. I saw this before. I will need to re-read things to find it again. Once I do, I will test and report. > Alternatively, a smal negative location constraint on the master role > (on any location; so "defined #uname" might be a nice location) > should offset the master score just enough to have the same effect. I will try the first. That option feels a little hack~ish. Thanks!! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?