Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Bob - np that's what we're all out here for. A few baseline questions I missed earlier: - What OS? - What versions of DRBD/Pacmaker/LVM2? - Not that it should matter but Heartbeat or Corosync? I'm thinking that the LVM stuff is unnecessary so I put some notes in your config below to remove it. Also I what is missing to get the startup order working is there too. Let me know if you have more questions and how you make out. Jake ----- Original Message ----- > From: "Bob Schatz" <bschatz at yahoo.com> > To: "Jake Smith" <jsmith at argotec.com> > Cc: drbd-user at lists.linbit.com > Sent: Monday, August 15, 2011 6:25:22 PM > Subject: Re: [DRBD-user] Fw: DRBD STONITH - how is Pacemaker > constraint cleared? > Jake, > Thanks for your help! > Answers to questions: > 1. (Q) Why do you have LVM defined in the configuration? > (A) I wanted to make sure the LVM volumes were started before I start > DRBD (I have DRBD configured on top of LVM). I assume that this > should be okay. My config is also DRBD on top of LVM however I don't "start" LVM... I believe it starts on boot. I don't have anything in my Pacemaker config for LVM; I just start with the DRBD primitive and build out from there. I believe the purpose of the LVM resource agent is for LVM on top of DRBD not the other way around... > 2. (Q) Can you clarify what you mean by "DRBD is not started"? > (A) If I do a "cat /proc/drbd", I see "unconfigured". The DRBD agent > START routine is never called. I believe this problem will be fixed > once I work through my other problems. Got it > 3. (Q) Colocation appears to be backwards per the documentation. > (A) Thanks! I changed it per your suggestion. However, the Filesystem > agent START routine is now called before the DRBD resources enters > the MASTER state. There is no order constraint between FS and DRBD - see below in config > I made the changes you suggested. (I assumed I should not have to > specify the stopping/demoting sequences but it was the only way I > could get it to work.) > After these changes, a timeline of the behavior I see is this > sequence logged by agent entry point calls: > 1. Call LVM start and before LVM start finishes > 2. Call Filesystem start At this time I expect the IP is up? According to your grouping that would trigger FS start. > This fails since DRBD volume is readonly > 3. LVM start completes > 4. Filesystem stop (called because Filesystem start fails) > 5. DRBD start called > 6. DRBD promote called > My expectation was that the Filesystem start routine would not be > called until DRBD was MASTER. > My configuration is: > node cnode-1-3-5 > node cnode-1-3-6 > primitive glance-drbd-p ocf:linbit:drbd \ > params drbd_resource="glance-repos-drbd" \ > op start interval="0" timeout="240" \ > op stop interval="0" timeout="100" \ > op monitor interval="59s" role="Master" timeout="30s" \ > op monitor interval="61s" role="Slave" timeout="30s" (curiosity) why 59/61 for intervals? > primitive glance-fs-p ocf:heartbeat:Filesystem \ > params device="/dev/drbd1" directory="/glance-mount" fstype="ext4" \ > op start interval="0" timeout="60" \ > op monitor interval="60" timeout="60" OCF_CHECK_LEVEL="20" \ > op stop interval="0" timeout="120" > primitive glance-ip-p ocf:heartbeat:IPaddr2 \ > params ip="10.4.0.25" nic="br100" \ > op monitor interval="5s" > primitive glance-lvm-p ocf:heartbeat:LVM \ > params volgrpname="glance-repos" exclusive="true" \ > op start interval="0" timeout="30" \ > op stop interval="0" timeout="30" Remove the primitive for LVM. > primitive node-stonith-5-p stonith:external/ipmi \ > op monitor interval="10m" timeout="1m" target_role="Started" \ > params hostname="cnode-1-3-5 cnode-1-3-6" ipaddr="172.23.8.99" > userid="ADMIN" passwd="foo" interface="lan" Did you mean to have both node names in the hostname param? Not relevant to the problem... > primitive node-stonith-6-p stonith:external/ipmi \ > op monitor interval="10m" timeout="1m" target_role="Started" \ > params hostname="cnode-1-3-5 cnode-1-3-6" ipaddr="172.23.8.100" > userid="ADMIN" passwd="foo" interface="lan" > group group-glance-fs glance-fs-p glance-ip-p \ > meta target-role="Started" This group is saying that the glance-fs has to start after the glance-ip and on the same node as the glance-ip... as soon as the IP is started then you're fs will try to start irregardless of the state of your DRBD resource. You probably know already but group my_group_resx_resy resx resy is basically saying resx must colocate with and order start after resy. What does the IP have to do with the FS? Without knowing I would expect the IP to start after the FS is ready making this one backward... And lastly you can't have it this way (FS starts after IP in a group) and set FS to start on DRBD master below. If you make IP rely on FS then it's OK. If you want it this way then you need to remove the group and do a colo and order like this: order order-glance-drbd-master-before-ip-before-fs inf: ms-glance-drbd:promote glance-ip-p:start glance-fs-p:start colocation coloc-fs-and-ip-and-drbd inf: glance-fs-p glance-ip-p ms-glance-drbd:Master Did I confuse you yet? ;-) > ms ms-glance-drbd glance-drbd-p \ > meta master-node-max="1" clone-max="2" clone-node-max="1" > globally-unique="false" notify="true" target-role="Master" > clone cloneLvm glance-lvm-p Remove this clone > location loc-node-stonith-5 node-stonith-5-p \ > rule $id="loc-node-stonith-5-rule" -inf: #uname eq cnode-1-3-5 > location loc-node-stonith-6 node-stonith-6-p \ > rule $id="loc-node-stonith-6-rule" -inf: #uname eq cnode-1-3-6 > colocation coloc-fs-group-and-drbd inf: group-glance-fs > ms-glance-drbd:Master Need an order contraint to match with this colo to make sure DRBD is promoted before FS starts like: order order-glance-drbd-master-before-fs inf: ms-glance-drbd:promote group-glance-fs:start > order order-glance-lvm-before-drbd inf: cloneLvm:start > ms-glance-drbd:start Remove this order > property $id="cib-bootstrap-options" \ > dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="true" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1313440611" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" *snip*