Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
just realized that the email went in private. resending to the list, sorry On Thu, 29 Sep 2011 09:17:48 -0700, Digimer <linux at alteeve.com> wrote: > On 09/29/2011 02:55 AM, Kaloyan Kovachev wrote: >> Hi list, >> i am about to upgrade DRBD on a RHCM cluster where GFS2 is used (dual >> primary mode). Previously i was using the outdate-peer script wodified to >> call fence_node in case the peer can not be reached over SSH. In the new >> version i can see the outdate-peer handler is replaced by fence-peer and >> the script executed is crm-fence-peer, but the problem is, the cluster is >> not using peacemaker. So here are the questions: >> >> 1. when using resource-and-stonith should the script always exit with 7 >> or it is OK to keep using the modified outdate-peer and return 7 only if >> the peer was fenced, which happens if it can't be contacted via SSH only >> i.e. at the end of the script the RV is still 5 and the cluster is >> quorate, >> the node status is Offline or fence_node was executed? actually the answer is in the drbd.conf manual: "resource-and-stonith If a node becomes a disconnected primary, it freezes all its IO operations and calls its fence-peer handler. The fence-peer handler is supposed to reach the peer over alternative communication paths and call 'drbdadm outdate res' there. In case it cannot reach the peer it should stonith the peer. IO is resumed as soon as the situation is resolved. In case your handler fails, you can resume IO with the resume-io command." so if nothing has changed (except the handler name), it should be OK (and expected) to return other exit codes if 'drbdadm outdate res' succeeds maybe i should have asked 'are there any changes except the name' >> >> 2. If any code in addition to 7 is allowed - which codes will lead to >> unfreezing the IO and which to keep blocking it, because in case of >> Inquorate cluster status or fence failure it is preferable to keep it >> blocked. Will returning 6 in this case lead to calling some of the >> pri-lost >> handlers i.e. commit suicide? > > I don't know about the exit codes, but I use Lon's obliterate-peer.sh > script in both DRBD 8.3.9 on EL5 (RHCS stable2) and DRBD 8.3.11 on EL6 > (RHCS stable 3) to protect my dual-primary setups. It works great. > Thank you for the links. Yes i have looked previously (when building the cluster) at obliterate-peer.sh script, but it works for two nodes only and does not have the option to outdate a single res if there are more than one, but just one failed - no need to fence the peer and drop all resources. That is why i opted to use outdate-peer.sh script and execute fence_node at the end (just like obliterate-peer.sh does) only if outdate was not successful. if [ $RV -eq 5 ]; then fence_node $DRBD_PEER if [ $? -eq 0 ]; then RV=7; fi fi now i would like to improve it a bit, so my second question was actually about the proper exit code in case when fencing failed (or not quorate), which currently is 5. Is it considered "In case your handler fails" or it should be 1 or 6 > Here is how I use it. The doc is for an older version but it works the > same. If you have the latest version of drbd, just replace > 'outdate-peer' with 'fence-peer'. > > *note* - This link is part of an *incomplete* tutorial. The DRBD section > is finished though. > > https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Configuring_DRBD_Global_and_Common_Options > > I keep a copy of the obliterate-peer.sh script here; > > https://alteeve.com/files/an-cluster/sbin/obliterate-peer.sh > > I install it with; > > wget -c https://alteeve.com/files/an-cluster/sbin/obliterate-peer.sh -O > /sbin/obliterate-peer.sh > chmod a+x /sbin/obliterate-peer.sh > ls -lah /sbin/obliterate-peer.sh > > If you want to find the source, do a search for "obliterate-peer.sh Lon > Hohberger". > > Sorry that this doesn't answer your question directly, but hopefully it > will help. :)