Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Lars and Everyone Please look below..... | -----Original Message----- | From: Lars Ellenberg [mailto:lars.ellenberg at linbit.com] | Sent: Monday, March 16, 2009 12:38 PM | To: drbd-user at lists.linbit.com | Subject: Re: [DRBD-user] pingAck failed - help to avoid it? | | On Mon, Mar 16, 2009 at 01:48:12AM +0200, Theophanis Kontogiannis wrote: | > Hello all | > | > | > | > I have a two nodes cluster on Centos 5.2, kernel | > 2.6.18-92.1.22.el5.centos.plus, drbd-8.3.0-3 and | > drbd-km-2.6.18_92.1.22.el5.centos.plus-8.3.0-3 compiled and installed as | rpm | > by myself. | > | > | > | > Though I do have two GigabitEth NICs connected back-to-back for DRBD and | > clustering, from time to time, especially during heavy traffic on the | public | > GigEth interfaces of the cluster nodes, I get the following: | > | > | > | > drbd0: PingAck did not arrive in time. | | > drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) | pdsk( | > UpToDate -> DUnknown ) susp( 0 -> 1 ) | > drbd0: asender terminated | > drbd0: Terminating asender thread | > drbd0: short read expecting header on sock: r=-512 | > drbd0: Creating new current UUID | > drbd0: Connection closed | > drbd0: helper command: /sbin/drbdadm fence-peer minor-0 | > drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 2 | (0x200) | > drbd0: fence-peer helper broken, returned 2 | | hm?? what is that about? | what did you configure for fencing? | why does it return 2? | I am using the obliterate script (http://people.redhat.com/lhh/obliterate) for fencing. However I realized that the originally posted script, was leading to exit code 2. I had to change the script to the following: [root at tweety-2 ~]# diff /sbin/obliterate /sbin/obliterate~ 29,37c29 < for NODETAG in `(cman_tool nodes 2>/dev/null | grep -v '^Node' | awk '{print $1 ":" $6}')` ; do < < #while read nid nodename; do < < #echo $NODETAG < nid=`echo $NODETAG|cut -d : -f 1` < nodename=`echo $NODETAG|cut -d : -f 2` < < --- > while read nid nodename; do 48c40 < done #< <(cman_tool nodes 2>/dev/null | grep -v '^Node' | awk '{print $1,$6}') --- > done < <(cman_tool nodes 2>/dev/null | grep -v '^Node' | awk '{print $1,$6}') Now it returns exit code 0 and it performs as it should. | | > Fencing is working since the node that failed to send the PinkAck gets | > fenced (and rebooted). | | hm... see above. You were right It was not the DRBD fencing that worked, but the cman was the one fencing. | | > However, any ideas why this is happening since there is private link for | > DRBD? | | > Also I fail to identify on the man pages and the on-line | tutorial/manual, | > the parameters that will make me fine tune this behavior, so I would | also | > appreciate some help on that too. | | increase the ping-timeout in drbd.conf. | I did so and now it looks OK. Thank you Lars and All for your time. | -- | : Lars Ellenberg | : LINBIT | Your Way to High Availability | : DRBD/HA support and consulting http://www.linbit.com | | DRBDR and LINBITR are registered trademarks of LINBIT, Austria. | __ | please don't Cc me, but send to list -- I'm subscribed