[DRBD-user] pingAck failed - help to avoid it?

Theophanis Kontogiannis theophanis_kontogiannis at yahoo.gr
Fri Mar 20 15:35:17 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello Lars and Everyone

Please look below.....

| -----Original Message-----
| From: Lars Ellenberg [mailto:lars.ellenberg at linbit.com]
| Sent: Monday, March 16, 2009 12:38 PM
| To: drbd-user at lists.linbit.com
| Subject: Re: [DRBD-user] pingAck failed - help to avoid it?
| 
| On Mon, Mar 16, 2009 at 01:48:12AM +0200, Theophanis Kontogiannis wrote:
| > Hello all
| >
| >
| >
| > I have a two nodes cluster on Centos 5.2, kernel
| > 2.6.18-92.1.22.el5.centos.plus, drbd-8.3.0-3 and
| > drbd-km-2.6.18_92.1.22.el5.centos.plus-8.3.0-3 compiled and installed as
| rpm
| > by myself.
| >
| >
| >
| > Though I do have two GigabitEth NICs connected back-to-back for DRBD and
| > clustering, from time to time, especially during heavy traffic on the
| public
| > GigEth interfaces of the cluster nodes, I get the following:
| >
| >
| >
| > drbd0: PingAck did not arrive in time.
| 
| > drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure )
| pdsk(
| > UpToDate -> DUnknown ) susp( 0 -> 1 )
| > drbd0: asender terminated
| > drbd0: Terminating asender thread
| > drbd0: short read expecting header on sock: r=-512
| > drbd0: Creating new current UUID
| > drbd0: Connection closed
| > drbd0: helper command: /sbin/drbdadm fence-peer minor-0
| > drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 2
| (0x200)
| > drbd0: fence-peer helper broken, returned 2
| 
| hm?? what is that about?
| what did you configure for fencing?
| why does it return 2?
| 

I am using the obliterate script (http://people.redhat.com/lhh/obliterate)
for fencing.

However I realized that the originally posted script, was leading to exit
code 2.

I had to change the script to the following:

[root at tweety-2 ~]# diff /sbin/obliterate /sbin/obliterate~

29,37c29
< for NODETAG in `(cman_tool nodes 2>/dev/null | grep -v '^Node' | awk
'{print $1 ":" $6}')` ; do
<
< #while read nid nodename; do
<
< #echo $NODETAG
< nid=`echo $NODETAG|cut -d : -f 1`
< nodename=`echo $NODETAG|cut -d : -f 2`
<
<
---
> while read nid nodename; do
48c40
< done #< <(cman_tool nodes 2>/dev/null | grep -v '^Node' | awk '{print
$1,$6}')
---
> done < <(cman_tool nodes 2>/dev/null | grep -v '^Node' | awk '{print
$1,$6}')

Now it returns exit code 0 and it performs as it should.

| 
| > Fencing is working since the node that failed to send the PinkAck gets
| > fenced (and rebooted).
| 
| hm... see above.

You were right
It was not the DRBD fencing that worked, but the cman was the one fencing.

| 
| > However, any ideas why this is happening since there is private link for
| > DRBD?
| 
| > Also I fail to identify on the man pages and the on-line
| tutorial/manual,
| > the parameters that will make me fine tune this behavior, so I would
| also
| > appreciate some help on that too.
| 
| increase the ping-timeout in drbd.conf.
| 


I did so and now it looks OK.
Thank you Lars and All for your time.


| --
| : Lars Ellenberg
| : LINBIT | Your Way to High Availability
| : DRBD/HA support and consulting http://www.linbit.com
| 
| DRBDR and LINBITR are registered trademarks of LINBIT, Austria.
| __
| please don't Cc me, but send to list   --   I'm subscribed






More information about the drbd-user mailing list