[DRBD-user] testing crm-fence-peer.sh

Lars Ellenberg lars.ellenberg at linbit.com
Fri Sep 24 00:03:49 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Mon, Sep 20, 2010 at 10:18:49PM +0200, Pavlos Parissis wrote:
> Hi,
> I was testing the testing crm-fence-peer.sh on heartbeat/pacemaker
> cluster and I was wondering if the message " Remote node did not
> respond" which I got 3 times, is normal.
> For the simulation I used iptables on the slave to break the
> communication link between the master and slave. The drbd noticed
> immediately the broken link and invoked the crm-fence-peer.sh.
> 
> here is the log on the master and I broke the communication link only
> for one of my drbd resources (drbd_pbx_service_1)
> Sep 20 22:07:22 node-01 kernel: block drbd1: PingAck did not arrive in time.
> Sep 20 22:07:22 node-01 kernel: block drbd1: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
> Sep 20 22:07:22 node-01 kernel: block drbd1: asender terminated
> Sep 20 22:07:22 node-01 kernel: block drbd1: Terminating asender thread
> Sep 20 22:07:22 node-01 kernel: block drbd1: short read expecting header on sock: r=-512
> Sep 20 22:07:22 node-01 kernel: block drbd1: Creating new current UUID
> Sep 20 22:07:22 node-01 kernel: block drbd1: Connection closed
> Sep 20 22:07:22 node-01 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1
> Sep 20 22:07:22 node-01 crm-fence-peer.sh[14877]: invoked for drbd_pbx_service_1
> Sep 20 22:07:22 node-01 cibadmin: [14881]: info: Invoked: cibadmin -Ql
> Sep 20 22:07:22 node-01 cibadmin: [14890]: info: Invoked: cibadmin -Q -t 1
> Sep 20 22:07:24 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond
> Sep 20 22:07:24 node-01 cibadmin: [14905]: info: Invoked: cibadmin -Q -t 1
> Sep 20 22:07:25 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond
> Sep 20 22:07:25 node-01 cibadmin: [14913]: info: Invoked: cibadmin -Q -t 1
> Sep 20 22:07:27 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond
> Sep 20 22:07:27 node-01 cibadmin: [14958]: info: Invoked: cibadmin -Q -t 1
> Sep 20 22:07:29 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond
> Sep 20 22:07:29 node-01 cibadmin: [14966]: info: Invoked: cibadmin -Q -t 2
> Sep 20 22:07:31 node-01 cibadmin: [14992]: info: Invoked: cibadmin -C
> 	-o constraints -X <rsc_location rsc="ms-drbd_01"
> 	id="drbd-fence-by-handler-ms-drbd_01">   <rule role="Master"
> 	score="-INFINITY" id="drbd-fence-by-handler-rule-ms-drbd_01">
> 	<expression attribute="#uname" operation="ne" value="node-01"
> 	id="drbd-fence-by-handler-expr-ms-drbd_01"/>   </rule> </rsc_location>
> Sep 20 22:07:33 node-01 crm-fence-peer.sh[14877]: INFO peer is reachable, my disk is UpToDate: placed constraint 'drbd-fence-by-handler-ms-drbd_01'


Yes.
http://git.linbit.com/?p=drbd-8.3.git;a=blob;f=scripts/crm-fence-peer.sh;h=ea461f884963e7fe9c1d21ca97d74cdc4fb27285;hb=68ee998421a014e931b398ed21fd738c9e9a5d12#l322

(That url is ugly, I meant to say look at lines around 322 in that script.)

We start with a timeout of 1 second,
apparently that is not enough in your setup to get an answer.

If the message disturbs you, feel free to increase the initial timeout there.
- local cibtimeout=10
+ local cibtimeout=29

(or something like that).

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list