Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Sep 20, 2010 at 10:18:49PM +0200, Pavlos Parissis wrote: > Hi, > I was testing the testing crm-fence-peer.sh on heartbeat/pacemaker > cluster and I was wondering if the message " Remote node did not > respond" which I got 3 times, is normal. > For the simulation I used iptables on the slave to break the > communication link between the master and slave. The drbd noticed > immediately the broken link and invoked the crm-fence-peer.sh. > > here is the log on the master and I broke the communication link only > for one of my drbd resources (drbd_pbx_service_1) > Sep 20 22:07:22 node-01 kernel: block drbd1: PingAck did not arrive in time. > Sep 20 22:07:22 node-01 kernel: block drbd1: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) > Sep 20 22:07:22 node-01 kernel: block drbd1: asender terminated > Sep 20 22:07:22 node-01 kernel: block drbd1: Terminating asender thread > Sep 20 22:07:22 node-01 kernel: block drbd1: short read expecting header on sock: r=-512 > Sep 20 22:07:22 node-01 kernel: block drbd1: Creating new current UUID > Sep 20 22:07:22 node-01 kernel: block drbd1: Connection closed > Sep 20 22:07:22 node-01 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1 > Sep 20 22:07:22 node-01 crm-fence-peer.sh[14877]: invoked for drbd_pbx_service_1 > Sep 20 22:07:22 node-01 cibadmin: [14881]: info: Invoked: cibadmin -Ql > Sep 20 22:07:22 node-01 cibadmin: [14890]: info: Invoked: cibadmin -Q -t 1 > Sep 20 22:07:24 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond > Sep 20 22:07:24 node-01 cibadmin: [14905]: info: Invoked: cibadmin -Q -t 1 > Sep 20 22:07:25 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond > Sep 20 22:07:25 node-01 cibadmin: [14913]: info: Invoked: cibadmin -Q -t 1 > Sep 20 22:07:27 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond > Sep 20 22:07:27 node-01 cibadmin: [14958]: info: Invoked: cibadmin -Q -t 1 > Sep 20 22:07:29 node-01 crm-fence-peer.sh[14877]: Call cib_query failed (-41): Remote node did not respond > Sep 20 22:07:29 node-01 cibadmin: [14966]: info: Invoked: cibadmin -Q -t 2 > Sep 20 22:07:31 node-01 cibadmin: [14992]: info: Invoked: cibadmin -C > -o constraints -X <rsc_location rsc="ms-drbd_01" > id="drbd-fence-by-handler-ms-drbd_01"> <rule role="Master" > score="-INFINITY" id="drbd-fence-by-handler-rule-ms-drbd_01"> > <expression attribute="#uname" operation="ne" value="node-01" > id="drbd-fence-by-handler-expr-ms-drbd_01"/> </rule> </rsc_location> > Sep 20 22:07:33 node-01 crm-fence-peer.sh[14877]: INFO peer is reachable, my disk is UpToDate: placed constraint 'drbd-fence-by-handler-ms-drbd_01' Yes. http://git.linbit.com/?p=drbd-8.3.git;a=blob;f=scripts/crm-fence-peer.sh;h=ea461f884963e7fe9c1d21ca97d74cdc4fb27285;hb=68ee998421a014e931b398ed21fd738c9e9a5d12#l322 (That url is ugly, I meant to say look at lines around 322 in that script.) We start with a timeout of 1 second, apparently that is not enough in your setup to get an answer. If the message disturbs you, feel free to increase the initial timeout there. - local cibtimeout=10 + local cibtimeout=29 (or something like that). -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed