[DRBD-user] Dual-Primary DRBD node fenced after other node reboots UP

Lars Ellenberg lars.ellenberg at linbit.com
Wed May 10 15:31:30 CEST 2017


On Wed, May 10, 2017 at 02:07:45AM +0530, Raman Gupta wrote:
> Hi,
> 
> In a Pacemaker 2 node cluster with dual-Primary DRBD(drbd84) with
> GFS2/DLM/CLVM setup following issue happens:
> 
> Steps:
> ---------
> 1) Successfully created Pacemaker 2 node cluster with DRBD master/slave
> resources integrated.
> 2) Cluster nodes: server4 and server7
> 3) The server4 node is rebooted.
> 4) When server4 comes Up the server7 is stonith'd and is lost! The node
> server4 survives.
> 
> Problem:
> -----------
> Problem is #4 above, when server4 comes up why server7 is stonith'd?
> 
> From surviving node server4 the DRBD logs seems to be OK: DRBD has moved to
> Connected/UpToDate state. Suddenly server7 is rebooted (stonithd/fenced)
> between time  00:47:35 <--> 00:47:42 in below logs.

I don't think this has anything to do with DRBD, because:

> 
> /var/log/messages at server4
> ------------------------------------------------

> May 10 00:47:41 server4 kernel: tg3 0000:02:00.1 em4: Link is down
> May 10 00:47:42 server4 kernel: tg3 0000:02:00.0 em3: Link is down
> May 10 00:47:42 server4 corosync[12570]: [TOTEM ] A processor failed,
> forming new configuration.
> May 10 00:47:43 server4 stonith-ng[12593]:  notice: Operation 'reboot'
> [13018] (call 2 from crmd.13562) for host 'server7ha' with device
> 'vCluster-Stonith-server7ha' returned: 0 (OK)


There.

Apparently, something downed the NICs for corosync communication.
Which then leads to fencing.

Maybe you should double check your network configuration,
and any automagic reconfiguration of the network,
and only start corosync once your network is "stable"?


-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed


More information about the drbd-user mailing list