[DRBD-user] Dual-Primary DRBD node fenced after other node reboots UP

Lars Ellenberg lars.ellenberg at linbit.com
Fri May 12 17:00:29 CEST 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Fri, May 12, 2017 at 02:04:57AM +0530, Raman Gupta wrote:
> > I don't think this has anything to do with DRBD, because:
> OK.
> 
> > Apparently, something downed the NICs for corosync communication.
> > Which then leads to fencing.
> No problem with NICs.
> 
> > Maybe you should double check your network configuration,
> > and any automagic reconfiguration of the network,
> > and only start corosync once your network is "stable"?
> As another manifestation of similar problem of dual-Primary DRBD integrated
> with stonith enabled Pacemaker: When server7 goes down, the DRBD resource
> on surviving node server4 is attempted to be demoted as secondary.

*why*

DRBD would not do that by itself,
so likely pacemaker decided to do that,
and you have to figure out *why*.
Pacemaker will have logged the reasons somewhere.

Seeing that you have different "uname -n" and "pacemaker node names",
that may well be the source of all your troubles.

"crm-fence-peer.sh" assumes that the result of "uname -n"
is the local nodes "pacemaker node name".

If "uname -n" and "crm_node -n" do not return the same thing for you,
the defaults will not work for you.

> The
> demotion fails because DRBD is hosting a GFS2 volume and Pacemaker complains
> of this failure as an error.

Then in addition to all your other trouble,
you have missing dependency constraints.
IF pacemaker decides it needs to "demote" DRBD,
it should know that it has a file system mounted,
and should know that it needs to first unmount,
and that it needs to first stop services accessing that mount,
and so on.

If it did not attempt to do that, your pacemaker config is broken.
If it did attempt to do that and failed,
you will have to look into why, which, again, should be in the logs.

Double check constraints, and also double check if GFS2/DLM fencing is
properly integrated with pacemaker.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed



More information about the drbd-user mailing list