[DRBD-user] DRBD+Pacemaker: Won't promote with only one node

Dan Frincu df.cluster at gmail.com
Thu Jan 5 09:34:28 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

On Wed, Jan 4, 2012 at 10:10 PM, William Seligman
<seligman at nevis.columbia.edu> wrote:
> I'll give the technical details in a moment, but I thought I'd start with a
> description of the problem.
>
> I have a two-node active/passive cluster, with DRBD controlled by Pacemaker. I
> upgraded to DRBD 8.4.x about six months ago (probably too soon); everything was
> fine. Then last week we did some power-outage tests on our cluster.
>
> Each node in the cluster is attached to its own uninterruptible power supply;
> the STONITH mechanism is to turn off the other node's UPS. In the event of an
> extended power outage (this happens 2-3 times a year at my site), it's likely
> that one node will STONITH the other when the other node's UPS runs out of power
> and shuts it down. This means that when power comes back on, only one node will
> come back up, since the STONITHed UPS won't turn on again without manual
> intervention.
>
> The problem is that with only one node, Pacemaker+DRBD won't promote the DRBD
> resource to primary; it just sits there at secondary and won't start up any
> DRBD-dependent resources. Only when the second node comes back up will Pacemaker
> assign one of them the primary role. I've confirmed this by shutting down
> corosync on both nodes, then bringing it up again on just one of them.
>

Could you also post your Pacemaker configuration?

Also you might want to check
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#id890288
for no-quorum-policy, in two-node clusters, losing one node means you
don't have quorum, and unless you something else as a quorum device,
then the policy is set to stop.

HTH,
Dan

> I'm pretty sure that this is due to a mistake I"ve made in made in my DRBD
> configuration when I fiddled with it during the 8.4.x upgrade. I've attached the
> files. Can one of you kind folks spot the error?
>
> Technical details:
>
> Two-node configuration: hypatia and orestes
> OS: Scientific Linux 5.5, kernel 2.6.18-238.19.1.el5xen
> Packages:
> drbd-8.4.1-1
> corosync-1.2.7-1.1.el5
> pacemaker-1.0.12-1.el5.centos
> openais-1.1.3-1.6.el5
>
> Attached: global_common.conf, nevis-admin.res
>
> --
> Bill Seligman             | Phone: (914) 591-2823
> Nevis Labs, Columbia Univ | mailto://seligman@nevis.columbia.edu
> PO Box 137                |
> Irvington NY 10533 USA    | http://www.nevis.columbia.edu/~seligman/
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>



-- 
Dan Frincu
CCNA, RHCE



More information about the drbd-user mailing list