[DRBD-user] examples of dual primary DRBD

Florian Haas florian at hastexo.com
Mon Oct 10 12:12:43 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 2011-10-08 15:55, Bart Coninckx wrote:
> On 10/08/11 00:25, Lars Ellenberg wrote:
>> On Fri, Oct 07, 2011 at 10:21:08PM +0200, Bart Coninckx wrote:
>>> On 10/06/11 22:03, Florian Haas wrote:
>>>> On 2011-10-06 21:43, Bart Coninckx wrote:
>>>>> Hi all,
>>>>> would you mind sending me examples of your crm config for a dual
>>>>> primary
>>>>> DRBD resource?
>>>>> I used the one on
>>>>> http://www.drbd.org/users-guide/s-ocfs2-pacemaker.html
>>>>> and on
>>>>> http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2
>>>>> and they both result into split brain, except for when I start drbd
>>>>> manually first.
>>>> They clearly should not. Rather than soliciting other people's
>>>> configurations and then try to adapt yours based on that, why don't you
>>>> upload _your_ CIB (not just a "crm configure dump", but a full
>>>> "cibadmin
>>>> -Q") and your DRBD configuration to your pastebin/pastie/fpaste and let
>>>> people tell you where your problem is?
>>> OK, I posted the drbd.conf on http://pastebin.com/SQe9YxhY
>>> cibadmin -Q is on http://pastebin.com/gTZqsACq
>>> The split brain logging is on http://pastebin.com/7unKKkdi .
>> I somehow think you added some "--force" or "--overwrite-data-of-peer"
>> to some drbdadm/drbdsetup primary invocation?
>>> Could this be some sort of timing issue? Manually things are find,
>>> but there are some seconds in between the primary promotions.
> OK, seems to be some sort of timing issue. I "fixed" this by adding a
> "sleep 1" in the RA right before the "do_drbdadm primary $DRBD_RESOURCE"
> line.
> I'm surprised though that I'm the first one to run into this.

Er, wait. I'm cross-posting this to the Pacemaker list on a hunch.

Andrew, in Boston last year you mentioned you were planning to implement
a change to Master/Slave sets in which, iirc, startup and promotion
would happen in one fell swoop (I believe the NTT folks made a
compelling case for this). Has that change ever been implemented? And if
so, at which Pacemaker version? Is there a configuration option to
revert back to the old behavior where the resource would be started
first, and then promotion would occur some time after that?


Need help with High Availability?

More information about the drbd-user mailing list