[DRBD-user] examples of dual primary DRBD

Bart Coninckx bart.coninckx at telenet.be
Mon Oct 10 11:45:16 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 10/10/11 11:39, Bart Coninckx wrote:
> On 10/10/11 11:30, Lars Ellenberg wrote:
>> On Sat, Oct 08, 2011 at 01:31:28PM +0200, Bart Coninckx wrote:
>>> On 10/08/11 00:25, Lars Ellenberg wrote:
>>>> On Fri, Oct 07, 2011 at 10:21:08PM +0200, Bart Coninckx wrote:
>>>>> On 10/06/11 22:03, Florian Haas wrote:
>>>>>> On 2011-10-06 21:43, Bart Coninckx wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> would you mind sending me examples of your crm config for a dual
>>>>>>> primary
>>>>>>> DRBD resource?
>>>>>>>
>>>>>>> I used the one on
>>>>>>>
>>>>>>> http://www.drbd.org/users-guide/s-ocfs2-pacemaker.html
>>>>>>>
>>>>>>> and on
>>>>>>>
>>>>>>> http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2
>>>>>>>
>>>>>>> and they both result into split brain, except for when I start drbd
>>>>>>> manually first.
>>>>>>
>>>>>> They clearly should not. Rather than soliciting other people's
>>>>>> configurations and then try to adapt yours based on that, why
>>>>>> don't you
>>>>>> upload _your_ CIB (not just a "crm configure dump", but a full
>>>>>> "cibadmin
>>>>>> -Q") and your DRBD configuration to your pastebin/pastie/fpaste
>>>>>> and let
>>>>>> people tell you where your problem is?
>>>>>
>>>>> OK, I posted the drbd.conf on http://pastebin.com/SQe9YxhY
>>>>>
>>>>> cibadmin -Q is on http://pastebin.com/gTZqsACq
>>>>>
>>>>> The split brain logging is on http://pastebin.com/7unKKkdi .
>>>>
>>>> I somehow think you added some "--force" or "--overwrite-data-of-peer"
>>>> to some drbdadm/drbdsetup primary invocation?
>>>
>>> Hi,
>>>
>>> I re-created the metadata to start all over, and the
>>>
>>> drbdadm -- --overwrite-data-of-peer primary r_test
>>>
>>> command has to been done according to SLES docs for the initial sync.
>>>
>>> So if that particular command is the problem, we either have faulty
>>> documentation or me wrongly interpreting the docs.
>>
>> Sure. On _one_ node only.
>> Not on both, which I think you did.
>>
>> If you did not, you'd need to post your log from _before_, from where
>> the drbd was last connected before it then detected the data divergence
>> (aka "split-brain").
>>
>
> Spot on Lars. I did it on both. Looking for a big heavy hammer to hit me
> with, as the documentation clearly states this should happen on just one
> node. The rationale probably is that the metadata gets synced along with
> the normal data, correct?
>
> thx,
>
> B.
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

Wait a sec, I'm mixing up things. I created the metadata on both, the

drbdadm -- -overwrite-data-of-peer primary all

happened on just one node.

I will gather the necessary logfiles by reproducing the problem.
Mind you, the problem is temporarely "fixed" by adding a short delay in 
the resource agent on one of the nodes. It seems as if DBRD needs to do 
a quick sync to get the UUIDs straight.

B.






More information about the drbd-user mailing list