[DRBD-user] DRBD RA and dual primary problem with Master Score

Thu Oct 2 21:27:35 CEST 2014

Hello list,

I described this problem before on the Pacemaker mailing list. I am 
currently experimenting in my test lab in setting up a DRBD dual primary 
environment.

I am running DRBD 8.4.5 on top of Debian Wheezy with Pacemaker 1.1.7.

The problem that I see is that pacemaker in some situations promoting my 
drbd resources to primary so fast that they are not yet connected and 
after connection they are both Primary and recognize a split brain 
situation and disconnect immediately.

I have done some research on this problem and came to the conclusion 
that it is related to the Master Score the drbd RA is reporting.

E.g. I stop a resource via Pacemaker (the resource initially is in 
Primary/Primary state). Afterwards I try to fire up the resource again 
and face a split brain. I can avoid this setting a location constraint 
for the Master role with a rather low score of <-1000. This is due to 
the fact the DRBD RA on both nodes reports a score of 1000 as soon as 
the resource is started on both sides.

So here comes my question.

According to the RA meta data description which states

adjust_master_score (string, [5 10 1000 10000]): master score adjustments
     Space separated list of four master score adjustments for different 
scenarios:
      - only access to 'consistent' data
      - only remote access to 'uptodate' data
      - currently Secondary, local access to 'uptodate' data, but remote 
is unknown

1000 is by default reported if the Resource has uptodate data.
Is it intended that a resource that has been disconnected gracefully 
before reports uptodate data after restarting it even if the other node 
was still there when disconnecting? So is this intended behaviour? I 
mean at least one node should assume that the other might most probably 
have newer data.

This also happens if I e.g. set one node into standby, reboot it and let 
it rejoin the cluster. Again Pacemaker fires up Primary mode almost 
instantly- when DRBD is still in WFConnection state and afterwards split 
brain is detected. Again I get a MS of 1000 from the RA here.

This is a bit odd for me as I can not seriously use the "only remote 
access to 'uptodate' data" state as it is scored between two options 
which kill my cluster.

I also tried using the stop_outdates_secondary="true" option which I 
assumed would outdate the data on the secondary on any stop action and 
afterwards it should report a MS of 5 according to the documentation but 
this seems to do nothing for me too. I know it is called outdates 
SECONDARY but for a short moment on stopping the resource should be 
secondary too if I see this correctly.

I can provide reference to log files if needed.

thank you for any hints in advance,

regards, Felix