Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello list,
I described this problem before on the Pacemaker mailing list. I am
currently experimenting in my test lab in setting up a DRBD dual primary
environment.
I am running DRBD 8.4.5 on top of Debian Wheezy with Pacemaker 1.1.7.
The problem that I see is that pacemaker in some situations promoting my
drbd resources to primary so fast that they are not yet connected and
after connection they are both Primary and recognize a split brain
situation and disconnect immediately.
I have done some research on this problem and came to the conclusion
that it is related to the Master Score the drbd RA is reporting.
E.g. I stop a resource via Pacemaker (the resource initially is in
Primary/Primary state). Afterwards I try to fire up the resource again
and face a split brain. I can avoid this setting a location constraint
for the Master role with a rather low score of <-1000. This is due to
the fact the DRBD RA on both nodes reports a score of 1000 as soon as
the resource is started on both sides.
So here comes my question.
According to the RA meta data description which states
adjust_master_score (string, [5 10 1000 10000]): master score adjustments
Space separated list of four master score adjustments for different
scenarios:
- only access to 'consistent' data
- only remote access to 'uptodate' data
- currently Secondary, local access to 'uptodate' data, but remote
is unknown
1000 is by default reported if the Resource has uptodate data.
Is it intended that a resource that has been disconnected gracefully
before reports uptodate data after restarting it even if the other node
was still there when disconnecting? So is this intended behaviour? I
mean at least one node should assume that the other might most probably
have newer data.
This also happens if I e.g. set one node into standby, reboot it and let
it rejoin the cluster. Again Pacemaker fires up Primary mode almost
instantly- when DRBD is still in WFConnection state and afterwards split
brain is detected. Again I get a MS of 1000 from the RA here.
This is a bit odd for me as I can not seriously use the "only remote
access to 'uptodate' data" state as it is scored between two options
which kill my cluster.
I also tried using the stop_outdates_secondary="true" option which I
assumed would outdate the data on the secondary on any stop action and
afterwards it should report a MS of 5 according to the documentation but
this seems to do nothing for me too. I know it is called outdates
SECONDARY but for a short moment on stopping the resource should be
secondary too if I see this correctly.
I can provide reference to log files if needed.
thank you for any hints in advance,
regards, Felix