[DRBD-user] Using quorum in three node cluster results in split brain

Tue Mar 21 15:45:19 CET 2023

Hi,

I still have no solution for this. I tried different versions of drbd:
9.2.2
9.2.1
9.1.12
I tried the third node as diskfull node with no success, two primaries with 
split brain is the result.

Nevertheless the documentation clearly describes how it SHOULD work:
https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-configuring-quorum
4.21.3. Using a Diskless Node as a Tiebreaker
The connection between the primary and secondary has failed, and the 
application is continuing to run on the primary, when the primary suddenly 
loses its connection to the diskless node.
In this case, no node can be promoted to primary and the cluster cannot 
continue to operate.

But I never achieved that, I was able to have two primaries after the above 
situation. I'm still wondering if I missed a important configuration option.

I watched the status with "drbdsetup events2", before and after the secondary 
became primary, it states "may_promote:no" (comments from me starting with #):
exists resource name:store1 role:Secondary suspended:no force-io-failures:no 
may_promote:no promotion_score:10102
exists connection name:store1 peer-node-id:2 conn-name:perf3 
connection:Connected role:Secondary
exists connection name:store1 peer-node-id:0 conn-name:perf1 
connection:Connected role:Primary
exists device name:store1 volume:0 minor:3 backing_dev:/dev/nvme/store1 
disk:UpToDate client:no quorum:yes
exists peer-device name:store1 peer-node-id:2 conn-name:perf3 volume:0 
replication:Established peer-disk:Diskless peer-client:yes resync-suspended:no
exists path name:store1 peer-node-id:2 conn-name:perf3 
local:ipv4:100.80.3.2:7792 peer:ipv4:100.80.2.3:7792 established:yes
exists peer-device name:store1 peer-node-id:0 conn-name:perf1 volume:0 
replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:store1 peer-node-id:0 conn-name:perf1 
local:ipv4:100.80.3.2:7792 peer:ipv4:100.80.3.1:7792 established:yes
exists -
# starting point, store1 is synced, no primary
# mount /dev/drbd3 on perf1:
# writing data to mountpoint
# perf1 looses connection to perf3:
change resource name:store1 may_promote:no promotion_score:0
change connection name:store1 peer-node-id:0 conn-name:perf1 
connection:NetworkFailure role:Unknown
change device name:store1 volume:0 minor:3 backing_dev:/dev/nvme/store1 
disk:Consistent client:no quorum:yes
change peer-device name:store1 peer-node-id:0 conn-name:perf1 volume:0 
replication:Off peer-disk:DUnknown peer-client:no
- skipped 3
change path name:store1 peer-node-id:0 conn-name:perf1 
local:ipv4:100.80.3.2:7792 peer:ipv4:100.80.3.1:7792 established:no
call helper name:store1 peer-node-id:0 conn-name:perf1 helper:disconnected
response helper name:store1 peer-node-id:0 conn-name:perf1 helper:disconnected 
status:0
change device name:store1 volume:0 minor:3 backing_dev:/dev/nvme/store1 
disk:Outdated client:no quorum:yes
change connection name:store1 peer-node-id:0 conn-name:perf1 
connection:Unconnected
change connection name:store1 peer-node-id:0 conn-name:perf1 
connection:Connecting
# mount /dev/drbd3 on perf2 fails
# perf1 is stil writing and looses connection to perf3:
# mount /dev/drbd3 on perf2 succeeds:
- skipped 4
change resource name:store1 role:Primary may_promote:no promotion_score:10101
change device name:store1 volume:0 minor:3 backing_dev:/dev/nvme/store1 
disk:UpToDate client:no quorum:yes

Next step I could try is a monitoring script, which watches the drbd state on 
the quorum node and if it detects a secondary with outdated data, the quorum 
then disconnects from this node. This would prevent the secondary from 
becoming primary. But I'm sure this wasn't the idea to use some scripting 
around....

Any ideas?

Many thanks in advance,

-- 
Mfg

Markus Hochholdinger

Am Freitag, 17. März 2023, 14:21:30 CET schrieb Markus Hochholdinger:
> Hi,
> 
> I'm using drbd 9.2.2 and using the quorum feature together with the diskless
> node feature.
> I've three nodes, A (perf1) and B (perf2) with a disk, C (perf3) without a
> disk.
> 
> Most of the time, the setup is behaving like expected. But in the following
> scenario, I get two primaries:
> 1. A is primary, B secondary, C is the diskless quorum.
> 2. A looses connection to B, i/o stops for around 10 seconds until A
> recognizes it can continue to be primary, i/o continues. B can't be promoted
> to primary. C sees both notes. B get its disk state to Outdated. All fine
> to this point.
> 3. A, now primary and i/o continues, looses connection to C. Nevertheless, A
> keeps primary (with the knowledge the outdated disk of B can't get
> primary). 4. B tries to get primary (mount drbd device) and gets primary as
> well! B sees C. Why doesn't C prevent this?
> => Now I've two primaries!
> 
> In my opinion, the Outdated secondary B shouldn't become primary, the quorum
> node C has the knowledge of the fact that A was primary and has changed
> data against the outdated secondary B.
> 
> Attached store1.res, the config of my resource store1.
> 
> Whre is my error? Is this expected with a diskless quorum? Why can an
> outdated secondary become primary? Any ideas?
> 
> 
> Many thanks in advance,