[DRBD-user] Using quorum in three node cluster results in split brain

Philipp Reisner philipp.reisner at linbit.com
Thu Mar 23 10:21:22 CET 2023


Hi Markus,

Thanks for sending this bug report, including instructions on
reproducing it.  At first, I ignored your report because I could not
reproduce the issue.  Thanks to your persistence, I realized that this
issue only reproduces on the versions you reported. So it is something
that is already fixed in the drbd-9.1 branch.

Here, Is the test to reproduce your steps in an automated way:
---
#! /usr/bin/env python3
#
from python import drbdtest
from python.drbdtest import connections, log, peer_devices

resource = drbdtest.setup_resource(nodes=3)
resource.resource_options = 'quorum majority;'
A, B, C = resource.nodes
resource.add_disk('1M', diskful_nodes=[A, B])
resource.up_wait()

log('* Make up-to-date data available.')
resource.skip_initial_sync()

A.primary()
connections(to_node=A).event(r'connection .* role:Primary')
connections(A, B).disconnect(wait=False) # wait=False ->
'peer-disk:Outdated' observable:
peer_devices(A, B).event(r'peer-device .* peer-disk:Outdated')
connections(A, C).disconnect()
try:
    B.primary()
except:
    pass
else:
    raise RuntimeError('B promoted!')

resource.down()
resource.cluster.teardown()
---
Using the drbd-test suite ( https://github.com/LINBIT/drbd9-tests )

With that, I was able to identify which of the recent changes fixes
that issue. It is
https://github.com/LINBIT/drbd/commit/057d17f455e909a75827948cd1fa932e58793a66

It will be released with drbd-9.1.14 and drbd-9.2.3 in about two weeks.

And I will add this test snipped to one of the larger quorum-* tests cases.

best regards,
 Philipp

On Wed, Mar 22, 2023 at 7:50 PM Markus Hochholdinger
<Markus at hochholdinger.net> wrote:
>
> Hi,
>
> Am Dienstag, 21. März 2023, 17:59:37 CET schrieb Markus Hochholdinger:
> > Still wondering why the diskless quorum is not working for me.
>
> I've tested the following drbd versions:
> 9.1.0 ok
> 9.1.5 ok
> 9.1.6 ok
> 9.1.7 fail
> 9.1.8 fail
> 9.1.13 fail
> 9.2.0 fail
> 9.2.1 fail
> 9.2.2 fail
>
> Between 9.1.6 and 9.1.7 something changed what resulted in:
> An Outdated Secondary can become Primary (without --force)....
>
> Now I'll have a look at the diff between those two versions....
>
>
> --
> Mfg
>
> Markus Hochholdinger
>
>
> _______________________________________________
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user at lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user


More information about the drbd-user mailing list