[DRBD-user] allow-two-primaries problem
Siniša Bandin
sinisa at 4net.rs
Fri May 24 19:10:20 CEST 2019
Hello!
My DRBD cluster is back to full power, all three nodes up and running.
I am still able to promote more than one node to primary, although I
have no such thing configured. I was even able to promote all three
nodes:
Before:
node0:~ # drbdstat md8
md8 node-id:0 role:Primary suspended:no
write-ordering:flush
volume:0 minor:8 disk:UpToDate quorum:yes
size:314255868 read:18959644 written:2048 al-writes:1 bm-writes:0
upper-pending:0 lower-pending:0 al-suspended:no blocked:no
node1 node-id:1 connection:Connected role:Secondary congested:no
ap-in-flight:0 rs-in-flight:18446744073709088768
volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no
received:316872 sent:231964 out-of-sync:0 pending:0 unacked:0
node2 node-id:2 connection:Connected role:Secondary congested:no
ap-in-flight:0 rs-in-flight:0
volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no
received:21909399 sent:2048 out-of-sync:0 pending:0 unacked:0
After "drbdadm primary md8" on other two nodes:
node0:~ # drbdstat md8
md8 node-id:0 role:Primary suspended:no
write-ordering:flush
volume:0 minor:8 disk:UpToDate quorum:yes
size:314255868 read:18959644 written:2048 al-writes:1 bm-writes:0
upper-pending:0 lower-pending:0 al-suspended:no blocked:no
node1 node-id:1 connection:Connected role:Primary congested:no
ap-in-flight:0 rs-in-flight:18446744073709088768
volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no
received:316872 sent:231964 out-of-sync:0 pending:0 unacked:0
node2 node-id:2 connection:Connected role:Primary congested:no
ap-in-flight:0 rs-in-flight:0
volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no
received:21909399 sent:2048 out-of-sync:0 pending:0 unacked:0
I don't think this is expected nor right behavior.
I have XFS filesystem there and it would just hate being mounted on more
that one node at a time.
Have I stumbled across a bug?
---
Srdačan pozdrav/Best regards/Freundliche Grüße/Cordialement,
Siniša Bandin
On 13.05.2019 12:20, Sinisa wrote:
> Thanks for a quick reply. Here are the details of my configuration:
>
> # uname -a
> Linux node0 5.1.1-1.g65f0348-default #1 SMP Sat May 11 17:16:51 UTC
> 2019 (65f0348) x86_64 x86_64 x86_64 GNU/Linux
> (latest stable kernel from opensuse repository)
>
>
> before - node2 is Primary, node0 is Secondary (node1 is currently down
> for repair - can this be the reason why remaing two don't work as
> expected?)
>
> node0 # drbdstat md5
> md5 node-id:0 role:Secondary suspended:no
> write-ordering:flush
> volume:0 minor:5 disk:UpToDate quorum:yes
> size:345725692 read:86972 written:8623059 al-writes:2332
> bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:no
> node1 node-id:1 connection:Connecting role:Unknown congested:no
> ap-in-flight:0 rs-in-flight:0
> volume:0 replication:Off peer-disk:DUnknown resync-suspended:no
> received:0 sent:0 out-of-sync:318772620 pending:0 unacked:0
> node2 node-id:2 connection:Connected role:Primary congested:no
> ap-in-flight:0 rs-in-flight:0
> volume:0 replication:Established peer-disk:UpToDate
> resync-suspended:no
> received:8623059 sent:82940 out-of-sync:0 pending:0 unacked:0
>
> nore2# drbdstat md5
> md5 node-id:2 role:Primary suspended:no
> write-ordering:flush
> volume:0 minor:5 disk:UpToDate quorum:yes
> size:345725692 read:681961962 written:665922273 al-writes:19990
> bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:no
> node0 node-id:0 connection:Connected role:Secondary congested:no
> ap-in-flight:0 rs-in-flight:18446744073709544568
> volume:0 replication:Established peer-disk:UpToDate
> resync-suspended:no
> received:668957943 sent:328667978 out-of-sync:0 pending:0
> unacked:0
> node1 node-id:1 connection:Connecting role:Unknown congested:no
> ap-in-flight:0 rs-in-flight:0
> volume:0 replication:Off peer-disk:DUnknown resync-suspended:no
> received:0 sent:0 out-of-sync:345725692 pending:0 unacked:0
>
>
> after
> node0# drbdadm primary md5
> both node0 and node2 are Primary
>
>
>
> node0# drbdstat md5
> md5 node-id:0 role:Primary suspended:no
> write-ordering:flush
> volume:0 minor:5 disk:UpToDate quorum:yes
> size:345725692 read:86972 written:8633116 al-writes:2332
> bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:no
> node1 node-id:1 connection:Connecting role:Unknown congested:no
> ap-in-flight:0 rs-in-flight:0
> volume:0 replication:Off peer-disk:DUnknown resync-suspended:no
> received:0 sent:0 out-of-sync:318772620 pending:0 unacked:0
> node2 node-id:2 connection:Connected role:Primary congested:no
> ap-in-flight:0 rs-in-flight:0
> volume:0 replication:Established peer-disk:UpToDate
> resync-suspended:no
> received:8633116 sent:82940 out-of-sync:0 pending:0 unacked:0
>
>
>
> dmesg output:
> on node0:
> [Mon May 13 12:09:53 2019] drbd md5 node2: Split-brain handler
> configured, rely on it.
> [Mon May 13 12:09:53 2019] drbd md5: Preparing cluster-wide state
> change 3262357452 (0->-1 3/1)
> [Mon May 13 12:09:53 2019] drbd md5: State change 3262357452:
> primary_nodes=5, weak_nodes=FFFFFFFFFFFFFFFA
> [Mon May 13 12:09:53 2019] drbd md5 node2: Split-brain handler
> configured, rely on it.
> [Mon May 13 12:09:53 2019] drbd md5: Committing cluster-wide state
> change 3262357452 (0ms)
> [Mon May 13 12:09:53 2019] drbd md5 node2: Split-brain handler
> configured, rely on it.
> [Mon May 13 12:09:53 2019] drbd md5: role( Secondary -> Primary )
>
>
> on node2:
> [Mon May 13 12:10:04 2019] drbd md5 node0: Preparing remote state
> change 3262357452
> [Mon May 13 12:10:04 2019] drbd md5 node0: Split-brain handler
> configured, rely on it.
> [Mon May 13 12:10:04 2019] drbd md5 node0: Committing remote state
> change 3262357452 (primary_nodes=5)
> [Mon May 13 12:10:04 2019] drbd md5 node0: Split-brain handler
> configured, rely on it.
> [Mon May 13 12:10:04 2019] drbd md5 node0: peer( Secondary -> Primary )
>
>
>
>
>
>
>
>
>
>
> Srdačan pozdrav / Best regards / Freundliche Grüße / Cordialement,
> Siniša Bandin
>
> On 5/13/19 11:04 AM, Robert Altnoeder wrote:
>> On 5/11/19 3:30 PM, Siniša Bandin wrote:
>>> I have a 3-node DRBD9 (9.0.17) cluster.
>>>
>>> The problem is: although the option "allow-two-primaries" is NOT set,
>>> I am able to set two nodes as primary, even worse, to mount XFS file
>>> system on both.
>> Tested right now, 3 nodes, DRBD 9.0.17-1
>> (b9abab2dd27313922797d026542b399870bfd13e), Linux 4.8.11 amd64
>> I cannot reproduce the problem.
>>
>> What is the exact status of the resources on those three nodes?
>>
>> br,
>> Robert
>>
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
More information about the drbd-user
mailing list