[Drbd-dev] Bug Report : meet an unexcepted WFBitMapS status after restarting the primary

Duan Zhang duan.zhang at easystack.cn
Wed Feb 5 11:12:55 CET 2020


Version: drbd-9.0.21-1

Layout: drbd.res within 3 nodes -- node-1(Secondary), node-2(Primary), node-3(Secondary)

Description:
a.reboot node-2 when cluster is working.
b.re-up the drbd.res on node-2 after it restarted.
c.an expected resync from node-3 to node-2 happens. When the resync is done, however,
   node-1 raises an unexpected WFBitMapS repl status and can't recover to normal anymore.

Status output:

node-1: drbdadm status

drbd6 role:Secondary

disk:UpToDate

hotspare connection:Connecting

node-2 role:Primary

replication:WFBitMapS peer-disk:Consistent

node-3 role:Secondary

peer-disk:UpToDate


node-2: drbdadm status

drbd6 role:Primary

disk:UpToDate

hotspare connection:Connecting

node-1 role:Secondary

peer-disk:UpToDate

node-3 role:Secondary

peer-disk:UpToDate

I assume that there is a process sequence below according to my source 
code version: node-1 node-2 node-3 restarted with CRASHED_PRIMARY start 
sync with node-3 as target start sync with node-2 as source … … end sync 
with node-3 end sync with node-2 w_after_state_change loop 1 within for 
loop against node-1:(a)
receive_uuids10 send uuid with UUID_FLAG_GOT_STABLE&CRASHED_PRIMARY to 
node-1
receive uuid of node-2 with CRASHED_PRIMARY loop 2 within for loop 
against node-3: clear CRASHED_PRIMARY(b) send uuid to node-2 with 
UUID_FLAG_RESYNC receive uuids10 sync_handshake to 
SYNC_SOURCE_IF_BOTH_FAILED sync_handshake to NO_SYNC change repl state 
to WFBitMapS The key problem is about the order of step(a) and step(b), 
that is, node-2 sends the unexpected CRASHED_PRIMARY to node-1 though 
it's actually no longer a crashed primary after syncing with node-3.So 
may I have the below questions: a.If this is really a BUG or just an 
expected result? b.If there's already a patch fix within the newest 
verion? c.If there's some workaround method against this kind of 
unexcepted status, since I really meet so many other problems like that :(

-- 
Sincerely Yours,
Zhang Duan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-dev/attachments/20200205/2837e579/attachment-0001.htm>


More information about the drbd-dev mailing list