[Drbd-dev] DRBD8: failed to complete sync due to receiving bitmap
in unexpected state
Montrose, Ernest
Ernest.Montrose at stratus.com
Mon Dec 11 23:16:50 CET 2006
Hi all,
Were are seeing a case where a Sync happened, data is marked consistent
on both sides, target went to Connected
state, source DID NOT CHANGE FROM WFBitMapS state. The clock on the
two systems seem to be not quite synchronized, but it seems that:
1. The two nodes connected, realised they needed to resync and worked
out that one node had the
good data.
2. Because other syncing was going on, the sync process was paused
3. Later on, sync resumed, good side connection went to WFBitmapS, bad
side WFBitmapT
4. Sync happened, data was marked consistent on both sides, target went
to Connected
state, source DID NOT CHANGE FROM WFBitMapS.
Now, the only oddity I see is on the target side where we see:
Dec 10 04:52:52 george kernel: drbd1: unexpected cstate (PausedSyncT) in
receive_bitmap
This did NOT stop the resync, but I would suspect it meant that a
critical message was never sent which left the source side in WFBitmapS.
Presumably there is a window where one side is out of the paused state
before the other.
Simon Grham actually did a bit of analysis of this and think that the
problem might be a race condition in drbd_receive.c:receive_bitmap().
Any ideas, because I cannot reproduce this at reliably at this time.
EM--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linbit.com/pipermail/drbd-dev/attachments/20061211/60493014/attachment.html
More information about the drbd-dev
mailing list