[Drbd-dev] DRBD8: failed to complete sync due to receiving bitmap in unexpected state

Montrose, Ernest Ernest.Montrose at stratus.com
Mon Dec 11 23:16:50 CET 2006


Hi all,
Were are seeing a case where a Sync happened, data is marked consistent
on both sides, target went to Connected
  state, source DID NOT CHANGE FROM WFBitMapS state. The clock on the
two systems seem to be not quite synchronized, but it seems that:

1. The two nodes connected, realised they needed to resync and worked
out that one node had the
  good data.
2. Because other syncing was going on, the sync process was paused
3. Later on, sync resumed, good side connection went to WFBitmapS, bad
side WFBitmapT
4. Sync happened, data was marked consistent on both sides, target went
to Connected
  state, source DID NOT CHANGE FROM WFBitMapS.

Now, the only oddity I see is on the target side where we see:

Dec 10 04:52:52 george kernel: drbd1: unexpected cstate (PausedSyncT) in
receive_bitmap

This did NOT stop the resync, but I would suspect it meant that a
critical message was never sent which left the source side in WFBitmapS.

Presumably there is a window where one side is out of the paused state
before the other.
 
Simon Grham actually did a bit of analysis of this and think that the
problem might be a race condition in drbd_receive.c:receive_bitmap().
Any ideas, because I cannot reproduce this at reliably at this time.
 
EM--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linbit.com/pipermail/drbd-dev/attachments/20061211/60493014/attachment.html


More information about the drbd-dev mailing list