[Drbd-dev] DRBD8: incorrect state transition Connected ->WFBitMapS and UpToDate->Inconsistent

Fri Nov 16 03:36:19 CET 2007

Phil,
I tested the patch and unfortunately it does not fix the race condition though I believe it fixes
the ASSERT issues.  
Essentially, when the resync is done and we are in drbd_resync_finished() if we pause the device
then we send the state to the peer.  The peer is done syncing at that point.  He does a
sync_hanshake() that sends its state to WBItMapS and Pdsk=Inconsistent (since the target
has not changed its state to connected and UptoDate yet. When resync_finished is done we go 
Uptodate and connected and we're stuck.

I tested yet another idea which seems to close the racy window.  I turned drbd_resync_finished
into two parts.  A cleanup part that the worker can schedule to do the clean up and a done part
that
changes the state right away when the resync is done.  I include an untested patch to illustrate
that idea.

Thanks.

EM--
--- Philipp Reisner <philipp.reisner at linbit.com> wrote:

> On Monday 12 November 2007 14:41:10 Montrose, Ernest wrote:
> > Hi,
> > We have been struggling with a problem where one side gets stuck in
> > WFBitMapS and Inconsistent State. Consider two nodes (Node0 and node1).
> >
> >
> > * Device r5 on node0 starts syncing as the synctarget.
> > * Device r5 is done syncing and on node0 we call drbd_resync_finished()
> > this gets delayed for a bit in drbd_rs_del_all()
> > * During this delay, device R0   wants to resync.  So the lower priority
> > devices like R5 gets paused.  This is were the trouble starts.
> 
> Right. But Something else happens...
> 
> [...]
> > Oct  4 14:56:01 node0 kernel: drbd60: Syncer continues.
> > Oct  4 14:56:01 node0 kernel: drbd60: ASSERT(
> > !test_bit(STOP_SYNC_TIMER,&mdev->flags) ) in
> > /sandbox/sgraham/sn/trunk/platform/drbd/src/drbd/drbd_main.c:786
> 
> That assert caught my attention, and this is my understanding what
> went wrong...
> 
> r5 was already finished with its resync timer and calling 
> w_make_resync_request(), but due to the continue event after the
> pause the timer got restarted...
> 
> Unfortunately the drbd_bm_find_next() searched through all the
> bitmap and found those bits near the end that where not yet 
> cleared, and so resync requests where resent...
> 
> Therefore...
> 
> [...]
> > Oct  4 14:56:09 node0 kernel: drbd60: Resync done (total 2 sec; paused 0
> > sec; 384 K/sec)
> [...]
> > Oct  4 14:56:09 node0 kernel: drbd60: Resync done (total 2 sec; paused 0
> > sec; 0 K/sec)
> > Oct  4 14:56:09 node0 kernel: drbd60: Resync done (total 2 sec; paused 0
> > sec; 0 K/sec)
> > Oct  4 14:56:09 node0 kernel: drbd60: Connected in w_make_resync_request
> > Oct  4 14:56:09 node0 kernel: drbd60: Resync done (total 2 sec; paused 0
> > sec; 0 K/sec)
> 
> ... we got multiple calls to drbd_resync_finished().
> 
> Here is my suggestion to fix that.
> 
> 1) Do not restart the timer after a syncpause, when the timer is no 
>    longer needed.
> 
> 2) To make the whole thing more robust against such bugs, 
>    drbd_bm_find_next() should not reset the find_offset back to 0
>    after it hit the end of the bitmap once.
> 
> I have not tested it.... but I think this should do...
> 
> -Phil
> -- 
> : Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
> : LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
> : Vivenotgasse 48, 1120 Vienna, Austria        http://www.linbit.com :
> > diff --git a/drbd/drbd_bitmap.c b/drbd/drbd_bitmap.c
> index 015421a..7e118a6 100644
> --- a/drbd/drbd_bitmap.c
> +++ b/drbd/drbd_bitmap.c
> @@ -954,7 +954,7 @@ unsigned long drbd_bm_find_next(drbd_dev *mdev)
>  	}
>  	if (i >= b->bm_bits) {
>  		i = -1UL;
> -		b->bm_fo = 0;
> +		/* leave b->bm_fo unchanged. */
>  	} else {
>  		b->bm_fo = i+1;
>  	}
> diff --git a/drbd/drbd_main.c b/drbd/drbd_main.c
> index fe8f66d..e25bb3a 100644
> --- a/drbd/drbd_main.c
> +++ b/drbd/drbd_main.c
> @@ -786,9 +786,13 @@ int _drbd_set_state(drbd_dev* mdev, drbd_state_t ns,enum chg_state_flags
> flags)
>  		INFO("Syncer continues.\n");
>  		mdev->rs_paused += (long)jiffies-(long)mdev->rs_mark_time;
>  		if( ns.conn == SyncTarget ) {
> -			D_ASSERT(!test_bit(STOP_SYNC_TIMER,&mdev->flags));
> -			clear_bit(STOP_SYNC_TIMER,&mdev->flags);
> -			mod_timer(&mdev->resync_timer,jiffies);
> +			if (!test_bit(STOP_SYNC_TIMER,&mdev->flags)) {
> +				mod_timer(&mdev->resync_timer,jiffies);
> +			}
> +			/* This if (!test_bit is only needed for the case
> +			   that a device that has ceased to used its timer,
> +			   i.e. it is already in drbd_resync_finished() gets
> +			   paused and resumed. */
>  		}
>  	}
>  
> > _______________________________________________
> drbd-dev mailing list
> drbd-dev at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-dev
> 

      ____________________________________________________________________________________
Be a better pen pal. 
Text or chat with friends inside Yahoo! Mail. See how.  http://overview.mail.yahoo.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: my_3230.patch
Type: text/x-patch
Size: 2254 bytes
Desc: 552174691-my_3230.patch
Url : http://lists.linbit.com/pipermail/drbd-dev/attachments/20071115/cf699b9e/my_3230.bin