[DRBD-cvs] r1859 - branches/drbd-0.7/drbd

www-data www-data at linbit.com
Wed Jul 13 10:24:00 CEST 2005


Author: phil
Date: 2005-07-13 10:23:59 +0200 (Wed, 13 Jul 2005)
New Revision: 1859

Modified:
   branches/drbd-0.7/drbd/drbd_actlog.c
Log:
Very very seldom the ERR in drbd_actlog.c:607 triggered. The reason
was that the rs_left member of one of the bm_extents got negative.

How could this happen:

Two parallel threads of exeution in __drbd_set_in_sync(). Both of them
already cleared some bits, therefore having a positive count [named
cleared in drbd_try_clear_on_disk_bm()].
  Before they can enter drbd_try_clear_on_disk_bm() they get serialized
  by the al_lock spin lock. In drbd_try_clear_on_disk_bm() it has to
  recount the bits worth a BM_EXT. While counting the bits, it sees
  of course that both bits where already cleared!
  Then thread 1 leaves drbd_try_clear_on_disk_bm() and releases the
  al_lock.
  Now the other thread finds the bm_ext in the cache and substracts
  its count [aka cleared] from rs_left.
  
  => So with this race condition, rs_left is one too low.
  
Fixed the race condition by serializing before clearing bits in
the bitmap...


Modified: branches/drbd-0.7/drbd/drbd_actlog.c
===================================================================
--- branches/drbd-0.7/drbd/drbd_actlog.c	2005-07-13 08:07:02 UTC (rev 1858)
+++ branches/drbd-0.7/drbd/drbd_actlog.c	2005-07-13 08:23:59 UTC (rev 1859)
@@ -671,7 +671,7 @@
 	unsigned long sbnr,ebnr,lbnr,bnr;
 	unsigned long count = 0;
 	sector_t esector, nr_sectors;
-	int strange_state;
+	int strange_state,wake_up=0;
 
 	strange_state = (mdev->cstate <= Connected) ||
 	                test_bit(DISKLESS,&mdev->flags) ||
@@ -717,23 +717,24 @@
 	 * ok, (capacity & 7) != 0 sometimes, but who cares...
 	 * we count rs_{total,left} in bits, not sectors.
 	 */
+	spin_lock_irq(&mdev->al_lock);
 	for(bnr=sbnr; bnr <= ebnr; bnr++) {
 		if (drbd_bm_clear_bit(mdev,bnr)) count++;
 	}
 	if (count) {
 		// we need the lock for drbd_try_clear_on_disk_bm
-		spin_lock_irq(&mdev->al_lock);
 		if(jiffies - mdev->rs_mark_time > HZ*10) {
 			/* should be roling marks, but we estimate only anyways. */
 			mdev->rs_mark_time = jiffies;
 			mdev->rs_mark_left = drbd_bm_total_weight(mdev);
 		}
 		drbd_try_clear_on_disk_bm(mdev,sector,count);
-		spin_unlock_irq(&mdev->al_lock);
 		/* just wake_up unconditional now,
 		 * various lc_chaged(), lc_put() in drbd_try_clear_on_disk_bm(). */
-		wake_up(&mdev->al_wait);
+		wake_up=1;
 	}
+	spin_unlock_irq(&mdev->al_lock);
+	if(wake_up) wake_up(&mdev->al_wait);
 }
 
 /*



More information about the drbd-cvs mailing list