[DRBD-user] First fix for drbd-0.7.12

Wed Aug 31 10:12:17 CEST 2005

Hi,

Here is a preliminary patch to fix drbd-0.7.12 . But I am not yet happy
with that patch, because I do not really understand where the bug is.

To Andre and Jeff,
Could you please verify that the patch fixes the problem ?

For people who think that they know something about kernel programming:

Problem description

the function _drbd_process_ee() [ in drbd_receiver.c ] is called with
the mdev->ee_lock held. Thus there are never two parallel threads
of execution entering that function at the same time.

But that function has to release the spinlock in oder to call the
callbacks on those 'epoch_entries'.

So a second CPU could enter that function if the first is just
processing a callback. 
To avoid this there is this PROCESS_EE_RUNNING bit, and the ee_wait
wait queue. An execution context has to grab that bit first, before
it procedes with the body of the function, otherwise it sleeps on
ee_wait, until the other CPU releases the bit and calls wake_up(ee_wait).

Here is the code:

STATIC int _drbd_process_ee(drbd_dev *mdev, int be_sleepy)
{
	struct Tl_epoch_entry *e;
	struct list_head *head = &mdev->done_ee;
	struct list_head *le;
	int ok=1;
	int got_sig;

	MUST_HOLD(&mdev->ee_lock);

	reclaim_net_ee(mdev);

	if( test_and_set_bit(PROCESS_EE_RUNNING,&mdev->flags) ) {
		if(!be_sleepy) {
			return 3;
		}
		spin_unlock_irq(&mdev->ee_lock);
		got_sig = wait_event_interruptible(mdev->ee_wait,
		       test_and_set_bit(PROCESS_EE_RUNNING,&mdev->flags) == 0);
		spin_lock_irq(&mdev->ee_lock);
		if(got_sig) return 2;
	}

	while(!list_empty(head)) {
		le = head->next;
		list_del(le);
		spin_unlock_irq(&mdev->ee_lock);
		e = list_entry(le, struct Tl_epoch_entry, w.list);
		ok = ok && e->w.cb(mdev,&e->w,0);
		spin_lock_irq(&mdev->ee_lock);
		drbd_put_ee(mdev,e);
	}

	clear_bit(PROCESS_EE_RUNNING,&mdev->flags);
	wake_up(&mdev->ee_wait);

	return ok;
}

But for unknown reasons, after some time the PROCESS_EE_RUNNING bit is set
but no CPU is in the function, therefore it never gets cleared again ->
the receiver thereads blocks forever -> the whole resync process is stalled.

Further facts:
* This happens only on SMP machines (up to now tested on x86 only)
* As soon as the PROCESS_EE_RUNNING bit is moved to an other long word 
  it works !?!?!

Changes that do _not_ fix the problem, but I have already tried:
* smp_mb__after_clear_bit()   after the clear_bit()
* wake_up_all() instead of wake_up()

What are your conclusions from this ?

-Philipp
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria    http://www.linbit.com :
-------------- next part --------------
A non-text attachment was scrubbed...
Name: drbd-0.7.12-syncer-fix.diff
Type: text/x-diff
Size: 1334 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20050831/fdc9a21c/attachment.diff>