[DRBD-user] drbd on virtio: WARNING: at block/blk-core.c

Lars Ellenberg lars.ellenberg at linbit.com
Sat Nov 13 20:24:31 CET 2010


Sorry this post spent so much time in some moderation queue,
apparently you don't post from your subscription address,
or you are not subscribed. Anyways, see below.

On Tue, Nov 09, 2010 at 05:40:02PM +0100, Thomas Vögtle wrote:
> Lars Ellenberg wrote:
> > 
> > If your kernel source looks like mine, then this would indicate something in
> > between spin_lock_irqsave and spin_unlock_irqrestore above would enable
> > spinlocks again, where is must not.
> 
> 
> My kernel source is 2.6.32.25 (vanilla).
> 
> 
> > If that something is some part of DRBD, then that would be a serious bug.
> > 
> > If you run with spin lock debug enabled, that may provide some more insight.
> > We'll try to reproduce here anyways.
> 
> Switched on:
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_SPINLOCK_SLEEP=y
> 
> Hope this helps:

No it does not...
no new information than in your previous post.


Anyways,
we do it wrong...

commit 7f9c6c210158d212cc2c7be6d6b4d289078ab735
Author: Lars Ellenberg <lars.ellenberg at linbit.com>
Date:   Wed Nov 10 10:33:21 2010 +0100

    drbd: use irqsave in bio endio callback
    
    We used spin_lock_irq, spin_unlock_irq.  The later may re-enable irq too
    early if we have been called with irq disabled, opening up a window
    for all sorts of problems.

diff --git a/drbd/drbd_req.h b/drbd/drbd_req.h
index 2260e4f..f759b05 100644
--- a/drbd/drbd_req.h
+++ b/drbd/drbd_req.h
@@ -338,18 +338,21 @@ static inline int _req_mod(struct drbd_request *req, enum drbd_req_event what)
 	return rv;
 }
 
-/* completion of master bio is outside of spinlock.
- * If you need it irqsave, do it your self! */
+/* completion of master bio is outside of our spinlock.
+ * We still may or may not be inside some irqs disabled section
+ * of the lower level driver completion callback, so we need to
+ * spin_lock_irqsave here. */
 static inline int req_mod(struct drbd_request *req,
 		enum drbd_req_event what)
 {
+	unsigned long flags;
 	struct drbd_conf *mdev = req->mdev;
 	struct bio_and_error m;
 	int rv;
 
-	spin_lock_irq(&mdev->req_lock);
+	spin_lock_irqsave(&mdev->req_lock, flags);
 	rv = __req_mod(req, what, &m);
-	spin_unlock_irq(&mdev->req_lock);
+	spin_unlock_irqrestore(&mdev->req_lock, flags);
 
 	if (m.bio)
 		complete_master_bio(mdev, &m);


only, we do it wrong for a long time already.
So I don't really see, why it would only show up in 8.3.9...

Hm. Wait.
No, we used to do it correct. My bad.
in
commit 9b7f76dc37919ea36caa9680a3f765e5b19b25fb
Author: Lars Ellenberg <lars.ellenberg at linbit.com>
Date:   Wed Aug 11 23:40:24 2010 +0200

    drbd: new configuration parameter c-min-rate

    We now track the data rate of locally submitted resync related requests,
    and can thus detect non-resync activity on the lower level device.

    If the current sync rate is above c-min-rate, and the lower level device
    appears to be busy, we throttle the resyncer.


a bad chunk slipped through, replacing the correct
spin_lock_irqsave;__req_mod; etc..
with a plain req_mod(), which only does spin_lock_irq.
Sorry.

I'll revert req_mod to plain spin_lock_irq,
and revert the endio callback to use the spin_lock_irqsave.

I think that should do it.

Thanks.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list