[Drbd-dev] Transaction log related assert messages running DRBD 8 trunk

Philipp Reisner philipp.reisner at linbit.com
Wed Jul 26 10:11:22 CEST 2006


Am Dienstag, 25. Juli 2006 20:56 schrieb Graham, Simon:
> Running some failover stress testing with the latest DRBD 8, I have
> started to notice assert failures like this:
>
> Jul 24 17:36:22 peer kernel: drbd1: ASSERT( b->br_number == barrier_nr )
> in drbd/drbd_main.c:280
> Jul 24 17:36:22 peer kernel: drbd1: ASSERT( b->n_req == set_size ) in
> drbd/drbd_main.c:281
>
> I'm not quite sure what these mean, but I do note that the code releases
> the spin lock before the assert and it occurs to me that perhaps the
> D_ASSERTs should also be done with the lock held (see below)?
>
> Simon
>
> --- code from drbd_main.c ---
>
> void tl_release(drbd_dev *mdev,unsigned int barrier_nr,
> 		       unsigned int set_size)
> {
> 	struct drbd_barrier *b;
>
> 	spin_lock_irq(&mdev->tl_lock);
>
> 	b = mdev->oldest_barrier;
> 	mdev->oldest_barrier = b->next;
>
> 	list_del(&b->requests);
> 	/* There could be requests on the list waiting for completion
> 	   of the write to the local disk, to avoid corruptions of
> 	   slab's data structures we have to remove the lists head */
>
> 	spin_unlock_irq(&mdev->tl_lock);
>
> 	D_ASSERT(b->br_number == barrier_nr);
> 	D_ASSERT(b->n_req == set_size);
> ...
>

Hi Simon,

Currently the code looks like this:

void tl_release(drbd_dev *mdev,unsigned int barrier_nr,
		       unsigned int set_size)
{
	struct drbd_barrier *b;

	spin_lock_irq(&mdev->tl_lock);

	b = mdev->oldest_barrier;
	mdev->oldest_barrier = b->next;

	list_del(&b->requests);
	/* There could be requests on the list waiting for completion
	   of the write to the local disk, to avoid corruptions of
	   slab's data structures we have to remove the lists head */

	spin_unlock_irq(&mdev->tl_lock);

	D_ASSERT(b->br_number == barrier_nr);
	D_ASSERT(b->n_req == set_size);

#ifdef DBG_ASSERTS
	if(b->br_number != barrier_nr) {
		DUMPI(b->br_number);
		DUMPI(barrier_nr);
	}
	if(b->n_req != set_size) {
		DUMPI(b->n_req);
		DUMPI(set_size);
	}
#endif

	kfree(b);
}


In case they are different you should also see the nubers. 
BTW, the spinlock only protects the linked lists. Looking at
the content of the barrier object is ok.

PS: Recently I was quite active in this parts of the code, with
    the current SVN head, these ASSERTS should not trigger.

BTW: The meaning is, we sent a number of write requests between
     two barriers. When the barrier ACK of the peer comes in
     we verify that the peer wrote the same number of writes
     between those two barriers.

-Phil
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria    http://www.linbit.com :


More information about the drbd-dev mailing list