Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-05-12 16:09:15 +0200 \ Philipp Reisner: > > The other problem is that the module seems to crash the machine when I try > > to reload it after it has been unloaded. After having unloaded the module I > > get: > > > > drbd0: short read expecting header on sock: r=-512 > > drbd0: worker terminated > > drbd0: asender terminated > > drbd0: Connection lost. > > drbd0: receiver terminated > > drbd0: worker terminated > > drbd0: ASSERT( mdev->ee_vacant==0 ) > > in /root/src/drbd-0.7_pre7/drbd/drbd_main.c:1417 > > slab error in kmem_cache_destroy(): cache `drbd_ee_cache': Can't free all > > objects > > Call Trace: > > [<c0147595>] kmem_cache_destroy+0xd5/0x120 > > [<e1108ab8>] drbd_destroy_mempools+0x58/0x90 [drbd] > > [<e1115f15>] drbd_cleanup+0x215/0x4b5 [drbd] > > [<c01383db>] sys_delete_module+0x15b/0x1b0 > > [<c015264e>] do_munmap+0x16e/0x1f0 > > [<c01062db>] syscall_call+0x7/0xb > > > > drbd: kmem_cache_destroy(drbd_ee_cache) FAILED > > > > This is interesting! Only the assertion in drbd_main.c:1417 fires, > but not the ERR() statements above. Where is this ee ? > > Could you please retry with this patch applied ? > > RCS file: /var/lib/cvs/drbd/drbd/drbd/drbd_main.c,v > retrieving revision 1.73.2.171 > diff -u -p -u -r1.73.2.171 drbd_main.c > --- drbd/drbd_main.c 12 May 2004 10:00:47 -0000 1.73.2.171 > +++ drbd/drbd_main.c 12 May 2004 14:07:57 -0000 > @@ -1417,6 +1417,7 @@ ONLY_IN_26( > if(rr) ERR("%d: %d EEs in read list found!\n",i,rr); > > D_ASSERT(mdev->ee_vacant==0); > + D_ASSERT(list_empty(&mdev->data.work.q)); > > if (mdev->md_io_page) > __free_page(mdev->md_io_page); > > > If this new assertion triggers, then at least we know where this > missing ee is. yes, I guess it is there. thats why put this "goto again;" in the worker cleanup path, which now already is in CVS; but still, there seems to be something "on the fly somewhere..." otherwise the ASSERT in the worker thread had triggered... unless the root of the problem was that unbalanced dec_ap_pending on failed barrier send... so try the above, and/or retry with CVS... Lars