Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-06-05 01:34:54 -0400 \ T. Howell-Cintron: > > I'm running 0.7pre7 2004/06/01. The (initial) primary is an FC2 2.6.5 > and runs fine, save for eventually noting that the connection to the > secondary was lost. The secondary is a FC1 2.4.22 machine and loads the > module fine, but shortly after starting to sync I get several oops (see > attached). we did not test drbd 0.7 on 2.4 kernel in a while, only made sure it still compiles. it is likely that there is some strange typo/misconcepting in the 2.4 compat wrappers. in case I have time, or it occurs to me that its obvious, I'll dig into this, but don'T expect a patch too soon. meanwhile, please use 2.6 kernels only. > # > # initial module load, sync, etc. > # > > drbd: initialised. Version: 0.7-pre7 cvs $Date: 2004/06/01 07:33:19 $ (api:73/proto:72) > drbd0: size = 59713605 KB > drbd0: 56879176 KB marked out-of-sync by on disk bit-map. > drbd0: No usable activity log found. > drbd0: Connection established. > drbd0: Peer switched to Primary state > drbd0: Resync started as target (need to sync 56879173 KB). > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > printing eip: > d08f85f7 > *pde = 0b9a1067 > *pte = 00000000 > Oops: 0002 > drbd parport_pc lp parport 3c59x keybdev mousedev hid input usb-uhci usbcore ext3 jbd aic7xxx sd_mod scsi_mod > CPU: 0 > EIP: 0060:[<d08f85f7>] Not tainted > EFLAGS: 00010086 > > EIP is at finish_wait [drbd] 0x27 (2.4.22-1.2188.nptl) aha. it just occured that its obvious. first, your kernel already contains a backport of those functions, second, our finish_wait backport was wrong. ok, this should be fixed by this, which should be in cvs soonish... diff -u -p -r1.97.2.165 drbd_receiver.c ======================= --- drbd_receiver.c 1 Jun 2004 14:29:07 -0000 1.97.2.165 +++ drbd_receiver.c 5 Jun 2004 07:54:52 -0000 @@ -248,11 +248,11 @@ STATIC void prepare_to_wait(wait_queue_h { unsigned long flags; + __set_current_state(state); wait->flags &= ~WQ_FLAG_EXCLUSIVE; spin_lock_irqsave(&q->lock, flags); if (list_empty(&wait->task_list)) __add_wait_queue(q, wait); - set_current_state(state); spin_unlock_irqrestore(&q->lock, flags); } @@ -261,10 +261,11 @@ STATIC void finish_wait(wait_queue_head_ unsigned long flags; __set_current_state(TASK_RUNNING); - - spin_lock_irqsave(&q->lock, flags); - list_del_init(&wait->task_list); - spin_unlock_irqrestore(&q->lock, flags); + if (!list_empty(&wait->task_list)) { + spin_lock_irqsave(&q->lock, flags); + list_del_init(&wait->task_list); + spin_unlock_irqrestore(&q->lock, flags); + } } #define DEFINE_WAIT(name) DECLARE_WAITQUEUE(name,current) ======================= > # > # after running 'drbdadm disconnect..' > # > <1>Unable to handle kernel NULL pointer dereference at virtual address 00000004 > printing eip: > c012731f > *pde = 00000000 > Oops: 0000 > drbd parport_pc lp parport 3c59x keybdev mousedev hid input usb-uhci usbcore ext3 jbd aic7xxx sd_mod scsi_mod > CPU: 0 > EIP: 0060:[<c012731f>] Not tainted > EFLAGS: 00010006 > > EIP is at force_sig_info [kernel] 0x2f (2.4.22-1.2188.nptl) > eax: 00000014 ebx: cfa47b9c ecx: cb4c4000 edx: 00000000 > esi: cfa47800 edi: 00000001 ebp: cfa47940 esp: cb0abed0 > ds: 0068 es: 0068 ss: 0068 > Process drbdsetup (pid: 2711, stackpage=cb0ab000) > Stack: 00002b00 00000001 00000001 00000282 cfa47b9c cfa47800 00000001 cfa47940 > c0127bef 00000001 00000001 cb4c4000 d08f1d0d 00000001 cb4c4000 d08f19db > 00000000 c01452f4 00000296 00000000 cfa47800 d08f0a27 cfa47b9c 00000000 > Call Trace: [<c0127bef>] force_sig [kernel] 0x1f (0xcb0abef0) hm. this is not that easy. we don't register signal handlers, but utilize force_sig nevertheless. RH 2.4 force_sig_info seems to not like this, and chokes somewhere in kernel/signal.c ... sorry, don't see the exact cause now, and no quick fix either. Lars Ellenberg