[DRBD-user] kernel oops drbd 8.0_pre2 on Fedora Core 5 and RHEL4

Langemeyer, Werner (IBW) Werner.Langemeyer at de.bp.com
Tue Apr 11 16:51:18 CEST 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Lars,

the patch is really in use. Information below, what would you like me to
do?

...
ipv6                  225569  16
drbd                  144884  1
Module                  Size  Used by

[root at emdc-ha1 ~]# modinfo drbd
filename:
/lib/modules/2.6.16-1.2080_FC5/kernel/drivers/block/drbd.ko
author:         Philipp Reisner <phil at linbit.com>, Lars Ellenberg
<lars at linbit.com>
description:    drbd - Distributed Replicated Block Device v8.0-pre2
license:        GPL
alias:          block-major-147-*
vermagic:       2.6.16-1.2080_FC5 686 REGPARM 4KSTACKS gcc-4.1
depends:
srcversion:     ED8B4814AD573FE5BC1348D
parm:           disable_bd_claim:DONT USE! disables block device
claiming (bool)
parm:           minor_count:Maximum number of drbd devices (1-255) (int)

[root at emdc-ha1 ~]# cat /proc/drbd
version: 8.0-pre2 (api:81/proto:80)
SVN Revision: 2143M build by root at emdc-devel.in-geseke.de, 2006-04-11
16:28:04
 0: cs:StandAlone st:Secondary/Unknown ds:Attaching/DUnknown r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:1 pe:0 ua:0 ap:0
        resync: used:0/7 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0

see /var/log/messages:
**********************
Apr 11 16:37:26 emdc-ha1 kernel: drbd: initialised. Version: 8.0-pre2
(api:81/proto:80)
Apr 11 16:37:26 emdc-ha1 kernel: drbd: SVN Revision: 2143M build by
root at emdc-devel.in-geseke.de, 2006-04-11 16:28:04
Apr 11 16:37:26 emdc-ha1 kernel: drbd: registered as block device major
147
Apr 11 16:37:26 emdc-ha1 kernel: drbd0: disk( Diskless -> Attaching )
Apr 11 16:37:26 emdc-ha1 kernel: klogd 1.4.1, ---------- state change
----------
Apr 11 16:37:26 emdc-ha1 kernel: drbd0: drbd_bm_resize called with
capacity == 786336
Apr 11 16:37:26 emdc-ha1 kernel: drbd0: bits = 98292 in
/usr/src/redhat/BUILD/drbd-0.8/drbd/drbd_bitmap.c:369
Apr 11 16:37:26 emdc-ha1 kernel: drbd0: resync bitmap: bits=98292
words=3072
Apr 11 16:37:26 emdc-ha1 kernel: drbd0: size = 383 MB (393168 KB)
Apr 11 16:37:29 emdc-ha1 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000004
Apr 11 16:37:29 emdc-ha1 kernel:  printing eip:
Apr 11 16:37:29 emdc-ha1 kernel: c01c4d3c
Apr 11 16:37:29 emdc-ha1 kernel: *pde = 00000000
Apr 11 16:37:29 emdc-ha1 kernel: Oops: 0000 [#1]
Apr 11 16:37:29 emdc-ha1 kernel: last sysfs file: /block/hdc/size
Apr 11 16:37:29 emdc-ha1 kernel: Modules linked in: drbd(U) ipv6 autofs4
hidp rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT
xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables
x_tables acpi_cpufreq video button battery ac lp parport_pc parport
floppy nvram pcnet32 mii i2c_piix4 i2c_core dm_snapshot dm_zero
dm_mirror dm_mod ext3 jbd
Apr 11 16:37:29 emdc-ha1 kernel: CPU:    0
Apr 11 16:37:29 emdc-ha1 kernel: EIP:    0060:[<c01c4d3c>]    Not
tainted VLI
Apr 11 16:38:29 emdc-ha1 kernel: EFLAGS: 00010082   (2.6.16-1.2080_FC5
#1)
Apr 11 16:38:29 emdc-ha1 kernel: EIP is at _raw_spin_lock+0x5/0xd3
Apr 11 16:38:29 emdc-ha1 kernel: eax: 00000000   ebx: 00000000   ecx:
00000000   edx: 00000003
Apr 11 16:38:29 emdc-ha1 kernel: esi: 00000000   edi: cc58be80   ebp:
00000000   esp: cc85cc64
Apr 11 16:38:29 emdc-ha1 kernel: ds: 007b   es: 007b   ss: 0068
Apr 11 16:38:29 emdc-ha1 kernel: Process drbdsetup (pid: 1805,
threadinfo=cc85c000 task=c6efcaa0)
Apr 11 16:38:29 emdc-ha1 kernel: Stack: <0>00000282 00000000 cc58be80
c02dc5cc cf92902c c01b7892 c7bec800 00000000
Apr 11 16:38:29 emdc-ha1 kernel:        d09db7e3 00000001 c7bec800
cda333c0 00000003 ffffa79f cc58be80 cecc68c0
Apr 11 16:38:29 emdc-ha1 kernel:        d098d000 c1000000 00000003
00000000 000bffa0 d09eb98e 00000000 00000101
Apr 11 16:38:29 emdc-ha1 kernel: Call Trace:
Apr 11 16:38:29 emdc-ha1 kernel:  [<c02dc5cc>]
_spin_lock_irqsave+0x9/0xd     [<c01b7892>] blk_run_queue+0xf/0x37
Apr 11 16:38:29 emdc-ha1 kernel:  [<d09db7e3>] drbd_bm_rw+0x173/0x3e5
[drbd]     [<d09eb98e>] drbd_al_shrink+0x19e/0x1a6 [drbd]
Apr 11 16:38:29 emdc-ha1 kernel:  [<c011ac86>] printk+0x19/0xa8
[<d09dba62>] drbd_bm_write+0xd/0x37 [drbd]
Apr 11 16:38:29 emdc-ha1 kernel:  [<d09decd4>]
drbd_determin_dev_size+0x2b3/0x338 [drbd]     [<c02dc5cc>]
_spin_lock_irqsave+0x9/0xd
Apr 11 16:38:29 emdc-ha1 kernel:  [<c01213a7>] lock_timer_base+0x15/0x2f
[<c0121aac>] __mod_timer+0x8a/0x92
Apr 11 16:38:29 emdc-ha1 kernel:  [<d09dbd00>]
drbd_bm_set_lel+0x214/0x23a [drbd]     [<d09df1c1>]
drbd_ioctl_set_disk+0x468/0x700 [drbd]
Apr 11 16:38:29 emdc-ha1 kernel:  [<d09df753>] drbd_ioctl+0x2fa/0x1358
[drbd]     [<c01c065a>] _atomic_dec_and_lock+0x22/0x2c
Apr 11 16:38:29 emdc-ha1 kernel:  [<c01a0154>] avc_has_perm+0x3a/0x44
[<d09df459>] drbd_ioctl+0x0/0x1358 [drbd]
Apr 11 16:38:29 emdc-ha1 kernel:  [<c01b938d>]
blkdev_driver_ioctl+0x39/0x3f     [<c01b99e6>] blkdev_ioctl+0x62a/0x665
Apr 11 16:38:29 emdc-ha1 kernel:  [<c01a0154>] avc_has_perm+0x3a/0x44
[<c01a06ef>] inode_has_perm+0x54/0x5c
Apr 11 16:38:29 emdc-ha1 kernel:  [<c01a0776>] file_has_perm+0x7f/0x88
[<c015898a>] block_ioctl+0x0/0x16
Apr 11 16:38:29 emdc-ha1 kernel:  [<c015899d>] block_ioctl+0x13/0x16
[<c0161776>] do_ioctl+0x16/0x48
Apr 11 16:38:29 emdc-ha1 kernel:  [<c01619a7>] vfs_ioctl+0x1ff/0x216
[<c0161a06>] sys_ioctl+0x48/0x62
Apr 11 16:38:29 emdc-ha1 kernel:  [<c0102bc1>] syscall_call+0x7/0xb
<0>Code: 08 00 74 0c ba 06 48 30 c0 89 d8 e8 ab fe ff ff c7 43 0c ff ff
ff ff c7 43 08 ff ff ff ff c7 03 01 00 00 00 5b c3 57 56 53 89 c3 <81>
78 04 ad 4e ad de 74 0a ba f0 47 30 c0 e8 7d fe ff ff b8 00
Continuing in 85 seconds.  rnel: Continuing in 120 seconds.
Continuing in 48 seconds. ernel: tinuing in 84 seconds.
Continuing in 11 seconds. ernel: tinuing in 47 seconds.
Continuing in 1 seconds. kernel: tinuing in 10 seconds.
Apr 11 16:38:29 emdc-ha1 kernel:  <3>Debug: sleeping function called
from invalid context at include/linux/rwsem.h:43
Apr 11 16:38:30 emdc-ha1 kernel: in_atomic():0, irqs_disabled():1
Apr 11 16:38:30 emdc-ha1 kernel:  [<c011b557>]
profile_task_exit+0x13/0x3e
Apr 11 16:38:30 emdc-ha1 kernel:  [<c011cdd5>] do_exit+0x1c/0x6c8
[<c0104022>] register_die_notifier+0x0/0x2f
Apr 11 16:38:30 emdc-ha1 kernel:  [<c02dd43e>] do_page_fault+0x375/0x51d
[<c01c4d3c>] _raw_spin_lock+0x5/0xd3
Apr 11 16:38:30 emdc-ha1 kernel:  [<d08523c1>] dm_request+0x11a/0x12e
[dm_mod]     [<c02dd0c9>] do_page_fault+0x0/0x51d
Apr 11 16:38:30 emdc-ha1 kernel:  [<c010367b>] error_code+0x4f/0x54
[<c01c4d3c>] _raw_spin_lock+0x5/0xd3
Apr 11 16:38:30 emdc-ha1 kernel:  [<c02dc5cc>]
_spin_lock_irqsave+0x9/0xd     [<c01b7892>] blk_run_queue+0xf/0x37
Apr 11 16:38:30 emdc-ha1 kernel:  [<d09db7e3>] drbd_bm_rw+0x173/0x3e5
[drbd]     [<d09eb98e>] drbd_al_shrink+0x19e/0x1a6 [drbd]
Apr 11 16:38:30 emdc-ha1 kernel:  [<c011ac86>] printk+0x19/0xa8
[<d09dba62>] drbd_bm_write+0xd/0x37 [drbd]
Apr 11 16:38:30 emdc-ha1 kernel:  [<d09decd4>]
drbd_determin_dev_size+0x2b3/0x338 [drbd]     [<c02dc5cc>]
_spin_lock_irqsave+0x9/0xd
Apr 11 16:38:30 emdc-ha1 kernel:  [<c01213a7>] lock_timer_base+0x15/0x2f
[<c0121aac>] __mod_timer+0x8a/0x92
Apr 11 16:38:30 emdc-ha1 kernel:  [<d09dbd00>]
drbd_bm_set_lel+0x214/0x23a [drbd]     [<d09df1c1>]
drbd_ioctl_set_disk+0x468/0x700 [drbd]
Apr 11 16:38:30 emdc-ha1 kernel:  [<d09df753>] drbd_ioctl+0x2fa/0x1358
[drbd]     [<c01c065a>] _atomic_dec_and_lock+0x22/0x2c
Apr 11 16:38:30 emdc-ha1 kernel:  [<c01a0154>] avc_has_perm+0x3a/0x44
[<d09df459>] drbd_ioctl+0x0/0x1358 [drbd]
Apr 11 16:38:30 emdc-ha1 kernel:  [<c01b938d>]
blkdev_driver_ioctl+0x39/0x3f     [<c01b99e6>] blkdev_ioctl+0x62a/0x665
Apr 11 16:38:30 emdc-ha1 kernel:  [<c01a0154>] avc_has_perm+0x3a/0x44
[<c01a06ef>] inode_has_perm+0x54/0x5c
Apr 11 16:38:30 emdc-ha1 kernel:  [<c01a0776>] file_has_perm+0x7f/0x88
[<c015898a>] block_ioctl+0x0/0x16
Apr 11 16:38:30 emdc-ha1 kernel:  [<c015899d>] block_ioctl+0x13/0x16
[<c0161776>] do_ioctl+0x16/0x48
Apr 11 16:38:30 emdc-ha1 kernel:  [<c01619a7>] vfs_ioctl+0x1ff/0x216
[<c0161a06>] sys_ioctl+0x48/0x62
Apr 11 16:38:30 emdc-ha1 kernel:  [<c0102bc1>] syscall_call+0x7/0xb
<3>drbd0: Process vol_id[1807] tried to READ; since we are not in
Primary state, we cannot allow this
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98272
Apr 11 16:38:30 emdc-ha1 kernel: drbd0: Process vol_id[1807] tried to
READ; since we are not in Primary state, we cannot allow this
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98272
Apr 11 16:38:30 emdc-ha1 kernel: drbd0: Process vol_id[1807] tried to
READ; since we are not in Primary state, we cannot allow this
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98291
Apr 11 16:38:30 emdc-ha1 kernel: drbd0: Process vol_id[1807] tried to
READ; since we are not in Primary state, we cannot allow this
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98291
Apr 11 16:38:30 emdc-ha1 kernel: drbd0: Process vol_id[1807] tried to
READ; since we are not in Primary state, we cannot allow this
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98291
Apr 11 16:38:30 emdc-ha1 last message repeated 3 times
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98284
Apr 11 16:38:30 emdc-ha1 kernel: Buffer I/O error on device drbd0,
logical block 98284
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_timer_expiry: dma status ==
0x26
Apr 11 16:38:30 emdc-ha1 kernel: hda: DMA interrupt recovery
Apr 11 16:38:30 emdc-ha1 kernel: hda: lost interrupt
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_intr: error=0xc0 { BadSector
UncorrectableError }, LBAsect=6631661, sector=6631661
Apr 11 16:38:30 emdc-ha1 kernel: ide: failed opcode was: unknown
Apr 11 16:38:30 emdc-ha1 kernel: end_request: I/O error, dev hda, sector
6631661
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_timer_expiry: dma status ==
0x26
Apr 11 16:38:30 emdc-ha1 kernel: hda: DMA interrupt recovery
Apr 11 16:38:30 emdc-ha1 kernel: hda: lost interrupt
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_intr: error=0xc0 { BadSector
UncorrectableError }, LBAsect=6631669, sector=6631669
Apr 11 16:38:30 emdc-ha1 kernel: ide: failed opcode was: unknown
Apr 11 16:38:30 emdc-ha1 kernel: end_request: I/O error, dev hda, sector
6631669
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_timer_expiry: dma status ==
0x26
Apr 11 16:38:30 emdc-ha1 kernel: hda: DMA interrupt recovery
Apr 11 16:38:30 emdc-ha1 kernel: hda: lost interrupt
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Apr 11 16:38:30 emdc-ha1 kernel: hda: dma_intr: error=0xc0 { BadSector
UncorrectableError }, LBAsect=6631677, sector=6631677
Apr 11 16:38:30 emdc-ha1 kernel: ide: failed opcode was: unknown
Apr 11 16:38:30 emdc-ha1 kernel: end_request: I/O error, dev hda, sector
6631677


-----Original Message-----
From: drbd-user-bounces at linbit.com [mailto:drbd-user-bounces at linbit.com]
On Behalf Of Lars Ellenberg
Sent: Dienstag, 11. April 2006 15:44
To: drbd-user at linbit.com
Subject: Re: [DRBD-user] kernel oops drbd 8.0_pre2 on Fedora Core 5 and
RHEL4

/ 2006-04-11 13:42:55 +0100
\ Langemeyer, Werner (IBW):
> Lars,
> 
> still the same..., the complete /var/log/message could be find below:

you are very sure that the module in use is the one with the patch?

because it is very simple:
in the kernel source: block/ll_rw_blk.c:
  |void blk_run_queue(struct request_queue *q)
  |{
  |	unsigned long flags;
  |
  |	spin_lock_irqsave(q->queue_lock, flags);
which is called from drbd_bitmap.c
  |	drbd_blk_run_queue(bdev_get_queue(mdev->bc->md_bdev));

which now is this macro wrapper
  |#define drbd_blk_run_queue(q) do {      \
  |        request_queue_t *_q = (q);      \
  |        if (_q) blk_run_queue(_q);      \
  |        else {                          \
  |                WARN(#q "== NULL??\n"); \
  |        };                              \
  |} while (0)

so, to get this "NULL pointer dereference" in spinlock, you have to have
no queue defined for the block device, which due to the macro now would
no longer call into blk_run_queue, thus would not produce the stack
trace you have.

-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list