[Drbd-dev] DRBD8: Panic in drbd_bm_write_sect() after an io error during resync.

Montrose, Ernest Ernest.Montrose at stratus.com
Wed Feb 14 19:03:43 CET 2007


Hi all,
We are overwelmed with panic's after io errors. Seem mdev->bc is null
due to some race condition.  Here is one instance:
 
Two node cluster, node A and Node B. Syncsource is node A. While syncing
Reads are issued on Node B.  I/O errosrs start to occur
on node A,  Node A panics :
 
Feb 11 03:15:49 drbd0: Sending NegRSDReply. sector 11620032s.
Feb 11 03:15:49 drbd0: Notified peer that my disk is broken.
Feb 11 03:15:58 end_request: I/O error, dev sda, sector 59856375
Feb 11 03:16:10 end_request: I/O error, dev sda, sector 59856383
Feb 11 03:16:10 drbd0: Local IO failed. Detaching...
Feb 11 03:16:10 Unable to handle kernel NULL pointer dereference at
virtual address 00000010
Feb 11 03:16:10  printing eip:
Feb 11 03:16:10 ee3d0b0b
Feb 11 03:16:10 299ba000 -> *pde = 00000000:7f512001
Feb 11 03:16:10 00512000 -> *pme = 00000000:00000000
Feb 11 03:16:10 Oops: 0000 [#1]
Feb 11 03:16:10 SMP 
Feb 11 03:16:10 Modules linked in: drbd cn bridge ipv6 ipmi_devintf
ipmi_si ipmi_msghandler i2c_dev i2c_core binfmt_misc dm_mirror video
thermal processor fan container button battery ac shpchp pci_hotplug
e1000 piix ide_cd cdrom sg raid1 dm_mod ide_disk mptscsih mptsas mptspi
mptfc mptscsi mptbase sd_mod scsi_mod
Feb 11 03:16:10 CPU:    0
Feb 11 03:16:10 EIP:    0061:[<ee3d0b0b>]    Tainted: GF    VLI
Feb 11 03:16:10 EFLAGS: 00010292  (2.6.16.29-xen #1) 
Feb 11 03:16:10 EIP is at drbd_bm_write_sect+0x1b/0x1f0 [drbd]
Feb 11 03:16:10 eax: 00000000  ebx: eb925000  ecx: 00000000  edx:
000001bd
Feb 11 03:16:10 esi: eb925000  edi: eb92502c  ebp: eae77f50  esp:
eae77f1c
Feb 11 03:16:10 ds: 007b  es: 007b  ss: 0069
Feb 11 03:16:10 Process drbd0_worker (pid: 5777, threadinfo=eae76000
task=c060b570)
Feb 11 03:16:10 Stack: <0>c0136f00 eae77f20 eae77f20 ffffffff ffffffff
e7ea3550 00000000 000001bd 
Feb 11 03:16:10        00000000 000001bd eb925000 eb667980 eb92502c
eae77f74 ee3e2080 eb385d40 
Feb 11 03:16:10        eb92502c eae77f74 ee3e6606 00000005 eb667980
eb925000 eae77fc0 ee3d4cae 
Feb 11 03:16:10 Call Trace:
Feb 11 03:16:10  [<c0105401>] show_stack_log_lvl+0xa1/0xe0
Feb 11 03:16:10  [<c01055f1>] show_registers+0x181/0x200
Feb 11 03:16:10  [<c0105810>] die+0x100/0x1a0
Feb 11 03:16:10  [<c01156f6>] do_page_fault+0x3c6/0x8b1
Feb 11 03:16:10  [<c0105067>] error_code+0x2b/0x30
Feb 11 03:16:10  [<ee3e2080>] w_update_odbm+0x100/0x220 [drbd]
Feb 11 03:16:10  [<ee3d4cae>] drbd_worker+0x2de/0x4b5 [drbd]
Feb 11 03:16:10  [<ee3e70fc>] drbd_thread_setup+0x8c/0x100 [drbd]
Feb 11 03:16:10  [<c0102e95>] kernel_thread_helper+0x5/0x10
Feb 11 03:16:10 Code: 89 c8 5b 5d c3 8d 74 26 00 8d bc 27 00 00 00 00 55
89 e5 57 56 89 c6 53 83 ec 28 89 55 e8 c7 45 ec 00 00 00 00 89 55 f0 8b
40 14 <8b> 50 10 8b 48 14 01 55 e8 11 4d ec 8b 40 54 c7 45 e0 00 00 00 
Feb 11 03:16:10  <0>Fatal exception: panic in 5 seconds

Thanks,
 
EM--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linbit.com/pipermail/drbd-dev/attachments/20070214/db00e215/attachment.htm


More information about the drbd-dev mailing list