Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Am Montag, 14. November 2005 16:24 schrieben Sie: > Good morning Philipp! > > Sorry bothering you Monday morning but I really need your help. > I installed DRBD 0.7.14 from source on kernel 2.6.14 and during initial > sync between Primary and Secondary nodes I had the "Stalled" situation 3 > times. "drbd status" reported: > > cs:SyncSource st:Primary/Secondary ld:Consistent > ns:430188628 nr:0 dw:0 dr:431237628 al:0 bm:26242 lo:0 pe:108 ua:1555 > ap:0 [>...................] sync'ed: 0.1% (636555/636704)M > stalled > > On Secondary node in /var/log/messages I found kernel BUG report and I > attached it to this email. I'm unable to do "drbd restart" or "reload" or > even "force-reload" in this situation, so, only after rebooting of > Secondary node I'm able to continue to resync. This DRBD machines are > supposed to be mission-critical, so, we will really appreciate any help > from you. Maybe small patch for kernel 2.6.14 ? > [ decoded .rtf inserted here ] Nov 14 01:26:57 san2 kernel: ------------[ cut here ]------------ Nov 14 01:26:57 san2 kernel: kernel BUG at /usr/local/src/drbd-0.7.14/drbd/lru_cache.c:259! Nov 14 01:26:57 san2 kernel: invalid operand: 0000 [#1] Nov 14 01:26:57 san2 kernel: Modules linked in: drbd iptable_mangle iptable_nat ip_nat ip_conntrack iptable_filter ip_tables edd sk98lin ohci1394 ieee1394 i2c_i801 i2c_core ehci_hcd uhci_hcd usbcore parport_pc lp parport Nov 14 01:26:57 san2 kernel: CPU: 0 Nov 14 01:26:57 san2 kernel: EIP: 0060:[<f90e4054>] Tainted: GF VLI Nov 14 01:26:57 san2 kernel: EFLAGS: 00010046 (2.6.14-default) Nov 14 01:26:57 san2 kernel: EIP is at lc_put+0x64/0x90 [drbd] Nov 14 01:26:57 san2 kernel: eax: 00000000 ebx: f90b5000 ecx: f90b5254 edx: f90b5254 Nov 14 01:26:57 san2 kernel: esi: f90b5254 edi: 00000246 ebp: 00000000 esp: f5831f54 Nov 14 01:26:57 san2 kernel: ds: 007b es: 007b ss: 0068 Nov 14 01:26:57 san2 kernel: Process drbd4_asender (pid: 4265, threadinfo=f5830000 task=f73c9540) Nov 14 01:26:57 san2 kernel: Stack: f90b5254 f5f6b350 f90e3a4d f5f6b350 f5c7d6f8 0d07f860 f90dc817 f5c7d6f8 Nov 14 01:26:57 san2 kernel: f5f6b350 00000001 f5f6b7bc f90db5ce 00000046 f5851808 f73c99b4 00000008 Nov 14 01:26:57 san2 kernel: c011e09d fffffffc f5f6b350 00000008 f5f6b5f4 f90db5e6 f90e00a7 f5f6b760 Nov 14 01:26:57 san2 kernel: Call Trace: Nov 14 01:26:57 san2 kernel: [<f90e3a4d>] drbd_rs_complete_io+0x2d/0x90 [drbd] Nov 14 01:26:57 san2 kernel: [<f90dc817>] e_end_resync_block+0x17/0xf0 [drbd] Nov 14 01:26:57 san2 kernel: [<f90db5ce>] _drbd_process_ee+0x15e/0x170 [drbd] Nov 14 01:26:57 san2 kernel: [<c011e09d>] flush_sigqueue+0x4d/0x60 Nov 14 01:26:57 san2 kernel: [<f90db5e6>] drbd_process_ee+0x6/0x10 [drbd] Nov 14 01:26:57 san2 kernel: [<f90e00a7>] drbd_asender+0xf7/0x41e [drbd] Nov 14 01:26:57 san2 kernel: [<f90e4ac7>] drbd_thread_setup+0x87/0xb0 [drbd] Nov 14 01:26:57 san2 kernel: [<f90e4a40>] drbd_thread_setup+0x0/0xb0 [drbd] Nov 14 01:26:57 san2 kernel: [<c0100e69>] kernel_thread_helper+0x5/0xc Nov 14 01:26:57 san2 kernel: Code: 41 04 89 08 8b 03 89 46 08 89 13 89 50 04 89 5a 04 0f ba 73 24 02 0f ba 73 24 00 8b 46 10 5b 5e c3 0f 0b fe 00 c8 c0 0e f9 eb a6 <0f> 0b 03 01 c8 c0 0e f9 eb b9 0f 0b 02 01 c8 c0 0e f9 eb a8 0f Nov 14 01:26:57 san2 kernel: ------------[ cut here ]------------ Nov 14 01:26:57 san2 kernel: kernel BUG at /usr/local/src/drbd-0.7.14/drbd/lru_cache.c:201! Nov 14 01:26:57 san2 kernel: invalid operand: 0000 [#2] Nov 14 01:26:57 san2 kernel: Modules linked in: drbd iptable_mangle iptable_nat ip_nat ip_conntrack iptable_filter ip_tables edd sk98lin ohci1394 ieee1394 i2c_i801 i2c_core ehci_hcd uhci_hcd usbcore parport_pc lp parport Nov 14 01:26:57 san2 kernel: CPU: 0 Nov 14 01:26:57 san2 kernel: EIP: 0060:[<f90e3f20>] Tainted: GF VLI Nov 14 01:26:57 san2 kernel: EFLAGS: 00010086 (2.6.14-default) Nov 14 01:26:57 san2 kernel: EIP is at lc_get+0xd0/0x110 [drbd] Nov 14 01:26:57 san2 kernel: eax: ffffffff ebx: 00000bfb ecx: 00000000 edx: 00000bfb Nov 14 01:26:57 san2 kernel: esi: f90b5000 edi: f5f6b350 ebp: f5f6b350 esp: f58fbf4c Nov 14 01:26:57 san2 kernel: ds: 007b es: 007b ss: 0068 Nov 14 01:26:57 san2 kernel: Process drbd4_worker (pid: 4264, threadinfo=f58fa000 task=f582aa50) Nov 14 01:26:57 san2 kernel: Stack: 00000bfb 00000000 f5f6b350 f90e373d 00000bfb 00000246 fa8f8654 f5ce97c0 Nov 14 01:26:57 san2 kernel: f5f6b350 8cc8e000 00000000 f90d4c47 f5ce97c0 05fd9508 00000000 00bfb2a1 Nov 14 01:26:57 san2 kernel: f5f6b350 f90d9738 8cc8e000 00000000 00000000 000002f1 f5f6b64c f5f6b350 Nov 14 01:26:57 san2 kernel: Call Trace: Nov 14 01:26:57 san2 kernel: [<f90e373d>] drbd_rs_begin_io+0x13d/0x420 [drbd] Nov 14 01:26:57 san2 kernel: [<f90d4c47>] drbd_bm_find_next+0x167/0x1b0 [drbd] Nov 14 01:26:57 san2 kernel: [<f90d9738>] w_make_resync_request+0xf8/0x2b0 [drbd] Nov 14 01:26:57 san2 kernel: [<f90daac6>] drbd_worker+0x1c6/0x2d9 [drbd] Nov 14 01:26:57 san2 kernel: [<f90e4ac7>] drbd_thread_setup+0x87/0xb0 [drbd] Nov 14 01:26:57 san2 kernel: [<f90e4a40>] drbd_thread_setup+0x0/0xb0 [drbd] Nov 14 01:26:57 san2 kernel: [<c0100e69>] kernel_thread_helper+0x5/0xc Nov 14 01:26:57 san2 kernel: Code: 8b 47 10 40 89 47 10 48 75 4a 89 7e 28 89 5e 20 0f ba 76 24 00 eb ae 0f ba 76 24 00 eb a7 0f 0b c6 00 c8 c0 0e f9 e9 3f ff ff ff <0f> 0b c9 00 c8 c0 0e f9 e9 4c ff ff ff 0f 0b c7 00 c8 c0 0e f9 Nov 14 01:26:57 san2 kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address 000001d0 Nov 14 01:26:57 san2 kernel: printing eip: Nov 14 01:26:57 san2 kernel: c011eaa6 Nov 14 01:26:57 san2 kernel: *pde = 00000000 Nov 14 01:26:57 san2 kernel: Oops: 0000 [#3] Nov 14 01:26:58 san2 kernel: Modules linked in: drbd iptable_mangle iptable_nat ip_nat ip_conntrack iptable_filter ip_tables edd sk98lin ohci1394 ieee1394 i2c_i801 i2c_core ehci_hcd uhci_hcd usbcore parport_pc lp parport Nov 14 01:26:58 san2 kernel: CPU: 0 Nov 14 01:26:58 san2 kernel: EIP: 0060:[<c011eaa6>] Tainted: GF VLI Nov 14 01:26:58 san2 kernel: EFLAGS: 00010046 (2.6.14-default) Nov 14 01:26:58 san2 kernel: EIP is at force_sig_info+0x26/0x70 Nov 14 01:26:58 san2 kernel: eax: 00000073 ebx: f73c9540 ecx: 00000000 edx: 00000017 Nov 14 01:26:58 san2 kernel: esi: 00000202 edi: 00000018 ebp: 00000001 esp: f7cdfe8c Nov 14 01:26:58 san2 kernel: ds: 007b es: 007b ss: 0068 Nov 14 01:26:58 san2 kernel: Process md2_raid5 (pid: 795, threadinfo=f7cde000 task=f7c69a50) Nov 14 01:26:58 san2 kernel: Stack: f5f6b350 f5f6b7ec f5b0a61c 00000286 f90d8cd0 00000000 f5b0a61c 00001000 Nov 14 01:26:58 san2 kernel: f90d8b30 00000000 c015515a f5a4da68 f6254a68 00000000 f5a09700 c02c5f29 Nov 14 01:26:58 san2 kernel: f8ff6080 00000000 00000000 f5f3db00 f7997438 00000000 f7c5d380 c02b91c5 Nov 14 01:26:58 san2 kernel: Call Trace: Nov 14 01:26:58 san2 kernel: [<f90d8cd0>] drbd_dio_end_sec+0x1a0/0x2a0 [drbd] Nov 14 01:26:58 san2 kernel: [<f90d8b30>] drbd_dio_end_sec+0x0/0x2a0 [drbd] Nov 14 01:26:58 san2 kernel: [<c015515a>] bio_endio+0x3a/0x60 Nov 14 01:26:58 san2 kernel: [<c02c5f29>] clone_endio+0xc9/0x120 Nov 14 01:26:58 san2 kernel: [<c02b91c5>] handle_stripe+0x7a5/0x1020 Nov 14 01:26:58 san2 kernel: [<c0119f93>] __do_softirq+0x43/0xa0 Nov 14 01:26:58 san2 kernel: [<c02b715f>] release_stripe+0x7f/0x110 Nov 14 01:26:58 san2 kernel: [<c02ba29a>] raid5d+0x11a/0x270 Nov 14 01:26:58 san2 kernel: [<c02c1f23>] md_thread+0x53/0x100 Nov 14 01:26:58 san2 kernel: [<c0126eb0>] autoremove_wake_function+0x0/0x30 Nov 14 01:26:58 san2 kernel: [<c02c1ed0>] md_thread+0x0/0x100 Nov 14 01:26:58 san2 kernel: [<c0126b75>] kthread+0x85/0x90 Nov 14 01:26:58 san2 kernel: [<c0126af0>] kthread+0x0/0x90 Nov 14 01:26:58 san2 kernel: [<c0100e69>] kernel_thread_helper+0x5/0xc Nov 14 01:26:58 san2 kernel: Code: 90 8d 74 26 00 55 89 d5 57 89 c7 56 53 89 cb 9c 5e fa 8d 50 ff 0f a3 91 64 04 00 00 19 c0 85 c0 75 23 8b 89 60 04 00 00 8d 04 92 <83> 7c 81 04 01 74 19 89 d9 89 ea 89 f8 e8 08 ff ff ff 56 9d 5b Thanks, An interesting BUG report. BTW, Please send OOPs messages as plain text and not as .rtf It will take a few days look into that issue. I need to upgrade my development cluster from 2.6.13 to 2.6.14... PS: Would be nice if you would post such issues to the mailing list and not to my personal address [subscription required!] . This helps a lot to find our if more people experience that bug etc... -Phil -- : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :