[DRBD-user] Kernel BUG - Xen & DRBD setup

Shane Goulden shane at matrixau.net
Wed Jun 2 05:17:40 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I've submitted a bug with CentOS about this (http://bugs.centos.org/view.php?id=4354) but figured I might as well try posting to this list as well. 

I'm getting random crashes with a Xen+DRBD setup. 

As per the bug report at bugs.centos.org: 

This has been happening for quite some time now (long before 2.6.18-194.3.1.el5xen). 

It happens with this combo: 
xen-3.0.3-94.el5_4.3 
drbd83-8.3.2-6.el5_3 
kmod-drbd83-xen-8.3.2-6.el5_3 

And it happened just now with this combo: 
xen-3.0.3-105.el5_5.2 
drbd83-8.3.7-1.el5.centos (from CentOS testing) 
kmod-drbd83-xen-8.3.7-2.el5.centos (from CentOS testing) 

I've ran hardware diagnostics and it always comes back fine. I don't understand the kernel panic so maybe someone can help me out. Any idea at all as to what it could be? I have a vmcore crash dump from the last crash if more information is required. 

This is the kernel bug message: 

BUG: unable to handle kernel paging request at virtual address e00ce5f0 
printing eip: 
c04ecc1e 
204df000 -> *pde = 00000001:0da5e001 
2105e000 -> *pme = 00000000:3e0f9067 
000f9000 -> *pte = 00000000:00000000 
Oops: 0000 [0000001] 
SMP 
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq 
Modules linked in: xt_physdev netloop netbk blktap blkbk ipt_MASQUERADE iptable_nat ip_nat bridge drbd(U) autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi ac parport_pc lp parport joydev sr_mod 8250_pnp sg serio_raw i5000_edac edac_mc 8250 pcspkr ide_cd serial_core cdrom bnx2 dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache usb_storage ata_piix libata mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd 
CPU: 5 
EIP: 0061:[<c04ecc1e>] Tainted: G VLI 
EFLAGS: 00010286 (2.6.18-194.3.1.el5xen 0000001) 
EIP is at csum_partial+0xca/0x120 
eax: 00000000 ebx: c04ecc1e ecx: 0000000b edx: 000005a8 
esi: e00ce618 edi: 000005a8 ebp: 00000034 esp: dbcc2b4c 
ds: 007b es: 007b ss: 0069 
Process drbd2_receiver (pid: 6635, ti=dbcc2000 task=ed7abaa0 task.ti=dbcc2000) 
Stack: e00ce000 00000034 c05bbf44 e00ce5f0 000005a8 00000000 00000010 ddbb4170 
00000000 00000020 000005dc eccdd114 c05bcc29 eccdd000 000005a8 ddbb4170 
ecd44acc ecd44ae0 dbcc2c5c c05c1139 d2d260b4 df257bf4 dbcc2c5c 00000003 
Call Trace: 
[<c05bbf44>] skb_checksum+0x111/0x27b 
[<c05bcc29>] pskb_expand_head+0xd6/0x11a 
[<c05c1139>] skb_checksum_help+0x64/0xb3 
[<ee5402ae>] ip_nat_fn+0x42/0x17a [iptable_nat] 
[<ee5405dd>] ip_nat_local_fn+0x34/0xa3 [iptable_nat] 
[<c05dee38>] dst_output+0x0/0x7 
[<c05d7480>] nf_iterate+0x30/0x61 
[<c05dee38>] dst_output+0x0/0x7 
[<c05d75a6>] nf_hook_slow+0x3a/0x90 
[<c05dee38>] dst_output+0x0/0x7 
[<c05e114f>] ip_queue_xmit+0x3bb/0x40c 
[<c05dee38>] dst_output+0x0/0x7 
[<c05c1da3>] dev_hard_start_xmit+0x1b4/0x25a 
[<c042530c>] local_bh_enable+0x5/0x81 
[<c05c3946>] dev_queue_xmit+0x329/0x357 
[<c05e19ad>] ip_output+0x22e/0x265 
[<c046e65c>] __kmalloc+0x7c/0x87 
[<c046e65c>] __kmalloc+0x7c/0x87 
[<c05eefba>] tcp_transmit_skb+0x5c7/0x5f5 
[<c05efd31>] tcp_retransmit_skb+0x4d5/0x5b7 
[<c05ee802>] tcp_may_send_now+0x3c/0x49 
[<c05effec>] tcp_xmit_retransmit_queue+0x1d9/0x257 
[<c05eb2dc>] tcp_ack+0x1573/0x16b5 
[<c05ee248>] tcp_rcv_established+0x6cd/0x7c5 
[<c05f3258>] tcp_v4_do_rcv+0x25/0x2b6 
[<c061e758>] _spin_lock_bh+0x8/0x18 
[<c05b9bb8>] release_sock+0x44/0x91 
[<c05ba3a7>] sk_wait_data+0x58/0x98 
[<c043128b>] autoremove_wake_function+0x0/0x2d 
[<c05e79d8>] tcp_recvmsg+0x3b6/0x9fa 
[<c05b968f>] sock_common_recvmsg+0x2f/0x45 
[<c05b72a1>] sock_recvmsg+0xee/0x141 
[<c043128b>] autoremove_wake_function+0x0/0x2d 
[<c04de451>] __make_request+0x319/0x348 
[<c0410000>] _speedstep_get+0x21/0x4b 
[<ee58142e>] drbd_recv+0x5a/0xdb [drbd] 
[<ee581714>] drbd_recv_header+0x10/0x81 [drbd] 
[<ee581ceb>] drbdd+0x1f/0xe6 [drbd] 
[<ee584970>] drbdd_init+0xaf/0xf2 [drbd] 
[<ee593df6>] drbd_thread_setup+0xfe/0x1a2 [drbd] 
[<ee593cf8>] drbd_thread_setup+0x0/0x1a2 [drbd] 
[<c0403005>] kernel_thread_helper+0x5/0xb 
======================= 
Code: 9c 13 46 a0 13 46 a4 13 46 a8 13 46 ac 13 46 b0 13 46 b4 13 46 b8 13 46 bc 13 46 c0 13 46 c4 13 46 c8 13 46 cc 13 46 d0 13 46 d4 <13> 46 d8 13 46 dc 13 46 e0 13 46 e4 13 46 e8 13 46 ec 13 46 f0 
EIP: [<c04ecc1e>] csum_partial+0xca/0x120 SS:ESP 0069:dbcc2b4c


Any ideas?

Regards,
Shane



More information about the drbd-user mailing list