[DRBD-user] Xenserver 5.6 FP 1 DRBD crash

Alex Kuehne alex at movx.de
Mon Jan 10 16:07:12 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Quoting Alex Kuehne <alex at movx.de>:

> Hi guys,
>
> This is another report of DRBD not working with Xenserver 5.6 FP1. I  
> tried version 8.3.9 and 8.3.8.1. With Xenserver 5.6 (without FP1) at  
> least version 8.3.8.1 is working.
>
> The crash occured when promoting one side to primary and calling  
> drbd-overview. The system immediately freezes and becomes  
> unresponsive.
>
> Another crash scenario is when I try to create a LVM storage on the  
> drbd device. The host where I type the commands crashes, the peer is  
> still available, eg. does not reboot or similar.
>
> Here is what Xenserver wrote to crash log before halting:
>
> <1>BUG: unable to handle kernel NULL pointer dereference at 00000004
>         <1>IP: [<c01b9abc>] bio_free+0x2c/0x50
>         <4>*pdpt = 000000020c6ac027 *pde = 0000000000000000
>         <0>Oops: 0000 [#1] SMP
>         <0>last sysfs file: /sys/block/drbd1/dev
>         <4>Modules linked in: sha1_generic drbd cn lockd sunrpc  
> bonding bridge stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4  
> xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables  
> binfmt_misc nls_utf8 isofs dm_mirror video output sbs sbshc fan  
> battery ac parport_pc lp parport nvram sr_mod cdrom sg container  
> e1000e pata_acpi evdev bnx2 button thermal processor thermal_sys  
> ata_piix ata_generic serio_raw 8250_pnp 8250 serial_core rtc_cmos  
> rtc_core rtc_lib tpm_tis tpm i5k_amb hwmon tpm_bios libata i2c_i801  
> i2c_core pcspkr dm_region_hash dm_log dm_mod ide_gd_mod megaraid_sas  
> sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon  
> font tileblit bitblit softcursor [last unloaded: microcode]
>         <4>
>         <4>Pid: 0, comm: swapper Not tainted  
> (2.6.32.12-0.7.1.xs5.6.100.307.170586xen #1) PRIMERGY RX200 S4
>         <4>EIP: 0061:[<c01b9abc>] EFLAGS: 00010246 CPU: 1
>         <4>EIP is at bio_free+0x2c/0x50
>         <4>EAX: ee96410c EBX: ee9640c0 ECX: cec03000 EDX: ee96410c
>         <4>ESI: 00000000 EDI: ee863d20 EBP: ee863cf0 ESP: ee863ce8
>         <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>         <0>Process swapper (pid: 0, ti=ee862000 task=ee83d3f0  
> task.ti=ee862000)
>         <0>Stack:
>         <4> 00000000 ee8d76a8 ee863cf8 c01b9aeb ee863d00 c01b8385  
> ee863d34 f0c8628b
>         <4><0> 0000002f 00000001 00000000 c16cc380 cfbf075c 00000014  
> ee863d6c c0145279
>         <4><0> f0c86200 ee8d76a8 00000000 ee863d40 c01b8290 ee9640c0  
> ee863d64 c020f12c
>         <0>Call Trace:
>         <4> [<c01b9aeb>] ? bio_fs_destructor+0xb/0x10
>         <4> [<c01b8385>] ? bio_put+0x25/0x30
>         <4> [<f0c8628b>] ? drbd_endio_pri+0x8b/0x130 [drbd]
>         <4> [<c0145279>] ? sched_clock_local+0xc9/0x1a0
>         <4> [<f0c86200>] ? drbd_endio_pri+0x0/0x130 [drbd]
>         <4> [<c01b8290>] ? bio_endio+0x20/0x40
>         <4> [<c020f12c>] ? req_bio_endio+0x5c/0xd0
>         <4> [<c020f22e>] ? blk_update_request+0x8e/0x390
>         <4> [<c020f546>] ? blk_update_bidi_request+0x16/0x60
>         <4> [<c0210056>] ? blk_end_bidi_request+0x26/0x70
>         <4> [<c02100b2>] ? blk_end_request+0x12/0x20
>         <4> [<f037c13c>] ? scsi_io_completion+0x9c/0x480 [scsi_mod]
>         <4> [<f037bcfc>] ? scsi_device_unbusy+0x8c/0xc0 [scsi_mod]
>         <4> [<f037564d>] ? scsi_finish_command+0x9d/0x100 [scsi_mod]
>         <4> [<f037919e>] ? scsi_decide_disposition+0x15e/0x170 [scsi_mod]
>         <4> [<f037c61d>] ? scsi_softirq_done+0xfd/0x130 [scsi_mod]
>         <4> [<c021576a>] ? trigger_softirq+0x8a/0xa0
>         <4> [<c02157e8>] ? blk_done_softirq+0x68/0x80
>         <4> [<c013117a>] ? __do_softirq+0xba/0x180
>         <4> [<c01591a7>] ? handle_IRQ_event+0x37/0x100
>         <4> [<c015c2c4>] ? move_native_irq+0x14/0x50
>         <4> [<c01312b5>] ? do_softirq+0x75/0x80
>         <4> [<c013159b>] ? irq_exit+0x2b/0x40
>         <4> [<c02987e7>] ? evtchn_do_upcall+0x1e7/0x330
>         <4> [<c012076f>] ? set_next_entity+0x1f/0x50
>         <4> [<c01046ef>] ? hypervisor_callback+0x43/0x4b
>         <4> [<c0106f35>] ? xen_safe_halt+0xb5/0x150
>         <4> [<c010ac4e>] ? xen_idle+0x1e/0x50
>         <4> [<c0102a7b>] ? cpu_idle+0x3b/0x60
>         <4> [<c037afdd>] ? cpu_bringup_and_idle+0xd/0x10
>         <0>Code: 89 e5 83 ec 08 89 1c 24 89 c3 89 74 24 04 89 d6 8b  
> 50 38 85 d2 74 14 8d 40 4c 39 c2 74 0d 8b 4b 10 89 f0 c1 e9 1c e8 a4  
> ff ff ff <2b> 5e 04 8b 56 08 89 d8 e8 97 99 fa ff 8b 1c 24 8b 74 24  
> 04 89
>         <0>EIP: [<c01b9abc>] bio_free+0x2c/0x50 SS:ESP 0069:ee863ce8
>         <0>CR2: 0000000000000004
>
> I hope this is only a minor bug as it used to work with 5.6, I  
> really intend to use DRBD with FP1. So any response is appreciated,  
> if you need further info just give me a note.
>
> Best regards
> Alex Kuehne

Follow up: I now tried to use version 8.3.10rc1. While doing "service  
drbd start" on the Xenserver, the drbdadm command crashes with that  
error message:

Jan 10 15:57:50 xs1 kernel: drbd: initialized. Version: 8.3.10rc1  
(api:88/proto:86-96)
Jan 10 15:57:50 xs1 kernel: drbd: GIT-hash:  
1a1dfa9f736c091cf4a4b8f8042601f3bcd00c5e build by root at std11526-vm01,  
2011-01-10 09:38:20
Jan 10 15:57:50 xs1 kernel: drbd: registered as block device major 147
Jan 10 15:57:50 xs1 kernel: drbd: minor_table @ 0xea4e30c0
Jan 10 15:57:50 xs1 kernel: drbdadm[11720]: segfault at 0 ip 08052690  
sp bffb4f60 error 4 in drbdadm[8048000+23000]

The drbd resource is not getting initialized at all. I'm building  
everything on Xenserver 5.6 FP1 DDK as RPM package.

BR,
Alex Kuehne



More information about the drbd-user mailing list