Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, I am trying to set up a KVM cluster with CentOS 6.0, corosync/pacemaker, dual-primary drbd and KVM. Whenever I restart the corosync process or reboot one of the machines, I get a kernel panic and one (or even both) machine die. I tried all the tipps I found in mailing lists or bugtrackers like loading the drbd module with disable_sendpage=1 or disabling checksumming and generic segmentation offload via ethtool. Same happens with drbd83 and drbd84 packages from elrepo and with a self-compiled drbd84 from linbit sources. /etc/drbd.conf: global { dialog-refresh 1; minor-count 5; usage-count no; } common { } resource r0 { protocol C; disk { on-io-error pass_on; } syncer { rate 100M; } net { allow-two-primaries yes; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } startup { wfc-timeout 10; become-primary-on both; } on proxy03 { device /dev/drbd0; address 10.10.10.27:7788; meta-disk internal; disk /dev/sysvg/kvm; } n on proxy04 { device /dev/drbd0; address 10.10.10.28:7788; meta-disk internal; disk /dev/sysvg/kvm; } } last messages from /var/log/messages: Aug 24 10:43:55 proxy03 kernel: d-con r0: Handshake successful: Agreed network protocol version 100 Aug 24 10:43:55 proxy03 kernel: d-con r0: conn( WFConnection -> WFReportParams ) Aug 24 10:43:55 proxy03 kernel: d-con r0: Starting asender thread (from drbd_r_r0 [19247]) Aug 24 10:43:55 proxy03 kernel: block drbd0: drbd_sync_handshake: Aug 24 10:43:55 proxy03 kernel: block drbd0: self 52406041848E78A3:F32F8530A9B9C955:66C1B63DDC072892:66C0B63DDC072893 bits:0 flags:0 Aug 24 10:43:55 proxy03 kernel: block drbd0: peer F32F8530A9B9C954:0000000000000000:66C1B63DDC072893:66C0B63DDC072893 bits:0 flags:0 Aug 24 10:43:55 proxy03 kernel: block drbd0: uuid_compare()=1 by rule 70 Aug 24 10:43:55 proxy03 kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent ) Aug 24 10:43:55 proxy03 kernel: block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 99.9% Aug 24 10:43:55 proxy03 kernel: block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 99.9% Aug 24 10:43:55 proxy03 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 Aug 24 10:43:55 proxy03 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 Aug 24 10:43:55 proxy03 kernel: IP: [<ffffffff813fda60>] sock_ioctl+0x30/0x280 Aug 24 10:43:55 proxy03 kernel: PGD 242b39067 PUD 2422a0067 PMD 0 Aug 24 10:43:55 proxy03 kernel: Oops: 0000 [#1] SMP Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:Oops: 0000 [#1] SMP Aug 24 10:43:55 proxy03 kernel: last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Aug 24 10:43:55 proxy03 kernel: CPU 3 Aug 24 10:43:55 proxy03 kernel: Modules linked in: sctp gfs2 dlm configfs drbd(U) libcrc32c sunrpc cpufreq_ondemand acpi_cpufreq freq_table bonding ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support sg shpchp ioatdma dca i7core_edac edac_core bnx2 ext3 jbd mbcache sd_mod crc_t10dif megaraid_sas ata_generic pata_acpi ata_piix dm_mod [last unloaded: microcode] Aug 24 10:43:55 proxy03 kernel: Aug 24 10:43:55 proxy03 kernel: Modules linked in: sctp gfs2 dlm configfs drbd(U) libcrc32c sunrpc cpufreq_ondemand acpi_cpufreq freq_table bonding ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support sg shpchp ioatdma dca i7core_edac edac_core bnx2 ext3 jbd mbcache sd_mod crc_t10dif megaraid_sas ata_generic pata_acpi ata_piix dm_mod [last unloaded: microcode] Aug 24 10:43:55 proxy03 kernel: Pid: 20331, comm: drbdadm Not tainted 2.6.32-71.29.1.el6.x86_64 #1 System x3550 M3 -[7944KBG]- Aug 24 10:43:55 proxy03 kernel: RIP: 0010:[<ffffffff813fda60>] [<ffffffff813fda60>] sock_ioctl+0x30/0x280 Aug 24 10:43:55 proxy03 kernel: RSP: 0018:ffff880242949e38 EFLAGS: 00010282 Aug 24 10:43:55 proxy03 kernel: RAX: 0000000000000000 RBX: 0000000000005401 RCX: 00007fff34be3c40 Aug 24 10:43:55 proxy03 kernel: RDX: 00007fff34be3c40 RSI: 0000000000005401 RDI: ffff880242b0b840 Aug 24 10:43:55 proxy03 kernel: RBP: ffff880242949e58 R08: ffffffff81536380 R09: 000000316920e930 Aug 24 10:43:55 proxy03 kernel: R10: 00007fff34be3a50 R11: 0000000000000202 R12: 00007fff34be3c40 Aug 24 10:43:55 proxy03 kernel: R13: 00007fff34be3c40 R14: ffff880252493140 R15: 0000000000000000 Aug 24 10:43:55 proxy03 kernel: FS: 00007fe14fe14700(0000) GS:ffff88002f660000(0000) knlGS:0000000000000000 Aug 24 10:43:55 proxy03 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 24 10:43:55 proxy03 kernel: CR2: 0000000000000038 CR3: 0000000242196000 CR4: 00000000000006e0 Aug 24 10:43:55 proxy03 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 24 10:43:55 proxy03 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 24 10:43:55 proxy03 kernel: Process drbdadm (pid: 20331, threadinfo ffff880242948000, task ffff8802714c34e0) Aug 24 10:43:55 proxy03 kernel: Stack: Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:Stack: Aug 24 10:43:55 proxy03 kernel: ffff880242b0b840 ffff880252493188 00007fff34be3c40 0000000000000000 Aug 24 10:43:55 proxy03 kernel: <0> ffff880242949e98 ffffffff8117fdf2 ffff880242949eb8 0000000000000001 Aug 24 10:43:55 proxy03 kernel: <0> 0000000000402340 0000003169ad9050 ffff8802429db080 ffff880242b0b840 Aug 24 10:43:55 proxy03 kernel: Call Trace: Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:Call Trace: Aug 24 10:43:55 proxy03 kernel: [<ffffffff8117fdf2>] vfs_ioctl+0x22/0xa0 Aug 24 10:43:55 proxy03 kernel: [<ffffffff8117ff94>] do_vfs_ioctl+0x84/0x580 Aug 24 10:43:55 proxy03 kernel: [<ffffffff8113676d>] ? handle_mm_fault+0x1ed/0x2b0 Aug 24 10:43:55 proxy03 kernel: [<ffffffff81180511>] sys_ioctl+0x81/0xa0 Aug 24 10:43:55 proxy03 kernel: [<ffffffff81013172>] system_call_fastpath+0x16/0x1b Aug 24 10:43:55 proxy03 kernel: Code: 83 ec 20 48 89 1c 24 4c 89 64 24 08 4c 89 6c 24 10 4c 89 74 24 18 0f 1f 44 00 00 4c 8b b7 a0 00 00 00 89 f3 49 89 d4 49 8b 46 38 <4c> 8b 68 38 8d 83 10 76 ff ff 83 f8 0f 76 51 8d 83 00 75 ff ff Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:Code: 83 ec 20 48 89 1c 24 4c 89 64 24 08 4c 89 6c 24 10 4c 89 74 24 18 0f 1f 44 00 00 4c 8b b7 a0 00 00 00 89 f3 49 89 d4 49 8b 46 38 <4c> 8b 68 38 8d 83 10 76 ff ff 83 f8 0f 76 51 8d 83 00 75 ff ff Aug 24 10:43:55 proxy03 kernel: RIP [<ffffffff813fda60>] sock_ioctl+0x30/0x280 Aug 24 10:43:55 proxy03 kernel: RSP <ffff880242949e38> Aug 24 10:43:55 proxy03 kernel: CR2: 0000000000000038 Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:CR2: 0000000000000038 Aug 24 10:43:55 proxy03 kernel: ---[ end trace 2a8c21ee3fd5b98d ]--- Aug 24 10:43:55 proxy03 kernel: Kernel panic - not syncing: Fatal exception Message from syslogd at proxy03 at Aug 24 10:43:55 ... kernel:Kernel panic - not syncing: Fatal exception Aug 24 10:43:55 proxy03 kernel: Pid: 20331, comm: drbdadm Tainted: G D ---------------- 2.6.32-71.29.1.el6.x86_64 #1 Aug 24 10:43:55 proxy03 kernel: Call Trace: Aug 24 10:43:55 proxy03 kernel: [<ffffffff814c8b54>] panic+0x78/0x137 Aug 24 10:43:55 proxy03 kernel: [<ffffffff814ccc24>] oops_end+0xe4/0x100 Aug 24 10:43:55 proxy03 kernel: [<ffffffff8104656b>] no_context+0xfb/0x260 Aug 24 10:43:55 proxy03 kernel: [<ffffffff810467f5>] __bad_area_nosemaphore+0x125/0x1e0 Any ideas? More information needed? Regards, Peter