[DRBD-user] CentOS 7.3 3.10.0-514.10.2.el7.x86_64 w. DRBD v8.4.9-2: PANIC: ".1BUG: unable to handle kernel NULL pointer dereference at 0000000000000014"

Lars Ellenberg lars.ellenberg at linbit.com
Tue Apr 18 15:32:07 CEST 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Apr 18, 2017 at 10:52:57AM +1000, Adi Pircalabu wrote:
> Hi, initially submitted here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1442593
> The node that crashed was at the time the active member of an active/passive
> Pacemaker cluster, using DRBD backed replicated storage for iSCSI and NFS
> resources.
> The RedHat developer closed the bug due to loading drbd out-of-tree. The
> module is built using the source from http://git.drbd.org/drbd-8.4.git/
> Even though it may or may not be related to DRBD, I thought it's worth
> having your opinion on this.

I don't see how the presence of DRBD would make
the apic_timer_interrupt deref some bad pointer,
while the cpu is "idle",
with such "boring" backtrace,
and for you only.

But yes, I know, just having DRBD around
makes it responsible for everything, I'm used to that.

I mean, sure, in theory, it was possible, somehow...
but I see no indication of that in the data provided.

> More details below.

[524641.634488] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[609611.321388] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[699614.201392] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[714013.791543] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[717613.869888] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[721213.962027] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[768015.053756] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[775215.979066] UDP: bad checksum. From 10.168.1.206:137 to 10.168.1.255:137 ulen 58
[793292.358213] .1BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
[793292.358710] IP: [<ffffffff810c8375>] account_system_time+0x15/0x170
[793292.358966] PGD 0
[793292.359202] Oops: 0000 [#1] SMP
[793292.359444] Modules linked in: binfmt_misc vfat fat drbd(OE) mpt3sas
	mpt2sas raid_class scsi_transport_sas mptctl mptbase dell_rbu bonding
	ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
	nf_conntrack iptable_filter dm_cache_smq dm_cache +dm_persistent_data
	dm_bio_prison dm_bufio intel_powerclamp coretemp intel_rapl iosf_mbi kvm
	irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
	ablk_helper cryptd iTCO_wdt iTCO_vendor_support dcdbas pcspkr mxm_wmi sg
	sb_edac edac_core
	+ipmi_devintf ipmi_si ipmi_msghandler lpc_ich mei_me mei shpchp
	acpi_power_meter wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc tcp_htcp
	ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul
	crct10dif_common crc32c_intel drm_kms_helper syscopyarea +sysfillrect
[793292.362164]  sysimgblt fb_sys_fops ttm ixgbe drm ahci uas igb libahci mdio
	i2c_algo_bit usb_storage ptp libata pps_core i2c_core megaraid_sas dca fjes
	dm_mirror dm_region_hash dm_log dm_mod
[793292.362978] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE  ------------   3.10.0-514.10.2.el7.x86_64 #1
[793292.363448] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS 2.3.4 11/08/2016
[793292.363910] task: ffff8804fa559f60 ti: ffff8804fa680000 task.ti: ffff8804fa680000
[793292.364377] RIP: 0010:[<ffffffff810c8375>]  [<ffffffff810c8375>] account_system_time+0x15/0x170
[793292.364850] RSP: 0018:ffff88086de43e00  EFLAGS: 00010086
[793292.365088] RAX: 0000000000000000 RBX: ffff88086de56c40 RCX: 00000000000f4240
[793292.365550] RDX: 00000000000f4240 RSI: 0000000000010000 RDI: 0000000000000000
[793292.366012] RBP: ffff88086de43e28 R08: 0000000000000000 R09: 00000000000c1af5
[793292.366470] R10: 000000003b9aca00 R11: 0000000000000000 R12: 00000000000f4240
[793292.367018] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88086de4f9d8
[793292.367473] FS:  0000000000000000(0000) GS:ffff88086de40000(0000) knlGS:0000000000000000
[793292.367935] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[793292.368179] CR2: 0000000000000014 CR3: 00000000019ba000 CR4: 00000000003407e0
[793292.368640] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[793292.369106] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[793292.369571] Stack:
[793292.369801]  ffff88086de56c40 0000000000016c40 0000000000000000 0000000000000000
[793292.370282]  ffff88086de4f9d8 ffff88086de43e60 ffffffff810c8682 0000000000000000
[793292.370764]  0000000000000000 0000000000000003 ffffffff810f3180 ffff88086de4f9d8
[793292.371250] Call Trace:
[793292.371484]  <IRQ>
[793292.371494]
[793292.371730]  [<ffffffff810c8682>] account_process_tick+0x62/0x170
[793292.371973]  [<ffffffff810f3180>] ? tick_sched_handle.isra.13+0x60/0x60
[793292.372218]  [<ffffffff8109932d>] update_process_times+0x2d/0x80
[793292.372465]  [<ffffffff810f3145>] tick_sched_handle.isra.13+0x25/0x60
[793292.372712]  [<ffffffff810f31c1>] tick_sched_timer+0x41/0x70
[793292.372957]  [<ffffffff810b4a32>] __hrtimer_run_queues+0xd2/0x260
[793292.373197]  [<ffffffff810b4fd0>] hrtimer_interrupt+0xb0/0x1e0
[793292.373445]  [<ffffffff81050fd7>] local_apic_timer_interrupt+0x37/0x60
[793292.373692]  [<ffffffff8169920f>] smp_apic_timer_interrupt+0x3f/0x60
[793292.373935]  [<ffffffff8169775d>] apic_timer_interrupt+0x6d/0x80
[793292.374178]  <EOI>
[793292.374187]
[793292.374423]  [<ffffffff81514492>] ? cpuidle_enter_state+0x52/0xc0
[793292.374664]  [<ffffffff815145d9>] cpuidle_idle_call+0xd9/0x210
[793292.374908]  [<ffffffff810350ee>] arch_cpu_idle+0xe/0x30
[793292.375154]  [<ffffffff810e7e65>] cpu_startup_entry+0x245/0x290
[793292.375398]  [<ffffffff8104f07a>] start_secondary+0x1ba/0x230
[793292.375640] Code: e8 81 63 07 00 5b 41 5c 41 5d 41 5e 5d c3 0f 1f 84 00 00
	00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 d4 53
	<f6> 47 14 10 48 89 fb 74 1c 65 48 8b 04 25 b8 cd 00 00 8b 80 44
[793292.376666] RIP  [<ffffffff810c8375>] account_system_time+0x15/0x170
[793292.376917]  RSP <ffff88086de43e00>
[793292.377158] CR2: 0000000000000014



> modinfo drbd
> filename:
> /lib/modules/3.10.0-514.16.1.el7.x86_64/weak-updates/drbd84/drbd.ko
> alias:          block-major-147-*
> license:        GPL
> version:        8.4.9-2
> description:    drbd - Distributed Replicated Block Device v8.4.9-2
> author:         Philipp Reisner <phil at linbit.com>, Lars Ellenberg
> <lars at linbit.com>
> rhelversion:    7.3
> srcversion:     F9BD1E13798E45BA4CA1B92
> depends:        libcrc32c
> vermagic:       3.10.0-514.10.2.el7.x86_64 SMP mod_unload modversions
> parm:           minor_count:Approximate number of drbd devices (1-255)
> (uint)
> parm:           disable_sendpage:bool
> parm:           allow_oos:DONT USE! (bool)
> parm:           proc_details:int
> parm:           enable_faults:int
> parm:           fault_rate:int
> parm:           fault_count:int
> parm:           fault_devs:int
> parm:           usermode_helper:string
> 
> KERNEL: /data/adi/crash/3.10.0-514.10.2.el7.x86_64/vmlinux
>     DUMPFILE: /data/adi/crash/127.0.0.1-2017-04-15-02:53:06/vmcore [PARTIAL
> DUMP]
>         CPUS: 12
>         DATE: Sat Apr 15 02:52:55 2017
>       UPTIME: 9 days, 04:22:12
> LOAD AVERAGE: 0.08, 0.08, 0.05
>        TASKS: 416
>     NODENAME: xd3
>      RELEASE: 3.10.0-514.10.2.el7.x86_64
>      VERSION: #1 SMP Fri Mar 3 00:04:05 UTC 2017
>      MACHINE: x86_64 (1700 Mhz)
>       MEMORY: 31.9 GB
>        PANIC: ".1BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000014"
>          PID: 0
>      COMMAND: "swapper/3"
>         TASK: ffff8804fa559f60 (1 of 12) [THREAD_INFO: ffff8804fa680000]
>          CPU: 3
>        STATE: TASK_RUNNING (PANIC)
> 
> Backtrace and other information extracted after the crash attached. Please
> let me know what other information I should provide.
> Thanks,
> 
> -- 
> Adi Pircalabu


> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user


-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed



More information about the drbd-user mailing list