[DRBD-user] drbd 8.4.6-5 oops when disconnect during upgrade

li songmin lisongmin9 at gmail.com
Wed Aug 9 16:21:33 CEST 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

when I upgrade drbd from 8.3.15 to 8.4.6-5, there is an oops cause by NULL
pointer Error.

upgrade step as follow:

1.  primary node work with drbd 8.3.15 as normal
2. stop drbd 8.3.15 on secondary node, and upgrade it to 8.4.6-5.
3. start secondary node, now data begin sync from primary node.
4. upgrade primary node with follow step
     1. stop business service on drbd
      2. disconnect drbd for umount quickly  <--  oops on secondary node
here?
      3.  umount filesystem
      4. primary -> secondary
      5. connect drbd and waiting sync complete.
      6. business service may start on secondary node now.
      7. stop drbd 8.3.15 on primary node, and upgrade it to 8.4.6-5.

call stack:

<6>[66071016.839607] block drbd0: peer( Primary ->
Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
<3>[66071016.840029] drbd drbd0: meta connection shut down by peer.
<6>[66071016.840037] drbd drbd0: ack_receiver terminated
<6>[66071016.840040] drbd drbd0: Terminating drbd_a_drbd0
<1>[66071017.154996] BUG: unable to handle kernel NULL
pointer dereference at           (null)
<1>[66071017.155013] IP: [<ffffffff8107b4e7>] __queue_work+0x17/0x3f0
<4>[66071017.155030] PGD 0
<0>[66071017.155037] Oops: 0000 [#1] SMP
<4>[66071017.155048] CPU 0
<4>[66071017.155051] Modules linked in: softdog drbd(FN)
crc32c libcrc32c ib_ipoib ib_cm mlx4_en mlx4_ib ib_sa ib_
mad ib_core mlx4_core mpt3sas mptctl mptbase lp parport_pc
ppdev st parport ide_cd_mod ide_core joydev ipmi_devintf
ipmi_si ipmi_msghandler tcp_diag inet_diag nls_utf8 dm_
snapshot af_packet md5 binfmt_misc edd bonding cpufreq_conservative cpufreq_
userspace cpufreq_powersave acpi_cpufreq mperf microcode
fuse loop dm_mod cdc_ether igb usbnet shpchp tpm_tis dca
tpm ipv6 pci_hotplug sr_mod ipv6_lib tpm_bios ptp mii sg
pcspkr i2c_i801 rtc_cmos cdrom pps_core wmi button
ext3 jbd mbcache usbhid hid ttm drm_kms_helper drm i2c_algo_bit sysimgblt
sysfillrect i2c_core syscopyarea ehci_hcd usbcore
sd_mod usb_common crc_t10dif processor thermal_sys hwmon
scsi_dh_alua scsi_dh_hp_sw scsi_dh_emc scsi_dh_rdac scsi_
dh ahci libahci libata mpt2sas scsi_transport_sas
raid_class megaraid_sas scsi_mod [last unloaded: drbd]
<4>[66071017.155207] Supported: No, Unsupported modules are loaded
<4>[66071017.155214]
<4>[66071017.155221] Pid: 0, comm: swapper Tainted: GF   B
     N  3.0.76-0.11-default #1 IBM System x3650 M4 : -[7915OSC]-/00Y8494
<4>[66071017.155234] RIP: 0010:[<ffffffff8107b4e7>]  [<
ffffffff8107b4e7>] __queue_work+0x17/0x3f0
<4>[66071017.155246] RSP: 0018:ffff88047fc03cc0  EFLAGS: 00010086
<4>[66071017.155253] RAX: 0000000000000000 RBX: ffff880465032c00 RCX:
0000000000000000
<4>[66071017.155261] RDX: ffff8804294fb3e0 RSI: 0000000000000000 RDI:
0000000000000000
<4>[66071017.155269] RBP: 0000000000000000 R08: 0000000000000000 R09:
00000004d9030553
<4>[66071017.155277] R10: 00000004d903059c R11: 0000000000000001 R12:
ffff880465032800
<4>[66071017.155285] R13: ffff8804294fb3e0 R14: 0000000000000000 R15:
0000000000003800
<4>[66071017.155293] FS:  0000000000000000(0000) GS:
ffff88047fc00000(0000) knlGS:0000000000000000
<4>[66071017.155302] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[66071017.155309] CR2: 0000000000000000 CR3: 0000000001a09000 CR4:
00000000000407f0
<4>[66071017.155317] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
<4>[66071017.155325] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
<4>[66071017.155333] Process swapper (pid: 0, threadinfo
ffffffff81a00000, task ffffffff81a11020)
<0>[66071017.155341] Stack:
<4>[66071017.155346]  ffff88046557b000 ffff880438b41248 ffff880465bfe000
ffff880465032c00
<4>[66071017.155363]  ffff880465032870 ffff880465032800 ffff88076b08e000
ffff8804294fb3c0
<4>[66071017.155376]  0000000000003800 ffffffff8107b906 ffff8804654e8e10
ffffffff8107b95a
<0>[66071017.155389] Call Trace:
<4>[66071017.155408]  [<ffffffff8107b906>] queue_work_on+0x16/0x30
<4>[66071017.155419]  [<ffffffff8107b95a>] queue_work+0x1a/0x20
<4>[66071017.155440]  [<ffffffffa06aa8cd>] drbd_endio_
write_sec_final+0x25d/0x650 [drbd]
<4>[66071017.155488]  [<ffffffff8122404b>] blk_update_request+0x10b/0x440
<4>[66071017.155502]  [<ffffffff8122439f>] blk_update_bidi_request+0x1f/0x90
<4>[66071017.155513]  [<ffffffff81225517>] blk_end_bidi_request+0x27/0x80
<4>[66071017.155538]  [<ffffffffa000a5ea>] scsi_end_
request+0x3a/0xb0 [scsi_mod]
<4>[66071017.155572]  [<ffffffffa000a9dc>] scsi_io_
completion+0x10c/0x5b0 [scsi_mod]
<4>[66071017.155598]  [<ffffffff8122b8b5>] blk_done_softirq+0x75/0x90
<4>[66071017.155611]  [<ffffffff81066eaf>] __do_softirq+0xef/0x220
<4>[66071017.155627]  [<ffffffff814657dc>] call_softirq+0x1c/0x30
<4>[66071017.155642]  [<ffffffff81004445>] do_softirq+0x65/0xa0
<4>[66071017.155654]  [<ffffffff81066ca5>] irq_exit+0xc5/0xe0
<4>[66071017.155667]  [<ffffffff81465433>] call_function_single_interrupt+
0x13/0x20
<4>[66071017.155683]  [<ffffffff812ba85e>] intel_idle+0x9e/0x130
<4>[66071017.155697]  [<ffffffff813769eb>] cpuidle_idle_call+0x11b/0x280
<4>[66071017.155710]  [<ffffffff81002126>] cpu_idle+0x66/0xb0
<4>[66071017.155722]  [<ffffffff81beeeff>] start_kernel+0x376/0x447
<4>[66071017.155736]  [<ffffffff81bee3c9>] x86_64_start_kernel+0x123/0x13d
<0>[66071017.155746] Code: 8f b5 00 7d d4 5b c3 66 66 66 66
2e 0f 1f 84 00 00 00 00 00 41 57 41 56 41 89 fe 41 55 49 89
d5 41 54 55 48 89 f5 53 48 83 ec 18 <8b> 16 f6 c2 40 0f 85
64 02 00 00 f6 c2 02 0f 85 f5 01 00 00 41
<1>[66071017.155810] RIP  [<ffffffff8107b4e7>] __queue_work+0x17/0x3f0
<4>[66071017.155820]  RSP <ffff88047fc03cc0>
<0>[66071017.155826] CR2: 0000000000000000
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20170809/5e0984da/attachment.htm>


More information about the drbd-user mailing list