[DRBD-user] 9.0.1-1 crash on disconnect after digest mismatch

Jan Janicki jj at lp.pl
Tue Mar 15 17:59:44 CET 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello there,
not sure if this was already reported, but here goes:

3 node setup following the manual, all settings on defaults with two 
exceptions:

[GLOBAL]
storage-plugin = drbdmanage.storage.lvm.Lvm

common {
    net {
        verify-alg crc32c;
}


My workload (KVM with guest disk cache=none) every now and then triggers 
the "Digest mismatch, buffer modified by upper layers during write:" error.

As per documentation, a digest mismatch should immediately cause a 
disconnect/reconnect/resync.
The first observed problem is: in my case the disconnect happens 
immediately only sometimes,
and sometimes a lot of those messages are repeated and a couple of 
minutes can pass before it does disconnect.

Second observed problem: sometimes when such disconnect happens, the 
drbd thread on primary node can crash,
(after that kernel works for a moment before all cpus finally lock up 
and the node needs a power cycle)

Disabling verify-alg helps, but maybe simply because there are no more 
disconnects happening to trigger the crash?

software versions:
---
[    0.000000] Linux version 4.2.8-1-pve (root at elsa) (gcc version 4.9.2 
(Debian 4.9.2-10) ) #1 SMP Fri Feb 26 16:37:36 CET 2016 ()
(...)
[  116.274693] drbd: initialized. Version: 9.0.1-1 (api:2/proto:86-111)
[  116.281869] drbd: GIT-hash: 3d38916489fac62b036d8e79d3fcd81d318ca4cb 
build by root at elsa, 2016-02-26 16:42:55
---

crash relevant dmesg output:
---
[114257.391726] drbd vm-10030-disk-1/0 drbd104: Digest mismatch, buffer 
modified by upper layers during write: 3534072s +163840
[114257.408687] drbd vm-10030-disk-1 hn51: Connection closed
[114257.414970] drbd vm-10030-disk-1 hn51: conn( Disconnecting -> 
StandAlone )
[114257.422821] drbd vm-10030-disk-1 hn51: Terminating receiver thread
[114257.450955] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000068
[114257.459878] IP: [<ffffffffc04b0ffa>] _tl_restart+0xaa/0xf0 [drbd]
[114257.466834] PGD 0
[114257.469214] Oops: 0000 [#1] SMP
[114257.472968] Modules linked in: veth act_police cls_u32 sch_ingress 
sch_htb drbd_transport_tcp(O) drbd(O) ip6t_REJECT nf_reject_ipv6 
nf_conntrack_ipv6 nf_defrag_ipv6 xt_mac ipt_REJECT nf_reject_ipv4 
xt_physdev xt_comment xt_tcpudp xt_mark xt_addrtype ip_set_hash_net 
softdog iptable_filter nfsd auth_rpcgss nfs_acl nfs lockd grace fscache 
sunrpc 8021q garp mrp openvswitch libcrc32c bonding xt_set ip_set 
xt_multiport xt_conntrack ip6table_filter ip6_tables xt_nat iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
ip_tables x_tables nfnetlink_log nfnetlink intel_rapl 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm 
crct10dif_pclmul snd_pcm crc32_pclmul snd_timer aesni_intel snd 
aes_x86_64 soundcore lrw uas gf128mul glue_helper ablk_helper cdc_ether 
cryptd pcspkr sb_edac
[114257.559773]  usbnet usb_storage mii edac_core lpc_ich 8250_fintek 
shpchp mac_hid ioatdma wmi ipmi_ssif vhost_net vhost macvtap macvlan 
ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler autofs4 igb 
i2c_algo_bit dca ahci ptp libahci pps_core megaraid_sas
[114257.583826] CPU: 5 PID: 10103 Comm: drbd_r_vm-10030 Tainted: 
G           O    4.2.8-1-pve #1
[114257.593406] Hardware name: IBM System x3650 M4 -[7915E3G]-/00W2665, 
BIOS -[VVE146AUS-2.00]- 09/17/2015
[114257.603957] task: ffff881003872940 ti: ffff88201b2ec000 task.ti: 
ffff88201b2ec000
[114257.612467] RIP: 0010:[<ffffffffc04b0ffa>] [<ffffffffc04b0ffa>] 
_tl_restart+0xaa/0xf0 [drbd]
[114257.622159] RSP: 0018:ffff88201b2efcd8  EFLAGS: 00010082
[114257.628219] RAX: ffff88103573d1e0 RBX: ffff881fc0d56db0 RCX: 
0000000000000000
[114257.636342] RDX: ffff881fc0d56e08 RSI: 0000000000000092 RDI: 
0000000000000092
[114257.644464] RBP: ffff88201b2efd38 R08: 0000000000000000 R09: 
0000000180190014
[114257.652587] R10: ffff88103fb60720 R11: ffff8810010f7200 R12: 
ffff882035dbd930
[114257.660709] R13: ffff881035e22000 R14: 0000000000000000 R15: 
ffff881fc0d56db0
[114257.668832] FS:  0000000000000000(0000) GS:ffff88103fb40000(0000) 
knlGS:0000000000000000
[114257.678022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[114257.684567] CR2: 0000000000000068 CR3: 0000000001e0d000 CR4: 
00000000000426e0
[114257.692689] Stack:
[114257.695055]  ffff881035e22048 0000000b00000286 ffff88201b2efd18 
ffff880f7a01df00
[114257.703525]  0000000000000000 0000000016746d9c ffff882035dbdbb0 
ffff882035dbd918
[114257.711992]  ffff881035e22000 000000000000000b ffff882034cc9800 
ffff882034cc9800
[114257.720456] Call Trace:
[114257.723318]  [<ffffffffc04b1082>] tl_restart+0x42/0x60 [drbd]
[114257.729860]  [<ffffffffc04b10b3>] tl_clear+0x13/0x20 [drbd]
[114257.736215]  [<ffffffffc04a3681>] conn_disconnect+0x281/0x830 [drbd]
[114257.743447]  [<ffffffffc04d30b6>] ? change_cstate+0x86/0xc0 [drbd]
[114257.750483]  [<ffffffffc0499590>] ? got_IsInSync+0x300/0x300 [drbd]
[114257.757617]  [<ffffffffc04a4887>] drbd_receiver+0x177/0x5e0 [drbd]
[114257.764654]  [<ffffffffc04af390>] ? w_complete+0x20/0x20 [drbd]
[114257.771397]  [<ffffffffc04af3f4>] drbd_thread_setup+0x64/0x120 [drbd]
[114257.778723]  [<ffffffffc04af390>] ? w_complete+0x20/0x20 [drbd]
[114257.785466]  [<ffffffff8109b1fa>] kthread+0xea/0x100
[114257.791140]  [<ffffffff8109b110>] ? kthread_create_on_node+0x1f0/0x1f0
[114257.798562]  [<ffffffff8180af1f>] ret_from_fork+0x3f/0x70
[114257.804718]  [<ffffffff8109b110>] ? kthread_create_on_node+0x1f0/0x1f0
[114257.812139] Code: 75 b8 4c 89 f7 e8 87 5f ff ff 48 8b 43 58 48 8d 53 
58 49 89 df 48 83 e8 58 49 39 d4 74 29 48 89 c3 49 8b 45 48 4d 8b 37 48 
85 c0 <41> 8b 76 68 74 a8 89 f2 30 d2 3b 10 75 a0 40 0f b6 f6 48 8d 04
[114257.834267] RIP  [<ffffffffc04b0ffa>] _tl_restart+0xaa/0xf0 [drbd]
[114257.841312]  RSP <ffff88201b2efcd8>
[114257.845331] CR2: 0000000000000068
[114257.849575] ---[ end trace 7c993d7d40ff47ee ]---
[114271.130266] ------------[ cut here ]------------
---

--
   Jan Janicki




More information about the drbd-user mailing list