<div dir="ltr"><div><div><div>Hi,<br><br></div>I am experimenting with DRBD dual-primary with OCFS 2, and DRBD client as well.<br></div>With the hope that every node can access the storage in an unified way.  But I got a<br></div><div>kernel call trace and huge number of ASSERTION failure (*before* OCFS2 is mounted):<br><br></div><div>----&lt;paste begins&gt;----<br>[11160.192091] INFO: task drbdsetup:19442 blocked for more than 120 seconds.<br>[11160.192096]       Tainted: G           OE   4.1.12-37.2.2.el7uek.x86_64 #2<br>[11160.192097] &quot;echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.<br>[11160.192099] drbdsetup       D ffff88013fd17840     0 19442      1 0x00000084<br>[11160.192108]  ffff8800addef8c8 0000000000000082 ffff88013a3d3800 ffff8800369eb800<br>[11160.192111]  ffff8800addef938 ffff8800addf0000 ffff8800adb192c0 7fffffffffffffff<br>[11160.192113]  ffff8800369eb800 0000000000000297 ffff8800addef8e8 ffffffff81712947<br>[11160.192116] Call Trace:<br>[11160.192128]  [&lt;ffffffff81712947&gt;] schedule+0x37/0x90<br>[11160.192131]  [&lt;ffffffff8171596c&gt;] schedule_timeout+0x20c/0x280<br>[11160.192134]  [&lt;ffffffff817158b6&gt;] ? schedule_timeout+0x156/0x280<br>[11160.192148]  [&lt;ffffffffa05c2695&gt;] ? drbd_destroy_path+0x15/0x20 [drbd]<br>[11160.192152]  [&lt;ffffffff817134b4&gt;] wait_for_completion+0x134/0x190<br>[11160.192157]  [&lt;ffffffff810b1d90&gt;] ? wake_up_state+0x20/0x20<br>[11160.192165]  [&lt;ffffffffa05c4d51&gt;] _drbd_thread_stop+0xc1/0x110 [drbd]<br>[11160.192173]  [&lt;ffffffffa05dd84c&gt;] del_connection+0x3c/0x140 [drbd]<br>[11160.192179]  [&lt;ffffffffa05e0bd3&gt;] drbd_adm_down+0xc3/0x2c0 [drbd]<br>[11160.192184]  [&lt;ffffffff8162886d&gt;] genl_family_rcv_msg+0x1cd/0x400<br>[11160.192186]  [&lt;ffffffff81628aa0&gt;] ? genl_family_rcv_msg+0x400/0x400<br>[11160.192188]  [&lt;ffffffff81628b31&gt;] genl_rcv_msg+0x91/0xd0<br>[11160.192190]  [&lt;ffffffff81627901&gt;] netlink_rcv_skb+0xc1/0xe0<br>[11160.192192]  [&lt;ffffffff81627fec&gt;] genl_rcv+0x2c/0x40<br>[11160.192193]  [&lt;ffffffff81626f86&gt;] netlink_unicast+0x106/0x210<br>[11160.192195]  [&lt;ffffffff816274c4&gt;] netlink_sendmsg+0x434/0x690<br>[11160.192199]  [&lt;ffffffff815d66ed&gt;] sock_sendmsg+0x3d/0x50<br>[11160.192201]  [&lt;ffffffff815d6785&gt;] sock_write_iter+0x85/0xf0<br>[11160.192205]  [&lt;ffffffff81209f6e&gt;] __vfs_write+0xce/0x120<br>[11160.192207]  [&lt;ffffffff8120a619&gt;] vfs_write+0xa9/0x1b0<br>[11160.192210]  [&lt;ffffffff8102587c&gt;] ? do_audit_syscall_entry+0x6c/0x70<br>[11160.192213]  [&lt;ffffffff8120b505&gt;] SyS_write+0x55/0xd0<br>[11160.192215]  [&lt;ffffffff81716aee&gt;] system_call_fastpath+0x12/0x71<br>[11163.573075] __bm_op: 84153300 callbacks suppressed<br>[11163.573075] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br>[10968.421046] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br>[10968.421046] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br>[10968.421046] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br>[10973.403026] __bm_op: 84588466 callbacks suppressed<br>[10973.403026] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br>[10973.403026] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br>[10973.403026] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in __bm_op<br></div><div>----&lt;paste ends&gt;----<br><br></div><div>&#39;grep -c&#39; shows tens of thousands of the ASSERTION error as shown above.<br><br></div><div>The call trace (and node got rebooted automatically) happened in a DRBD client node.<br><br></div><div>Any insights?<br><br></div><div>Thanks in advance.<br></div><div></div><div><br># cat /proc/drbd<br>version: 9.0.7-1 (api:2/proto:86-112)<br><br></div><div>My DRBD resource configuration:<br>resource r0 {<br>        handlers {<br>                split-brain &quot;/usr/lib64/drbd/notify-split-brain.sh root&quot;;<br>        }<br>        startup {<br>                become-primary-on both;<br>        }<br>        connection-mesh {<br>                hosts 10-0-149-20 10-0-147-191 10-0-218-14 10-0-183-69;<br>        }<br>        on 10-0-149-20 {<br>                node-id   0;<br>                address ipv4 <a href="http://10.0.149.20:7789">10.0.149.20:7789</a>;<br>                volume 0 {<br>                        device minor 100;<br>                        disk   /dev/disk/by-id/wwn-0x000f5ab58042677f;<br>                        meta-disk internal;<br>                }<br>        }<br>        on 10-0-147-191 {<br>                node-id   1;<br>                address ipv4 <a href="http://10.0.147.191:7789">10.0.147.191:7789</a>;<br>                volume 0 {<br>                        device minor 100;<br>                        disk   /dev/disk/by-id/wwn-0x000f5ab58042677f;<br>                        meta-disk internal;<br>                }<br>        }<br>        # DRBD client<br>        on 10-0-218-14 {<br>                node-id 2;<br>                address ipv4 <a href="http://10.0.218.14:7789">10.0.218.14:7789</a>;<br>                volume 0 {<br>                        device minor 100;<br>                        disk none;<br>                        meta-disk internal;<br>                }<br>        }<br>        # DRBD client<br>        on 10-0-183-69 {<br>                node-id 3;<br>                address ipv4 <a href="http://10.0.183.69:7789">10.0.183.69:7789</a>;<br>                volume 0 {<br>                        device minor 100;<br>                        disk none;<br>                        meta-disk internal;<br>                }<br>        }<br>        net {<br>                after-sb-0pri discard-zero-changes;<br>                after-sb-1pri discard-secondary;<br>                after-sb-2pri disconnect;<br>                fencing resource-and-stonith;<br>                protocol C;<br>                allow-two-primaries yes;<br>                sndbuf-size 0;<br>        }<br>}<br><br></div><div><div><div><div><br>-- <br><div class="gmail_signature">Thanks,<br>Li Qun<br></div>
</div></div></div></div></div>