<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jun 12, 2017 at 5:45 PM, Lars Ellenberg <span dir="ltr">&lt;<a href="mailto:lars.ellenberg@linbit.com" target="_blank">lars.ellenberg@linbit.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-HOEnZb"><div class="gmail-h5">On Fri, Jun 09, 2017 at 11:39:05PM +0800, David Lee wrote:<br>

&gt; Hi,<br>

&gt;<br>

&gt; I am experimenting with DRBD dual-primary with OCFS 2, and DRBD client as<br>

&gt; well.<br>

&gt; With the hope that every node can access the storage in an unified way.<br>

&gt; But I got a<br>

&gt; kernel call trace and huge number of ASSERTION failure (*before* OCFS2 is<br>

&gt; mounted):<br>

&gt;<br>

&gt; ----&lt;paste begins&gt;----<br>

&gt; [11160.192091] INFO: task drbdsetup:19442 blocked for more than 120 seconds.<br>

&gt; [11160.192096]       Tainted: G           OE   4.1.12-37.2.2.el7uek.x86_64<br>

&gt; #2<br>

&gt; [11160.192097] &quot;echo 0 &gt; /proc/sys/kernel/hung_task_<wbr>timeout_secs&quot; disables<br>

&gt; this message.<br>

&gt; [11160.192099] drbdsetup       D ffff88013fd17840     0 19442      1<br>

&gt; 0x00000084<br>

&gt; [11160.192108]  ffff8800addef8c8 0000000000000082 ffff88013a3d3800<br>

&gt; ffff8800369eb800<br>

&gt; [11160.192111]  ffff8800addef938 ffff8800addf0000 ffff8800adb192c0<br>

&gt; 7fffffffffffffff<br>

&gt; [11160.192113]  ffff8800369eb800 0000000000000297 ffff8800addef8e8<br>

&gt; ffffffff81712947<br>

&gt; [11160.192116] Call Trace:<br>

&gt; [11160.192128]  [&lt;ffffffff81712947&gt;] schedule+0x37/0x90<br>

&gt; [11160.192131]  [&lt;ffffffff8171596c&gt;] schedule_timeout+0x20c/0x280<br>

&gt; [11160.192134]  [&lt;ffffffff817158b6&gt;] ? schedule_timeout+0x156/0x280<br>

&gt; [11160.192148]  [&lt;ffffffffa05c2695&gt;] ? drbd_destroy_path+0x15/0x20 [drbd]<br>

&gt; [11160.192152]  [&lt;ffffffff817134b4&gt;] wait_for_completion+0x134/<wbr>0x190<br>

&gt; [11160.192157]  [&lt;ffffffff810b1d90&gt;] ? wake_up_state+0x20/0x20<br>

&gt; [11160.192165]  [&lt;ffffffffa05c4d51&gt;] _drbd_thread_stop+0xc1/0x110 [drbd]<br>

&gt; [11160.192173]  [&lt;ffffffffa05dd84c&gt;] del_connection+0x3c/0x140 [drbd]<br>

&gt; [11160.192179]  [&lt;ffffffffa05e0bd3&gt;] drbd_adm_down+0xc3/0x2c0 [drbd]<br>

&gt; [11160.192184]  [&lt;ffffffff8162886d&gt;] genl_family_rcv_msg+0x1cd/<wbr>0x400<br>

<br>

</div></div><span class="gmail-">&gt; [11163.573075] __bm_op: 84153300 callbacks suppressed<br>

&gt; [11163.573075] drbd r0/0 drbd100: ASSERTION bitmap-&gt;bm_pages FAILED in<br>

<br>

<br>

</span>The assertion is that the bitmap pages are supposed to be allocated<br>

when we do bitmap operations.<br>

<br>

Apparently in this case, they are not.<br>

<br>

So either the bitmap pages have never been allocated, and our error<br>

handling for that case sucks, or they are freed too early, while<br>

&quot;something&quot; still wants to flip or count some bits.  But I would have<br>

expected someone to notice something like that before. Strange.<br>

<br>

    Lars<br></blockquote></div><br></div><div class="gmail_extra"><br>Thanks for your comments, Lars.<br><br></div><div class="gmail_extra">I found other interesting (weird) things with OCFS 2 and DRBD clients, and moved to<br></div><div class="gmail_extra">other directions.  The interesting things are:<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">1. In the scenario of a three-node OCFS2 cluster with dual-primary DRBD and 1 client node,<br>    the whole cluster fences (every node will reboot) when the DRBD client node down.<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">2. If add one more DRBD client node (of course drbd/o2cb/ocfs2 confs are updated)<br></div><div class="gmail_extra">    then both client node constantly  failed to join with mount.ocfs2 failure.<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">I&#39;ve changed the experiment to get rid of OCFS2.  But if any help needed (for example,<br></div><div class="gmail_extra">to verify some configuration), please kindly let me know.<br clear="all"></div><div class="gmail_extra"><br>-- <br><div class="gmail_signature">Thanks,<br>Li Qun<br></div>

</div></div>