<div dir="ltr">Hello everybody, <div><br></div><div>I am currently facing some issues with the DRBD syncronization.</div><div>Here is the config file:<br><div>global {<br></div><div> usage-count no;</div><div>}</div><div><br></div><div>common {</div><div> startup {</div><div> wfc-timeout 15;</div><div> degr-wfc-timeout 15;</div><div> outdated-wfc-timeout 15;</div><div> }</div><div> disk {</div><div> resync-rate 80M;</div><div> disk-flushes no;</div><div> disk-barrier no;</div><div> al-extents 3389;</div><div> c-fill-target 0;</div><div> c-plan-ahead 18;</div><div> c-max-rate 200M;</div><div> }</div><div> net {</div><div> protocol C;</div><div> max-buffers 8000;</div><div> max-epoch-size 8000;</div><div> sndbuf-size 1024k;</div><div> }</div><div>}</div><div><br></div><div>resource cmshareddrbdres {</div><div> net {</div><div> cram-hmac-alg sha1;</div><div> shared-secret xxxxxxx;</div><div> after-sb-0pri discard-younger-primary;</div><div> after-sb-1pri discard-secondary;</div><div> csums-alg md5;</div><div> }</div><div> on master1 {</div><div> device /dev/drbd1;</div><div> disk /dev/sdb;</div><div> address <a href="http://10.149.255.254:7789">10.149.255.254:7789</a>;</div><div> meta-disk internal;</div><div> }</div><div> on master2 {</div><div> device /dev/drbd1;</div><div> disk /dev/sdb;</div><div> address <a href="http://10.149.255.253:7789">10.149.255.253:7789</a>;</div><div> meta-disk internal;</div></div><div><div> }</div><div>}<br><br>The network <a href="http://10.149.0.0/16">10.149.0.0/16</a> is using IPoIB.<br><br>The messages that i see are (first master): <a href="https://pastebin.com/0xCLceeD">https://pastebin.com/0xCLceeD</a></div><div><br></div><div>Suspect messages:<br>[Sun Jun 4 03:50:17 2017] block drbd1: logical block size of local backend does not match (drbd:512, backend:4096); was this a late attach?</div><div>[Sun Jun 4 03:51:01 2017] drbd cmshareddrbdres: [drbd_w_cmshared/3640] sock_sendmsg time expired, ko = 6</div><div>[Sun Jun 4 03:34:12 2017] block drbd1: We did not send a P_BARRIER for 84203ms > ko-count (7) * timeout (60 * 0.1s); drbd kernel thread blocked?<br>(I see so many of these)<br><br>To me, i would say that there is some issue with the network, but i am not sure, because in that case i would expect drbd to be able to send the messages but going in timeout on the other side.</div><div><br></div><div>I have tried to stress it and i couldn't reproduce it, so it doesn't seem to be load-related.</div><div><br></div><div><div>[root@master1 ~]# uname -r</div><div>3.10.0-327.el7.x86_64</div><div>[root@master1 ~]# rpm -qa | grep drbd</div><div>kmod-drbd84-8.4.7-1_1.el7.elrepo.x86_64</div><div>drbd84-utils-8.9.5-1.el7.elrepo.x86_64</div></div><div><br></div><div>Any ideas?<br><br><br>Regards,</div></div><div>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><br><table height="195" width="312" style="font-size:7pt;font-family:Tahoma,Arial,Helvetica;padding:0px;border:1px solid rgb(234,239,242)"><tbody><tr valign="top"><td colspan="2"><img alt="clustervision_logo.png" title="" src="http://www.clustervision.com/images/cv_sig.gif"></td></tr><tr><td valign="bottom" nowrap style="padding-left:12px"><font style="font-size:9pt;font-weight:bold">Andrea Del Monaco<br></font><font style="font-size:7pt">Internal Engineer<br> <br> <br>Mob: +31 64 166 4003<br>Skype: delmonaco.andrea<br><a href="mailto:andrea.delmonaco@clustervision.com" style="text-decoration:none;color:rgb(57,136,194)" target="_blank">andrea.delmonaco@clustervision.com</a></font><br> <br></td><td valign="bottom" nowrap><font style="font-size:8pt;font-weight:bold">ClusterVision BV<br></font><font style="font-size:7pt">Gyroscoopweg 56<br>1042 AC Amsterdam<br>The Netherlands<br>Tel: +31 20 407 7550<br>Fax: +31 84 759 8389<br><a href="http://www.clustervision.com/" style="text-decoration:none;color:rgb(0,63,119)" target="_blank">www.clustervision.com</a></font><br> <br></td></tr></tbody></table></div></div></div></div></div></div></div></div></div>
</div></div>