<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Thanks.<div>However, DRBD seems to have been stuck for 2 days with these "time expired" messages until I split nodes (it then started again flawlessly).</div><div>Seems that it would have stayed in this situation indefinitely, without working.</div><div><br></div><div>I already encountered this issue a few months ago, an online verification was also running.&nbsp;</div><div><br></div><div>Anything to do ?</div><div>Some tuning in parameters ?</div><div>A "retry patch" to code for DRBD to "stop and retry" when it encounters this issue ?</div><div>...</div><div><br></div><div>Thank you very much !</div><div><br></div><div>Best regards,</div><div><br></div><div>Ben</div><div><br></div><div><div><div>Le 10 mars 2013 à 16:33, David Coulson a écrit :</div><br class="Apple-interchange-newline"><blockquote type="cite">
  
    <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
  
  <div bgcolor="#FFFFFF" text="#000000">
    Sorry - I picked out the wrong line(s).<br>
    <br>
    <meta charset="utf-8">
    Feb 17 20:31:11 srv2-1 kernel: block drbd1: [drbd1_worker/3083]
    sock_sendmsg time expired, ko = 4294967295
    Feb 17 20:31:17 srv2-1 kernel: block drbd1: [drbd1_worker/3083]
    sock_sendmsg time expired, ko = 4294967294<br>
    <pre style="color: rgb(0, 0, 0); font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; word-wrap: break-word; white-space: pre-wrap;">That means your network is unreliable. Not much DRBD can do about it - I would investigate the cause of that problem.

David
</pre>
    <br>
    <div class="moz-cite-prefix">On 3/10/13 11:21 AM, AZ 9901 wrote:<br>
    </div>
    <blockquote cite="mid:953FD5A6-F266-4D20-82C1-E10472F2377F@gmail.com" type="cite">David,
      <div><br>
      </div>
      <div>Thank you for your answer !</div>
      <div><br>
      </div>
      <div>This log entry arrived just after (and is certainly due to
        the fact that) I closed network communication between srv2-1 and
        srv2-2 :</div>
      <div>I connected to secondary server and used iptables to stop
        communication between the two servers.</div>
      <div>Just after that, primary server was reachable again !</div>
      <div>But according to logs, issue started 2 days before.</div>
      <div><br>
      </div>
      <div>However, to answer your question, the network between the 2
        servers is the private dedicated network OVH uses between its 2
        data-centers RBX &amp; SGB :</div>
      <div><a moz-do-not-send="true" href="http://www.ovh.co.uk/dedicated_servers/data_centre_selection.xml">http://www.ovh.co.uk/dedicated_servers/data_centre_selection.xml</a></div>
      <div>I have a 100Mbps connection between the 2 servers.</div>
      <div><br>
      </div>
      <div>Best regards,</div>
      <div><br>
      </div>
      <div>Ben</div>
      <div><br>
      </div>
      <div>
        <div>
          <div>Le 10 mars 2013 à 16:01, David Coulson a écrit :</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
            <div bgcolor="#FFFFFF" text="#000000"> What is your network
              between the two systems?<br>
              <br>
              <meta charset="utf-8">
              <pre style="color: rgb(0, 0, 0); font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; word-wrap: break-word; white-space: pre-wrap;">Feb 19 19:20:56 srv2-2 kernel: block drbd1: PingAck did not arrive in time.
</pre>
              <br class="Apple-interchange-newline">
              That means DRBD couldn't communicate between the nodes.<br>
              <br>
              David<br>
              <br>
              <div class="moz-cite-prefix">On 3/10/13 10:59 AM, AZ 9901
                wrote:<br>
              </div>
              <blockquote cite="mid:D91E5861-11AA-4507-92AB-A0C36FC01157@gmail.com" type="cite">
                <div>Le 5 mars 2013 à 07:21, AZ 9901 a écrit :</div>
                <div>
                  <div><br class="Apple-interchange-newline">
                    <blockquote type="cite">
                      <div style="word-wrap: break-word;
                        -webkit-nbsp-mode: space; -webkit-line-break:
                        after-white-space; ">
                        <div>// I made some errors in my previous mail,
                          here they are corrected</div>
                        <div><br>
                        </div>
                        Hello,<br>
                        <br>
                        I faced a big issue with DRBD.<br>
                        <br>
                        OS : Linux Debian 6<br>
                        Kernel : 2.6.32-46<br>
                        DRBD : 8.3.14<br>
                        <br>
                        My primary server (srv2-2) was totally
                        unreachable, it only replied to ping.<br>
                        Apache, SSH etc... were not replying anymore.<br>
                        <br>
                        So I connected to my secondary server (srv2-1)
                        and closed network communication between both.<br>
                        This made srv2-2 available again !<br>
                        I decided however to change srv2-1 from
                        Secondary to Primary and to reboot srv2-2.<br>
                        <br>
                        Following are logs from srv2-2 and srv2-1, with
                        some comments.<br>
                        srv2-2 :&nbsp;<a moz-do-not-send="true" href="http://pastebin.com/raw.php?i=zkHV5Tr9">http://pastebin.com/raw.php?i=zkHV5Tr9</a><br>
                        srv2-1 :&nbsp;<a moz-do-not-send="true" href="http://pastebin.com/raw.php?i=WX4vNR6d">http://pastebin.com/raw.php?i=WX4vNR6d</a><br>
                        <br>
                        on srv2-2, sar tells me that some of my CPU
                        cores were 100% used (100% iowait) during all
                        the time frame in which I had "time expired"
                        errors.&nbsp;<br>
                        <br>
                        Could you help me please ?<br>
                        <br>
                        Thank you very much,<br>
                        <br>
                        Ben
                        <div><br>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                </div>
                <div>Hello,
                  <div><br>
                  </div>
                  <div>Any help on this problem ?</div>
                  <div><br>
                  </div>
                  <div>To help further, here is my configuration :&nbsp;<a moz-do-not-send="true" href="http://pastebin.com/raw.php?i=UJ7npfBD">http://pastebin.com/raw.php?i=UJ7npfBD</a></div>
                  <div><br>
                  </div>
                  <div>Thank you very much,</div>
                  <div><br>
                  </div>
                  <div>Best regards,</div>
                  <div><br>
                  </div>
                  <div>Ben</div>
                </div>
                <div><br>
                </div>
                <br>
                <fieldset class="mimeAttachmentHeader"></fieldset>
                <br>
                <pre wrap="">_______________________________________________
drbd-user mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div><br></div></body></html>