<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Hi all,<br>
      <br>
      I'd like to know if someone know a tip to block the Raid controler
      or blocking the I/O ? <br>
      <br>
      I'd like to reproduce our problem to check if the ko-count fix the
      problem.<br>
      <br>
      Thanks for your help<br>
      <br>
      Matthieu<br>
      <br>
      <br>
      <br>
      <br>
      <br>
      Le 10/03/14 09:44, Matthieu Lejeune a &eacute;crit&nbsp;:<br>
    </div>
    <blockquote cite="mid:531D7B8B.8060208@exxoss.com" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      <div class="moz-cite-prefix">Hi,<br>
        <br>
        Thanks for you reply.<br>
        <br>
        If I modify the configuration like this on the global_common : <br>
        <br>
        global {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; usage-count yes;<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # minor-count dialog-refresh disable-ip-verification<br>
        }<br>
        common {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; protocol C;<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; handlers {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # The following 3 handlers were disabled due to
        #576511.<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # Please check the DRBD manual and enable them,
        if they make sense in your setup.<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # pri-on-incon-degr
        "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
        /usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt;
        /proc/sysrq-trigger ; reboot -f";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # pri-lost-after-sb
        "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
        /usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt;
        /proc/sysrq-trigger ; reboot -f";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # local-io-error
        "/usr/lib/drbd/notify-io-error.sh;
        /usr/lib/drbd/notify-emergency-shutdown.sh; echo o &gt;
        /proc/sysrq-trigger ; halt -f";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # split-brain
        "/usr/lib/drbd/notify-split-brain.sh root";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # out-of-sync
        "/usr/lib/drbd/notify-out-of-sync.sh root";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # before-resync-target
        "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # after-resync-target
        /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; startup {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # wfc-timeout degr-wfc-timeout
        outdated-wfc-timeout wait-after-sb<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; disk {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # on-io-error fencing use-bmbv no-disk-barrier
        no-disk-flushes<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # no-disk-drain no-md-flushes max-bio-bvecs<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; net {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ko-count 2<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; timeout 50<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # sndbuf-size rcvbuf-size timeout connect-int
        ping-int ping-timeout max-buffers<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # max-epoch-size ko-count allow-two-primaries
        cram-hmac-alg shared-secret<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # after-sb-0pri after-sb-1pri after-sb-2pri
        data-integrity-alg no-tcp-cork<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>
        <br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; syncer {<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # rate after al-extents use-rle cpu-mask
        verify-alg csums-alg<br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>
        }<br>
        <br>
        If I make this config one the secondary node, I can have a
        proper disconnection on the slave when we ave HW problems like
        on my previous post ?<br>
        <br>
        Thanks<br>
        <br>
        Matthieu Lejeune<br>
        <br>
        <br>
        <br>
        <br>
        Le 5/03/14 11:32, Philip Gaw a &eacute;crit&nbsp;:<br>
      </div>
      <blockquote cite="mid:5316FD47.3050101@darktech.org.uk"
        type="cite">
        <meta content="text/html; charset=ISO-8859-1"
          http-equiv="Content-Type">
        <blockquote cite="mid:5316F0DA.7050704@darktech.org.uk"
          type="cite">Hi Matthieu,<br>
          <br>
          <div class="moz-cite-prefix">On 05/03/2014 07:29, Matthieu
            Lejeune wrote:<br>
          </div>
          <blockquote cite="mid:5316D24E.8040201@exxoss.com" type="cite">
            <meta http-equiv="content-type" content="text/html;
              charset=ISO-8859-1">
            Hi all,<br>
            <br>
            I had a problem this night with a DRBD Primary/Slave.<br>
            <br>
            <br>
            The slave experienced a hardware issue (LSI controller
            freezed).<br>
            It seems the master hold I/O waiting for the slave to
            respond until timeout.<br>
            <br>
            <br>
            This caused all targets exported trough infiniband to be
            disconnected from the master.<br>
            <br>
            <br>
            So, practically, the master stop responding due to a failure
            on the slave.<br>
            <br>
            I had to hard reboot (power cycle) the slave because UDEV
            wasn't responding and did not allow normal reboot.<br>
            After slave reboot, drdb did reconnect. It was in status
            pri/sec uptodate/uptodate.<br>
            But the LSI controller immediatly timeout causing the same
            issue a second time.<br>
            <br>
            <br>
            How can we prevent issue on the slave to impact the master ?<br>
            <br>
          </blockquote>
          have a look at ko-count <br>
          <br>
          <dt style="color: rgb(0, 0, 0); font-family: sans-serif;
            font-size: 13px; font-style: normal; font-variant: normal;
            font-weight: normal; letter-spacing: normal; line-height:
            normal; orphans: auto; text-align: start; text-indent: 0px;
            text-transform: none; white-space: normal; widows: auto;
            word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span
              class="term"><code class="option">ko-count<span
                  class="Apple-converted-space">&nbsp;</span><em
                  class="replaceable"><code>number</code></em></code></span></dt>
          <dd style="color: rgb(0, 0, 0); font-family: sans-serif;
            font-size: 13px; font-style: normal; font-variant: normal;
            font-weight: normal; letter-spacing: normal; line-height:
            normal; orphans: auto; text-align: start; text-indent: 0px;
            text-transform: none; white-space: normal; widows: auto;
            word-spacing: 0px; -webkit-text-stroke-width: 0px;">
            <p>In case the secondary node fails to complete a single
              write request for<span class="Apple-converted-space">&nbsp;</span><em
                class="replaceable"><code>count</code></em><span
                class="Apple-converted-space">&nbsp;</span>times the<span
                class="Apple-converted-space">&nbsp;</span><em
                class="replaceable"><code>timeout</code></em>, it is
              expelled from the cluster. (I.e. the primary node goes
              into<span class="Apple-converted-space">&nbsp;</span><code
                class="option">StandAlone</code><span
                class="Apple-converted-space">&nbsp;</span>mode.) The default
              value is 0, which disables this feature.</p>
          </dd>
          <dt><br>
          </dt>
          <a moz-do-not-send="true" class="moz-txt-link-freetext"
            href="http://www.drbd.org/users-guide/re-drbdconf.html">http://www.drbd.org/users-guide/re-drbdconf.html</a><br>
          <br>
          <blockquote cite="mid:5316D24E.8040201@exxoss.com" type="cite">
            <br>
            Thank you.<br>
            Matthieu Lejeune<br>
            <br>
            <br>
            drbd8-utils : &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
            2:8.3.13-2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; amd64&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; RAID 1 over
            tcp/ip for Linux utilities<br>
            Debian : <br>
            root@ifprdstor8a:~/trunk# cat /proc/version <br>
            Linux version 3.2.0-4-amd64 (<a moz-do-not-send="true"
              class="moz-txt-link-abbreviated"
              href="mailto:debian-kernel@lists.debian.org">debian-kernel@lists.debian.org</a>)
            (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian
            3.2.51-1<br>
            root@ifprdstor8a:~/trunk# <br>
            <br>
            We are using the scst/srpt with the Trunk version of the 7
            January 2014<br>
            <br>
            I give you the config :<br>
            <b>drbd global : </b><b><br>
            </b><br>
            <a moz-do-not-send="true" class="moz-txt-link-abbreviated"
              href="mailto:root@ifprdstor8a:/etc/drbd.d#">root@ifprdstor8a:/etc/drbd.d#</a>
            cat global_common.conf <br>
            global {<br>
            &nbsp;&nbsp;&nbsp; usage-count yes;<br>
            &nbsp;&nbsp;&nbsp; # minor-count dialog-refresh disable-ip-verification<br>
            }<br>
            <br>
            common {<br>
            &nbsp;&nbsp;&nbsp; protocol C;<br>
            <br>
            &nbsp;&nbsp;&nbsp; handlers {<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # The following 3 handlers were disabled due to
            #576511.<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # Please check the DRBD manual and enable them, if
            they make sense in your setup.<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # pri-on-incon-degr
            "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
            /usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt;
            /proc/sysrq-trigger ; reboot -f";<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # pri-lost-after-sb
            "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
            /usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt;
            /proc/sysrq-trigger ; reboot -f";<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # local-io-error "/usr/lib/drbd/notify-io-error.sh;
            /usr/lib/drbd/notify-emergency-shutdown.sh; echo o &gt;
            /proc/sysrq-trigger ; halt -f";<br>
            <br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # split-brain "/usr/lib/drbd/notify-split-brain.sh
            root";<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh
            root";<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # before-resync-target
            "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c
            16k";<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # after-resync-target
            /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;<br>
            &nbsp;&nbsp;&nbsp; }<br>
            <br>
            &nbsp;&nbsp;&nbsp; startup {<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # wfc-timeout degr-wfc-timeout outdated-wfc-timeout
            wait-after-sb<br>
            &nbsp;&nbsp;&nbsp; }<br>
            <br>
            &nbsp;&nbsp;&nbsp; disk {<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # on-io-error fencing use-bmbv no-disk-barrier
            no-disk-flushes<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # no-disk-drain no-md-flushes max-bio-bvecs<br>
            &nbsp;&nbsp;&nbsp; }<br>
            <br>
            &nbsp;&nbsp;&nbsp; net {<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # sndbuf-size rcvbuf-size timeout connect-int
            ping-int ping-timeout max-buffers<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # max-epoch-size ko-count allow-two-primaries
            cram-hmac-alg shared-secret<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # after-sb-0pri after-sb-1pri after-sb-2pri
            data-integrity-alg no-tcp-cork<br>
            &nbsp;&nbsp;&nbsp; }<br>
            <br>
            &nbsp;&nbsp;&nbsp; syncer {<br>
            &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # rate after al-extents use-rle cpu-mask verify-alg
            csums-alg<br>
            &nbsp;&nbsp;&nbsp; }<br>
            }<br>
            <br>
            <b>Ressource Configuration : </b><br>
            <br>
            <a moz-do-not-send="true" class="moz-txt-link-abbreviated"
              href="mailto:root@ifprdstor8a:/etc/drbd.d#">root@ifprdstor8a:/etc/drbd.d#</a>
            cat DSA801.res <br>
            resource DSA801 {<br>
            &nbsp; protocol C;<br>
            <br>
            &nbsp; startup {<br>
            &nbsp;&nbsp;&nbsp; wfc-timeout 0;<br>
            &nbsp; }<br>
            <br>
            &nbsp; disk {<br>
            &nbsp;&nbsp;&nbsp; on-io-error detach;<br>
            &nbsp; }<br>
            <br>
            &nbsp; syncer {<br>
            &nbsp;&nbsp;&nbsp; rate 400M;<br>
            &nbsp;&nbsp;&nbsp; verify-alg md5;<br>
            &nbsp; }<br>
            <br>
            &nbsp; on ifprdstor8a {<br>
            &nbsp;&nbsp;&nbsp; device&nbsp;&nbsp;&nbsp; /dev/drbd1;<br>
            &nbsp;&nbsp;&nbsp; disk&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /dev/sda;<br>
            &nbsp;&nbsp;&nbsp; address&nbsp;&nbsp; 10.13.1.5:7788;<br>
            &nbsp;&nbsp;&nbsp; meta-disk internal;<br>
            &nbsp; }<br>
            <br>
            &nbsp; on ifprdstor8b {<br>
            &nbsp;&nbsp;&nbsp; device&nbsp;&nbsp;&nbsp; /dev/drbd1;<br>
            &nbsp;&nbsp;&nbsp; disk&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /dev/sda;<br>
            &nbsp;&nbsp;&nbsp; address&nbsp;&nbsp; 10.13.1.6:7788;<br>
            &nbsp;&nbsp;&nbsp; meta-disk internal;<br>
            &nbsp; }<br>
            }<br>
            <br>
          </blockquote>
        </blockquote>
        <br>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
drbd-user mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
      </blockquote>
      <br>
      <div class="moz-signature"><br>
        &nbsp; </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
drbd-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>