<div dir="ltr">So,<br><br>I played during hours with the dynamic resync rate controller.<br>Here are my settings :<br><br>c-plan-ahead 10; //10ms between my 2 nodes, but minimum 1 second recommended here : <a href="https://blogs.linbit.com/p/128/drbd-sync-rate-controller/" target="_blank">https://blogs.linbit.com/p/128/drbd-sync-rate-controller/</a><br>resync-rate 680M; //mainly ignored<br>c-min-rate 400M;<br>c-max-rate 680M;<br>c-fill-target 20M; //my BDP is 6.8M, guides say to use a value between 1x and 3x BDP<br><br>Resync can achieve up to 680M when there are no application IOs on the source.<br>However, as soon as there are application IOs (writes with dd in my tests), resync slows down to some MB/s...<br>I played with c-plan-ahead and c-fill-target without success.<br>I also tested c-delay-target.<br>I tried to set unplug-watermark to 16.<br>My IO scheduler is already the deadline one...<br><br>Well, I&#39;m a little bit lost, I can&#39;t achieve to get resync with a minimum rate of 400M when there are application IOs...<br><br>Here, Lars says :<br><a href="http://lists.linbit.com/pipermail/drbd-user/2011-August/016739.html" target="_blank">http://lists.linbit.com/pipermail/drbd-user/2011-August/016739.html</a><br>The dynamic resync rate controller basically tries to use as much as c-max-rate bandwidth, but will automatically throttle, if<br>- application IO on the device is detected (READ or WRITE), AND the estimated current resync speed is above c-min-rate<br>- the amount of in-flight resync requests exceeds c-fill-target<br><br>However, does DRBD throttle application IOs when resync rate is lower than c-min-rate ?<br>According to my tests I&#39;m not so sure.<br><br>Thank you very much for your support,<br><br>Ben<div class="gmail_extra"><br><br><div class="gmail_quote">2015-05-26 15:06 GMT+02:00 A.Rubio <span dir="ltr">&lt;<a href="mailto:aurusa@etsii.upv.es" target="_blank">aurusa@etsii.upv.es</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  

    

  

  <div bgcolor="#FFFFFF" text="#000000">

    Have you test these values ?<br>

    <br>

    <a href="https://drbd.linbit.com/users-guide/s-throughput-tuning.html" target="_blank">https://drbd.linbit.com/users-guide/s-throughput-tuning.html</a><br>

    <br>

    <br>

    <div>El 26/05/15 a las 13:16, Ben RUBSON

      escribió:<br>

    </div>

    <blockquote type="cite"><div><div class="h5">

      <div dir="ltr">

        <div>RAID controller is OK yes.<br>

          <br>

        </div>

        Here is a 4 steps example of the issue :<br>

        <br>

        <br>

        <br>

        ### 1 - initial state :<br>

        <br>

        Source :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 0<br>

        - eth1 incoming MB/s : 0<br>

        - eth1 outgoing MB/s : 0<br>

        Target :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 0<br>

        - eth1 incoming MB/s : 0<br>

        - eth1 outgoing MB/s : 0<br>

        <br>

        <br>

        <br>

        ### 2 - dd if=/dev/zero of=bigfile :<br>

        <br>

        Source :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 670<br>

        - eth1 incoming MB/s : 1<br>

        - eth1 outgoing MB/s : 670<br>

        Target :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 670<br>

        - eth1 incoming MB/s : 670<br>

        - eth1 outgoing MB/s : 1<br>

        <br>

        <br>

        <br>

        ### 3 - disable the link between the 2 nodes :<br>

        <br>

        Source :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 670<br>

        - eth1 incoming MB/s : 0<br>

        - eth1 outgoing MB/s : 0<br>

        Target :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 0<br>

        - eth1 incoming MB/s : 0<br>

        - eth1 outgoing MB/s : 0<br>

        <br>

        <br>

        <br>

        ### 4 - re-enable the link between the 2 nodes :<br>

        <br>

        Source :<br>

        - sdb read MB/s      : ~20<br>

        - sdb write MB/s     : ~670<br>

        - eth1 incoming MB/s : 1<br>

        - eth1 outgoing MB/s : 670<br>

        Target :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 670<br>

        - eth1 incoming MB/s : 670<br>

        - eth1 outgoing MB/s : 1<br>

        DRBD resource :<br>

         1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate

        C r-----<br>

            ns:62950732 nr:1143320132 dw:1206271712 dr:1379744185

        al:9869 bm:6499 lo:2 pe:681 ua:1 ap:0 ep:1 wo:d oos:11883000<br>

            [&gt;...................] sync&#39;ed:  6.9% (11604/12448)M<br>

            finish: 0:34:22 speed: 5,756 (6,568) want: 696,320 K/sec<br>

        <br>

        <br>

        <br>

        ### values I would have expected in step 4 :<br>

        <br>

        Source :<br>

        - sdb read MB/s      : ~400 (because of c-min-rate 400M)<br>

        - sdb write MB/s     : ~370<br>

        - eth1 incoming MB/s : 1<br>

        - eth1 outgoing MB/s : 670<br>

        Target :<br>

        - sdb read MB/s      : 0<br>

        - sdb write MB/s     : 670<br>

        - eth1 incoming MB/s : 670<br>

        - eth1 outgoing MB/s : 1<br>

        <div>

          <div>

            <div class="gmail_extra"><br>

            </div>

            <div class="gmail_extra">Why resync is totally ignored and

              application (dd here in the example) still consumes all

              available IOs / bandwidth ?<br>

              <br>

              <br>

              <br>

            </div>

            <div class="gmail_extra">Thank you,<br>

              <br>

            </div>

            <div class="gmail_extra">Ben<br>

            </div>

            <div class="gmail_extra"><br>

              <br>

              <br>

              <div class="gmail_quote">2015-05-25 16:50 GMT+02:00

                A.Rubio <span dir="ltr">&lt;<a href="mailto:aurusa@etsii.upv.es" target="_blank">aurusa@etsii.upv.es</a>&gt;</span>:<br>

                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Cache settings an

                  I/O in RAID controller is optimal ??? Write-back,

                  write-through, cache enablad, I/O direct, ...<br>

                  <br>

                  El 25/05/15 a las 11:50, Ben RUBSON escribió:<br>

                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                    <div>

                      <div>

                        The link between nodes is a 10Gb/s link.<br>

                        The DRBD resource is a RAID-10 array which is

                        able to resync at up to 800M (as you can see I

                        have lowered it to 680M in my configuration

                        file).<br>

                        <br>

                        The &quot;issue&quot; here seems to be a prioritization

                        &quot;issue&quot; between application IOs and resync IOs.<br>

                        Perhaps I miss-configured something ?<br>

                        Goal is to have resync rate up to 680M, with a

                        minimum of 400M, even if there are application

                        IOs.<br>

                        This in order to be sure to complete the resync

                        even if there are a lot of write IOs from the

                        application.<br>

                        <br>

                        With my simple test below, this is not the case,

                        dd still writes at its best throughput, lowering

                        resync rate which can’t reach 400M at all.<br>

                        <br>

                        Thank you !<br>

                        <br>

                        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                          Le 25 mai 2015 à 11:18, A.Rubio &lt;<a href="mailto:aurusa@etsii.upv.es" target="_blank">aurusa@etsii.upv.es</a>&gt;

                          a écrit :<br>

                          <br>

                          the link between nodes is ???  1Gb/s ??? ,

                          10Gb/s ??? ...<br>

                          <br>

                          the Hard Disks are ??? SATA 7200rpm ???,

                          10000rpm ???, SAS ???,<br>

                          SSD ???...<br>

                          <br>

                          400M to 680M with a 10Gb/s link and SAS 15.000

                          rpm is OK but less ...<br>

                          <br>

                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                            Le 12 avr. 2014 à 17:23, Ben RUBSON &lt;<a href="mailto:ben.rubson@gmail.com" target="_blank">ben.rubson@gmail.com</a>&gt;

                            a écrit :<br>

                            <br>

                            Hello,<br>

                            <br>

                            Let&#39;s assume the following configuration :<br>

                            disk {<br>

                                c-plan-ahead 0;<br>

                                resync-rate 680M;<br>

                                c-min-rate 400M;<br>

                            }<br>

                            <br>

                            Both nodes are uptodate, and on the primary,

                            I have a test IO burst running, using dd.<br>

                            <br>

                            I then cut replication link for a few

                            minutes so that secondary node will be

                            several GB behind primary node.<br>

                            <br>

                            I then re-enable replication link.<br>

                            What I expect here according to the

                            configuration is that secondary node will

                            fetch missing GB at a 400 MB/s throughput.<br>

                            DRBD should then prefer resync IOs over

                            application (dd here) IOs.<br>

                            <br>

                            However, it does not seems to work.<br>

                            dd still writes at its best throughput,

                            meanwhile reads are made from the primary

                            disk between 30 and 60 MB/s to complete the

                            resync.<br>

                            Of course this is not the expected

                            behaviour.<br>

                            <br>

                            Did I miss something ?</blockquote></blockquote></div></div></blockquote></blockquote></div></div></div></div></div></div></div></blockquote></div></blockquote></div></div></div>