<div dir="ltr">Hi,<div><br></div><div>Thanks for your answers.</div><div><br></div><div>Answering your questions:</div><div>DRBD_KERNEL_VERSION=9.0.25<br></div><div><br></div><div>Linux kernel:</div><div>4.18.0-305.3.1.el8.x86_6<br></div><div><br></div><div>File system type: </div><div>XFS.</div><div><br></div><div>So the file system is not cluster-aware, but as far as I understand in an active/passive setup - single primary (that I have) it should be OK.</div><div>Just checked the doc which seems to confirm that.</div><div><br></div><div>I think the problem may come from the way I&#39;m testing it.</div><div>I came up with this testing scenario, that I described in my first post, because I didn&#39;t have an easy way to abruptly restart the server.</div><div>When I do the hard reset of the primary server it works as expected (at least I can find a logical explanation).</div><div><br></div><div>I think what happened in my previous scenario was:</div><div>Service is writing to the disk, and some portion of the written data is in a disk cache. As the picture <a href="https://linbit.com/wp-content/uploads/drbd/drbd-guide-9_0-en/images/drbd-in-kernel.png">https://linbit.com/wp-content/uploads/drbd/drbd-guide-9_0-en/images/drbd-in-kernel.png</a> shows, the cache is above the DRBD module.</div><div>Then I kill the service and the network, but some data is still in the cache.</div><div>At some point the cache is flushed and the data gets written to the disk.</div><div>DRBD probably reports some error at this point, as it can&#39;t send that data to the secondary node (DRBD thinks the other node has left the cluster).</div><div><br></div><div>When I check the files at this point I see more data on the primary because it also contains the data from the cache, which were not replicated because the network was down when the data hit the DRBD.</div><div><br></div><div>When I do the hard restart of the server, data in the cache is lost, so we don&#39;t observe the result as above.</div><div><br></div><div>Does it make sense?</div><div><br></div><div>Regards,</div><div>Janusz.</div><div><br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">wt., 10 sie 2021 o 17:55 Digimer &lt;<a href="mailto:lists@alteeve.ca">lists@alteeve.ca</a>&gt; napisał(a):<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div>

    <div>On 2021-08-10 2:11 a.m., Eddie Chapman

      wrote:<br>

    </div>

    <blockquote type="cite">On

      10/08/2021 00:01, Digimer wrote:

      <br>

      <blockquote type="cite">On 2021-08-05 5:53 p.m., Janusz Jaskiewicz

        wrote:

        <br>

        <blockquote type="cite">Hello.

          <br>

          <br>

          I&#39;m experimenting a bit with DRBD in a cluster managed by

          Pacemaker.

          <br>

          It&#39;s a two node, active-passive cluster and the service that

          I&#39;m trying to put in the cluster writes to the file system.

          <br>

          The service manages many files, it appends to some of them and

          increments the single integer value in others.

          <br>

          <br>

          I&#39;m observing surprising behaviour and I would like to ask you

          if what I see is expected or not (I think not).

          <br>

          <br>

          I&#39;m using protocol C, but still I see some delay in the files

          that are being replicated to the secondary server.

          <br>

          For the files that increment the integer I see a difference

          which corresponds roughly to 1 second of traffic.

          <br>

          <br>

          I&#39;m really surprised to see this, as protocol C should

          guarantee synchronous replication.

          <br>

          I&#39;d rather expect some delay in processing (potentially slower

          disk writes due to the network replication).

          <br>

          <br>

          The way I&#39;m testing it:

          <br>

          The service runs on primary and writes to DRBD drive,

          secondary connected and &quot;UpToDate&quot;.

          <br>

          I kill the service abruptly (kill -9) and then take down the

          network interface between primary and secondary (kill and

          ifdown commands in the script so executed quite promptly one

          after the other).

          <br>

          Then I mount the DRBD drive on both nodes and check the

          difference in the files with incrementing integer.

          <br>

          <br>

          I would appreciate any help or pointers on how to fix this.

          <br>

          But first of all I would like to confirm that this behaviour

          is not expected.

          <br>

          <br>

          Also if it is expected/allowed, how can I decrease the impact?

          <br>

        </blockquote>

      </blockquote>

      <br>

      <blockquote type="cite">What filesystem are you using? Is it

        cluster / multi-node aware?

        <br>

      </blockquote>

      <br>

      The filesystem may be relevant in that filesystems can behave in

      ways one might not expect, depending on how they are tuned, so

      would be good to know what the fs is and what mount options are

      being used. However, the filesystem certainly does not need to be

      aware that the underlying block device is drbd or a cluster of any

      kind. A drbd device should look like a regular block device and

      there is no need to treat it like anything else.

      <br>

      <br>

      I would like to know the kernel and drbd versions, if these are

      old enough then expecting things to &quot;work&quot; in a sane fashion might

      not be a reasonable expectation :-)

      <br>

    </blockquote>

    <p>The reason I asked is because all DRBD does is replicate the

      blocks. It doesn&#39;t (and can&#39;t) handle locks to avoid corruption,

      update file lists, etc. If you&#39;ve mounted, say, xfs or ext4 on

      both nodes, then it&#39;s a surprise it updates at all, never mind

      being a few seconds delayed. If you&#39;re using a proper cluster FS

      like GFS2 or ocfs, then we may want to investigate those.</p>

    <pre cols="72">-- 

Digimer

Papers and Projects: <a href="https://alteeve.com/w/" target="_blank">https://alteeve.com/w/</a>

&quot;I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops.&quot; - Stephen Jay Gould</pre>

  </div>

</blockquote></div>