No subject


Tue Jan 23 13:55:44 CET 2018


https://docs.linbit.com/docs/users-guide-9.0/
"11.3. Manual split brain recovery"

If DRBD detects that both nodes are (or were at some point, while
disconnected) in the primary role, it immediately tears down the
replication connection. The tell-tale sign of this is a message like the
following appearing in the system log:

Split-Brain detected, dropping connection!


The solution for this problem is to use "drbdadm --discard-my-data
connect res" on the node2 that made the backup.

But my Question is why do I *sometimes not* have a split brain ?
It does not make a difference if I write to or read from the primary
device while the backup is running.




--------------F204265983ECF13F8227D2D6
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta http-equiv=3D"content-type" content=3D"text/html; charset=3DUTF=
-8">
  </head>
  <body text=3D"#000000" bgcolor=3D"#FFFFFF">
    <p>Hi,</p>
    <p>I have two identical bare metal machines "server1" and "server2"
      running DRBD_KERNEL_VERSION=3D9.0.13 on Opensuse Leap15.<br>
      I have one resource "res1"<br>
      <br>
      Double Primary is NOT allowed.<br>
      <br>
      If I run the following scenario:</p>
    <p>0. "server1" stops the VM that uses the res1 for the virtual hard
      disk <br>
      (to ensure a consistent vm state of the backup and have no
      read/writes on the drbd resource )</p>
    <p>1. Set "res1" on both nodes "server1" and "server2" to secondary
      and disconnect <br>
      2. Set "res1" on both nodes to primary in standalone mode<br>
      3. "server1" starts the VM that uses the res1 for the virtual hard
      disk<br>
      4. "server2"=C2=A0 run the command "ddrescue /dev/drbd/by-res/bb
      /backup/backup.img"</p>
    <p>5. When the backup is finished I set "res1" on node"server2" to
      secondary ( while on server1 the res1 is already on primary and
      gets reads/writes )<br>
      6. Connect both nodes again<br>
    </p>
    <p>Now I have a "random" behavior:</p>
    <p>Sometimes the res1 is connected after I run the command "drbdadm
      connect res1" on both nodes.<br>
      <br>
      Sometimes I become:</p>
    <p>server1 :~=C2=A0 #drbdadm status res1:<br>
    </p>
    <p>server1 role:Primary<br>
      =C2=A0 disk:UpToDate<br>
      =C2=A0 server2 connection:StandAlone<br>
      <br>
      And If I look into the log file on server1 :<br>
    </p>
    <pre style=3D"background-color:#ffffff;color:#000000;font-family:'Cou=
rier New';font-size:12,0pt;">drbd res1/<span style=3D"color:#0000ff;">0 <=
/span>drbd1: Split-Brain detected but unresolved, dropping connection!

</pre>
    <p>From the drbd9 documentation I read that this behavior should be
      normal:</p>
    <p><a class=3D"moz-txt-link-freetext" href=3D"https://docs.linbit.com=
/docs/users-guide-9.0/">https://docs.linbit.com/docs/users-guide-9.0/</a>=
<br>
      "11.3. Manual split brain recovery"</p>
    <p>If DRBD detects that both nodes are
      (or were at some point, while disconnected) in the primary role,
      it
      immediately tears down the replication connection. The tell-tale
      sign
      of this is a message like the following appearing in the system
      log:
    </p>
    <div class=3D"listingblock">
      <div class=3D"content">
        <pre>Split-Brain detected, dropping connection!</pre>
      </div>
    </div>
    <p><br>
    </p>
    <p>The solution for this problem is to use "drbdadm
      --discard-my-data connect res" on the node2 that made the backup.<b=
r>
      <br>
      But my Question is why do I <b>sometimes not</b> have a split
      brain ?<br>
      It does not make a difference if I write to or read from the
      primary device while the backup is running. <br>
    </p>
    <p><br>
    </p>
    <br>
  </body>
</html>

--------------F204265983ECF13F8227D2D6--


More information about the drbd-user mailing list