<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p>Servus !<br>

    </p>

    Am 24.02.2017 um 15:53 schrieb Lars Ellenberg:<br>

    <blockquote cite="mid:20170224145351.GS21236@soda.linbit"

      type="cite">

      <pre wrap="">On Fri, Feb 24, 2017 at 03:08:04PM +0100, Dr. Volker Jaenisch wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">

If both 10Gbit links fail then the bond0 aka the worker connection fails

and DRBD goes - as expected - into split brain. But that is not the problem.

</pre>

      </blockquote>

      <pre wrap="">

DRBD will be *disconnected*, yes.</pre>

    </blockquote>

    Sorry, was not precise in my wording. But I assumed that after going

    into disconnect state the Cluster manager is informed and reflects

    this somehow.<br>

    I now noticed that a CIB rule is set on the former primary to stay

    primary (please have a look at the cluster state at the end of this

    email.) but I still wonder why this is not reflected in the crm

    status. I was misled by this missing status information and

    concluded wrongly, that the ocf:linbit:drbd plugin does not inform

    the CRM/CIB. Sorry, for blaiming drbd. <br>

    <br>

    But I am still confused about the behavior of pacemaker in not

    reflecting the change of DRBD in the crm status. Maybe this question

    should go to the pacemaker list.<br>

    <blockquote cite="mid:20170224145351.GS21236@soda.linbit"

      type="cite">

      <pre wrap="">

But no reason for it to be "split brain"ed yet.

and with proper fencing configured, it won't.</pre>

    </blockquote>

    This is our DRBD config. This is all quite basic:<br>

    <br>

    resource r0 {<br>

    <br>

      disk {<br>

        fencing resource-only;<br>

      }<br>

    <br>

      handlers {<br>

        fence-peer "/usr/lib/drbd/crm-fence-peer.sh";<br>

        after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";<br>

      }<br>

    <br>

      on mail1 {<br>

        device    /dev/drbd1;<br>

        disk      /dev/sda1;<br>

        address   172.27.250.8:7789;<br>

        meta-disk internal;<br>

      }<br>

      on mail2 {<br>

        device    /dev/drbd1;<br>

        disk      /dev/sda1;<br>

        address   172.27.250.9:7789;<br>

        meta-disk internal;<br>

      }<br>

    }<br>

    <br>

    <b>What did we miss?</b> We have no Stonith configured, yet. And

    IMHO a missing stonith configuration should not interfere with the

    DRBD-state change. Or am I wrong with this assumption?<br>

    <br>

    <br>

    State after bond0 goes down:<br>

    <br>

    root@mail1:/home/volker# crm status<br>

    Stack: corosync<br>

    Current DC: mail2 (version 1.1.15-e174ec8) - partition with quorum<br>

    Last updated: Fri Feb 24 16:56:44 2017          Last change: Fri Feb

    24 16:45:19 2017 by root via cibadmin on mail2<br>

    <br>

    2 nodes and 7 resources configured<br>

    <br>

    Online: [ mail1 mail2 ]<br>

    <br>

    Full list of resources:<br>

    <br>

     Master/Slave Set: ms_drbd_mail [drbd_mail]<br>

         Masters: [ mail2 ]<br>

         Slaves: [ mail1 ]<br>

     Resource Group: FS_IP<br>

         fs_mail    (ocf::heartbeat:Filesystem):    Started mail2<br>

         vip_193.239.30.23  (ocf::heartbeat:IPaddr2):       Started

    mail2<br>

         vip_172.27.250.7   (ocf::heartbeat:IPaddr2):       Started

    mail2<br>

     Resource Group: Services<br>

         postgres_pg2       (ocf::heartbeat:pgsql): Started mail2<br>

         Dovecot    (lsb:dovecot):  Started mail2<br>

    <br>

    Failed Actions:<br>

    * vip_172.27.250.7_monitor_30000 on mail2 'not running' (7):

    call=55, status=complete, exitreason='none',<br>

        last-rc-change='Fri Feb 24 16:47:07 2017', queued=0ms, exec=0ms<br>

    <br>

    root@mail2:/home/volker# drbd-overview <br>

     1:r0/0  StandAlone Primary/Unknown UpToDate/Outdated /shared/data

    ext4 916G 12G 858G 2% <br>

    <br>

    root@mail1:/home/volker#

drbd-overview                                                                                                                                                                     

    <br>

     1:r0/0  WFConnection Secondary/Unknown UpToDate/DUnknown <br>

    <br>

    <br>

    And after bringing up bond0 again the same state on both machines.<br>

    After cleanup of the failed VIP interface still the same state:<br>

    <br>

    root@mail2:/home/volker# crm status<br>

    Stack: corosync<br>

    Current DC: mail2 (version 1.1.15-e174ec8) - partition with quorum<br>

    Last updated: Fri Feb 24 17:01:05 2017          Last change: Fri Feb

    24 16:59:32 2017 by hacluster via crmd on mail2<br>

    <br>

    2 nodes and 7 resources configured<br>

    <br>

    Online: [ mail1 mail2 ]<br>

    <br>

    Full list of resources:<br>

    <br>

     Master/Slave Set: ms_drbd_mail [drbd_mail]<br>

         Masters: [ mail2 ]<br>

         Slaves: [ mail1 ]<br>

     Resource Group: FS_IP<br>

         fs_mail    (ocf::heartbeat:Filesystem):    Started mail2<br>

         vip_193.239.30.23  (ocf::heartbeat:IPaddr2):       Started

    mail2<br>

         vip_172.27.250.7   (ocf::heartbeat:IPaddr2):       Started

    mail2<br>

     Resource Group: Services<br>

         postgres_pg2       (ocf::heartbeat:pgsql): Started mail2<br>

         Dovecot    (lsb:dovecot):  Started mail2<br>

    <br>

    root@mail2:/home/volker# drbd-overview <br>

     1:r0/0  StandAlone Primary/Unknown UpToDate/Outdated /shared/data

    ext4 916G 12G 858G 2% <br>

    <br>

    root@mail1:/home/volker# drbd-overview <br>

     1:r0/0  WFConnection Secondary/Unknown UpToDate/DUnknown <br>

    <br>

    After issuing a <br>

    <br>

    mail2# drbdadm connect all <br>

    <br>

    the nodes resync and everything is in best order (The "sticky" rule

    is cleared also). <br>

    <br>

    Cheers,<br>

    <br>

    Volker<br>

    <br>

    General Setup : Stock Debian Jessie without any modifications. DRBD,

    Pacemaker etc. all Debian.<br>

    <br>

    Here our crm config:<br>

    <br>

    node 740030984: mail1 \<br>

            attributes standby=off<br>

    node 740030985: mail2 \<br>

            attributes standby=off<br>

    primitive Dovecot lsb:dovecot \<br>

            op monitor interval=20s timeout=15s \<br>

            meta target-role=Started<br>

    primitive drbd_mail ocf:linbit:drbd \<br>

            params drbd_resource=r0 \<br>

            op monitor interval=15s role=Master \<br>

            op monitor interval=16s role=Slave \<br>

            op start interval=0 timeout=240s \<br>

            op stop interval=0 timeout=100s<br>

    ...<br>

    ms ms_drbd_mail drbd_mail \<br>

            meta master-max=1 master-node-max=1 clone-max=2

    clone-node-max=1 notify=true is-managed=true target-role=Started<br>

    order FS_IP_after_drbd inf: ms_drbd_mail:promote FS_IP:start<br>

    order dovecot_after_FS_IP inf: FS_IP:start Services:start<br>

    location drbd-fence-by-handler-r0-ms_drbd_mail ms_drbd_mail \<br>

            <b>rule $role=Master -inf: #uname ne mail2</b><br>

    colocation mail_fs_on_drbd inf: FS_IP Services ms_drbd_mail:Master<br>

    property cib-bootstrap-options: \<br>

            have-watchdog=false \<br>

            dc-version=1.1.15-e174ec8 \<br>

            cluster-infrastructure=corosync \<br>

            cluster-name=mail \<br>

            stonith-enabled=false \<br>

            last-lrm-refresh=1487951972 \<br>

            no-quorum-policy=ignore<br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <pre class="moz-signature" cols="72">-- 

=========================================================

   inqbus Scientific Computing    Dr.  Volker Jaenisch

   Richard-Strauss-Straße 1       +49(08861) 690 474 0

   86956 Schongau-West            <a class="moz-txt-link-freetext" href="http://www.inqbus.de">http://www.inqbus.de</a>

=========================================================</pre>

  </body>

</html>