[DRBD-user] COnfused about shared partitions

Wed Apr 3 20:52:51 CEST 2019

My apologies for the length of this; I'd rather give to much information
than too little.

I have a simple PaceMaker cluster with two nodes, A and B, that are CentOS
7 VMs.  In both nodes I have a file named my-data.res under /etc/drdb.d,
with the following contents (identical in both nodes):

          resource my-data {
            protocol C;
            meta-disk internal;
            device /dev/drbd1;

            syncer {
             verify-alg sha1;
            }

            net {
             allow-two-primaries;
            }
            on A {
             disk /dev/sdb1;
             address 192.168.0.5:7789 <http://192.168.123.17:7789>;
            }

            on B {
             disk /dev/sdb1;
             address 192.168.0.6:7789 <http://192.168.123.20:7789>;
            }
          }

In both A and B, I did the following:

   # drbdadm create-md my-data
   --==  Thank you for participating in the global usage survey  ==--
   The server's response is:

    you are the xxxxxth user to install this version
    drbdmeta 1 v08 /dev/sdb1 internal create-md

    # drbdadm up my-data

In A alone, I did the following:

     # drbdadm primary --force my-data

After this, and  giving time for both nodes to become synchronized,  I get
the following in A:

     # cat /proc/drbd
      version: 8.4.11-1 (api:1/proto:86-101)
      GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@,
2018-11-03 01:26:55

      1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C
r-----ns:2096028 nr:0 dw:0 dr:2098148 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1
wo:f oos:0

whereas  the output from the same command in B is

       version: 8.4.11-1 (api:1/proto:86-101)
       GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by
mockbuild@, 2018-11-03 01:26:55

       1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C
r-----ns:0 nr:2096028 dw:2096028 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1
wo:f oos:0

In A, I then created a filesystem in the /dev/drbd1 device

        # mkfs.ext4 /dev/drbd1

I then mounted it (stil in A)  and populated it with some data, consisting
of two files, f1 and f2, unmounting when done.

After this, I issued the following commands in A:

        # pcs cluster cib drbd_cfg
        # pcs -f drbd_cfg resource create DrbdData ocf:linbit:drbd
drbd_resource=my-data op monitor interval=60s
        # pcs -f drbd_cfg resource master DrbdDataClone DrbdData
master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
        # pcs cluster cib-push drbd_cfg
        # pcs cluster cib fs_cfg
        # pcs -f fs_cfg resource create DrbdFS Filesystem
device="/dev/drbd1" directory="/var/lib/my-data" fstype="ext4"
        # pcs -f fs_cfg constraint colocation add DrbdFS with DrbdDataClone
INFINITY with-rsc-role=Master
        # pcs -f fs_cfg constraint order promote DrbdDataClone then start
DrbdFS
        Adding DrbdDataClone DrbdFS (kind: Mandatory) (Options:
first-action=promote then-action=start)
        # pcs -f fs_cfg constraint colocation add ClusterIP with DrbdFS
INFINITY
        # pcs -f fs_cfg constraint order DrbdFS then ClusterIP
         Adding DrbdFS ClusterIP (kind: Mandatory) (Options:
first-action=start then-action=start)
        # pcs cluster cib-push fs_cfg
        CIB Updated
        # pcs status resources
        ClusterIP  (ocf::heartbeat:IPaddr2):  Started A
         Master/Slave Set: DrbdDataClone [DrbdData]
             Masters: [ A ]
             Slaves: [ B ]
         DrbdFS  (ocf::heartbeat:Filesystem):  Started A

The output from the last command in B is very similar,  mutatis mutandis.

Once this all is done, node A is active where node B is passive. When I do

           # ls /var/lib/my-data

in A I can see the f1 and f2 files that I created above. In B, however,
there is nothing. My understanding of the DRBD framework tells me that this
is expected.

During this state of affairs (i.e. the cluster up both in A and B, with A
being the active node and B being the passive node) I stopped the A node by
issuing the following command in A:

             # pcs cluster stop A

After this, and giving time for node B to take over, I get the following in
B:

             # pcs status
            Cluster name: FirstCluster
            Stack: corosync
            Current DC: B (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition
with quorum
            Last updated: Wed Apr  3 11:49:19 2019
            Last change: Wed Apr  3 09:53:30 2019 by root via cibadmin on B

           2 nodes configured
           4 resources configured

           Online: [ B ]
           OFFLINE: [ A ]

           Full list of resources:

           ClusterIP (ocf::heartbeat:IPaddr2): Started B
           Master/Slave Set: DrbdDataClone [DrbdData]
               Masters: [ B ]
               Stopped: [ A ]
           DrbdFS (ocf::heartbeat:Filesystem): Started B

           Daemon Status:
              corosync: active/enabled
              pacemaker: active/enabled
              pcsd: active/enabled

On listing the contents of /var/lib/my-data on A, nothing is found there.
However, on listing the contents of /var/lib/my-data on B, I find files f1
and f2. So, B has taken over and is now accessing, thanks to DRDB,  the
filesystem created in A earlier on.

Next I started and stopped both A and B (the nodes, not the VMs) until I
got back the original situation - namely, A is active, B is passive,
/var/log/my-data contains f1 and f2 in A, and nothing in B.

Next I did the following: In A I removed both f1 and f2, and added new
files, f3, and f4. I did this while A is the active node. After this I
stopped the A node from A as above. Having done this, the output from pcs
status in B is as described above. Here's the thing:

In this situation, when I list the contents of /var/lib/my-data in A, I
find it to be empty - as expected, for A has been stopped. When I list the
contents of /var/lib/my-data in B, however, what I see is files f1 and f2,
not f3 and f4. I was expecting for any changes made on /var/lib/my-data
while A was active to be made visible to B when it takes over.

Is my expectation misplaced? Or perhaps the changes to /var/lib/my-data
have to be made by some resource explicitly managed by PaceMaker, while
changes done by hand are just dismissed? Notice, however, that if I bring
the A node up again, and make it active, on listing /var/lib/my-data on A I
see files f3 and f4: the changes I made by hand have not just been
dismissed.

As you can see, I am very confused.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20190403/28d5ca81/attachment.htm>