[DRBD-user] DRBD Filesystem Pacemaker Resources Stopped

Robert Langley Robert.Langley at ventura.org
Thu Mar 29 23:53:00 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 3/27/2012 01:18 AM, Andreas Krurz wrote:
> Mountpoint created on both nodes, defined correct device and valid file system? What happens after a cleanup? ... crm resource cleanup p_fs_vol01 ... grep for "Filesystem" in your logs to get the error output from the resource agent.
>
> For more ... please share current drbd state/configuration and your cluster configuration.
>
> Regards,
> Andreas

* Pardon me if I'm not replying correctly, I'm trying to learn the mailing list usage. I'll see how this goes. Look out, I'm a noob!

Andreas,
Thank you for your reply.

Mountpoints are done using LVM2 (as mentioned in the LinBit guide; the DRBD resource is the used as the physical volume for the LV Group) and are all showing available on ds01, status is NOT available on ds02 at this time. I formatted them with ext4 and specified that difference when going through the LinBit guide (for the Pacemaker config; they mention ext3 in their guide).
I had previously run the cleanup, and it did not appear to make a different.

This time I first ran crm_mon -1 and the lvm resource was Started, but the Filesystems were not. That is how it has been.
Then, and maybe I shouldn't have worried about this yet, but I noticed in my global_common.conf file that I hadn't included any wait-connect in the Startup section (see below for my additions.
After doing so, though I have not restarted anything yet, I ran crm resource cleanup p_fs_vol01 , then I saw the lvm resource say "FAILED" and I am now getting the following from crm_mon -1:

Stack: Heartbeat
Current DC: ds02 (8a61ab9e-da93-4b4d-8f37-9523436b5f14) - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, unknown expected votes
4 Resources configured.
============

Online: [ ds01 ds02 ]

 Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
     p_drbd_nfs:0       (ocf::linbit:drbd):     Slave ds01 (unmanaged) FAILED
     p_drbd_nfs:1       (ocf::linbit:drbd):     Slave ds02 (unmanaged) FAILED
 Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
     Started: [ ds01 ds02 ]

Failed actions:
    p_drbd_nfs:0_demote_0 (node=ds01, call=75, rc=5, status=complete): not installed
    p_drbd_nfs:0_stop_0 (node=ds01, call=79, rc=5, status=complete): not installed
    p_drbd_nfs:1_monitor_30000 (node=ds02, call=447, rc=5, status=complete): not installed
    p_drbd_nfs:1_stop_0 (node=ds02, call=458, rc=5, status=complete): not installed

Grep for "Filesystem" in /var/log/syslog on ds01 shows the following for every volume repeatedly:
Mar 29 11:06:19 ds01 pengine: [27987]: notice: native_print:      p_fs_vol01#011(ocf::heartbeat:Filesystem):#011Stopped

On ds02, I receive the same in the syslog file, with the addition of this message after the above messages:
Mar 29 11:09:26 ds02 Filesystem[2000]: [2021]: WARNING: Couldn't find device [/dev/nfs/vol01]. Expected /dev/??? to exist

DRBD State from ds01 (Before restarting ds02): Connected and UpToDate with ds01 as the Primary.
DRBD State from ds02 (After restarting ds02; interesting; Pacemaker?): cat: /proc/drbd: No such file or directory
DRBD State from ds01 (After restarting ds02): WFConnection with ds02 as unknown.

---- Configuration below here ---

:::DRBD Resource Config:::
resource nfs {
device /dev/drbd0;
disk /dev/sda1;
meta-disk internal;
on ds01 {
address 192.168.1.11:7790;
}
on ds02 {
address 192.168.1.12:7790;
}

:::DRBD Global_common.conf:::
global {
        usage-count yes;
}

common {
        protocol C;

        handlers {
                pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
                pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
                local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
        }

        startup {
		# Just included before reply on mailing list. 3/29/2012, Please reply with comment to this if I am mistaken for adding these.
                        wfc-timeout 120
		degr-wfc-timeout 120
		outdated-wfc-timeout 120
		wait-after-sb 180
        }

        disk {
                on-io-error detach;
        }

        net {
                after-sb-0pri disconnect;
                after-sb-1pri disconnect;
                after-sb-2pri disconnect;
        }

        syncer {
                rate 100M;
                al-extents 257;
        }
}

:::Heartbeat ha.cf:::
autojoin none
mcast bond0 239.0.0.1 694 1 0
bcast bond1
keepalive 2
deadtime 15
warntime 5
initdead 60
node ds01
node ds02
pacemaker respawn

:::Pacemaker CIB.XML:::
<cib epoch="60" num_updates="0" admin_epoch="0" validate-with="pacemaker-1.2" crm_feature_set="3.0.5" have-quorum="1" cib-last-written="Thu Mar 29 10:39:59 2012" dc-uuid="8a61ab9e-da93-4b4d-8f37-9523436b5f14">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="Heartbeat"/>
        <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
        <nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1333042796"/>
        <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="ignore"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="b0dff0ec-073e-475b-b7b9-167ae122e5e0" type="normal" uname="ds01"/>
      <node id="8a61ab9e-da93-4b4d-8f37-9523436b5f14" type="normal" uname="ds02"/>
    </nodes>
    <resources>
      <primitive class="ocf" id="failover-ip" provider="heartbeat" type="IPaddr">
        <instance_attributes id="failover-ip-instance_attributes">
          <nvpair id="failover-ip-instance_attributes-ip" name="ip" value="192.168.2.10"/>
        </instance_attributes>
        <operations>
          <op id="failover-ip-monitor-10s" interval="10s" name="monitor"/>
        </operations>
        <meta_attributes id="failover-ip-meta_attributes">
          <nvpair id="failover-ip-meta_attributes-target-role" name="target-role" value="Stopped"/>
        </meta_attributes>
      </primitive>
      <master id="ms_drbd_nfs">
        <meta_attributes id="ms_drbd_nfs-meta_attributes">
          <nvpair id="ms_drbd_nfs-meta_attributes-master-max" name="master-max" value="1"/>
          <nvpair id="ms_drbd_nfs-meta_attributes-master-node-max" name="master-node-max" value="1"/>
          <nvpair id="ms_drbd_nfs-meta_attributes-clone-max" name="clone-max" value="2"/>
          <nvpair id="ms_drbd_nfs-meta_attributes-clone-node-max" name="clone-node-max" value="1"/>
 	  <nvpair id="ms_drbd_nfs-meta_attributes-notify" name="notify" value="true"/>
        </meta_attributes>
        <primitive class="ocf" id="p_drbd_nfs" provider="linbit" type="drbd">
          <instance_attributes id="p_drbd_nfs-instance_attributes">
            <nvpair id="p_drbd_nfs-instance_attributes-drbd_resource" name="drbd_resource" value="nfs"/>
          </instance_attributes>
          <operations>
            <op id="p_drbd_nfs-monitor-15" interval="15" name="monitor" role="Master"/>
            <op id="p_drbd_nfs-monitor-30" interval="30" name="monitor" role="Slave"/>
          </operations>
        </primitive>
      </master>
      <clone id="cl_lsb_nfsserver">
        <primitive class="lsb" id="p_lsb_nfsserver" type="nfs-kernel-server">
          <operations>
            <op id="p_lsb_nfsserver-monitor-30s" interval="30s" name="monitor"/>
          </operations>
        </primitive>
      </clone>
      <group id="g_nfs">
        <primitive class="ocf" id="p_lvm_nfs" provider="heartbeat" type="LVM">
          <instance_attributes id="p_lvm_nfs-instance_attributes">
            <nvpair id="p_lvm_nfs-instance_attributes-volgrpname" name="volgrpname" value="nfs"/>
          </instance_attributes>
          <operations>
            <op id="p_lvm_nfs-monitor-30s" interval="30s" name="monitor"/>
          </operations>
          <meta_attributes id="p_lvm_nfs-meta_attributes">
            <nvpair id="p_lvm_nfs-meta_attributes-is-managed" name="is-managed" value="true"/>
          </meta_attributes>
        </primitive>
        <primitive class="ocf" id="p_fs_vol01" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol01-instance_attributes">
            <nvpair id="p_fs_vol01-instance_attributes-device" name="device" value="/dev/nfs/vol01"/>
            <nvpair id="p_fs_vol01-instance_attributes-directory" name="directory" value="/srv/nfs/vol01"/>
            <nvpair id="p_fs_vol01-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol01-monitor-10s" interval="10s" name="monitor"/>
          </operations>
          <meta_attributes id="p_fs_vol01-meta_attributes">

      </meta_attributes>
        </primitive>
        <primitive class="ocf" id="p_fs_vol02" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol02-instance_attributes">
            <nvpair id="p_fs_vol02-instance_attributes-device" name="device" value="/dev/nfs/vol02"/>
            <nvpair id="p_fs_vol02-instance_attributes-directory" name="directory" value="/srv/nfs/vol02"/>
            <nvpair id="p_fs_vol02-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol02-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol03" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol03-instance_attributes">
            <nvpair id="p_fs_vol03-instance_attributes-device" name="device" value="/dev/nfs/vol03"/>
            <nvpair id="p_fs_vol03-instance_attributes-directory" name="directory" value="/srv/nfs/vol03"/>
            <nvpair id="p_fs_vol03-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol03-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol04" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol04-instance_attributes">
            <nvpair id="p_fs_vol04-instance_attributes-device" name="device" value="/dev/nfs/vol04"/>
            <nvpair id="p_fs_vol04-instance_attributes-directory" name="directory" value="/srv/nfs/vol04"/>
            <nvpair id="p_fs_vol04-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol04-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol05" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol05-instance_attributes">
            <nvpair id="p_fs_vol05-instance_attributes-device" name="device" value="/dev/nfs/vol05"/>
            <nvpair id="p_fs_vol05-instance_attributes-directory" name="directory" value="/srv/nfs/vol05"/>
            <nvpair id="p_fs_vol05-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol05-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol06" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol06-instance_attributes">
            <nvpair id="p_fs_vol06-instance_attributes-device" name="device" value="/dev/nfs/vol06"/>
            <nvpair id="p_fs_vol06-instance_attributes-directory" name="directory" value="/srv/nfs/vol06"/>
            <nvpair id="p_fs_vol06-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol06-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol07" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol07-instance_attributes">
            <nvpair id="p_fs_vol07-instance_attributes-device" name="device" value="/dev/nfs/vol07"/>
            <nvpair id="p_fs_vol07-instance_attributes-directory" name="directory" value="/srv/nfs/vol07"/>
            <nvpair id="p_fs_vol07-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol07-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol08" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol08-instance_attributes">
            <nvpair id="p_fs_vol08-instance_attributes-device" name="device" value="/dev/nfs/vol08"/>
            <nvpair id="p_fs_vol08-instance_attributes-directory" name="directory" value="/srv/nfs/vol08"/>
            <nvpair id="p_fs_vol08-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol08-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol09" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol09-instance_attributes">
            <nvpair id="p_fs_vol09-instance_attributes-device" name="device" value="/dev/nfs/vol09"/>
            <nvpair id="p_fs_vol09-instance_attributes-directory" name="directory" value="/srv/nfs/vol09"/>
            <nvpair id="p_fs_vol09-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol09-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_vol10" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_vol10-instance_attributes">
            <nvpair id="p_fs_vol10-instance_attributes-device" name="device" value="/dev/nfs/vol10"/>
            <nvpair id="p_fs_vol10-instance_attributes-directory" name="directory" value="/srv/nfs/vol10"/>
            <nvpair id="p_fs_vol10-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_vol10-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="p_fs_dc1" provider="heartbeat" type="Filesystem">
          <instance_attributes id="p_fs_dc1-instance_attributes">
            <nvpair id="p_fs_dc1-instance_attributes-device" name="device" value="/dev/nfs/dc1"/>
            <nvpair id="p_fs_dc1-instance_attributes-directory" name="directory" value="/srv/nfs/dc1"/>
            <nvpair id="p_fs_dc1-instance_attributes-fstype" name="fstype" value="ext4"/>
          </instance_attributes>
          <operations>
            <op id="p_fs_dc1-monitor-10s" interval="10s" name="monitor"/>
          </operations>
        </primitive>
        <meta_attributes id="g_nfs-meta_attributes">
          <nvpair id="g_nfs-meta_attributes-target-role" name="target-role" value="Started"/>
        </meta_attributes>
      </group>
    </resources>
    <constraints>
      <rsc_order first="ms_drbd_nfs" first-action="promote" id="o_drbd_before_nfs" score="INFINITY" then="g_nfs" then-action="start"/>
      <rsc_colocation id="c_nfs_on_drbd" rsc="g_nfs" score="INFINITY" with-rsc="ms_drbd_nfs" with-rsc-role="Master"/>
      <rsc_location id="cli-prefer-failover-ip" rsc="failover-ip">
        <rule id="cli-prefer-rule-failover-ip" score="INFINITY" boolean-op="and">
          <expression id="cli-prefer-expr-failover-ip" attribute="#uname" operation="eq" value="ds01" type="string"/>
        </rule>
      </rsc_location>
    </constraints>
    <rsc_defaults>
      <meta_attributes id="rsc-options">
        <nvpair id="rsc-options-resource-stickiness" name="resource-stickiness" value="200"/>
      </meta_attributes>
    </rsc_defaults>
  </configuration>
</cib>


Thank you,
Robert




More information about the drbd-user mailing list