[DRBD-user] DRBD Filesystem Pacemaker Resources Stopped

Andreas Kurz andreas at hastexo.com
Wed Apr 4 00:14:58 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 03/29/2012 11:53 PM, Robert Langley wrote:
> On 3/27/2012 01:18 AM, Andreas Krurz wrote:
>> Mountpoint created on both nodes, defined correct device and valid file system? What happens after a cleanup? ... crm resource cleanup p_fs_vol01 ... grep for "Filesystem" in your logs to get the error output from the resource agent.
>>
>> For more ... please share current drbd state/configuration and your cluster configuration.
>>
>> Regards,
>> Andreas
> 
> * Pardon me if I'm not replying correctly, I'm trying to learn the mailing list usage. I'll see how this goes. Look out, I'm a noob!
> 
> Andreas,
> Thank you for your reply.
> 
> Mountpoints are done using LVM2 (as mentioned in the LinBit guide; the DRBD resource is the used as the physical volume for the LV Group) and are all showing available on ds01, status is NOT available on ds02 at this time. I formatted them with ext4 and specified that difference when going through the LinBit guide (for the Pacemaker config; they mention ext3 in their guide).
> I had previously run the cleanup, and it did not appear to make a different.
> 
> This time I first ran crm_mon -1 and the lvm resource was Started, but the Filesystems were not. That is how it has been.
> Then, and maybe I shouldn't have worried about this yet, but I noticed in my global_common.conf file that I hadn't included any wait-connect in the Startup section (see below for my additions.
> After doing so, though I have not restarted anything yet, I ran crm resource cleanup p_fs_vol01 , then I saw the lvm resource say "FAILED" and I am now getting the following from crm_mon -1:
> 
> Stack: Heartbeat
> Current DC: ds02 (8a61ab9e-da93-4b4d-8f37-9523436b5f14) - partition with quorum
> Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 2 Nodes configured, unknown expected votes
> 4 Resources configured.
> ============
> 
> Online: [ ds01 ds02 ]
> 
>  Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
>      p_drbd_nfs:0       (ocf::linbit:drbd):     Slave ds01 (unmanaged) FAILED
>      p_drbd_nfs:1       (ocf::linbit:drbd):     Slave ds02 (unmanaged) FAILED
>  Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
>      Started: [ ds01 ds02 ]
> 
> Failed actions:
>     p_drbd_nfs:0_demote_0 (node=ds01, call=75, rc=5, status=complete): not installed
>     p_drbd_nfs:0_stop_0 (node=ds01, call=79, rc=5, status=complete): not installed
>     p_drbd_nfs:1_monitor_30000 (node=ds02, call=447, rc=5, status=complete): not installed
>     p_drbd_nfs:1_stop_0 (node=ds02, call=458, rc=5, status=complete): not installed

hmm ... not installed ... looks like broken config. You only make
changes to your drbd config while Pacemaker is in maintenance-mode and
you don't switch it live without testing the drbd config for validity?

> 
> Grep for "Filesystem" in /var/log/syslog on ds01 shows the following for every volume repeatedly:
> Mar 29 11:06:19 ds01 pengine: [27987]: notice: native_print:      p_fs_vol01#011(ocf::heartbeat:Filesystem):#011Stopped
> 
> On ds02, I receive the same in the syslog file, with the addition of this message after the above messages:
> Mar 29 11:09:26 ds02 Filesystem[2000]: [2021]: WARNING: Couldn't find device [/dev/nfs/vol01]. Expected /dev/??? to exist
> 
> DRBD State from ds01 (Before restarting ds02): Connected and UpToDate with ds01 as the Primary.
> DRBD State from ds02 (After restarting ds02; interesting; Pacemaker?): cat: /proc/drbd: No such file or directory
> DRBD State from ds01 (After restarting ds02): WFConnection with ds02 as unknown.
> 
> ---- Configuration below here ---
> 
> :::DRBD Resource Config:::
> resource nfs {
> device /dev/drbd0;
> disk /dev/sda1;
> meta-disk internal;
> on ds01 {
> address 192.168.1.11:7790;
> }
> on ds02 {
> address 192.168.1.12:7790;
> }
> 
> :::DRBD Global_common.conf:::
> global {
>         usage-count yes;
> }
> 
> common {
>         protocol C;
> 
>         handlers {
>                 pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
>                 pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
>                 local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
>         }
> 
>         startup {
> 		# Just included before reply on mailing list. 3/29/2012, Please reply with comment to this if I am mistaken for adding these.
>                         wfc-timeout 120
> 		degr-wfc-timeout 120
> 		outdated-wfc-timeout 120
> 		wait-after-sb 180

ah yes ... missing ; at the end of those lines. But you don't need that
startup timeouts because they are only read by the init script which
should be disabled. The complete drbd should be managed by Pacemaker.

The rest of your config looks ok, except for not using STONITH and DRBD
resource level fencing.

 If the filesystems still don't start because of "not installed" ...
these are the reasons and you should find the entrys in your logs:

ocf_log err "Couldn't find filesystem $FSTYPE in /proc/filesystems"

ocf_log err "Couldn't find device [$DEVICE]. Expected /dev/??? to exist"
		
ocf_log err "Couldn't find directory  [$MOUNTPOINT] to use as a mount point"
		
So at least p_fs_vol01 suffers from one of the above problems ... check
mountpoints and lvs for existence and typos.

Regards,
Andreas

-- 
Need help with DRBD?
http://www.hastexo.com/now

>         }
> 
>         disk {
>                 on-io-error detach;
>         }
> 
>         net {
>                 after-sb-0pri disconnect;
>                 after-sb-1pri disconnect;
>                 after-sb-2pri disconnect;
>         }
> 
>         syncer {
>                 rate 100M;
>                 al-extents 257;
>         }
> }
> 
> :::Heartbeat ha.cf:::
> autojoin none
> mcast bond0 239.0.0.1 694 1 0
> bcast bond1
> keepalive 2
> deadtime 15
> warntime 5
> initdead 60
> node ds01
> node ds02
> pacemaker respawn
> 
> :::Pacemaker CIB.XML:::
> <cib epoch="60" num_updates="0" admin_epoch="0" validate-with="pacemaker-1.2" crm_feature_set="3.0.5" have-quorum="1" cib-last-written="Thu Mar 29 10:39:59 2012" dc-uuid="8a61ab9e-da93-4b4d-8f37-9523436b5f14">
>   <configuration>
>     <crm_config>
>       <cluster_property_set id="cib-bootstrap-options">
>         <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"/>
>         <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="Heartbeat"/>
>         <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
>         <nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1333042796"/>
>         <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="ignore"/>
>       </cluster_property_set>
>     </crm_config>
>     <nodes>
>       <node id="b0dff0ec-073e-475b-b7b9-167ae122e5e0" type="normal" uname="ds01"/>
>       <node id="8a61ab9e-da93-4b4d-8f37-9523436b5f14" type="normal" uname="ds02"/>
>     </nodes>
>     <resources>
>       <primitive class="ocf" id="failover-ip" provider="heartbeat" type="IPaddr">
>         <instance_attributes id="failover-ip-instance_attributes">
>           <nvpair id="failover-ip-instance_attributes-ip" name="ip" value="192.168.2.10"/>
>         </instance_attributes>
>         <operations>
>           <op id="failover-ip-monitor-10s" interval="10s" name="monitor"/>
>         </operations>
>         <meta_attributes id="failover-ip-meta_attributes">
>           <nvpair id="failover-ip-meta_attributes-target-role" name="target-role" value="Stopped"/>
>         </meta_attributes>
>       </primitive>
>       <master id="ms_drbd_nfs">
>         <meta_attributes id="ms_drbd_nfs-meta_attributes">
>           <nvpair id="ms_drbd_nfs-meta_attributes-master-max" name="master-max" value="1"/>
>           <nvpair id="ms_drbd_nfs-meta_attributes-master-node-max" name="master-node-max" value="1"/>
>           <nvpair id="ms_drbd_nfs-meta_attributes-clone-max" name="clone-max" value="2"/>
>           <nvpair id="ms_drbd_nfs-meta_attributes-clone-node-max" name="clone-node-max" value="1"/>
>  	  <nvpair id="ms_drbd_nfs-meta_attributes-notify" name="notify" value="true"/>
>         </meta_attributes>
>         <primitive class="ocf" id="p_drbd_nfs" provider="linbit" type="drbd">
>           <instance_attributes id="p_drbd_nfs-instance_attributes">
>             <nvpair id="p_drbd_nfs-instance_attributes-drbd_resource" name="drbd_resource" value="nfs"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_drbd_nfs-monitor-15" interval="15" name="monitor" role="Master"/>
>             <op id="p_drbd_nfs-monitor-30" interval="30" name="monitor" role="Slave"/>
>           </operations>
>         </primitive>
>       </master>
>       <clone id="cl_lsb_nfsserver">
>         <primitive class="lsb" id="p_lsb_nfsserver" type="nfs-kernel-server">
>           <operations>
>             <op id="p_lsb_nfsserver-monitor-30s" interval="30s" name="monitor"/>
>           </operations>
>         </primitive>
>       </clone>
>       <group id="g_nfs">
>         <primitive class="ocf" id="p_lvm_nfs" provider="heartbeat" type="LVM">
>           <instance_attributes id="p_lvm_nfs-instance_attributes">
>             <nvpair id="p_lvm_nfs-instance_attributes-volgrpname" name="volgrpname" value="nfs"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_lvm_nfs-monitor-30s" interval="30s" name="monitor"/>
>           </operations>
>           <meta_attributes id="p_lvm_nfs-meta_attributes">
>             <nvpair id="p_lvm_nfs-meta_attributes-is-managed" name="is-managed" value="true"/>
>           </meta_attributes>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol01" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol01-instance_attributes">
>             <nvpair id="p_fs_vol01-instance_attributes-device" name="device" value="/dev/nfs/vol01"/>
>             <nvpair id="p_fs_vol01-instance_attributes-directory" name="directory" value="/srv/nfs/vol01"/>
>             <nvpair id="p_fs_vol01-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol01-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>           <meta_attributes id="p_fs_vol01-meta_attributes">
> 
>       </meta_attributes>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol02" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol02-instance_attributes">
>             <nvpair id="p_fs_vol02-instance_attributes-device" name="device" value="/dev/nfs/vol02"/>
>             <nvpair id="p_fs_vol02-instance_attributes-directory" name="directory" value="/srv/nfs/vol02"/>
>             <nvpair id="p_fs_vol02-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol02-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol03" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol03-instance_attributes">
>             <nvpair id="p_fs_vol03-instance_attributes-device" name="device" value="/dev/nfs/vol03"/>
>             <nvpair id="p_fs_vol03-instance_attributes-directory" name="directory" value="/srv/nfs/vol03"/>
>             <nvpair id="p_fs_vol03-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol03-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol04" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol04-instance_attributes">
>             <nvpair id="p_fs_vol04-instance_attributes-device" name="device" value="/dev/nfs/vol04"/>
>             <nvpair id="p_fs_vol04-instance_attributes-directory" name="directory" value="/srv/nfs/vol04"/>
>             <nvpair id="p_fs_vol04-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol04-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol05" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol05-instance_attributes">
>             <nvpair id="p_fs_vol05-instance_attributes-device" name="device" value="/dev/nfs/vol05"/>
>             <nvpair id="p_fs_vol05-instance_attributes-directory" name="directory" value="/srv/nfs/vol05"/>
>             <nvpair id="p_fs_vol05-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol05-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol06" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol06-instance_attributes">
>             <nvpair id="p_fs_vol06-instance_attributes-device" name="device" value="/dev/nfs/vol06"/>
>             <nvpair id="p_fs_vol06-instance_attributes-directory" name="directory" value="/srv/nfs/vol06"/>
>             <nvpair id="p_fs_vol06-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol06-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol07" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol07-instance_attributes">
>             <nvpair id="p_fs_vol07-instance_attributes-device" name="device" value="/dev/nfs/vol07"/>
>             <nvpair id="p_fs_vol07-instance_attributes-directory" name="directory" value="/srv/nfs/vol07"/>
>             <nvpair id="p_fs_vol07-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol07-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol08" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol08-instance_attributes">
>             <nvpair id="p_fs_vol08-instance_attributes-device" name="device" value="/dev/nfs/vol08"/>
>             <nvpair id="p_fs_vol08-instance_attributes-directory" name="directory" value="/srv/nfs/vol08"/>
>             <nvpair id="p_fs_vol08-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol08-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol09" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol09-instance_attributes">
>             <nvpair id="p_fs_vol09-instance_attributes-device" name="device" value="/dev/nfs/vol09"/>
>             <nvpair id="p_fs_vol09-instance_attributes-directory" name="directory" value="/srv/nfs/vol09"/>
>             <nvpair id="p_fs_vol09-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol09-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_vol10" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_vol10-instance_attributes">
>             <nvpair id="p_fs_vol10-instance_attributes-device" name="device" value="/dev/nfs/vol10"/>
>             <nvpair id="p_fs_vol10-instance_attributes-directory" name="directory" value="/srv/nfs/vol10"/>
>             <nvpair id="p_fs_vol10-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_vol10-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="ocf" id="p_fs_dc1" provider="heartbeat" type="Filesystem">
>           <instance_attributes id="p_fs_dc1-instance_attributes">
>             <nvpair id="p_fs_dc1-instance_attributes-device" name="device" value="/dev/nfs/dc1"/>
>             <nvpair id="p_fs_dc1-instance_attributes-directory" name="directory" value="/srv/nfs/dc1"/>
>             <nvpair id="p_fs_dc1-instance_attributes-fstype" name="fstype" value="ext4"/>
>           </instance_attributes>
>           <operations>
>             <op id="p_fs_dc1-monitor-10s" interval="10s" name="monitor"/>
>           </operations>
>         </primitive>
>         <meta_attributes id="g_nfs-meta_attributes">
>           <nvpair id="g_nfs-meta_attributes-target-role" name="target-role" value="Started"/>
>         </meta_attributes>
>       </group>
>     </resources>
>     <constraints>
>       <rsc_order first="ms_drbd_nfs" first-action="promote" id="o_drbd_before_nfs" score="INFINITY" then="g_nfs" then-action="start"/>
>       <rsc_colocation id="c_nfs_on_drbd" rsc="g_nfs" score="INFINITY" with-rsc="ms_drbd_nfs" with-rsc-role="Master"/>
>       <rsc_location id="cli-prefer-failover-ip" rsc="failover-ip">
>         <rule id="cli-prefer-rule-failover-ip" score="INFINITY" boolean-op="and">
>           <expression id="cli-prefer-expr-failover-ip" attribute="#uname" operation="eq" value="ds01" type="string"/>
>         </rule>
>       </rsc_location>
>     </constraints>
>     <rsc_defaults>
>       <meta_attributes id="rsc-options">
>         <nvpair id="rsc-options-resource-stickiness" name="resource-stickiness" value="200"/>
>       </meta_attributes>
>     </rsc_defaults>
>   </configuration>
> </cib>
> 
> 
> Thank you,
> Robert
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 222 bytes
Desc: OpenPGP digital signature
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120404/513b25da/attachment.pgp>


More information about the drbd-user mailing list