[DRBD-user] DRBD Can't Mount Under Pacemaker

Fri Jun 26 16:56:24 CEST 2015

On 26.06.2015 12:42, JA E wrote:
> Hi,
> 
> I am newbie to clustering. I managed to setup cluster with
> pacemaker,corosysnc, drbd and pcs in two nodes. But after test restart of
> both node it seems drbd can't mount the desired folder when controlled by
> pacemaker, manually it's fine.
> 
> 
>> *[root at master Desktop]# pcs config*
>> *Cluster Name: cluster_web*
>> *Corosync Nodes:*
>> * master slave *
>> *Pacemaker Nodes:** master slave *
>>
>> *Resources: *
>> * Resource: virtual_ip (class=ocf provider=heartbeat type=IPaddr2)*
>> *  Attributes: ip=192.168.73.133 cidr_netmask=32 *
>> *  Operations: start interval=0s timeout=20s
>> (virtual_ip-start-timeout-20s)*
>> *              stop interval=0s timeout=20s (virtual_ip-stop-timeout-20s)*
>> *              monitor interval=30s (virtual_ip-monitor-interval-30s)*
>> * Resource: webserver (class=ocf provider=heartbeat type=apache)*
>> *  Attributes: configfile=/etc/httpd/conf/httpd.conf
>> statusurl=http://localhost/server-status <http://localhost/server-status> *
>> *  Operations: start interval=0s timeout=40s (webserver-start-timeout-40s)*
>> *              stop interval=0s timeout=60s (webserver-stop-timeout-60s)*
>> *              monitor interval=1min (webserver-monitor-interval-1min)*
>> * Master: webserver_data_sync*
>> *  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
>> notify=true *
>> *  Resource: webserver_data (class=ocf provider=linbit type=drbd)*
>> *   Attributes: drbd_resource=drbd0 *
>> *   Operations: start interval=0s timeout=240
>> (webserver_data-start-timeout-240)*
>> *               promote interval=0s timeout=90
>> (webserver_data-promote-timeout-90)*
>> *               demote interval=0s timeout=90
>> (webserver_data-demote-timeout-90)*
>> *               stop interval=0s timeout=100
>> (webserver_data-stop-timeout-100)*
>> *               monitor interval=60s (webserver_data-monitor-interval-60s)*
>> * Resource: webserver_fs (class=ocf provider=heartbeat type=Filesystem)*
>> *  Attributes: device=/dev/drbd0 directory=/var/www/html fstype=ext3 *
>> *  Operations: start interval=0s timeout=60
>> (webserver_fs-start-timeout-60)*
>> *              stop interval=0s timeout=60 (webserver_fs-stop-timeout-60)**
>>             monitor interval=20 timeout=40
>> (webserver_fs-monitor-interval-20)*
>>
>> *Stonith Devices: **Fencing Levels: *
>>
>> *Location Constraints:*
>> *  Resource: webserver*
>> *    Enabled on: node01 (score:50) (id:location-webserver-node01-50)*
>> *    Enabled on: master (score:50) (id:location-webserver-master-50)*
>> *Ordering Constraints:*
>> *  start virtual_ip then start webserver (kind:Mandatory)
>> (id:order-virtual_ip-webserver-mandatory)*
>> *  start webserver_fs then start webserver (kind:Mandatory)
>> (id:order-webserver_fs-webserver-mandatory)*
>> *Colocation Constraints:*
>> *  webserver with virtual_ip (score:INFINITY)
>> (id:colocation-webserver-virtual_ip-INFINITY)**  webserver_fs with
>> webserver_data_sync (score:INFINITY) (with-rsc-role:Master)
>> (id:colocation-webserver_fs-webserver_data_sync-INFINITY)*
>>
>> *Cluster Properties:*
>> * cluster-infrastructure: corosync*
>> * cluster-name: cluster_web*
>> * dc-version: 1.1.12-a14efad*
>> * have-watchdog: false*
>> * no-quorum-policy: ignore** stonith-enabled: false*
>>
>>
> 
>>
>> *[root at master Desktop]# drbdadm dump*
>> *# /etc/drbd.conf*
>> *global {*
>> *    usage-count yes;*
>> *    cmd-timeout-medium 600;*
>> *    cmd-timeout-long 0;**}*
>>
>> *common {*
>> *    net {*
>> *        protocol           C;*
>> *    }**}*
>>
>> *# resource drbd0 on master: not ignored, not stacked*
>> *# defined at /etc/drbd.d/drbd0.res:1*
>> *resource drbd0 {*
>> *    on master {*
>> *        volume 0 {*
>> *            device       /dev/drbd0 minor 0;*
>> *            disk         /dev/vg_drbd0/lv_drbd0;*
>> *            meta-disk    internal;*
>> *        }*
>> *        address          ipv4 192.168.73.131:7789
>> <http://192.168.73.131:7789/>;*
>> *    }*
>> *    on slave {*
>> *        volume 0 {*
>> *            device       /dev/drbd0 minor 0;*
>> *            disk         /dev/vg_drbd0/lv_drbd0;*
>> *            meta-disk    internal;*
>> *        }*
>> *        address          ipv4 192.168.73.132:7789
>> <http://192.168.73.132:7789/>;*
>> *    }**}*
>>
>>
> 
>>
>> *[root at master Desktop]# pcs status*
>> *Cluster name: cluster_web*
>> *Last updated: Fri Jun 26 03:04:18 2015*
>> *Last change: Fri Jun 26 02:13:11 2015*
>> *Stack: corosync*
>> *Current DC: master (1) - partition with quorum*
>> *Version: 1.1.12-a14efad*
>> *2 Nodes configured**5 Resources configured*
>>
>> *Online: [ master slave ]*
>> *Full list of resources:*
>>
>> * virtual_ip (ocf::heartbeat:IPaddr2): Started master *
>> * webserver (ocf::heartbeat:apache): Stopped *
>> * Master/Slave Set: webserver_data_sync [webserver_data]*
>> *     Masters: [ master ]*
>> *     Slaves: [ slave ]** webserver_fs (ocf::heartbeat:Filesystem):
>> Stopped *
>>
>> *Failed actions:*
>> *    webserver_fs_start_0 on master 'unknown error' (1): call=23,
>> status=complete, exit-reason='Couldn't mount filesystem /dev/drbd0 on
>> /var/www/html', last-rc-change='Fri Jun 26 02:20:45 2015', queued=0ms,
>> exec=87ms**    webserver_fs_start_0 on slave 'unknown error' (1):
>> call=23, status=complete, exit-reason='Couldn't mount filesystem /dev/drbd0
>> on /var/www/html', last-rc-change='Fri Jun 26 02:20:45 2015', queued=0ms,
>> exec=79ms*
>>
>>
>> *PCSD Status:*
>> *  master: Online**  slave: Online*
>>
>> *Daemon Status:*
>> *  corosync: active/enabled*
>> *  pacemaker: active/enabled**  pcsd: active/enabled*
> 
> 
> cat /proc/drbd
>> version: 8.4.6 (api:1/proto:86-101)
>> GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil at Build64R7,
>> 2015-04-10 05:13:52
>>  0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
>>     ns:0 nr:0 dw:0 dr:216 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> 
> 
> I know it might be a silly configuration issue but would be a great help.

Hi,
I've only skimmed the configuration but what I noticed is that there
seems to be no order constraint that says that the filesystem should be
mounted after one of the drbd peers has transitioned to the Master role.
I suspect that Pacemaker tries to mount the filesystem before the
corresponding drbd resource has been set to primary.

Also I only see two colocation rules that tie the webserver with the
virtual_ip and the webserver_fs with the webserver_data_sync but none
that tells pacemaker to run the ip or webserver on the same system where
webserver_fs is running. So according to this configuration it would be
valid to run ip and webserver on one node and drbd primary and
webserver_fs on the other.

Regards,
   Dennis