[DRBD-user] Having Trouble with LVM on DRBD

Sat Feb 27 03:36:24 CET 2016

On Sat, Feb 27, 2016 at 11:18 AM, Eric Robinson <eric.robinson at psmnv.com>
wrote:

> Sadly, it still isn’t working.
>
> Here is my crm config...
>
> node ha13a
> node ha13b
> primitive p_drbd0 ocf:linbit:drbd \
>         params drbd_resource=ha01_mysql \
>         op monitor interval=31s role=Slave \
>         op monitor interval=30s role=Master
> primitive p_drbd1 ocf:linbit:drbd \
>         params drbd_resource=ha02_mysql \
>         op monitor interval=29s role=Slave \
>         op monitor interval=28s role=Master
> primitive p_fs_clust17 Filesystem \
>         params device="/dev/vg_drbd0/lv_drbd0" directory="/ha01_mysql"
> fstype=ext3 options=noatime
> primitive p_fs_clust18 Filesystem \
>         params device="/dev/vg_drbd1/lv_drbd1" directory="/ha02_mysql"
> fstype=ext3 options=noatime
> primitive p_lvm_drbd0 LVM \
>         params volgrpname=vg_drbd0
> primitive p_lvm_drbd1 LVM \
>         params volgrpname=vg_drbd1
> primitive p_vip_clust17 IPaddr2 \
>         params ip=192.168.9.104 cidr_netmask=32 \
>         op monitor interval=30s
> primitive p_vip_clust18 IPaddr2 \
>         params ip=192.168.9.105 cidr_netmask=32 \
>         op monitor interval=30s
> ms ms_drbd0 p_drbd0 \
>         meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
> notify=true target-role=Master
> ms ms_drbd1 p_drbd1 \
>         meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
> notify=true target-role=Master
> colocation c_clust17 inf: p_vip_clust17 p_lvm_drbd0 ms_drbd0:Master
> colocation c_clust18 inf: p_vip_clust18 p_lvm_drbd1 ms_drbd1:Master
> order o_clust17 inf: ms_drbd0:promote p_lvm_drbd0:start p_vip_clust17
> order o_clust18 inf: ms_drbd1:promote p_lvm_drbd1:start p_vip_clust18
> property cib-bootstrap-options: \
>         dc-version=1.1.11-97629de \
>         cluster-infrastructure="classic openais (with plugin)" \
>         no-quorum-policy=ignore \
>         stonith-enabled=false \
>         maintenance-mode=false \
>         expected-quorum-votes=2 \
>         last-lrm-refresh=1456529727
> # vim: set filetype=pcmk:
>
> Here is what my filter looks like...
>
>     filter = [ "a|/dev/sda*|", "a|/dev/drbd*|", "r|.*|" ]
>     write_cache_state = 0
>     volume_list = [ "vg00", "vg_drbd0", "vg_drbd1" ]
>
>
> Here is what lvdisplay shows on node ha13a...
>
>  [root at ha13a ~]# lvdisplay
>   --- Logical volume ---
>   LV Path                /dev/vg00/lv00
>   LV Name                lv00
>   VG Name                vg00
>   LV UUID                BfYyBv-VPNI-2f5s-0kVZ-AoSr-dGcY-gojAzs
>   LV Write Access        read/write
>   LV Creation host, time ha13a.mycharts.md, 2014-01-23 03:38:38 -0800
>   LV Status              available
>   # open                 1
>   LV Size                78.12 GiB
>   Current LE             20000
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:0
>
>   --- Logical volume ---
>   LV Path                /dev/vg_drbd1/lv_drbd1
>   LV Name                lv_drbd1
>   VG Name                vg_drbd1
>   LV UUID                HLVYSz-mZbQ-rCUm-OMBg-a1G9-vqdg-FwRp5S
>   LV Write Access        read/write
>   LV Creation host, time ha13b, 2016-02-26 13:48:51 -0800
>   LV Status              NOT available
>   LV Size                1.00 TiB
>   Current LE             262144
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>
>   --- Logical volume ---
>   LV Path                /dev/vg_drbd0/lv_drbd0
>   LV Name                lv_drbd0
>   VG Name                vg_drbd0
>   LV UUID                2q0e0v-P2g1-inu4-GKDN-cTyn-e2L7-jCJ1BY
>   LV Write Access        read/write
>   LV Creation host, time ha13a, 2016-02-26 13:48:06 -0800
>   LV Status              available
>   # open                 1
>   LV Size                1.00 TiB
>   Current LE             262144
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:1
>
> And here is what it shows on ha13b...
>
> [root at ha13b ~]# lvdisplay
>   --- Logical volume ---
>   LV Path                /dev/vg_drbd1/lv_drbd1
>   LV Name                lv_drbd1
>   VG Name                vg_drbd1
>   LV UUID                HLVYSz-mZbQ-rCUm-OMBg-a1G9-vqdg-FwRp5S
>   LV Write Access        read/write
>   LV Creation host, time ha13b, 2016-02-26 13:48:51 -0800
>   LV Status              available
>   # open                 1
>   LV Size                1.00 TiB
>   Current LE             262144
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:1
>
>   --- Logical volume ---
>   LV Path                /dev/vg_drbd0/lv_drbd0
>   LV Name                lv_drbd0
>   VG Name                vg_drbd0
>   LV UUID                2q0e0v-P2g1-inu4-GKDN-cTyn-e2L7-jCJ1BY
>   LV Write Access        read/write
>   LV Creation host, time ha13a, 2016-02-26 13:48:06 -0800
>   LV Status              NOT available
>   LV Size                1.00 TiB
>   Current LE             262144
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>
>   --- Logical volume ---
>   LV Path                /dev/vg00/lv00
>   LV Name                lv00
>   VG Name                vg00
>   LV UUID                lIJWiz-2Y9j-cq2G-Ie4f-9wVK-xJbu-2s1f23
>   LV Write Access        read/write
>   LV Creation host, time ha13b.mycharts.md, 2014-01-23 10:01:36 -0800
>   LV Status              available
>   # open                 1
>   LV Size                78.12 GiB
>   Current LE             20000
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:0
>
> And here is crm_mon...
>
> Last updated: Fri Feb 26 16:16:30 2016
> Last change: Fri Feb 26 16:12:39 2016
> Stack: classic openais (with plugin)
> Current DC: ha13a - partition with quorum
> Version: 1.1.11-97629de
> 2 Nodes configured, 2 expected votes
> 10 Resources configured
>
>
> Online: [ ha13a ha13b ]
>
>  Master/Slave Set: ms_drbd0 [p_drbd0]
>      Masters: [ ha13a ]
>      Slaves: [ ha13b ]
>  Master/Slave Set: ms_drbd1 [p_drbd1]
>      Masters: [ ha13a ]
>      Slaves: [ ha13b ]
> p_vip_clust17   (ocf::heartbeat:IPaddr2):       Started ha13a
> p_fs_clust17    (ocf::heartbeat:Filesystem):    Started ha13a
> *p_fs_clust18    (ocf::heartbeat:Filesystem):    Started ha13b*
>

^^^^^^^
                                                                  how could
this happen
                                                                  without
starting the LV first???

p_lvm_drbd0     (ocf::heartbeat:LVM):   Started ha13a
> *p_lvm_drbd1     (ocf::heartbeat:LVM):   FAILED ha13b (unmanaged)*
>
                                                                ^^^^^^^^
                                                           failed, so no
way p_fs_clust18
                                                           should be
allowed to start

>
> Failed actions:
>     p_lvm_drbd1_stop_0 on ha13b 'unknown error' (1): call=104,
> status=complete, last-rc-change='Fri Feb 26 16:12:29 2016', queued=0ms,
> exec=10447ms
>     p_lvm_drbd1_stop_0 on ha13b 'unknown error' (1): call=104,
> status=complete, last-rc-change='Fri Feb 26 16:12:29 2016', queued=0ms,
> exec=10447ms
>     p_fs_clust17_start_0 on ha13b 'not installed' (5): call=110,
> status=complete, last-rc-change='Fri Feb 26 16:12:40 2016', queued=0ms,
> exec=46ms
>
>
Can you please try following constraints instead the ones you have:

group g_drbd0 p_lvm_drbd0 p_fs_clust17 p_vip_clust17
group g_drbd1 p_lvm_drbd1 p_fs_clust18 p_vip_clust18
colocation c_clust17 inf: g_drbd0 ms_drbd0:Master
colocation c_clust18 inf: g_drbd1 ms_drbd1:Master
order o_clust17 inf: ms_drbd0:promote g_drbd0:start
order o_clust18 inf: ms_drbd1:promote g_drbd1:start

> I've tried cleaning up all the resources, but this is what I get.
> Sometimes if I mess around enough, I can get everything up, but as soon as
> I tried to fail over one of the cluster IPs or filesystems, the whole thing
> goes to crap.
>
> Do you see any potential causes?
>
> --Eric
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20160227/3a08d502/attachment.htm>