[DRBD-user] GFS2 - DualPrimaryDRBD hangs if a node Crashes

Raman Gupta ramangupta16 at gmail.com
Thu Mar 30 14:31:42 CEST 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

I was able to integrate DRBD with Pacemaker and my problem was solved.
After this no hang was observed in running after other node was shutdown.
DRBD was integrated as Master/Slave resource in Pacemaker with both nodes
as Primary since DRBD is running in dual-Primary mode.

Here is what I did:
1) Started Pacemaker cluster with DLM+CLVM integrated on both nodes
(server4 and server7).
2) Started DRBD, GFS2 on server4.
3) Integrated DRBD in Pacemaker  on server4 using commands:
pcs cluster cib drbd_cfg
pcs -f drbd_cfg resource create drbd_data ocf:linbit:drbd
drbd_resource=vDrbd op monitor interval=60s
pcs -f drbd_cfg resource master drbd_data_clone drbd_data master-max=2
master-node-max=1 clone-max=2 clone-node-max=1 notify=true
4) Now shutdown other node (server7) and the running node (server4) was
seen to be working fine. The VM on it did not stop and cLVM commands were
also fine.
5) Bring up server7, start Pacemaker on it and verified none of the nodes
hang or crash.  VM was also working fine.

Thanks a lot for everybody's help and suggestions! It has really helped me.

However:
1) I have not yet integrated GFS2 into Pacemaker and shall do it in
sometime.
2) In step5 above I need to activate the LVM using command: lvchange -a y
/dev/DRBD_VolGroup}/DRBD_LogicalVolume after starting cluster. Not sure if
this is required because GFS2 has not yet been integrated into Pacemaker or
it is always required. Shall experiment and publish results.
3) If DRBD in step2 is not yet started but Pacemaker commands in step3 are
done (first time I tried this) then other node is fenced off with error in
/var/log/messages: server4 drbd(drbd_data)[12224]: ERROR: meta parameter
misconfigured, expected clone-max -le 2, but found unset.
So basically if Pacemaker is configured for DRBD (step3) before starting
DRBD (step2) then other node is stonith'd. Thus I need to start DRBD
resources first before integrating Pacemaker with DRBD.


Here is my new Pacemaker Status with new resources highlighted:
[root at server7 ~]# pcs status
Cluster name: vCluster
Stack: corosync
Current DC: server7ha (version 1.1.15-11.el7_3.4-e174ec8) - partition with
quorum
Last updated: Thu Mar 30 17:29:11 2017          Last change: Wed Mar 29
21:09:19 2017 by root via cibadmin on server4ha

2 nodes and 9 resources configured

Online: [ server4ha server7ha ]

Full list of resources:

 vCluster-VirtualIP-10.168.10.199       (ocf::heartbeat:IPaddr2):
Started server7ha
 vCluster-Stonith-server7ha     (stonith:fence_ipmilan):        Started
server4ha
 vCluster-Stonith-server4ha     (stonith:fence_ipmilan):        Started
server7ha
 Clone Set: dlm-clone [dlm]
     Started: [ server4ha server7ha ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ server4ha server7ha ]
 *Master/Slave Set: drbd_data_clone [drbd_data]*
*     Masters: [ server4ha server7ha ]*

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
[root at server7 ~]#

Again thanks for help.

--Raman


On Sat, Mar 25, 2017 at 4:58 PM, Raman Gupta <ramangupta16 at gmail.com> wrote:

> Hi,
>
> Thanks for the detailed explanation and sample examples.
>
> I will work on suggestions about missing DRBD-Pacemaker, GFS2-Pacemaker
> configuration and re-check fencing configuration and I will let you know
> the results of my experiments.
>
> --Raman
>
>
> On Sat, Mar 25, 2017 at 5:48 AM, Igor Cicimov <igorc at encompasscorporation.
> com> wrote:
>
>>
>>
>> On 25 Mar 2017 11:00 am, "Igor Cicimov" <icicimov at gmail.com> wrote:
>>
>> Raman,
>>
>> On Sat, Mar 25, 2017 at 12:07 AM, Raman Gupta <ramangupta16 at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks for looking into this issue. Here is my 'pcs status' and attached
>>> is cib.xml pacemaker file
>>>
>>> [root at server4 cib]# pcs status
>>> Cluster name: vCluster
>>> Stack: corosync
>>> Current DC: server7ha (version 1.1.15-11.el7_3.4-e174ec8) - partition
>>> with quorum
>>> Last updated: Fri Mar 24 18:33:05 2017          Last change: Wed Mar 22
>>> 13:22:19 2017 by root via cibadmin on server7ha
>>>
>>> 2 nodes and 7 resources configured
>>>
>>> Online: [ server4ha server7ha ]
>>>
>>> Full list of resources:
>>>
>>>  vCluster-VirtualIP-10.168.10.199       (ocf::heartbeat:IPaddr2):
>>> Started server7ha
>>>  vCluster-Stonith-server7ha     (stonith:fence_ipmilan):        Started
>>> server4ha
>>>  vCluster-Stonith-server4ha     (stonith:fence_ipmilan):        Started
>>> server7ha
>>>  Clone Set: dlm-clone [dlm]
>>>      Started: [ server4ha server7ha ]
>>>  Clone Set: clvmd-clone [clvmd]
>>>      Started: [ server4ha server7ha ]
>>>
>>> Daemon Status:
>>>   corosync: active/disabled
>>>   pacemaker: active/disabled
>>>   pcsd: active/enabled
>>> [root at server4 cib]#
>>>
>>>
>> This shows us the problem: you have not configured any DRBD resource in
>> Pacemaker hence it has no idea and control over it.
>>
>> This is from one of my clusters:
>>
>> Online: [ sl01 sl02 ]
>>
>>  p_fence_sl01 (stonith:fence_ipmilan): Started sl02
>>  p_fence_sl02 (stonith:fence_ipmilan): Started sl01
>> * Master/Slave Set: ms_drbd [p_drbd_r0]*
>> *     Masters: [ sl01 sl02 ]*
>>  Clone Set: cl_dlm [p_controld]
>>      Started: [ sl01 sl02 ]
>>  Clone Set: cl_fs_gfs2 [p_fs_gfs2]
>>      Started: [ sl01 sl02 ]
>>
>> You can notice the resources you are missing in bold, more specifically
>> you have missed to configure DRBD and it's MS resource then colocation and
>> contsraint resources too. So the "resource-and-stonith" hook in your drbd
>> config will never work, Pacemaker does not know about any drbd resources.
>>
>> This is from one of my production clusters, it's on Ubuntu so no PCS just
>> CRM and I'm not using cLVM just DLM:
>>
>> primitive p_controld ocf:pacemaker:controld \
>> op monitor interval="60" timeout="60" \
>> op start interval="0" timeout="90" \
>> op stop interval="0" timeout="100" \
>> params daemon="dlm_controld" \
>> meta target-role="Started"
>> *primitive p_drbd_r0 ocf:linbit:drbd \*
>> * params drbd_resource="r0" adjust_master_score="0 10 1000 10000" \*
>> * op monitor interval="10" role="Master" \*
>> * op monitor interval="20" role="Slave" \*
>> * op start interval="0" timeout="240" \*
>> * op stop interval="0" timeout="100"*
>> *ms ms_drbd p_drbd_r0 \*
>> * meta master-max="2" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true" interleave="true" target-role="Started"*
>> primitive p_fs_gfs2 ocf:heartbeat:Filesystem \
>> params device="/dev/drbd0" directory="/data" fstype="gfs2"
>> options="_netdev,noatime,rw,acl" \
>> op monitor interval="20" timeout="40" \
>> op start interval="0" timeout="60" \
>> op stop interval="0" timeout="60" \
>> meta is-managed="true"
>> clone cl_dlm p_controld \
>> meta globally-unique="false" interleave="true" target-role="Started"
>> clone cl_fs_gfs2 p_fs_gfs2 \
>> meta globally-unique="false" interleave="true" ordered="true"
>> target-role="Started"
>> colocation cl_fs_gfs2_dlm inf: cl_fs_gfs2 cl_dlm
>> *colocation co_drbd_dlm inf: cl_dlm ms_drbd:Master*
>> order o_dlm_fs_gfs2 inf: cl_dlm:start cl_fs_gfs2:start
>> *order o_drbd_dlm_fs_gfs2 inf: ms_drbd:promote cl_dlm:start
>> cl_fs_gfs2:start*
>>
>> I have excluded the fencing stuff for brevity and highlighted the
>> resources you are missing. Check the rest though as well you might find
>> something you can use or cross-check with your config.
>>
>> Also thanks to Digimer about the very useful information (as always) he
>> contributed explaining how the things actually work.
>>
>>
>> Just noticed your gfs2 is out of pacemaker control you need to sort that
>> out too.
>>
>>
>>
>>> On Fri, Mar 24, 2017 at 1:49 PM, Raman Gupta <ramangupta16 at gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am having a problem where if in GFS2 dual-Primary-DRBD Pacemaker
>>>> Cluster, a node crashes then the running node hangs! The CLVM commands
>>>> hang, the libvirt VM on running node hangs.
>>>>
>>>> Env:
>>>> ---------
>>>> CentOS 7.3
>>>> DRBD 8.4
>>>> gfs2-utils-3.1.9-3.el7.x86_64
>>>> Pacemaker 1.1.15-11.el7_3.4
>>>> corosync-2.4.0-4.el7.x86_64
>>>>
>>>>
>>>> Infrastructure:
>>>> ------------------------
>>>> 1) Running A 2 node Pacemaker Cluster with proper fencing between the
>>>> two. Nodes are server4 and server7.
>>>>
>>>> 2) Running DRBD dual-Primary and hosting GFS2 filesystem.
>>>>
>>>> 3) Pacemaker has DLM and cLVM resources configured among others.
>>>>
>>>> 4) A KVM/QEMU virtual machine is running on server4 which is holding
>>>> the cluster resources.
>>>>
>>>>
>>>> Normal:
>>>> ------------
>>>> 5) In normal condition when the two nodes are completely UP then things
>>>> are fine. The DRBD dual-primary works fine. The disk of VM is hosted on
>>>> DRBD mount directory /backup and VM runs fine with Live Migration happily
>>>> happening between the 2 nodes.
>>>>
>>>>
>>>> Problem:
>>>> ----------------
>>>> 6) Stop server7 [shutdown -h now] ---> LVM commands like pvdisplay
>>>> hangs, VM runs only for 120s ---> After 120s DRBD/GFS2 panics
>>>> (/var/log/messages below) in server4 and DRBD mount directory (/backup)
>>>> becomes unavailable and VM hangs in server4. The DRBD though is fine on
>>>> server4 and in Primary/Secondary mode in WFConnection state.
>>>>
>>>> Mar 24 11:29:28 server4 crm-fence-peer.sh[54702]: invoked for vDrbd
>>>> Mar 24 11:29:28 server4 crm-fence-peer.sh[54702]: WARNING drbd-fencing
>>>> could not determine the master id of drbd resource vDrbd
>>>> Mar 24 11:29:28 server4 kernel: drbd vDrbd: helper command:
>>>> /sbin/drbdadm fence-peer vDrbd exit code 1 (0x100)
>>>> Mar 24 11:29:28 server4 kernel: drbd vDrbd: fence-peer helper broken,
>>>> returned 1
>>>> Mar 24 11:32:01 server4 kernel: INFO: task kworker/8:1H:822 blocked for
>>>> more than 120 seconds.
>>>> Mar 24 11:32:01 server4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> Mar 24 11:32:01 server4 kernel: kworker/8:1H    D ffff880473796c18
>>>> 0   822      2 0x00000080
>>>> Mar 24 11:32:01 server4 kernel: Workqueue: glock_workqueue
>>>> glock_work_func [gfs2]
>>>> Mar 24 11:32:01 server4 kernel: ffff88027674bb10 0000000000000046
>>>> ffff8802736e9f60 ffff88027674bfd8
>>>> Mar 24 11:32:01 server4 kernel: ffff88027674bfd8 ffff88027674bfd8
>>>> ffff8802736e9f60 ffff8804757ef808
>>>> Mar 24 11:32:01 server4 kernel: 0000000000000000 ffff8804757efa28
>>>> ffff8804757ef800 ffff880473796c18
>>>> Mar 24 11:32:01 server4 kernel: Call Trace:
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffff8168bbb9>] schedule+0x29/0x70
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffffa0714ce4>]
>>>> drbd_make_request+0x2a4/0x380 [drbd]
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffff812e0000>] ?
>>>> aes_decrypt+0x260/0xe10
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffff810b17d0>] ?
>>>> wake_up_atomic_t+0x30/0x30
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffff812ee6f9>]
>>>> generic_make_request+0x109/0x1e0
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffff812ee841>]
>>>> submit_bio+0x71/0x150
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffffa063ee11>]
>>>> gfs2_meta_read+0x121/0x2a0 [gfs2]
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffffa063f392>]
>>>> gfs2_meta_indirect_buffer+0x62/0x150 [gfs2]
>>>> Mar 24 11:32:01 server4 kernel: [<ffffffff810d2422>] ?
>>>> load_balance+0x192/0x990
>>>>
>>>> 7) After server7 is UP, Pacemaker Cluster is started, DRBD started and
>>>> Logical Volume activated and only after that in server4 the DRBD mount
>>>> directory (/backup) becomes available and VM resumes in server4.  So after
>>>> server7 is down and till it is completely UP the VM in server4 hangs.
>>>>
>>>>
>>>> Can anyone help how to avoid running node hang when other node crashes?
>>>>
>>>>
>>>> Attaching DRBD config file.
>>>>
>>>>
>>>> --Raman
>>>>
>>>>
>>>
>>> _______________________________________________
>>> drbd-user mailing list
>>> drbd-user at lists.linbit.com
>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>>
>>>
>>
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20170330/6c9ce2ad/attachment.htm>


More information about the drbd-user mailing list