[DRBD-user] GFS2 - DualPrimaryDRBD hangs if a node Crashes

Fri Mar 24 12:08:13 CET 2017

On Fri, Mar 24, 2017 at 7:19 PM, Raman Gupta <ramangupta16 at gmail.com> wrote:

> Hi All,
>
> I am having a problem where if in GFS2 dual-Primary-DRBD Pacemaker
> Cluster, a node crashes then the running node hangs! The CLVM commands
> hang, the libvirt VM on running node hangs.
>
> Env:
> ---------
> CentOS 7.3
> DRBD 8.4
> gfs2-utils-3.1.9-3.el7.x86_64
> Pacemaker 1.1.15-11.el7_3.4
> corosync-2.4.0-4.el7.x86_64
>
>
> Infrastructure:
> ------------------------
> 1) Running A 2 node Pacemaker Cluster with proper fencing between the two.
> Nodes are server4 and server7.
>
> 2) Running DRBD dual-Primary and hosting GFS2 filesystem.
>
> 3) Pacemaker has DLM and cLVM resources configured among others.
>
> 4) A KVM/QEMU virtual machine is running on server4 which is holding the
> cluster resources.
>
>
> Normal:
> ------------
> 5) In normal condition when the two nodes are completely UP then things
> are fine. The DRBD dual-primary works fine. The disk of VM is hosted on
> DRBD mount directory /backup and VM runs fine with Live Migration happily
> happening between the 2 nodes.
>
>
> Problem:
> ----------------
> 6) Stop server7 [shutdown -h now] ---> LVM commands like pvdisplay hangs,
> VM runs only for 120s ---> After 120s DRBD/GFS2 panics (/var/log/messages
> below) in server4 and DRBD mount directory (/backup) becomes unavailable
> and VM hangs in server4. The DRBD though is fine on server4 and in
> Primary/Secondary mode in WFConnection state.
>
> Mar 24 11:29:28 server4 crm-fence-peer.sh[54702]: invoked for vDrbd
> Mar 24 11:29:28 server4 crm-fence-peer.sh[54702]: WARNING drbd-fencing
> could not determine the master id of drbd resource vDrbd
> *Mar 24 11:29:28 server4 kernel: drbd vDrbd: helper command: /sbin/drbdadm
> fence-peer vDrbd exit code 1 (0x100)*
> *Mar 24 11:29:28 server4 kernel: drbd vDrbd: fence-peer helper broken,
> returned 1*
>

I guess this is the problem. Since the drbd fencing script fails DLM will
hang to avoid resource corruption since it has no information about the
status of the other node.

> Mar 24 11:32:01 server4 kernel: INFO: task kworker/8:1H:822 blocked for
> more than 120 seconds.
> Mar 24 11:32:01 server4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> Mar 24 11:32:01 server4 kernel: kworker/8:1H    D ffff880473796c18     0
> 822      2 0x00000080
> Mar 24 11:32:01 server4 kernel: Workqueue: glock_workqueue glock_work_func
> [gfs2]
> Mar 24 11:32:01 server4 kernel: ffff88027674bb10 0000000000000046
> ffff8802736e9f60 ffff88027674bfd8
> Mar 24 11:32:01 server4 kernel: ffff88027674bfd8 ffff88027674bfd8
> ffff8802736e9f60 ffff8804757ef808
> Mar 24 11:32:01 server4 kernel: 0000000000000000 ffff8804757efa28
> ffff8804757ef800 ffff880473796c18
> Mar 24 11:32:01 server4 kernel: Call Trace:
> Mar 24 11:32:01 server4 kernel: [<ffffffff8168bbb9>] schedule+0x29/0x70
> Mar 24 11:32:01 server4 kernel: [<ffffffffa0714ce4>]
> drbd_make_request+0x2a4/0x380 [drbd]
> Mar 24 11:32:01 server4 kernel: [<ffffffff812e0000>] ?
> aes_decrypt+0x260/0xe10
> Mar 24 11:32:01 server4 kernel: [<ffffffff810b17d0>] ?
> wake_up_atomic_t+0x30/0x30
> Mar 24 11:32:01 server4 kernel: [<ffffffff812ee6f9>]
> generic_make_request+0x109/0x1e0
> Mar 24 11:32:01 server4 kernel: [<ffffffff812ee841>] submit_bio+0x71/0x150
> Mar 24 11:32:01 server4 kernel: [<ffffffffa063ee11>]
> gfs2_meta_read+0x121/0x2a0 [gfs2]
> Mar 24 11:32:01 server4 kernel: [<ffffffffa063f392>]
> gfs2_meta_indirect_buffer+0x62/0x150 [gfs2]
> Mar 24 11:32:01 server4 kernel: [<ffffffff810d2422>] ?
> load_balance+0x192/0x990
>
> 7) After server7 is UP, Pacemaker Cluster is started, DRBD started and
> Logical Volume activated and only after that in server4 the DRBD mount
> directory (/backup) becomes available and VM resumes in server4.  So after
> server7 is down and till it is completely UP the VM in server4 hangs.
>
>
> Can anyone help how to avoid running node hang when other node crashes?
>
>
> Attaching DRBD config file.
>
>
Do you actually have fencing configured in pacemaker? Since you have drbd
fencing policy set to "resource-and-stonith" you *must* have fencing setup
in pacemaker too. Have you also set no-quorum-policy="ignore" in pacemaker?
maybe show us your pacemaker config too so we don't have to guess....

Not related to the problem but I would also add "after-resync-target"
handler too:

handlers {
  ...
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
}

>
> --Raman
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20170324/9f779b84/attachment.htm>