[DRBD-user] GFS2 freezes

Zohair Raza engineerzuhairraza at gmail.com
Mon Oct 29 15:26:47 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Mon, Oct 29, 2012 at 5:43 PM, Maurits van de Lande <
M.vandeLande at vdl-fittings.com> wrote:

>  Hello,****
>
> ** **
>
> When  one  node unexpectedly shuts down, dlm locks down until quorum is
> regained AND the faulty node is fenced, before it can take over the cluster
> resources.****
>
> ** **
>
> I assume that you have set the “two_node” flag  in cluster.conf****
>
> **
>

yes, I have it set because I want to primary/primary setup


>  **
>
> >Oct 29 08:05:59 node1 fenced[1401]: fence node2 dev 0.0 agent
> fence_ack_manual result: error from agent****
>
> Oct 29 08:05:59 node1 fenced[1401]: fence node2 failed****
>
> ** **
>
> I think that adding the following option to the dlm section in cluster.conf
> ****
>
> enable_fencing="0"****
>
> might solve this problem. (but I have not tested this) This will disable
> fencing.****
>
> **
>
giving a try


>  **
>
> Or you can setup fencing.****
>
> **
>
How can I do so?

I am testing this two virtual machines on vmware workstation, do I need
fence_vmware for this?



> **
>
> Best regards,****
>
> ** **
>
> Maurits van de Lande ****
>
> ** **
>
> ** **
>
> ** **
>
> *Van:* drbd-user-bounces at lists.linbit.com [mailto:
> drbd-user-bounces at lists.linbit.com] *Namens *Zohair Raza
> *Verzonden:* maandag 29 oktober 2012 11:03
> *Aan:* drbd-user at lists.linbit.com
> *Onderwerp:* [DRBD-user] GFS2 freezes****
>
> ** **
>
> Hi, ****
>
> ** **
>
> I have setup a Primary/Primary cluster with GFS2.****
>
> ** **
>
> All works good if I shut down any node regularly, but when I unplug power
> of any node, GFS freezes and I can not access the device. ****
>
> ** **
>
> Tried to use http://people.redhat.com/lhh/obliterate ****
>
> ** **
>
> this is what I see in logs ****
>
> ** **
>
> Oct 29 08:05:41 node1 kernel: d-con res0: PingAck did not arrive in time.*
> ***
>
> Oct 29 08:05:41 node1 kernel: d-con res0: peer( Primary -> Unknown ) conn(
> Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 )*
> ***
>
> Oct 29 08:05:41 node1 kernel: d-con res0: asender terminated****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: Terminating asender thread****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: Connection closed****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: conn( NetworkFailure ->
> Unconnected )****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: receiver terminated****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: Restarting receiver thread****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: receiver (re)started****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: conn( Unconnected ->
> WFConnection )****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: helper command: /sbin/drbdadm
> fence-peer res0****
>
> Oct 29 08:05:41 node1 fence_node[1912]: fence node2 failed****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: helper command: /sbin/drbdadm
> fence-peer res0 exit code 1 (0x100)****
>
> Oct 29 08:05:41 node1 kernel: d-con res0: fence-peer helper broken,
> returned 1****
>
> Oct 29 08:05:48 node1 corosync[1346]:   [TOTEM ] A processor failed,
> forming new configuration.****
>
> Oct 29 08:05:53 node1 corosync[1346]:   [QUORUM] Members[1]: 1****
>
> Oct 29 08:05:53 node1 corosync[1346]:   [TOTEM ] A processor joined or
> left the membership and a new membership was formed.****
>
> Oct 29 08:05:53 node1 corosync[1346]:   [CPG   ] chosen downlist: sender
> r(0) ip(192.168.23.128) ; members(old:2 left:1)****
>
> Oct 29 08:05:53 node1 corosync[1346]:   [MAIN  ] Completed service
> synchronization, ready to provide service.****
>
> Oct 29 08:05:53 node1 kernel: dlm: closing connection to node 2****
>
> Oct 29 08:05:53 node1 fenced[1401]: fencing node node2****
>
> Oct 29 08:05:53 node1 kernel: GFS2: fsid=cluster-setup:res0.0: jid=1:
> Trying to acquire journal lock...****
>
> Oct 29 08:05:53 node1 fenced[1401]: fence node2 dev 0.0 agent
> fence_ack_manual result: error from agent****
>
> Oct 29 08:05:53 node1 fenced[1401]: fence node2 failed****
>
> Oct 29 08:05:56 node1 fenced[1401]: fencing node node2****
>
> Oct 29 08:05:56 node1 fenced[1401]: fence node2 dev 0.0 agent
> fence_ack_manual result: error from agent****
>
> Oct 29 08:05:56 node1 fenced[1401]: fence node2 failed****
>
> Oct 29 08:05:59 node1 fenced[1401]: fencing node node2****
>
> Oct 29 08:05:59 node1 fenced[1401]: fence node2 dev 0.0 agent
> fence_ack_manual result: error from agent****
>
> Oct 29 08:05:59 node1 fenced[1401]: fence node2 failed****
>
>
> ****
>
> Regards,
> Zohair Raza****
>
> ** **
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20121029/ece33cb0/attachment.htm>


More information about the drbd-user mailing list