[DRBD-user] Dual Primary Mode: Shared Directory blocked after node crash until reboot

Digimer lists at alteeve.ca
Tue May 12 09:44:59 CEST 2015

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 12/05/15 03:42 AM, DRBD User wrote:
>>>> pacemakers pcs property stonith-enabled is currently set to false
>> Well there's your problem. :)
> Since i don't have any (hardware) STONITH device, i have set stonith-enabled to false.
> DRBD's fencing rule is set to : 'fencing: resource-only'
> My goal is: if one node crashes, the other node should take over the work immediately. But actually i have to wait the reboot time of the crashed node. I thought, that in such a situation the active node (rather the shared directory) is immediately usable ?
> May be i should use another fence script ?
> I tried to create the resource with operation 'on-fail=restart' - but no success ...
> Any other suggestions ?

You *CAN NOT* safely proceed when a node stops responding _until_ you
have put the lost node into a known state. To do otherwise would be to
risk a split-brain.

A good fence device are switched PDUs, like the APC-brand AP7900 (not
all makes/models are supported, so check first before buying other
brands). The AP7900 can usually be found used for ~$200 and makes an
excellent external fence device.

Trying to use DRBD without proper fencing will result in pain and
heartache. The delay needed to fence a lost node is FAR preferable to
risking a split-brain.

Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

More information about the drbd-user mailing list