[DRBD-user] Re: A big bug in datadisk

Tue May 11 15:23:34 CEST 2004

[...]
> > My opinion is that if we are trying to make a reliable cluster we must
> > take care of hanging processes in a way that is incorrect at all for the
> > kernel and whatever-other-subsystem developers - terminate that processes
> > forcibly even if they are in 'D' uninterruptible state. But without
> > powering the other node off.
>
> think about it. if you have the remote possibility that something is
> still accessing, i.e. modifying, your data on one node, the other node
> may not take over or you will get data corruption.
> so if one process hangs on one server, the other is not able to take over.
> without STONITH, then you have very reliable NO-availability.  or, one
> node *thinks* the other is dead while actually it is still alive. and
> takes over regardless.  without STONITH you have data corruption.
>
> and, btw, you *cannot* terminate a process
> which hangs in uninterruptible state.
>
> so if the operator (or cluster manager) tells the node to *stop* all
> resources, and the node does not succeed in doing so, and after a
> certain timeout it still does not respond successfully to the stop
> request, then, to have availability, this node needs to commit suicide
> and hope for the other node to take over.
>
> the other issue is, if one node *thinks* that the other is dead,
> and they have shared data, it typically *needs* to STONITH it regardless
> "just to make sure" that the peer really IS dead. if it was, then
> shooting a dead node is a noop. if it was NOT, but we thought it was,
> then the STONITH just saved our data.
>
> No matter how one would like to have it behave,
> if there is a possibility for it to misbehave (and there always is),
> this *is* the only way to ensure availability and avoid corruption.
>
> and since this is a generic issue, this needs to be handled in the
> generic level: the cluster manager.
>
> but these issues are better discussed on linux-ha [-dev],
> rather than drbd-user.
>

From DRBD's point of view we could say: If the other node fails to
release its resources we simply abandon the connection to the peer
node (= exclude it form the cluster, at least as far DRBD is concerned )

Of course this is a very DRBD centric point of view, and not real-world
suitable, since in the real we also have a serive address (= other resources,
that needs to be released as well)

-Philipp
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria    http://www.linbit.com :