[DRBD-user] Re: A big bug in datadisk

Tue May 11 14:09:46 CEST 2004

/ 2004-05-11 14:14:18 +0300
\ Dmitry Golubev:
> Hello, 
> > ah, you are refering to that three liner patch from early may.
> > this does nothing usefull but adds some "echo > /var/log/drbd.log" just
> > before every call of logger for those who don't have the standard logger
> > binary at hand to interface with syslog...
> 
> Well, at least that 'standard logger' is absent in Debian... Are you sure it 
> is standard on all systems?

dpkg -S /usr/bin/logger
bsdutils: ...

maybe it is not installed on every debian,
but I think it should. and even if it is not, all messages go to
STDOUT/STDERR, too, which is typically not connected to /dev/null, but
to some logfile ...

> > if you have it mounted, someone/thing did the mount.
> > so that same someone needs to do the umount, before you deactivate the
> > device. thats not at all DRBDs business.
> 
> Yes, they need to be unmounted. But then why do you need to kill processes? 

as a last resort if something is deadly wrong,
just to *try* and avoid suicide.

> They should be stopped as well? But somehow you've rightly decided to make 
> that option and I think that to complete that process, internal mounts should 
> also be terminated.

if it can be done easily, yes.
but as mentioned, in drbd 0.7 we do not bother with mounts or users at
all. we just [try to] make the device primary/secondary on request.

> > thats why you have to start things in a certain order, and usually stop
> > things in reverse order. you obviously need to properly pair starts and
> > stops.
> 
> Well, then something else is broken, since my scripts are correct, and they 
> should unmount the filesystems. They have been checked and rechecked, and my 
> opinion that if they were incorrect, my system would not be able to start at 
> all... Naive me :)))
> 
> > be aware that *all* operations may hang or block, for whatever reason.
> > so if you want to migrate services, and you need to stop them on the
> > active node before you can start them on some other node, and the
> > stopping hangs, you have a problem. *every* cluster manager needs to be
> > able to cope with hanging resource scripts. typically the only way out
> > of this is STONITH.
> 
> No, STONITH is a way brutal and at all unacceptable for me since I have two 
> virtual servers running on two physical servers - normally each one physical 
> server is responsible for one of the logical servers, but in case one of the 
> physical ones is dead, the second one will take over.
> 
> My opinion is that if we are trying to make a reliable cluster we must take 
> care of hanging processes in a way that is incorrect at all for the kernel 
> and whatever-other-subsystem developers - terminate that processes forcibly 
> even if they are in 'D' uninterruptible state. But without powering the other 
> node off.

think about it. if you have the remote possibility that something is
still accessing, i.e. modifying, your data on one node, the other node
may not take over or you will get data corruption.
so if one process hangs on one server, the other is not able to take over.
without STONITH, then you have very reliable NO-availability.  or, one
node *thinks* the other is dead while actually it is still alive. and
takes over regardless.  without STONITH you have data corruption.

and, btw, you *cannot* terminate a process
which hangs in uninterruptible state.

so if the operator (or cluster manager) tells the node to *stop* all
resources, and the node does not succeed in doing so, and after a
certain timeout it still does not respond successfully to the stop
request, then, to have availability, this node needs to commit suicide 
and hope for the other node to take over.

the other issue is, if one node *thinks* that the other is dead,
and they have shared data, it typically *needs* to STONITH it regardless
"just to make sure" that the peer really IS dead. if it was, then
shooting a dead node is a noop. if it was NOT, but we thought it was,
then the STONITH just saved our data.

No matter how one would like to have it behave,
if there is a possibility for it to misbehave (and there always is),
this *is* the only way to ensure availability and avoid corruption.

and since this is a generic issue, this needs to be handled in the
generic level: the cluster manager.

but these issues are better discussed on linux-ha [-dev],
rather than drbd-user.

	Lars Ellenberg