[DRBD-user] To stonith or not to stonith?

Thu Sep 8 18:39:28 CEST 2005

[ Even though Alan Cc'ed his message to drbd-user I haven't seen it
  show up there yet. ]

On Tue, Sep 06, 2005 at 04:47:53PM -0600, Alan Robertson wrote:
> Dave Dykstra wrote:
> >On Tue, Aug 30, 2005 at 11:24:29AM +0200, Lars Ellenberg wrote:
> >>/ 2005-08-29 13:39:20 -0500
> >>\ Dave Dykstra:
> >>>(someone asked about why to use stonith if DRBD prevents corruption)
> >>>Drbd will prevent data corruption on its own, but stonith with drbd can
> >>>give you increased uptime because there are cases when a standby drbd or
> >>>heartbeat will refuse to take over until the formerly active one has been
> >>>proven to be shut down.
> >>which are: ... ?
> >>
> >>
> >>
> >>btw:
> >>we at LINBIT make sure that heartbeat has as many communication
> >>channels as possible, but try to avoid stonith in most deployments:
> >>we had cases where heartbeat would reboot one node, and might have
> >>stonithed the other at the same event -- not exactly heartbeats fault,
> >>more "misbehaving resource agents", but still very annoying.
> >>
> >>we feel better if we automatise as less as possible,
> >>though obviously as much as necessary or convenient.
> >>
> >>as far as I can see, stonith with drbd does not really buy you anything.
> >
> >You know better than I do, Lars, about the states that DRBD can get into,
> >but I know that heartbeat tries very hard to avoid split brain and doesn't
> >distinguish between whether it's using DRBD or not.   I initially tried
> >to get by without stonith but eventually came to the conclusion that I
> >needed it because failovers sometimes didn't happen properly.   Come to
> >think of it, it may be because if heartbeat dies on the active side but
> >DRBD doesn't, the takeover by heartbeat fails and I had assumed that a
> >stonith would clean that up.  As it turns out, DRBD still won't take over
> >immediately after a stonith, not until it times out, and that continues
> >to be a thorny issue that I've raised on both mailing lists and do not
> >have yet have an answer for.
> 
> STONITH can keep both sides from becoming master.
> This requires human intervention to recover from.
> 
> It's not a happy circumstance.  And, STONITH avoids it.
> 
> BUT, it's not as serious as if it were true shared storage - in which 
> case all the online data is destroyed - an even less happy circumstance.
> 
> I don't see a huge problem caused by stonithing a node which has killed 
> itself.  But, maybe I missed something.
> 
> Regarding DRBD not taking over when it hasn't declared the other node 
> dead, I would think that a good solution might be to have DRBD wait up 
> to "drbd deadtime" seconds before giving up.
> 
> Since Heartbeat happily has no clue about DRBD (or its internal 
> deadtime), it would seem to be best dealt with by DRBD.
> 
> -- 
>     Alan Robertson <alanr at unix.sh>

That sounds like a reasonable solution to me.

In fact, I already see code to try a 'drbdadm primary' command 6
times with a one second sleep between each try in the 'start' case of
/etc/heartbeat/resource.d/drbddisk, and a comment saying that it is "in
case heartbeat deadtime was smaller than drbd ping time".   This is a
different situation than the comment, in that heartbeat forcibly knocked
down the remote side and immediately took over, so the timeout probably
starts ticking when the first 'drbdadm primary' command is executed.
I'm using the default drbd "timeout" time of six seconds, so presumably
on the 7th or 8th try it would work.

I think that doing multiple tries in the drbddisk command is a hack,
though, especially since it doesn't take into account any change in
the "timeout" parameter that there may be in drbd.conf. I think the
'drbdsetup primary' command (possibly with a new option that drbddisk
invokes) should try to contact the remote side and wait until there is
either a positive response or a timeout before it exits with an error.

- Dave