[DRBD-user] DRBD: failover when sync connection dies?
lars.ellenberg at linbit.com
Wed Dec 19 17:00:13 CET 2007
On Wed, Dec 19, 2007 at 03:58:33PM +0100, Martin Gombac wrote:
> On 2007.12.18, at 15:52, Lars Ellenberg wrote:
> >On Mon, Dec 17, 2007 at 10:55:02AM +0100, Martin Gombac wrote:
> >>On 2007.12.13, at 18:24, Martin Gombac wrote:
> >>>On 2007.12.13, at 17:42, Florian Haas wrote:
> >>>>On Thursday 13 December 2007 13:45:47 Martin Gombac wrote:
> >>>>>without synchronized drbd resources (amongst other things).
> >>>>>My question is this:
> >>>>>How can i make one node take over all resources if local crossover
> >you don't want to.
> But it do. :-)
> See below why.
> >if your lan connection dies,
> >and your lan connection was your replication link,
> >then you don't have replication anymore,
> >and so you would go online with non-current data.
> Couple of seconds old data would come up on the second node, i agree.
> But it's way better than scenario described below:
> So both nodes have secondary/slave drbd resource out of sync. Since
> my replication link probably died due to a broken network card i have
> to take the node with broken card down. In this case the second
> (healthy) node would come up with _really_ out of date data for
> second resource or not at all if i used dopd. Which means we don't
> have a true fail-over cluster and would have unnecessary downtime.
> But we use clustering in the first place to avoid downtime.
> (I also have other applications communicating over this link, but can
> easily make them use WAN.)
> >if currently your LAN connection is a direct "crossover cable",
> >why would you think any clients would benefit from failing over?
> There would very little downtime (if it fails over to healthy node),
> no outdated drbd resources (oppose to one outdated if it doesn't
> failover) and as soon as i fix network card, synchronization would be
> auto-magic. Loosing a couple of seconds of data is in my opinion much
> better than having at least half hour downtime or more when i shut
> down broken server.
> >if you change to a switched LAN, and add a ping node,
> >why do you think any clients would benefit from that?
> We would know on which node the network failed and on which it works
> so fail-over would be in the right direction. Later on i would fix
> the node with broken network card or whathever plug it back in and it
> would come back to cluster (sync and all). Clients would benefit by
> not having any downtime, like it would be using dopd.
> >how can you be sure what component failed,
> > local NIC, cables, remote NIC, switch, driver, ...?
> If the switch or ping node failed, both nodes would detect that and
> no fail-over would happen. I would replace switch and synchronization
> would start auto-magically. If there would be a failure on local
> network interface, cable or driver one node would still get the pings
> back so we would know on which side it failed and the healthy node
> would take over thanks to heartbeat.
> >what problem are you trying to solve?
> > I mean not "failing over when the LAN link dies".
> > please zoom out a little.
> To have as little downtime as possible if replication link fails. I
> think i explained this a couple of times by now. If data get's
> outdated on one node and the other has to be taken down for repairs
> => service offline.
reconfigure drbd to use the "outside" link,
wait for resync, switch over, take down the server
with the broken local nic, fix it, reconfigure drbd to use
"local" link again, be happy.
preferably also use bonding for "local" link,
so you can lose one NIC without losing the link.
: Lars Ellenberg http://www.linbit.com :
: DRBD/HA support and consulting sales at linbit.com :
: LINBIT Information Technologies GmbH Tel +43-1-8178292-0 :
: Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 :
please use the "List-Reply" function of your email client.
More information about the drbd-user