[DRBD-user] after split brain

Lars Ellenberg Lars.Ellenberg at linbit.com
Mon Nov 27 10:21:53 CET 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2006-11-26 20:04:40 -0300
\ Roberto Scattini:
> hi:
> 
> >can anybody tell me if there is a way , after a split brain (caused by a
> >> network disconnection), to tell the new node that it should sync his
> >data
> >> with the actual primary node without having to re-syncronize all the
> >device
> >> ? (it is too slow! aprox 1 K by sec)

fix your network,
fix your configuration,
or whatever else is broken.
if your hardware can do better,
drbd can do better, too.

> >>
> >> what im doing now is a "drbdadm invalidate all" but it tooks about 8
> >hours
> >> re-syncronize a 30GB partition...
> >Is this test machine or production machine? If its under test.. you can do
> >this by drbdadm command.
> >
> 
> in the future, both will be "production machines", one as primary and the
> other as a failover server. the problem is that when i disconnect the
> network cable from the primary server, then the drbd timeouts and (aprox 30
> seconds later) the heartbeat switch the drbd to the secondary server (which
> gets the state Primary/Unknown).

read up about split brain
and why multiple communication channels are a must.

> Then, when i reconnect the network cable,
> both nodes go in standalone mode. i would like that the "old primary" sets
> its state to Secondary and the replication could still be working. i think
> that one of the possible solutions for this is to set the drbd timeout
> greater than the heartbeat timeout (i will test this).

NOoo.

you realize that there are more failure scenarios than
"you unplugging the network cable"?
what about the current primary really crashes?
what did you think the heartbeat "deadtime" was for?
how do you think the failover can work if you set the drbd timeout much
higher than the heartbeat deadtime?

> but anyway... how can i get the old primary to forgot all his changes since
> the last time he was connected with the slave?

most easy way: drbdadm invalidate 
(you get a full sync)

-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list