[DRBD-user] DRBD: failover when sync connection dies?

Martin Gombac martin at isg.si
Thu Dec 13 15:07:26 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 2007.12.13, at 14:42, Dominik Klein wrote:

>>> If the local crossover connection fails - what would be different  
>>> on the other node?
>>>
>>> I think its sane to leave everything as is.
>>>
>>> Or I understood your setup wrong or you explained it wrong :)
>>>
>> If the LAN/crossover fails, then:
>> - drbd doesn't sync anymore (so each node has one up to date  
>> resource and if i want to bring one down to replace network card  
>> which failed, the out of date resource on the other node comes up  
>> as primary. When i start node with replaced newtork card back,  
>> split brain would occur. Correct me if i'm wrong.)
>> - webmail and other apps stop working since they use internal  
>> network LAN/crossover for communication (faster, less busy). (I  
>> can go around this problem and just use external network (IPs)  
>> whenever communicating (mysql, smtp, imap,..) to other node, but  
>> it's less efficient and still doesn't fix above split brain problem.)
>> I hope i explained it :-)
>
> Your words are not clear. A crossover connection is usually  
> considered a one on one connection between two computers. That's  
> why I asked what would be different.
>
> What you tell sounds more like you have two networks, an external  
> and internal one.
>
> To prevent your situation, you need to outdate your drbd peer or  
> use Stonith.
>
> Read:
> http://blogs.linbit.com/florian/2007/10/01/an-underrated-cluster- 
> admins-companion-dopd/
> http://www.linux-ha.org/STONITH
>
> Regards
> Dominik
>

Exactly, i have two network cards in each server. As i stated in my  
first mail:
"
  In each server there are two network cards. One for internet access  
(WAN) and one for internal communication and synchronization (LAN).  
If one node looses internet connection (WAN) the other takes over all  
resources as expected, but this does no thappen with local connection  
(LAN) which is just crossover UTP cable.
"

Anyway, i think i will have a look at dopd and stonith. But this then  
is not yet failover cluster in it's true sense. if i remember  
correctly dopd will just mark the out of sync resources so, that they  
won't mount. So the cluster will not work fully (will not failover)  
when one server is down, until i replace broken network card and sync  
the data by hand.
In my opinion if sync connection/LAN/crossover fails, one node should  
take over all resources immediately (as it happens when WAN goes down  
- achieved by ipfail). Then we would not need dopd and also would not  
need to sync the resource data by hand when node with broken network  
card comes back, since the working node would have right data on both  
resources.




More information about the drbd-user mailing list