[DRBD-user] DRBD: failover when sync connection dies?

Paul Court pc at matrixonline.co.uk
Thu Dec 13 15:17:35 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> On 2007.12.13, at 14:42, Dominik Klein wrote:
> 
>>>> If the local crossover connection fails - what would be different on 
>>>> the other node?
>>>>
>>>> I think its sane to leave everything as is.
>>>>
>>>> Or I understood your setup wrong or you explained it wrong :)
>>>>
>>> If the LAN/crossover fails, then:
>>> - drbd doesn't sync anymore (so each node has one up to date resource 
>>> and if i want to bring one down to replace network card which failed, 
>>> the out of date resource on the other node comes up as primary. When 
>>> i start node with replaced newtork card back, split brain would 
>>> occur. Correct me if i'm wrong.)
>>> - webmail and other apps stop working since they use internal network 
>>> LAN/crossover for communication (faster, less busy). (I can go around 
>>> this problem and just use external network (IPs) whenever 
>>> communicating (mysql, smtp, imap,..) to other node, but it's less 
>>> efficient and still doesn't fix above split brain problem.)
>>> I hope i explained it :-)
>>
>> Your words are not clear. A crossover connection is usually considered 
>> a one on one connection between two computers. That's why I asked what 
>> would be different.
>>
>> What you tell sounds more like you have two networks, an external and 
>> internal one.
>>
>> To prevent your situation, you need to outdate your drbd peer or use 
>> Stonith.
>>
>> Read:
>> http://blogs.linbit.com/florian/2007/10/01/an-underrated-cluster-admins-companion-dopd/ 
>>
>> http://www.linux-ha.org/STONITH
>>
>> Regards
>> Dominik
>>
> 
> Exactly, i have two network cards in each server. As i stated in my 
> first mail:
> "
>  In each server there are two network cards. One for internet access 
> (WAN) and one for internal communication and synchronization (LAN). If 
> one node looses internet connection (WAN) the other takes over all 
> resources as expected, but this does no thappen with local connection 
> (LAN) which is just crossover UTP cable.
> "
> 
> Anyway, i think i will have a look at dopd and stonith. But this then is 
> not yet failover cluster in it's true sense. if i remember correctly 
> dopd will just mark the out of sync resources so, that they won't mount. 
> So the cluster will not work fully (will not failover) when one server 
> is down, until i replace broken network card and sync the data by hand.
> In my opinion if sync connection/LAN/crossover fails, one node should 
> take over all resources immediately (as it happens when WAN goes down - 
> achieved by ipfail). Then we would not need dopd and also would not need 
> to sync the resource data by hand when node with broken network card 
> comes back, since the working node would have right data on both resources.
> 

If you want to protect against the failure of the network card, then you 
will want to install a serial cable and configure heartbeat to also use 
the serial link. That way DRBD can use dopd to tell the other node (via 
heartbeats serial link) that it has been outdated.

Or, you can use nic bonding to make your network cards redundant.



More information about the drbd-user mailing list