[DRBD-user] DRBD: failover when sync connection dies?

Mon Dec 17 10:55:02 CET 2007

On 2007.12.13, at 18:24, Martin Gombac wrote:

> On 2007.12.13, at 17:42, Florian Haas wrote:
>
>> On Thursday 13 December 2007 13:45:47 Martin Gombac wrote:
>>> Hi,
>>>
>>> i have a simple fail-over heartbeat/drbd cluster set up. In each
>>> server there are two network cards. One for internet access (WAN)  
>>> and
>>> one for internal communication and synchronization (LAN). If one  
>>> node
>>> looses internet connection (WAN) the other takes over all resources
>>> as expected, but this does no thappen with local connection (LAN)
>>> which is just crossover UTP cable.
>>> In case if local connection (LAN) fails, cluster does not migrate  
>>> all
>>> resources to one of the nodes and just continues to work without
>>> synchronized drbd resources (amongst other things).
>>>
>>> My question is this:
>>> How can i make one node take over all resources if local crossover
>>> connection fails?
>>
>> Give us your ha.cf, please.
>>
>> Florian
>>
>> -- : Florian G. Haas
>> : LINBIT Information Technologies GmbH
>> : Vivenotgasse 48, A-1120 Vienna, Austria
>
> crm no
>
>
> #
> #	There are lots of options in this file.  All you have to have is  
> a set
> #	of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},
> #	and a value for "auto_failback".
> #
> #	ATTENTION: As the configuration file is read line by line,
> #		   THE ORDER OF DIRECTIVE MATTERS!
> #
> #	In particular, make sure that the udpport, serial baud rate
> #	etc. are set before the heartbeat media are defined!
> #	debug and log file directives go into effect when they
> #	are encountered.
> #
> #	All will be fine if you keep them ordered as in this example.
> #
> #
> #       Note on logging:
....
> #	e.g. if the threshold is 1, then any message with size greater  
> than 1 KB
> #	will be compressed, the default is 2 (KB)
> compression_threshold 2
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

Did my config file help?

I guess i could use dopd, to outdate resources over one of the  
working heartbeat network connections (WAN or crossover LAN) but this  
would mean downtime for half of the services until i fix the broken  
machine and manually resync.

So i have two options.

Optimal:
1. Add another serial cable for heartbeat packets. Replace (LAN)  
crossover network connection for a switched one. Add another server  
to LAN swich with appropriate IP and monitor it with ipfail. This way  
heartbeat will failover as soon as LAN (or WAN) dies on one node and  
services will continue to work with DRBD active/primary for all  
resources on second node and passive/secondary on the broken node.
After i fix broken node it will get synced auto-magically into right  
direction and take over it's resources back.

Not so optimal:
2. Just move all the syncing for drbd from crossover LAN to WAN. Also  
move all internal program communications to WAN. LAN crossover is  
used only for heartbeat. I guess it's pretty obvious why this is  
slower set up that 1. (I use one network connection for everything  
and second just for heartbeat).

I'll probably just do 2. since i have no physical access to the nodes  
a.t.m. and limit DBRD sync to 30Mbit/s.

Regards,
M.