[DRBD-user] KVM / Heartbeat Master->Server Takeover

Arnold Krille arnold at arnoldarts.de
Wed Sep 14 22:00:26 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wednesday 14 September 2011 18:43:10 Robert P wrote:
> i've installed DRBD / Heartbeat (Ubuntu 10.04 LTS) on two HP Servers.
> The final solution would be:
> The KVM - Image is in the mirrored partition (resp. /dev/drbd0), and it
> should start on that machine, which becomes the drbd-"master"
> automatically.
> Now i have a strange problem, and i still don't know on how to figure that
> out with Google :-(
> The Master-Server is up, /dev/drbd0 is mounted, and the virtual machine
> with the image under /dev/drbd0 is running. eth0:0 also got an ip-Adress
> from heartbeat.
> On the Slave Server, no ip is assigned to eth0:0, and /dev/drbd0 isn't
> mounted so there's also no virtual machine running.
> 
> When i disconnect the ethernet cable, so that the heartbeat is missing from
> the master, on the Slave:  /dev/drbd0 gets mounted correctly, eth0:0 gets
> an ip-Adress as configured, and the virtual machine is starting up.
> BUT: /dev/drbd0 is still mounted on the master server, and the virtual
> machine on the master server also stays up. eth0:0 has also still the same
> adress as before i unplugged the cable.
> And that should not be, because this would cause to have 2 Servers with
> same IP in the network.

Classical split-brain. The common way is to prevent one node from starting the 
vm. In three-node-clusters (or higher) this is done in the split-part that has 
more than half of the nodes. On two-node-clusters this fails badly.
Maybe thats an area for stonith, but as most (all?) stonith mechanisms need a 
network-connection between the servers, this doesn't help for our case where 
the missing network is the problem.
I think I solved this by using a ping-resource on each node to ping other 
machines in the network (and the router and switch) and only start the VMs on 
nodes where the ping-property is above a certain threshold. So when the 
network between the nodes (which is actually two networks with two cables but 
currently all connected to the same switch) fails, new services should only be 
started on servers that have a connection to some other non-virtual machine...

Hope that helps,

Arnold
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110914/88692b25/attachment.pgp>


More information about the drbd-user mailing list