hi everyone:<br><br>we are setting up a cluster with two nodes in failover, but there is a point where we are a little confused.<br>we have a working config with<br>-heartbeat-1.2.3-9sarge6<br>-drbd0.7 <br>-ldirectord-1.2.3-9sarge6
<br><br>what we want: a failover config with two nodes, with auto_failback off, drbd replication always running and maybe operator intervention to re-establish the master-slave config.<br><br>what we have: when we unplug the network cable on the primary host (host-a), all the services including the virtual IP and drbd filesystem switch correctly to the slave server (host-b).
<br><br>the problem: when we reconnect the network cable on the primary host-a, the drbd service stops on the primary. we have searched about this and think that is a expected situation since we have not a stonith device to prevent split-brain situations (in fact, both nodes switch to standalone mode to prevent data corruption).
<br><br>the question: anybody has this setup working as we want? is this possible? is there a way to force the host-a to become a slave of host-b and continue replication? we know that this implies that host-a should throw away some good updates, but i think we can live with that. i mean, both nodes are interconnected with a switch. we will have some web folders and maildirs in drbd. if the primary goes down, that means that nobody can access it, so the more complicated scenario can be some mails loss and some webs outdated (only the time that host-b takes to bring up the virtual IP and mount the drbd filesystem).
<br>it is the first time that we work with clusters. and we know that maybe this is not the best setup but is the most balanced (price/reliability) that we could find.<br>if anyone has an advice to give us, it will be welcome.
<br><br>thanks in advance.<br><br>--<br>Roberto Scattini.