[DRBD-user] drbd with heartbeat won't fail over
lars.ellenberg at linbit.com
Thu Jun 14 17:00:24 CEST 2007
On Thu, Jun 14, 2007 at 10:37:38AM -0400, Dan Gahlinger wrote:
> I posted this in linux-ha but got no response, and didn't even see my post get
> to the list.
> so here it is here. seems more like a drbd issue anyhow.
> I have two systems, with heartbeat and DRBD installed.
> Initially I tested with just DRBD, and was able to fail back and forth very
> well and easily.
> However, when using heartbeat, it won't fail over, no matter what I do. status
> doesn't change.
> I have it setup so that DRBD goes over a cross-over cable between the two
> systems on a private IP.
> and heartbeat is run over the public (internet facing) interfaces.
> My heartbeat config looks like this:
> vi /etc/ha.d/ha.cf -
> logfacility local0
> logfile /var/log/ha-log
> debugfile /var/log/ha-debug
> udpport 694
> keepalive 1
> deadtime 60
> bcast eth0
> node LAB-TEST-01
> node LAB-TEST-02
> auto_failback on
I don't like automatic failback.
it may even be dangerous
(in case you have some misbehaving resource agent on stop ...
if you don't know what I mean, consider yourself happy
to have missed out on one of the most fun parts setting up
a heartbeat cluster)
in a "homogeneous" 2-node-failover-cluster
(i.e. both nodes are more or less identical)
it does not make much sense.
and to have a non-homogeneous cluster is
not a good idea either (most of the time).
even then, operator will get paged for the first failover,
and if deemd useful, will initiate the switch-back by hand.
> and /etc/ha.d/haresources (note IP address is the virtual public IP):
( this is all one long single line, right?
if not, you _have_ to use backslash! )
> lab-test-01 192.168.10.218 drbddisk Filesystem::/dev/drbd0::/mysql::ext3 Filesystem::/dev/drbd1::/data::ext3
^^^^^^^^^^^  ^^^^^^^^
 should be the same cAsE (preferably both small).
it must be the actual node name, as reported by "uname -n"
 please use one drbddisk statement per drbd resource explicitly.
(or whatever your resource names are in drbd.conf)
> configs on both systems are the same, hosts files identical with all
> the entries. I've tried with auto_failback on and off seems to make
> no difference.
> I test by pulling the public cable on lab-test-01, or using ifconfig eth0 down
> Also, when I bring the server back up drbd can't see the other system
> (either one), it becomes
> secondary/unknown and primary/unknown.
> It seems for some cases I need to use the drbdadm primary all on the
> primary at boot up to fix that.
> One other note about the heartbeat issue above. I found if I enter the
> commands manually it seems to work.
> which makes it really weird.
> Can anyone tell me what's going wrong?
the heartneat log file(s) (ha-debug)?
: Lars Ellenberg Tel +43-1-8178292-0 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
please use the "List-Reply" function of your email client.
More information about the drbd-user