Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, Jun 14, 2007 at 10:37:38AM -0400, Dan Gahlinger wrote: > I posted this in linux-ha but got no response, and didn't even see my post get > to the list. > so here it is here. seems more like a drbd issue anyhow. > > I have two systems, with heartbeat and DRBD installed. > Initially I tested with just DRBD, and was able to fail back and forth very > well and easily. > > However, when using heartbeat, it won't fail over, no matter what I do. status > doesn't change. > > I have it setup so that DRBD goes over a cross-over cable between the two > systems on a private IP. > and heartbeat is run over the public (internet facing) interfaces. > > My heartbeat config looks like this: > > vi /etc/ha.d/ha.cf - > logfacility local0 > > logfile /var/log/ha-log > > debugfile /var/log/ha-debug > > udpport 694 > > keepalive 1 > > deadtime 60 > > bcast eth0 > > node LAB-TEST-01 ^^^^^^^^^^^^ [1] > > node LAB-TEST-02 > > auto_failback on I don't like automatic failback. it may even be dangerous (in case you have some misbehaving resource agent on stop ... if you don't know what I mean, consider yourself happy to have missed out on one of the most fun parts setting up a heartbeat cluster) in a "homogeneous" 2-node-failover-cluster (i.e. both nodes are more or less identical) it does not make much sense. and to have a non-homogeneous cluster is not a good idea either (most of the time). even then, operator will get paged for the first failover, and if deemd useful, will initiate the switch-back by hand. > and /etc/ha.d/haresources (note IP address is the virtual public IP): ( this is all one long single line, right? if not, you _have_ to use backslash! ) > lab-test-01 192.168.10.218 drbddisk Filesystem::/dev/drbd0::/mysql::ext3 Filesystem::/dev/drbd1::/data::ext3 ^^^^^^^^^^^ [1] ^^^^^^^^[2] [1] should be the same cAsE (preferably both small). it must be the actual node name, as reported by "uname -n" [2] please use one drbddisk statement per drbd resource explicitly. drbddisk::r0 drbddisk::r1 (or whatever your resource names are in drbd.conf) > configs on both systems are the same, hosts files identical with all > the entries. I've tried with auto_failback on and off seems to make > no difference. > > I test by pulling the public cable on lab-test-01, or using ifconfig eth0 down > > Also, when I bring the server back up drbd can't see the other system > (either one), it becomes > secondary/unknown and primary/unknown. > > It seems for some cases I need to use the drbdadm primary all on the > primary at boot up to fix that. > One other note about the heartbeat issue above. I found if I enter the > commands manually it seems to work. > which makes it really weird. > > Can anyone tell me what's going wrong? the heartneat log file(s) (ha-debug)? -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.