Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 2007.12.13, at 17:42, Florian Haas wrote: > On Thursday 13 December 2007 13:45:47 Martin Gombac wrote: >> Hi, >> >> i have a simple fail-over heartbeat/drbd cluster set up. In each >> server there are two network cards. One for internet access (WAN) and >> one for internal communication and synchronization (LAN). If one node >> looses internet connection (WAN) the other takes over all resources >> as expected, but this does no thappen with local connection (LAN) >> which is just crossover UTP cable. >> In case if local connection (LAN) fails, cluster does not migrate all >> resources to one of the nodes and just continues to work without >> synchronized drbd resources (amongst other things). >> >> My question is this: >> How can i make one node take over all resources if local crossover >> connection fails? > > Give us your ha.cf, please. > > Florian > > -- > : Florian G. Haas > : LINBIT Information Technologies GmbH > : Vivenotgasse 48, A-1120 Vienna, Austria crm no # # There are lots of options in this file. All you have to have is a set # of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast}, # and a value for "auto_failback". # # ATTENTION: As the configuration file is read line by line, # THE ORDER OF DIRECTIVE MATTERS! # # In particular, make sure that the udpport, serial baud rate # etc. are set before the heartbeat media are defined! # debug and log file directives go into effect when they # are encountered. # # All will be fine if you keep them ordered as in this example. # # # Note on logging: # If any of debugfile, logfile and logfacility are defined then they # will be used. If debugfile and/or logfile are not defined and # logfacility is defined then the respective logging and debug # messages will be loged to syslog. If logfacility is not defined # then debugfile and logfile will be used to log messges. If # logfacility is not defined and debugfile and/or logfile are not # defined then defaults will be used for debugfile and logfile as # required and messages will be sent there. # # File to write debug messages to #debugfile /var/log/ha-debug # # # File to write other messages to # #logfile /var/log/ha-log # # # Facility to use for syslog()/logger # #logfacility local0 # # # A note on specifying "how long" times below... # # The default time unit is seconds # 10 means ten seconds # # You can also specify them in milliseconds # 1500ms means 1.5 seconds # # # keepalive: how long between heartbeats? # keepalive 2 # # deadtime: how long-to-declare-host-dead? # # If you set this too low you will get the problematic # split-brain (or cluster partition) problem. # See the FAQ for how to use warntime to tune deadtime. # deadtime 30 # # warntime: how long before issuing "late heartbeat" warning? # See the FAQ for how to use warntime to tune deadtime. # warntime 10 # # # Very first dead time (initdead) # # On some machines/OSes, etc. the network takes a while to come up # and start working right after you've been rebooted. As a result # we have a separate dead time for when things first come up. # It should be at least twice the normal dead time. # #initdead 120 # # # What UDP port to use for bcast/ucast communication? # udpport 713 # # Baud rate for serial ports... # #baud 19200 # # serial serialportname ... #serial /dev/ttyS0 # Linux #serial /dev/cuaa0 # FreeBSD #serial /dev/cuad0 # FreeBSD 6.x #serial /dev/cua/a # Solaris # # # What interfaces to broadcast heartbeats over? # #bcast eth0 # Linux #bcast eth1 eth2 # Linux bcast eth0 eth1 #bcast le0 # Solaris #bcast le1 le2 # Solaris # # Set up a multicast heartbeat medium # mcast [dev] [mcast group] [port] [ttl] [loop] # # [dev] device to send/rcv heartbeats on # [mcast group] multicast group to join (class D multicast address # 224.0.0.0 - 239.255.255.255) # [port] udp port to sendto/rcvfrom (set this value to the # same value as "udpport" above) # [ttl] the ttl value for outbound heartbeats. this effects # how far the multicast packet will propagate. (0-255) # Must be greater than zero. # [loop] toggles loopback for outbound multicast heartbeats. # if enabled, an outbound packet will be looped back and # received by the interface it was sent on. (0 or 1) # Set this value to zero. # # #mcast eth0 225.0.0.1 694 1 0 # # Set up a unicast / udp heartbeat medium # ucast [dev] [peer-ip-addr] # # [dev] device to send/rcv heartbeats on # [peer-ip-addr] IP address of peer to send packets to # #ucast eth0 192.168.1.2 # # # About boolean values... # # Any of the following case-insensitive values will work for true: # true, on, yes, y, 1 # Any of the following case-insensitive values will work for false: # false, off, no, n, 0 # # # # auto_failback: determines whether a resource will # automatically fail back to its "primary" node, or remain # on whatever node is serving it until that node fails, or # an administrator intervenes. # # The possible values for auto_failback are: # on - enable automatic failbacks # off - disable automatic failbacks # legacy - enable automatic failbacks in systems # where all nodes do not yet support # the auto_failback option. # # auto_failback "on" and "off" are backwards compatible with the old # "nice_failback on" setting. # # See the FAQ for information on how to convert # from "legacy" to "on" without a flash cut. # (i.e., using a "rolling upgrade" process) # # The default value for auto_failback is "legacy", which # will issue a warning at startup. So, make sure you put # an auto_failback directive in your ha.cf file. # (note: auto_failback can be any boolean or "legacy") # auto_failback on # # # Basic STONITH support # Using this directive assumes that there is one stonith # device in the cluster. Parameters to this device are # read from a configuration file. The format of this line is: # # stonith <stonith_type> <configfile> # # NOTE: it is up to you to maintain this file on each node in the # cluster! # #stonith baytech /etc/ha.d/conf/stonith.baytech # # STONITH support # You can configure multiple stonith devices using this directive. # The format of the line is: # stonith_host <hostfrom> <stonith_type> <params...> # <hostfrom> is the machine the stonith device is attached # to or * to mean it is accessible from any host. # <stonith_type> is the type of stonith device (a list of # supported drives is in /usr/lib/stonith.) # <params...> are driver specific parameters. To see the # format for a particular device, run: # stonith -l -t <stonith_type> # # # Note that if you put your stonith device access information in # here, and you make this file publically readable, you're asking # for a denial of service attack ;-) # # To get a list of supported stonith devices, run # stonith -L # For detailed information on which stonith devices are supported # and their detailed configuration options, run this command: # stonith -h # #stonith_host * baytech 10.0.0.3 mylogin mysecretpassword #stonith_host ken3 rps10 /dev/ttyS1 kathy 0 #stonith_host kathy rps10 /dev/ttyS1 ken3 0 # # Watchdog is the watchdog timer. If our own heart doesn't beat for # a minute, then our machine will reboot. # NOTE: If you are using the software watchdog, you very likely # wish to load the module with the parameter "nowayout=0" or # compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise even # an orderly shutdown of heartbeat will trigger a reboot, which is # very likely NOT what you want. # #watchdog /dev/watchdog # # Tell what machines are in the cluster # node nodename ... -- must match uname -n node veliki node mali # # Less common options... # # Treats 10.10.10.254 as a psuedo-cluster-member # Used together with ipfail below... # note: don't use a cluster node as ping node # #ping 10.10.10.254 #ping 88.200.116.1 # # Treats 10.10.10.254 and 10.10.10.253 as a psuedo-cluster-member # called group1. If either 10.10.10.254 or 10.10.10.253 are up # then group1 is up # Used together with ipfail below... # #ping_group group1 10.10.10.254 10.10.10.253 ping_group internet 88.200.116.1 193.2.85.3 # # HBA ping derective for Fiber Channel # Treats fc-card-name as psudo-cluster-member # used with ipfail below ... # # You can obtain HBAAPI from http://hbaapi.sourceforge.net. You need # to get the library specific to your HBA directly from the vender # To install HBAAPI stuff, all You need to do is to compile the common # part you obtained from the sourceforge. This will produce libHBAAPI.so # which you need to copy to /usr/lib. You need also copy hbaapi.h to # /usr/include. # # The fc-card-name is the name obtained from the hbaapitest program # that is part of the hbaapi package. Running hbaapitest will produce # a verbose output. One of the first line is similar to: # Apapter number 0 is named: qlogic-qla2200-0 # Here fc-card-name is qlogic-qla2200-0. # #hbaping fc-card-name # # # Processes started and stopped with heartbeat. Restarted unless # they exit with rc=100 # #respawn userid /path/name/to/run respawn cluster /usr/lib/heartbeat/ipfail # # Access control for client api # default is no access # #apiauth client-name gid=gidlist uid=uidlist #apiauth ipfail gid=haclient uid=hacluster ########################### # # Unusual options. # ########################### # # hopfudge maximum hop count minus number of nodes in config #hopfudge 1 # # deadping - dead time for ping nodes #deadping 30 # # hbgenmethod - Heartbeat generation number creation method # Normally these are stored on disk and incremented as needed. #hbgenmethod time # # realtime - enable/disable realtime execution (high priority, etc.) # defaults to on #realtime off # # debug - set debug level # defaults to zero #debug 1 # # API Authentication - replaces the fifo-permissions-based system of the past # # # You can put a uid list and/or a gid list. # If you put both, then a process is authorized if it qualifies under either # the uid list, or under the gid list. # # The groupname "default" has special meaning. If it is specified, then # this will be used for authorizing groupless clients, and any client groups # not otherwise specified. # # There is a subtle exception to this. "default" will never be used in the # following cases (actual default auth directives noted in brackets) # ipfail (uid=HA_CCMUSER) # ccm (uid=HA_CCMUSER) # ping (gid=HA_APIGROUP) # cl_status (gid=HA_APIGROUP) # # This is done to avoid creating a gaping security hole and matches the most # likely desired configuration. # #apiauth ipfail uid=hacluster #apiauth ccm uid=hacluster #apiauth cms uid=hacluster #apiauth ping gid=haclient uid=alanr,root #apiauth default gid=haclient # message format in the wire, it can be classic or netstring, # default: classic #msgfmt classic/netstring # Do we use logging daemon? # If logging daemon is used, logfile/debugfile/logfacility in this file # are not meaningful any longer. You should check the config file for logging # daemon (the default is /etc/logd.cf) # more infomartion can be fould in http://www.linux-ha.org/ ha_2ecf_2fUseLogdDirective # Setting use_logd to "yes" is recommended # use_logd yes # # the interval we reconnect to logging daemon if the previous connection failed # default: 60 seconds conn_logd_time 60 # # # Configure compression module # It could be zlib or bz2, depending on whether u have the corresponding # library in the system. compression bz2 # # Confiugre compression threshold # This value determines the threshold to compress a message, # e.g. if the threshold is 1, then any message with size greater than 1 KB # will be compressed, the default is 2 (KB) compression_threshold 2