<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<body bgcolor="#ffffff" text="#000000">
<div class="moz-text-html" lang="x-western">
Hello All,<br>
As I have already told you in my previous mail that my drbd was not
working with Heartbeat. Now after making some changes in the
configuration files It's running on the primary master automatically
with heartbeat .<br>
Thanks to u guys for that...!<br>
But when primary node goes down then secondary node refuses to be
primary and consequences drive is not mounted automatically to the
mount point on secondary node.<br>
my message log in /var/log on secondary node when Primary node goes
down(I manually detach the network cables for testing) are:<br>
<pre><big>Jul 15 10:55:03 sec kernel: block drbd1: PingAck did not arrive in time.
Jul 15 10:55:03 sec kernel: block drbd1: peer( Primary -> Unknown ) conn( SyncTarget -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Jul 15 10:55:03 sec kernel: block drbd1: asender terminated
Jul 15 10:55:03 sec kernel: block drbd1: short read expecting header on sock: r=-512
Jul 15 10:55:03 sec kernel: block drbd1: Terminating asender thread
Jul 15 10:55:03 sec kernel: block drbd1: Connection closed
Jul 15 10:55:03 sec kernel: block drbd1: conn( NetworkFailure -> Unconnected )
Jul 15 10:55:03 sec kernel: block drbd1: receiver terminated
Jul 15 10:55:03 sec kernel: block drbd1: Restarting receiver thread
Jul 15 10:55:03 sec kernel: block drbd1: receiver (re)started
Jul 15 10:55:03 sec kernel: block drbd1: conn( Unconnected -> WFConnection )
<font color="#cc0000">Jul 15 10:55:09 sec kernel: block drbd1: State change failed: Refusing to be Primary without at least one UpToDate disk
Jul 15 10:55:09 sec kernel: block drbd1: state = { cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- }
Jul 15 10:55:09 sec kernel: block drbd1: wanted = { cs:WFConnection ro:Primary/Unknown ds:Inconsistent/DUnknown r--- }</font>
Jul 15 10:55:10 sec kernel: block drbd1: State change failed: Refusing to be Primary without at least one UpToDate disk
Jul 15 10:55:10 sec kernel: block drbd1: state = { cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- }
Jul 15 10:55:10 sec kernel: block drbd1: wanted = { cs:WFConnection ro:Primary/Unknown ds:Inconsistent/DUnknown r--- }
Jul 15 10:55:11 sec kernel: block drbd1: State change failed: Refusing to be Primary without at least one UpToDate disk
Jul 15 10:55:11 sec kernel: block drbd1: state = { cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- }
Jul 15 10:55:11 sec kernel: block drbd1: wanted = { cs:WFConnection ro:Primary/Unknown ds:Inconsistent/DUnknown r--- }</big></pre>
<big><b>var/log/ha-debug file is:</b></big><br>
<meta http-equiv="CONTENT-TYPE" content="text/html; charset=utf-8">
<meta name="GENERATOR" content="OpenOffice.org 3.2 (Unix)">
<style type="text/css">
                @page { margin: 0.79in }
                PRE { font-family: "Liberation Serif" }
                P { margin-bottom: 0.08in }
<pre>Jul 15 10:04:30 sec.master heartbeat: [4559]: info: Link test.cluster:eth0 dead.
Jul 15 10:04:32 sec.master heartbeat: [4559]: info: Heartbeat restart on node test.cluster
Jul 15 10:04:32 sec.master heartbeat: [4559]: info: Link test.cluster:eth0 up.
Jul 15 10:04:32 sec.master heartbeat: [4559]: info: Status update for node test.cluster: status init
Jul 15 10:04:32 sec.master heartbeat: [4559]: info: Status update for node test.cluster: status up
Jul 15 10:04:32 sec.master heartbeat: [6281]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
Jul 15 10:04:32 sec.master heartbeat: [4559]: debug: StartNextRemoteRscReq(): child count 1
Jul 15 10:04:32 sec.master heartbeat: [4559]: debug: get_delnodelist: delnodelist=
logd is not running2010/07/15_10:04:32 info: Running /etc/ha.d/rc.d/status status
harc[6281]:        2010/07/15_10:04:32 info: Running /etc/ha.d/rc.d/status status
Jul 15 10:04:32 sec.master heartbeat: [4559]: info: Status update for node test.cluster: status active
Jul 15 10:04:32 sec.master heartbeat: [4559]: debug: StartNextRemoteRscReq(): child count 1
Jul 15 10:04:32 sec.master heartbeat: [6301]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
logd is not running2010/07/15_10:04:32 info: Running /etc/ha.d/rc.d/status status
harc[6301]:        2010/07/15_10:04:32 info: Running /etc/ha.d/rc.d/status status
Jul 15 10:04:32 sec.master heartbeat: [6321]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
logd is not running2010/07/15_10:04:32 info: Running /etc/ha.d/rc.d/status status
harc[6321]:        2010/07/15_10:04:32 info: Running /etc/ha.d/rc.d/status status
Jul 15 10:04:33 sec.master heartbeat: [4559]: info: remote resource transition completed.
Jul 15 10:04:33 sec.master heartbeat: [4559]: info: sec.master wants to go standby [foreign]
Jul 15 10:04:34 sec.master heartbeat: [4559]: info: standby: test.cluster can take our foreign resources
Jul 15 10:04:34 sec.master heartbeat: [6341]: info: give up foreign HA resources (standby).
logd is not running2010/07/15_10:04:34 info: Releasing resource group: test.cluster IPaddr::
ResourceManager[6356]:        2010/07/15_10:04:34 info: Releasing resource group: test.cluster IPaddr::
logd is not running2010/07/15_10:04:34 info: Running /etc/ha.d/resource.d/IPaddr stop
ResourceManager[6356]:        2010/07/15_10:04:34 info: Running /etc/ha.d/resource.d/IPaddr stop
In IP Stop
SIOCDELRT: No such process
logd is not running2010/07/15_10:04:34 INFO: ifconfig eth0:0 down
IPaddr[6414]:        2010/07/15_10:04:34 INFO: ifconfig eth0:0 down
logd is not running2010/07/15_10:04:34 INFO: Success
IPaddr[6393]:        2010/07/15_10:04:34 INFO: Success
INFO: Success
logd is not running2010/07/15_10:04:34 info: Releasing resource group: test.cluster drbddisk::drbd1 Filesystem::/dev/drbd1::/replicated::ext3
ResourceManager[6445]:        2010/07/15_10:04:34 info: Releasing resource group: test.cluster drbddisk::drbd1 Filesystem::/dev/drbd1::/replicated::ext3
logd is not running2010/07/15_10:04:34 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /replicated ext3 stop
ResourceManager[6445]:        2010/07/15_10:04:34 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /replicated ext3 stop
logd is not running2010/07/15_10:04:34 INFO: Running stop for /dev/drbd1 on /replicated
Filesystem[6502]:        2010/07/15_10:04:34 INFO: Running stop for /dev/drbd1 on /replicated
logd is not running2010/07/15_10:04:34 INFO: Trying to unmount /replicated
Filesystem[6502]:        2010/07/15_10:04:34 INFO: Trying to unmount /replicated
logd is not running2010/07/15_10:04:34 INFO: unmounted /replicated successfully
Filesystem[6502]:        2010/07/15_10:04:34 INFO: unmounted /replicated successfully
logd is not running2010/07/15_10:04:34 INFO: Success
Filesystem[6484]:        2010/07/15_10:04:34 INFO: Success
INFO: Success
logd is not running2010/07/15_10:04:34 info: Running /etc/ha.d/resource.d/drbddisk drbd1 stop
ResourceManager[6445]:        2010/07/15_10:04:34 info: Running /etc/ha.d/resource.d/drbddisk drbd1 stop
Jul 15 10:04:34 sec.master heartbeat: [6341]: info: foreign HA resource release completed (standby).
Jul 15 10:04:34 sec.master heartbeat: [4559]: info: Local standby process completed [foreign].
Jul 15 10:04:35 sec.master heartbeat: [4559]: WARN: 1 lost packet(s) for [test.cluster] [13:15]
Jul 15 10:04:35 sec.master heartbeat: [4559]: info: remote resource transition completed.
Jul 15 10:04:35 sec.master heartbeat: [4559]: info: No pkts missing from test.cluster!
Jul 15 10:04:35 sec.master heartbeat: [4559]: info: Other node completed standby takeover of foreign resources.
Jul 15 10:55:08 sec.master heartbeat: [4559]: WARN: node test.cluster: is dead
Jul 15 10:55:08 sec.master heartbeat: [4559]: WARN: No STONITH device configured.
Jul 15 10:55:08 sec.master heartbeat: [4559]: WARN: Shared disks are not protected.
Jul 15 10:55:08 sec.master heartbeat: [4559]: info: Resources being acquired from test.cluster.
Jul 15 10:55:08 sec.master heartbeat: [7293]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
Jul 15 10:55:08 sec.master heartbeat: [4559]: info: Link test.cluster:eth0 dead.
Jul 15 10:55:08 sec.master heartbeat: [7294]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys sec.master] to acquire.
Jul 15 10:55:08 sec.master heartbeat: [4559]: debug: StartNextRemoteRscReq(): child count 1
<b><font color="#333333">logd is not running2010/07/15_10:55:08 info: Running /etc/ha.d/rc.d/status status
harc[7293]:        2010/07/15_10:55:08 info: Running /etc/ha.d/rc.d/status status
logd is not running2010/07/15_10:55:08 info: Taking over resource group IPaddr::
mach_down[7328]:        2010/07/15_10:55:08 info: Taking over resource group IPaddr::
logd is not running2010/07/15_10:55:08 info: Acquiring resource group: test.cluster IPaddr::
ResourceManager[7358]:        2010/07/15_10:55:08 info: Acquiring resource group: test.cluster IPaddr::
logd is not running2010/07/15_10:55:08 INFO: Resource is stopped
IPaddr[7387]:        2010/07/15_10:55:08 INFO: Resource is stopped
logd is not running2010/07/15_10:55:08 info: Running /etc/ha.d/resource.d/IPaddr start</font></b>
ResourceManager[7358]:        2010/07/15_10:55:08 info: Running /etc/ha.d/resource.d/IPaddr start
logd is not running2010/07/15_10:55:09 INFO: Using calculated nic for eth0
IPaddr[7470]:        2010/07/15_10:55:09 INFO: Using calculated nic for eth0
logd is not running2010/07/15_10:55:09 INFO: Using calculated netmask for
IPaddr[7470]:        2010/07/15_10:55:09 INFO: Using calculated netmask for
logd is not running2010/07/15_10:55:09 INFO: eval ifconfig eth0:0 netmask broadcast
IPaddr[7470]:        2010/07/15_10:55:09 INFO: eval ifconfig eth0:0 netmask broadcast
logd is not running2010/07/15_10:55:09 INFO: Success
IPaddr[7449]:        2010/07/15_10:55:09 INFO: Success
INFO: Success
<font color="#333333"><b>logd is not running2010/07/15_10:55:09 info: Taking over resource group drbddisk::drbd1
mach_down[7328]:        2010/07/15_10:55:09 info: Taking over resource group drbddisk::drbd1
logd is not running2010/07/15_10:55:09 info: Acquiring resource group: test.cluster drbddisk::drbd1 Filesystem::/dev/drbd1::/replicated::ext3
ResourceManager[7570]:        2010/07/15_10:55:09 info: Acquiring resource group: test.cluster drbddisk::drbd1 Filesystem::/dev/drbd1::/replicated::ext3
logd is not running2010/07/15_10:55:09 info: Running /etc/ha.d/resource.d/drbddisk drbd1 start
ResourceManager[7570]:        2010/07/15_10:55:09 info: Running /etc/ha.d/resource.d/drbddisk drbd1 start
1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command '/sbin/drbdsetup 1 primary' terminated with exit code 17</b></font>
1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command '/sbin/drbdsetup 1 primary' terminated with exit code 17
1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command '/sbin/drbdsetup 1 primary' terminated with exit code 17
1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command '/sbin/drbdsetup 1 primary' terminated with exit code 17
1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command '/sbin/drbdsetup 1 primary' terminated with exit code 17
1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command '/sbin/drbdsetup 1 primary' terminated with exit code 17
logd is not running2010/07/15_10:55:14 ERROR: Return code 1 from /etc/ha.d/resource.d/drbddisk
ResourceManager[7570]:        2010/07/15_10:55:14 ERROR: Return code 1 from /etc/ha.d/resource.d/drbddisk
logd is not running2010/07/15_10:55:14 CRIT: Giving up resources due to failure of drbddisk::drbd1
ResourceManager[7570]:        2010/07/15_10:55:14 CRIT: Giving up resources due to failure of drbddisk::drbd1
logd is not running2010/07/15_10:55:14 info: Releasing resource group: test.cluster drbddisk::drbd1 Filesystem::/dev/drbd1::/replicated::ext3
ResourceManager[7570]:        2010/07/15_10:55:14 info: Releasing resource group: test.cluster drbddisk::drbd1 Filesystem::/dev/drbd1::/replicated::ext3
logd is not running2010/07/15_10:55:14 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /replicated ext3 stop
ResourceManager[7570]:        2010/07/15_10:55:14 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /replicated ext3 stop
logd is not running2010/07/15_10:55:14 INFO: Running stop for /dev/drbd1 on /replicated
Filesystem[7694]:        2010/07/15_10:55:14 INFO: Running stop for /dev/drbd1 on /replicated
/dev/drbd1: Wrong medium type
logd is not running2010/07/15_10:55:14 INFO: Success
Filesystem[7679]:        2010/07/15_10:55:14 INFO: Success
INFO: Success
logd is not running2010/07/15_10:55:14 info: Running /etc/ha.d/resource.d/drbddisk drbd1 stop
ResourceManager[7570]:        2010/07/15_10:55:14 info: Running /etc/ha.d/resource.d/drbddisk drbd1 stop
logd is not running2010/07/15_10:55:14 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[7328]:        2010/07/15_10:55:14 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
logd is not running2010/07/15_10:55:14 info: mach_down takeover complete for node test.cluster.
mach_down[7328]:        2010/07/15_10:55:14 info: mach_down takeover complete for node test.cluster.
Jul 15 10:55:14 sec.master heartbeat: [4559]: info: mach_down takeover complete.
ARPING from eth0
Sent 10 probes (10 broadcast(s))
Received 0 response(s)
logd is not running2010/07/15_10:55:19 ERROR: Could not send gratuitous arps. rc=1
IPaddr[7470]:        2010/07/15_10:55:19 ERROR: Could not send gratuitous arps. rc=1
logd is not running2010/07/15_10:55:44 Going standby [foreign].
hb_standby[7820]:        2010/07/15_10:55:44 Going standby [foreign].
Jul 15 10:55:44 sec.master heartbeat: [4559]: info: sec.master wants to go standby [foreign]
Jul 15 10:55:55 sec.master heartbeat: [4559]: WARN: No reply to standby request. Standby request cancelled.</pre>
<big><b>my /etc/ha.d/haresource file is:</b></big><br>
test.cluster drbddisk::drbd1
<big>*where test.cluster is my
primary node and is virtual IP on which heartbeat is
<big><b>my /etc/drbd.conf file is:</b></big><br>
<pre><big>global {
usage-count yes;
common {
protocol C;
resource drbd1 {
on test.cluster {
device /dev/drbd1;
disk /dev/sda5;
meta-disk internal;
on sec.master {
device /dev/drbd1;
disk /dev/sda5;
meta-disk internal;
<big>I just want to run my drbd with heartbeat and replicate the
and run some service on it immediately in the case of fail-over. So
Please help me as soon as possible.</big><br>
Thanks <br>