[DRBD-user] heartbeat starts until drbd sync is finished

Wed Dec 13 12:50:33 CET 2006

/ 2006-12-13 12:15:12 +0100
\ Uwe Melzer:
> Let me describe the situation:
> secondary node is Primary/Unknown
> primary is booting
> drbd starts on primary, sync per group is starting - start script leaves
> heartbeat starts on primary, auto_failback on --> take over resouces initiate
> secondary node try to set devices in secondary status per drbddisk.
> 
> Today I looked in the ha-log files on both nodes. During the sync phase I found
> these infos an the secondary node:
> 
> heartbeat: 2006/12/12_19:44:37 info: Running /etc/ha.d/resource.d/drbddisk logs stop
> heartbeat: 2006/12/12_19:44:42 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk

well.
there should be output from the drbddisk (respective called drbdadm)
as to why it "fails".
you can probably find that output in the ha-debug.

> It seems for me that drbddisk didn't check that the device is in sync mode.

nope.
drbddisk stop does
 * succeed always if there is no /proc/drbd
 * exec "drbdadm secondary resourcename"

if 
# drbdadm secondary resourcename 
does not succeed in making the resource secondary,
it fails.

drbddisk status does only print
 "running"
if the resource in question is in fact primary,
if it is secondary it prints "stopped",
and if there is something else wrong (unconfigured, not found)
it prints whatever the output of the drbdadm state command was,
which will be interpreted by heartbeat as equivalent to "stopped".

please verify that you don't have spelling errors or strange characters
in your haresources file, and find the output of the drbddisk stop
command.

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.