[DRBD-user] DRBD+Linux-HA: Failover fails if initiated by init 6 on DRBD Master Node

Thu Sep 18 11:14:09 CEST 2008

On Thu, Sep 18, 2008 at 10:05:15AM +0200, Christoph Eßer wrote:
> Hi there,
>
> I first came up with this question on Linux-HA mailing list. As they  
> think it semms to be a DRBD related problem, I repeat it here:
>
> I am using Heartbeat 2.99,Pacemaker 0.6.6 and DRBD 8.0.13 on Debian Lenny.
>
> Anything works as expected if I just force the system to failover by
> unplugging the power supply of any of my two nodes. Shutting down the
> DRBD slave via "init 0" or "init 6" works fine as well.
>
> But whenever I restart my DRBD master via "init 6" the whole cluster
> crashes and fails to reassign the resources, even after the node started
> up again. Anyway, I found some strange DRBD related messages in
> /var/log/syslog on the second node while restarting the master node:
>
> Sep 16 13:48:44 viktor-01 drbd[4123]: [4135]: DEBUG: r0: Calling drbdadm
> -c /etc/drbd.conf state r0

...

compare with the kernel log (where the drbd module printks go to)
from that time span.

double check the order of the init scripts.
verify that you _DO NOT_ start or stop drbd via init
when using the heartbeat drbd ocf ra.

in other words, when using the heartbeat ocf drbd ra,
/etc/init.d/drbd MUST NOT be referenced from
rc.d/* or runlevel.conf or whatever init system you use.

if you happen to let init stop drbd
before init would stop heartbeat,
then this stuff may be one of the symptoms.

consider using the drbddisk ra instead (where, in turn, you have to let
init start and stop drbd, start drbd before heartbeat, stop heartbeat
before drbd)

did that help?

-- 
: Lars Ellenberg                
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed