[DRBD-user] Drbd and heartbeat - bit off topic

Lars Ellenberg Lars.Ellenberg at linbit.com
Wed Jun 8 09:38:30 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2005-06-08 09:03:13 +0200
\ Seuberth, Frank:
> > Hello,
> > 
> > I got the following question :
> > We have SLES9, drbd-0.7.5-0.16, heartbeat-1.2.3-2.3
> > 
> > The /etc/ha.d/haresources is :
> > gf01sxas34  datadisk::r0
> > Filesystem::/dev/drbd0::/vol02::ext3::defaults,noauto 10.90.194.52
> > 

/ 2005-06-08 09:08:52 +0200
\ Stephan Rattai:
> The datadisk script was right for drbd-0.6. In 0.7, use drbddisk.

correct, but for suse...
they are "backwards" (uhm, compatible, of course :-> )
and have a symlink in place datadisk -> drbddisk, which is a good thing.

> > When NodeB takes over due to a network failure of the public lan 
> > (by ipfail, drbd and hartbeat lan still running) then the drbd resource
> > does not bekome primary
> > On nodeB automaticly.
> > 
> > Which drbd command do i have to put into /etc/ha.d/haresources to get
> > this running ?

well, you could adjust the timeouts (heartbeat, drbd),
or up the "try" count in the drbddisk script:
/etc/ha.d/resource.d/drbddisk:
   start)
        # try several times, in case heartbeat deadtime
        # was smaller than drbd ping time

==>>    try=6    <<== HERE

        while true; do
                $DRBDADM primary $RES && break
                let "--try" || exit 20
                sleep 1
        done

and you probably want to verify and if neccessary adjust the heartbeat
ResourceManager script to NOT continue to start depending resources if an early
resource fails to start, so you don't end up with running services but no storage
(fixed in heartbeat STABLE_1_2, but not yet released, will be 1.2.4).

/usr/lib/heartbeat/ResourceManager: ~ line 253
  acquireresourcegroup() {
    ha_log "info: Acquiring resource group: $*"
    node="$1"
    shift
    rc=0;
    for j in "$@"
    do
      if
        we_own_resource "$j" || doscript "$j" start
      then
        : $j start succeeded
      else
        rc=$?

==>> HERE
        ha_log "CRIT: Giving up resources due to failure of $j"
        giveupresourcegroup "$node" "$@"
        break
<<==

      fi
    done
    return $rc
  }



-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list