Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wed, Oct 20, 2010 at 5:41 PM, Colin Simpson <Colin.Simpson at iongeo.com> wrote:
>
> It all works clean if I wait for the drbd to be fully in sync, before
> clvmd is started.
>
> Any thoughts?
I worked on something similar last year.
The cluster systems were based on F12 and rhcs 3, but probably still applies.
In my case I modified clvmd with something like this inside the start
section, probably suboptimal, but working, in my primary/primary
setup. you can eventually accomodate parameters (NR_ATTEMPTS and sleep
time) and also resource names to other values (or initially run a
drbdadm command to get all your resources...)
Instead of original:
start)
start
rtrn=$?
[ $rtrn = 0 ] && touch $LOCK_FILE
;;
> echo -n "Wait for drbd to be UpToDate and Primary:"
> DRBD_STATUS=KO
> ATTEMPT=0
> NR_ATTEMPTS=10
> while [ $ATTEMPT -lt $NR_ATTEMPTS ]
> do
> (( ATTEMPT++ ))
> DRBD_DSTATE=$(drbdadm dstate r0 | cut -d "/" -f 1 2>> /var/log/drbd_clvmd.log)
> DRBD_ROLE=$(drbdadm role r0 | cut -d "/" -f 1 2>> /var/log/drbd_clvmd.log)
> if [ "$DRBD_DSTATE" != "UpToDate" -o "$DRBD_ROLE" != "Primary" ]
> then
> echo "$(date): $DRBD_DSTATE $DRBD_ROLE" >> /var/log/drbd_clvmd.log
> sleep 60
> else
> DRBD_STATUS=OK
> continue
> fi
> done
> if [ $DRBD_STATUS = "OK" ]
> then
> start
> rtrn=$?
> [ $rtrn = 0 ] && touch $LOCK_FILE
> else
> exit 1
> fi
> ;;
So in this case clvmd doesn't try to actually start if after 10
minutes the drbd resource is not UpToDate AND ALSO primary and the
clvmd scripts exits 1
Hope that helps in considering possible strategies for your case.
Gianluca