[DRBD-user] Best Practice with DRBD RHCS and GFS2?

Thu Oct 21 12:52:28 CEST 2010

On Wed, Oct 20, 2010 at 5:41 PM, Colin Simpson <Colin.Simpson at iongeo.com> wrote:
>
> It all works clean if I wait for the drbd to be fully in sync, before
> clvmd is started.
>
> Any thoughts?

I worked on something similar last year.
The cluster systems were based on F12 and rhcs 3, but probably still applies.
In my case I modified clvmd with something like this inside the start
section, probably suboptimal, but working, in my primary/primary
setup. you can eventually accomodate parameters (NR_ATTEMPTS and sleep
time) and also resource names to other values (or initially run a
drbdadm command to get all your resources...)

Instead of original:
  start)
        start
        rtrn=$?
        [ $rtrn = 0 ] && touch $LOCK_FILE
        ;;

> 	echo -n "Wait for drbd to be UpToDate and Primary:"
> 	DRBD_STATUS=KO
> 	ATTEMPT=0
> 	NR_ATTEMPTS=10
> 	while [ $ATTEMPT -lt $NR_ATTEMPTS ]
> 	do
> 		(( ATTEMPT++ ))
> 		DRBD_DSTATE=$(drbdadm dstate r0 | cut -d "/" -f 1 2>> /var/log/drbd_clvmd.log)
> 		DRBD_ROLE=$(drbdadm role r0 | cut -d "/" -f 1 2>> /var/log/drbd_clvmd.log)
> 		if [ "$DRBD_DSTATE" != "UpToDate" -o "$DRBD_ROLE" != "Primary"  ]
> 		then
> 			echo "$(date): $DRBD_DSTATE $DRBD_ROLE" >> /var/log/drbd_clvmd.log
> 			sleep 60
> 		else
> 			DRBD_STATUS=OK
> 			continue
> 		fi
> 	done
> 	if [ $DRBD_STATUS = "OK" ]
> 	then
> 		start
> 		rtrn=$?
> 		[ $rtrn = 0 ] && touch $LOCK_FILE
> 	else
> 		exit 1
> 	fi
>      ;;

So in this case clvmd doesn't try to actually start if after 10
minutes the drbd resource is not UpToDate AND ALSO primary and the
clvmd scripts exits 1

Hope that helps in considering possible strategies for your case.
Gianluca