[DRBD-user] Best Practice with DRBD RHCS and GFS2?

Fri Oct 29 19:49:37 CEST 2010

In case anyone out there has this issue again, I noticed there was a
newer drbd version in the Centos extras repos. When I first implemented
this the latest there was drbd82, I now see they have drbd83 it's 8.3.8.

Upgrading to 8.3.8 has resolved my issue! A reboot now goes clean with
no oops and it looks like the correct behaviour with respect to a
consistent view during the resync process. Fantastic! 

It means I don't need any workaround script to wait for UpToDate before
starting RH Cluster Services. The advice given by JR Earl earlier in
this thread, to alter the clvmd and drbd startup scripts to ensure that
clvmd starts after drbd, now works great.

I have made one slight mod to his method on the cluster.conf, I
personally have multiple services using the same file mounts. I also
though have cluster.conf managing my GFS2 mounts for me e.g.

<clusterfs device="/dev/CluVG0/CluVG0-projects"
         force_umount="0"
         fstype="gfs2" mountpoint="/mnt/projects" name="projects"
         options="acl"/>

The issue is: I don't want to force_umount as other services might be
using this mount point (but may not be actually in it at the time). For
example, Samba may not have any files open in here, so a mount would
succeed but Samba would then fail to access files in here. So I have
added this mount to fstab:

/dev/mapper/CluVG0-CluVG0-projects		/mnt/projects	gfs2	noauto	0 0

, but it's set it to noauto. Then chkconfig'd gfs2 on.

This means that nothing happens at boot time but on the way down any
gfs2 mounts present will get unmounted. This now means a node shutdown
will cleanly leave the cluster and shutdown. But I still have
cluster.conf fully managing my gfs2 mounts.

I still have the issue of restart always brings the device up in
Secondary/Primary. I wonder if the startup script doesn't do enough on
restart? I notice the "start" section does:

$DRBDADM sh-b-pri all # Become primary if configured

this is missing from restart. Not a big deal but I need to be careful
with that.

Colin

On Wed, 2010-10-27 at 19:18 +0100, Colin Simpson wrote:
> Grr, sadly I've just tried waiting for it to become fully "UpToDate",
> with a mount in place, but the GFS2 mount remains hung even after it
> reaches this state. The noquota is probably a false lead as I see the
> manual page for mount.gfs2 says quota's are defaulted to off anyway.
> 
> I do like your idea for putting /etc/init.d/gfs2 as the furthest out
> resource, though I think I might be unable to use it for the same
> reason
> I have dismissed the idea of using "force_unmount=1" in the clusterfs
> resource (and I can't see the advantage of what you are doing over
> force_unmount, again I'm maybe missing something). Namely, I have
> multiple services using the same mounts i.e in my case Samba and NFS.
> 
> I know a umount may be safe as they will probably get a busy if they
> try
> to unmount and another service is using, but some services e.g samba
> may
> not be "in" the mount point (i.e if no one is accessing a file in
> there
> at this time), so will have that rug pulled away?
> 
> Another weird thing on my drbd just now, is there any reason why
> bringing the drbd service up using restart causes it to come up as a
> Secondary/Primary, but using just start does the right thing
> Primary/Primary? See below (ok the restart generates some spurious
> gunk
> cause it isn't running but I'd have thought it shouldn't do this):
> 
> [root at node2 ~]# /etc/init.d/drbd stop
> Stopping all DRBD resources.
> [root at node2 ~]# /etc/init.d/drbd start
> Starting DRBD resources:    [ d(r0) s(r0) n(r0) ].
> [root at node2 ~]# more /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> buildsvn at c5-i386-build, 2008-10-03 11:42:32
> 
>  1: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate C r---
>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 oos:0
> [root at node2 ~]# /etc/init.d/drbd stop
> Stopping all DRBD resources.
> [root at node2 ~]# /etc/init.d/drbd restart
> Restarting all DRBD resourcesNo response from the DRBD driver! Is the
> module loaded?
> Command '/sbin/drbdsetup /dev/drbd1 down' terminated with exit code 20
> command exited with code 20
> ERROR: Module drbd does not exist in /proc/modules
> .
> [root at node2 ~]# more /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> buildsvn at c5-i386-build, 2008-10-03 11:42:32
> 
>  1: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 oos:0
> 
> Any ideas?
> 
> Thanks
> 
> Colin
> 
> 
> On Tue, 2010-10-26 at 17:37 +0100, J. Ryan Earl wrote:
> > On Fri, Oct 22, 2010 at 12:49 PM, Colin Simpson
> > <Colin.Simpson at iongeo.com> wrote:
> >         Maybe I just need to leave for a long time ? Or I wonder
> >         because you
> >         have "noquota" in your mount options and the oops is in
> >         gfs2_quotad
> >         modules you never see it?
> >
> >
> > I saw that too...  I'm not sure if the noquota statement has any
> > effect, I didn't have any problems before adding that but I saw in
> > some tuning document it could help performance.
> > 
> >        
> >         Though I don't see why you are adding the /etc/init.d/gfs2
> >         service to
> >         the cluster.conf, as all that does is mount gfs2 filesystems
> >         from fstab
> >         (and you say these are noauto in there), so will this do
> >         anything? The
> >         inner "clusterfs" directives will handle the actual mount?
> >
> >
> > It's to handle the unmount so that the volume goes down cleanly when
> > the rgmanager service stops.  clusterfs won't stop the mount, so I
> put
> > the mount in /etc/fstab with "noauto" so let rgmanager mount and
> > unmount GFS2.
> > 
> >         <resources>
> >           <clusterfs device="/dev/CluVG0/CluVG0-projects"
> >         force_umount="0"
> >         fstype="gfs2" mountpoint="/mnt/projects" name="projects"
> >         options="acl"/>
> >           <nfsexport name="tcluexports"/>
> >           <nfsclient name="NFSprojectsclnt" options="rw"
> >         target="192.168.1.0/24"/>
> >            <ip address="192.168.1.60" monitor_link="1"/>
> >         </resources>
> >            <service autostart="1" domain="clusterA"
> >         name="NFSprojects">
> >             <ip ref="192.168.1.60"/>
> >             <clusterfs fstype="gfs" ref="projects">
> >               <nfsexport ref="tcluexports">
> >                  <nfsclient name=" " ref="NFSprojectsclnt"/>
> >                </nfsexport>
> >              </clusterfs>
> >            </service>
> >
> >
> > YMMV but I found it best to keep 'chkconfig gfs2 off' and control
> that
> > as a script from rgmanager.  It fixed order of operation issues such
> > as the GFS2 volume being mounted still during shutdown.  I'd wrap
> all
> > your gfs clusterfs stanzas within a script for gfs2.  I suspect your
> > gfs2 is recovering after an unclean shutdown, if you're using quotas
> > that could add time to that operation I suppose.  Does it eventually
> > come up if you just wait?
> >
> >
> > -JR
> 
> 
> This email and any files transmitted with it are confidential and are
> intended solely for the use of the individual or entity to whom they
> are addressed.  If you are not the original recipient or the person
> responsible for delivering the email to the intended recipient, be
> advised that you have received this email in error, and that any use,
> dissemination, forwarding, printing, or copying of this email is
> strictly prohibited. If you received this email in error, please
> immediately notify the sender and delete the original.
> 
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
>