[DRBD-user] HA-NFS & drbd

Dave Dykstra dwdha at drdykstra.us
Thu Mar 3 21:21:27 CET 2005


On Wed, Mar 02, 2005 at 12:09:36PM -0500, Todd Denniston wrote:
> Eugene Crosser wrote:
> > Brett Bolen wrote:
...
> > > Does /var/lib/nfs in a shared directory take care of these things?
> > > I've seen cases where the file system goes away ( without drbd),
> > > but the nfs clients continue to operated as if there was a
> > > file system ( nfs caching?).
...
> > Second, *if* you export shares for networks rathar than individual IP
> > addresses, you *must* have /var/lib/nfs symlinked (or mounted) onto
> > filesystem residing on DRBD device.  Otherwise you will get EPERM errors
> > after failover, and will have to remount shares on the clients.
> > 
> <SNIP>
> 
> Having /var/lib/nfs on a DRBD device (or simlinked) is how the howto[2]
> showed it and works great, why would you not want to share this state
> information between the servers?

I need some clarification on this.  I have tried it both ways and have
had a little better luck with having /var/lib/nfs be separate on the two
servers than having it be a symlink to the shared filesystem.  I have not
seen EPERM errors after failover, but I do sometimes have NFS filesystems
show up in 'df' on some clients as having some huge number of free blocks
(which I assumed was a misinterpretation of -1) until I do "exportfs -r"
on the new active server.  Is that an EPERM error? It usually is not on
all clients, just some of the ones in the netgroup that the filesystem is
exported-to (and I've even seen it when explicitly listing all the hosts
in /etc/exports).  I'm quite sure I saw these problems with /var/lib/nfs
as a symlink to the shared fileserver or not.  If I make /var/lib/nfs
a symlink to the shared filesystem, then it disappears on the standby
server and 'df' hangs there when it gets up to the shared filesystem
which I have mounted from the activer server by NFS.

Is it in general a Very Bad Thing to NFS-mount the shared filesystem on
the HA-NFS servers?  I have never seen anybody explicitly state that,
although I'm beginning to come to that conclusion.  I've had problems with
fuser hanging on failover, and even after avoiding that I still sometimes
see hangs on shutdown that I'm quite sure are related to operations
attempting to access the non-responding NFS mountpoint.  In my case the
shared filesystem holds almost everybody's home directories so it's rather
a pain to not be able to access them on the standby shared file server.
I need to allow people to log in to the active server so I'd have to
have their home directories be set up there to be symlinks directly to
the mounted filesystem (because that's how we do CVS accesses to avoid
problems with CVS over NFS), but that means that when a failover happens
every process that is directly accessing the filesystem will get killed
which isn't very friendly.

- Dave Dykstra



More information about the drbd-user mailing list