[DRBD-user] HA-NFS & drbd
pauln at psc.edu
Thu Mar 3 21:53:45 CET 2005
> I need some clarification on this. I have tried it both ways and have
> had a little better luck with having /var/lib/nfs be separate on the two
> servers than having it be a symlink to the shared filesystem. I have not
> seen EPERM errors after failover, but I do sometimes have NFS filesystems
> show up in 'df' on some clients as having some huge number of free blocks
> (which I assumed was a misinterpretation of -1) until I do "exportfs -r"
> on the new active server. Is that an EPERM error? It usually is not on
> all clients, just some of the ones in the netgroup that the filesystem is
> exported-to (and I've even seen it when explicitly listing all the hosts
> in /etc/exports). I'm quite sure I saw these problems with /var/lib/nfs
> as a symlink to the shared fileserver or not. If I make /var/lib/nfs
> a symlink to the shared filesystem, then it disappears on the standby
> server and 'df' hangs there when it gets up to the shared filesystem
> which I have mounted from the activer server by NFS.
After some trip-ups, I've got a similar config working very well.
I've been sharing varlibnfs the entire time and have not seen any
problems - even after 20 or 30 failovers (hard and soft). One thing
I have learned is that running an nfs server on your standby machine
is a bad idea. I chkconfig nfs off on both of the machines and let
heartbeat start nfsd after it has mounted drbd. If the machine is
on standby, nfsd is not started at all. One time, after a machine was
rebuilt, nfsd was on be default and failovers ceased to work at all.
I have not been mounting the ha share on the failover cluster nodes,
but I don't think that would be a bad thing - unless the nfs client
translates the vip to the loopback. Are you explicitly mounting the
vip via the nfs client or the real ip? I'd recommend that you don't allow
any non-vital processes on your failover cluster.
> Is it in general a Very Bad Thing to NFS-mount the shared filesystem on
> the HA-NFS servers? I have never seen anybody explicitly state that,
> although I'm beginning to come to that conclusion. I've had problems with
> fuser hanging on failover, and even after avoiding that I still sometimes
> see hangs on shutdown that I'm quite sure are related to operations
> attempting to access the non-responding NFS mountpoint. In my case the
> shared filesystem holds almost everybody's home directories so it's rather
> a pain to not be able to access them on the standby shared file server.
> I need to allow people to log in to the active server so I'd have to
> have their home directories be set up there to be symlinks directly to
> the mounted filesystem (because that's how we do CVS accesses to avoid
> problems with CVS over NFS), but that means that when a failover happens
> every process that is directly accessing the filesystem will get killed
> which isn't very friendly.
> - Dave Dykstra
> drbd-user mailing list
> drbd-user at lists.linbit.com
More information about the drbd-user