[DRBD-user] how to avoid Stale NFS file handle with multiple drbds?

Raoul Borenius raoul at sgs.dfn.de
Tue Dec 20 14:58:39 CET 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Dec 13, 2005 at 09:54:40AM -0500, Todd Denniston wrote:
> >
> >Both nodes are always setup identical. I just tried two different setups:
> >
> >1.) the 'simple' setup with only one drbd as described in
> >    http://linux-ha.org/DRBD_2fNFS, i.e. with a symlink from /var/lib/nfs
> >    to the drbd-volume. Everything works perfectly, no errors on failover.
> >
> >2.) a setup without symlink but /var/lib/nfs as /dev/drbd0 and 
> >/srv/nfs/mail
> >    as /dev/drbd1. I get stale nfs errors during failover.
> >
> 
> I find setup 2 to be interesting in its failure, because I would think it 
> should work the same, but I have no idea why it fails.

Anyone else on the list?

> To me your haresources snipit looks reasonable for 0.7.x though I question 
> two things:
> 
> 1) Why do you run some kind of killnfsd before running `nfs-common start` 

Because the 'DRBD Heartbeat and NFS on Debian HowTo' told me to...

You can find it on http://linux-ha.org/DRBD_2fNFS.

I left it out today and it makes no difference. My haresources for the
simple setup looks like this now:

b1      drbddisk::nfs \
        Filesystem::/dev/drbd1::/srv/nfs::ext3::noatime \
        nfs-common \
        nfs-kernel-server \
        sleep::3 \
        IPaddr::194.95.249.20/24/eth0

Remember, I use a symlink /var/lib/nfs -> /srv/nfs/varlibnfs which helps
to avoid the Stale NFS error.

> and `nfs-kernel-server start`, if nfs for some reason was running when 
> heartbeat got kicked off, what you really need to do is edit your rc config 
> (on fedora I would use ntsysv ) to not start nfs services on boot.

I've done that. nfs is completely controlled by heartbeat!

> 2) why do you have a 3 second sleep between starting nfs and starting your 
> Ethernet config?

Without the 'sleep 3' I get the Stale NFS error on the client. In this case
the nfs-server is shut down immediately after running 'IPaddr'. As it
seems the interface takes some time to really go away and as long as the
interface is up, the nfs-server should be as well..

Could anyone with more knowledge comment on this?

> I did just have a thought on your setup #2 (mounting /dev/drbd0 at 
> /var/lib/nfs), try putting a 3 second sleep between the Filesystem call and 
> the nfs startups. The thought being to make sure the /var/lib/nfs 
> Filesystem is fully ready for use before trying to use it, i.e., see if it 
> is a race condition.

I've set up my second setup:

b1      drbddisk::varlibnfs \
        drbddisk::nfs \
        Filesystem::/dev/drbd0::/var/lib/nfs::ext3::noatime \
        Filesystem::/dev/drbd1::/srv/nfs::ext3::noatime \
        nfs-common \
        nfs-kernel-server \
        sleep::3 \
        IPaddr::194.95.249.20/24/eth0

and today I can switch back and forth between the two nodes without
error!

I can't say what has changed. Still investigating...

Regards
 Raoul




More information about the drbd-user mailing list