[DRBD-user] Stale NFS file handle vs. NFS-Server-README.txt

Eugene Crosser crosser at rol.ru
Wed Jun 9 08:03:52 CEST 2004


On Wed, 2004-06-09 at 00:10, Jens Dreger wrote:
> Hi!
> 
> I'm trying to set up a drbd+heartbeat NFS-server. Most things work fine,
> but if I write to the NFS storage during failover, I get a Stale NFS
> file handle error:
> 
> root:~> cp /tmp/large_file /mnt
> cp: writing `/mnt/large_file': Stale NFS file handle
> cp: closing `/mnt/large_file': Stale NFS file handle

Not sure if it will help in your particular case, but I have two pieces
of advice:

- do *not* ever unexport the filesystem when you stop NFS server
- keep /var/lib/nfs in the exported filesystem (e.g. symlink there)

> I can get rid of this error, if I insert a small amount of time before
> taking over the IP: 
> 
> [/etc/ha.d/haresources]
>   node1 datadisk::drbd0 nfs-kernel-server nfs-common \
>       wait_n_seconds::5 IPaddr::160.45.32.173
> 
> (wait_n_seconds::5 just sleeps for 5 seconds). Putting the ip in front
> as suggested in http://www.slackworks.com/~dkrovich/DRBD/heartbeat.html
> doesn't work at all.
> 
> drbd/documentation/NFS-Server-README.txt suggests to remove any
> "exportfs -au" from nfs init-scripts. But that has the effect of
> heartbeat no longer being able to unmount the drbd device on failover,
> followed by a reboot of the primary node followed by a re-sync.
> 
>   node1 datadisk: ===> datadisk drbd0 stop <===
>   node1 datadisk: 'drbd0' /dev/nbd/0 is mounted on /drbd/0, trying to unmount
>   node1 datadisk: 'drbd0' trying to kill users of /dev/nbd/0
>   node1 datadisk: fuser -k -m /dev/nbd/0
>   node1 datadisk: umount -v /dev/nbd/0
>   node1 heartbeat: CRIT: Resource STOP failure. Reboot required!
>   node1 heartbeat: CRIT: Killing heartbeat ungracefully!
> 
> This behaviour can be reproduced by:
> 
>   root:~> mount /dev/hda3 /mountpoint
>   root:~> exportfs -vi node1:/mountpoint
>   exporting node1.physik.fu-berlin.de:/mountpoint
>   exporting node1.physik.fu-berlin.de:/mountpoint to kernel
>   root:~> umount /mountpoint
>   umount: /mountpoint: device is busy
>   umount: /mountpoint: device is busy
>   root:~> fuser -k -m /mountpoint
>   root:~>				[NO OUTPUT]
> 
> After issuing an "exportfs -au" the filesystem can be unmounted:
> 
>   root:~> exportfs -au
>   root:~> umount /mountpoint
>   root:~>				[WORKS]
> 
> Thus I can not understand, how the advice given in
> NFS-Server-README.txt could have worked.
> 
> /var/lib/nfs is on the shared device and as I said, everything works
> fine (no data corruption whatsoever), iff I insert the small delay.
> So this is not a big problem, but I would like to understand, why
> noone else seems to have this problem.
> 
> Using:
> 	drbd-0.6.12
> 	heartbeat-1.2.2
> 	kernel 2.4.26
> 	debian woody
> 
> Any help is greatly appreciated,
> 
> Jens.
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.linbit.com/pipermail/drbd-user/attachments/20040609/980ffa5a/attachment.pgp 


More information about the drbd-user mailing list