[DRBD-user] Stale NFS file handle

Diego Julian Remolina diego.remolina at ibb.gatech.edu
Fri Dec 16 21:09:55 CET 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

Is the Stale NFS problem happening on nfs version 3, 4 or both?  I am 
using NFS version 4 on RHEL4 with RHEL4, FC3 and FC4 clients and I have 
no problems with NFS switching from primary to secondary and back, 
except for the fact that I do need to sometimes restart rpcidmapd on the 
nfs server a few times after the switch (I actually added a script that 
restarts rpcidmapd 5 times with 5 second intervals) or I will get the 
following error message on the clients:

Dec 11 07:15:50 sauron kernel: decode_getfattr: xdr error 10008!

I also have not tested if I can now remove that rpcidmapd script as 
there have been kernel upgrades since I frist set everything up with the 
2.6.9-11 kernel.


Todd Denniston wrote:
> Andrés Cañada wrote:
>> Hi!
>> I've been reading about the Stale NFS file handle problem in this list.
>> As far as I can see, this is a Debian/NFS problem, so that's my 
>> problem, and I haven't read the solution that works for me.
>  From the above and below text you wrote I think you may have one 
> problem causing another.
> If you kill nfsd it can not store needed data in drbd0 that would be 
> needed when a fall over so the other system can resume the nfs sessions.
> you are telling us below that you are getting an error with your 
> killnfsd script, but you have not answered the question I also asked 
> The reason for that question is simple,
> 1) nfsd service should not be started at boot so it should not be in the 
> way when heartbeat trys to start the nfs service scripts,
> on Fedora chkconfig or system-config-services can be used,
> on Debian I think one of the following will head you in the correct 
> direction:
> http://packages.debian.org/unstable/admin/sysvconfig
> http://packages.debian.org/unstable/admin/sysv-rc
> http://packages.debian.org/unstable/admin/rcconf
> http://packages.debian.org/unstable/admin/policyrcd-script-zg2
> http://packages.debian.org/unstable/admin/file-rc.en.html
> But you if you are not sure which one to use, you would be better to ask 
> on a debian list.
> But if you still need to kill your nfsd's you would be better to have 
> your killnfsd make calls to
> `service nfs-kernel-server stop` and `service nfs-common stop`.
> as these should make sure all related programs are taken down in an 
> orderly fashion.
> Once you get over the startup/shutdown problems, then we can come back 
> to the stale NFS handles.  Granted I am making the assumption that you 
> have either mounted a drbd resource at /var/lib/nfs like Raoul, or have 
> made softlinks from /var/lib/nfs to a drbd resource directory. Note that 
> Raoul indicated the softlinks worked for him but the mounting did not, 
> so I suggest sticking with the links.
>> I've added a 3 second delay before taking the VIP, but nothing.
>> I also made "echo 'killall -9 nfsd' > 
>> /etc/heartbeat/resource.d/killnfsd" and added killnfsd to haresources 
>> but in this case I get " CRIT: Giving up resources due to failure of 
>> killnfsd".
>> ¿Can anybody help me? ¿Why heartbeat can't run killnfsd and returns 
>> that error?
>> Thanks in advance.
>> Andrés
> look at the other heartbeat scripts, your killnfsd script needs to
> 1) accept the options start|stop|restart|status.
> 2) return specific data for each of the options.
> http://www.goldmark.org/jeff/stupid-disclaimers/
> http://www.goldmark.org/jeff/stupid-disclaimers/#sec-least-stupid
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

Diego Julian Remolina
System Administrator - Systems Support Specialist III
Institute for Bioengineering and Bioscience
Georgia Institute of Technology
Phone (404) 385-0127
Fax   (404) 894-2291
315 Ferst Drive
Atlanta, GA 30332-0363

More information about the drbd-user mailing list