Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Is the Stale NFS problem happening on nfs version 3, 4 or both? I am using NFS version 4 on RHEL4 with RHEL4, FC3 and FC4 clients and I have no problems with NFS switching from primary to secondary and back, except for the fact that I do need to sometimes restart rpcidmapd on the nfs server a few times after the switch (I actually added a script that restarts rpcidmapd 5 times with 5 second intervals) or I will get the following error message on the clients: Dec 11 07:15:50 sauron kernel: decode_getfattr: xdr error 10008! I also have not tested if I can now remove that rpcidmapd script as there have been kernel upgrades since I frist set everything up with the 2.6.9-11 kernel. Diego Todd Denniston wrote: > Andrés Cañada wrote: > >> Hi! >> I've been reading about the Stale NFS file handle problem in this list. >> As far as I can see, this is a Debian/NFS problem, so that's my >> problem, and I haven't read the solution that works for me. > > > From the above and below text you wrote I think you may have one > problem causing another. > > If you kill nfsd it can not store needed data in drbd0 that would be > needed when a fall over so the other system can resume the nfs sessions. > > you are telling us below that you are getting an error with your > killnfsd script, but you have not answered the question I also asked > Raoul, 'WHY DO YOU NEED TO KILL NFSD?'. > > The reason for that question is simple, > 1) nfsd service should not be started at boot so it should not be in the > way when heartbeat trys to start the nfs service scripts, > on Fedora chkconfig or system-config-services can be used, > on Debian I think one of the following will head you in the correct > direction: > > http://packages.debian.org/unstable/admin/sysvconfig > http://packages.debian.org/unstable/admin/sysv-rc > http://packages.debian.org/unstable/admin/rcconf > http://packages.debian.org/unstable/admin/policyrcd-script-zg2 > http://packages.debian.org/unstable/admin/file-rc.en.html > But you if you are not sure which one to use, you would be better to ask > on a debian list. > > But if you still need to kill your nfsd's you would be better to have > your killnfsd make calls to > `service nfs-kernel-server stop` and `service nfs-common stop`. > as these should make sure all related programs are taken down in an > orderly fashion. > > Once you get over the startup/shutdown problems, then we can come back > to the stale NFS handles. Granted I am making the assumption that you > have either mounted a drbd resource at /var/lib/nfs like Raoul, or have > made softlinks from /var/lib/nfs to a drbd resource directory. Note that > Raoul indicated the softlinks worked for him but the mounting did not, > so I suggest sticking with the links. > >> I've added a 3 second delay before taking the VIP, but nothing. >> I also made "echo 'killall -9 nfsd' > >> /etc/heartbeat/resource.d/killnfsd" and added killnfsd to haresources >> but in this case I get " CRIT: Giving up resources due to failure of >> killnfsd". >> ¿Can anybody help me? ¿Why heartbeat can't run killnfsd and returns >> that error? >> Thanks in advance. >> Andrés >> > > look at the other heartbeat scripts, your killnfsd script needs to > 1) accept the options start|stop|restart|status. > 2) return specific data for each of the options. > > >> **CONFIDENTIALITY NOTICE** This email <WAS SENT TO A MAILING LIST AND >> IS STORED IN MANY PLACES NOW> > > http://www.goldmark.org/jeff/stupid-disclaimers/ > http://www.goldmark.org/jeff/stupid-disclaimers/#sec-least-stupid > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -- Diego Julian Remolina System Administrator - Systems Support Specialist III Institute for Bioengineering and Bioscience Georgia Institute of Technology Phone (404) 385-0127 Fax (404) 894-2291 315 Ferst Drive Atlanta, GA 30332-0363