[DRBD-user] Stale NFS file handle problem

Mike mike-drbd at tiedyenetworks.com
Sat Nov 5 19:34:13 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 11/03/2011 12:33 AM, Nick Morrison wrote:

> Where are you mounting your exports from?  A third machine?  FWIW, I
> had THAT working just fine.  I did quite a bit of testing with
> reading and writing large files, and reading and writing many little
> files, and so on, whilst doing the switchover, and it seemed to work
> well.  I followed probably the exact same HOWTOs you did - the linode
> library one was memorable.  As long as I wasn't making my machines be
> both nfs-servers AND nfs-clients, it was smooth.

I have a client machine which is not either of the nfs servers. In my 
case I discovered that trying to do this with BTRFS on a raw partition 
just doesn't work and the primary can't become primary on startup (has 
to do with how btrfs detects file systems), but if you put that on an 
LVM volume, then at least yes it does work to automount and export the 
fs on startup. But at any rate, the stale handle issue is just driving 
me nuts. I don't know what variables are at play here but digging around 
I see on my two servers that the uuid's of the partitions are different. 
sda5 is the raw pv, while 'vg00-replicated' is the lv residing on sda5, 
containing the replicated file system:

/dev/sda5: UUID="78bc44d60960f573" TYPE="drbd"
/dev/mapper/vg00-replicated: UUID="c1d2bcc90aadabdf" TYPE="drbd"

/dev/sda5: UUID="12d84d507ae6cccf" TYPE="drbd"
/dev/mapper/vg00-replicated: UUID="cf11685207c92f2d" TYPE="drbd"

	This puzzles me. These FS's are the same and I've been thru multiple 
(hours longs) resyncs, and as well as doing an md5sum on the 
rawpartitions. Maybe this is a red-herring.

	I've just tried reformatting with ext4. In that config (ext4 on top of 
an lvm volume), things are much more stable and there are no more stale 
nfs file handles reported as I stop/steart heartbeat on the 'main' 
forcing transition to the backup. However, there is a secondary problem 
- occasionally my rsync process will report othe fs errors, for example 
'permission denied' when writing (if it's in the middle) or 'file 
dissapeared' (but checking the fs afterwards show it's there).

	I just don't know what could be the deal here. Eveerything I can see 
works - until you test it during failovers, and thats where the issues 
show up. I can start file system operations up again (start my rsync 
over again for example, or whatever) and the images on disk are fine. 
But the errors caused during transition are deadly and make it a no-go. 
What other kinds of things should I be looking at?


More information about the drbd-user mailing list