Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 11/03/2011 12:33 AM, Nick Morrison wrote: > > Where are you mounting your exports from? A third machine? FWIW, I > had THAT working just fine. I did quite a bit of testing with > reading and writing large files, and reading and writing many little > files, and so on, whilst doing the switchover, and it seemed to work > well. I followed probably the exact same HOWTOs you did - the linode > library one was memorable. As long as I wasn't making my machines be > both nfs-servers AND nfs-clients, it was smooth. > I have a client machine which is not either of the nfs servers. In my case I discovered that trying to do this with BTRFS on a raw partition just doesn't work and the primary can't become primary on startup (has to do with how btrfs detects file systems), but if you put that on an LVM volume, then at least yes it does work to automount and export the fs on startup. But at any rate, the stale handle issue is just driving me nuts. I don't know what variables are at play here but digging around I see on my two servers that the uuid's of the partitions are different. sda5 is the raw pv, while 'vg00-replicated' is the lv residing on sda5, containing the replicated file system: store0: /dev/sda5: UUID="78bc44d60960f573" TYPE="drbd" /dev/mapper/vg00-replicated: UUID="c1d2bcc90aadabdf" TYPE="drbd" store1: /dev/sda5: UUID="12d84d507ae6cccf" TYPE="drbd" /dev/mapper/vg00-replicated: UUID="cf11685207c92f2d" TYPE="drbd" This puzzles me. These FS's are the same and I've been thru multiple (hours longs) resyncs, and as well as doing an md5sum on the rawpartitions. Maybe this is a red-herring. I've just tried reformatting with ext4. In that config (ext4 on top of an lvm volume), things are much more stable and there are no more stale nfs file handles reported as I stop/steart heartbeat on the 'main' forcing transition to the backup. However, there is a secondary problem - occasionally my rsync process will report othe fs errors, for example 'permission denied' when writing (if it's in the middle) or 'file dissapeared' (but checking the fs afterwards show it's there). I just don't know what could be the deal here. Eveerything I can see works - until you test it during failovers, and thats where the issues show up. I can start file system operations up again (start my rsync over again for example, or whatever) and the images on disk are fine. But the errors caused during transition are deadly and make it a no-go. What other kinds of things should I be looking at? Mike-