[DRBD-user] Migrating NFS to a DRBD block device

Fri Oct 15 10:27:18 CEST 2004

/ 2004-10-14 23:59:42 -0400
\ Omar Kilani:
> Hello, World,
> 
> I'm in the process of moving my RHEL3 based NFS server onto DRBD.
> Setup and configuration is fine, replication works, etc.
> 
> Unfortunately, if I stop the NFS service, remount the block device on 
> which NFS exports from onto DRBD, and start up again, my NFS clients 
> start receiving 'stale file handle' errors. AFAICS, this should work, 
> since NFS has no inherent tie to the lower level device -- it sits on 
> top of the file system, after all.
> 
> Stopping NFS, unmounting from the DRBD block dev, remounting the lower 
> level device and starting NFS again gets things working on the client 
> side. So I was wondering what the problem could be?

NFS "handles" are in most implementations basically just
 <block device number>:<inode number>, which is supposed to be unique,
and  easily available.

but this means that, if you change something on your nfs server,
and now the data is on some other block device (drbd), the block device
number changes, and thus all handles are invalid...

if you change back, suddenly the handles are valid again.

so there is no chance (without patching the nfs server to support a
configurable mapping of exported handle numbers to block dev numbers)
to migrate the data on the nfs server to some other block device without
rebooting / forcefully remounting the clients.

> I'm using an *external* meta data device.
> 
> Oh, one last question. I've got:
> 
>    wfc-timeout  60;
>    degr-wfc-timeout 120;    # 2 minutes.

it does not make sense at all to have a degraded timeout of less than
your non-degraded timeout.

> 
> yet, when the drbd initscript runs, it just waits forever (although the 
> message does say "the timeout for resource X is 60 seconds"). I'm 
> obviously missing something... I haven't used drbd in production since 
> 2001 (version 0.5.8 is still running well... :) so I'm probably not up 
> with the latest configuration syntax. :)

you need to specify for all resources.
the message only prints the values for the first resource in the
configuration file. so if one of your later resources happens to have
the default of "0", it would print something about 60 seconds, but it
would still wait forever.

this is maybe subotimal, and I guess we could move those timeouts into
the global section. but it is more flexible the way it is, and there may
be configurations where different timeouts for different resources do
make sense.

	Lars Ellenberg

-- 
please use the "List-Reply" function of your email client.