[DRBD-user] Two NFS servers in passive-active

Tue Apr 6 19:09:46 CEST 2010

Olivier Le Cam <Olivier.LeCam at ...> writes:

> 
> Hi -
> 
> I have a setup with one NFS server over a couple of DRBD servers in 
> active-passive mode. It is running just fine!
> 
> Now I would like is to have a second NFS server also in passive-active 
> but on the opposite DRBD server.
> 
>    - node1: NFS server exporting /data1 (passive-active)
>    - node2: NFS server exporting /data2 (passive-active)
>    - if one node fails, the other one would then take over and export 
> both shares (fail-over).
> 
> As you can imagine I am concerned about maximizing resources usage 
> taking advantage of having two servers.
> 

Hi Olivier,

I have tried to build such a setup for quite some time now, without success. The
idea in our case is not necessarily 100% resource utilization, but very
specifically highly redundant VMware clusters. The trend goes toward virtual SAN
(look at the VM market place for virtual storage VMs). This allows building HA
clusters with only 2 physical hosts. The vSAN gets created by two VMs, doing IP
failover and disk sync. Most of the commercial solutions can do active/active,
and in this case VMs configured on datastore/vSAN #1 run on host #1, and the
ones configured on vSAN #2 run on host #2. All the SAN traffic then remains
internal to the host (quite efficient), and it's only if a host (or vsAN VM)
crashes that the two vSAN gets exported by a single VM.

They all (the commercial solutions) use iSCSI. None of them uses NFS (which
would create a vNAS, actually). However, NFS is so much simpler than iSCSI, and
runs so well with VMware, that it is a very good candidate for small HA clusters.

As I see it, there are different problems with NFS:

1) NFS runs in the linux kernel, so you cannot really have multiple instances

2) NFS stores data in /var/lib/nfs. When you failover, the new NFS server should
get consistent data in there, but because of 1), there is not a "new" NFS
server. It's the one running on the other host. So this /var/lib/nfs should be
replicated, for instance on a very small active/active GFS2/OCFS2 ! This becomes
complex.

3) Finally, how would you do a clean failover (ressource migrate) of a single
NFS directory ? If the directory is used (by NFS clients), then you cannot
umount the filesystem (FS busy). In active/passive, the solution we see
everywhere is to "kill -9 nfsd", but then this would kill both resources !

There's a bit of a chicken/egg problem here, and I haven't found a solution so
far... As I do it with VMs, I just run 4 of them, creating two active/passive
NFS servers. It's a bit more admin work, but it works like a charm.

Just adding my 2cents to the discussion, no real answer. Sorry.

Regards, - Patrick -

> Does someone have some experience in such setup? How to have two NFS 
> instances running on the same node (both on different IPs)? TBH, I don't 
> know if this is even possible thougth...
> 
> PS: I don't want to use some ClusterFS because I don't think I can 
> expect better performance due to overhead, compared to a single EXT3 server.
> 
> Thanks in anticipation for sharing experiences/pointers.
> 
> Best regards,