[DRBD-user] DRBD Fencing with RHCS

Fri Jan 4 17:18:15 CET 2008

I'm new to setting up clustering and DRBD. We're setting up a HA NFS  
cluster between 2 nodes, each node is running CentOS 5.1. I have DRBD  
working between them and RHCS is working with exporting a NFS export  
backed by GFS. The GFS volume is sitting on drbd0. When everything is  
running everything is good.

However, I simulated a failure by shutting down the node that held the  
NFS service while a client was writing to it. Now here's what happened:

Node 1 went down
NFS Client paused during a write
NFS Service came up on Node 2
NFS Client finished write
NFS Client issued an ls
NFS Clients ls hung, other processes on the Client were fine
ls issued on Node 2 against the GFS filesystem, this also hung. All  
other processes on Node 2 were fine.

So what happened is DRBD hung without effecting any other operation.  
When Node 1 was brought back online communications with Node 2 did not  
resume and a Split-Brain situation occurred.

Now looking into this I believe that the problem is that the DRBD  
process/service is not properly fenced, or possibly not fenced at all.  
Now I see a lot of documentation fencing DRBD with Heartbeat but as I  
am already using RHCS for other services which depend on DRBD working,  
is it possible to use RHCS do handle the fencing and heartbeat  
processes? How would I go about configuring it?

-Thank you in advance.