[DRBD-user] RedHat Clustering Services does not fence when DRBD breaks

Tue Nov 23 12:06:01 CET 2010

On Tue, 2010-11-23 at 09:16 +0000, Jakov Sosic wrote:
> On 11/22/2010 08:04 PM, Joe Hammerman wrote:
> > Well we’re running DRBD in Primary – Primary mode, so the service
> should
> > be enabled on both machines at the same time.
> >
> > GFS breaks when DRBD loses sync, and both nodes become unusable,
> since
> > neither can guarantee write integrity.  If one of the nodes fenced,
> when
> > it rebooted, it would become at the worst secondary. Then the node
> that
> > never fenced stays on line, and we have %100 uptime.
> >
> > This is a pretty non-standard setup, huh?
> 
> But what's the point of two-node cluster if your setup cannot
> withstand
> the loss of one node. In the case of sync loss, one node should be
> fenced, so the other can keep on working with mounted GFS. Your goal
> should be to achieve that.
> 
> You should resolve it on DRBD level indeed, so when DRBD loses sync
> that
> one node gets fenced... Something like:
> 
> disk {
>    fencing resource-and-stonith;
> }
> handlers {
>    outdate-peer "/sbin/obliterate-peer.sh"; # We'll get back to this.
> }

I'm slightly confused on this thread. 

I understood, the recommended way to run DRDB and GFS/RHCS was to do the
fencing in Cluster Suite, not in DRBD, and all you need in the drbd.conf
is:

 startup {
	become-primary-on both;
  }
net {
   allow-two-primaries;
   after-sb-0pri discard-zero-changes;
   after-sb-1pri discard-secondary;
   after-sb-2pri disconnect;
  }

This should always keep the newer of the two nodes data without needing
to add in outdate-peer.

There should be no need to enable both nodes at the same time as DRBD
will wait until it sees the other mode. Or configured to wait for a
certain time. 

I have on mine:

  startup {
  	wfc-timeout  300;       # Wait 300 for initial connection
  	degr-wfc-timeout 60;  # Wait only 60 seconds if this node was a
degraded cluster
	become-primary-on both;
  }

Provided drbd is set to start before clvmd all should work, 

I'm led to believe this should always take data from the node with the
newest data (whilst DRBD resyncs).

GFS should continue provided it is assured the other node is fenced. 

Colin

This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed.  If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.