[DRBD-user] Unexplained reboots in DRBD82 + OCFS2 setup

Kris Buytaert mlkb at inuits.be
Thu Jun 25 11:42:47 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Lars, 

> > 
> > At first I assumed OCFS2 to be the root of this problem ..so I moved
> > forward and setup an ISCSI target on a 3rd node, and used that device
> > with the same OCFS2 setup. There no crashes occured and bonnie++
> > flawlessly completed it test run.
> > 
> > So my attention went  back to the combination of DRBD and OCFS 
> > 
> > I tried both DRBD 8.2 drbd82-8.2.6-1.el5.centos kmod-drbd82-8.2.6-2  and
> > the 83 variant from Centos Testing
> > 
> > At first I was trying with the ocfs2 1.4.1-1.el5.i386.rpm verson but
> > upgrading to  1.4.2-1.el5.i386.rpm didn't change the behaviour
> > 
> > 
> > Anyone has an idea on this ? 
> 
> OCFS2 heartbeat to disk takes too long, and it self-fences?
> increase the OCFS2 timeout there.

I've already altered the O2CB_HEARTBEAT_THRESHOLD values 
with no significant change in behaviour .. 

> 
> also, use the "deadline" IO scheduler,
> or the latencies will kill you!
> 
> > How can we get more debug info from OCFS2  , apart from heartbeat
> > tracing which doesn't learn me nothing yet ..  in order to potentially
> > file a valuable bug report.
> 
> Use a serial console, attach that to some "monitoring" host.
> (you can useUSB-to-Serial, they are cheap and work), and log
> on that one. You'll get the last messages from there.
> 
I indeed had hoped to see some output on on the serial console when the
reboots happened .. but the best I got so far was a partial timestamp
with no further explanation before the reboot output started again .. 

Any other ideas ? 

greetings

Kris

PS.  I've already played with different sync rates, different heartbeat
tresholds etc ... 





More information about the drbd-user mailing list