Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Lars, > > > > At first I assumed OCFS2 to be the root of this problem ..so I moved > > forward and setup an ISCSI target on a 3rd node, and used that device > > with the same OCFS2 setup. There no crashes occured and bonnie++ > > flawlessly completed it test run. > > > > So my attention went back to the combination of DRBD and OCFS > > > > I tried both DRBD 8.2 drbd82-8.2.6-1.el5.centos kmod-drbd82-8.2.6-2 and > > the 83 variant from Centos Testing > > > > At first I was trying with the ocfs2 1.4.1-1.el5.i386.rpm verson but > > upgrading to 1.4.2-1.el5.i386.rpm didn't change the behaviour > > > > > > Anyone has an idea on this ? > > OCFS2 heartbeat to disk takes too long, and it self-fences? > increase the OCFS2 timeout there. I've already altered the O2CB_HEARTBEAT_THRESHOLD values with no significant change in behaviour .. > > also, use the "deadline" IO scheduler, > or the latencies will kill you! > > > How can we get more debug info from OCFS2 , apart from heartbeat > > tracing which doesn't learn me nothing yet .. in order to potentially > > file a valuable bug report. > > Use a serial console, attach that to some "monitoring" host. > (you can useUSB-to-Serial, they are cheap and work), and log > on that one. You'll get the last messages from there. > I indeed had hoped to see some output on on the serial console when the reboots happened .. but the best I got so far was a partial timestamp with no further explanation before the reboot output started again .. Any other ideas ? greetings Kris PS. I've already played with different sync rates, different heartbeat tresholds etc ...