Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I do have fencing setup via a Dell DRAC card, which uses the system NICs as its connection. The first time I tried this test, I thought RHCM was the culprit, but they assured me that both nodes should not be fenced when the connection was re-established. After that, I found the halt commands in drbd.conf which fit the symptoms perfectly. I actually performed this test twice, once with each node. The first time it behaved as expected, with the offending node being rebooted by the cluster. When testing the second node, both systems halted. I plan to do more testing and try to rule out one subsystem in this mix. Sadly, manual fencing is busted in CS 5, at least when setting up with Conga. I haven't found useful info on it for manual editing. I think Conga omits a nodename attribute in Cluster.conf. Florian, do you have recommendations for drbd.conf settings for after split-brain events if RHCS is going to do the fencing? Many thanks, Chris Florian G. Haas wrote: > Chris, > > since you're on RHCM, are you sure it's DRBD that's causing your node lockup? > When RHCM loses the connection to the peer node, AFAIK it will assume it's in > split brain until it is certain that the peer has been properly fenced. > Assuming you don't have fencing in place, now would probably be a good time > to implement it. For testing purposes, you may use the "manual" fence device, > which you must acknowledge using fence_ack_manual. > > I hope this is applicable to your setup. Let us know if it helps. Mind that my > suggestions stem from experience gathered using RHEL 4 U4 and GFS, but the > scenario you describe sounds all too familiar. :-) > > Cheers, > Florian > > On Tuesday 03 July 2007 03:01:20 Chris Harms wrote: > >> Hi All, >> >> I'm having a problem after simulating a network failure (unplugging the >> cables) and reconnecting. Upon reconnecting the cables, both nodes get >> halted by the system and do not log anything. I have removed the >> default settings for Split Brain scenarios in drbd.conf and replaced >> them with what I thought were innocuous commands: >> >> Is there an unlisted default setting in DRBD that might issue a halt to >> the system? Also, if I want the cluster manager to do fencing, what >> would be good settings for the after split brain handlers? >> [...] >> > >