Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, After various diversions from the project, I'm back to gathering clues on what should be a simple setup. But either my clue basket it leaking, or some necessary clues take better luck than mine to easily find. It's should be a fairly simple setup to complete. It consists of two servers, dedicated only to hosting KVM VMs, which are running raw on LVM volumes, which are on top individual DRBD resources. Half the VMs run on each of the servers. So far so good. All that needs to happen now is: If one host goes down - fence it from the other - promote the DRBD partitions for the VMs it wasn't running to primary - start the KVM VMs which it wasn't already running - and notify me. The step for today is trying to get IPMI stonith working, using Pacemaker 1.0.9 and Heartbeat 3.0.2. I've tried configuring external/impi through DRBD-MC, but all ends up with is a gray box saying Starting ... Not Running, despite that the various connection settings I've put in are known good. The logs show it generating a bunch of XML, and no errors, but no success. Evidently I'll need to configure this by hand. There are bits of documentation like this out there: http://www.karlkatzke.com/ipmi-stonith-howto-with-pacemaker/ but it's just a fragment. Let's assume I can work out the missing pieces to that part of the puzzle. Then there's the promotion of DRBD resources, which it looks like there are a few different, somewhat disputed, ways to handle. Advice on the simplest reliable way would be welcome. As for a resource to start a defined list of KVM VMs, I don't see one in Pacemaker or Heartbeat at all. Should I find one elsewhere, or do I need to learn how to create one? If so, is there a good template around? The notification I'm not too worried about, since I can get most of what I want from Nagios running elsewhere. As an alternative, it wouldn't take much of a custom script to send the IPMI reset message, promote a list of DRBD partitions, and start a list of KVMs. The trickier part of that approach would be joining it with Pacemaker/Heartbeat, which I presume should still be preferred for initial detection over simply sending out pings by script. There'd have to be some conditionals added for initial system startup to make sure not to promote DRBD resources or start VMs if the other system had already taken them in failover. But doing it all "by hand" looks doable. So, short of taking the long way around and spending weeks more studying the whole Pacemaker/OpenAIS universe, what's the simplest way to a reasonably dependable two-host KVM-VM-on-LVM-on-DRBD private cloud? I'm totally open to doing it "wrong." I'm running an NFS server consisting of two systems with DRBD and UCARP plus some simple scripts, and the failover tests have been flawless. But that's obviously not adaptable to this KVM puzzle. Nor are Red-Hat specific methods. I've got this on Ubuntu. And I'd really like to avoid the solution having more layers than it needs - some of the cluster resource stacks are top-heavy with their jack-of-all-trades approaches. I'm just trying to master this relatively simple system. Best, Whit