[DRBD-user] KVM on multiple DRBD resources + IPMI stonith - questions

Fri Jun 24 16:58:28 CEST 2011

On Thu, Jun 23, 2011 at 01:54:45PM -0600, Pete Ashdown wrote:

> I don't think the cluster software is quite ready for primetime in Ubuntu. 
> I have high hopes for the next LTS (12.04?).  In any case, you'll need the
> ubuntu-ha ppa now.

I'm an old-school sysadmin. It's gotten to a point in the last few years
where I'll trust varios distros' LAMP stacks, but before that it didn't
matter which distro, I always built each component of anything mission
critical from source. Never trusted distro kernels either. I look at cluster
stacks now as being where the LAMP stack was 15 years ago, and kernels were
5. But unlike with the LAMP stack 15 years back, the documentation is worse,
and configuration methods are a bizarre, diverse, inconsistent mess.

> If you're not looking to live clustering replacement of downed KVM guests,
> you can get clvm running by just configuring corosync then disabling the
> corosync & clvm init.d scripts (update-rc.d disable) and creating one to
> run /usr/sbin/aisexec.

But of course I'm looking to live replacement of guests. It's easy enough to
do that with a human involved sitting at virt-manager or drbd-mc. Anything a
human can do, that's a simple, consistent operation like that can be handled
by a well-designed script. 

> I know this topic isn't directly drbd related, but I do know how hard it is
> to find good clustering help for Ubuntu.  You can email me offlist if you
> need any extra help.

I truly appreciate that, Pete. What I've been hoping to find is
distribution-neutral models for handling KVM failover, where each KVM VM is
directly on top of a dedicated DRBD resource - which is a setup with an
ideal granularity, as compared to having multiple VMs sharing a single DRBD
resource. Because the granularity is right-sized, it greatly simplifies the
possible failure modes, and what should be needed for reliable failover. For
example, if it were one shared DRBD resource for many VMs running across two
hosts, it would need to run primary-primary, with CLVM and a clustering file
system and the full pacemaker-corosync/heartbeat treatment. I get that. But
with dedicated, per-VM DRBD resources, each can be run primary-secondary (if
you don't mind a few seconds down during migration - which will be there in
failover in any case), so there's no need for CLVM or one of the (less
mature than ext4 or xfs) clustering file systems in the arrangement.

There also should be a whole lot less needed on the pacemaker-
corosync/heartbeat side. What I've been hoping to find is documentation on
just enough of pacemaker-corosync/heartbeat to handle this simplified
architecture adequately. But most of the documentation isn't aimed towards
an architecture like this at all, and just about nothing I've found
addresses a KVM environment. 

Because each layer is sliced into its own project, with its own
documentation, configuration syntax, and mailing list, there's no perfect
place for addressing these questions - no place I've found dedicated to the
overview, to the questions of architecture and integration rather than to
close focus on one of the components. While KVM/QEMU is at an admirable
state of maturity - and rapidly improving - that doesn't extend to its
documentation at all. The libvirt list has fascinating discussion at the
edge of the technology, but even less about the practical issues of
deployment than we have here. LINBIT has adopted pacemaker because failover
concerns are a natural fit to DRBD. Failover concerns for KVM VMs hosted on
dedicted DRBD resources, I'll argue, area also a natural fit for DRBD. I'm
not sure, in this context whether pacemaker etc. is the answer. If it is, it
would be nice to see that documented. 

Best,
Whit