[DRBD-user] Parallel resource startup, scalability questions

Tue Jul 2 22:15:06 CEST 2013

Hi,

On Tue, 2 Jul 2013 17:08:30 +0900 Christian Balzer <chibi at gol.com>
wrote:
> not purely a DRBD issue, but it sort of touches most of the bases, so
> here goes.
> 
> I'm looking at deploying a small (as things go these days) cluster (2
> machines) for about 30 KVM guests, with DRBD providing the storage.
> 
> My main concern here is how long it would take to fail-over (restart)
> all the guests if a node goes down. From what I gathered none of the
> things listed below do anything in terms of parallelism when it comes
> to starting up resources, even if the HW (I/O system) could handle it.
<snip>
> Lastly I could go with Pacemaker, as I've done in the past for much
> simpler clusters, but I really wonder how long starting up those
> resources will take. If I forgo live-migration I guess I could just
> do one DRBD backing resource for all the LVMs. But still, firing up
> 30 guests in sequence will take considerable time, likely more than I
> would consider really a "HA failover" level of quality.

Why do the vms have to start in sequence?

Pacemaker happily starts several services in parallel provided they
don't depend on each other. And you have to define these dependencies
as orders/groups yourself. Otherwise pacemaker assumes that services
are to be startet in parallel. (At least thats what I see here when
booting my 2+1 node cluster from cold.)

I don't have 30 vms, more like 15. But at least one drbd-volume for
each machine. And dependencies defined so the ldap-server has to be up
before the others start that need it.

And using individual drbd-resources for the machines might be a bit
more to set-up when doing it all at once (my setup has grown over
time), it allows to distribute the vms on the two nodes, so the vms
don't need to run all on one node. And when you also define scores for
importance, you can over-commit the memory (and cpu) of the two nodes
so that normally everything runs and only in case of a node-failure
some not so importent vms are stopped/not started.

One the other hand, if I had to start all over again (without a
deadline within the next two weeks), I would look at ceph or sheepdog
for storage and either use pacemaker to manage the vms or take a look
at openstack's ha-support.

Have fun,

Arnold
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130702/a0d706ff/attachment.pgp>