[DRBD-user] KVM on multiple DRBD resources + IPMI stonith - questions

joukoy joukoy at gmail.com
Fri Jul 1 16:36:38 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


1.7.2011 16:43, Helmut Wollmersdorfer kirjoitti:
>
> Am 24.06.2011 um 16:58 schrieb Whit Blauvelt:
>
>> I truly appreciate that, Pete. What I've been hoping to find is
>> distribution-neutral models for handling KVM failover, where each KVM 
>> VM is
>> directly on top of a dedicated DRBD resource - which is a setup with an
>> ideal granularity, as compared to having multiple VMs sharing a 
>> single DRBD
>> resource. Because the granularity is right-sized, it greatly 
>> simplifies the
>> possible failure modes, and what should be needed for reliable 
>> failover. For
>> example, if it were one shared DRBD resource for many VMs running 
>> across two
>> hosts, it would need to run primary-primary, with CLVM and a 
>> clustering file
>> system and the full pacemaker-corosync/heartbeat treatment. I get 
>> that. But
>> with dedicated, per-VM DRBD resources, each can be run 
>> primary-secondary (if
>> you don't mind a few seconds down during migration - which will be 
>> there in
>> failover in any case), so there's no need for CLVM or one of the (less
>> mature than ext4 or xfs) clustering file systems in the arrangement.
>
> ACK. Such a configuration should be usual and wide-spread:
> A two node cluster with some VMs running criss-cross, each on a 
> dedicated Primary/secondary DRBD resource.
>
> One can do it with KVM or Xen or something similar, Primary/Primary 
> with live-migration, or Primary/Secondary without.
>
> IMHO it is a very common configuration.
>
>> There also should be a whole lot less needed on the pacemaker-
>> corosync/heartbeat side. What I've been hoping to find is 
>> documentation on
>> just enough of pacemaker-corosync/heartbeat to handle this simplified
>> architecture adequately. But most of the documentation isn't aimed 
>> towards
>> an architecture like this at all, and just about nothing I've found
>> addresses a KVM environment.
>
> I join in ranting: there is a lot docs and HOWTOs out there, but 
> nothing of sufficient quality.
>
> It still needs many hours (days) of trial and error.
>
> Here is my crm conf for a similar XEN on dedicated DRBD (shortened to 
> two VMs):
>
> node $id="..." xen11
> node $id="..." xen10
>
> primitive xen_cmsdb ocf:heartbeat:Xen \
>         params xmfile="/etc/xen/cmsdb.cfg" \
>         op monitor interval="3s" timeout="30s" \
>         op start interval="0" timeout="60s" \
>         op stop interval="0" timeout="40s" \
>         meta target-role="Started" allow-migrate="false"
> primitive xen_www ocf:heartbeat:Xen \
>         params xmfile="/etc/xen/www.cfg" \
>         op monitor interval="3s" timeout="30s" \
>         op start interval="0" timeout="60s" \
>         op stop interval="0" timeout="40s" \
>         meta target-role="started" allow-migrate="false"
>
> primitive xen_drbd1_1 ocf:linbit:drbd \
>         params drbd_resource="drbd1_1" \
>         op monitor interval="15s" \
>         op start interval="0" timeout="240s" \
>         op stop interval="0" timeout="100s"
> primitive xen_drbd1_2 ocf:linbit:drbd \
>         params drbd_resource="drbd1_2" \
>         op monitor interval="15s" \
>         op start interval="0" timeout="240s" \
>         op stop interval="0" timeout="100s"
> primitive xen_drbd5_1 ocf:linbit:drbd \
>         params drbd_resource="drbd5_1" \
>         op monitor interval="15s" \
>         op start interval="0" timeout="240s" \
>         op stop interval="0" timeout="100s"
> primitive xen_drbd5_2 ocf:linbit:drbd \
>         params drbd_resource="drbd5_2" \
>         op monitor interval="15s" \
>         op start interval="0" timeout="240s" \
>         op stop interval="0" timeout="100s"
>
> group group_drbd1 xen_drbd1_1 xen_drbd1_2
> group group_drbd5 xen_drbd5_1 xen_drbd5_2
>
> ms DrbdClone1 group_drbd1 \
>         meta master_max="1" master-mode-max="1" clone-max="2" 
> clone-node-max="1" notify="true"
> ms DrbdClone5 group_drbd5 \
>         meta master_max="1" master-mode-max="1" clone-max="2" 
> clone-node-max="1" notify="true"
>
> location cli-prefer-xen_cmsdb xen_cmsdb \
>         rule $id="cli-prefer-rule-xen_cmsdb" inf: #uname eq xen10
> location cli-prefer-xen_www xen_www \
>         rule $id="cli-prefer-rule-xen_www" inf: #uname eq xen11
>
> location prefer_xen10_cmsdb xen_cmsdb 50: xen10
> location prefer_xen10_www xen_www 40: xen10
>
> location prefer_xen11_cmsdb xen_cmsdb 40: xen11
> location prefer_xen11_www xen_www 50: xen11
>
> colocation xen_cmsdb_and_drbd inf: xen_cmsdb DrbdClone5:Master
> colocation xen_www_and_drbd inf: xen_www DrbdClone1:Master
>
> order xen_cmsdb_after_drbd inf: DrbdClone5:promote xen_cmsdb:start
> order xen_www_after_drbd inf: DrbdClone1:promote xen_www:start
>
> property $id="cib-bootstrap-options" \
>         dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
>         cluster-infrastructure="Heartbeat" \
>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \
>         last-lrm-refresh="1309523267"
>
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="100"
>
> HTH
>
> Helmut Wollmersdorfer
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

I have similar setup, and have draft for howto + GUI script.

https://github.com/joukoy/KVM-cluster-tool

This is still far from perfect but it works for me.
Any feedback about script or howto is welcome. If you think it's totally 
useless, please tell me. Maybe I destroy it from github then :-) (not)

- Jouko -





More information about the drbd-user mailing list