<div dir="ltr"><div>Thanks Bram & Ivan! These are some great hints. The ability to perform a KVM live migration with no service disruption will be a huge benefit.<br><br></div><div>Paul<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 5, 2015 at 5:52 AM, Ivan <span dir="ltr"><<a href="mailto:ivan@c3i.bg" target="_blank">ivan@c3i.bg</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi !<span class=""><br>
<br>
On 03/03/2015 07:26 PM, Paul Mackinney wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Apologies in advance if this is too off-topic for this forum, feel free to<br>
reply directly & point to other resources.<br>
<br>
I'm having good success with DRBD in the following configuration:<br>
<br>
Two identical hosts, dual-bonded ethernet for DRBD with a dedicated VLAN on<br>
my smart swtich.<br>
<br>
I'm running DRBD on top of LVM, each device is the backing storage for a<br>
KVM virtual guest. The top-down diagram is:<br>
<br>
[/dev/sda1] Guest root system<br>
[ /dev/sda ]<br>
[ /dev/drbdN ] <--- primary/secondary mode<br>
[ /dev/mapper/vg-lvN ]<br>
[ lvm physical volume ]<br>
</blockquote>
<br></span>
I have the same DRBD setup as you.<span class=""><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
This works perfectly, but guest migration requires a shutdown:<br>
1. Shut down guest on host 0<br>
2. Demote DRBD primary to secondary on host 0<br>
3. Promote DRBD secondary to primary on host 1<br>
4. Start guest on host 1<br>
<br>
Apparently running GFS with DRBD's dual-primary mode would allow KVM<br>
migration from one host to another without a shutdown, but all the<br>
tutorials jump straight into cluster configuration.<br>
</blockquote>
<br></span>
Live migration indeed requires dual-primary mode. But it has nothing to do with GFS (except if you want your machines' disks to be file based, but in that case doing that over GFS will have a huge performance hit, and it defeats the purpose of the setup you desribed above).<br>
<br>
So - just setup dual-primary (protocol C; allow-two-primaries yes;) and you'll be able to use this to live migrate your vm:<br>
<br>
virsh migrate vm-test qemu://other.host/system<br>
<br>
provided you have setup libvirt properly (hint if you haven't: configure listen_addr, *_file, tls_allowed_dn_list in libvirtd.conf).<br>
<br>
Having the 2 nodes as primary without proper clustering/fencing is a recipe for trouble but you can decrease the risk of doing stupid things by using a libvirt hook script which sets both nodes in primary only when live migration is performed, and then revert the "slave" resource as secondary. Here's such a script:<br>
<br>
<a href="https://github.com/taradiddles/cluster/blob/master/libvirt_hooks/qemu" target="_blank">https://github.com/<u></u>taradiddles/cluster/blob/<u></u>master/libvirt_hooks/qemu</a><span class=""><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Is converting my hosts into a cluster really the next step? Of course the<br>
point of KVM migration is to get to automatic failover.<br>
</blockquote>
<br></span>
You could write a few scripts instead of setting up clustering, but you'll end up reinventing the wheel, get upset, and revert to setup clustering anyway (been there, done that). A few tips in case you use pacemaker:<br>
<br>
restrict the number of concurrent migrations (avoids I/O trashing):<br>
<br>
pcs property set migration-limit=1<br>
<br>
creating a basic domain (provided you have already set up a drbd resource, named eg. drbd_${VM}-clone):<br>
<br>
VM=vmest<br>
pcs resource create ${VM} VirtualDomain \<br>
hypervisor="qemu:///system" \<br>
config="/etc/cluster/vm-xml/${<u></u>VM}.xml" \<br>
meta allow-migrate="true"<br>
<br>
(pay attention that the xml files are not in /etc/libvirt/qemu ; obviously you'll need to synchronize those files between both hosts).<br>
<br>
start the drbd resource before this resource:<br>
<br>
pcs constraint order promote drbd_${VM}-clone then ${VM}<br>
<br>
you'll also want to setup various timeouts, eg.:<br>
<br>
pcs resource op add ${VM} start timeout="120s" interval="0"<br>
pcs resource op add ${VM} stop timeout="240s" interval="0"<br>
pcs resource op add ${VM} monitor timeout="30s" interval="10s"<br>
pcs resource op add ${VM} migrate_from timeout="60s" interval="0"<br>
pcs resource op add ${VM} migrate_to timeout="120s" interval="0"<br>
<br>
the commands use pcs but you can easily translate them to crm.<br>
<br>
good luck<span class="HOEnZb"><font color="#888888"><br>
ivan</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
______________________________<u></u>_________________<br>
drbd-user mailing list<br>
<a href="mailto:drbd-user@lists.linbit.com" target="_blank">drbd-user@lists.linbit.com</a><br>
<a href="http://lists.linbit.com/mailman/listinfo/drbd-user" target="_blank">http://lists.linbit.com/<u></u>mailman/listinfo/drbd-user</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr">Paul Mackinney<br>Systems & Quality Manager<br>O.N. Diagnostics, LLC
<br>2150 Shattuck Ave. Suite 610, Berkeley, CA 94704
<br>510-204-0688 (phone) | 510-356-4349 (fax)
<br>_____________________________________________
<br>If you receive this message in error, please delete it immediately.
This message may contain information that is privileged, confidential
and exempt from disclosure and dissemination under applicable law.
<br>
<br></div></div>
</div>