[DRBD-user] DRBD dual primary mode implications

Fri Mar 6 10:08:10 CET 2015

Rather than setting up a cluster from scratch I'd suggest using something
like Proxmox VE. I've been using it with DRBD in dial primary without issue
for many years.

Eric
On Mar 5, 2015 1:47 PM, "Paul Mackinney" <paul.mackinney at ondiagnostics.com>
wrote:

> Thanks Bram & Ivan! These are some great hints. The ability to perform a
> KVM live migration with no service disruption will be a huge benefit.
>
> Paul
>
> On Thu, Mar 5, 2015 at 5:52 AM, Ivan <ivan at c3i.bg> wrote:
>
>> Hi !
>>
>> On 03/03/2015 07:26 PM, Paul Mackinney wrote:
>>
>>> Apologies in advance if this is too off-topic for this forum, feel free
>>> to
>>> reply directly & point to other resources.
>>>
>>> I'm having good success with DRBD in the following configuration:
>>>
>>> Two identical hosts, dual-bonded ethernet for DRBD with a dedicated VLAN
>>> on
>>> my smart swtich.
>>>
>>> I'm running DRBD on top of LVM, each device is the backing storage for a
>>> KVM virtual guest. The top-down diagram is:
>>>
>>> [/dev/sda1] Guest root system
>>> [ /dev/sda ]
>>> [ /dev/drbdN ] <--- primary/secondary mode
>>> [ /dev/mapper/vg-lvN ]
>>> [ lvm physical volume ]
>>>
>>
>> I have the same DRBD setup as you.
>>
>>
>>> This works perfectly, but guest migration requires a shutdown:
>>> 1. Shut down guest on host 0
>>> 2. Demote DRBD primary to secondary on host 0
>>> 3. Promote DRBD secondary to primary on host 1
>>> 4. Start guest on host 1
>>>
>>> Apparently running GFS with DRBD's dual-primary mode would allow KVM
>>> migration from one host to another without a shutdown, but all the
>>> tutorials jump straight into cluster configuration.
>>>
>>
>> Live migration indeed requires dual-primary mode. But it has nothing to
>> do with GFS (except if you want your machines' disks to be file based, but
>> in that case doing that over GFS will have a huge performance hit, and it
>> defeats the purpose of the setup you desribed above).
>>
>> So - just setup dual-primary (protocol C; allow-two-primaries yes;) and
>> you'll be able to use this to live migrate your vm:
>>
>> virsh migrate vm-test qemu://other.host/system
>>
>> provided you have setup libvirt properly (hint if you haven't: configure
>> listen_addr, *_file, tls_allowed_dn_list in libvirtd.conf).
>>
>> Having the 2 nodes as primary without proper clustering/fencing is a
>> recipe for trouble but you can decrease the risk of doing stupid things by
>> using a libvirt hook script which sets both nodes in primary only when live
>> migration is performed, and then revert the "slave" resource as secondary.
>> Here's such a script:
>>
>> https://github.com/taradiddles/cluster/blob/master/libvirt_hooks/qemu
>>
>>  Is converting my hosts into a cluster really the next step? Of course the
>>> point of KVM migration is to get to automatic failover.
>>>
>>
>> You could write a few scripts instead of setting up clustering, but
>> you'll end up reinventing the wheel, get upset, and revert to setup
>> clustering anyway (been there, done that). A few tips in case you use
>> pacemaker:
>>
>> restrict the number of concurrent migrations (avoids I/O trashing):
>>
>> pcs property set migration-limit=1
>>
>> creating a basic domain (provided you have already set up a drbd
>> resource, named eg. drbd_${VM}-clone):
>>
>> VM=vmest
>> pcs resource create ${VM} VirtualDomain \
>>         hypervisor="qemu:///system" \
>>         config="/etc/cluster/vm-xml/${VM}.xml" \
>>         meta allow-migrate="true"
>>
>> (pay attention that the xml files are not in /etc/libvirt/qemu ;
>> obviously you'll need to synchronize those files between both hosts).
>>
>> start the drbd resource before this resource:
>>
>> pcs constraint order promote drbd_${VM}-clone then ${VM}
>>
>> you'll also want to setup various timeouts, eg.:
>>
>> pcs resource op add ${VM} start timeout="120s" interval="0"
>> pcs resource op add ${VM} stop timeout="240s" interval="0"
>> pcs resource op add ${VM} monitor timeout="30s" interval="10s"
>> pcs resource op add ${VM} migrate_from timeout="60s" interval="0"
>> pcs resource op add ${VM} migrate_to timeout="120s" interval="0"
>>
>> the commands use pcs but you can easily translate them to crm.
>>
>> good luck
>> ivan
>>
>>
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>
>
>
> --
> Paul Mackinney
> Systems & Quality Manager
> O.N. Diagnostics, LLC
> 2150 Shattuck Ave. Suite 610, Berkeley, CA 94704
> 510-204-0688 (phone) | 510-356-4349 (fax)
> _____________________________________________
> If you receive this message in error, please delete it immediately. This
> message may contain information that is privileged, confidential and exempt
> from disclosure and dissemination under applicable law.
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150306/6d4623da/attachment.htm>