[DRBD-user] DRBD8, dedicated repository for PVE 4

Sun Dec 25 03:43:14 CET 2016

Hello Cesar!

 > LVM-Partition / DRBD / LVM-PV / LVM-VG / LVM-LV(the virtual disk)
 > I guess that you are talking of use two LVM partitions, and my question of
 > performance is compared with it.
 > Other questions:
 > 1- Will work your storage plugin with my setup explained above?, or
 > 2- What setup must i have? (Please explain in detail such setup)
I explained it already in my last eMail. It is designed to use the DRB8
instance DIRECTLY as a disk for a VM in Proxmox.
   -> PhysDisk / LVM / DRBD8(the virtual disk)
Whenever this disk is needed to be activated by Proxmox, the Plugin does the
required drdbadm calls to activate DRBD8 on that host. As long as the disk is
not used, the DRBD8 instance is in "down" state.

 > (Please explain in detail such setup)
I am leaving soon for vacation and I have no time to now to describe more,
than what is already in the README.md on the GitHub page. When I have good
iNet on the Philippines and time, I can improve the description.

But basically it is:
- Define a PV/VG/LV on node A and also on node B. In my configuration the PV on
   node A and B are different. You have full freedom to choose what you like.
   Even the names of the VG/LV can be different, but I recommend to use the same
   to have a easier setup.
   You have to define a LV for each virtual disk you need!
- Define for each LV (virtual disk) a DRBD8 configuration and copy this to
   both machines. The resource name need to follow this naming convention:
    "vm-<vm-id>-disk-*"
- Define for each DRBD8 resource (virtual disk) a drbd8 storage entry in
   "/etc/pve/storage.cfg" and use the DRBD8 resource name for the "resource"
   config entry. To make your live easier, you should name the storage like
   the DRBD8 resource name with prefix "storage_drbd_", but you can use any
   name here.
- Use for each virtual disk the predefined storage in each VM. Please note,
   that the plugin doesn't support real allocation. It uses always the whole
   storage for the virtual disk. You need to set the size of the "allocated"
   disk less-equal to the real size.

Now my DRBD8 storage plugin will be executed by the Proxmox Storage manager
when ever the VM is started, stopped or migrated and will do all the required
DRBD interaction.

Yes, it is a lot to configure manually (LVM, DRBD8, Storage), but you end up
with a safer system (in my opinion) and full DRBD8 automation. The only thing
which is currently not supported is life migration. The plugin checks if the
other side is in secondary, otherwise it will not activate the DRBD8 volume.
It might be possible to implement this (remove the check in the plugin) and
use dual primary mode, but ... .
In my opinion it is not necessary to life migrate, but others may see this
different and extend the plugin to allow this and ofcourse test, if this is
really working without getting a split brain.

 > For solve Split Brain problems very quickly, and while that the VMs are
 > running (online), it is that i have a LVM-Partition/DRBD/etc... for each PVE
 > Node, and each PVE Node can have a max of two LVM partitions with DRBD8 if i
 > run VMs in both PVE Nodes.
I had a similar configuration before, but now I am free with the number of
disks and number of VMs which I have. The drawback is, that there are now
a lot of DRBD8 instances and if the unlikely situation of a split brain
happens, you have to resolve it on a lot of DRDB instances. But even this
should be not so complicated, because you know which VM was running on which
server, so you know which part has the newest data. The plugin avoids
activating the DRBD8 resource on two servers by checking the other sides
secondary role. And your cluster manager should avoid starting a VM on both
servers, if the DRBD8 driver can't reach its other end and would allow to
switch to primary. But in reality it is very unlikely that the DRBD8 drivers
can't see each other, because you should have a doubled infrastructure for
the server interconnection. So again, it should be never possible to activate
the VM on both servers, even if the cluster manager would do this.

 > i  guess that is necessary create a second plugin for it, and this plugin
 > must do this:
 > 1- Compare the sync of the DRBD resources and know if them are well.
 > 2- If the DRBD resources are sync correctly, change the  DRBD resource to
 > primary in the other PVE Node, else, show a error message and abort this
 > process.
 > 3- Run the VM migration.
 > 4- Change the  DRBD resource to secondary in the first PVE Node.
Your idea is nice, but this is not how the PVE storage interface works!
The PVE storage interface will execute de/activate_storage, de/activate_volume
and file_path. It assumes the storage is shared, which means it is allowed
to write from both sides at the same time. Thus it calls activate_volume on
the second node when it is still activated on the first node. This would
be a primary-primary situation, which requires another DRBD8 configuration
(dual primary mode) and a change in my plugin.
To be honest, I don't trust the PVE cluster manager here. Or better, I don't
know if it would be possible to write really from both sides the same disk
blocks. If someone likes to test it, I wrote above how it might work with the
existing plugin by simply removing the check for the other side being
secondary.

 > I did several tests about that, and i can say with certainty that in DRBD8
 > the online resize only is possible if the VM is not running and the DRBD
 > resource is unmounted in the host (very tested in my lab with his possible
 > alternatives, always with the condition that resource DRBD be as small as
 > possible, ie, according to the size of the Virtual disk)
Thanks for sharing this information!
So you need to shutdown the VM first. Then the plugin will change to
secondary on both sides and then you can do the LVM/DRBD8 shrink/extend. And
ofcourse shrinking requires more attention and intelligence to do it.

BR
    Jasmin