[DRBD-user] [dm-crypt] dm-crypt on top of DRBD for live migration

Fri Dec 16 10:22:00 CET 2011

Hi,

I think most of the problems were already discussed,
so just few comments here.

First, dm-crypt is not tested in cluster scenarios at all,
you have been warned :)

Dm-crypt is transparent block device encryption, it always
works on 512bytes sectors (no problem for 4k drives, but just
atomic encryption unit is always 512 bytes) and does not randomize
sectors - so mapping is always 1:1 for plaintext/ciphertext device
(but note LUKS has some header on-disk so there is offset,
plaintext device is smaller than underlying device.)

On 12/07/2011 01:30 PM, Berengar Lehr wrote:
> We want to use LVM, dm-crypt and DRBD in a 2-machine setup for KVM.
> 
> We think, a proper setup could be something like this (dm-crypt below DRBD):
> 
> 
>    Machine 1               Machine 2
> 
>       KVM  -> -> -> -> -> ->  KVM
>        |   (live migration)    .
>        |                       .
>       DRBD - - - - - - - - - DRBD
>        |                       |
>       LVM                     LVM
>        |                       |
>     dm-crypt                dm-crypt
>        |                       |
>  Disk/Partition          Disk/Partition
> 
> The KVM guest machines should run on machine 1. Live migration to
> machine 2 should be supported.

dm-crypt internally uses kernel threads (workqueues) which perform
encryption (and even IO submitting).

So without using a synchronization mechanism data hit disk
in different times for machine 1 and 2.

This is nothing new, IO-scheduler does the same, dm-crypt just add
to the mix more delay (encryption takes some time and
it can depend on load of machine).

It is filesystem responsibility (or application using top-level partition)
to issue FLUSH operation to ensure that data reached disk and DRBD
responsibility to distribute this to both machines.
Both LVM and dmcrypt support FLUSH (or formerly barrier) requests now.

> Using this setup, every write to DRBD would be (independently) crypted
> on both machines,
> leading to additional (unnecessary?) cpu load on machine 2 before live
> migrating, and additional
> cpu load on machine 1 after live migration.

I think CPU load is nothing you should worry about.
It is data consistency what matters here.

> Could these additional cpu loads be avoided using a setup like this
> (dm-crypt in top of DRBD):
> 
> 
>    Machine 1               Machine 2
> 
>       KVM  -> -> -> -> -> ->  KVM
>        |   (live migration)    .
>        |                       .(b)
>     dm-crypt                dm-crypt
>        |                       |(a)
>       DRBD - - - - - - - - - DRBD
>        |                       |
>       LVM                     LVM
>        |                       |
>  Disk/Partition          Disk/Partition
> 
> In this setup, dm-crypt runs on both machines, too, but is not used on
> machine 2 until KVM
> guests send write-requests after the live migration. So crypting is
> done only by one machine
> at every time point.
> 
> Is such a setup safe and stable?
> What about caching at points (a) or (b) on machine 2?
> Can KVM read cached, outdated data from dm-crypt after live migration?

Yes, in this scenario you should create dm-crypt mapping only on one
machine and "migrate" it together with KVM guest.

Or force flush all buffers before migration on the host where is
the guest migrating to (blockdev --flushbufs - but not tested, maybe
it need more force).

As said above, if application (KVM, fs) is properly using some synchronization,
it should work but it is not tested.

Also please note that despite you are exporting ciphertext device here
(IOW data is encrypted on the wire) it is NOT SECURE on network.

In short, attacker cannot decrypt data, but he can replace old content
of sectors (reply attack) etc.
(From the logic how disk encryption works - IV is always constant
for the sector while IV for network data stream block must be unique - nonce
and never repeats). Also there is no authentication/checksum on dm-crypt level
possible because it simply has no space where to store authentication
tag or checksum).

So you need another layer of encryption of the network data stream
(ipsec or something) if you need that.

Milan