[DRBD-user] Out-of-sync woes

Fri Aug 4 09:55:51 CEST 2017

Hi Luke,

I assume you are experiencing the results of data inconsistency by
in-flight writes. This means that a process (here your VM's qemu) can
change a block that already waits to be written to disk.
Whether this happens (undetected) or not depends on how the data is
accessed for writing and synced to disk.

For qemu, you have to consider two factors; the guest OS' file systems'
configuration and qemu's disk caching configuration:
On Linux guests, this usually only happens for guests with file systems,
that are NOT mounted either sync or with barriers, and with block-backed
swap.
On Windows guests it always happens.
For qemu it depends on how the disk caching strategy is configured and
thus whether it allows in-fight writes or not.

The common position is to configure qemu for writethrough caching for
all disks and leave your guests' OS unchanged. You will also have to
ignore/override libvirt's warning about unsafe migration with this cache
setting, as it only applies to file-backed VM disks, not
blockdev-backed.
I use this for hundreds of both Linux and Windows VMs backed by DRBD
block devices and have no inconsistency problems at all since this
change.

Changing qemu's caching strategy might affect performance.
For performance reasons you are advised to use a hardware RAID
controller with battery-backed write-back cache.

For consistency reasons you are advised to use real hardware RAID, too,
as the in-flight block changing problem described above might also
affect mdraid, dmraid/FakeRAID, LVM mirroring, etc. (depending on
configuration).

Best regards,
// Veit

Am Freitag, den 04.08.2017, 11:11 +1200 schrieb Luke Pascoe:
> Hello everyone.
> 
> I have a fairly simple 2-node CentOS 7 setup running KVM virtual
> machines, with DRBD 8.4.9 between them.
> 
> There is one DRBD resource per VM, with at least 1 volume each,
> totalling 47 volumes.
> 
> There's no clustering or heartbeat or other complexity. DRBD has it's
> own Gig-E interface to sync over.
> 
> I recently migrated a host between nodes and it crashed. During
> diagnostics I did a verification on the drbd volume for the host and
> found that it had _a lot_ of out of sync blocks.
> 
> This led me to run a verification on all volumes, and while I didn't
> find any other volumes with large numbers of out of sync blocks, there
> were several with a few. I have disconnected and reconnected all these
> volumes, to force them to resync.
> 
> I have now set up a nightly cron which will verify as many volumes as
> it can in a 2 hour window, this means I get through the whole lot in
> about a week.
> 
> Almost every night, it reports at least 1 volume which is out-of-sync,
> and I'm trying to understand why that would be.
> 
> I did some research and the only likely candidate I could find was
> related to TCP checksum offloading on the NICs, which I have now
> disabled, but it has made no difference.
> 
> Any suggestions what might be going on here?
> 
> Thanks.
> 
> Luke Pascoe
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user