[DRBD-user] Out-of-sync blocs with VWare Workstation

Tue Nov 12 10:46:16 CET 2013

Dear DRBD users and developers,

We have 2 clusters with several "simple" resources (single master, 
single resource per connection).
Each resource contains either an LXC container, or a VMWare virtual 
machine (VMWare Workstation v9.0.2).

We run "drbdadm verify all" every week-end and we noticed that the 
resources hosting VMWare machines often have out-of-sync blocs, and the 
ones hosting LXCs never have any.
We've seen this on both clusters (config quoted below).

I was wondering what could be causing these out-of-sync blocs?
Can VMWare possibly be modifying in-flight data?
Is there a way I can make sure?

Thanks in advance if someone can shed some light on this issue.
Lionel Sausin.

---
Config on the oldest cluster:
Ubuntu 10.04, kernel 2.6.32 and DRBD 8.3.13. Its resources use ext4 
mounted with -o nobarrier. The storage is a hardware RAID10 on SSDs.

global {
	usage-count yes;
}

common {

	protocol C;

	# Actions to take in the face of special events
	handlers {
		pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
		pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
		local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
		split-brain "/usr/lib/drbd/notify-split-brain.sh <<<cut>>>";
		out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh <<<cut>>>";
		before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 5 -- -c 16k";
		after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
	}

	startup {
	}

	net {
		# Restrict access to the resources with a shared secret
		cram-hmac-alg md5;
		shared-secret <<<cut>>>;

		# Congestion management lets writes flow without disconnecting
		# on-congestion pull-ahead;
		# congestion-fill 1M;

		# Go StandAlone if the peer is enreachable too long
		ko-count 10;
		# Allow 6 seconds for the other node's reply before we drop connections
		timeout 60;
	}

	syncer {
		# Compress the dirty-bitmaps
		use-rle;

		# Use checksumming to allow online verification
		# sha1 has fewer chances of hash collision but is CPU-hungry (Noa's CPU can only process up to 60MB/s)
		verify-alg md5;

		# Resync checksuming while verifying used lead to a deadlock, fixed in v8.3.11
		csums-alg md5;

		# Adaptative syncer rate: let DRBD decide the best sync speed
		#   initial sync rate
		rate 50M;
		#   size of the rate adaptation window
		c-plan-ahead 20;
		#   min/max rate
		#   The network will allow only up to ~110MB/s, but verify and identical-bloc resyncs use very little network BW
		c-max-rate 800M;
		#   quantity of sync data to maintain in the buffers (impacts the length of the wait queue)
		c-fill-target 100k;

		# Limit the bandwidth available for resync on the primary node when DRBD detects application I/O
		c-min-rate 8M;

		al-extents 1023;
	}
}

Typical resource config:

resource openerp {
   device    /dev/drbd_openerp minor 0;
   meta-disk internal;
   on NodeA {
     address   10.100.1.2:7788;
     disk      /dev/fast_vol/openerp;
   }
   on NodeB {
     address   10.100.1.3:7788;
     disk      /dev/fast_vol/openerp;
   }
}

---
Config on the newest cluster:
Ubuntu 12.04, kernel 3.8 (raring stack) and DRBD 8.4.2. Its resources 
use ext4 with default options, and were created with -b 4096 -E 
stride=64,stripe-width=192. The storage is a SSD on the primary node, 
hardware RAID5 on the other side.

global_common.conf:

global {
     usage-count yes;
}

common {
     handlers {
         pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
         pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
         local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
         split-brain "/usr/lib/drbd/notify-split-brain.sh <<<cut>>>";
         out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh <<<cut>>>";
         before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
         after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
     }

     startup {
     }

     options {
     }

     disk {
     }

     net {

         # Peer authentication
         cram-hmac-alg sha1;
         shared-secret <<<cut>>>;

         # Skip sync when checksum matches
         csums-alg sha1;

         # Enable online verify
         verify-alg md5;
     }
}

Typical resource config:

resource web {
     device /dev/drbd_web minor 4;
     # Master
     on vmhost7 {
         address 10.100.0.14:7804;
         disk /dev/data/web;
         meta-disk internal;
     }
     # Slave
     on stockagec {
         address 10.100.0.13:7804;
         disk /dev/data1/web;
         meta-disk internal;
     }
}