[DRBD-user] LCMC display (and other tools) says "up to date" but DRBD is not

Whit Blauvelt whit+drbd at transpect.com
Wed Oct 24 23:44:49 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

Date: Wed, 24 Oct 2012 17:16:21 -0400
From: Whit Blauvelt <whit.drbd at transpect.com>
To: Rasto Levrinc <rasto.levrinc at gmail.com>, drbd-user at lists.linbit.com
Cc: drbd-mc at lists.linbit.com
Subject: Re: [drbd-mc] LCMC display says "up to date" but DRBD is not
User-Agent: Mutt/1.5.21 (2010-09-15)

I wrote:

> > I've got a fairly simple setup, that back some time ago was working well,
> > but at some point has slipped away from me. I have a number of KVM VMs which
> > have been set up by using a distinct LVM partition behind each, and then
> > using DRBD to mirror these between two servers via a dedicated crossover.
> > There are 6-8 VMs on each of the two servers, with both dedicated LVMs and
> > dedicated DRBD resources. I've been using current versions of LCMC along the
> > way to set up the DRBD mirroring. The LVMs have been set up using native
> > tools, and the KVM VMs through libvirt.
> > 
> > To put the problem briefly, I've recently discovered, on shutting down VMs
> > on one server and then restarting the VMs on the other, after shifting DRBD
> > primary assignments, that the secondary DRBD storage has not kept up. This
> > is despite Connected/UpToDate claims in the Storage display of LCMC.

> The display in LCMC should be ok. Your problem is probably either your
> config or an administration error at some point, forcing the DRBD to think
> the data are up-to-date. You can run online verify to check if your
> secondary has the same data as primary, before finding out the hard way. For
> DRBD specific questions, you should ask in drbd-user mailing list.
> Rasto

Thanks Rasto,

Including the drbd list now. 

I'm certainly capable of administrative error. And the reporting of UpToDate
when the filesystems are definitely not is deeper than LCMC - drbd-overview
shows the same thing, "Connected UpToDate/UpToDate" even though the mirror
doesn't match. "cat /prod/drdb" gives the same misinformation. "drbdadm
cstate xxx" also gives "Connected". And "drbdadm dstate cent_s" gives
"UpToDate/UpToDate" on both servers.

A problem with the "administrative error" hypothesis is that the DRBD
administration has been, beyond the initial installation, entirely through
LCMC. That is, it's a problem for LCMC (perhaps an older version though) if
it allows an admin's error that results in false reports of up-to-date

Using online verify also confirms that we're not at all up to date:

Oct 24 16:49:24 vm1 kernel: [5730169.131424] block drbd0: conn( Connected -> VerifyS ) 
Oct 24 16:49:24 vm1 kernel: [5730169.131434] block drbd0: Starting Online Verify from sector 0
Oct 24 16:49:24 vm1 kernel: [5730169.185546] block drbd0: Out of sync: start=584, size=8 (sectors)
Oct 24 16:49:24 vm1 kernel: [5730169.188980] block drbd0: Out of sync: start=1112, size=16 (sectors)
Oct 24 16:49:24 vm1 kernel: [5730169.236967] block drbd0: Out of sync: start=64, size=8 (sectors)
Oct 24 16:49:24 vm1 kernel: [5730169.630823] block drbd0: Out of sync: start=32832, size=8 (sectors)
... on for 947 lines of such notices in this case.

Disconnecting and reconnecting the secondary should cause a resync per the
manual. Okay. But that's not preventing the problem redeveloping - not
identifying and correcting the cause.

To review how these were administratively set up: An LVM partition was used
as a backing store in creating each VM. A matching LVM partition was created
on the second server. LCMC was used at that point to assign both to DRBD,
using the data from the first LVM.

It is initially working, or else the secondary wouldn't be populated at all.
But it stops working at some point, while leaving DRBD showing that
everything's just fine - short of running online verify or doing the
disconnect-reconnect sequence. I could script disconnect-reconnect behavior
overnight. That still wouldn't guarantee good mirrors in between, so DRBD
still can't be 100% depended on for failover then.

This is not the most up-to-date system, drbd version Still....


More information about the drbd-user mailing list