[DRBD-user] [drbd-mc] LCMC display says "up to date" but DRBD is not

Rasto Levrinc rasto.levrinc at gmail.com
Thu Oct 25 00:50:57 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Oct 24, 2012 at 11:16 PM, Whit Blauvelt <whit.drbd at transpect.com> wrote:
> I wrote:
>
>> > I've got a fairly simple setup, that back some time ago was working well,
>> > but at some point has slipped away from me. I have a number of KVM VMs which
>> > have been set up by using a distinct LVM partition behind each, and then
>> > using DRBD to mirror these between two servers via a dedicated crossover.

You mean LVM -> DRBD -> KVM, not LVM->(KVM, DRBD), right?

>> > There are 6-8 VMs on each of the two servers, with both dedicated LVMs and
>> > dedicated DRBD resources. I've been using current versions of LCMC along the
>> > way to set up the DRBD mirroring. The LVMs have been set up using native
>> > tools, and the KVM VMs through libvirt.
>> >
>> > To put the problem briefly, I've recently discovered, on shutting down VMs
>> > on one server and then restarting the VMs on the other, after shifting DRBD
>> > primary assignments, that the secondary DRBD storage has not kept up. This
>> > is despite Connected/UpToDate claims in the Storage display of LCMC.
>
>
>> The display in LCMC should be ok. Your problem is probably either your
>> config or an administration error at some point, forcing the DRBD to think
>> the data are up-to-date. You can run online verify to check if your
>> secondary has the same data as primary, before finding out the hard way. For
>> DRBD specific questions, you should ask in drbd-user mailing list.
>>
>> Rasto
>
> Thanks Rasto,
>
> Including the drbd list now.
>
> I'm certainly capable of administrative error. And the reporting of UpToDate
> when the filesystems are definitely not is deeper than LCMC - drbd-overview
> shows the same thing, "Connected UpToDate/UpToDate" even though the mirror
> doesn't match. "cat /prod/drdb" gives the same misinformation. "drbdadm
> cstate xxx" also gives "Connected". And "drbdadm dstate cent_s" gives
> "UpToDate/UpToDate" on both servers.
>
> A problem with the "administrative error" hypothesis is that the DRBD
> administration has been, beyond the initial installation, entirely through
> LCMC. That is, it's a problem for LCMC (perhaps an older version though) if
> it allows an admin's error that results in false reports of up-to-date
> connections.

I see, we can leave out the administrative error, then. LCMC doesn't let you
to do it and It's actually difficult to achieve even from the command line.
Well, one way to do it would to be to skip the initial full-sync, right in
the beginning.

Your DRBD config file could give an indication what's wrong, but It can also
be that the problem is underneath the DRBD, or something writing underneath
the DRBD.

Rasto

...

> Oct 24 16:49:24 vm1 kernel: [5730169.630823] block drbd0: Out of sync: start=32832, size=8 (sectors)
> ... on for 947 lines of such notices in this case.

It could still mean that

>
> Disconnecting and reconnecting the secondary should cause a resync per the
> manual. Okay. But that's not preventing the problem redeveloping - not
> identifying and correcting the cause.
>
> To review how these were administratively set up: An LVM partition was used
> as a backing store in creating each VM. A matching LVM partition was created
> on the second server. LCMC was used at that point to assign both to DRBD,
> using the data from the first LVM.
>
> It is initially working, or else the secondary wouldn't be populated at all.
> But it stops working at some point, while leaving DRBD showing that
> everything's just fine - short of running online verify or doing the
> disconnect-reconnect sequence. I could script disconnect-reconnect behavior
> overnight. That still wouldn't guarantee good mirrors in between, so DRBD
> still can't be 100% depended on for failover then.
>
> This is not the most up-to-date system, drbd version 8.3.8.1. Still....
>
> Whit



-- 
Dipl.-Ing. Rastislav Levrinc
rasto.levrinc at gmail.com
Linux Cluster Management Console
http://lcmc.sf.net/



More information about the drbd-user mailing list