[DRBD-user] Mystery with online verify and Out of sync sectors.

Florian Haas florian.haas at linbit.com
Tue Jun 17 09:52:31 CEST 2008

> Florian Haas schrieb:
>> OK. Can you post a relevant excerpt of your syslog? Please do "grep
>> drbd0
>> /var/log/syslog", and then cut & paste to include 100 or so relevant
>> lines
>> from your most recent verify run.
> Sure:
> Jun 15 10:47:48 host-a kernel: drbd0: conn( Connected -> VerifyS )
> Jun 15 10:48:43 host-a kernel: drbd0: Out of sync: start=4475928,
> size=32 (secto
> rs)
> Jun 15 10:48:43 host-a kernel: drbd0: Out of sync: start=4475960, size=8
> (sector
> s)

Right. And now can you do

dd if=<dev> skip=<start> bs=512 count=<size> iflag=direct | openssl md5

... where <dev> is your backing device, <start> is the first out-of-sync
sector as reported in the kernel log, and <size> is the number of
out-of-sync sectors?

Please do that on both nodes, for a handful (say 5 or so) of out-of-sync
areas reported in your syslog.

If those MD5 sums match, then these are apparently false positives and
we'll have to look into what's causing them.

If, however, they do not match, replace "openssl md5" with "xxd" in the
command above, and try to interpret those hex dumps. Are they completely
different, do they not match at all, or are you seeing just one or two
seemingly random differences?

And just so I understand you correctly: you
- unmounted your file system,
- made both devices Secondary,
- ran "drbdadm invalidate-remote <resource>",
- waited for the full sync to complete,
- ran "drbdadm verify <resource>" immediately thereafter,
- and even then you saw these out-of-sync messages in your syslog?


