[DRBD-user] BUG: Uncatchable DRBD out-of-sync issue

Bram Matthys syzop at vulnscan.org
Mon Jan 27 13:18:08 CET 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

Just jumping in, unaware of the history of this thread...

Stanislav German-Evtushenko wrote, on 27-1-2014 7:08:
> 
> On Thu, Apr 18, 2013 at 4:21 PM, Stanislav German-Evtushenko
> <ginermail at gmail.com <mailto:ginermail at gmail.com>> wrote:
> 
>     No choice so far :)
>     http://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_2.3
> 
>     I don't think this is a kernel bug. Anyway would be nice if sombody
>     can investigate and fix or at least find work around. IDE is slow in
>     compare to VIRTIO.
> 
>     On Thu, Apr 18, 2013 at 2:31 PM, Felix Frank <ff at mpexnet.de
>     <mailto:ff at mpexnet.de>> wrote:
>     > On 04/18/2013 12:20 PM, Stanislav German-Evtushenko wrote:
>     >>> Note that your kernel (and hence kvm/virtio) can be considered
>     rather old by now.
>     >> This is a stable RHEL 6 kernel at the moment.
>     >
>     > Exactly ;-)
>     >
>     > Same for Debian 6, which I no longer consider fit for KVM setups
>     > (without backports and such).
> 
> 
> I have replaced all hard-drives on the first server and upgraded DRBD kernel
> modules to 8.3.15. I do verifying every week. It usually founds new
> out-of-sync sectors, then I check if they are false-positive or not (with
> md5sum) and find that 95% of them are real.
> Could anybody suggest a way to debug? Can it be DRBD + RAID problem? Or DRBD
> + one specific RAID problem?

Have you figured out on which one of the servers the data is correct? And is
it always the same server? This assumes a primary/secondary setup.
If you know on which server the data is correct then you know - IF it's a
hardware problem - which server is at fault. If it's a software problem,
then you still can't tell.

Do you run a weekly/monthly RAID verification job? On both servers? Linux sw
raid has this, and presumably hw raid has this option as well.
This would pick up (most) RAID / disk issues.
Silent disk corruption on RAID arrays can occur and disk verification would
be the only way to tell (well, apart from using a filesystem like ZFS).

Good luck,

Bram.


- -- 
Bram Matthys
Software developer/IT consultant        syzop at vulnscan.org
Website:                                  www.vulnscan.org
PGP key:                       www.vulnscan.org/pubkey.asc
PGP fp: EBCA 8977 FCA6 0AB0 6EDB  04A7 6E67 6D45 7FE1 99A6
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iF4EAREIAAYFAlLmToAACgkQbmdtRX/hmabbewD9HEaFbFw1j91AgDiAbgWcDari
qZ/fYOYBw/qyMMempbMA/iCKM5Y2Oa3XAUApPWc05cTZ+W9FyOGdOmNgIl4FMGE0
=z7Jn
-----END PGP SIGNATURE-----



More information about the drbd-user mailing list