AW: [DRBD-user] endless device verification and strange checksums

Schmidt, Florian florian.schmidt at centric-it.de
Thu Aug 7 13:44:26 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.




> -----Ursprüngliche Nachricht-----
> Von: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-
> bounces at lists.linbit.com] Im Auftrag von Eric Marin
> Gesendet: Mittwoch, 6. August 2008 15:23
> An: drbd-user at lists.linbit.com
> Betreff: Re: [DRBD-user] endless device verification and strange checksums
> 
> Hi,
> 
> Just out of curiosity (not to answer your question, sorry !), which kernel do you use
> ?

I am using RHEL5 with 2.6.18-53.el5.
 
> When going _back_ from 2.6.25-2-686-bigmem to 2.6.18-6-686-bigmem (Debian
> Etch), I experienced way
> less errors : one out-of-sync in nearly a week instead of one every two minutes.
> 
> Then, even with the flood of errors detected (which I quickly stopped trying to
> correct) when I used
> 2.6.25-2-686-bigmem, I didn't notice any data corruption (that I can tell).
> 
> I've now de-activated checksum offloading in the network cards and activated
> verify-alfg crc32c, and
> so far I've only noticed one out-of-sync occurence. Still, there shouldn't be any !
> (Memtest didn't detect any RAM defect)


What I do not understand is, how can the cecksums fort he same sectors be this different without any write-operations on the device (DRBD on both nodes secondary)

I think if I could solve this lots of these errors would disappear.

Now I disabled device-verification again, but of course I would like to know the reason...

Regards
Florian


> Eric
> 
> Schmidt, Florian a écrit :
> > Hello everyone again,
> >
> > I just added verify-alfg crc32c; to my drbd.conf and ran drbdadm verify
> > all
> >
> > After that I saw lots of out-of-sync secrots in dmesg.
> > I thought: OK, lets sync and then everything should be alright.
> >
> > But after the sync (drbdadm disconnect and connect all on primary one),
> > a re-run of drbdadm verify all still found out-of-sync sectors.
> >
> > So I googled around and saw that there where also people having this
> > problem before.
> >
> > I found the following command and executed this on a sector, reported as
> > out of sync:
> >
> > drbd3: Out of sync: start=192896, size=24 (sectors)
> > <lots of lines of them>
> >
> > [root at saprouter1 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copied, 6.06248 seconds, 2.0 kB/s
> > 6d79f038d772cac9ce34af477574ef7d
> >
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copied, 1.93692 seconds, 6.3 kB/s
> > 30704f13388d5e23d644e44996cc2b62
> >
> > Uh, just found something very strange after executing the command again:
> >
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copied, 1.93692 seconds, 6.3 kB/s
> > 30704f13388d5e23d644e44996cc2b62
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > a2915f04e3533f1f329ac9eab8a875f9
> > 12288 bytes (12 kB) copied, 0.99269 seconds, 12.4 kB/s
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copied, 0.856155 seconds, 14.4 kB/s
> > b94b4be0da055b290d4ae4d421cf17f0
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copiedb94b4be0da055b290d4ae4d421cf17f0
> > , 0.597763 seconds, 20.6 kB/s
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copieda2915f04e3533f1f329ac9eab8a875f9
> > , 0.621022 seconds, 19.8 kB/s
> > [root at saprouter2 ~]#  dd if=/dev/sda9 skip=192896 bs=512 count=24
> > iflag=direct | openssl md5
> > 24+0 records in
> > 24+0 records out
> > 12288 bytes (12 kB) copied6d79f038d772cac9ce34af477574ef7d
> > , 0.041115 seconds, 299 kB/s
> >
> > How could there be sometimes different checksums? O_o
> > Problems with the RAID-driver?
> >
> > Current state is:
> >
> > [root at saprouter1 ~]# /etc/init.d/drbd status
> > drbd driver loaded OK; device status:
> > version: 8.2.6 (api:88/proto:86-88)
> > GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
> > buildsvn at c5-i386-build, 2008-06-02 10:17:29
> > m:res             cs         st                   ds                 p
> > mounted  fstype
> > 0:drbd_afd        Connected  Secondary/Secondary  UpToDate/UpToDate  C
> > 1:drbd_ftpdata    Connected  Secondary/Secondary  UpToDate/UpToDate  C
> > 2:drbd_saprouter  Connected  Secondary/Secondary  UpToDate/UpToDate  C
> > 3:drbd_configs    Connected  Secondary/Secondary  UpToDate/UpToDate  C
> >
> > Greetings and sorry for bothering again^^
> > Florian
> > _______________________________________________
> > drbd-user mailing list
> > drbd-user at lists.linbit.com
> > http://lists.linbit.com/mailman/listinfo/drbd-user
> >
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list