Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> -----Ursprüngliche Nachricht----- > Von: drbd-user-bounces at lists.linbit.com [mailto:drbd-user- > bounces at lists.linbit.com] Im Auftrag von Eric Marin > Gesendet: Mittwoch, 6. August 2008 15:23 > An: drbd-user at lists.linbit.com > Betreff: Re: [DRBD-user] endless device verification and strange checksums > > Hi, > > Just out of curiosity (not to answer your question, sorry !), which kernel do you use > ? I am using RHEL5 with 2.6.18-53.el5. > When going _back_ from 2.6.25-2-686-bigmem to 2.6.18-6-686-bigmem (Debian > Etch), I experienced way > less errors : one out-of-sync in nearly a week instead of one every two minutes. > > Then, even with the flood of errors detected (which I quickly stopped trying to > correct) when I used > 2.6.25-2-686-bigmem, I didn't notice any data corruption (that I can tell). > > I've now de-activated checksum offloading in the network cards and activated > verify-alfg crc32c, and > so far I've only noticed one out-of-sync occurence. Still, there shouldn't be any ! > (Memtest didn't detect any RAM defect) What I do not understand is, how can the cecksums fort he same sectors be this different without any write-operations on the device (DRBD on both nodes secondary) I think if I could solve this lots of these errors would disappear. Now I disabled device-verification again, but of course I would like to know the reason... Regards Florian > Eric > > Schmidt, Florian a écrit : > > Hello everyone again, > > > > I just added verify-alfg crc32c; to my drbd.conf and ran drbdadm verify > > all > > > > After that I saw lots of out-of-sync secrots in dmesg. > > I thought: OK, lets sync and then everything should be alright. > > > > But after the sync (drbdadm disconnect and connect all on primary one), > > a re-run of drbdadm verify all still found out-of-sync sectors. > > > > So I googled around and saw that there where also people having this > > problem before. > > > > I found the following command and executed this on a sector, reported as > > out of sync: > > > > drbd3: Out of sync: start=192896, size=24 (sectors) > > <lots of lines of them> > > > > [root at saprouter1 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copied, 6.06248 seconds, 2.0 kB/s > > 6d79f038d772cac9ce34af477574ef7d > > > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copied, 1.93692 seconds, 6.3 kB/s > > 30704f13388d5e23d644e44996cc2b62 > > > > Uh, just found something very strange after executing the command again: > > > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copied, 1.93692 seconds, 6.3 kB/s > > 30704f13388d5e23d644e44996cc2b62 > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > a2915f04e3533f1f329ac9eab8a875f9 > > 12288 bytes (12 kB) copied, 0.99269 seconds, 12.4 kB/s > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copied, 0.856155 seconds, 14.4 kB/s > > b94b4be0da055b290d4ae4d421cf17f0 > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copiedb94b4be0da055b290d4ae4d421cf17f0 > > , 0.597763 seconds, 20.6 kB/s > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copieda2915f04e3533f1f329ac9eab8a875f9 > > , 0.621022 seconds, 19.8 kB/s > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > > iflag=direct | openssl md5 > > 24+0 records in > > 24+0 records out > > 12288 bytes (12 kB) copied6d79f038d772cac9ce34af477574ef7d > > , 0.041115 seconds, 299 kB/s > > > > How could there be sometimes different checksums? O_o > > Problems with the RAID-driver? > > > > Current state is: > > > > [root at saprouter1 ~]# /etc/init.d/drbd status > > drbd driver loaded OK; device status: > > version: 8.2.6 (api:88/proto:86-88) > > GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by > > buildsvn at c5-i386-build, 2008-06-02 10:17:29 > > m:res cs st ds p > > mounted fstype > > 0:drbd_afd Connected Secondary/Secondary UpToDate/UpToDate C > > 1:drbd_ftpdata Connected Secondary/Secondary UpToDate/UpToDate C > > 2:drbd_saprouter Connected Secondary/Secondary UpToDate/UpToDate C > > 3:drbd_configs Connected Secondary/Secondary UpToDate/UpToDate C > > > > Greetings and sorry for bothering again^^ > > Florian > > _______________________________________________ > > drbd-user mailing list > > drbd-user at lists.linbit.com > > http://lists.linbit.com/mailman/listinfo/drbd-user > > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user