Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wed, Aug 06, 2008 at 02:22:07PM +0200, Schmidt, Florian wrote: > Hello everyone again, > > I just added verify-alfg crc32c; to my drbd.conf and ran drbdadm verify > all > > After that I saw lots of out-of-sync secrots in dmesg. > I thought: OK, lets sync and then everything should be alright. > > But after the sync (drbdadm disconnect and connect all on primary one), > a re-run of drbdadm verify all still found out-of-sync sectors. > > So I googled around and saw that there where also people having this > problem before. > > I found the following command and executed this on a sector, reported as > out of sync: > > drbd3: Out of sync: start=192896, size=24 (sectors) > <lots of lines of them> > > [root at saprouter1 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > iflag=direct | openssl md5 > 24+0 records in > 24+0 records out > 12288 bytes (12 kB) copied, 6.06248 seconds, 2.0 kB/s > 6d79f038d772cac9ce34af477574ef7d > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 > iflag=direct | openssl md5 > 24+0 records in > 24+0 records out > 12288 bytes (12 kB) copied, 1.93692 seconds, 6.3 kB/s > 30704f13388d5e23d644e44996cc2b62 > > Uh, just found something very strange after executing the command again: > > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 iflag=direct | openssl md5 > 30704f13388d5e23d644e44996cc2b62 > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 iflag=direct | openssl md5 > a2915f04e3533f1f329ac9eab8a875f9 > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 iflag=direct | openssl md5 > b94b4be0da055b290d4ae4d421cf17f0 > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 iflag=direct | openssl md5 > b94b4be0da055b290d4ae4d421cf17f0 > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 iflag=direct | openssl md5 > a2915f04e3533f1f329ac9eab8a875f9 > [root at saprouter2 ~]# dd if=/dev/sda9 skip=192896 bs=512 count=24 iflag=direct | openssl md5 > 6d79f038d772cac9ce34af477574ef7d > > How could there be sometimes different checksums? O_o > Problems with the RAID-driver? So what you are saying is you don't have any activity on the device yet, if you look at the same location, the data changes. if you can reproduce this behaviour, please "tee" the raw data somewhere, so you can diff the hexdumps, and see if it is * bit flips * some words changed * totally unrelated data * anything else that may hint to something. but yes, if nothing is writing to the device, and you read it several times, and you get different data back and it is not supposed to spontaneously change its content (i.e. it is not random generator), that device, or the communication path to it, or its driver, is broken. is that a RAID5? RAID10? did you rebuild/resilver/consistency check it lately? any diagnose tools to determine its own view about its health status and the state of the world? -- : Lars Ellenberg http://www.linbit.com : : DRBD/HA support and consulting sales at linbit.com : : LINBIT Information Technologies GmbH Tel +43-1-8178292-0 : : Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 : __ please don't Cc me, but send to list -- I'm subscribed