[DRBD-user] Re: Could drbd randomly flip bits? Was: Database page corruption on disk occurring during mysqldump on a fresh database and Was: Spontaneous development of supremely large files on different ext3 filesystems

Mon Sep 17 20:59:00 CEST 2007

Hi Maurice,

If you're running into corruption both in ext3 metadata and in MySQL 
data, it is certainly not he fault of MySQL as you're likely aware.

There are absolutely many places where corruption could occur between 
MySQL and the physical bits on disk.  The corruption you're seeing does 
not appear to be just "flipped bits", although I guess any corruption 
could be called that.  If you compare the two i_sizes you see from below:

 >> Inode 16257874, i_size is 18014398562775391, should be 53297152

53297152:

0000 0000 0000 0000 0000 0000 0000 0000
0000 0011 0010 1101 0100 0000 0000 0000

18014398562775391:

0000 0000 0100 0000 0000 0000 0000 0000
0000 0011 0010 1101 0011 0001 0101 1111

Differences: 10 x 0->1, 1 x 1->0.

 >> Inode 2121855, i_size is 35184386120704, should be 14032896.

14032896:

0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 1101 0110 0010 0000 0000 0000

35184386120704:

0000 0000 0000 0000 0010 0000 0000 0000
0000 0000 1101 0110 0001 1100 0000 0000

Differences: 4 x 0->1, 1 x 1->0

You can see that there are in fact many bits flipped in each.  I would 
suspect higher-level corruption than the actual disks (typical single 
bit or double bit flips, and generally 1->0 only) but lower than the OS 
(typical entire page corruptions of 4k-64k).

That leaves network, SATA controller, various system buses, and possibly 
stupid errors in DRBD (although I'd call this unlikely).

Do note that data on e.g. the PCI bus is not protected by any sort of 
checksum.  I've seen this cause corruption problems with PCI risers and 
RAID cards.  Are you using a PCI riser card?  Note that LSI does *not* 
certify their cards to be used on risers if you are custom building a 
machine.

Regards,

Jeremy

Maurice Volaski wrote:
> In using drbd 8.0.5 recently, I have come across at least two 
> instances where a bit on disk apparently flipped spontaneously in the 
> ext3 metadata on volumes running on top of drbd.
> 
> Also, I have been seeing regular corruption of a mysql database, 
> which runs on top of drbd, and when I reported this as a bug since I 
> also recently upgraded mysql versions, they question whether drbd 
> could be responsible!
> 
> All the volumes have been fscked recently and there were no reported 
> errors. And, of course, there have been no errors reported from the 
> underlying hardware.
> 
> I have since upgraded to 8.0.6, but it's too early to say whether 
> there is a change.
> 
> I'm also seeing the backup server complain of not being files not 
> comparing, though this may be a separate problem on the backup server.
> 
> 
> 
> The ext-3  bit flipping:
> At 12:00 PM -0400 9/11/07, ext3-users-request at redhat.com wrote:
>> I have come across two files, essentially untouched in years, on two
>> different ext3 filesystems on the same server, Gentoo AMD 64-bit with
>> kernel 2.6.22 and fsck version 1.40.2 currently, spontaneously
>> becoming supremely large:
>>
>> Filesystem one
>> Inode 16257874, i_size is 18014398562775391, should be 53297152
>>
>> Filesystem two
>> Inode 2121855, i_size is 35184386120704, should be 14032896.
>>
>> Both were discovered during an ordinary backup operation (via EMC
>> Insiginia's Retrospect Linux client).
>>
>> The backup runs daily and so one day, one file must have grew
>> spontaneously to this size and then on another day, it happened to
>> the second file, which is on a second filesystem. The backup attempt
>> generated repeated errors:
>>
>> EXT3-fs warning (device dm-2): ext3_block_to_path: block > big
>>
>> Both filesystems are running on different logical volumes, but
>> underlying that is are drbd network raid devices and underlying that
>> is a RAID 6-based SATA disk array.
> 
> 
> 
> The answer to the bug report regarding mysql data corruption, who is 
> blaming drbd!
>> http://bugs.mysql.com/?id=31038
>>
>>  Updated by:  Heikki Tuuri
>>  Reported by: Maurice Volaski
>>  Category:    Server: InnoDB
>>  Severity:    S2 (Serious)
>>  Status:      Open
>>  Version:     5.0.48
>>  OS:          Linux
>>  OS Details:  Gentoo
>>  Tags:        database page corruption locking up corrupt doublewrite
>>
>> [17 Sep 18:49] Heikki Tuuri
>>
>> Maurice, my first guess is to suspect the RAID-1 driver.
> 
> 
> My initial report of mysql data corruption:
>>> A 64-bit Gentoo Linux box had just been upgraded from MySQL 4.1 
>>> to5.0.44 fresh (by dumping in 4.1 and restoring in 5.0.44) and 
>>> almostimmediately after that, during which time the database was 
>>> not used,a crash occurred during a scripted mysqldump. So I 
>>> restored and dayslater, it happened again. The crash details seem 
>>> to be trying tosuggest some other aspect of the operating system, 
>>> even the memoryor disk is flipping a bit. Or could I be running 
>>> into a bug in thisversion of MySQL?
>>>
>>> Here's the output of the crash
>>> -----------------------------------
>>> InnoDB: Database page corruption on disk or a failed
>>> InnoDB: file read of page 533.
>>> InnoDB: You may have to recover from a backup.
>>> 070827  3:10:04  InnoDB: Page dump in ascii and hex (16384 bytes):
>>>  len 16384; hex
>>>
>>> [dump itself deleted 
>>> forbrevity]                                                                                                            
>>>
>>>
>>>                                                                                                                                                                                                    
>>>  ;InnoDB: End of page dump
>>> 070827  3:10:04  InnoDB: Page checksum 
>>> 646563254,prior-to-4.0.14-form checksum 2415947328
>>> InnoDB: stored checksum 4187530870, prior-to-4.0.14-form 
>>> storedchecksum 2415947328
>>> InnoDB: Page lsn 0 4409041, low 4 bytes of lsn at page end 4409041
>>> InnoDB: Page number (if stored to page already) 533,
>>> InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
>>> InnoDB: Page may be an index page where index id is 0 35
>>> InnoDB: (index PRIMARY of table elegance/image)
>>> InnoDB: Database page corruption on disk or a failed
>>> InnoDB: file read of page 533.
>>> InnoDB: You may have to recover from a backup.
>>> InnoDB: It is also possible that your operating
>>> InnoDB: system has corrupted its own file cache
>>> InnoDB: and rebooting your computer removes the
>>> InnoDB: error.
>>> InnoDB: If the corrupt page is an index page
>>> InnoDB: you can also try to fix the corruption
>>> InnoDB: by dumping, dropping, and reimporting
>>> InnoDB: the corrupt table. You can use CHECK
>>> InnoDB: TABLE to scan your table for corruption.
>>> InnoDB: See also 
>>> InnoDB:http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
>>> InnoDB: about forcing recovery.
>> InnoDB: Ending processing because of a corrupt database page.
> 

-- 
high performance mysql consulting
www.provenscaling.com