Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all. A few times ago we had a problem with a hardisk with bad blocks. We got filesystem errors on both primary and secondary. I though is the bad disk's fault, we changed it, we got no error on the dell tests, so it looked like the problem was solved. But it wasn't so, we started to get errors all the time we ran fsck on all partitions, we upgraded to 0.7.10, upgraded the kernel to 2.6.10-gentoo-r6 I destroyed the partitions (meta data and the one we wanted mirrored) but we still got errors. If I leave the primary (let's call it d2) unconnected to secondary (d1), it runs fine. I started to do some tests on d1, putting around 60 Boonies to run in the same time. On all tests drbd was in WFCOnnection state. The machine keeps erroring out about the filesystem. I trying to use the device without drbd, it ran fine, for 24 hours. With drbd without debug on, it error in 1-5 hours, no exceptions. I compiled drbd with debug on, and, surprise, it ran fine for over 48 hours. I did the tests a few times, 24 hours min, the machine runs fine only with debug_all_symbols defined in drbd_config.h. The errors we get are: EXT3-fs error (device XXXX) in start transaction: Journal has aborted EXT3-fs error (device XXXX) in ext3_ordered_writepage: IO failure EXT3-fs error (device XXXX) in ext3_find_entry: reading directory #asd offset n EXT3-fs error (device XXXX) in ext3_get_inode_loc: unable to read inode block inode - xxx block -xxx EXT3-fs error (device XXXX) ext3_journal_start_db: Detected aborted journal - Remounting read only I put XXXx in place of device because we get errors about all the mounted devices. Does anyone knows why this behavior without debug on drbd? Thank You. Florin Cazacu.