[DRBD-user] ext3 filesystem errors using drbd without debug_all_symbols on

Fri Mar 4 00:57:06 CET 2005

Hi all.

A few times ago we had a problem with a hardisk with bad blocks. We got 
filesystem errors on both primary and secondary. I though is the bad 
disk's fault, we changed it, we got no error on the dell tests, so it 
looked like the problem was solved. But it wasn't so, we started to get 
errors all the time we ran fsck on all partitions, we upgraded to 
0.7.10, upgraded the kernel to 2.6.10-gentoo-r6 I destroyed the 
partitions (meta data and the one we wanted mirrored) but we still got 
errors.

If I leave the primary (let's call it d2) unconnected to secondary (d1), 
it runs fine.

I started to do some tests on d1, putting around 60 Boonies to run in 
the same time. On all tests drbd was in WFCOnnection state. The machine 
keeps erroring out about the filesystem. I trying to use the device 
without drbd, it ran fine, for 24 hours. With drbd without debug on, it 
error in 1-5 hours, no exceptions. I compiled drbd with debug on, and, 
surprise, it ran fine for over 48 hours. I did the tests a few times, 24 
hours min, the machine runs fine only with debug_all_symbols defined in 
drbd_config.h.

The errors we get are:

EXT3-fs error (device XXXX) in start transaction: Journal has aborted
EXT3-fs error (device XXXX) in ext3_ordered_writepage: IO failure
EXT3-fs error (device XXXX) in ext3_find_entry: reading directory #asd 
offset n
EXT3-fs error (device XXXX) in ext3_get_inode_loc: unable to read inode 
block inode - xxx block -xxx
EXT3-fs error (device XXXX) ext3_journal_start_db: Detected aborted 
journal - Remounting read only

I put XXXx in place of device because we get errors about all the 
mounted devices.

Does anyone knows why this behavior without debug on drbd?

Thank You.
Florin Cazacu.