[DRBD-user] filesystem corruptions

Thu Sep 29 18:18:11 CEST 2005

bro wrote:
> 
> Lars Ellenberg wrote:
> > I am not saying that this is definitely what you see,
> > but I do strongly suspect it.
> >
> > upgrade the kernel to something >= 2.6.12.4.
> 
> oh i'm sorry, this must have slipped out from my memory.
> Afterwards we've come across that problem, we've set up a different set
> of boxes for testing use only with configuration as it follows:
> 
> DRBD version 0.7.11 (api:77/proto:74)
> Linux test1 2.6.12.5 #1 SMP
> 
> and still we had errors while running tests, this is one of them:
> 
> EXT3-fs error (device drbd0): ext3_add_entry: bad entry in directory
> #16548547: rec_len is smaller than minimal - offset=0, inode=0,
> rec_len=0, name_len=0
> Aborting journal on device drbd0.
> EXT3-fs error (device drbd0) in ext3_reserve_inode_write: Journal has
> aborted
> ext3_abort called.
> EXT3-fs error (device drbd0): ext3_journal_start_sb: Detected aborted
> journal
> Remounting filesystem read-only
> EXT3-fs error (device drbd0) in start_transaction: Journal has aborted
> __journal_remove_journal_head: freeing b_committed_data
> 
> so i guess that's why we didn't use it on our servers altough we still
> got errors.

I am assuming you are still getting those errors on your test system after
you upgraded from a 2.6.11.

Question are you getting those errors on your test system _after_ a fsck was
ran since the kernel upgrade? 

The only time I have seen similar (with a 2.4 kernel and drbd 0.6.13) is
when I had hardware faults causing disk hangs on one machine at the same
time as someone tried to "FIX" the systems caused a split brain. fsck fixed
it.

-- 
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane) 
Harnessing the Power of Technology for the Warfighter