[DRBD-user] Machine crashed repeatedly: drbd16: Epoch set size wrong!!found=1061 reported=1060

Lars Ellenberg Lars.Ellenberg at linbit.com
Tue Nov 2 17:52:05 CET 2004


/ 2004-11-02 17:22:48 +0100
\ Andreas Hartmann:
> Hello Lars,
> 
> [...]
> 
> Yesterday evening, one of the machines crashed again during datatransfer
> (don't know why up to now, because the location of the machine is another
> as mine. Maybe I can see something at tuesday). I rebooted the secondary,
> preventing it from crashing, too.
> 
> It was syslogd, which crashed. Unfortunately, the log wasn't in the
> messages-file, but just onto the screen.
> 
> The previous drbd-errors (from the days before), I could see:
> 
> Oct 27 16:03:03 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=1
> reported=0
> Oct 27 16:03:10 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=364
> reported=363
> Oct 27 16:04:14 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=372
> reported=371
> Oct 27 16:04:41 FAGINTSC kernel: drbd11: tl messed up!
> Oct 27 16:04:41 FAGINTSC kernel: drbd11: invalid barrier
> number!!found=4522056, reported=42224

well, it looks like something corrupts your memory. I really doubt it is
drbd (someone else had noticed, too). probably it is the memory itself,
that is corrupt.

maybe some intensive cosmic rays outburst hit your boxes some weeks ago,
and all memory modules now have some kaputt bits :-/

	lge



More information about the drbd-user mailing list