[DRBD-user] Machine crashed repeatedly: drbd16: Epoch set sizewrong!!found=1061 reported=1060
Todd.Denniston at ssa.crane.navy.mil
Mon Nov 1 17:14:16 CET 2004
Andreas Hartmann wrote:
> Hello Lars,
> Lars Ellenberg schrieb:
> > sure, it still could be drbd's fault. but there is not real sign of this.
> > and the symptom of "gcc compile fails always the same;
> > after reboot, it works fine" for me clearly suggests bad ram.
> This would mean:
> I've got two machines, both would have bad RAM, which comes up with drbd
> only. All other applications are running fine. Hmmm. Ok, not impossible
> but very hard to believe.
> Could it be to get rid of the potential errors, if the RAM-timings in the
> bios are changed to a less critical value (if this is possible at all)?
> I recall, that there are memory chips (with an additional chip), which can
> detect flipping bits. Don't know, if the machines memory has this feature.
> I will have a look!
I am not sure if the XSeries are x86 machines or not,
if they are you might try memtest86, if they are not then the old 'memory
test script' can be useful. If you have 'upgraded' the ram you might want
to take a quick look at some linuxforum comments. The memtest script
does have one other advantage over the memtest86, you can run it while the
system is live, but it is not necessarily a good idea. We have found two
systems lately which got a batch of new ram and then would not finish an
install of fedora before crashing, memtest86 showed that a portion of the ram
was failing, we slowed down the ram bus and then it passed memtest86 and
worked, which was acceptable for the system use.
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter
More information about the drbd-user