[DRBD-user] Machine crashed repeatedly: drbd16: Epoch set size wrong!!found=1061 reported=1060

Andreas Hartmann andihartmann at freenet.de
Tue Nov 2 17:22:48 CET 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello Lars,

[...]

Yesterday evening, one of the machines crashed again during datatransfer
(don't know why up to now, because the location of the machine is another
as mine. Maybe I can see something at tuesday). I rebooted the secondary,
preventing it from crashing, too.

It was syslogd, which crashed. Unfortunately, the log wasn't in the
messages-file, but just onto the screen.

The previous drbd-errors (from the days before), I could see:

Oct 27 16:03:03 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=1
reported=0
Oct 27 16:03:10 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=364
reported=363
Oct 27 16:04:14 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=372
reported=371
Oct 27 16:04:41 FAGINTSC kernel: drbd11: tl messed up!
Oct 27 16:04:41 FAGINTSC kernel: drbd11: invalid barrier
number!!found=4522056, reported=42224
Oct 27 16:04:41 FAGINTSC kernel: drbd11: Epoch set size wrong!!found=197
reported=63
Oct 27 16:04:44 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967295
Oct 27 16:04:50 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967294
Oct 27 16:04:56 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967293
Oct 27 16:05:02 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967292
[...]
Oct 27 16:20:02 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967142
Oct 27 16:20:08 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967141
Oct 27 16:20:14 FAGINTSC kernel: drbd11: [bdflush/6] sock_sendmsg timeout
count down: ko=4294967140




Oct 29 19:52:33 fagintsc kernel: drbd16: Epoch set size wrong!!found=1138
reported=1137
Oct 29 19:53:32 fagintsc kernel: drbd16: tl messed up!
Oct 29 19:53:32 fagintsc kernel: drbd16: invalid barrier number!!found=0,
reported=2739
Oct 29 19:53:32 fagintsc kernel: drbd16: Epoch set size wrong!!found=1331
reported=241
Oct 29 19:53:32 fagintsc kernel: drbd16: invalid barrier number!!found=0,
reported=2740
Oct 29 19:53:32 fagintsc kernel: drbd16: Epoch set size wrong!!found=5
reported=260
Oct 29 19:53:32 fagintsc kernel: drbd16: invalid barrier number!!found=0,
reported=2741
Oct 29 19:53:32 fagintsc kernel: drbd16: Epoch set size wrong!!found=5
reported=238
Oct 29 19:53:32 fagintsc kernel: drbd16: invalid barrier number!!found=0,
reported=2742
Oct 29 19:53:32 fagintsc kernel: drbd16: Epoch set size wrong!!found=5
reported=239


After these sequenzes, the machine was always dead.



Kind regards,
Andreas Hartmann




More information about the drbd-user mailing list