[DRBD-user] drbd 0.6.12 and 0.6.7: Epoch set size wrong!! tl messed up! transferlog too small!!

Lars Ellenberg lars.ellenberg at linbit.com
Sat Nov 24 19:23:52 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Fri, Nov 23, 2007 at 12:51:48AM +1100, Nick Urbanik wrote:
> Dear Folks,
> 
> I am using DRBD 0.6.12 and have created a second drbd device on two
> 200MB raw disks.
> I am syncing data, and am copying data, and all was
> going nicely until these messages began to appear in
> /var/log/messages:
> Nov 23 00:23:51 machine1 kernel: drbd1: tl messed up!
> Nov 23 00:23:51 machine1 kernel: drbd1: Epoch set size wrong!!found=192 
> reported=191 Nov 23 00:23:51 machine1 kernel: drbd1: transferlog too 
> small!! 
> What does this mean?

it means that the transfer log in 0.6 is too small, and got messed up.
you don't want more details.

to put it an other way, it means you should upgrade your drbd.

> Will I lose data?

not immediately.
this does not mean that it would be harmless.
only that as long as nothing else fails,
this alone will not eat your data.
howere, in combination with real failures, it very well might.

> Is something configured wrongly?
probably.
you could try to increase tl-size.

> Note: heartbeat is monitoring the first device, but not the second, on
> which these errors are being reported.

heartbeat is monitoring?  are you sure?
you mean you use heartbeat 2, with drbd 0.6,
on a red hat 7.3?
why would you do that?

and you only mirror 200 MB?
what is this, some sort of embeded thing?

> I am migrating from the first (full) device to the second (much
> bigger) device.
> 
> The OS is Red Hat 7.3.
> 
> On machine1:
> version: 0.6.7 (api:63/proto:62)
> On machine2:
> $ cat /proc/drbd
> version: 0.6.12 (api:64/proto:62)

> On both machine1 and machine2:
> ==============================
> 
> $ cat /etc/drbd.conf
> global {
>      minor_count=2
> }
> 
> resource drbd0 {
>     protocol = B

note that the implementation of protocol A and B
has been wrong, always, from the begining,
and only got fixed in drbd 0.7.22 and later,
as well as 8.x. using protocol != C with older versions
may well lead to data loss if you had some failovers.
also, with older versions, protocol C performs better
(even though that is counter intutive, and was documented
otherwise back then, iirc)

I'd recommend to upgrade to drbd 8,
and probably upgrade the rest of the system as well.

unless you are locked in for some obscure reason.
though, you should not be, this is the OSS world...

-- 
: commercial DRBD/HA support and consulting: sales at linbit.com :
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list