[Drbd-dev] DRBD-8: recent regression causing corruption and
crashes
Lars Ellenberg
Lars.Ellenberg at linbit.com
Fri Aug 11 20:45:59 CEST 2006
/ 2006-08-11 12:01:23 -0400
\ Graham, Simon:
> Quick update:
>
How exactly do you "test"?
Kernel and hardware?
(sorry, if you posted that earlier, just point me to it)
I triggered a full sync (drbdadm invalidate),
and while that was running, access the Primary(SyncSource)
(cp -av /somethinghuge/ /mnt/drbd-mount-point/)
> > 1. I get errors during initial synchronization of a volume like this
> > that cause the resync to be aborted:
> >
> > drbd15: tl_verify: failed to find req e51a4da0, sector 0 in list
I don't see those here.
> DRBD, Cmd: WriteAck, BlkId: SYNCER Sector: 0, AckLen: 8000
I don't see these either.
> > 2. I get panics with the following signature:- these look like they
> are
> > happening when a local write
> > on the primary (which this node is) completes.
>
> The panic signature seems to change - for example, I just got one like
> this in the receiver thread:
>
> drbd15: ASSERT( drbd_req_get_sector(i) == sector ) in
> /sandbox/sgraham/sn/trunk/platform/drbd/8.0/drbd/drbd_main.c:313
> drbd15: tl_verify: found req e63d0240 but it has wrong sector (8 versus
> 0)
nor these.
> drbd15: in tl_clear_barrier:374: ap_pending_cnt = -1 < 0 !
this is bad...
What I do see here is: "ap_pending > 0" still too often, when I
disconnect during resync + write activity, effectively blocking the
Primary's io subsystem. seemingly we still got bugs in tl_clear :(
need to look into that further.
> Code: Bad EIP value.
> <0>Fatal exception: panic in 5 seconds
outch.
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com :
More information about the drbd-dev
mailing list