[Drbd-dev] DRBD-8 - handling data write errors
Lars Ellenberg
Lars.Ellenberg at linbit.com
Thu Jan 11 11:01:25 CET 2007
/ 2007-01-10 23:00:53 -0500
\ Graham, Simon:
> > I'm not really sure how to fix this at the moment, but I'm considering
> > the following:
> >
> > 1. The side that gets the error marks the block as out of sync AND
> > marks the local disk as inconsistent.
> > 2. Receipt of a NegAck causes the block to be marked as out of sync
> > AND the peer disk is made inconsistent (not sure if I need this step
> > since step 1 should cause this fact to be broadcast but it seems
> > safer).
> >
>
> So - I've found there is some existing code in place already - for
> example, set_out_of_sync is done in req_may_be_done if either local or
> remote fails, however, this is not sufficient for a couple of reasons:
>
> 1. Need to get the failing disk set Inconsistent so that following reads
> do not attempt to use the local block.
>
> 2. It seems to me that the current code doesn't really handle
> set_out_of_sync being set whilst resync is in progress (i.e. if a
> write error occurs on an application write during resync).
>
> I've also coded something that sends the Inconsistent state to the other
> side, which will trigger resync immediately - perhaps I shouldn't do
> this??? Not really going to be able to fix this problem (although it
> might be worth trying if the error was transient)...
>
> I wonder if we shouldn't instead simply always detach on error (i.e.
> stop using PassOn at all) to get the best behavior... this would
> certainly make things simpler (and we could remove the forcible detach
> on meta-data error that I added earlier -- if you want to be able to
> handle errors then never use PassOn!
this is not a short-term project, but
how about this:
introduce an additional "badblocks" bitmap -- actually, I think probably
a "range-list" type of storage would be appropriate here.
local read error:
mark dirty, read full blocks remotely (which may be more than the
application requested), write -- written ok: mark clean again.
local write error:
mark block (range) as bad,
mark system "degraded"
both blocks bad, or remote not reachable:
pass to upper layers.
I still need to think about the various meta-data io-error possibilities.
--
: Lars Ellenberg Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
More information about the drbd-dev
mailing list