[Drbd-dev] DRBD-8 - handling data write errors

Graham, Simon Simon.Graham at stratus.com
Thu Jan 11 16:50:27 CET 2007

> this is not a short-term project, but
> how about this:
>  introduce an additional "badblocks" bitmap -- actually, I think
> probably
>  a "range-list" type of storage would be appropriate here.
>  local read error:
>     mark dirty, read full blocks remotely (which may be more than the
>     application requested), write -- written ok: mark clean again.
>  local write error:
>     mark block (range) as bad,
>     mark system "degraded"
>  both blocks bad, or remote not reachable:
>     pass to upper layers.

You are right - it's not short term! Also;
. I think it'd be necessary to write this new badblocks structure to the
  meta-data so we'd need to allocate space for it
. We'd then need to deal with the case of having no more space to record
badblocks (the
  disk is pretty toasty in this case - maybe just detach).

You know, the underlying disks already include a lot of this
functionality and the more I think about it, the more convinced I am
that detaching on any error is the right thing to do -- 
. DRBD already (I think) correctly handles things if you re-attach
following this (it'll try 
  to resync the failed blocks and if that fails it would detach again).
. Although this seems like you end up doing a lot of work, these errors
are unlikely so I
  think it's OK to use a large hammer.

I'm going to do some experiments with the error handler set to Detach -
will report back on results.

More information about the drbd-dev mailing list