[Drbd-dev] Re: drbd_panic() in drbd_receiver.c
Philipp Reisner
philipp.reisner at linbit.com
Wed Jul 5 18:15:01 CEST 2006
> Apologies for the detail below, but I want to make sure I'm going about
> this the right way - Here's what I'm thinking as a way to fix this --
> please comment; you know this code so much better than I do!
>
> 1. Add a new field in the mdev - rs_failed - that counts the number of
> NegDSReply's received, init to zero
> at start of resync
ack.
> 2. Move the code that checks for end of resync into a new routine -
> drbd_check_for_end_resync() and change it
> to check if the bitmap weight is <= rs_failed.
ok.
> 3. Change drbd_try_to_clean_on_disk_bm to schedule w_update_odbm if
> _any_ bits are cleared on disk (perhaps it should
> be some-bit-cleared AND (rs_failed!=0 || extent-now-completely-clear)
> - that wont change the current behavior if
> no failures occur -- I'm just a bit worried about doing this too
> often...
I see the problem here... And I have am advice for you.
The bm_extent holds the number of dirty bit for the extent (rs_left).
Add a member there that holds the number of IO errors for that
sync extent (rs_failed).
... Do you know by now what I mean ?
> 4. Add a call to drbd_check_for_end_resync() in got_NegDSReply() to
> handle the case where the last block failed.
right.
> 5. Find all the places where rs_total, rs_mark_left and the bitmap
> weight are referenced and include rs_failed as
> necessary (e.g. BM_PARANOIA_CHECK in drbd_bitmap.c).
-Philipp
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :
More information about the drbd-dev
mailing list