[Drbd-dev] Checksum based resync block size

Wed Jun 26 21:20:31 CEST 2019

On Mon, 24 Jun 2019, Lars Ellenberg wrote:

> On Sat, Jun 22, 2019 at 12:03:55AM +0000, Eric Wheeler wrote:
> > Hello all,
> > 
> > Can someone help explain how checksum-based sync and verify are 
> > implemented in the sender and receive side?  It looks like the hashes are 
> > per-sector (looking at read_for_csum?) and I am interested in making the 
> > csum chunk size configurable, or at least hack in some test code to see if 
> > it would provide a performance benefit to csum multiple sectors.
> > 
> > I'm also trying to understand what iterates over the lldev and understand 
> > where the csum takes place foreach chunk of data.
> > 
> > Any direction would be helpful.  Thank you.
> 
> As our in-sync/out-of-sync bitmap tracks 4k blocks,
> we want to compare 4k checkesums.
> 
> Yes, that generates "a lot" of requests, and if these are not merged by
> some IO scheduler on the lower layers, that may seriously suck.
> 
> make_ov_request() is what generates the online-verify requests.
> 
> What we potentially could do is issue the requests in larger chunks,
> like (1 MiB) to the backends, then calculate and communicate the
> checksum per each 4k, as well as the result.

What if it were to calculate 1MiB chunks (configurable) and then 
invalidate all 4k bitmap entries in that 1MiB range if the hash 
mismatches?

--
Eric Wheeler

> 
> -- 
> : Lars Ellenberg
> : LINBIT | Keeping the Digital World Running
> : DRBD -- Heartbeat -- Corosync -- Pacemaker
> : R&D, Integration, Ops, Consulting, Support
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT
> _______________________________________________
> drbd-dev mailing list
> drbd-dev at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-dev
>