[Drbd-dev] FLUSH/FUA documentation & code discrepancy

Lars Ellenberg lars.ellenberg at linbit.com
Wed Sep 5 12:07:24 CEST 2012


On Wed, Sep 05, 2012 at 01:49:15AM -0700, Tejun Heo wrote:
> On Wed, Sep 05, 2012 at 10:44:55AM +0200, Philipp Reisner wrote:
> > > Currently, FLUSH/FUA doesn't enforce any ordering requirement.  File
> > > systems are responsible for draining all writes which have to happen
> > > before and not issue further writes which should come after.
> > 
> > Ok. That is a clear statement. So we will do it that way.
> > 
> > The "Currently" in you statement, suggests that there might be something
> > more mighty in the future. Is that true?
> 
> Heh, I was more thinking about the past.  We used to have barrier
> support with much stricter ordering.  I don't think we're gonna change
> the ordering requirement in any foreseeable future.

So reiterating the situation:

If I'd submit a non-empty bio with FLUSH/FUA set,
on a queue that does support flush, we get to
	blk_queue_bio()
		if (bio->bi_rw & (REQ_FLUSH | REQ_FUA)) {
			spin_lock_irq(q->queue_lock);
			where = ELEVATOR_INSERT_FLUSH;
			goto get_rq;

This bio ends up *not* being merged or reordered by the elevator.
(and, by means of flush/fua not by the hardware, either, obviously)


If the queue does not support it, flags are stripped away in
generic_make_request_checks(), and we will not take that branch
in blk_queue_bio(), but enter the normal elevator code path,
attempting a merge, or doing ELEVATOR_INSERT_SORT.

This same bio, happening to be submitted on a different IO stack,
now *is* being reordered in the elevator already,
even before being sent to the hardware.




If we somehow can express at submit_bio time that we would like
this bio, once it reaches the elevator, to not be reordered,
regardless of whether or not FLUSH is supported respectively required
by the IO stack in use, that would be better than now, IMO.
In fact, for our particular use case it would even be good enough.


Could we "just" strip these flags a "little bit later"?
Or set some other indicator when stripping them?


Thanks,

	Lars



More information about the drbd-dev mailing list