[DRBD-user] kvm, drbd, elevator, rotational - quite an interesting co-operation

Lars Ellenberg lars.ellenberg at linbit.com
Fri Jul 3 16:00:59 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Fri, Jul 03, 2009 at 08:06:07AM -0500, Javier Guerra wrote:
> Lars Ellenberg wrote:
> > On Thu, Jul 02, 2009 at 11:55:05PM +0400, Michael Tokarev wrote:
> > > drbd: what's the difference in write pattern on secondary and
> > >   primary nodes?  Why `rotational' flag makes very big difference
> > >   on secondary and no difference whatsoever on primary?
> > 
> > not much.
> > disk IO on Primary is usually submitted in the context of the
> > submitter (vm subsystem, filesystem or the process itself)
> > whereas on Secondary, IO is naturally submitted just by the
> > DRBD receiver and worker threads.
> just like with KVM itself, using several worker threads against a
> single IO device makes performance heavily dependent on a sensible
> elevator algorithm.  ideally, there should be only one worker thread
> for each thread/process originating the initial write.  unfortunately
> DRBD, being a block/level protocol, might have a hard time unraveling
> which writes belong to which process.  maybe just merging adjacent (in
> block address space, not in time) write operations would keep most of
> the relationships.

again, on the DRBD secondary, _all_ IO is naturally submitted
by _one_ thread (the receiver thread), appart from just a few
"internally" generated writes, which have to happen from a second
thread, the (one) drbd worker thread.
no, we do not do thread pooling (yet).

the elevator of the lower level block device (in this case,
the kvm virtual block device, or the host real block device)
is responsible for merging requests adjacent in block address space.

you have to choose a sensible elevator there.

please do use deadline IO scheduler on the host.
please do use deadline IO scheduler on the host.

if you really think you have reason to use cfq,
just _try_ deadline anyways.
or upgrade to a 2.6.30 or 2.6.31 kernel, where cfq should finally be
usable for many IO requests of one thread (receiver) depending on single
synchronous IO from an other thread (drbd meta data updates from worker).
but using cfq on kernel < 2.6.30 for this access pattern
introduces unneccessary latency of "idle-wait" (in the range of ms!)
for those meta data updates.

so, on kernel < 2.6.30,
please do use deadline IO scheduler.



and if performance then still sucks,
we can talk further about why this may be.

: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list