[DRBD-user] I/O Scheduler and DRBD

Lars Ellenberg lars.ellenberg at linbit.com
Tue Jan 8 11:29:38 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Mon, Jan 07, 2008 at 08:13:29PM +0100, Bernd Petrovitsch wrote:
> Hi all!
> 
> I have 2 DRBD clusters - one cluster with Pentium 4 3.2 GHz CPUs and
> "3ware 7000-series ATA-RAID" controllers with RAID1 over two 75GB SCSI
> disks each, the other with Xeon 3.2 GHz CPUs and "Adaptec SmartRAID V"
> controllers with 6 disks (RAID0 over 3 RAID0), both with a
> 2.6.15-1-em64t-p4-smp kernel kernel from Debian/Sarge backports
> (admittedly from many months ago).

any software raid or lvm striping involved?

> Typical workload is manipulating many many small files. However the
> nightly backup job (especially on the 6-disk hosts) job stresses the I/O
> subsystem that much that it blocks the rest of the host almost
> completely unusable and programs run into timeouts on I/O.
> That kernel uses per default the "anticipatory" I/O scheduler (which
> seems to be the problem of the starvation).
> I wonder
> - if there is any risk involved in changing that (via kernel command
>   line and/or via /sys/block/<devicename>/queue/scheduler) to "cfq" -
>   the presumbly best one, and

go ahead.

I personally prefer "deadline" on servers.
even though for a busy postgress, we had good results with cfq,
which "feeled" marginally better than deadline there.
  main tunables for deadline apear to be:
	/sys/block/hda/queue/iosched/front_merges:1
		(can be switched off for raid controllers with good write cache)
	/sys/block/hda/queue/iosched/read_expire:500
		(ms. tune to 300 maybe)
	/sys/block/hda/queue/iosched/write_expire:5000
		(ms. tune to 1500 maybe)
cfq parameters are much less intuitive to tune :)

> - if that interferes with DRBD on top, and

if you care about write latency,
anticipatory is very bad for DRBD.
even noop should be better.

if you care most about read latency,
and don't care for write latency at all,
stay with anticipatory.

> - if that actually buys anything, especially avoids the starvation.

depends. tuning the io scheduler can buy you a lot.

depending on how your backup job works, I suggest tuning there, too.

e.g. if there is any streaming pipe involved, like
  input_streaming | output_streaming
just insert some "buffer -u 100" (apt-get install buffer; man buffer)
like so:
  input_streaming | buffer -u 100 | output_streaming

that should be enough to avoid it starving the rest of the system.

-- 
: Lars Ellenberg                           http://www.linbit.com :
: DRBD/HA support and consulting             sales at linbit.com :
: LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
: Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list