[DRBD-cvs] r1847 - trunk
www-data
www-data at linbit.com
Mon Jul 11 10:57:42 CEST 2005
Author: phil
Date: 2005-07-11 10:57:41 +0200 (Mon, 11 Jul 2005)
New Revision: 1847
Modified:
trunk/ROADMAP
Log:
A bit more research on the toppic of write barriers in the kernel,
and capabilities if disk drives.
Modified: trunk/ROADMAP
===================================================================
--- trunk/ROADMAP 2005-07-09 16:35:46 UTC (rev 1846)
+++ trunk/ROADMAP 2005-07-11 08:57:41 UTC (rev 1847)
@@ -503,7 +503,9 @@
BIO_RW (=write) requests.
The REQ_HARDBARRIER bit is currently used to do a cache flush on
- IDE device, which some IDE devices are capable of.
+ IDE devices. Actually not all IDE devices can do cache flushes, there
+ are some older models out there that can do write-caching but can
+ not perform a cache flush!
Journaling file systems should use this barrier mechanism in their journal
writes (actually on the commit block, this is the last write in a
@@ -512,16 +514,48 @@
As for DRBD we should probabely ship the REQ_HARDBARRIER flags with
our wire protocol (or should they be expressed by Barrier packets?)
- We will only see such REQ_HARDBARRIER flags if state to the upper layers
+ We will only see such REQ_HARDBARRIER flags if we state to the upper layers
that we are able to deal with them. We need to do this by announcing it:
blk_queue_ordered(q, QUEUE_ORDERED_FLUSH or QUEUE_ORDERED_TAG ) .
- Default ist QUEUE_ORDERED_NODE. This is the reason why we never see
+ Default ist QUEUE_ORDERED_NONE. This is the reason why we never see
the REQ_HARDBARRIER flag currently.
- An other consequence of this is, that IDE devices _not_ support cache
- flushes and have write cache enabled are inherent buggy to use with
+ An other consequence of this is, that IDE devices that do _not_ support
+ cache flushes and have write cache enabled are inherent buggy to use with
a journaled file system.
+ SCSI's Tagged queuing (seems to be presenet in SATA as well)
+ [excerpt from http://www.scsimechanic.com/scsi/SCSI2-07.html]
+
+ Tagged queuing allows a target to accept multiple I/O processes from
+ the same or different initiators until the logical unit's command queue
+ is full.
+
+ If only SIMPLE QUEUE TAG messages are used, the target may execute the
+ commands in any order that is deemed desirable within the constraints
+ of the queue management algorithm specified in the control mode page
+ (see 8.3.3.1).
+
+ If ORDERED QUEUE TAG messages are used, the target shall execute the
+ commands in the order received with respect to other commands received
+ with ORDERED QUEUE TAG messages. All commands received with a SIMPLE
+ QUEUE TAG message prior to a command received with an ORDERED QUEUE
+ TAG message, regardless of initiator, shall be executed before that
+ command with the ORDERED QUEUE TAG message. All commands received with
+ a SIMPLE QUEUE TAG message after a command received with an ORDERED
+ QUEUE TAG message, regardless of initiator, shall be executed after
+ that command with the ORDERED QUEUE TAG message.
+
+ A command received with a HEAD OF QUEUE TAG message is placed first in
+ the queue, to be executed next. A command received with a HEAD OF
+ QUEUE TAG message shall be executed prior to any queued I/O
+ process. Consecutive commands received with HEAD OF QUEUE TAG messages
+ are executed in a last- in-first-out order.
+
+ I think in the context of SCSI the kernel usually issues write requests
+ with the SIMPLE QUEUE TAG, and requests with the REQ_HARDBARRIER
+ (i.e. bio's with the BIO_RW_BARRIER) with an ORDERED QUEUE TAG.
+
What QUEUE_ORDERED_ type should we expose ?
In order to support capable IDE devices right, we should ship the
@@ -534,13 +568,13 @@
self peer DRBD
---------------------
NONE , NONE => NONE
- NONE , FLUSH => FLUSH ( for shipping. Strip local. Save ? )
+ NONE , FLUSH => NONE
NONE , TAG => NONE
- FLUSH, NONE => FLUSH ( for local flush. Strip remote. Save ? )
+ FLUSH, NONE => NONE
FLUSH, FLUSH => FLUSH
FLUSH, TAG => FLUSH
TAG, NONE => NONE
- TAG, FLUSH => TAG
+ TAG, FLUSH => FLUSH
TAG, TAG => TAG
How should we deal with our self generated barrier packets ?
@@ -551,8 +585,8 @@
done upon receiving a write barrier packet. This could give us a
performance boost.
- In case out backing device only supports QUEUE_ORDERED_FLUSH it is
- probabely better to use the current code. That mesns, when we receive
+ In case our backing device only supports QUEUE_ORDERED_FLUSH it is
+ probabely better to use the current code. That means, when we receive
a write barrier packet we wait until all of our pending local write
requests are done.
More information about the drbd-cvs
mailing list