[DRBD-cvs] r1847 - trunk

Mon Jul 11 10:57:42 CEST 2005

Author: phil
Date: 2005-07-11 10:57:41 +0200 (Mon, 11 Jul 2005)
New Revision: 1847

Modified:
   trunk/ROADMAP
Log:
A bit more research on the toppic of write barriers in the kernel,
and capabilities if disk drives.


Modified: trunk/ROADMAP
===================================================================

--- trunk/ROADMAP	2005-07-09 16:35:46 UTC (rev 1846)
+++ trunk/ROADMAP	2005-07-11 08:57:41 UTC (rev 1847)
@@ -503,7 +503,9 @@
   BIO_RW (=write) requests.
 
   The REQ_HARDBARRIER bit is currently used to do a cache flush on 
-  IDE device, which some IDE devices are capable of.
+  IDE devices. Actually not all IDE devices can do cache flushes, there
+  are some older models out there that can do write-caching but can
+  not perform a cache flush!
 
   Journaling file systems should use this barrier mechanism in their journal
   writes (actually on the commit block, this is the last write in a
@@ -512,16 +514,48 @@
   As for DRBD we should probabely ship the REQ_HARDBARRIER flags with
   our wire protocol (or should they be expressed by Barrier packets?)
 
-  We will only see such REQ_HARDBARRIER flags if state to the upper layers
+  We will only see such REQ_HARDBARRIER flags if we state to the upper layers
   that we are able to deal with them. We need to do this by announcing it:
   blk_queue_ordered(q, QUEUE_ORDERED_FLUSH or QUEUE_ORDERED_TAG ) .
-  Default ist QUEUE_ORDERED_NODE. This is the reason why we never see
+  Default ist QUEUE_ORDERED_NONE. This is the reason why we never see
   the REQ_HARDBARRIER flag currently.
 
-  An other consequence of this is, that IDE devices _not_ support cache
-  flushes and have write cache enabled are inherent buggy to use with
+  An other consequence of this is, that IDE devices that do _not_ support 
+  cache flushes and have write cache enabled are inherent buggy to use with
   a journaled file system.
 
+  SCSI's Tagged queuing (seems to be presenet in SATA as well)
+    [excerpt from http://www.scsimechanic.com/scsi/SCSI2-07.html]
+
+    Tagged queuing allows a target to accept multiple I/O processes from 
+    the same or different initiators until the logical unit's command queue 
+    is full.
+
+    If only SIMPLE QUEUE TAG messages are used, the target may execute the
+    commands in any order that is deemed desirable within the constraints
+    of the queue management algorithm specified in the control mode page
+    (see 8.3.3.1).  
+
+    If ORDERED QUEUE TAG messages are used, the target shall execute the
+    commands in the order received with respect to other commands received
+    with ORDERED QUEUE TAG messages. All commands received with a SIMPLE
+    QUEUE TAG message prior to a command received with an ORDERED QUEUE
+    TAG message, regardless of initiator, shall be executed before that
+    command with the ORDERED QUEUE TAG message. All commands received with
+    a SIMPLE QUEUE TAG message after a command received with an ORDERED
+    QUEUE TAG message, regardless of initiator, shall be executed after
+    that command with the ORDERED QUEUE TAG message.  
+
+    A command received with a HEAD OF QUEUE TAG message is placed first in
+    the queue, to be executed next. A command received with a HEAD OF
+    QUEUE TAG message shall be executed prior to any queued I/O
+    process. Consecutive commands received with HEAD OF QUEUE TAG messages
+    are executed in a last- in-first-out order.
+
+  I think in the context of SCSI the kernel usually issues write requests
+  with the SIMPLE QUEUE TAG, and requests with the REQ_HARDBARRIER 
+  (i.e. bio's with the BIO_RW_BARRIER) with an ORDERED QUEUE TAG.
+
   What QUEUE_ORDERED_ type should we expose ?
 
     In order to support capable IDE devices right, we should ship the
@@ -534,13 +568,13 @@
     self   peer      DRBD
     ---------------------
     NONE , NONE  =>  NONE
-    NONE , FLUSH =>  FLUSH  ( for shipping. Strip local. Save ? )
+    NONE , FLUSH =>  NONE
     NONE , TAG   =>  NONE
-    FLUSH, NONE  =>  FLUSH  ( for local flush. Strip remote. Save ? )
+    FLUSH, NONE  =>  NONE
     FLUSH, FLUSH =>  FLUSH
     FLUSH, TAG   =>  FLUSH
     TAG,   NONE  =>  NONE
-    TAG,   FLUSH =>  TAG
+    TAG,   FLUSH =>  FLUSH
     TAG,   TAG   =>  TAG
 
   How should we deal with our self generated barrier packets ?
@@ -551,8 +585,8 @@
     done upon receiving a write barrier packet. This could give us a
     performance boost.
 
-    In case out backing device only supports QUEUE_ORDERED_FLUSH it is
-    probabely better to use the current code. That mesns, when we receive
+    In case our backing device only supports QUEUE_ORDERED_FLUSH it is
+    probabely better to use the current code. That means, when we receive
     a write barrier packet we wait until all of our pending local write
     requests are done.