[DRBD-user] Syncer question

Fri Feb 15 20:18:13 CET 2008

Thank you so much for that synopsis.  You made it very clear and now I
see the correlation.  Once I finish I *hope* to be able to write a
tutorial that covers heartbeat, drbd, stonith, etc.  With permission I
would like to include your detailed synopsis.  Of course with proper
recognition.

regards,

Douglas Lochart

On Fri, Feb 15, 2008 at 7:09 AM, Lars Ellenberg
<lars.ellenberg at linbit.com> wrote:
> On Thu, Feb 14, 2008 at 10:07:39AM -0500, Doug Lochart wrote:
>  > >  > I am using DRBD to mirror a 1 terrabyte device.  I will be syncing
>  > >  > over a gigabit switch so I assume I can use 100M to 150M as a rate
>  > >  > safely.  There will be other traffic on the switch.  However I do not
>  > >  > seem to understand how the al-extents parameter is used.  Most
>  > >  > examples leave it set to 257 and that is good for an active set of 1
>  > >  > Gig.  Could someone explain what the active set implies AND what are
>  > >  > the pros/cons of increasing the active set size or do I need to woory
>  > >  > about.
>  > >
>  > >  please read
>  > >  http://www.drbd.org/fileadmin/drbd/publications/drbd8.linux-conf.eu.2007.pdf
>  > >
>
> > Forgive me if I am obviously too stupid to understand but I did read
>  > that document and it did not help.  It makes no mention of the
>  > parameters in question nor enough information for me to relate the
>  > information in the document to drbd.conf file entries.
>  >
>  > I can understand (not really) but I can accept not being offered a
>  > high level and simplified explanation of my questions.  I do ask
>  > however help on correlating the config file parameters in question to
>  > the contents of the this  document.
>  >
>  > To others I would accept a high level synopsis of what the parameters
>  > affect and how to use them to tune my replication.  Armed with a high
>  > level view I hope to be able to correlate them back to the technical
>  > details on this document.
>
>  ok, sorry.
>  section 6 explains the problem of finding which blocks might have been
>  modified independently during different degraded situations.
>
>  6.1 explains the easy part, where we have just a network hickup:
>     we can use the in-memory dirty-bitmap on the Primary.
>
>  6.2 explains why the bitmap alone does not suffice when we think of
>     Primary node crash.
>
>  6.3 argues against synchronous "write intent bitmap writes"
>
>  6.4 explains what we do to reliably keep track of the
>     target blocks of in-flight IO, while minimizing
>     the frequency of housekeeping meta data transactions.
>
>  it summarizes:
>     The number of slots in the activity-log is tunable
>  [this is the al-extents configuration parameter you asked for],
>  the trade-of is larger activity-log, less frequent meta data
>  transactions, less likely to introduce maximum latency because of
>  activity-log starvation, but also longer minimum resync time after
>  Primary crash.
>
>  say you have 2.5 TByte of storage.
>
>  |-- 2.5 TB of storage ---------------------------------------------|
>  |- 4MB -|- 4MB -|-~~ // ~~-|- 4MB -|- 4MB -|- 4MB -|- 4MB -|- 4MB -|
>    0         1 extent numbers ....   655356  655357  655358  655359
>
>  you have 655360 extents/areas/regions/whatever of 4MB each.
>  the al-extent parameter tunes how many of them may be target of
>  in-flight writes at the same time.
>
>  you set al-extents = 7
>
>  you have a set of
>   [ 0, 1, 2, 17, 35, 270, 87 ] in the activity log.
>
>  you want to write to somewhere in 37, which is not in the al.
>  so you wait for any of the regions in the set to become unused,
>  put that region out of the active set, put the 37 into the active set,
>  and write that transaction to disk, which takes some time.
>  only after that has happened, the original write to the 37 extent
>  may continue.
>
>  if a crashed primary comes up again, the active set at the time of the
>  crash will be reconstructed from these meta data transactions (the
>  on-disk activity log), and all bits corresponding to any active set will
>  be dirtied (reason given in section 6.2).
>
>  so again:
>   you go up with al-extents, you have less frequent meta data
>  transactions, less frequently interruption of your IO stream
>  for it, you less frequently introduce this additional latency.
>  but if your primary crashes, it has to resync all extents that have been
>  in the active set, you have to resync more if the set is larger.
>
>  does that help at all?
>
>  --
>
>
> : Lars Ellenberg                           http://www.linbit.com :
>  : DRBD/HA support and consulting             sales at linbit.com :
>  : LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
>  : Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
>  __
>  please use the "List-Reply" function of your email client.
>  _______________________________________________
>  drbd-user mailing list
>  drbd-user at lists.linbit.com
>  http://lists.linbit.com/mailman/listinfo/drbd-user
>

-- 
What profits a man if he gains the whole world yet loses his soul?