[DRBD-user] How does protocol A work across a WAN?

Wed Jul 4 00:33:10 CEST 2007

On Tue, Jul 03, 2007 at 12:19:14PM -0700, Chris de Vidal wrote:
> Thanks for the rapid reply!
> 
> --- Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
> > > Suppose then the primary node goes down before all of the recent changes could be copied to
> > > the secondary node.
> > > 
> > > Would the only losses to the secondary node be those recent few minutes?  Or might there be
> > > other, out-of-step changes that are lost as well?
> > > 
> > > Another way to ask the question: are changes run through a strictly first-in first-out (FIFO)
> > > process or could they be sent across the wire in random order for optimization?  Is the order
> > > of changes strictly in the order that they happened on the primary node?  Or is there some
> > > reordering for performance or some other reason?
> > 
> > the image on the secondary is always consistent, regardless of when the
> > connection is lost or the primary crashes (looks the same from this
> > perspective: no more requests from Primary).
> > (however, it is inconsistent during resynch, so keep that in mind.)
> > 
> > the only reodering taking place is the reordering in the local io stacks
> > below drbd. this is restricted to happen within reorder domains which are 
> > bounded by "drbd barriers". drbd barriers are issued whenever the
> > primary relays an io-completion event to the upper layers (file system).
> > 
> > rationale: if the "user" (file system) submits several write requests
> > while some other write request is not yet completed, these are obviously
> > independend and may be reordered. any write request submitted after some
> > other request has been completed however may well be dependend on that
> > very completion. follows: reordering may take place, but only between
> > completion events.
> 
> I'm not sure I understand.  Forgive me, I've not done as much low-level/network programming as you
> have, I'm just an admin.
> 
> Here's what I have in mind: database replication over slow, inexpensive WAN links.  For example,
> T1s or even cablemodem/DSL.
> 
> I've read lots of warnings which recommend I only use protocol C for databases, because if you run
> protocol A you run a risk of completing a transaction locally that doesn't complete on the remote
> copy.  If you have a local crash and you bring up the remote copy the transaction that was
> supposed to have been completed is actually rolled back.
> 
> I understand why that would be a Bad Thing (tm).
> 
> However, I don't see how this loss is any more significant than running snapshot backups on a
> database.  If you can accept that kind of risk then DRBD/protocol A is sufficient.

[snip]

> Many, many small businesses accept the risk of using periodic snapshot backups of databases (such
> as ours), and it seems to it's the same level of risk for DRBD/protocol A, unless there's
> reordering going on somewhere.
> 
> 
> So... does protocol A send writes across the wire on a first-in/first out basis?

"sort of." read my answer above again.
if it still does not make sense, take it as a "yes".
 :)

keep in mind that protocol A does not have that huge a backlog as you
seem to think.  it does not buy you more bandwidth, it only helps to
reduce the negative effects of peek latency, and smothens out small
bursty write operations.

any write operation reaching "sustained" rates
will be throttled to "sustainable" bandwidth.

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.