[DRBD-user] use drbd on wan

Wed Sep 21 13:47:51 CEST 2011

[Sorry, seems like my first post didn't make it to the list: resending]

Dear Mia Lueng,

We've been in this very situation for 6 month so I think I can anwser
most of your questions:

> (...) we have considered  the following solution:  On secondary node,
> run disconnect and connect command in sequence . Diconnect drbd0 ,
> wait 5 minutes (or longer time), connect drbd0 again, and after sync
> complete , disconnect drbd0 again.

Good idea I think. We tried that and it proved much harder to have
working riliably that I expected. Please let me tell you the story.
We have been so far as to write a script to watch the level of the
kernel's write cache, and automatically disconnect the resource when too
many writes are waiting. It works and we can share it if you want, but
read on.
Eventually, we gave up this idea because the data in write cache was
lost each time the primary crashed, which was unfortunately often.
So we upgraded to 8.3.11, which has a congestion management feature to
do even better : the secondary node falls behind when the link is too
slow, and catches up when writes calm down. But there were problems too,
like frequent full-syncs forced on us. Our setup is complex and probably
it met a race condition or something.
So by all means, if you have the money consider using drbd-proxy. It's a
bit expensive but it's probably the best solution. It will give you a
big RAM buffer to handle short periods of heavy writes, and should work
great witht the congestion management. You can ask for a free trial.

If you stick to your initial idea, I'd advice you disconnect most of the
time, and connect during hours where write activity is lowest. This way
you won't slow your primary down too much and you have essentially no
risk of forcing a full-sync.

> But we are wondering the follow
> issues:
>
> 1. How can we confirm that each sychorinzation after connecting is a
> quick sync and not a full sync? And how to tune it to be that.
Quick sync, no tuning needed. drbd is very smartly done.
However be warned that the connect/disconnect code probably still has
race conditions and I suppose you will trigger them eventually if you
disconnect/reconnect every 5 minutes. We had some full syncs and several
complete lockups.
In our experience, the congestion management had full sync problems too
when used without DRBD-proxy, but at least the nodes didn't hang.

> 2. the secondary node disconnect drbd0 only when the cstate is
> Connected and dstate is UptoDate. Can  the data integration of db data
> be guaranteed?  In other words, If the primary node crash at this
> moment,  the oracle db can be started on the secondary node?  And in
> quich sync , Is the data written on secondary node  in the same order
> as  it written on primary node ?
No! During syncs, the secondary node *will be incoherent* by design.
It's less of a problem if you make a snaphot of the secondary before the
sync. Drbd is distributed with an example script to do that, which work
great in v8.3.10+. The recovery may be complex but it can be done.
Document the procedures and you'll be fine.
And no, the blocks will *not be written in the same order*.
If you really need that you must stay connected, with the right protocol
and the right barriers and flushes configured, and it will by very slow.
But really, few people need it.

On a side note, drbd lets you run primary with an incoherent local disk
if the secondary is reachable (reads will be fetch over the network).
It's amazing and precious when you have system maintenance to do.

> 3. is the activity log size tuning helpful for this situation?
>
I had no time to try, but I guess it will. Why not build a prototype and
experiment ?

Last piece of advice: DRBD needs the available bandwidth to be reliable.
You will have weird problems if you ask for 2MB/s and suddenly only
0,5MB/s is available because the line is busy with something else.

Yours,
Lionel Sausin.