[DRBD-user] DRBD Dual Primary (writable/writeble) setup over VDSL WAN links

Lars Ellenberg lars.ellenberg at linbit.com
Tue Oct 27 17:26:20 CET 2015

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Tue, Oct 27, 2015 at 04:51:54PM +0100, Lorenzo Milesi wrote:
> > No, it is not feasible.
> out of curiosity, wasn't drbd-proxy [1] designed for this purpose? or was it just for active/passive?
> thanks

> [1] https://drbd.linbit.com/users-guide/s-drbd-proxy.html

For "Dual-Primary" (Cluster File System use case),
you have to use *synchronous* replication.

So appart from possibly trading CPU cycles (compression) against
available WAN bandwidth, DRBD proxy cannot help you here.

The issue is the high latency of WAN links,
and the distributed locks and other dances
any cluster file system has to do to provide cache-coherency.

Think of a single file "stat" operation.  With a network latency of
~30ms between cluster nodes, you will be able to do less than about 30
random (means: no directory leases or other shortcuts; I'm sorry if I'm
not using the proper GFS2 vocabulary) stats per second, best case.
And that's already "cache hot".
(compare with a "normal" file system, where you can do a million stats
per second and more, cache hot)
Worse for file open/create/remove or any other operation
where the file system needs to aquire a distributed lock.

Then consider any network "hickup", which would require to
 * freeze IO (even on the filesystem level already)
 * reliably fence the other node
    (good luck with this via unreliable WAN)
 * recover other node's journal
 * resume

It sure could potentially made to "work".

But performance wise, you'd be better off burning and Snail-Mailing
CD-ROMs back and forth...  And that's not because of DRBD,
but because of the cluster file system locking latencies.
Also using cluster file systems does not necessarily increase
reliability or availability ;-)

Depending on the exact requirements and goals,
there are other methods to synchronize data between
branch-offices or similar.

But "cluster file systems" (as in GFS2 or OCFS2) are made
for environments with reliable low-latency LAN
(and require reliable fencing).

: Lars Ellenberg
: http://www.LINBIT.com | Your Way to High Availability
: DRBD, Linux-HA  and  Pacemaker support and consulting

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list