[DRBD-user] Resources for learning how to use DRBD as Primary/Primary?

Wed Nov 21 05:00:07 CET 2007

I'm putting the two ends together again here,
because the first nicely explains what you have in mind,
the second provides better oportunity to explain some things.

On Tue, Nov 20, 2007 at 06:17:34PM -0600, D. Dante Lorenso wrote:
> Florian Haas wrote:
> >Geographic redundancy in Primary/Primary mode? Forget that. What are you 
> >trying to do, run an OCFS2 cluster coast to coast?
> >I'm puzzled, to say the least. But always eager to learn. :-)
> 
> It's not really about redundancy in the sense of having a hot fail-over, 
> but more about having a live copy of data at a remote location.  I'd 
> like both locations to feel they have their own local file storage while 
> in effect sharing that storage remotely.
> 
> For a more specific example:
> 
>    I have a collocation facility in Austin, TX and another in Dallas, TX
>    and want to be able to write a file locally in Austin and have that
>    file also be written in Dallas and immediately be readable through an
>    NFS exported file system.
> 
> In my lab tests, I have DRBD running to mirror one file server to the 
> other, but I can only mount the /dev/drbd0 device on the Primary node 
> and the Secondary node is not usable for anything other than fail-over. 
>  I can't even use the secondary to perform backups to tape.

you can, if you have drbd on top of lvm, snapshot the lower level lv,
and do the backup from the snapshot. to have consistent backup,
you want to drive it from snapshots anyways.

> I don't mind having a master/slave relationship if say node1 was the 
> Primary and could be mounted read/write while node2 was the secondary 
> and could be mounted as read-only.  I think I'd prefer even better to 
> have a Primary/Primary setup where I could write files from either 
> location and have them written to both simultaneously.

the problem is the same,
in both scenarios you'd need a cluster file system.

> My connection between Dallas and Austin is only a few short hops over 
> the internet and latency/speed is not as important as the abstraction of 
> the transfer/storage.  This system is 99.99% read intensive and the 
> write traffic is quite low.

On Tue, Nov 20, 2007 at 06:34:26PM -0600, D. Dante Lorenso wrote:
> Lars Ellenberg wrote:
> >what is your understanding of "Pirmary/Primary"?
> 
> If I have 2 nodes, I'd like to use either node and have changes 
> reflected on both.

well, yes.
and you want to have it reflected "immediately".

> >  do you want to use a cluster file system on top of that?
> >  or do you want to have each site active for *its set* of shares,
> >  and be the backup only for the respective other site's set?
> 
> I'm exploring OCFS2, GFS, and Lustre in addition to my DRBD testing.

you'd always need a shared disk.
you can't have it on both sites.

well, yes, you can, using synchronous replication instead,
that is why you want cluster file system on top of DRBD.
but a cluster file system requires strictly synchronous replication.

> >what does "geographically remote" mean in link latency and bandwidth?
> >  30 km FO dedicated 1Gbps low latency feeling for all practical 
> purposes almost like a lan?
> >  3000 km 1Mbps flaky high latency bell wire?
> 
> Regular internet bursting to 100 Mbps at each location, but likely to be 
> throttled to just 1-10 Mbps for this purpose.
>
> >what is "a file repository"? what kind of files?
> 
> Mostly media files audio/video/telecom.

you say above write traffic is quite low.
can't really believe that with "media files".

for any write (probably streaming),
your write throughput will be throttled to the link bandwidth.
1 - 10 Mbps is about what an old cdrom drive was capable,
1 GB @ 5 Mbps -->  half an hour.

write throughput means this affects also
what you consider to be a local write,
because it is not local anymore, it is replicated synchronously.

I doubt that such a write rate is acceptable
for media applications nowadays.

the other problem with cluster file systems and latency is,
the need the distributed lock manager, and they need to bounce
locks back and forth between nodes, even when one node is mounted
read-only (some optimizations can be made).
e.g. when one is creating/modifying some files,
and the other is reading the directory those files lives in.
more so when they are both able to modify (mounted read-write; they
don't necessarily have to write, they still must assume that the other
_could_ start a write then, thus they have to do even more locking).

and cluster file systems do not expect a lock passing to take 200 ms,
they expect that to be <0.1 ms.

> > why would you need synchronous replication, why would rsync not do?
> 
> Currently rsync does the job.  With synchronous replication, though, I'd 
> be able to write a file locally then trigger a remote job to begin using 
> that file immediately.  Rsync works but is slow and we are syncing such 
> a large number of files that the sync consumes a great deal of resources 
> on both the sending and receiving sides.  I've also been meaning to look 
> into csync2.

you should :)
it basically does the same as rsync,
only it takes much less resources and time to build the file list.
though it was not designed with network latency in mind,
so there may be room for improvement.

> >what is your expected storage size?
> > what is your estimated average write rate?
> > what is you estimated peek write rate?
> 
> I'm at 6 TB now.  We have been doubling storage each 6-12 months.  The 4 
> TB limit on DRBD makes me look at DRBD+, but the licensing costs there 
> have us concerned.
> 
> I see that the drbd.org site says: "Since DRBD-8.0.0 you can run both 
> nodes in the primary role, enabling to mount a cluster file system (a 
> physical parallel file system) one both nodes concurrently. Examples for 
> such file systems are OCFS2 and GFS."
> 
> I was hoping there was a nice tutorial/howto on specifically this type 
> of Primary/Primary configuration.

even if there was, it is not the right tool for the job.

cluster file systems will not work reliably over low bandwidth high
latency supposedly flaky WAN links.

-- 
: commercial DRBD/HA support and consulting: sales at linbit.com :
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.