[DRBD-user] primary/primary with OCFS2 for the first time, questions

Thu Mar 27 14:17:29 CET 2008

On Fri, Mar 21, 2008 at 01:11:55PM +0100, Martin Gombac wrote:
> Hi DRBD users,
>
> i've been using drbd since 0.6 version. I always used to setup fail-over 
> clusters with primary/secondary and usually 2 drbd resources. Each running 
> on one done as primary.
> Now i'm thinking it's time to try primary/primary for the first time. I 
> would use OCFS2 and gentoo.
>
> I wonder how does OCFS2 or DRBD 8.0.* deal with failed failed replication 
> link when the same file is accessed.
> I mean, in the primary/secondary if the replication link went down on 
> secondary no data was written. If for some strange reason the secondary 
> resource became primary and mounted and we got split brain, solution was 
> pretty straight forward. Data got (automatically or manually) replicated 
> back from first primary node or the node with most current data on resource 
> basis. r1 could be replicated one way, while r0 was in the other way.
>
> But what happens when i have only one resource in primary/primary and the 
> same file gets written to or created on both with different content while 
> resource is disconnected?
>
> I cannot say sync from first to second node since each node has some of the 
> data correct and doing this in either way would make me loose half of the 
> data. I'm I not understanding something here or is this expected? Maybe 
> point me to some manual which explains drbd in primary/primary mode. I 
> don't think i can find this info on: 
> http://www.drbd.org/users-guide/index.html

drbd cannot do a "three way merge" of arbitrary file system data.
that is just not possible.

so, yes, if you have divergin data sets, the only thing drbd can offer
is take either verseion, and sync up the other one to be identical again
with the one you chose.

for a replicated resource,
avoiding diverging data sets is not always easy.

drbd helps you by offering a lot of handlers that get triggered for
various events. one possible brutal but effective way would be to set
the outdate-peer handler to something that either hard-reboots this node
(self-fencing, SMITH), or hard-reboot the other node (STONITH).

productively using DRBD in Primary/Primary Cluster FS mode is,
while possible, still cumbersome in face of connection errors.

we (LinBit) will eventually put out some "best practices and caveats"
guide for DRBD and cluster file systems, but that will take some time.
I currently cannot estimate when we get that one done.

-- 
: Lars Ellenberg                           http://www.linbit.com :
: DRBD/HA support and consulting             sales at linbit.com :
: LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
: Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
__
please use the "List-Reply" function of your email client.