[DRBD-user] Reasons not to use allow-two-primaries with DRDB

Digimer lists at alteeve.ca
Sat May 19 23:30:22 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 05/19/2012 01:54 PM, Arnold Krille wrote:
> On Friday 18 May 2012 18:29:11 karel04 at gmail.com wrote:
>> I am in the process of setting up DRBD on my servers, the network
>> bandwidth being the bottleneck.  After having evaluated GlusterFS I
>> realised, that I need the instant read access offered by DRBD.
>>
>> Logically I am able to separate partitions that would require access
>> from both nodes, and partitions where an asynchronous master-slave
>> sync is sufficient.  But as far as I understand, the benefits from
>> using Protocol A instead of C are limited, when the network is stable.
>>
>> My question:
>> Are there any additional benefits from NOT using two primaries or
>> additional risks when using it? eg. would there be significant
>> performance gain by using ext4 instead of GFS2/OCFS2? Anything else I
>> should take into consideration?
> 
> A) There is a huge performance gain from using extX over gfs2/ocfs2, 
> especially when you implement it wrong/incomplete as I did:-( Not doing 
> fencing is basically killing gfs2 and ocfs2. Which I didn't want to do for a 
> variety of reasons.
> B) There is a huge latency improvement when using protocol A or B over C. The 
> docs say that you loose reliability unless the second machine (and the 
> switches in between) have battery backup. Which they should have unless you 
> use one UPS for everything.
> C) There is a huge administration gain when using "simple" single-primary and 
> traditional fs.
> 
> My two cents...
> 
> Have fun,
> 
> Arnold

To expand on this;

  The main reason dual-primary is not recommended is because it is hard
to do it right, rather than any inherent technical limitation.

  You *should* have fencing (aka 'stonith') with DRBD, but you
absolutely *must* have it for dual-primary, or else you will end up with
a split-brain and likely data loss. This can still happen with
single-primary if a secondary promotes while the real primary is still
alive but not talking to it's peer. Really, *always* use fencing.

  Clustered file systems require some form of clustered locking. GFS2
uses DLM, not sure what OCFS2 uses. Regardless, this means that locks
have to be coordinated between nodes and that is comparatively expensive
when compared to local locking found in extX and other FSs. So yes,
GFS2/OCFS2 is slower.

  In turn, speaking from GFS2, which I use, clustered locking requires a
cluster proper. This is inherently simple technology, but can be complex
to learn at first because there is no "hello, world!" equivalent,
really. You have to get a few things right before you get off the ground
at all, so it can take patience. Once you "get it" though, you're off to
the races.

  As a final comment; if you need simultaneous read and/or write access
to data, and you need your data in two places at all time, DRBD is the
answer and you will need a clustered file system. So depending on your
use-case, you may not really find an alternative.

Cheers!

-- 
Digimer
Papers and Projects: https://alteeve.com



More information about the drbd-user mailing list