[DRBD-user] DRBD with CentOS in Production?

Wed Aug 14 20:54:56 CEST 2013

On 14/08/13 10:58, Christian Völker wrote:
> Hi all,
>
> I'm planning to use DRBD in a production environment. I prefer to use
> CentOS as base system.
>
> The reason to use drbd is the synchronisation, not the high availability.
>
> We'll have two locations connected through a 100Mbit line. On both
> locations users will access the data at the same time. So I know I have
> to use a cluster aware filesystem.
>
> I'm a little bit unsure about the performance- of course it will slow
> down all access to the device which might be secondary. But are there
> any tweaks to improve the performance despite of the slow 100 Mbit
> connection?
>
> So questions are:
> Is CentOS6 with DRBD suitable for production use?
> Which filesystem is recommended? GFS? ZFS (experimental?)?
>
> Thanks& Greetings
>
> Christian

First, the short answer; Yes, DRBD on CentOS 6 is perfectly stable. I've 
used 8.3.{11~15} on CentOS 6.{0~4} in production without issue. I also 
use GFS2 partitions on all my clusters without issue.

If you want both locations to have simultaneous access to the storage / 
filesystem, then you need a cluster aware filesystem and you need to run 
DRBD in dual-primary. This, in turn, requires the use of "Protocol C" 
which says that DRBD will not tell the caller that the write was 
completed until it has hit persistent storage on both nodes. Effectively 
making your storage performance that of the speed/latency of your 
network link. Across a 100 Mbit link, this means that your raw write 
speeds will never exceed ~11~12 MB/sec. The write latency will also be 
the same as the network link's latency + the storage latency.

Performance will not be stellar.

What you're proposing is called a "stretch cluster" and it's notoriously 
hard to do well.

There is a further complication to your plan though; It will be nearly 
impossible to differentiate a broken link from a failed remote server. 
So your network link becomes a single point of failure... If the link 
breaks, both nodes will block and call a fence against their peer. The 
fence will fail because the link to the fence device is lost, so the 
nodes will remain blocked and your storage will hang (better to hang 
than to risk corruption). The fence actions will remain pending for 
however long it takes to repair the link, and then both will try to 
fence the other at the exact same time. There is a chance that, post 
network repair, both nodes will get fenced and you will have to manually 
boot the nodes back up.

There is yet another concern; Corosync expects low latency networks 
(corosync being the communication and membership layer of the cluster). 
So you will need to allocate time to tweaking the corosync timeouts to 
handle your high-latency network. If there is an intermittent blip in 
your network that exceeds corosync's timeouts, the cluster will 
partition and one or both of the nodes will be fenced, as per the issue 
above.

You said that "HA" is not your highest concern, so this might be an 
acceptable risk... You have to make that call. The software is stable 
though. Your implementation may not be stable, however.

digimer

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?