[DRBD-user] DRBD with CentOS in Production?

Thu Aug 15 04:13:59 CEST 2013

On 15/08/13 04:54, Digimer wrote:
> On 14/08/13 10:58, Christian Völker wrote:
>> Hi all,
>>
>> I'm planning to use DRBD in a production environment. I prefer to use
>> CentOS as base system.
>>
>> The reason to use drbd is the synchronisation, not the high 
>> availability.
>>
>> We'll have two locations connected through a 100Mbit line. On both
>> locations users will access the data at the same time. So I know I have
>> to use a cluster aware filesystem.
>>
>> I'm a little bit unsure about the performance- of course it will slow
>> down all access to the device which might be secondary. But are there
>> any tweaks to improve the performance despite of the slow 100 Mbit
>> connection?
>>
>> So questions are:
>> Is CentOS6 with DRBD suitable for production use?
>> Which filesystem is recommended? GFS? ZFS (experimental?)?
>>
>> Thanks& Greetings
>>
>> Christian
>
> First, the short answer; Yes, DRBD on CentOS 6 is perfectly stable. 
> I've used 8.3.{11~15} on CentOS 6.{0~4} in production without issue. I 
> also use GFS2 partitions on all my clusters without issue.
>
> If you want both locations to have simultaneous access to the storage 
> / filesystem, then you need a cluster aware filesystem and you need to 
> run DRBD in dual-primary. This, in turn, requires the use of "Protocol 
> C" which says that DRBD will not tell the caller that the write was 
> completed until it has hit persistent storage on both nodes. 
> Effectively making your storage performance that of the speed/latency 
> of your network link. Across a 100 Mbit link, this means that your raw 
> write speeds will never exceed ~11~12 MB/sec. The write latency will 
> also be the same as the network link's latency + the storage latency.
>
> Performance will not be stellar.
>
> What you're proposing is called a "stretch cluster" and it's 
> notoriously hard to do well.
>
> There is a further complication to your plan though; It will be nearly 
> impossible to differentiate a broken link from a failed remote server. 
> So your network link becomes a single point of failure... If the link 
> breaks, both nodes will block and call a fence against their peer. The 
> fence will fail because the link to the fence device is lost, so the 
> nodes will remain blocked and your storage will hang (better to hang 
> than to risk corruption). The fence actions will remain pending for 
> however long it takes to repair the link, and then both will try to 
> fence the other at the exact same time. There is a chance that, post 
> network repair, both nodes will get fenced and you will have to 
> manually boot the nodes back up.
>
> There is yet another concern; Corosync expects low latency networks 
> (corosync being the communication and membership layer of the 
> cluster). So you will need to allocate time to tweaking the corosync 
> timeouts to handle your high-latency network. If there is an 
> intermittent blip in your network that exceeds corosync's timeouts, 
> the cluster will partition and one or both of the nodes will be 
> fenced, as per the issue above.
>
> You said that "HA" is not your highest concern, so this might be an 
> acceptable risk... You have to make that call. The software is stable 
> though. Your implementation may not be stable, however.
>
> digimer
>

Just to jump in on this....

If one of those nodes was for a small mini-office, could you configure 
the small office node to always shutdown/disable access/offline, and the 
big office to become a single primary upon link failure? Then on 
recovery, the small office node reconnects, resyncs any updates, and 
then returns to dual primary?

I don't know if this is applicable for Christian, but it is something 
I've considered previously.

Regards,
Adam

-- 
Adam Goryachev Website Managers www.websitemanagers.com.au