[DRBD-user] DRBD performance problems!

Digimer lists at alteeve.ca
Tue Feb 11 17:23:45 CET 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 11/02/14 10:50 AM, Joeri Casteels wrote:
>> On 11/02/14 10:31 AM, Joeri Casteels wrote:
>>>> On 11/02/14 03:42 AM, Joeri Casteels wrote:
>>>>> I can cranck up the initial sync to 750MB/s when playing with the c-rate’s (still not a good performance) while after the initial sync it drops back to 400MB/s
>>>> Initial sync is not reflective of real-world performance. In fact, it's kept low on purpose.
>>>> If your DRBD resource (slowest of the network or drives) is, say, 1.1 GB/sec, then that is your _replication_ speed. When both nodes are UpToDate, your applications using the DRBD resource will work at this speed.
>>>> Synchronization speed is the speed at which out of sync blocks are copied to the peer. So say, for example, your nodes are UpToDate. Then you disconnect node 2 for a couple of hours and during that time, 50 GB of data changes on node 1. When node 2 returns, DRBD will start copying that 50 GBs of changes will start to sync from node 1. This runs at the sync speed.
>>>> If this sync runs at 750 MB/sec, then your apps will feel like their storage has slowed from 1,100 MB/sec down to just 350 MB/sec (1,1000 - 750). This is rarely what people want, so the best option is to let the sync operation take longer and let the application speed stay high. The general rule of thumb is to set the sync rate no higher that 30% of the tested maximum write speed. This is a good trade off between syncing quickly without hurting replication performance too much.
>>>> Make sense?
>>> Yes i know :-) that’s not my problem and after initial sync i do set it to max 30% of the real speed.
>>> Initial sync i do without limitations. and then i get about 755MB/s
>>>> Please wait until both nodes are UpToDate and then run your 'dd' test by writing directly to the /dev/drbdX device (or a file on it's FS, if you formatted /dev/drbdX). The performance you get with this test will reflect the performance your apps using DRBD should expect.
>>> Here is where the problem is i only get 400MB/s and no more after the nodes are uptodate with oflag=direct
>>> If i detach the secondary i’m back at 1.8GB/s with oflag=direct and 1.1GB/s with no flags.
>>> I would expect that i get around 1+ G/s with oflag=direct and without around 900MB/s with drbd in-between.
>> Ok, in that case, the best I can offer is my raw notes from when I was doing my DRBD tuning:
>> https://alteeve.ca/w/AN!Cluster_Tutorial_2_-_Performance_Tuning
>> Those really are raw notes, totally unstructured. However, it might help you tune your system. I found tuning dramatically helped my performance, and the ideal numbers (for me) were not at all like those in the tuning docs I read.
> Wow nice guide! Most of them i tested but see you have some settings i didn’t try before or not at such low values.
> Will test and get back too you what my best config was.

Excellent, please do!

The system I was using to write that has gone out for eval. When it 
comes back, I plan to do a proper tutorial on tuning DRBD. Any insight 
you learn that you share will certainly help in that effort!

Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?

More information about the drbd-user mailing list