[DRBD-user] Fast write performance on backing device, slow write Performance on DRBD

Wed Jan 2 17:51:22 CET 2013

Hi Florian,

Thanks for your reply. I was out of office for some time so here's my 
observations...

On Wednesday 02 01 2013 16:29:17 you wrote:
> On Tue, Dec 18, 2012 at 10:58 AM, Tom Fernandes <anyaddress at gmx.net> 
> wrote:
> > ------------------------------- DRBD 
> -----------------------------------------
> > tom at hydra04 [1526]:~$ sudo drbdadm dump
> > # /etc/drbd.conf
> > common {
> >     protocol               C;
> >     syncer {
> >         rate             150M;
> >     }
> > }
> >
> > # resource leela on hydra04: not ignored, not stacked
> > resource leela {
> >     on hydra04 {
> >         device           minor 0;
> >         disk             /dev/vg0/leela;
> >         address          ipv4 10.0.0.1:7788;
> >         meta-disk        internal;
> >     }
> >     on hydra05 {
> >         device           minor 0;
> >         disk             /dev/vg0/leela;
> >         address          ipv4 10.0.0.2:7788;
> >         meta-disk        internal;
> >     }
> > }
> 
> If that configuration is indeed "similar" to the one on the other
> cluster (the one where you're apparently writing to DRBD at 200 MB/s),
> I'd be duly surprised. Indeed I'd consider it quite unlikely for _any_
> DRBD 8.3 cluster to hit that throughput unless you tweaked at least
> al-extents, max-buffers and max-epoch-size, and possibly also
> sndbuf-size and rcvbuf-size, and set no-disk-flushes and no-md-flushes
> (assuming you run on flash or battery backed write cache).

I compared the DRBD-configuration of the fast and the slow cluster again 
with drbdadm dump. They are the same. Both configurations have just the 
defaults. No modifications of the parameters you mentioned above.
To be on the save side I re-ran the benchmarks with a 2048MB dd-file (as 
we have big RAID-caches). On the fast cluster I have 1024 flash-backed 
cache, on the slow cluster it's 512MB (without BBU). When doing the 
tests on the fast cluster I observed nr and dw in /proc/drbd on the 
secondary node to be sure, that the data is really getting synced. The 
fast cluster are HP-Servers. The slow cluster is different hardware 
(it's rented from our provider and may be no-name hardware). But they 
have the same amount of RAM, same number of threads, both SAS drives and 
both have a RAID6 configured.

The fast cluster gives ~176MB/s write performance (not 200MB/s as I 
mentioned before - I wasn't accurate when I wrote that - sorry). The 
slow cluster gives ~55MB/s write performance. The speed on the slow 
cluster stays roughly the same, whether I use protocol C or A. On the 
fast cluster the speed increases from ~176MB/s to 187MB/s when switching 
from protocol C to protocol A.

> 
> So I'd suggest that you refer back to your "fast" cluster and see if
> perhaps you forgot to copy over your /etc/drbd.d/global_common.conf.

I checked. Both configs are the same.

> 
> You may also need to switch your I/O scheduler from cfq to deadline on
> your backing devices, if you haven't already done so. 

I switched from cfq to dealine on the slow cluster. There was a 
performance increase from ~55MB/s to ~58MB/s.

> And finally, for
> a round-robin bonded network link, upping the net.ipv4.tcp_reordering
> sysctl to perhaps 30 or so would also be wise.

I tried out setting it to 30 on the slow cluster but performance didn't 
really change.

I did not feel it makes sense to tweak the DRBD-configuration on the 
slow cluster as the fast cluster has the same DRBD-configuration but 
gives more than 3x better performance.

I'll try with 8.4 tomorrow. Let's see if that makes a difference.

Is there any more information I can provide?

warm regards,

Tom Fernandes