[DRBD-user] Fast write performance on backing device, slow write Performance on DRBD

Mon Jan 7 10:44:34 CET 2013

Hi again,

Problem solved. There is a setting called storsave for newer 3ware RAID-
controllers. This was set to "balanced" which - among others - provides 
a write-journal for the disk-cache to prevent data-loss in case of power 
failure. Setting it to "perform" boosted write  performance from ~55MB/s 
to ~155MB/s. Looking carefully at the ouput of atop and comparing the 
writes/s to the backing device with the writes/s to the disk made me 
feel that it must have something to do with raid-/disk-caching settings. 
atop is a wonderful tool!

The manual http://www.3ware.com/support/UserDocs/UsrGuide-9.5.2.pdf has 
more details on this setting.

I do wonder though why this performance bottleneck (storsave=balance) 
only applies for writes on the DRBD-device. Writes on the local backing 
device directly are fast (~300MB/s). Are different syncs or write-calls 
used when writing locally or when DRBD writes to the disk of the 
secondary node? Or is this due to the network-latency?

Warm regards and thanks for the good work!

Tom

On Wednesday 02 01 2013 17:51:22 Tom Fernandes wrote:
> Hi Florian,
> 
> Thanks for your reply. I was out of office for some time so here's my 
> observations...
> 
> On Wednesday 02 01 2013 16:29:17 you wrote:
> > On Tue, Dec 18, 2012 at 10:58 AM, Tom Fernandes <anyaddress at gmx.net> 
> > wrote:
> > > ------------------------------- DRBD 
> > -----------------------------------------
> > > tom at hydra04 [1526]:~$ sudo drbdadm dump
> > > # /etc/drbd.conf
> > > common {
> > >     protocol               C;
> > >     syncer {
> > >         rate             150M;
> > >     }
> > > }
> > >
> > > # resource leela on hydra04: not ignored, not stacked
> > > resource leela {
> > >     on hydra04 {
> > >         device           minor 0;
> > >         disk             /dev/vg0/leela;
> > >         address          ipv4 10.0.0.1:7788;
> > >         meta-disk        internal;
> > >     }
> > >     on hydra05 {
> > >         device           minor 0;
> > >         disk             /dev/vg0/leela;
> > >         address          ipv4 10.0.0.2:7788;
> > >         meta-disk        internal;
> > >     }
> > > }
> > 
> > If that configuration is indeed "similar" to the one on the other
> > cluster (the one where you're apparently writing to DRBD at 200
> > MB/s), I'd be duly surprised. Indeed I'd consider it quite unlikely
> > for _any_ DRBD 8.3 cluster to hit that throughput unless you tweaked
> > at least al-extents, max-buffers and max-epoch-size, and possibly
> > also sndbuf-size and rcvbuf-size, and set no-disk-flushes and no-md-
> > flushes (assuming you run on flash or battery backed write cache).
> 
> I compared the DRBD-configuration of the fast and the slow cluster
> again with drbdadm dump. They are the same. Both configurations have
> just the defaults. No modifications of the parameters you mentioned
> above.
> To be on the save side I re-ran the benchmarks with a 2048MB dd-file
> (as we have big RAID-caches). On the fast cluster I have 1024 flash-
> backed cache, on the slow cluster it's 512MB (without BBU). When doing
> the tests on the fast cluster I observed nr and dw in /proc/drbd on
> the secondary node to be sure, that the data is really getting synced.
> The fast cluster are HP-Servers. The slow cluster is different
> hardware (it's rented from our provider and may be no-name hardware).
> But they have the same amount of RAM, same number of threads, both SAS
> drives and both have a RAID6 configured.
> 
> The fast cluster gives ~176MB/s write performance (not 200MB/s as I 
> mentioned before - I wasn't accurate when I wrote that - sorry). The 
> slow cluster gives ~55MB/s write performance. The speed on the slow 
> cluster stays roughly the same, whether I use protocol C or A. On the 
> fast cluster the speed increases from ~176MB/s to 187MB/s when
> switching from protocol C to protocol A.
> 
> > 
> > So I'd suggest that you refer back to your "fast" cluster and see if
> > perhaps you forgot to copy over your /etc/drbd.d/global_common.conf.
> 
> I checked. Both configs are the same.
> 
> > 
> > You may also need to switch your I/O scheduler from cfq to deadline
> > on your backing devices, if you haven't already done so. 
> 
> I switched from cfq to dealine on the slow cluster. There was a 
> performance increase from ~55MB/s to ~58MB/s.
> 
> > And finally, for
> > a round-robin bonded network link, upping the 
> > net.ipv4.tcp_reordering sysctl to perhaps 30 or so would also be
> > wise.
> 
> I tried out setting it to 30 on the slow cluster but performance 
> didn't really change.
> 
> I did not feel it makes sense to tweak the DRBD-configuration on the 
> slow cluster as the fast cluster has the same DRBD-configuration but 
> gives more than 3x better performance.
> 
> I'll try with 8.4 tomorrow. Let's see if that makes a difference.
> 
> Is there any more information I can provide?
> 
> 
> warm regards,
> 
> 
> Tom Fernandes
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>