[DRBD-user] Fast write performance on backing device, slow write Performance on DRBD

Mon Jan 7 15:51:33 CET 2013

Note that this setting disables the BBU.

On 01/07/2013 03:44 AM, Tom Fernandes wrote:
> Hi again,
>
> Problem solved. There is a setting called storsave for newer 3ware RAID-
> controllers. This was set to "balanced" which - among others - provides
> a write-journal for the disk-cache to prevent data-loss in case of power
> failure. Setting it to "perform" boosted write  performance from ~55MB/s
> to ~155MB/s. Looking carefully at the ouput of atop and comparing the
> writes/s to the backing device with the writes/s to the disk made me
> feel that it must have something to do with raid-/disk-caching settings.
> atop is a wonderful tool!
>
> The manual http://www.3ware.com/support/UserDocs/UsrGuide-9.5.2.pdf has
> more details on this setting.
>
> I do wonder though why this performance bottleneck (storsave=balance)
> only applies for writes on the DRBD-device. Writes on the local backing
> device directly are fast (~300MB/s). Are different syncs or write-calls
> used when writing locally or when DRBD writes to the disk of the
> secondary node? Or is this due to the network-latency?
>
> Warm regards and thanks for the good work!
>
>
> Tom
>
>
> On Wednesday 02 01 2013 17:51:22 Tom Fernandes wrote:
>> Hi Florian,
>>
>> Thanks for your reply. I was out of office for some time so here's my
>> observations...
>>
>> On Wednesday 02 01 2013 16:29:17 you wrote:
>>> On Tue, Dec 18, 2012 at 10:58 AM, Tom Fernandes<anyaddress at gmx.net>
>>> wrote:
>>>> ------------------------------- DRBD
>>> -----------------------------------------
>>>> tom at hydra04 [1526]:~$ sudo drbdadm dump
>>>> # /etc/drbd.conf
>>>> common {
>>>>      protocol               C;
>>>>      syncer {
>>>>          rate             150M;
>>>>      }
>>>> }
>>>>
>>>> # resource leela on hydra04: not ignored, not stacked
>>>> resource leela {
>>>>      on hydra04 {
>>>>          device           minor 0;
>>>>          disk             /dev/vg0/leela;
>>>>          address          ipv4 10.0.0.1:7788;
>>>>          meta-disk        internal;
>>>>      }
>>>>      on hydra05 {
>>>>          device           minor 0;
>>>>          disk             /dev/vg0/leela;
>>>>          address          ipv4 10.0.0.2:7788;
>>>>          meta-disk        internal;
>>>>      }
>>>> }
>>> If that configuration is indeed "similar" to the one on the other
>>> cluster (the one where you're apparently writing to DRBD at 200
>>> MB/s), I'd be duly surprised. Indeed I'd consider it quite unlikely
>>> for _any_ DRBD 8.3 cluster to hit that throughput unless you tweaked
>>> at least al-extents, max-buffers and max-epoch-size, and possibly
>>> also sndbuf-size and rcvbuf-size, and set no-disk-flushes and no-md-
>>> flushes (assuming you run on flash or battery backed write cache).
>> I compared the DRBD-configuration of the fast and the slow cluster
>> again with drbdadm dump. They are the same. Both configurations have
>> just the defaults. No modifications of the parameters you mentioned
>> above.
>> To be on the save side I re-ran the benchmarks with a 2048MB dd-file
>> (as we have big RAID-caches). On the fast cluster I have 1024 flash-
>> backed cache, on the slow cluster it's 512MB (without BBU). When doing
>> the tests on the fast cluster I observed nr and dw in /proc/drbd on
>> the secondary node to be sure, that the data is really getting synced.
>> The fast cluster are HP-Servers. The slow cluster is different
>> hardware (it's rented from our provider and may be no-name hardware).
>> But they have the same amount of RAM, same number of threads, both SAS
>> drives and both have a RAID6 configured.
>>
>> The fast cluster gives ~176MB/s write performance (not 200MB/s as I
>> mentioned before - I wasn't accurate when I wrote that - sorry). The
>> slow cluster gives ~55MB/s write performance. The speed on the slow
>> cluster stays roughly the same, whether I use protocol C or A. On the
>> fast cluster the speed increases from ~176MB/s to 187MB/s when
>> switching from protocol C to protocol A.
>>
>>> So I'd suggest that you refer back to your "fast" cluster and see if
>>> perhaps you forgot to copy over your /etc/drbd.d/global_common.conf.
>> I checked. Both configs are the same.
>>
>>> You may also need to switch your I/O scheduler from cfq to deadline
>>> on your backing devices, if you haven't already done so.
>> I switched from cfq to dealine on the slow cluster. There was a
>> performance increase from ~55MB/s to ~58MB/s.
>>
>>> And finally, for
>>> a round-robin bonded network link, upping the
>>> net.ipv4.tcp_reordering sysctl to perhaps 30 or so would also be
>>> wise.
>> I tried out setting it to 30 on the slow cluster but performance
>> didn't really change.
>>
>> I did not feel it makes sense to tweak the DRBD-configuration on the
>> slow cluster as the fast cluster has the same DRBD-configuration but
>> gives more than 3x better performance.
>>
>> I'll try with 8.4 tomorrow. Let's see if that makes a difference.
>>
>> Is there any more information I can provide?
>>
>>
>> warm regards,
>>
>>
>> Tom Fernandes
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

--------------------------------------------------------------------------------
Privileged, Proprietary and/or Confidential Information may be contained in
this electronic message.  If you are not the intended recipient, you should
not copy it, re-transmit it, use it or disclose its contents, but should
kindly return to the sender immediately and delete your copy from your system.
Gulf Interstate Engineering Company does not accept responsibility for
changes made to this electronic message or its attachments.