[DRBD-user] Antwort: Re: Poor DRBD performance, HELP!

Tue Jun 21 18:44:38 CEST 2011

Inline

On 06/21/2011 07:41 AM, Noah Mehl wrote:
> Brian,
>
> I'm in the middle of tuning according to the user guide, and to Florian's blog post: http://fghaas.wordpress.com/2007/06/22/performance-tuning-drbd-setups/
>
> To speed up my process, can you post what settings you used for:
>
> 1. al-extents
3049, keep it prime
> 2. sndbuf-size
and rcvbuf-size 5M
> 3. unplug-watermark
not set
> 4. max-buffers
> 5. max-epoch-size
9000
> Also, what MTU are you using, the standard 1500, or 9000, or something else?
9000

Also in your syncer section set c-min-rate 0;

If you have battery backed write cache and you know it's in good health:
no-disk-barrier and no-disk-flushes in your disk section.

Good luck,

Brian
> Thank you SOOOO much in advance!
>
> ~Noah
>
> On Jun 21, 2011, at 10:31 AM, Brian R. Hellman wrote:
>
>> Just for the record I'm currently working on a system that achieves
>> 1.2GB/sec, maxing out the 10Gbe connection, so DRBD can perform that
>> well.  Just have to tune it to do so.
>>
>> Check out this section of the users guide:
>> http://www.drbd.org/users-guide/ch-throughput.html
>>
>> Good luck,
>> Brian
>>
>> On 06/21/2011 05:51 AM, Noah Mehl wrote:
>>> Yes,
>>>
>>> But I was getting the same performance with the nodes in
>>> Standalone/Primary.  Also, if the the lower level physical device, and
>>> the network link perform at 3x that rate, then what's the bottle neck?
>>> Is this the kind of performance loss I should expect from DRBD?
>>>
>>> ~Noah
>>>
>>> On Jun 21, 2011, at 2:29 AM, <Robert.Koeppl at knapp.com
>>> <mailto:Robert.Koeppl at knapp.com>> <Robert.Koeppl at knapp.com
>>> <mailto:Robert.Koeppl at knapp.com>> wrote:
>>>
>>>> Hi!
>>>> You are getting about 4 Gbit/s actual throughput, which is not that
>>>> bad, but could be better. 1,25 Gbyte/s would be the theoretical
>>>> maximum of your interlink without any overhead latency.
>>>> Mit freundlichen Grüßen / Best Regards
>>>>
>>>> Robert Köppl
>>>>
>>>> Systemadministration
>>>> *
>>>> KNAPP Systemintegration GmbH*
>>>> Waltenbachstraße 9
>>>> 8700 Leoben, Austria
>>>> Phone: +43 3842 805-910
>>>> Fax: +43 3842 82930-500
>>>> robert.koeppl at knapp.com <mailto:robert.koeppl at knapp.com>
>>>> www.KNAPP.com <http://www.KNAPP.com>
>>>>
>>>> Commercial register number: FN 138870x
>>>> Commercial register court: Leoben
>>>>
>>>> The information in this e-mail (including any attachment) is
>>>> confidential and intended to be for the use of the addressee(s) only.
>>>> If you have received the e-mail by mistake, any disclosure, copy,
>>>> distribution or use of the contents of the e-mail is prohibited, and
>>>> you must delete the e-mail from your system. As e-mail can be changed
>>>> electronically KNAPP assumes no responsibility for any alteration to
>>>> this e-mail or its attachments. KNAPP has taken every reasonable
>>>> precaution to ensure that any attachment to this e-mail has been
>>>> swept for virus. However, KNAPP does not accept any liability for
>>>> damage sustained as a result of such attachment being virus infected
>>>> and strongly recommend that you carry out your own virus check before
>>>> opening any attachment.
>>>>
>>>>
>>>> *Noah Mehl <noah at tritonlimited.com <mailto:noah at tritonlimited.com>>*
>>>> Gesendet von: drbd-user-bounces at lists.linbit.com
>>>> <mailto:drbd-user-bounces at lists.linbit.com>
>>>>
>>>> 21.06.2011 03:30
>>>>
>>>> 	
>>>> An
>>>> 	"drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>"
>>>> <drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>>
>>>> Kopie
>>>> 	
>>>> Thema
>>>> 	Re: [DRBD-user] Poor DRBD performance, HELP!
>>>>
>>>>
>>>>
>>>> 	
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Jun 20, 2011, at 6:06 AM, Cristian Mammoli - Apra Sistemi wrote:
>>>>
>>>>> On 06/20/2011 07:16 AM, Noah Mehl wrote:
>>>>>>
>>>>>> On Jun 20, 2011, at 12:39 AM, Noah Mehl wrote:
>>>>>>
>>>>>>> On Jun 18, 2011, at 2:27 PM, Florian Haas wrote:
>>>>>>>
>>>>>>>> On 06/17/2011 05:04 PM, Noah Mehl wrote:
>>>>>>>>> Below is the script I ran to do the performance testing.  I
>>>> basically took the script from the user guide and removed the
>>>> oflag=direct,
>>>>>>>> ... which means that dd wrote to your page cache (read: RAM). At
>>>> this
>>>>>>>> point, you started kidding yourself about your performance.
>>>>>>> I do have a question here:  the total size of the dd write was
>>>> 64GB, twice the amount of system RAM, does this still apply?
>>>>>>>>> because when it was in there, it brought the performance down
>>>> to 26MB/s (not really my focus here, but maybe related?).
>>>>>>>> "Related" doesn't begin to describe it.
>>>>>>>>
>>>>>>>> Rerun the tests with oflag=direct and then repost them.
>>>>>>> Florian,
>>>>>>>
>>>>>>> I apologize for posting again without seeing your reply.  I took
>>>> the script directly from the user guide:
>>>>>>> #!/bin/bash
>>>>>>> TEST_RESOURCE=r0
>>>>>>> TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE)
>>>>>>> TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE)
>>>>>>> drbdadm primary $TEST_RESOURCE
>>>>>>> for i in $(seq 5); do
>>>>>>> dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct
>>>>>>> done
>>>>>>> drbdadm down $TEST_RESOURCE
>>>>>>> for i in $(seq 5); do
>>>>>>> dd if=/dev/zero of=$TEST_LL_DEVICE bs=512M count=1 oflag=direct
>>>>>>> done
>>>>>>>
>>>>>>> Here are the results:
>>>>>>>
>>>>>>> 1+0 records in
>>>>>>> 1+0 records out
>>>>>>> 536870912 bytes (537 MB) copied, 0.911252 s, 589 MB/s
>>>>> [...]
>>>>>
>>>>> If your controller has a BBU change the write policy to writeback and
>>>>> disable flushes in your drbd.conf
>>>>>
>>>>> HTH
>>>>>
>>>>> --
>>>>> Cristian Mammoli
>>>>> APRA SISTEMI srl
>>>>> Via Brodolini,6 Jesi (AN)
>>>>> tel dir. +390731719822
>>>>>
>>>>> Web   www.apra.it <http://www.apra.it>
>>>>> e-mail  c.mammoli at apra.it <mailto:c.mammoli at apra.it>
>>>>> _______________________________________________
>>>>> drbd-user mailing list
>>>>> drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>
>>>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>>> After taking many users suggestions into play, here's where I am now.
>>>> I've done the iperf between the machines:
>>>>
>>>> [root at storageb ~]# iperf -c 10.0.100.241
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.0.100.241, TCP port 5001
>>>> TCP window size: 27.8 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  3] local 10.0.100.242 port 57982 connected with 10.0.100.241 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  3]  0.0-10.0 sec  11.5 GBytes  9.86 Gbits/sec
>>>>
>>>> As you can see the network connectivity between the machines should
>>>> not be a bottleneck.  Unless I'm running the wrong test, or in the
>>>> wrong way.  Comments are definitely welcome here.
>>>>
>>>> I update my resource config to remove flushes because my controller
>>>> is set to writeback:
>>>>
>>>> # begin resource drbd0
>>>> resource r0 {
>>>>    protocol C;
>>>>
>>>>         disk {
>>>>               no-disk-flushes;
>>>>               no-md-flushes;
>>>>               }
>>>>
>>>>         startup {
>>>>               wfc-timeout 15;
>>>>               degr-wfc-timeout 60;
>>>>               }
>>>>
>>>>         net {
>>>>               allow-two-primaries;
>>>>               after-sb-0pri discard-zero-changes;
>>>>               after-sb-1pri discard-secondary;
>>>>               after-sb-2pri disconnect;
>>>>               }
>>>>         syncer {
>>>>         }
>>>>    on storagea {
>>>>              device /dev/drbd0;
>>>>              disk /dev/sda1;
>>>>              address 10.0.100.241:7788;
>>>>              meta-disk internal;
>>>>    }
>>>>    on storageb {
>>>>               device /dev/drbd0;
>>>>               disk /dev/sda1;
>>>>               address 10.0.100.242:7788;
>>>>               meta-disk internal;
>>>>    }
>>>> }
>>>>
>>>> I've connected and synced the other node:
>>>>
>>>> version: 8.3.8.1 (api:88/proto:86-94)
>>>> GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@,
>>>> 2011-05-21 19:18:16
>>>> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
>>>>   ns:1460706824 nr:0 dw:671088640 dr:2114869272 al:163840 bm:210874
>>>> lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
>>>>
>>>> I've update the test script to include the oflag=direct in dd.  Also,
>>>> I expanded the test writes to 64GB, twice the system ram, and 64
>>>> times the controller ram:
>>>>
>>>> #!/bin/bash
>>>> TEST_RESOURCE=r0
>>>> TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE)
>>>> TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE)
>>>> drbdadm primary $TEST_RESOURCE
>>>> for i in $(seq 5); do
>>>> dd if=/dev/zero of=$TEST_DEVICE bs=1G count=64 oflag=direct
>>>> done
>>>> drbdadm down $TEST_RESOURCE
>>>> for i in $(seq 5); do
>>>> dd if=/dev/zero of=$TEST_LL_DEVICE bs=1G count=64 oflag=direct
>>>> done
>>>>
>>>> And this is the result:
>>>>
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 152.376 s, 451 MB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 148.863 s, 462 MB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 152.587 s, 450 MB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 152.661 s, 450 MB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 148.099 s, 464 MB/s
>>>> 0: State change failed: (-12) Device is held open by someone
>>>> Command 'drbdsetup 0 down' terminated with exit code 11
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 52.5957 s, 1.3 GB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 56.9315 s, 1.2 GB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 57.5803 s, 1.2 GB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 52.4276 s, 1.3 GB/s
>>>> 64+0 records in
>>>> 64+0 records out
>>>> 68719476736 bytes (69 GB) copied, 52.8235 s, 1.3 GB/s
>>>>
>>>> I'm getting a huge performance difference between the drbd resource
>>>> and the lower level device.  Is this what I should expect?
>>>>
>>>> ~Noah
>>>>
>>>>
>>>>
>>>> Scanned for viruses and content by the Tranet Spam Sentinel service.
>>>> _______________________________________________
>>>> drbd-user mailing list
>>>> drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>
>>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>>>
>>>> <ATT00001..txt>
>>>
>>>     
>>>
>>>
>>> _______________________________________________
>>> drbd-user mailing list
>>> drbd-user at lists.linbit.com
>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user