Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Inline On 06/21/2011 07:41 AM, Noah Mehl wrote: > Brian, > > I'm in the middle of tuning according to the user guide, and to Florian's blog post: http://fghaas.wordpress.com/2007/06/22/performance-tuning-drbd-setups/ > > To speed up my process, can you post what settings you used for: > > 1. al-extents 3049, keep it prime > 2. sndbuf-size and rcvbuf-size 5M > 3. unplug-watermark not set > 4. max-buffers > 5. max-epoch-size 9000 > Also, what MTU are you using, the standard 1500, or 9000, or something else? 9000 Also in your syncer section set c-min-rate 0; If you have battery backed write cache and you know it's in good health: no-disk-barrier and no-disk-flushes in your disk section. Good luck, Brian > Thank you SOOOO much in advance! > > ~Noah > > On Jun 21, 2011, at 10:31 AM, Brian R. Hellman wrote: > >> Just for the record I'm currently working on a system that achieves >> 1.2GB/sec, maxing out the 10Gbe connection, so DRBD can perform that >> well. Just have to tune it to do so. >> >> Check out this section of the users guide: >> http://www.drbd.org/users-guide/ch-throughput.html >> >> Good luck, >> Brian >> >> On 06/21/2011 05:51 AM, Noah Mehl wrote: >>> Yes, >>> >>> But I was getting the same performance with the nodes in >>> Standalone/Primary. Also, if the the lower level physical device, and >>> the network link perform at 3x that rate, then what's the bottle neck? >>> Is this the kind of performance loss I should expect from DRBD? >>> >>> ~Noah >>> >>> On Jun 21, 2011, at 2:29 AM, <Robert.Koeppl at knapp.com >>> <mailto:Robert.Koeppl at knapp.com>> <Robert.Koeppl at knapp.com >>> <mailto:Robert.Koeppl at knapp.com>> wrote: >>> >>>> Hi! >>>> You are getting about 4 Gbit/s actual throughput, which is not that >>>> bad, but could be better. 1,25 Gbyte/s would be the theoretical >>>> maximum of your interlink without any overhead latency. >>>> Mit freundlichen Grüßen / Best Regards >>>> >>>> Robert Köppl >>>> >>>> Systemadministration >>>> * >>>> KNAPP Systemintegration GmbH* >>>> Waltenbachstraße 9 >>>> 8700 Leoben, Austria >>>> Phone: +43 3842 805-910 >>>> Fax: +43 3842 82930-500 >>>> robert.koeppl at knapp.com <mailto:robert.koeppl at knapp.com> >>>> www.KNAPP.com <http://www.KNAPP.com> >>>> >>>> Commercial register number: FN 138870x >>>> Commercial register court: Leoben >>>> >>>> The information in this e-mail (including any attachment) is >>>> confidential and intended to be for the use of the addressee(s) only. >>>> If you have received the e-mail by mistake, any disclosure, copy, >>>> distribution or use of the contents of the e-mail is prohibited, and >>>> you must delete the e-mail from your system. As e-mail can be changed >>>> electronically KNAPP assumes no responsibility for any alteration to >>>> this e-mail or its attachments. KNAPP has taken every reasonable >>>> precaution to ensure that any attachment to this e-mail has been >>>> swept for virus. However, KNAPP does not accept any liability for >>>> damage sustained as a result of such attachment being virus infected >>>> and strongly recommend that you carry out your own virus check before >>>> opening any attachment. >>>> >>>> >>>> *Noah Mehl <noah at tritonlimited.com <mailto:noah at tritonlimited.com>>* >>>> Gesendet von: drbd-user-bounces at lists.linbit.com >>>> <mailto:drbd-user-bounces at lists.linbit.com> >>>> >>>> 21.06.2011 03:30 >>>> >>>> >>>> An >>>> "drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>" >>>> <drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>> >>>> Kopie >>>> >>>> Thema >>>> Re: [DRBD-user] Poor DRBD performance, HELP! >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Jun 20, 2011, at 6:06 AM, Cristian Mammoli - Apra Sistemi wrote: >>>> >>>>> On 06/20/2011 07:16 AM, Noah Mehl wrote: >>>>>> >>>>>> On Jun 20, 2011, at 12:39 AM, Noah Mehl wrote: >>>>>> >>>>>>> On Jun 18, 2011, at 2:27 PM, Florian Haas wrote: >>>>>>> >>>>>>>> On 06/17/2011 05:04 PM, Noah Mehl wrote: >>>>>>>>> Below is the script I ran to do the performance testing. I >>>> basically took the script from the user guide and removed the >>>> oflag=direct, >>>>>>>> ... which means that dd wrote to your page cache (read: RAM). At >>>> this >>>>>>>> point, you started kidding yourself about your performance. >>>>>>> I do have a question here: the total size of the dd write was >>>> 64GB, twice the amount of system RAM, does this still apply? >>>>>>>>> because when it was in there, it brought the performance down >>>> to 26MB/s (not really my focus here, but maybe related?). >>>>>>>> "Related" doesn't begin to describe it. >>>>>>>> >>>>>>>> Rerun the tests with oflag=direct and then repost them. >>>>>>> Florian, >>>>>>> >>>>>>> I apologize for posting again without seeing your reply. I took >>>> the script directly from the user guide: >>>>>>> #!/bin/bash >>>>>>> TEST_RESOURCE=r0 >>>>>>> TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE) >>>>>>> TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE) >>>>>>> drbdadm primary $TEST_RESOURCE >>>>>>> for i in $(seq 5); do >>>>>>> dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct >>>>>>> done >>>>>>> drbdadm down $TEST_RESOURCE >>>>>>> for i in $(seq 5); do >>>>>>> dd if=/dev/zero of=$TEST_LL_DEVICE bs=512M count=1 oflag=direct >>>>>>> done >>>>>>> >>>>>>> Here are the results: >>>>>>> >>>>>>> 1+0 records in >>>>>>> 1+0 records out >>>>>>> 536870912 bytes (537 MB) copied, 0.911252 s, 589 MB/s >>>>> [...] >>>>> >>>>> If your controller has a BBU change the write policy to writeback and >>>>> disable flushes in your drbd.conf >>>>> >>>>> HTH >>>>> >>>>> -- >>>>> Cristian Mammoli >>>>> APRA SISTEMI srl >>>>> Via Brodolini,6 Jesi (AN) >>>>> tel dir. +390731719822 >>>>> >>>>> Web www.apra.it <http://www.apra.it> >>>>> e-mail c.mammoli at apra.it <mailto:c.mammoli at apra.it> >>>>> _______________________________________________ >>>>> drbd-user mailing list >>>>> drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com> >>>>> http://lists.linbit.com/mailman/listinfo/drbd-user >>>> After taking many users suggestions into play, here's where I am now. >>>> I've done the iperf between the machines: >>>> >>>> [root at storageb ~]# iperf -c 10.0.100.241 >>>> ------------------------------------------------------------ >>>> Client connecting to 10.0.100.241, TCP port 5001 >>>> TCP window size: 27.8 KByte (default) >>>> ------------------------------------------------------------ >>>> [ 3] local 10.0.100.242 port 57982 connected with 10.0.100.241 port 5001 >>>> [ ID] Interval Transfer Bandwidth >>>> [ 3] 0.0-10.0 sec 11.5 GBytes 9.86 Gbits/sec >>>> >>>> As you can see the network connectivity between the machines should >>>> not be a bottleneck. Unless I'm running the wrong test, or in the >>>> wrong way. Comments are definitely welcome here. >>>> >>>> I update my resource config to remove flushes because my controller >>>> is set to writeback: >>>> >>>> # begin resource drbd0 >>>> resource r0 { >>>> protocol C; >>>> >>>> disk { >>>> no-disk-flushes; >>>> no-md-flushes; >>>> } >>>> >>>> startup { >>>> wfc-timeout 15; >>>> degr-wfc-timeout 60; >>>> } >>>> >>>> net { >>>> allow-two-primaries; >>>> after-sb-0pri discard-zero-changes; >>>> after-sb-1pri discard-secondary; >>>> after-sb-2pri disconnect; >>>> } >>>> syncer { >>>> } >>>> on storagea { >>>> device /dev/drbd0; >>>> disk /dev/sda1; >>>> address 10.0.100.241:7788; >>>> meta-disk internal; >>>> } >>>> on storageb { >>>> device /dev/drbd0; >>>> disk /dev/sda1; >>>> address 10.0.100.242:7788; >>>> meta-disk internal; >>>> } >>>> } >>>> >>>> I've connected and synced the other node: >>>> >>>> version: 8.3.8.1 (api:88/proto:86-94) >>>> GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@, >>>> 2011-05-21 19:18:16 >>>> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- >>>> ns:1460706824 nr:0 dw:671088640 dr:2114869272 al:163840 bm:210874 >>>> lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 >>>> >>>> I've update the test script to include the oflag=direct in dd. Also, >>>> I expanded the test writes to 64GB, twice the system ram, and 64 >>>> times the controller ram: >>>> >>>> #!/bin/bash >>>> TEST_RESOURCE=r0 >>>> TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE) >>>> TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE) >>>> drbdadm primary $TEST_RESOURCE >>>> for i in $(seq 5); do >>>> dd if=/dev/zero of=$TEST_DEVICE bs=1G count=64 oflag=direct >>>> done >>>> drbdadm down $TEST_RESOURCE >>>> for i in $(seq 5); do >>>> dd if=/dev/zero of=$TEST_LL_DEVICE bs=1G count=64 oflag=direct >>>> done >>>> >>>> And this is the result: >>>> >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 152.376 s, 451 MB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 148.863 s, 462 MB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 152.587 s, 450 MB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 152.661 s, 450 MB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 148.099 s, 464 MB/s >>>> 0: State change failed: (-12) Device is held open by someone >>>> Command 'drbdsetup 0 down' terminated with exit code 11 >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 52.5957 s, 1.3 GB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 56.9315 s, 1.2 GB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 57.5803 s, 1.2 GB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 52.4276 s, 1.3 GB/s >>>> 64+0 records in >>>> 64+0 records out >>>> 68719476736 bytes (69 GB) copied, 52.8235 s, 1.3 GB/s >>>> >>>> I'm getting a huge performance difference between the drbd resource >>>> and the lower level device. Is this what I should expect? >>>> >>>> ~Noah >>>> >>>> >>>> >>>> Scanned for viruses and content by the Tranet Spam Sentinel service. >>>> _______________________________________________ >>>> drbd-user mailing list >>>> drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com> >>>> http://lists.linbit.com/mailman/listinfo/drbd-user >>>> >>>> <ATT00001..txt> >>> >>> >>> >>> >>> _______________________________________________ >>> drbd-user mailing list >>> drbd-user at lists.linbit.com >>> http://lists.linbit.com/mailman/listinfo/drbd-user >> _______________________________________________ >> drbd-user mailing list >> drbd-user at lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user