Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi! You are getting about 4 Gbit/s actual throughput, which is not that bad, but could be better. 1,25 Gbyte/s would be the theoretical maximum of your interlink without any overhead latency. Mit freundlichen Grüßen / Best Regards Robert Köppl Systemadministration KNAPP Systemintegration GmbH Waltenbachstraße 9 8700 Leoben, Austria Phone: +43 3842 805-910 Fax: +43 3842 82930-500 robert.koeppl at knapp.com www.KNAPP.com Commercial register number: FN 138870x Commercial register court: Leoben The information in this e-mail (including any attachment) is confidential and intended to be for the use of the addressee(s) only. If you have received the e-mail by mistake, any disclosure, copy, distribution or use of the contents of the e-mail is prohibited, and you must delete the e-mail from your system. As e-mail can be changed electronically KNAPP assumes no responsibility for any alteration to this e-mail or its attachments. KNAPP has taken every reasonable precaution to ensure that any attachment to this e-mail has been swept for virus. However, KNAPP does not accept any liability for damage sustained as a result of such attachment being virus infected and strongly recommend that you carry out your own virus check before opening any attachment. Noah Mehl <noah at tritonlimited.com> Gesendet von: drbd-user-bounces at lists.linbit.com 21.06.2011 03:30 An "drbd-user at lists.linbit.com" <drbd-user at lists.linbit.com> Kopie Thema Re: [DRBD-user] Poor DRBD performance, HELP! On Jun 20, 2011, at 6:06 AM, Cristian Mammoli - Apra Sistemi wrote: > On 06/20/2011 07:16 AM, Noah Mehl wrote: >> >> >> On Jun 20, 2011, at 12:39 AM, Noah Mehl wrote: >> >>> On Jun 18, 2011, at 2:27 PM, Florian Haas wrote: >>> >>>> On 06/17/2011 05:04 PM, Noah Mehl wrote: >>>>> Below is the script I ran to do the performance testing. I basically took the script from the user guide and removed the oflag=direct, >>>> >>>> ... which means that dd wrote to your page cache (read: RAM). At this >>>> point, you started kidding yourself about your performance. >>> >>> I do have a question here: the total size of the dd write was 64GB, twice the amount of system RAM, does this still apply? >>> >>>> >>>>> because when it was in there, it brought the performance down to 26MB/s (not really my focus here, but maybe related?). >>>> >>>> "Related" doesn't begin to describe it. >>>> >>>> Rerun the tests with oflag=direct and then repost them. >>> >>> Florian, >>> >>> I apologize for posting again without seeing your reply. I took the script directly from the user guide: >>> >>> #!/bin/bash >>> TEST_RESOURCE=r0 >>> TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE) >>> TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE) >>> drbdadm primary $TEST_RESOURCE >>> for i in $(seq 5); do >>> dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct >>> done >>> drbdadm down $TEST_RESOURCE >>> for i in $(seq 5); do >>> dd if=/dev/zero of=$TEST_LL_DEVICE bs=512M count=1 oflag=direct >>> done >>> >>> Here are the results: >>> >>> 1+0 records in >>> 1+0 records out >>> 536870912 bytes (537 MB) copied, 0.911252 s, 589 MB/s > [...] > > If your controller has a BBU change the write policy to writeback and > disable flushes in your drbd.conf > > HTH > > -- > Cristian Mammoli > APRA SISTEMI srl > Via Brodolini,6 Jesi (AN) > tel dir. +390731719822 > > Web www.apra.it > e-mail c.mammoli at apra.it > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user After taking many users suggestions into play, here's where I am now. I've done the iperf between the machines: [root at storageb ~]# iperf -c 10.0.100.241 ------------------------------------------------------------ Client connecting to 10.0.100.241, TCP port 5001 TCP window size: 27.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.100.242 port 57982 connected with 10.0.100.241 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 11.5 GBytes 9.86 Gbits/sec As you can see the network connectivity between the machines should not be a bottleneck. Unless I'm running the wrong test, or in the wrong way. Comments are definitely welcome here. I update my resource config to remove flushes because my controller is set to writeback: # begin resource drbd0 resource r0 { protocol C; disk { no-disk-flushes; no-md-flushes; } startup { wfc-timeout 15; degr-wfc-timeout 60; } net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } syncer { } on storagea { device /dev/drbd0; disk /dev/sda1; address 10.0.100.241:7788; meta-disk internal; } on storageb { device /dev/drbd0; disk /dev/sda1; address 10.0.100.242:7788; meta-disk internal; } } I've connected and synced the other node: version: 8.3.8.1 (api:88/proto:86-94) GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@, 2011-05-21 19:18:16 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:1460706824 nr:0 dw:671088640 dr:2114869272 al:163840 bm:210874 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 I've update the test script to include the oflag=direct in dd. Also, I expanded the test writes to 64GB, twice the system ram, and 64 times the controller ram: #!/bin/bash TEST_RESOURCE=r0 TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE) TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE) drbdadm primary $TEST_RESOURCE for i in $(seq 5); do dd if=/dev/zero of=$TEST_DEVICE bs=1G count=64 oflag=direct done drbdadm down $TEST_RESOURCE for i in $(seq 5); do dd if=/dev/zero of=$TEST_LL_DEVICE bs=1G count=64 oflag=direct done And this is the result: 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 152.376 s, 451 MB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 148.863 s, 462 MB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 152.587 s, 450 MB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 152.661 s, 450 MB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 148.099 s, 464 MB/s 0: State change failed: (-12) Device is held open by someone Command 'drbdsetup 0 down' terminated with exit code 11 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 52.5957 s, 1.3 GB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 56.9315 s, 1.2 GB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 57.5803 s, 1.2 GB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 52.4276 s, 1.3 GB/s 64+0 records in 64+0 records out 68719476736 bytes (69 GB) copied, 52.8235 s, 1.3 GB/s I'm getting a huge performance difference between the drbd resource and the lower level device. Is this what I should expect? ~Noah Scanned for viruses and content by the Tranet Spam Sentinel service. _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110621/ff1a927b/attachment.htm>