<br><font size=2 face="sans-serif">Hi!</font>

<br><font size=2 face="sans-serif">You are getting about 4 Gbit/s actual

throughput, which is not that bad, but could be better. 1,25 Gbyte/s would

be the theoretical maximum of your interlink without any overhead latency.<br>

</font><font size=3 color=#5f5f5f>Mit freundlichen Grüßen / Best Regards<br>

<br>

Robert Köppl<br>

<br>

Systemadministration<br>

<b><br>

KNAPP Systemintegration GmbH</b><br>

Waltenbachstraße 9<br>

8700 Leoben, Austria <br>

Phone: +43 3842 805-910<br>

Fax: +43 3842 82930-500<br>

robert.koeppl@knapp.com <br>

www.KNAPP.com <br>

<br>

Commercial register number: FN 138870x<br>

Commercial register court: Leoben</font><font size=3><br>

</font><font size=3 color=#d2d2d2><br>

The information in this e-mail (including any attachment) is confidential

and intended to be for the use of the addressee(s) only. If you have received

the e-mail by mistake, any disclosure, copy, distribution or use of the

contents of the e-mail is prohibited, and you must delete the e-mail from

your system. As e-mail can be changed electronically KNAPP assumes no responsibility

for any alteration to this e-mail or its attachments. KNAPP has taken every

reasonable precaution to ensure that any attachment to this e-mail has

been swept for virus. However, KNAPP does not accept any liability for

damage sustained as a result of such attachment being virus infected and

strongly recommend that you carry out your own virus check before opening

any attachment.</font>

<br>

<br>

<br>

<table width=100%>

<tr valign=top>

<td width=40%><font size=1 face="sans-serif"><b>Noah Mehl &lt;noah@tritonlimited.com&gt;</b>

</font>

<br><font size=1 face="sans-serif">Gesendet von: drbd-user-bounces@lists.linbit.com</font>

<p><font size=1 face="sans-serif">21.06.2011 03:30</font>

<td width=59%>

<table width=100%>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">An</font></div>

<td><font size=1 face="sans-serif">&quot;drbd-user@lists.linbit.com&quot;

&lt;drbd-user@lists.linbit.com&gt;</font>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">Kopie</font></div>

<td>

<tr valign=top>

<td>

<div align=right><font size=1 face="sans-serif">Thema</font></div>

<td><font size=1 face="sans-serif">Re: [DRBD-user] Poor DRBD performance,

HELP!</font></table>

<br>

<table>

<tr valign=top>

<td>

<td></table>

<br></table>

<br>

<br>

<br><tt><font size=2><br>

On Jun 20, 2011, at 6:06 AM, Cristian Mammoli - Apra Sistemi wrote:<br>

<br>

&gt; On 06/20/2011 07:16 AM, Noah Mehl wrote:<br>

&gt;&gt; <br>

&gt;&gt; <br>

&gt;&gt; On Jun 20, 2011, at 12:39 AM, Noah Mehl wrote:<br>

&gt;&gt; <br>

&gt;&gt;&gt; On Jun 18, 2011, at 2:27 PM, Florian Haas wrote:<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt;&gt; On 06/17/2011 05:04 PM, Noah Mehl wrote:<br>

&gt;&gt;&gt;&gt;&gt; Below is the script I ran to do the performance testing.

&nbsp;I basically took the script from the user guide and removed the oflag=direct,<br>

&gt;&gt;&gt;&gt; <br>

&gt;&gt;&gt;&gt; ... which means that dd wrote to your page cache (read:

RAM). At this<br>

&gt;&gt;&gt;&gt; point, you started kidding yourself about your performance.<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt; I do have a question here: &nbsp;the total size of the dd

write was 64GB, twice the amount of system RAM, does this still apply?<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt;&gt; <br>

&gt;&gt;&gt;&gt;&gt; because when it was in there, it brought the performance

down to 26MB/s (not really my focus here, but maybe related?).<br>

&gt;&gt;&gt;&gt; <br>

&gt;&gt;&gt;&gt; &quot;Related&quot; doesn't begin to describe it.<br>

&gt;&gt;&gt;&gt; <br>

&gt;&gt;&gt;&gt; Rerun the tests with oflag=direct and then repost them.<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt; Florian,<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt; I apologize for posting again without seeing your reply. &nbsp;I

took the script directly from the user guide:<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt; #!/bin/bash<br>

&gt;&gt;&gt; TEST_RESOURCE=r0<br>

&gt;&gt;&gt; TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE)<br>

&gt;&gt;&gt; TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE)<br>

&gt;&gt;&gt; drbdadm primary $TEST_RESOURCE<br>

&gt;&gt;&gt; for i in $(seq 5); do<br>

&gt;&gt;&gt; &nbsp;dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct<br>

&gt;&gt;&gt; done<br>

&gt;&gt;&gt; drbdadm down $TEST_RESOURCE<br>

&gt;&gt;&gt; for i in $(seq 5); do<br>

&gt;&gt;&gt; &nbsp;dd if=/dev/zero of=$TEST_LL_DEVICE bs=512M count=1 oflag=direct<br>

&gt;&gt;&gt; done<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt; Here are the results:<br>

&gt;&gt;&gt; <br>

&gt;&gt;&gt; 1+0 records in<br>

&gt;&gt;&gt; 1+0 records out<br>

&gt;&gt;&gt; 536870912 bytes (537 MB) copied, 0.911252 s, 589 MB/s<br>

&gt; [...]<br>

&gt; <br>

&gt; If your controller has a BBU change the write policy to writeback

and <br>

&gt; disable flushes in your drbd.conf<br>

&gt; <br>

&gt; HTH<br>

&gt; <br>

&gt; -- <br>

&gt; Cristian Mammoli<br>

&gt; APRA SISTEMI srl<br>

&gt; Via Brodolini,6 Jesi (AN)<br>

&gt; tel dir. +390731719822<br>

&gt; <br>

&gt; Web &nbsp; www.apra.it<br>

&gt; e-mail &nbsp;c.mammoli@apra.it<br>

&gt; _______________________________________________<br>

&gt; drbd-user mailing list<br>

&gt; drbd-user@lists.linbit.com<br>

&gt; http://lists.linbit.com/mailman/listinfo/drbd-user<br>

<br>

After taking many users suggestions into play, here's where I am now. &nbsp;I've

done the iperf between the machines:<br>

<br>

[root@storageb ~]# iperf -c 10.0.100.241<br>

------------------------------------------------------------<br>

Client connecting to 10.0.100.241, TCP port 5001<br>

TCP window size: 27.8 KByte (default)<br>

------------------------------------------------------------<br>

[ &nbsp;3] local 10.0.100.242 port 57982 connected with 10.0.100.241 port

5001<br>

[ ID] Interval &nbsp; &nbsp; &nbsp; Transfer &nbsp; &nbsp; Bandwidth<br>

[ &nbsp;3] &nbsp;0.0-10.0 sec &nbsp;11.5 GBytes &nbsp;9.86 Gbits/sec<br>

<br>

As you can see the network connectivity between the machines should not

be a bottleneck. &nbsp;Unless I'm running the wrong test, or in the wrong

way. &nbsp;Comments are definitely welcome here.<br>

<br>

I update my resource config to remove flushes because my controller is

set to writeback:<br>

<br>

# begin resource drbd0<br>

resource r0 {<br>

 &nbsp; &nbsp; protocol C;<br>

<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;disk {<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;no-disk-flushes;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;no-md-flushes;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>

<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;startup {<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;wfc-timeout 15;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;degr-wfc-timeout

60;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>

<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;net {<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;allow-two-primaries;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;after-sb-0pri discard-zero-changes;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;after-sb-1pri discard-secondary;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;after-sb-2pri disconnect;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;syncer {<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>

 &nbsp; &nbsp; on storagea {<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; device /dev/drbd0;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; disk /dev/sda1;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; address 10.0.100.241:7788;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; meta-disk internal;<br>

 &nbsp; &nbsp; }<br>

 &nbsp; &nbsp; on storageb {<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;device /dev/drbd0;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;disk /dev/sda1;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;address 10.0.100.242:7788;<br>

 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;meta-disk internal;<br>

 &nbsp; &nbsp; }<br>

}<br>

<br>

I've connected and synced the other node:<br>

<br>

version: 8.3.8.1 (api:88/proto:86-94)<br>

GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@, 2011-05-21

19:18:16<br>

 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----<br>

 &nbsp; &nbsp;ns:1460706824 nr:0 dw:671088640 dr:2114869272 al:163840 bm:210874

lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0<br>

<br>

I've update the test script to include the oflag=direct in dd. &nbsp;Also,

I expanded the test writes to 64GB, twice the system ram, and 64 times

the controller ram:<br>

<br>

#!/bin/bash<br>

TEST_RESOURCE=r0<br>

TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE)<br>

TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE)<br>

drbdadm primary $TEST_RESOURCE<br>

for i in $(seq 5); do<br>

 &nbsp;dd if=/dev/zero of=$TEST_DEVICE bs=1G count=64 oflag=direct<br>

done<br>

drbdadm down $TEST_RESOURCE<br>

for i in $(seq 5); do<br>

 &nbsp;dd if=/dev/zero of=$TEST_LL_DEVICE bs=1G count=64 oflag=direct<br>

done<br>

<br>

And this is the result:<br>

<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 152.376 s, 451 MB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 148.863 s, 462 MB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 152.587 s, 450 MB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 152.661 s, 450 MB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 148.099 s, 464 MB/s<br>

0: State change failed: (-12) Device is held open by someone<br>

Command 'drbdsetup 0 down' terminated with exit code 11<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 52.5957 s, 1.3 GB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 56.9315 s, 1.2 GB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 57.5803 s, 1.2 GB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 52.4276 s, 1.3 GB/s<br>

64+0 records in<br>

64+0 records out<br>

68719476736 bytes (69 GB) copied, 52.8235 s, 1.3 GB/s<br>

<br>

I'm getting a huge performance difference between the drbd resource and

the lower level device. &nbsp;Is this what I should expect?<br>

<br>

~Noah<br>

<br>

<br>

<br>

Scanned for viruses and content by the Tranet Spam Sentinel service.<br>

_______________________________________________<br>

drbd-user mailing list<br>

drbd-user@lists.linbit.com<br>

http://lists.linbit.com/mailman/listinfo/drbd-user<br>

</font></tt>

<br>