<br><font size=2 face="sans-serif">Hi!</font>
<br><font size=2 face="sans-serif">You are getting about 4 Gbit/s actual
throughput, which is not that bad, but could be better. 1,25 Gbyte/s would
be the theoretical maximum of your interlink without any overhead latency.<br>
</font><font size=3 color=#5f5f5f>Mit freundlichen Grüßen / Best Regards<br>
<br>
Robert Köppl<br>
<br>
Systemadministration<br>
<b><br>
KNAPP Systemintegration GmbH</b><br>
Waltenbachstraße 9<br>
8700 Leoben, Austria <br>
Phone: +43 3842 805-910<br>
Fax: +43 3842 82930-500<br>
robert.koeppl@knapp.com <br>
www.KNAPP.com <br>
<br>
Commercial register number: FN 138870x<br>
Commercial register court: Leoben</font><font size=3><br>
</font><font size=3 color=#d2d2d2><br>
The information in this e-mail (including any attachment) is confidential
and intended to be for the use of the addressee(s) only. If you have received
the e-mail by mistake, any disclosure, copy, distribution or use of the
contents of the e-mail is prohibited, and you must delete the e-mail from
your system. As e-mail can be changed electronically KNAPP assumes no responsibility
for any alteration to this e-mail or its attachments. KNAPP has taken every
reasonable precaution to ensure that any attachment to this e-mail has
been swept for virus. However, KNAPP does not accept any liability for
damage sustained as a result of such attachment being virus infected and
strongly recommend that you carry out your own virus check before opening
any attachment.</font>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>Noah Mehl <noah@tritonlimited.com></b>
</font>
<br><font size=1 face="sans-serif">Gesendet von: drbd-user-bounces@lists.linbit.com</font>
<p><font size=1 face="sans-serif">21.06.2011 03:30</font>
<td width=59%>
<table width=100%>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">An</font></div>
<td><font size=1 face="sans-serif">"drbd-user@lists.linbit.com"
<drbd-user@lists.linbit.com></font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Kopie</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Thema</font></div>
<td><font size=1 face="sans-serif">Re: [DRBD-user] Poor DRBD performance,
HELP!</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><tt><font size=2><br>
On Jun 20, 2011, at 6:06 AM, Cristian Mammoli - Apra Sistemi wrote:<br>
<br>
> On 06/20/2011 07:16 AM, Noah Mehl wrote:<br>
>> <br>
>> <br>
>> On Jun 20, 2011, at 12:39 AM, Noah Mehl wrote:<br>
>> <br>
>>> On Jun 18, 2011, at 2:27 PM, Florian Haas wrote:<br>
>>> <br>
>>>> On 06/17/2011 05:04 PM, Noah Mehl wrote:<br>
>>>>> Below is the script I ran to do the performance testing.
I basically took the script from the user guide and removed the oflag=direct,<br>
>>>> <br>
>>>> ... which means that dd wrote to your page cache (read:
RAM). At this<br>
>>>> point, you started kidding yourself about your performance.<br>
>>> <br>
>>> I do have a question here: the total size of the dd
write was 64GB, twice the amount of system RAM, does this still apply?<br>
>>> <br>
>>>> <br>
>>>>> because when it was in there, it brought the performance
down to 26MB/s (not really my focus here, but maybe related?).<br>
>>>> <br>
>>>> "Related" doesn't begin to describe it.<br>
>>>> <br>
>>>> Rerun the tests with oflag=direct and then repost them.<br>
>>> <br>
>>> Florian,<br>
>>> <br>
>>> I apologize for posting again without seeing your reply. I
took the script directly from the user guide:<br>
>>> <br>
>>> #!/bin/bash<br>
>>> TEST_RESOURCE=r0<br>
>>> TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE)<br>
>>> TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE)<br>
>>> drbdadm primary $TEST_RESOURCE<br>
>>> for i in $(seq 5); do<br>
>>> dd if=/dev/zero of=$TEST_DEVICE bs=512M count=1 oflag=direct<br>
>>> done<br>
>>> drbdadm down $TEST_RESOURCE<br>
>>> for i in $(seq 5); do<br>
>>> dd if=/dev/zero of=$TEST_LL_DEVICE bs=512M count=1 oflag=direct<br>
>>> done<br>
>>> <br>
>>> Here are the results:<br>
>>> <br>
>>> 1+0 records in<br>
>>> 1+0 records out<br>
>>> 536870912 bytes (537 MB) copied, 0.911252 s, 589 MB/s<br>
> [...]<br>
> <br>
> If your controller has a BBU change the write policy to writeback
and <br>
> disable flushes in your drbd.conf<br>
> <br>
> HTH<br>
> <br>
> -- <br>
> Cristian Mammoli<br>
> APRA SISTEMI srl<br>
> Via Brodolini,6 Jesi (AN)<br>
> tel dir. +390731719822<br>
> <br>
> Web www.apra.it<br>
> e-mail c.mammoli@apra.it<br>
> _______________________________________________<br>
> drbd-user mailing list<br>
> drbd-user@lists.linbit.com<br>
> http://lists.linbit.com/mailman/listinfo/drbd-user<br>
<br>
After taking many users suggestions into play, here's where I am now. I've
done the iperf between the machines:<br>
<br>
[root@storageb ~]# iperf -c 10.0.100.241<br>
------------------------------------------------------------<br>
Client connecting to 10.0.100.241, TCP port 5001<br>
TCP window size: 27.8 KByte (default)<br>
------------------------------------------------------------<br>
[ 3] local 10.0.100.242 port 57982 connected with 10.0.100.241 port
5001<br>
[ ID] Interval Transfer Bandwidth<br>
[ 3] 0.0-10.0 sec 11.5 GBytes 9.86 Gbits/sec<br>
<br>
As you can see the network connectivity between the machines should not
be a bottleneck. Unless I'm running the wrong test, or in the wrong
way. Comments are definitely welcome here.<br>
<br>
I update my resource config to remove flushes because my controller is
set to writeback:<br>
<br>
# begin resource drbd0<br>
resource r0 {<br>
protocol C;<br>
<br>
disk {<br>
no-disk-flushes;<br>
no-md-flushes;<br>
}<br>
<br>
startup {<br>
wfc-timeout 15;<br>
degr-wfc-timeout
60;<br>
}<br>
<br>
net {<br>
allow-two-primaries;<br>
after-sb-0pri discard-zero-changes;<br>
after-sb-1pri discard-secondary;<br>
after-sb-2pri disconnect;<br>
}<br>
syncer {<br>
}<br>
on storagea {<br>
device /dev/drbd0;<br>
disk /dev/sda1;<br>
address 10.0.100.241:7788;<br>
meta-disk internal;<br>
}<br>
on storageb {<br>
device /dev/drbd0;<br>
disk /dev/sda1;<br>
address 10.0.100.242:7788;<br>
meta-disk internal;<br>
}<br>
}<br>
<br>
I've connected and synced the other node:<br>
<br>
version: 8.3.8.1 (api:88/proto:86-94)<br>
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@, 2011-05-21
19:18:16<br>
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----<br>
ns:1460706824 nr:0 dw:671088640 dr:2114869272 al:163840 bm:210874
lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0<br>
<br>
I've update the test script to include the oflag=direct in dd. Also,
I expanded the test writes to 64GB, twice the system ram, and 64 times
the controller ram:<br>
<br>
#!/bin/bash<br>
TEST_RESOURCE=r0<br>
TEST_DEVICE=$(drbdadm sh-dev $TEST_RESOURCE)<br>
TEST_LL_DEVICE=$(drbdadm sh-ll-dev $TEST_RESOURCE)<br>
drbdadm primary $TEST_RESOURCE<br>
for i in $(seq 5); do<br>
dd if=/dev/zero of=$TEST_DEVICE bs=1G count=64 oflag=direct<br>
done<br>
drbdadm down $TEST_RESOURCE<br>
for i in $(seq 5); do<br>
dd if=/dev/zero of=$TEST_LL_DEVICE bs=1G count=64 oflag=direct<br>
done<br>
<br>
And this is the result:<br>
<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 152.376 s, 451 MB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 148.863 s, 462 MB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 152.587 s, 450 MB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 152.661 s, 450 MB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 148.099 s, 464 MB/s<br>
0: State change failed: (-12) Device is held open by someone<br>
Command 'drbdsetup 0 down' terminated with exit code 11<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 52.5957 s, 1.3 GB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 56.9315 s, 1.2 GB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 57.5803 s, 1.2 GB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 52.4276 s, 1.3 GB/s<br>
64+0 records in<br>
64+0 records out<br>
68719476736 bytes (69 GB) copied, 52.8235 s, 1.3 GB/s<br>
<br>
I'm getting a huge performance difference between the drbd resource and
the lower level device. Is this what I should expect?<br>
<br>
~Noah<br>
<br>
<br>
<br>
Scanned for viruses and content by the Tranet Spam Sentinel service.<br>
_______________________________________________<br>
drbd-user mailing list<br>
drbd-user@lists.linbit.com<br>
http://lists.linbit.com/mailman/listinfo/drbd-user<br>
</font></tt>
<br>