<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi,<br>
<br>
No real improvement, just maybe a couple of MB/s. Also, what's strange
is that the same test performed over and over again produces quite
different results:<br>
<br>
[root@leviathan ftp]# dd if=/dev/zero of=zero.dat bs=1G count=1
oflag=dsync<br>
1+0 records in<br>
1+0 records out<br>
1073741824 bytes (1.1 GB) copied, 39.0488 seconds, 27.5 MB/s<br>
[root@leviathan ftp]# rm zero.dat<br>
[root@leviathan ftp]# rm -f zero.dat<br>
[root@leviathan ftp]# dd if=/dev/zero of=zero.dat bs=1G count=1
oflag=dsync<br>
1+0 records in<br>
1+0 records out<br>
1073741824 bytes (1.1 GB) copied, 37.3662 seconds, 28.7 MB/s<br>
[root@leviathan ftp]# rm -f zero.dat<br>
[root@leviathan ftp]# dd if=/dev/zero of=zero.dat bs=1G count=1
oflag=dsync<br>
1+0 records in<br>
1+0 records out<br>
1073741824 bytes (1.1 GB) copied, 19.4406 seconds, 55.2 MB/s<br>
[root@leviathan ftp]# rm -f zero.dat<br>
[root@leviathan ftp]# dd if=/dev/zero of=zero.dat bs=1G count=1
oflag=dsync<br>
1+0 records in<br>
1+0 records out<br>
1073741824 bytes (1.1 GB) copied, 20.3066 seconds, 52.9 MB/s<br>
<br>
<br>
All of the above tests were done at less than 1 minute interval.<br>
<br>
Thanks,<br>
<br>
Andrei.<br>
<br>
Marcelo Azevedo wrote:
<blockquote
cite="mid:204aa80806260422s4bbbc32eg4ebf27c8cf1ca4e6@mail.gmail.com"
type="cite">
<div>put :</div>
<div> no-disk-flushes;<br>
no-md-flushes;<br>
</div>
<div>under , disk { } <br>
tell me if it makes a difference ... <br>
</div>
<div class="gmail_quote">On Tue, Jun 24, 2008 at 2:12 PM, Andrei
Neagoe <<a moz-do-not-send="true" href="mailto:anne@imc.nl">anne@imc.nl</a>>
wrote:<br>
<blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0px 0px 0px 0.8ex; padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff">Thanks a lot for the
clarification. That was exactly the case... from my understanding of
the docs I thought it was just necessary to run drbdadm adjust all on
each node, regardless of the node state (primary or secondary). Right
now it's pretty clear how I must proceed with the testing.<br>
What still puzzles me is the fact that only one resource got the need
to be fully resynchronized, because as I said, I'm running lvm2 over
them (having drbd0 and drbd1 as physical volumes).<br>
Another thing is the speed, which atm it's let's say satisfactory, but
I found a thread on linbit archive where a user with a very similar
setup and testing scheme was getting ~37 MB/s over fiber link between 2
datacenters and if connected via crossover cable a transfer rate of
almost 80 MB/s. You can view the thread here: <a moz-do-not-send="true"
href="http://archives.free.net.ph/message/20080523.225430.9ba8ceac.en.html"
target="_blank">http://archives.free.net.ph/message/20080523.225430.9ba8ceac.en.html</a><br>
Testing both network and writing to the external storage box directly
reveals that these are not the limitations:<br>
<blockquote><i>------------------------------------------------------------</i><br>
<i>Client connecting to <a moz-do-not-send="true"
href="http://10.0.0.20/" target="_blank">10.0.0.20</a>, TCP port 5001</i><br>
<i>TCP window size: 0.02 MByte (default)</i><br>
<i>------------------------------------------------------------</i><br>
<i>[ 3] local <a moz-do-not-send="true" href="http://10.0.0.10/"
target="_blank">10.0.0.10</a> port 39353 connected with <a
moz-do-not-send="true" href="http://10.0.0.20/" target="_blank">10.0.0.20</a>
port 5001</i><br>
<i>[ ID] Interval Transfer Bandwidth</i><br>
<i>[ 3] 0.0-10.0 sec 1125 MBytes 113 MBytes/sec</i><br>
<i>[ ID] Interval Transfer Bandwidth</i><br>
<i>[ 3] 0.0-10.0 sec 1125 MBytes 112 MBytes/sec<br>
-----------------------------------------------------------
<div class="Ih2E3d"><br>
[root@erebus testing]# dd if=/dev/zero of=test.dat bs=1G count=1
oflag=dsync<br>
1+0 records in<br>
1+0 records out<br>
</div>
1073741824 bytes (1.1 GB) copied, 10.321 seconds, 104 MB/s<br>
<br>
</i></blockquote>
Note that in the above test a different device is mounted in /testing
(just another logical drive on the storage box). As an additional
information, the storage box is an IBM DS 3200 connected to the machine
using 2 SAS HBA's (just for redundancy, no load balancing).<br>
<br>
So at the moment I'm also pretty stuck with performance tuning as I
don't know what else I could try.<br>
<br>
Thanks,<br>
<font color="#888888">Andrei Neagoe.</font>
<div>
<div class="Wj3C7c"><br>
<br>
Lars Ellenberg wrote:
<blockquote type="cite">
<pre>On Tue, Jun 24, 2008 at 12:27:30PM +0200, Andrei Neagoe wrote:
</pre>
<blockquote type="cite">
<pre>Hi,
I was trying today to play with drbd's settings and benchmark the results in
order to obtain the best performance.
Here is my test setup:
2 identical machines with sas storage boxes. Each machine has two 2TB device
(in my case /dev/sdb and /dev/sdc) that I mirror over drbd and on top of them
there's LVM set up. The nodes share a gbit link dedicated for drbd traffic.
After the initial sync which took something around 20 hours to finish, I've
created the LVM volume and formatted using ext3 FS. Then I started to play
around with params like al-extents, unplug-watermark, maxbuffers, max-epoch by
changing the values and doing a drbdadm adjust all on each node (of course
after copying the config file accordingly). In the begining it went pretty
well, maximum value attained by dd test over drbd was 28.9 MB/s:
[root@erebus testing]# dd if=/dev/zero of=test.dat bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 37.1114 seconds, 28.9 MB/s
The configuration used is described in the end. After a couple more tests, I
noticed a big impact on performance, getting around 19-20 MB/s so I checked /
proc/drbd to see what's going on. Surprisingly, it was doing a full resync on
one of the disks. Problem is, I don't understand why, as normally it should
only resync discrepancies.
</pre>
</blockquote>
<pre>if you change anything in the config file that changes "disk"
parameters (like on-io-error, size, fencing, use-bmbv, ...),
which causes drbdadm adjust to think it needs to detach/attach, and you
do that while being primary, you get a full sync.
this is unfortunate, and there should probably
be a dialog to warn you about it.
if you detach a Primary, then reattach, it will receive a full sync.
you need to make it secondary first, if you want to avoid that.
detaching, then reattaching a secondary will only receive an
"incremental" resync, which typically is a few KB or nothing at all,
depending on the timing.
if this is not what happened for you, read the kernel log,
typically drbd tells you why a resync was necessary.
--
: Lars Ellenberg <a moz-do-not-send="true"
href="http://www.linbit.com/" target="_blank">http://www.linbit.com</a> :
: DRBD/HA support and consulting sales at <a
moz-do-not-send="true" href="http://linbit.com/" target="_blank">linbit.com</a> :
: LINBIT Information Technologies GmbH Tel +43-1-8178292-0 :
: Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 :
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
<a moz-do-not-send="true" href="mailto:drbd-user@lists.linbit.com"
target="_blank">drbd-user@lists.linbit.com</a>
<a moz-do-not-send="true"
href="http://lists.linbit.com/mailman/listinfo/drbd-user"
target="_blank">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
drbd-user mailing list<br>
<a moz-do-not-send="true" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a><br>
<a moz-do-not-send="true"
href="http://lists.linbit.com/mailman/listinfo/drbd-user"
target="_blank">http://lists.linbit.com/mailman/listinfo/drbd-user</a><br>
<br>
</blockquote>
</div>
<br>
</blockquote>
<br>
</body>
</html>