<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Hi there,<br>
<br>
my rather strange problem is that I get reasonable throughput when
writing directly to the device as well as writing a file through DRBD.
Even using a single NFS client is ok, but when I let my web server
cluster loose on the DRBD-based NFS server pair, everything grinds to a
halt. No errors, retransmissions, packet loss, nothing - just extremely
slow, almost frozen-up servers.<br>
<br>
The strange thing is that I have an old NFS server that does just fine
under the exact same circumstances - however without the DRBD mirroring.<br>
<br>
Since last time I posted here I've upgraded the pair to kernel 2.6.9
and drbd 0.7.5, but I have yet to dare switching the pair into action.
(Last time I tried I had the first and worst outage on our web portal
in three years - wouldn't like to have that happen again.)<br>
<br>
Will post my progress on this.<br>
<br>
Best regards<br>
Jan<br>
<br>
Jeff Buck wrote:<br>
<blockquote cite="mid1100889037.19316.3.camel@ace" type="cite">
<pre wrap="">I don't know if this will help you, but we get around 20MB/sec with our
3ware raid + lvm2 + drbd. Our version of drbd is fairly old, and I
haven't messed with it lately. I think there have been "speed ups" on
one of the releases I've seen since our version (We're on 0.7.1 I
think). We've got all 8 disks in use on it, one of which is a hot spare.
On Fri, 2004-11-19 at 05:42, Jan Bruvoll wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Hi there,
I find your comment about S-ATA RAID controllers highly interesting -
I
am struggling bigtime with getting anywhere near reasonable speeds
using
a NFS - DRDB - 3Ware RAID chain. What do you mean exactly with "slow",
and would you be able to share any pointers on how to pinpoint why my
server pair is so painfully slow?
Best regards & TIA
Jan
Philipp Reisner wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Hi,
Again I had the chance to set up a DRBD Cluster for a LINBIT
Customer. It was the first time I a had one of these new SATA
devices really under my fingers [ withouh any of these enterprise
ready RAID controlers, which are in reality rather slow... ]
The machines are some DELLs with a single P4 w 2.8 GHz and an
"Intel Corp. 6300ESB SATA Storage Controller" IDE Controller,
two "Intel Corp. 82547GI Gigabit Ethernet Controller" NICs
and 512 MB RAM, the disk calles itsef "ST380011AS" Seagate
Baracuda.
At first the performance of the disk was miserable, in the
area of ~5 MB/sec, as it turned out the reason for this was
because we used the LINUX's common IDE (PATA).
Then we tried the libata/ata_piix driver combination, and suddenly
we got a write performance in the area of 40MB/sec.
BTW, with libata suddenly the disk appear as SCSI disk !
[ -> changing all config files from "hdc" to "sda" ]
Networksetup:
e1000 driver, the machines connected with a straight
cat5E cable, forced the cards into "speed 1000" with
ethtool, and set the MTU to 9000 aka Jumboframes.
I am interested in raw data throughput, so I did sequential
writes on an ext3 Filesystem.
Test1
I wrote a 1GB File (with sync) to the root partition
[Cyl: 250 to 1466] 3 times:
43.35 MB/sec (1073741824 B / 00:23.621594)
40.43 MB/sec (1073741824 B / 00:25.328009)
40.78 MB/sec (1073741824 B / 00:25.112768)
avg: 41.52
Test2
The I did the same on a connected DRBD device (protocol C),
also ext3: [Cyl: 2747 to 6151]
39.05 MB/sec (1073741824 B / 00:26.226047)
35.95 MB/sec (1073741824 B / 00:28.483070)
36.48 MB/sec (1073741824 B / 00:28.068153)
avg: 37.16
At first I was satisfied with the outcome that DRBD [with protocol C]
costs you about 10% of your throughput with sequential writes.
Test3
But the I did the same test with DRBD disconnect and got these
</pre>
</blockquote>
<pre wrap="">numbers:
</pre>
<blockquote type="cite">
<pre wrap="">[Cyl: 2747 to 6151]
39.63 MB/sec (1073741824 B / 00:25.840004)
40.30 MB/sec (1073741824 B / 00:25.406312)
39.82 MB/sec (1073741824 B / 00:25.713998)
avg: 39.91
I aked myself: Why is it 4% below the first test ?
Assumption: Maybe because the mirrored partition is behind the
root partition, and harddisk are slower on the outer
cylinders than on the inner cylinders.
Test4:
So I unloaded the DRBD module and mounted the backing storage devices
on the mountpoints directly! [Cyl: 2747 to 6151]
39.65 MB/sec (1073741824 B / 00:25.823633)
38.54 MB/sec (1073741824 B / 00:26.570280)
37.26 MB/sec (1073741824 B / 00:27.479914)
avg: 38.48
Test3 was 3.5% faster than Test4. This could be explained by the fact
that DRBD sometimes triggers the immediate write of buffers to disk.
The DRBD mirroring overhead, thus Test4 to Test2 is 3.4% which is
</pre>
</blockquote>
<pre wrap="">smaller
</pre>
<blockquote type="cite">
<pre wrap="">then the performance differences within the disk device Test1 to
</pre>
</blockquote>
<pre wrap="">Test4
</pre>
<blockquote type="cite">
<pre wrap="">is 7.3%
CPU Usage:
I monitored CPU Usage on the secondary system using the "top"
</pre>
</blockquote>
<pre wrap="">utitily
</pre>
<blockquote type="cite">
<pre wrap="">and the hightes value for the drbd_receiver was 7.7%.
Resync performance:
For the Customer I configured the syncer to run with 10MB/sec, this
makes sure that the Customer's application will continue to run
during a resync operation. For testing purpose I set the
resync rate to 40M and got a resync rate in the area of 33MByte/sec.
Effect of JumboFrames / 9000 MTU
I repeated Test2 with an MTU of 1500 Byte and got these numbers:
36.27 MB/sec (1073741824 B / 00:28.234703)
36.22 MB/sec (1073741824 B / 00:28.270442)
36.41 MB/sec (1073741824 B / 00:28.121841)
On the secondary system the CPU system time's highest point was 7%,
and the spotted maximum on the drbd_receiver thread was 9.7%
So it seems the the JumboFrames only ease the task of the secondary
</pre>
</blockquote>
<pre wrap="">node,
</pre>
<blockquote type="cite">
<pre wrap="">but do not improve performance.
Test Setup:
Linux 2.6.9
DRBD 0.7.5
Writes were this command: ./dm -a 0x00 -o /mnt/ha0/1Gfile -b1M -m -p
</pre>
</blockquote>
<pre wrap="">-y -s1G
</pre>
<blockquote type="cite">
<pre wrap="">dm is from drbd-0.7.5.tar.gz in the benchmark directory.
Conclusion:
===========
The Performance inhomogeneity within a single disk drive can be
bigger (measusred 7.3%) than the loss of performance caused by DRBD
mirroring (measured 3.4%).
This only holds true it the limiting factor is the performance of
the disk. In other words: Your network link and your CPU needs to
be strong enough.
-phil
</pre>
</blockquote>
<pre wrap="">_______________________________________________
drbd-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
</blockquote>
<pre wrap=""><!---->
_______________________________________________
drbd-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
</blockquote>
<br>
</body>
</html>