<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

Hi there,<br>

<br>

my rather strange problem is that I get reasonable throughput when

writing directly to the device as well as writing a file through DRBD.

Even using a single NFS client is ok, but when I let my web server

cluster loose on the DRBD-based NFS server pair, everything grinds to a

halt. No errors, retransmissions, packet loss, nothing - just extremely

slow, almost frozen-up servers.<br>

<br>

The strange thing is that I have an old NFS server that does just fine

under the exact same circumstances - however without the DRBD mirroring.<br>

<br>

Since last time I posted here I've upgraded the pair to kernel 2.6.9

and drbd 0.7.5, but I have yet to dare switching the pair into action.

(Last time I tried I had the first and worst outage on our web portal

in three years - wouldn't like to have that happen again.)<br>

<br>

Will post my progress on this.<br>

<br>

Best regards<br>

Jan<br>

<br>

Jeff Buck wrote:<br>

<blockquote cite="mid1100889037.19316.3.camel@ace" type="cite">

  <pre wrap="">I don't know if this will help you, but we get around 20MB/sec with our

3ware raid + lvm2 + drbd. Our version of drbd is fairly old, and I

haven't messed with it lately. I think there have been "speed ups" on

one of the releases I've seen since our version (We're on 0.7.1 I

think). We've got all 8 disks in use on it, one of which is a hot spare.

On Fri, 2004-11-19 at 05:42, Jan Bruvoll wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">Hi there,

I find your comment about S-ATA RAID controllers highly interesting -

I 

am struggling bigtime with getting anywhere near reasonable speeds

using 

a NFS - DRDB - 3Ware RAID chain. What do you mean exactly with "slow",

and would you be able to share any pointers on how to pinpoint why my 

server pair is so painfully slow?

Best regards &amp; TIA

Jan

Philipp Reisner wrote:

    </pre>

    <blockquote type="cite">

      <pre wrap="">Hi,

Again I had the chance to set up a DRBD Cluster for a LINBIT

Customer. It was the first time I a had one of these new SATA

devices really under my fingers [ withouh any of these enterprise

ready RAID controlers, which are in reality rather slow... ]

The machines are some DELLs with a single P4 w 2.8 GHz and an

"Intel Corp. 6300ESB SATA Storage Controller" IDE Controller,

two "Intel Corp. 82547GI Gigabit Ethernet Controller" NICs

and 512 MB RAM, the disk calles itsef "ST380011AS" Seagate

Baracuda.

At first the performance of the disk was miserable, in the

area of ~5 MB/sec, as it turned out the reason for this was

because we used the LINUX's common IDE (PATA).

Then we tried the libata/ata_piix driver combination, and suddenly

we got a write performance in the area of 40MB/sec.

BTW, with libata suddenly the disk appear as SCSI disk !

[ -&gt; changing all config files from "hdc" to "sda" ]

Networksetup:

e1000 driver, the machines connected with a straight

cat5E cable, forced the cards into "speed 1000" with

ethtool, and set the MTU to 9000 aka Jumboframes.

I am interested in raw data throughput, so I did sequential

writes on an ext3 Filesystem. 

Test1

I wrote a 1GB File (with sync) to the root partition

[Cyl: 250 to 1466] 3 times:

43.35 MB/sec (1073741824 B / 00:23.621594)

40.43 MB/sec (1073741824 B / 00:25.328009)

40.78 MB/sec (1073741824 B / 00:25.112768)

avg: 41.52

Test2

The I did the same on a connected DRBD device (protocol C), 

also ext3: [Cyl: 2747 to 6151]

39.05 MB/sec (1073741824 B / 00:26.226047)

35.95 MB/sec (1073741824 B / 00:28.483070)

36.48 MB/sec (1073741824 B / 00:28.068153)

avg: 37.16

At first I was satisfied with the outcome that DRBD [with protocol C]

costs you about 10% of your throughput with sequential writes.

Test3

But the I did the same test with DRBD disconnect and got these

      </pre>

    </blockquote>

    <pre wrap="">numbers:

    </pre>

    <blockquote type="cite">

      <pre wrap="">[Cyl: 2747 to 6151]

39.63 MB/sec (1073741824 B / 00:25.840004)

40.30 MB/sec (1073741824 B / 00:25.406312)

39.82 MB/sec (1073741824 B / 00:25.713998)

avg: 39.91

I aked myself: Why is it 4% below the first test ?

Assumption: Maybe because the mirrored partition is behind the

           root partition, and harddisk are slower on the outer 

           cylinders than on the inner cylinders.

Test4:

So I unloaded the DRBD module and mounted the backing storage devices

on the mountpoints directly! [Cyl: 2747 to 6151]

39.65 MB/sec (1073741824 B / 00:25.823633)

38.54 MB/sec (1073741824 B / 00:26.570280)

37.26 MB/sec (1073741824 B / 00:27.479914)

avg: 38.48

Test3 was 3.5% faster than Test4. This could be explained by the fact

that DRBD sometimes triggers the immediate write of buffers to disk.

The DRBD mirroring overhead, thus Test4 to Test2 is 3.4% which is

      </pre>

    </blockquote>

    <pre wrap="">smaller

    </pre>

    <blockquote type="cite">

      <pre wrap="">then the performance differences within the disk device Test1 to

      </pre>

    </blockquote>

    <pre wrap="">Test4 

    </pre>

    <blockquote type="cite">

      <pre wrap="">is 7.3%

CPU Usage:

I monitored CPU Usage on the secondary system using the "top"

      </pre>

    </blockquote>

    <pre wrap="">utitily

    </pre>

    <blockquote type="cite">

      <pre wrap="">and the hightes value for the drbd_receiver was 7.7%.

Resync performance:

For the Customer I configured the syncer to run with 10MB/sec, this

makes sure that the Customer's application will continue to run

during a resync operation. For testing purpose I set the 

resync rate to 40M and got a resync rate in the area of 33MByte/sec.

Effect of JumboFrames / 9000 MTU

I repeated Test2 with an MTU of 1500 Byte and got these numbers:

36.27 MB/sec (1073741824 B / 00:28.234703)

36.22 MB/sec (1073741824 B / 00:28.270442)

36.41 MB/sec (1073741824 B / 00:28.121841)

On the secondary system the CPU system time's highest point was 7%,

and the spotted maximum on the drbd_receiver thread was 9.7%

So it seems the the JumboFrames only ease the task of the secondary

      </pre>

    </blockquote>

    <pre wrap="">node,

    </pre>

    <blockquote type="cite">

      <pre wrap="">but do not improve performance.

Test Setup:

Linux 2.6.9

DRBD 0.7.5

Writes were this command: ./dm -a 0x00 -o /mnt/ha0/1Gfile -b1M -m -p

      </pre>

    </blockquote>

    <pre wrap="">-y -s1G

    </pre>

    <blockquote type="cite">

      <pre wrap="">dm is from drbd-0.7.5.tar.gz in the benchmark directory.

Conclusion:

===========

The Performance inhomogeneity within a single disk drive can be

bigger (measusred 7.3%) than the loss of performance caused by DRBD

mirroring (measured 3.4%).

This only holds true it the limiting factor is the performance of

the disk. In other words: Your network link and your CPU needs to 

be strong enough.

-phil

      </pre>

    </blockquote>

    <pre wrap="">_______________________________________________

drbd-user mailing list

<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>

<a class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>

    </pre>

  </blockquote>

  <pre wrap=""><!---->

_______________________________________________

drbd-user mailing list

<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>

<a class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>

  </pre>

</blockquote>

<br>

</body>

</html>