[DRBD-user] drbd is performing at 12 MB/sec on recent hardware

Wed Oct 15 11:41:08 CEST 2008

On Wed, Oct 15, 2008 at 10:17:24AM +0200, Bart Coninckx wrote:
> On Tuesday 14 October 2008 23:17, Lars Marowsky-Bree wrote:
> > On 2008-10-14T18:32:50, Bart Coninckx <bart.coninckx at telenet.be> wrote:
> > > Hi all,
> > >
> > > I previously posted about a sync going somewhat slow (managed to get it
> > > up to 18MB/sec).
> > > I figured this could be just the syncing, so I decided to do some
> > > performance tests as mentionned in the webinar.
> > >
> > > I used:
> > > dd if=/dev/zero of=/dev/drbd0 bs=1G count=1 oflag=direct
> >
> > oflag=direct uses direct sync'ing (obviously), which means there's not
> > very much optimization going on. bs=1G also means that a huge memory
> > block is allocated and flushed.
> 
> I see. I used this exact command from the webinar, assuming it was a good way 
> to test performance.

it is.
it is a good micro benchmark for streaming throuput.

> > So at least these parameters are far from sane - bs=16384 oflag=fsync
> > might provide better performance.

it does not, in my experience.

when writing to a block device (as opposed to a file in a file system on
top of a block device), odirect in my experience gives MUCH better
throughput than going through the buffer cache.
I did not yet investigate why the vm does behave that way
(I assume it is the vm's fault).

btw, oflag=fsync does not exist, it is conv=fsync.
and conv=fsync got introduced to dd later than oflag=direct, iirc.

> I get about 1.3 MB/sec when using these values  :-s

fix your local io subsystem.
fix your local io subsystem driver.

check
 local io subsystem,
 (battery backed) write cache enabled?
 raid resync/rebuild/re-whatever going on at the same time?
 does the driver honor BIO_RW_SYNC?
 does the driver introduce additional latency
 because of ill-advised "optimizations"?

if local io subsystem is ok,
but DRBD costs more than a few percent in throughput,
check your local io subsystem on the other node!

if that is ok as well, check network for throughput,
latency and packet loss/retransmits.

only if all of that seems ok too,
check drbd configuration.

> > The next step might be to check whether you're using internal meta-data,
> > which also might not exactly perform very well (seek time issues).
> 
> I indeed use internal meta-data. There is somewhat of a problem in using 
> external meta-data since I have no disk left. I could reconfigure the RAID-5 
> and dedicate a virtual disk to this, but would this enhance performance?
> 
> > > The server is a HP ML370 G5 with a 1TB RAID-5 setup. The controller used
> > > the cciss driver. DRBD is 0.7.22. OS is SLES 10 SP1. The network link is
> > > gigabit, full duplex, set to 1000 Mbit.
> >
> > Of course, I am inclined to advise an upgrade to SP2, which might help
> > with a newer cciss driver ;-)

a-ha.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed