[DRBD-user] DRBD (write) Performance on Intel e1000

Lars Ellenberg lars.ellenberg at linbit.com
Wed Dec 17 15:46:00 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


you need to subscribe to get your posts through!

On Wed, Dec 17, 2008 at 11:38:08AM +0000, Rudolph Bott wrote:
> > On Tue, Dec 16, 2008 at 08:23:39PM +0000, Rudolph Bott wrote:
> > > Hi List,
> > > 
> > > I was wondering if anyone might be able to share some performance
> > > information about his/her DRBD setup. Ours comes along with the
> > > following Hardware:
> > > 
> > > Hardware: Xeon QuadCore CPU, 2GB RAM, Intel Mainboard with 2
> > Onboard
> > > e1000 NICs and one additional plugged into a regular PCI slot,
> > 3ware
> > > 9650SE (PCI-Express) with 4 S-ATA Disks in a RAID-10 array
> > > 
> > > Software: Ubuntu Hardy LTS with DRBD 8.0.11 (from the ubuntu
> > repository), Kernel 2.6.24
> > > 
> > > one NIC acts as "management interface", one as the DRBD Link, one
> > as
> > > the heartbeat interface. On top of DRBD runs LVM to allow the
> > creation
> > > of volumes (which are in turn exported via iSCSI). Everything seems
> > to
> > > run smoothly - but I'm not quite satisfied with the write speed
> > > available on the DRBD device (locally, I don't care about the iSCSI
> > > part yet).
> > > 
> > > All tests were done with dd (either copying from /dev/zero or to
> > > /dev/null with 1, 2 or 4GB sized files). Reading gives me speeds at
> > > around 390MB/sec which is way more than enough - but writing does
> > not
> > > exceed 39MB/sec. Direct writes to the raid controller (without
> > DRBD)
> > > are at around 95MB/sec which is still below the limit of
> > Gig-Ethernet.
> > > I spent the whole day tweaking various aspects (Block-Device
> > tuning,
> > > TCP-offload-settings, DRBD net-settings etc.) and managed to raise
> > the
> > > write speed from initially 25MB/sec to 39MB/sec that way.
> > > 
> > > Any suggestions what happens to the missing ~60-50MB/sec that the
> > > 3ware controller is able to handle? Do you think the PCI bus is
> > > "overtasked"? Would it be enough to simply replace the onboard NICs
> > > with an additional PCI-Express Card or do you think the limit is
> > > elsewhere? (DRBD settings, Options set in the default Distro Kernel
> > > etc.). 
> > 
> > drbdadm dump all
> 
> common {
>     syncer {
>         rate             100M;
>     }
> }
> 
> resource storage {
>     protocol               C;
>     on nas03 {
>         device           /dev/drbd0;
>         disk             /dev/sda3;
>         address          172.16.15.3:7788;
>         meta-disk        internal;
>     }
>     on nas04 {
>         device           /dev/drbd0;
>         disk             /dev/sda3;
>         address          172.16.15.4:7788;
>         meta-disk        internal;
>     }
>     net {
>         unplug-watermark 1024;
>         after-sb-0pri    disconnect;
>         after-sb-1pri    disconnect;
>         after-sb-2pri    disconnect;
>         rr-conflict      disconnect;

_any_ thread about drbd tuning mentiones
at least sndbuf-size, max-buffers, max-epoch-size, ...

>     }
>     disk {
>         on-io-error      detach;

do you have a battery backed write cache on the controller?
if not, get one, and read about no-disk-flushes and no-md-flushes.

>     }
>     syncer {
>         rate             100M;
>         al-extents       257;

did you understand the al-extents setting,
and its tradeoff?

>     }
>     startup {
>         wfc-timeout       20;
>         degr-wfc-timeout 120;
>     }
>     handlers {
>         pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
>         pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
>         local-io-error   "echo o > /proc/sysrq-trigger ; halt -f";
>     }
> }

> > what exactly does your micro benchmark look like?
> dd if=/dev/zero of=/mnt/testfile bs=1M count=2048
> dd if=/mnt/testfile of=/dev/null

the write test does not fsync.
add "conv=fsync".

or use oflag=direct, and a much higher bs and less count.

or both.

the read test reads from page cache, unless you drop caches before each
run.  ok, so it is a large file, about the size of your RAM.  but still.
use iflag=direct, a much larger bs, and a smaller count if you are
interessted in streaming read performance from storage.
if you want to benchmark page cache, fine...

> hmm...when I take the information above into account I would
> say...maybe LVM is the bottleneck? The speed comparison to local
> writes (achieving ~95mb/sec) were done on the root fs, which is direct
> on the sda device, not ontop of LVM.

well, you could easily verify with a non-drbd lv.

I'd say you should read up on the al-extents.

and get a battery backet cache.

cheers,

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list