[DRBD-user] Slow Lustre performance over DRBD

Robert Dunkley Robert at saq.co.uk
Wed Jan 28 09:36:17 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Somsak,

I use some DRBD systems running over Infiniband/IPOIB, try setting a larger MTU and switch IPOIB to connected mode. What sort of raw performance results do you get over IPOIB? (My own 20Gb setup shows only about 700-1000MByte/sec with IPerf). What version of OFED are you running?

Rob

-----Original Message-----
From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Florian Haas
Sent: 28 January 2009 08:27
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] Slow Lustre performance over DRBD

Somsak,

can you please take a look at our performance tuning webinar
(http://www.linbit.com/en/education/on-demand-webinars/drbd-performance-tuning/),
run the micro-benchmarks described there, and share your results?

Also, it would be helpful if you could provide network throughput test
results for your IPoIB connection you are using for DRBD replication.

Why are you using protocol A?  Can you afford to lose updates on node
failure?

Cheers,
Florian

On 01/28/2009 07:14 AM, Somsak Sriprayoonsakul wrote:
> Dear List,
> 
>   I am setting up a 4 nodes Lustre cluster. The Cluster consist of
> 
> 1. 2 nodes with a shared external storage of about 800GB, this one did't
> use DRBD. These nodes served as Lustre MDS in active passive mode.
> 
> 2. 2 Sun X4500 nodes, which contains 48 disks of 750GB. These nodes
> served as Lustre OSSs. 2 boot disks was combined using RAID1 for OS
> installation. For the rest of 46 disks, I divided it into 6 group of
> RAID10 which contains 8 hdds per group (one group use 2 vacant
> partitions available after making RAID1 of OS drive). All the RAID were
> configured using software raid (this thumper does not support hardware
> raid).
> 
>   Note that, both X4500 nodes were using Infiniband SDR (10Gbps)
> connecting to each client. Each client has DDR (20Gbps) infiniband
> installed.
> 
>   I conducted a test by creating Lustre over the that 6 RAID10 (3 from a
> node, another 3 from another node) without DRBD, and run iozone in
> parallel mode (-t 8 -+m) of 8 clients over Infiniband yield the total
> performance of about 1.3 - 1.4GB/s. I monitored the raid device by
> "iostat -d 2 -k), each RAID10 could deliver about 200+MB/s per RAID.
> 
>   Then I switch over to DRBD, one device per RAID group mirroring to
> neighbor node. Each node served 3 primary DRBD exported as Lustre OSSs.
> Then I conduct the same test again, but this time the performance is
> down to only about 350MB/s maximum. I did iostat on each machine and
> each RAID only deliver about 50+MB/s. This was the maximum I got from
> tweaking many parameters in drbd.conf. Attached with this e-mail is the
> current configuration used in this set-up.
>     I think I already tweak and adjust all parameters I could already. I
> am aware that the performance over DRBD will be slower, but I think it
> should be at least about 600-700MB/s. I also test switching over to
> Gigabit Ethernet for DRBD but the performance is much worse, as expected.
> 
>   Could anyone suggest the performance tuning for my set-up?
> 
> 
> Rgds,

-- 
When replying, there is no need to CC my personal address.
I monitor the list on a daily basis. Thank you.

LINBIT® and DRBD® are registered trademarks of LINBIT.
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


More information about the drbd-user mailing list