[DRBD-user] Slow Lustre performance over DRBD
Somsak Sriprayoonsakul
somsak_sr at thaigrid.or.th
Wed Jan 28 17:43:28 CET 2009
Thanks all for sharing.
About the webminar, I have look through that but maybe I miss something.
I have already adjust sndbuf, al-extend, no*flush, max-buffers,
unplug-watermark. The only parameters that really improve performance
are sndbuf-size and no*flush.
I admit that I'm not Infiniband expert. But the performance of pure
Lustre over IB is really amazing without tweaking anything, so I thought
IB should doing ok already. I will double check that again.
I use the infiniband tool came with RHEL4, but the kernel-ib package was
taken from Lustre. rpm -qi kernel-ib returns
Name : kernel-ib Relocations: (not relocatable)
Version : 1.3 Vendor: OpenFabrics
Release : 2.6.9_67.0.22.EL_lustre.1.6.6smp Build Date: Fri 12 Sep
2008 06:08:31 AM ICT
So I guess I'm using OFED 1.3.
Currently my MTU is 65520, do I need to increase it?
For the performance, I got
#bytes #iterations BW peak[MB/sec] BW average[MB/sec]
65536 5000 623.60 623.58
from ib_write_bw, I think it used to reach 700 at my first test, might
because few things running over it (drbd, lustre) while testing.
Robert Dunkley wrote:
> Hi Somsak,
>
> I use some DRBD systems running over Infiniband/IPOIB, try setting a larger MTU and switch IPOIB to connected mode. What sort of raw performance results do you get over IPOIB? (My own 20Gb setup shows only about 700-1000MByte/sec with IPerf). What version of OFED are you running?
>
> Rob
>
> -----Original Message-----
> From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Florian Haas
> Sent: 28 January 2009 08:27
> To: drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] Slow Lustre performance over DRBD
>
> Somsak,
>
> can you please take a look at our performance tuning webinar
> (http://www.linbit.com/en/education/on-demand-webinars/drbd-performance-tuning/),
> run the micro-benchmarks described there, and share your results?
>
> Also, it would be helpful if you could provide network throughput test
> results for your IPoIB connection you are using for DRBD replication.
>
> Why are you using protocol A? Can you afford to lose updates on node
> failure?
>
> Cheers,
> Florian
>
> On 01/28/2009 07:14 AM, Somsak Sriprayoonsakul wrote:
>
>> Dear List,
>>
>> I am setting up a 4 nodes Lustre cluster. The Cluster consist of
>>
>> 1. 2 nodes with a shared external storage of about 800GB, this one did't
>> use DRBD. These nodes served as Lustre MDS in active passive mode.
>>
>> 2. 2 Sun X4500 nodes, which contains 48 disks of 750GB. These nodes
>> served as Lustre OSSs. 2 boot disks was combined using RAID1 for OS
>> installation. For the rest of 46 disks, I divided it into 6 group of
>> RAID10 which contains 8 hdds per group (one group use 2 vacant
>> partitions available after making RAID1 of OS drive). All the RAID were
>> configured using software raid (this thumper does not support hardware
>> raid).
>>
>> Note that, both X4500 nodes were using Infiniband SDR (10Gbps)
>> connecting to each client. Each client has DDR (20Gbps) infiniband
>> installed.
>>
>> I conducted a test by creating Lustre over the that 6 RAID10 (3 from a
>> node, another 3 from another node) without DRBD, and run iozone in
>> parallel mode (-t 8 -+m) of 8 clients over Infiniband yield the total
>> performance of about 1.3 - 1.4GB/s. I monitored the raid device by
>> "iostat -d 2 -k), each RAID10 could deliver about 200+MB/s per RAID.
>>
>> Then I switch over to DRBD, one device per RAID group mirroring to
>> neighbor node. Each node served 3 primary DRBD exported as Lustre OSSs.
>> Then I conduct the same test again, but this time the performance is
>> down to only about 350MB/s maximum. I did iostat on each machine and
>> each RAID only deliver about 50+MB/s. This was the maximum I got from
>> tweaking many parameters in drbd.conf. Attached with this e-mail is the
>> current configuration used in this set-up.
>> I think I already tweak and adjust all parameters I could already. I
>> am aware that the performance over DRBD will be slower, but I think it
>> should be at least about 600-700MB/s. I also test switching over to
>> Gigabit Ethernet for DRBD but the performance is much worse, as expected.
>>
>> Could anyone suggest the performance tuning for my set-up?
>>
>>
>> Rgds,
>>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
More information about the drbd-user
mailing list