[DRBD-user] Slow Lustre performance over DRBD

Somsak Sriprayoonsakul somsak_sr at thaigrid.or.th
Wed Jan 28 18:21:12 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


More ifnromation, I just try iperf and the performance is quite low

[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  2.01 GBytes  1.73 Gbits/sec

I use pre-compiled version of Iperf from rpmforge.

Oops!, did I mentioned that I'm using CentOS4 on x4500 machines, and 
those MDSs machine are CentOS5? I have to use different OS because poor 
driver support of X4500 on RHEL5.

About IPOIB connected mode, it seems that the infiniband is already 
running in connected mode.

[root at storage-0-3 ~]# cat /sys/class/net/ib1/mode
connected

I will try to use OFED1.4 instead, to see if it improve the performance.

Do I need to upgrade firmware to maximize the performance? Right now it is

[root at storage-0-3 ~]# cat /sys/class/infiniband/mthca0/fw_ver
3.5.0

Sorry if the question is becoming irrelevant to the list.

Somsak Sriprayoonsakul wrote:
> Thanks all for sharing.
>
> About the webminar, I have look through that but maybe I miss 
> something. I have already adjust sndbuf, al-extend, no*flush, 
> max-buffers, unplug-watermark. The only parameters that really improve 
> performance are sndbuf-size and no*flush.
>
> I admit that I'm not Infiniband expert. But the performance of pure 
> Lustre over IB is really amazing without tweaking anything, so I 
> thought IB should doing ok already. I will double check that again.
>
> I use the infiniband tool came with RHEL4, but the kernel-ib package 
> was taken from Lustre. rpm -qi kernel-ib returns
>
> Name        : kernel-ib                    Relocations: (not relocatable)
> Version     : 1.3                               Vendor: OpenFabrics
> Release     : 2.6.9_67.0.22.EL_lustre.1.6.6smp   Build Date: Fri 12 
> Sep 2008 06:08:31 AM ICT
>
> So I guess I'm using OFED 1.3.
>
> Currently my MTU is 65520, do I need to increase it?
>
> For the performance, I got
>
> #bytes #iterations    BW peak[MB/sec]    BW average[MB/sec] 
>  65536        5000             623.60                623.58
>
> from ib_write_bw, I think it used to reach 700 at my first test, might 
> because few things running over it (drbd, lustre) while testing.
>
> Robert Dunkley wrote:
>> Hi Somsak,
>>
>> I use some DRBD systems running over Infiniband/IPOIB, try setting a 
>> larger MTU and switch IPOIB to connected mode. What sort of raw 
>> performance results do you get over IPOIB? (My own 20Gb setup shows 
>> only about 700-1000MByte/sec with IPerf). What version of OFED are 
>> you running?
>>
>> Rob
>>
>> -----Original Message-----
>> From: drbd-user-bounces at lists.linbit.com 
>> [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Florian Haas
>> Sent: 28 January 2009 08:27
>> To: drbd-user at lists.linbit.com
>> Subject: Re: [DRBD-user] Slow Lustre performance over DRBD
>>
>> Somsak,
>>
>> can you please take a look at our performance tuning webinar
>> (http://www.linbit.com/en/education/on-demand-webinars/drbd-performance-tuning/), 
>>
>> run the micro-benchmarks described there, and share your results?
>>
>> Also, it would be helpful if you could provide network throughput test
>> results for your IPoIB connection you are using for DRBD replication.
>>
>> Why are you using protocol A?  Can you afford to lose updates on node
>> failure?
>>
>> Cheers,
>> Florian
>>
>> On 01/28/2009 07:14 AM, Somsak Sriprayoonsakul wrote:
>>  
>>> Dear List,
>>>
>>>   I am setting up a 4 nodes Lustre cluster. The Cluster consist of
>>>
>>> 1. 2 nodes with a shared external storage of about 800GB, this one 
>>> did't
>>> use DRBD. These nodes served as Lustre MDS in active passive mode.
>>>
>>> 2. 2 Sun X4500 nodes, which contains 48 disks of 750GB. These nodes
>>> served as Lustre OSSs. 2 boot disks was combined using RAID1 for OS
>>> installation. For the rest of 46 disks, I divided it into 6 group of
>>> RAID10 which contains 8 hdds per group (one group use 2 vacant
>>> partitions available after making RAID1 of OS drive). All the RAID were
>>> configured using software raid (this thumper does not support hardware
>>> raid).
>>>
>>>   Note that, both X4500 nodes were using Infiniband SDR (10Gbps)
>>> connecting to each client. Each client has DDR (20Gbps) infiniband
>>> installed.
>>>
>>>   I conducted a test by creating Lustre over the that 6 RAID10 (3 
>>> from a
>>> node, another 3 from another node) without DRBD, and run iozone in
>>> parallel mode (-t 8 -+m) of 8 clients over Infiniband yield the total
>>> performance of about 1.3 - 1.4GB/s. I monitored the raid device by
>>> "iostat -d 2 -k), each RAID10 could deliver about 200+MB/s per RAID.
>>>
>>>   Then I switch over to DRBD, one device per RAID group mirroring to
>>> neighbor node. Each node served 3 primary DRBD exported as Lustre OSSs.
>>> Then I conduct the same test again, but this time the performance is
>>> down to only about 350MB/s maximum. I did iostat on each machine and
>>> each RAID only deliver about 50+MB/s. This was the maximum I got from
>>> tweaking many parameters in drbd.conf. Attached with this e-mail is the
>>> current configuration used in this set-up.
>>>     I think I already tweak and adjust all parameters I could 
>>> already. I
>>> am aware that the performance over DRBD will be slower, but I think it
>>> should be at least about 600-700MB/s. I also test switching over to
>>> Gigabit Ethernet for DRBD but the performance is much worse, as 
>>> expected.
>>>
>>>   Could anyone suggest the performance tuning for my set-up?
>>>
>>>
>>> Rgds,
>>>     
>>
>>   
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>   
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>



More information about the drbd-user mailing list