[DRBD-user] Performance with DRBD + iSCSI

Weilin Gong wgong at alcatel-lucent.com
Thu Feb 22 01:39:12 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Ross S. W. Walker wrote:
>> -----Original Message-----
>> From: Weilin Gong [mailto:wgong at alcatel-lucent.com] 
>> Sent: Wednesday, February 21, 2007 5:47 PM
>> To: Ross S. W. Walker
>> Cc: drbd-user at linbit.com
>> Subject: Re: [DRBD-user] Performance with DRBD + iSCSI
>>
>> Ross S. W. Walker wrote:
>>     
>>>> -----Original Message-----
>>>> From: drbd-user-bounces at lists.linbit.com 
>>>> [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of 
>>>>         
>> Weilin Gong
>>     
>>>> Sent: Wednesday, February 21, 2007 1:11 PM
>>>> Cc: drbd-user at lists.linbit.com
>>>> Subject: Re: [DRBD-user] Performance with DRBD + iSCSI
>>>>
>>>> Ross S. W. Walker wrote:
>>>>     
>>>>         
>>>>> You can only write into drbd using what your application 
>>>>>       
>>>>>           
>>>> can handle and
>>>>     
>>>>         
>>>>> for VFS file operations that is 4k io! 
>>>>>       
>>>>>           
>>>> On Solaris ufs, the "maxcontig" parameter can be tuned to 
>>>>         
>> specify the 
>>     
>>>> the number of contiguous
>>>> blocks written to the disk. Haven't found the equivalence on 
>>>> Linux yet.
>>>>     
>>>>         
>>> Well, if you write your own app you can bypass VFS page-memory io
>>> restriction by using the generic block layer.
>>>
>>> I'm not sure if you quite understand the maxcontig parameter either:
>>>
>>> maxcontig=n The maximum number of logical
>>> blocks, belonging to one file, that
>>> are allocated contiguously. The
>>> default is calculated as follows...
>>>
>>> This parameter is for tuning disk space allocation in order 
>>>       
>> to reduce
>>     
>>> fragmentation, it doesn't affect the io block size.
>>>   
>>>       
>> Actually, this defines the max io data size the file system 
>> sends down 
>> to the driver.
>> We have a home grown vdisk driver, similar to drbd, the 
>> "maxcontig" had 
>> to be
>> tuned to match the size of the buffer allocated for the 
>> network transport.
>>     
>
> Ok, so I found this:
>
> ----------------
> Tune maxcontig
>
> Under the Solaris OE, UFS uses an extent-like feature called clustering.
> It is impossible to have a default setting for maxcontig that is optimal
> for all file systems. It is too application dependent. Many small files
> accessed in a random pattern do not need extents and performance can
> suffer for both reads and writes when using extents. Larger files can
> benefit from read-ahead on reads and improved allocation units when
> using extents in writes.
>
> For reads, the extent-like feature is really just a read ahead. To
> simply and dynamically tune the read-ahead algorithm, use the tunefs(1M)
> command as follows:
>
> # tunefs -a 4 /ufs1
>
> The value changed is maxcontig, which sets the number of file system
> blocks read in read ahead. The preceding example changes the maximum
> contiguous block count from 32 (the default) to 4.
>
> When a process reads more than one file system block, the kernel
> schedules reads to fill the rest of maxcontig * file system blocksize
> bytes. A single 8 kilobyte, or smaller, random read on a file does not
> trigger read ahead. Read ahead does not occur on files being read with
> mmap(2).
>
> The kernel attempts to automatically detect whether an application is
> doing small random or large sequential I/O. This often works fine, but
> the definition of small or large depends more on system application
> criteria than on device characteristics. Tune maxcontig to obtain
> optimal performance.
> ----------------
>
> So this is like setting read-ahead with, blockdev --setra XX /dev/XX
>   
"maxcontig" is also used for write:

In UFS, the filesystem cluster size, for both reads and writes, is set 
to the value set for /maxcontig/. The filesystem cluster size is used to 
determine:

    * The maximum number of logical blocks contiguously laid out on disk
      for a UFS filesystem before inserting a rotational delay.
    * When, and the amount to read ahead and/or write behind if the
      sequential IO case is found. The algorithm that determines
      sequential read ahead in UFS is broken, so system administrators
      use the /maxcontig/ value to tune their filesystems to achieve
      better random I/O performance.
    * The UFS filesystem cluster size also indicates how many pages to
      attempt to push out to disk at a time. It also determines the
      frequency of pushing pages because in UFS pages are clustered for
      writes, based on the filesystem cluster size.


> -Ross
>
> ______________________________________________________________________
> This e-mail, and any attachments thereto, is intended only for use by
> the addressee(s) named herein and may contain legally privileged
> and/or confidential information. If you are not the intended recipient
> of this e-mail, you are hereby notified that any dissemination,
> distribution or copying of this e-mail, and any attachments thereto,
> is strictly prohibited. If you have received this e-mail in error,
> please immediately notify the sender and permanently delete the
> original and any copy or printout thereof.
>
>   

-------------- next part --------------
A non-text attachment was scrubbed...
Name: wgong.vcf
Type: text/x-vcard
Size: 126 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070221/b78a0ba8/attachment.vcf>


More information about the drbd-user mailing list