[DRBD-user] Performance with DRBD + iSCSI

Ross S. W. Walker rwalker at medallion.com
Thu Feb 22 01:06:04 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> -----Original Message-----
> From: Weilin Gong [mailto:wgong at alcatel-lucent.com] 
> Sent: Wednesday, February 21, 2007 5:47 PM
> To: Ross S. W. Walker
> Cc: drbd-user at linbit.com
> Subject: Re: [DRBD-user] Performance with DRBD + iSCSI
> 
> Ross S. W. Walker wrote:
> >> -----Original Message-----
> >> From: drbd-user-bounces at lists.linbit.com 
> >> [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of 
> Weilin Gong
> >> Sent: Wednesday, February 21, 2007 1:11 PM
> >> Cc: drbd-user at lists.linbit.com
> >> Subject: Re: [DRBD-user] Performance with DRBD + iSCSI
> >>
> >> Ross S. W. Walker wrote:
> >>     
> >>> You can only write into drbd using what your application 
> >>>       
> >> can handle and
> >>     
> >>> for VFS file operations that is 4k io! 
> >>>       
> >> On Solaris ufs, the "maxcontig" parameter can be tuned to 
> specify the 
> >> the number of contiguous
> >> blocks written to the disk. Haven't found the equivalence on 
> >> Linux yet.
> >>     
> >
> > Well, if you write your own app you can bypass VFS page-memory io
> > restriction by using the generic block layer.
> >
> > I'm not sure if you quite understand the maxcontig parameter either:
> >
> > maxcontig=n The maximum number of logical
> > blocks, belonging to one file, that
> > are allocated contiguously. The
> > default is calculated as follows...
> >
> > This parameter is for tuning disk space allocation in order 
> to reduce
> > fragmentation, it doesn't affect the io block size.
> >   
> Actually, this defines the max io data size the file system 
> sends down 
> to the driver.
> We have a home grown vdisk driver, similar to drbd, the 
> "maxcontig" had 
> to be
> tuned to match the size of the buffer allocated for the 
> network transport.

Ok, so I found this:

----------------
Tune maxcontig

Under the Solaris OE, UFS uses an extent-like feature called clustering.
It is impossible to have a default setting for maxcontig that is optimal
for all file systems. It is too application dependent. Many small files
accessed in a random pattern do not need extents and performance can
suffer for both reads and writes when using extents. Larger files can
benefit from read-ahead on reads and improved allocation units when
using extents in writes.

For reads, the extent-like feature is really just a read ahead. To
simply and dynamically tune the read-ahead algorithm, use the tunefs(1M)
command as follows:

# tunefs -a 4 /ufs1

The value changed is maxcontig, which sets the number of file system
blocks read in read ahead. The preceding example changes the maximum
contiguous block count from 32 (the default) to 4.

When a process reads more than one file system block, the kernel
schedules reads to fill the rest of maxcontig * file system blocksize
bytes. A single 8 kilobyte, or smaller, random read on a file does not
trigger read ahead. Read ahead does not occur on files being read with
mmap(2).

The kernel attempts to automatically detect whether an application is
doing small random or large sequential I/O. This often works fine, but
the definition of small or large depends more on system application
criteria than on device characteristics. Tune maxcontig to obtain
optimal performance.
----------------

So this is like setting read-ahead with, blockdev --setra XX /dev/XX

-Ross

______________________________________________________________________
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.




More information about the drbd-user mailing list