[DRBD-user] Performance regression with DRBD 8.3.12 and newer

Matthias Hensler lists-drbd at wspse.de
Mon Jun 11 23:29:09 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Mon, Jun 11, 2012 at 11:00:09PM +0200, Matthias Hensler wrote:
> On Mon, Jun 11, 2012 at 10:31:16PM +0200, Florian Haas wrote:
> > On 06/11/12 22:14, Matthias Hensler wrote:
> > > Indeed, the problem lies within the kernel version used to build the
> > > drbd.ko module. I double checked by using all userland tools from 8.3.13
> > > elrepo build together with my drbd.ko build on 2.6.32-71 (but run from
> > > 2.6.32-220).
> > > 
> > > Just to be clear: all tests were made with kernel 2.6.32-220, and the
> > > userland version does not matter.
> > > 
> > > drbd.ko              | 8.3.11 | 8.3.13
> > > ---------------------+--------+-------
> > > build on 2.6.32-71   | good   | good
> > > build on 2.6.32-220  | bad    | bad
> > > 
> > > 
> > > So, how to debug this further? I would suspect looking at the symbols of
> > > both modules might give a clue?
> > 
> > As a knee-jerk response based on a hunch -- you've been warned :) --,
> > this could be related to the BIO_RW_BARRIER vs. FLUSH/FUA dance that the
> > RHEL 6 kernel has been doing between the initial RHEL 6 release, and
> > more recent updates (when they've been backporting the "let's kill
> > barriers" upstream changes from post-2.6.32).
> 
> OK.
> 
> > Try configuring your disk section with no-disk-barrier, no-disk-flushes
> > and no-md-flushes (in both configurations) and see if your kernel module
> > change still makes a difference.
> 
> Just did that:
> 
> Using the drbd.ko build on 2.6.32-71 shows minor increase in
> performance (108,5 MByte/s, so some 5% more or so).
> 
> Using the drbd.ko build on 2.6.32-220.17.1 now finally brings the
> expected performance (same as with the 2.6.32-71 built).
> 
> > Of course, in production you should only use those options if you have
> > no volatile caches involved in the I/O path.
> 
> Yes, that is clear. I did not plan to disable barriers, as the
> bottleneck in my setup should be clearly the network.
> 
> > Not sure if this is useful, but I sure hope it is. :)
> 
> Well, what does that mean: are the modules build on 2.6.32-71 broken in
> a way that they do not use barriers (and therefore dangerous to use), or
> is everything fine with the 2.6.32-71 builds and just building on a
> newer kernel produces broken modules?

Let me extend this: when using the 2.6.32-71 build it seems that
barriers are still used (if switching back to my config with barriers),
at least /proc/drbd shows "wo:b" and kernel log says "Method to ensure
write ordering: barrier".

To summerize:

drbd.ko 8.3.13       | with barriers | with no barriers
                     | /proc:  wo:b  | /proc:  wo:d
---------------------+---------------+-----------------
build on 2.6.32-71   | 100-105 MB/s  | 108-109 MB/s
build on 2.6.32-220  | 45-55 MB/s    | 106-108 MB/s

So, it looks that the 71er build still uses barriers and brings good
performance, while the 220er build seems to only perform if barriers are
disabled.

Regards,
Matthias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 308 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120611/7bfa26b6/attachment.pgp>


More information about the drbd-user mailing list