[DRBD-user] Adjusting al-extents on-the-fly

Tue May 27 12:31:31 CEST 2014

On Tue, May 27, 2014 at 12:16:11PM +1000, Stuart Longland wrote:
> Hi all,
> 
> I'm in the process of trying to debug what I suspect is an I/O issue on
> a highly-available SCADA server operated by a mining company.  The
> systems run Ubuntu 10.04 LTS with two network interfaces, one on their
> business network, and one on their control network, both gigabit links.
> 
> drbdadm --version reports:
> > DRBDADM_BUILDTAG=GIT-hash:\ ea9e28dbff98e331a62bcbcc63a6135808fe2917\ build\ by\ buildd at panlong\,\ 2012-05-18\ 08:21:18
> > DRBDADM_API_VERSION=88
> > DRBD_KERNEL_VERSION_CODE=0x080307
> > DRBDADM_VERSION_CODE=0x080307
> > DRBDADM_VERSION=8.3.7
> 
> The system logs PLC-generated process data every 5 seconds, and at two
> times of the day, at midnight and midday, it misses a sample with the
> logging taking 6 seconds.  There's no obvious CPU spike at this time, so
> my hunch is I/O, and so I'm looking at ways to try and improve this.

Funny how if "something" happens,
and there is DRBD anywhere near it,
it is "obviously" DRBD's fault, naturally.

> iotop didn't show any huge spikes that I'd imagine the disks would have
> trouble with.  Then again, since it's effectively polling, I could have
> "blinked" and missed it.

If your data gathering and logging thingy misses a sample
because of the logging to disk (assuming for now that this is in fact
what happens), you are still doing it wrong.

Make the data sampling asynchronous wrt. flushing data to disk.

> DR:BD is configured with a disk partition on a RAID array as its backing

Wrong end of the system to tune in this case, imo.

> store, the same array being shared with the OS and swap space.  I don't
> know if the array has RAM or flash based cache, all I know is it uses
> the cciss driver.
> 
> The configuration file looks like this:
> > global {
> >         usage-count yes;
> > }
> > common {
> >         syncer { rate 50M; }
> > }
> > resource r0 {
> >         protocol C;
> >         handlers {
> >         }
> >         startup {
> >                 wfc-timeout 10;         # 10 seconds
> >                 degr-wfc-timeout 120;    # 2 minutes.
> >         }
> >         disk {
> >                 on-io-error   pass_on;
> >         }
> >         net {
> >                 sndbuf-size 512k;
> >                 timeout      60;    #  6 seconds  (unit = 0.1 seconds)
> >                 ping-int      10;    # 10 seconds  (unit = 1 second)
> >                 ping-timeout  10;    # 500 ms (unit = 0.1 seconds)
> >                 max-buffers     4096;
> >                 max-epoch-size  4096;
> >                 after-sb-0pri discard-zero-changes;
> >                 after-sb-1pri consensus;
> >                 after-sb-2pri disconnect;
> >                 rr-conflict disconnect;
> >         }
> >         syncer {
> >                 rate 50M;
> >                 al-extents 257;
> >         }
> >         on node1 {
> >                 device /dev/drbd0;
> >                 disk /dev/cciss/c0d0p4;
> >                 address 10.20.30.1:7788;
> >                 meta-disk internal;
> >         }
> >         on node2 {
> >                 device /dev/drbd0;
> >                 disk /dev/cciss/c0d0p4;
> >                 address 10.20.30.2:7788;
> >                 meta-disk internal;
> >         }
> > }
> 
> One thing I'm looking to adjust is the al-extents option, as reading the
> literature, 257 looks a little small.  The historian will be writing
> little bits of data every 5 seconds as part of its logging function, and
> so I suspect raising this may help.
> 
> However, I cannot bring the system down to adjust it at this time.  I
> read that some settings can be changed on-the-fly, so I tried setting it
> to 521 and issued the following dry-run command:
> 
> > root at node1:~# drbdadm -d adjust all
> > drbdsetup 0 syncer --set-defaults --create-device --rate=50M --al-extents=521
> 
> I saw --create-device and thought, what does "--create-device" mean?  Is
> it a *destructive* re-creation of the block device?  I've since reverted
> my changes back to what's shown above on both nodes, and have not
> proceeded.  The documentation on exactly what that command does is
> unclear to me.
> 
> Is it a sane thing to try and adjust this parameter (or others)
> on-the-fly like this, and am I going the right way about it?

This (adjusting of the "al-extents" only) is a rather boring command
actually.  It may stall IO on a very busy backend a bit,
changes some internal "caching hash table size" (sort of),
and continues.

As your server seems to be rather not-so-busy, IO wise,
I don't think this will even be noticable.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed