[DRBD-user] drbdadm status blocked:lower

Lars Ellenberg lars.ellenberg at linbit.com
Wed Oct 24 20:06:51 CEST 2018


On Fri, Oct 19, 2018 at 10:18:12AM +0200, VictorSanchez2 wrote:
> On 10/18/2018 09:51 PM, Lars Ellenberg wrote:
> > On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote:
> > > On 10/11/2018 10:59 AM, Lars Ellenberg wrote:
> > > > On Wed, Oct 10, 2018 at 11:52:34AM +0000, Garrido, Cristina wrote:
> > > > > Hello,
> > > > > 
> > > > > I have two drbd devices configured on my cluster. On both nodes the status shows "blocked:lower" although everything seems to be fine. We have conducted IO tests on the physical devices and on the drbd devices with good results. Do you know why this message is shown and how to debug it?
> > > > > 
> > > > > The message from status command:
> > > > > 
> > > > > xxxx:/dev/mapper # drbdsetup status --verbose --statistics
> > > > > ASCS node-id:1 role:Primary suspended:no
> > > > >       write-ordering:flush
> > > > >     volume:0 minor:0 disk:UpToDate
> > > > >         size:10452636 read:3247 written:8185665 al-writes:53 bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:lower
> > > > "blocked:lower" means that the in-kernel API for querying block
> > > > device info congestion reported "congestion" for the backing device.
> > > > Why it did that, and whether that was actually the case, and what
> > > > that actually means is very much dependend on that backing device,
> > > > and how it "felt" at the time of that status output.
> > > > 
> > > Thanks Lars,
> > > 
> > > Do you know how DRBD asks kernel about congestion information? Which is the
> > > system call it makes?
> > DRBD is part of the kernel. No system call involved.
> > We call bdi_congested() which is a wrapper around wb_congested(),
> > both defined in linux/include/backing-dev.h
> > 
> > > We want to know why is marking it as "blocked:lower",
> > just ignore that wording. don't panic just because it says "blocked"...
> > 
> > > because we are making heavy performance test and seems that there is
> > > no problem at disk or network level.
> > "congestion" does not mean "no progress".
> > Just that you reached some kind of, well, congestion, and likely, that,
> > if you where to even increase the "IO load", you'd probably just make
> > the latency tail longer, and not improve throughput or IOPS anymore.
> > 
> > so you throw "heavy" IO against the IO stack.  as a result, you drive
> > the IO stack into "congestion".  and if you ask it for some status,
> > it reports that back.
> > 
> > no surprise there.
> > 
> > > We think that DRBD/kernel is not getting the correct information from
> > > the system.
> > afaics, blk_set_congested() is called when a queue has more than
> > "nr_congestion_on" requests "in flight", and it is cleared once that
> > drops below "nr_congestion_off" again.  both hysteresis watermarks are
> > set in relation to the queue "nr_requests", which again is a tunable.
> > 
> > 
> Thanks Lars,
> 
> how we can tune nr_requests? By default is at 128, and we can't increase it:

It's not about DRBD, it's about the storage backend.

> # cat /sys/block/drbd1/queue/nr_requests
> 128
> echo 129 > /sys/block/drbd1/queue/nr_requests
> -bash: echo: write error: Invalid argument

sure. DRBD is a "virtual" device, which does not even have a queue.
nr_requests for DRBD has no actual meaning.

> in any case, I think that increase the nr_requests will not solve the
> problem.

Well, do you have any indication that there actually is a "problem"?

If your only "problem" is the string "blocked:lower"
in the drbdsetup status output, may I suggest to just ignore that?

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed


More information about the drbd-user mailing list