[DRBD-user] drbdadm status blocked:lower

Lars Ellenberg lars.ellenberg at linbit.com
Thu Oct 18 21:51:32 CEST 2018


On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote:
> On 10/11/2018 10:59 AM, Lars Ellenberg wrote:
> > On Wed, Oct 10, 2018 at 11:52:34AM +0000, Garrido, Cristina wrote:
> > > Hello,
> > > 
> > > I have two drbd devices configured on my cluster. On both nodes the status shows "blocked:lower" although everything seems to be fine. We have conducted IO tests on the physical devices and on the drbd devices with good results. Do you know why this message is shown and how to debug it?
> > > 
> > > The message from status command:
> > > 
> > > xxxx:/dev/mapper # drbdsetup status --verbose --statistics
> > > ASCS node-id:1 role:Primary suspended:no
> > >      write-ordering:flush
> > >    volume:0 minor:0 disk:UpToDate
> > >        size:10452636 read:3247 written:8185665 al-writes:53 bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:lower
> > "blocked:lower" means that the in-kernel API for querying block
> > device info congestion reported "congestion" for the backing device.
> > Why it did that, and whether that was actually the case, and what
> > that actually means is very much dependend on that backing device,
> > and how it "felt" at the time of that status output.
> > 
> Thanks Lars,
> 
> Do you know how DRBD asks kernel about congestion information? Which is the
> system call it makes?

DRBD is part of the kernel. No system call involved.
We call bdi_congested() which is a wrapper around wb_congested(),
both defined in linux/include/backing-dev.h

> We want to know why is marking it as "blocked:lower",

just ignore that wording. don't panic just because it says "blocked"...

> because we are making heavy performance test and seems that there is
> no problem at disk or network level.

"congestion" does not mean "no progress".
Just that you reached some kind of, well, congestion, and likely, that,
if you where to even increase the "IO load", you'd probably just make
the latency tail longer, and not improve throughput or IOPS anymore.

so you throw "heavy" IO against the IO stack.  as a result, you drive
the IO stack into "congestion".  and if you ask it for some status,
it reports that back.

no surprise there.

> We think that DRBD/kernel is not getting the correct information from
> the system.

afaics, blk_set_congested() is called when a queue has more than
"nr_congestion_on" requests "in flight", and it is cleared once that
drops below "nr_congestion_off" again.  both hysteresis watermarks are
set in relation to the queue "nr_requests", which again is a tunable.


-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed


More information about the drbd-user mailing list