[DRBD-user] drbdadm status blocked:lower

Fri Oct 19 10:18:12 CEST 2018

On 10/18/2018 09:51 PM, Lars Ellenberg wrote:
> On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote:
>> On 10/11/2018 10:59 AM, Lars Ellenberg wrote:
>>> On Wed, Oct 10, 2018 at 11:52:34AM +0000, Garrido, Cristina wrote:
>>>> Hello,
>>>>
>>>> I have two drbd devices configured on my cluster. On both nodes the status shows "blocked:lower" although everything seems to be fine. We have conducted IO tests on the physical devices and on the drbd devices with good results. Do you know why this message is shown and how to debug it?
>>>>
>>>> The message from status command:
>>>>
>>>> xxxx:/dev/mapper # drbdsetup status --verbose --statistics
>>>> ASCS node-id:1 role:Primary suspended:no
>>>>       write-ordering:flush
>>>>     volume:0 minor:0 disk:UpToDate
>>>>         size:10452636 read:3247 written:8185665 al-writes:53 bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:lower
>>> "blocked:lower" means that the in-kernel API for querying block
>>> device info congestion reported "congestion" for the backing device.
>>> Why it did that, and whether that was actually the case, and what
>>> that actually means is very much dependend on that backing device,
>>> and how it "felt" at the time of that status output.
>>>
>> Thanks Lars,
>>
>> Do you know how DRBD asks kernel about congestion information? Which is the
>> system call it makes?
> DRBD is part of the kernel. No system call involved.
> We call bdi_congested() which is a wrapper around wb_congested(),
> both defined in linux/include/backing-dev.h
>
>> We want to know why is marking it as "blocked:lower",
> just ignore that wording. don't panic just because it says "blocked"...
>
>> because we are making heavy performance test and seems that there is
>> no problem at disk or network level.
> "congestion" does not mean "no progress".
> Just that you reached some kind of, well, congestion, and likely, that,
> if you where to even increase the "IO load", you'd probably just make
> the latency tail longer, and not improve throughput or IOPS anymore.
>
> so you throw "heavy" IO against the IO stack.  as a result, you drive
> the IO stack into "congestion".  and if you ask it for some status,
> it reports that back.
>
> no surprise there.
>
>> We think that DRBD/kernel is not getting the correct information from
>> the system.
> afaics, blk_set_congested() is called when a queue has more than
> "nr_congestion_on" requests "in flight", and it is cleared once that
> drops below "nr_congestion_off" again.  both hysteresis watermarks are
> set in relation to the queue "nr_requests", which again is a tunable.
>
>
Thanks Lars,

how we can tune nr_requests? By default is at 128, and we can't increase it:

# cat /sys/block/drbd1/queue/nr_requests
128
echo 129 > /sys/block/drbd1/queue/nr_requests
-bash: echo: write error: Invalid argument
# uname -r
4.4.140-94.42-default

in any case, I think that increase the nr_requests will not solve the 
problem. I'm worried about how congestion is taking place, if not 
writings are being executing in the device at the moment. How I can see 
the current queue size, or request in flight?

Looking information for these functions, I have seen that there are 
reported bugs, that I don't know if they are related:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=linux-4.18.y&qt=grep&q=wb_congested&showmsg=1

kernel in these systems is 4.4.140-94.42-default