[DRBD-user] "kernel: bio too big device drbd0"

Lutz Vieweg lvml at 5t9.de
Thu Jun 6 18:30:23 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 06/06/2013 02:51 PM, Lars Ellenberg wrote:
> You did something bad, and that confused the IO stack.

I would have expected any kind of error message from any of the
tools I used to increase the device sized if I actually
did something bad...

> This causes IO errors.

Interestingly, while these "kernel: bio too big device drbd0"
keep coming, no human user or other component of the machine complains
about any error... so far for ~ one week of intensive usage.

On 06/06/2013 03:39 PM, Sebastian Riemer wrote:
> Looks like something in the IO stack above DRBD in the kernel doesn't
> respect the IO size limits of DRBD.
>
> In kernel 3.3 the function "blk_set_stacking_limits()" has been
> introduced to fix such issues. MD uses this function for example. Before
> that MD used too small IO limits.
>
> Try these commands and repeat them for the devices above:
> $ cat /sys/block/drbd0/queue/max_sectors_kb
> $ cat /sys/block/drbd0/queue/max_hw_sectors_kb

The fascinating results:

# for i in /sys/block/drbd*/queue/max_sectors_kb ; do echo -n "$i " ; cat $i ; done
/sys/block/drbd0/queue/max_sectors_kb 128
/sys/block/drbd1/queue/max_sectors_kb 512
/sys/block/drbd7/queue/max_sectors_kb 512

# for i in /sys/block/drbd*/queue/max_hw_sectors_kb ; do echo -n "$i " ; cat $i ; done
/sys/block/drbd0/queue/max_hw_sectors_kb 128
/sys/block/drbd1/queue/max_hw_sectors_kb 1024
/sys/block/drbd7/queue/max_hw_sectors_kb 1024

> Should be 128 as DRBD has 128 KiB hashing functions and can't do bigger
> IO because of that. The kernel internally calculates with 512 byte
> sectors. So 256 sectors are 128 KiB.

I wonder why only drbd0, which is one of three drbd devices used
on the machine, shows such a result - and drbd0 is the only device
that the "bio too big" messages are reported for.

> Have a look into the kernel source in "block/blk-core.c" and search for
> "bio too big device" for details. In the function
> "generic_make_request_checks()" you can see that an IO error is sent to
> the upper layers in that case ( bio_endio(bio, -EIO) ).

Yes, so the next layer, which is dm-crypt, should either complain / return
an error, too, or do some magic to slice the write into pieces, right?

BTW: This is what I get for the dm-crypt device that sits on top of drbd0:
/sys/block/dm-9/queue/max_hw_sectors_kb 1024
/sys/block/dm-9/queue/max_sectors_kb    512


>> Should I be worried?
>
> It depends on how the layers above react on this situation. If they try
> again with smaller IOs, then it's okay. Otherwise, there can be a major
> issue. Kernel code has to be read to verify.

I could not find the right place to look at in drivers/md/dm-crypt.c,
do you have a suggestion?

Regards,

Lutz Vieweg





More information about the drbd-user mailing list