Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 06/06/2013 02:51 PM, Lars Ellenberg wrote: > You did something bad, and that confused the IO stack. I would have expected any kind of error message from any of the tools I used to increase the device sized if I actually did something bad... > This causes IO errors. Interestingly, while these "kernel: bio too big device drbd0" keep coming, no human user or other component of the machine complains about any error... so far for ~ one week of intensive usage. On 06/06/2013 03:39 PM, Sebastian Riemer wrote: > Looks like something in the IO stack above DRBD in the kernel doesn't > respect the IO size limits of DRBD. > > In kernel 3.3 the function "blk_set_stacking_limits()" has been > introduced to fix such issues. MD uses this function for example. Before > that MD used too small IO limits. > > Try these commands and repeat them for the devices above: > $ cat /sys/block/drbd0/queue/max_sectors_kb > $ cat /sys/block/drbd0/queue/max_hw_sectors_kb The fascinating results: # for i in /sys/block/drbd*/queue/max_sectors_kb ; do echo -n "$i " ; cat $i ; done /sys/block/drbd0/queue/max_sectors_kb 128 /sys/block/drbd1/queue/max_sectors_kb 512 /sys/block/drbd7/queue/max_sectors_kb 512 # for i in /sys/block/drbd*/queue/max_hw_sectors_kb ; do echo -n "$i " ; cat $i ; done /sys/block/drbd0/queue/max_hw_sectors_kb 128 /sys/block/drbd1/queue/max_hw_sectors_kb 1024 /sys/block/drbd7/queue/max_hw_sectors_kb 1024 > Should be 128 as DRBD has 128 KiB hashing functions and can't do bigger > IO because of that. The kernel internally calculates with 512 byte > sectors. So 256 sectors are 128 KiB. I wonder why only drbd0, which is one of three drbd devices used on the machine, shows such a result - and drbd0 is the only device that the "bio too big" messages are reported for. > Have a look into the kernel source in "block/blk-core.c" and search for > "bio too big device" for details. In the function > "generic_make_request_checks()" you can see that an IO error is sent to > the upper layers in that case ( bio_endio(bio, -EIO) ). Yes, so the next layer, which is dm-crypt, should either complain / return an error, too, or do some magic to slice the write into pieces, right? BTW: This is what I get for the dm-crypt device that sits on top of drbd0: /sys/block/dm-9/queue/max_hw_sectors_kb 1024 /sys/block/dm-9/queue/max_sectors_kb 512 >> Should I be worried? > > It depends on how the layers above react on this situation. If they try > again with smaller IOs, then it's okay. Otherwise, there can be a major > issue. Kernel code has to be read to verify. I could not find the right place to look at in drivers/md/dm-crypt.c, do you have a suggestion? Regards, Lutz Vieweg