[DRBD-user] DRBD on top of mdraid troubles
Josh Fisher
jfisher at jaybus.com
Fri Mar 17 16:53:27 CET 2023
On 3/17/23 03:50, Roland Kammerer wrote:
> On Wed, Mar 15, 2023 at 03:16:20PM +0200, Athanasios Chatziathanassiou wrote:
>> drbd raid10_ssd/0 drbd1: Local IO failed in drbd_endio_write_sec_final.
>> Detaching...
> I'd say you have a hardware problem on the backing device. Whenever DRBD
> tries to write there local IO fails and then it detaches. So test and
> verify that the backing device/storage actually works.
I have ruled a hardware problem out in my case. My raid10 backing device
works perfectly with the kernel module from 9.1.4. The kernel module
from 9.1.5 through 9.1.13 fail with:
Mar 1 08:43:39 cnode2 kernel: md/raid10:md127: make_request bug: can't
convert block across chunks or bigger than 256k 448794880 132
Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: disk(
UpToDate -> Failed )
Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: Local IO
failed in drbd_request_endio. Detaching...
Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: local READ
IO error sector 29362432+264 on ffff9fcff9a389c0
Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: sending
new current UUID: 9C66E258C0F9F361
Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: disk(
Failed -> Diskless )
This appears to be the same problem as in issue #26, or at least related.
Note that this could still be a mdraid bug, however the same raid10
works perfectly well with the DRBD 9.1.4 kmod.
Also note that the DRBD device starts up OK and resync works as long as
both hosts are secondary. Promoting either host to primary seems to
trigger the error on the host having the raid10 backing device..
More information about the drbd-user
mailing list