[DRBD-user] DRBD on top of mdraid troubles

Sinisa sinisa at 4net.rs
Tue Mar 21 08:31:02 CET 2023


Maybe you should forward this to linux-raid at vger.kernel.org
It could be some race condition, similar to what happened to me with XFS amd 
md10 a few years ako (resolved very quickly btw)

Srdačan pozdrav / Best regards / Freundliche Grüße / Cordialement / よろしくお願いします
Siniša Bandin

On 2023-03-17 16:53, Josh Fisher wrote:
> On 3/17/23 03:50, Roland Kammerer wrote:
>> On Wed, Mar 15, 2023 at 03:16:20PM +0200, Athanasios Chatziathanassiou wrote:
>>> drbd raid10_ssd/0 drbd1: Local IO failed in drbd_endio_write_sec_final.
>>> Detaching...
>> I'd say you have a hardware problem on the backing device. Whenever DRBD
>> tries to write there local IO fails and then it detaches. So test and
>> verify that the backing device/storage actually works.
>
>
> I have ruled a hardware problem out in my case. My raid10 backing device 
> works perfectly with the kernel module from 9.1.4. The kernel module from 
> 9.1.5 through 9.1.13 fail with:
>
> Mar 1 08:43:39 cnode2 kernel: md/raid10:md127: make_request bug: can't 
> convert block across chunks or bigger than 256k 448794880 132
> Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: disk( UpToDate 
> -> Failed )
> Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: Local IO failed 
> in drbd_request_endio. Detaching...
> Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: local READ IO 
> error sector 29362432+264 on ffff9fcff9a389c0
> Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: sending new 
> current UUID: 9C66E258C0F9F361
> Mar 1 08:43:39 cnode2 kernel: drbd drbd_access_home/0 drbd13: disk( Failed -> 
> Diskless )
>
> This appears to be the same problem as in issue #26, or at least related.
>
> Note that this could still be a mdraid bug, however the same raid10 works 
> perfectly well with the DRBD 9.1.4 kmod.
>
> Also note that the DRBD device starts up OK and resync works as long as both 
> hosts are secondary. Promoting either host to primary seems to trigger the 
> error on the host having the raid10 backing device..
>
>
> _______________________________________________
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user at lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list