[DRBD-user] drbd 9.0.26 and newer fails to build on 5.4 kernels

Christoph Böhmwalder christoph.boehmwalder at linbit.com
Mon Feb 22 12:37:41 CET 2021


On 2/22/21 11:40 AM, Natanael Copa wrote:
> On Fri, 19 Feb 2021 19:50:13 +0100
> Natanael Copa <ncopa at alpinelinux.org> wrote:
> 
>> On Fri, 19 Feb 2021 19:38:53 +0100
>> Natanael Copa <ncopa at alpinelinux.org> wrote:
>>
>>> Hi,
>>>
>>> I tried to update the kernel for alpine 3.12-stable branch from 5.4.84
>>> to 5.4.99. The 3rd part kernel module drbd 9.0.22-2 failed to build so
>>> I updated it to 9.0.27-1. This passed on the x86_64 machine I tested
>>> built it on so I pushed it.
>>>
>>> But it failed on 32 bit arm builders:
>>>
>>> ...
>>> In file included from ./include/linux/module.h:27,
>>>                   from /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_req.h:16,
>>>                   from /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_state.c:22:
>>> ./arch/arm/include/asm/module.h:59: warning: "MODULE_ARCH_VERMAGIC" redefined
>>>     59 | #define MODULE_ARCH_VERMAGIC \
>>>        |
>>> In file included from /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_state.c:19:
>>> ./include/linux/vermagic.h:28: note: this is the location of the previous definition
>>>     28 | #define MODULE_ARCH_VERMAGIC ""
>>>        |
>>>    GEN     /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_buildtag.c
>>>    CC [M]  /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_buildtag.o
>>>    LD [M]  /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd.o
>>>    Building modules, stage 2.
>>>    MODPOST 2 modules
>>> ERROR: "__aeabi_ldivmod" [/home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd.ko] undefined!
>>> ERROR: "__aeabi_uldivmod" [/home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd.ko] undefined!
>>> make[3]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
>>> make[2]: *** [Makefile:1639: modules] Error 2
>>> make[1]: *** [Makefile:132: kbuild] Error 2
>>> make[1]: Leaving directory '/home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd'
>>> make: *** [Makefile:131: module] Error 2
>>>>>> ERROR: drbd-lts: build failed
>>>
>>> And on 32 bit x86:
>>>
>>>    GEN     /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_buildtag.c
>>>    CC [M]  /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd_buildtag.o
>>>    LD [M]  /home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd.o
>>>    Building modules, stage 2.
>>>    MODPOST 2 modules
>>> ERROR: "__udivdi3" [/home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd.ko] undefined!
>>> ERROR: "__divdi3" [/home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd/drbd.ko] undefined!
>>> make[3]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
>>> make[2]: *** [Makefile:1639: modules] Error 2
>>> make[1]: *** [Makefile:132: kbuild] Error 2
>>> make[1]: Leaving directory '/home/buildozer/aports/main/drbd-lts/src/drbd-9.0.27-1/drbd'
>>> make: *** [Makefile:131: module] Error 2
>>>>>> ERROR: drbd-lts: build failed
>>>
>>> I read that this can happen if do_div() is not used and libgcc is not
>>> linked in (which I assume we shouldnt on kernel). I tried to look at
>>> the git log if I could find anything evident but couldn't find anything.
>>>
>>> So I tried to find older version. Turns out that the lastest version
>>> that compiles with 5.4 kernel is 9.0.25-2, but it only compiles with
>>> 5.4.84 kernel and not the 5.4.99.
>>>
>>> Does anyone have a clue how to solve this? If not, I guess I have to
>>> try git bisect it.
> 
> 
> I found the commit that introduces the issue using objdump and git log:
> 
> commit 8dc8ede32de4410e99148b39d0f960e975eeddea
> Author: Joel Colledge <joel.colledge at linbit.com>
> Date:   Wed Nov 11 18:11:26 2020 +0100
> 
>      drbd: fix slow sync when sync requests are answered quickly
> 
> 
> It is the DIV_ROUND_UP in that commit that introduces this.
> 
> This change fixes it:
> diff --git a/drbd/drbd_sender.c b/drbd/drbd_sender.c
> index 30e301fb..34030870 100644
> --- a/drbd/drbd_sender.c
> +++ b/drbd/drbd_sender.c
> @@ -687,7 +687,7 @@ static int drbd_resync_delay(struct drbd_peer_device *peer_device)
>                           * that the rate limiting prevents any new requests
>                           * from being made. Wait just long enough so that we
>                           * can request some data next time. */
> -                       delay = DIV_ROUND_UP(HZ * BM_SECT_PER_BIT, pdc->c_max_rate * 2);
> +                       delay = DIV_ROUND_UP((unsigned long)(HZ * BM_SECT_PER_BIT / 2), pdc->c_max_rate);
>                  }
>          } else {
>                  /* Fixed resync rate. Use the standard delay. */
> 
> 
> As I understand, it is the compiler that converts the HZ *
> BM_SECT_PER_BIT to 64 bit type which triggers the error.
> 
> As I understand the source BM_SECT_PER_BIT = 1 << (12-9) = 8, so it
> should be safe to move the `* 2` on the right side to a `/ 2` on the
> left, which results in HZ * 4 / pdc->c_max_rate.
> 
> HZ is to my understand seldomly (never?) set to anything above 1000?
> And 1000 * 4 fits just fine in an unsigned long, so I think the above diff should be safe.
> 
> -nc
> _______________________________________________
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user at lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user
> 

Hi Natanael,

thanks for the proposed fix! Your logic seems right to me.

I have committed it to the drbd repository, along with one other similar 
fix. See 
https://github.com/LINBIT/drbd/commit/4194a136b08c1f204707bdb0695fc05c0b2997e9

Are you able to test the latest git HEAD (drbd-9.0 branch) on your 
infrastructure to verify that this is fixed for you?

Thanks again,
Christoph

--
Christoph Böhmwalder
LINBIT | Keeping the Digital World Running
DRBD HA —  Disaster Recovery — Software defined Storage


More information about the drbd-user mailing list