Issue while loading drbd_transport_rdma module
Indivar Nair
indivar.nair at techterra.in
Tue Sep 3 10:11:47 CEST 2024
Hello Akemi, Davide,
It is a live pacemaker cluster system. We are currently running it
with DRDB+TCP.
Upgrading it right away would be a challenge.
I will again try with the Linux OFED first and see.
Otherwise, I will try recompiling MOFED and DRBD, going through all
the compile time parameters.
Thanks,
Indivar Nair
On Tue, Sep 3, 2024 at 11:42 AM Davide Obbi (E4)
<davide.obbi at e4company.com> wrote:
>
> Hi,
>
> As far as i know, if you used MOFED you need to re-compile the drbd module. Also if installing MOFED be sure you are on the right kernel otherwise while compiling the MOFED itself with `--add-kernel-support`, you need to have correctly installed the right kernel-devel-(uname -r) before running the MOFED installation script.
>
> Instead, if you use the default Linux OFED (dnf groups install Infiniband\ Support; dnf install kernel-modules-$(uname -r)) you can use the pre-compiled version.
>
> Instructions are available at https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-rdma_transport
>
>
>
> -----Original Message-----
> From: drbd-user-bounces at lists.linbit.com <drbd-user-bounces at lists.linbit.com> On Behalf Of Akemi Yagi
> Sent: Monday, September 2, 2024 8:08 PM
> To: Indivar Nair <indivar.nair at techterra.in>
> Cc: drbd-user at lists.linbit.com
> Subject: Re: Issue while loading drbd_transport_rdma module
>
> On Sun, Sep 1, 2024 at 10:59 PM Indivar Nair <indivar.nair at techterra.in> wrote:
> >
> > Hello All,
> >
> > I have a 2-node cluster on which I am trying to load the
> > drdb_transport_rdma.ko modules.
> >
> > The nodes have -
> > - Rocky Linux 9.1 (Kernel 5.14.0-162.23.1)
> > - NVIDIA/Mellanox ConnectX-5 EN 100GB NIC
> > - MLNX_OFED_LINUX-23.10-3.2.2.0-rhel9.1-x86_64 drivers
> > - DRBD 9.2.3 (compiled on the same machine)
> >
> > I have connected the 100G Ethernet (RoCE) ports back-to-back with a
> > short DAC cable.
> > Tests with perftest tools (ib_send_bw and ib_read_bw) show proper
> > connectivity. RoCE is working properly.
> >
> > But, I get the following error when I try to load the
> > drdb_transport_rdma.ko module
> > ----------------------------------------------------------------------
> > -----------------
> > drbd_transport_rdma: disagrees about version of symbol __ib_alloc_pd
> > drbd_transport_rdma: Unknown symbol __ib_alloc_pd (err -22)
> > drbd_transport_rdma: disagrees about version of symbol
> > rdma_resolve_addr
> (snip)
> > ----------------------------------------------------------------------
> > -----------------
> > What could be the issue?
> > Thanks
> >
> > Regards,
> > Indivar Nair
>
> Looks like the kernel modules you built do not match the running kernel.
>
> Rocky Linux 9.1 is obsolete and it has many security vulnerabilities.
> Can you update it to the current 9.4? If you can, then I suggest you use ELRepo's kmod-drbd9x package. It is currently at version 9.2.11 and is available from the elrepo-testing repository.
>
> If for some reason you cannot update the OS, make sure you build your modules against the kernel in use.
>
> Akemi
More information about the drbd-user
mailing list