[PATCH 2/3] drbd: Fix kernel crash in drbd_find_path_by_addr()

Philipp Reisner philipp.reisner at linbit.com
Thu Jul 31 14:36:05 CEST 2025


Thanks applied

On Wed, Jul 9, 2025 at 5:01 AM zhengbing.huang
<zhengbing.huang at easystack.cn> wrote:
>
> We hava the crash info as follow:
>  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>  Workqueue: ib_cm cm_work_handler [ib_cm]
>  RIP: 0010:drbd_find_path_by_addr+0x6c/0xd0 [drbd]
>  Call Trace:
>   dtr_cma_event_handler+0x1c1/0x4ee [drbd_transport_rdma]
>   cma_cm_event_handler+0x25/0xd0 [rdma_cm]
>   cma_ib_req_handler+0x7cd/0x1250 [rdma_cm]
>   ? addr4_resolve+0x67/0xd0 [ib_core]
>   cm_process_work+0x22/0xf0 [ib_cm]
>   cm_req_handler+0x7ed/0xf40 [ib_cm]
>   ? __switch_to_asm+0x35/0x70
>   cm_work_handler+0x798/0xf30 [ib_cm]
>   ? finish_task_switch+0x18e/0x2e0
>   process_one_work+0x1a7/0x360
>   ? create_worker+0x1a0/0x1a0
>   worker_thread+0x30/0x390
>   ? create_worker+0x1a0/0x1a0
>   kthread+0x10a/0x120
>   ? set_kthread_struct+0x40/0x40
>   ret_from_fork+0x1f/0x40
>
> The code that crash is traverse the listener->waiters list:
> struct drbd_path *drbd_find_path_by_addr(struct drbd_listener *listener, struct sockaddr_storage *addr)
> {
>         struct drbd_path *path;
>
>         list_for_each_entry(path, &listener->waiters, listener_link) {
>                 if (addr_equal(&path->peer_addr, addr))
>                         return path;
>         }
>
>         return NULL;
> }
>
> The listener->waiters list has a Path node:
> crash> struct dtr_listener ff4ba75054797c00
> struct dtr_listener {
>   listener = {
>     kref = {
>       refcount = {
>         refs = {
>           counter = 2
>         }
>       }
>     },
>     resource = 0xff4ba766cc325000,
>     transport_class = 0xffffffffc037f080 <rdma_transport_class>,
>     list = {
>       next = 0xff4ba766cc325500,
>       prev = 0xff4ba766cc325500
>     },
>     waiters = {
>       next = 0xff4ba74fd578e138,
>       prev = 0xff4ba74fd578e138
>     },
>  ...
> }
>
> but this Path has been released:
> crash> struct drbd_path 0xff4ba74fd578e000
> struct drbd_path {
>   my_addr = {
>     ss_family = 1,
>     __data = "\000\000\000\000"
>   },
>   peer_addr = {
>     ss_family = 0,
>     __data = "\000\000\000\000\000\000\0"
>   },
>   kref = {
>     refcount = {
>       refs = {
>         counter = 0
>       }
>     }
>   },
>   net = 0x0,
>   my_addr_len = 0,
>   peer_addr_len = 0,
>   flags = 0,
>   // all zero
>   ...
> }
>
> So this path has been released, but it is still on the listener->waiters list,
> which cause problem when traverse the list later.
>
> And the scenario of this problem should be like this:
> thread_1:
>   remove_path()
>     dtr_remove_path()
>       drbd_put_listener()
>         list_del(&path->listener_link)
>                                           thread_2:
>                                             ...
>                                             dtr_activate_path()
>                                               drbd_get_listener()
>                                                 list_add(&path->listener_link, &listener->waiters);
>                                             ...
>    ...
>    kfree(path)
>
> thread_3:
> connect request come in:
> dtr_cma_event_handler()
>   dtr_cma_accept()
>     drbd_find_path_by_addr()
>     crash
>
> To avoid this use-after-free, we hold an additional reference to drbd_path
> whenever it is added to the listener->waiters list, and drop it when removed.
>
> This ensures the path memory remains valid during list traversal.
>
> Signed-off-by: zhengbing.huang <zhengbing.huang at easystack.cn>
> ---
>  drbd/drbd_transport.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drbd/drbd_transport.c b/drbd/drbd_transport.c
> index 00e7f9269..aff96716f 100644
> --- a/drbd/drbd_transport.c
> +++ b/drbd/drbd_transport.c
> @@ -224,6 +224,7 @@ int drbd_get_listener(struct drbd_path *path)
>
>         spin_lock_bh(&listener->waiters_lock);
>         list_add(&path->listener_link, &listener->waiters);
> +       kref_get(&path->kref);
>         path->listener = listener;
>         spin_unlock_bh(&listener->waiters_lock);
>         /* After exposing the listener on a path, drbd_put_listenr() can destroy it. */
> @@ -258,6 +259,7 @@ void drbd_put_listener(struct drbd_path *path)
>
>         spin_lock_bh(&listener->waiters_lock);
>         list_del(&path->listener_link);
> +       kref_put(&path->kref, drbd_destroy_path);
>         spin_unlock_bh(&listener->waiters_lock);
>         kref_put(&listener->kref, drbd_listener_destroy);
>  }
> --
> 2.43.0
>


More information about the drbd-dev mailing list