[PATCH 2/3] drbd: Fix kernel crash in drbd_find_path_by_addr()
Philipp Reisner
philipp.reisner at linbit.com
Thu Jul 31 14:36:05 CEST 2025
Thanks applied
On Wed, Jul 9, 2025 at 5:01 AM zhengbing.huang
<zhengbing.huang at easystack.cn> wrote:
>
> We hava the crash info as follow:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> Workqueue: ib_cm cm_work_handler [ib_cm]
> RIP: 0010:drbd_find_path_by_addr+0x6c/0xd0 [drbd]
> Call Trace:
> dtr_cma_event_handler+0x1c1/0x4ee [drbd_transport_rdma]
> cma_cm_event_handler+0x25/0xd0 [rdma_cm]
> cma_ib_req_handler+0x7cd/0x1250 [rdma_cm]
> ? addr4_resolve+0x67/0xd0 [ib_core]
> cm_process_work+0x22/0xf0 [ib_cm]
> cm_req_handler+0x7ed/0xf40 [ib_cm]
> ? __switch_to_asm+0x35/0x70
> cm_work_handler+0x798/0xf30 [ib_cm]
> ? finish_task_switch+0x18e/0x2e0
> process_one_work+0x1a7/0x360
> ? create_worker+0x1a0/0x1a0
> worker_thread+0x30/0x390
> ? create_worker+0x1a0/0x1a0
> kthread+0x10a/0x120
> ? set_kthread_struct+0x40/0x40
> ret_from_fork+0x1f/0x40
>
> The code that crash is traverse the listener->waiters list:
> struct drbd_path *drbd_find_path_by_addr(struct drbd_listener *listener, struct sockaddr_storage *addr)
> {
> struct drbd_path *path;
>
> list_for_each_entry(path, &listener->waiters, listener_link) {
> if (addr_equal(&path->peer_addr, addr))
> return path;
> }
>
> return NULL;
> }
>
> The listener->waiters list has a Path node:
> crash> struct dtr_listener ff4ba75054797c00
> struct dtr_listener {
> listener = {
> kref = {
> refcount = {
> refs = {
> counter = 2
> }
> }
> },
> resource = 0xff4ba766cc325000,
> transport_class = 0xffffffffc037f080 <rdma_transport_class>,
> list = {
> next = 0xff4ba766cc325500,
> prev = 0xff4ba766cc325500
> },
> waiters = {
> next = 0xff4ba74fd578e138,
> prev = 0xff4ba74fd578e138
> },
> ...
> }
>
> but this Path has been released:
> crash> struct drbd_path 0xff4ba74fd578e000
> struct drbd_path {
> my_addr = {
> ss_family = 1,
> __data = "\000\000\000\000"
> },
> peer_addr = {
> ss_family = 0,
> __data = "\000\000\000\000\000\000\0"
> },
> kref = {
> refcount = {
> refs = {
> counter = 0
> }
> }
> },
> net = 0x0,
> my_addr_len = 0,
> peer_addr_len = 0,
> flags = 0,
> // all zero
> ...
> }
>
> So this path has been released, but it is still on the listener->waiters list,
> which cause problem when traverse the list later.
>
> And the scenario of this problem should be like this:
> thread_1:
> remove_path()
> dtr_remove_path()
> drbd_put_listener()
> list_del(&path->listener_link)
> thread_2:
> ...
> dtr_activate_path()
> drbd_get_listener()
> list_add(&path->listener_link, &listener->waiters);
> ...
> ...
> kfree(path)
>
> thread_3:
> connect request come in:
> dtr_cma_event_handler()
> dtr_cma_accept()
> drbd_find_path_by_addr()
> crash
>
> To avoid this use-after-free, we hold an additional reference to drbd_path
> whenever it is added to the listener->waiters list, and drop it when removed.
>
> This ensures the path memory remains valid during list traversal.
>
> Signed-off-by: zhengbing.huang <zhengbing.huang at easystack.cn>
> ---
> drbd/drbd_transport.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drbd/drbd_transport.c b/drbd/drbd_transport.c
> index 00e7f9269..aff96716f 100644
> --- a/drbd/drbd_transport.c
> +++ b/drbd/drbd_transport.c
> @@ -224,6 +224,7 @@ int drbd_get_listener(struct drbd_path *path)
>
> spin_lock_bh(&listener->waiters_lock);
> list_add(&path->listener_link, &listener->waiters);
> + kref_get(&path->kref);
> path->listener = listener;
> spin_unlock_bh(&listener->waiters_lock);
> /* After exposing the listener on a path, drbd_put_listenr() can destroy it. */
> @@ -258,6 +259,7 @@ void drbd_put_listener(struct drbd_path *path)
>
> spin_lock_bh(&listener->waiters_lock);
> list_del(&path->listener_link);
> + kref_put(&path->kref, drbd_destroy_path);
> spin_unlock_bh(&listener->waiters_lock);
> kref_put(&listener->kref, drbd_listener_destroy);
> }
> --
> 2.43.0
>
More information about the drbd-dev
mailing list