From zhengbing.huang at easystack.cn Thu Apr 17 08:08:19 2025 From: zhengbing.huang at easystack.cn (zhengbing.huang) Date: Thu, 17 Apr 2025 14:08:19 +0800 Subject: [PATCH] rdma: Fix cm leaks in some abnormal scenarios Message-ID: <20250417060819.2157347-1-zhengbing.huang@easystack.cn> In dtr_create_rx_desc() function, if ib_dma_map_single() return an error, it goes to error code branch, which does not subtract 1 from the reference count of cm. In dtr_post_tx_desc() function, in the retry code branch, has similar issues. Signed-off-by: zhengbing.huang --- drbd/drbd-headers | 2 +- drbd/drbd_transport_rdma.c | 14 ++++++++++---- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/drbd/drbd-headers b/drbd/drbd-headers index 94f447251..9188ee14f 160000 --- a/drbd/drbd-headers +++ b/drbd/drbd-headers @@ -1 +1 @@ -Subproject commit 94f4472513f351efba5788f783feba6ac6efe9fc +Subproject commit 9188ee14f6de582a493d260c091db0c655b30d50 diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c index 9ce15a0ce..be919a926 100644 --- a/drbd/drbd_transport_rdma.c +++ b/drbd/drbd_transport_rdma.c @@ -2080,8 +2080,10 @@ static int dtr_create_rx_desc(struct dtr_flow *flow, gfp_t gfp_mask) rx_desc->sge.addr = ib_dma_map_single(cm->id->device, page_address(page), alloc_size, DMA_FROM_DEVICE); err = ib_dma_mapping_error(cm->id->device, rx_desc->sge.addr); - if (err) - goto out; + if (err) { + tr_err(transport, "ib_dma_map_single() failed %d\n", err); + goto out_put; + } rx_desc->sge.length = alloc_size; atomic_inc(&flow->rx_descs_allocated); @@ -2094,6 +2096,9 @@ static int dtr_create_rx_desc(struct dtr_flow *flow, gfp_t gfp_mask) dtr_free_rx_desc(rx_desc); } return err; + +out_put: + kref_put(&cm->kref, dtr_destroy_cm); out: kfree(rx_desc); drbd_free_pages(transport, page, 0); @@ -2396,9 +2401,10 @@ retry: return -EINTR; flow = &cm->path->flow[stream]; - if (atomic_dec_if_positive(&flow->peer_rx_descs) < 0) + if (atomic_dec_if_positive(&flow->peer_rx_descs) < 0) { + kref_put(&cm->kref, dtr_destroy_cm); goto retry; - + } device = cm->id->device; switch (tx_desc->type) { case SEND_PAGE: -- 2.43.0 From zhengbing.huang at easystack.cn Fri Apr 25 12:24:21 2025 From: zhengbing.huang at easystack.cn (zhengbing.huang) Date: Fri, 25 Apr 2025 18:24:21 +0800 Subject: [PATCH] rdma: Fix cm leak Message-ID: <20250425102421.1673048-1-zhengbing.huang@easystack.cn> We found that when all the DRBDs is down, the reference count of the drbd_transport_rdma module is still 1. [root at node-4 ~]# drbdadm status No currently configured DRBD found. [root at node-4 ~]# lsmod | grep drbd drbd_transport_rdma 262144 1 Then, we found an unreleas cm structure and discover that its state is DSB_CONNECT_REQ + DSB_ERROR. crash> struct dtr_cm ffff57e515da9400 struct dtr_cm { kref = { refcount = { refs = { counter = 1 ... state = 9, ... } The scenario of this problem should be like this: dtr_cma_event_handler() get an RDMA_CM_EVENT_CONNECT_REQUEST event, and call dtr_cma_accept() to alloc a cm. and set cm->state = DSM_CONNECT_REQ, now the cm->kref count is 2. then dtr_cma_event_handler() get xxx_CONNECT_ERROR/xxx_UNREACHABLE/xxx_REJECTED event, and set_bit(DSB_ERROR, &cm->state). the cm remove from path in dtr_cma_retry_connect, put one ref. and cm->state dont has DSB_CONNECTING flag, then return 0. Now, the cm->kref count is 1, and state is DSB_CONNECT_REQ + DSB_ERROR. Therefore, when we test the DSB_CONNECTING flag, we should also test the DSB_CONNECT_REQ flag to avoid cm leak. Signed-off-by: zhengbing.huang --- drbd/drbd_transport_rdma.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c index be919a926..f24440580 100644 --- a/drbd/drbd_transport_rdma.c +++ b/drbd/drbd_transport_rdma.c @@ -1307,9 +1307,10 @@ static int dtr_cma_event_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event set_bit(DSB_ERROR, &cm->state); dtr_cma_retry_connect(cm->path, cm); - if (!test_and_clear_bit(DSB_CONNECTING, &cm->state)) - return 0; /* keep ref; __dtr_disconnect_path() won */ - break; + if (test_and_clear_bit(DSB_CONNECTING, &cm->state) || + test_and_clear_bit(DSB_CONNECT_REQ, &cm->state)) + break; + return 0; /* keep ref; __dtr_disconnect_path() won */ case RDMA_CM_EVENT_DISCONNECTED: // pr_info("%s: RDMA_CM_EVENT_DISCONNECTED\n", cm->name); @@ -2787,7 +2788,8 @@ static void __dtr_disconnect_path(struct dtr_path *path) * events. Destroy the cm and cm_id to avoid leaking it. * This is racing with the event delivery, which drops a reference. */ - if (test_and_clear_bit(DSB_CONNECTING, &cm->state)) + if (test_and_clear_bit(DSB_CONNECTING, &cm->state) || + test_and_clear_bit(DSB_CONNECT_REQ, &cm->state)) kref_put(&cm->kref, dtr_destroy_cm); kref_put(&cm->kref, dtr_destroy_cm); -- 2.43.0 From splc.regional.east at gmail.com Thu Apr 24 17:22:52 2025 From: splc.regional.east at gmail.com (Reginald Cirque) Date: Thu, 24 Apr 2025 11:22:52 -0400 Subject: Possible memory leak in DRBD 8.4.11 Message-ID: Good day, I was syncing a 300 GB LVM volume from a DRBD primary to a newly-built secondary, and noticed that the sending host (primary) had 300G of "untracked", used, memory (not visible in slab, cached, or associated with any application(s), simply shown as "kernel dynamic memory" in "smem -twk" output) for long (many hours) after the sync had completed, suggesting that DRBD buffers/page-pool were not reclaimed. When I ran "drbdsetup down" to disconnect the secondary, I observed a kernel log message: "block drbd3: net_ee not empty, killed 291226 entries", which further suggests to me that DRBD buffers are not being properly reclaimed. The memory was returned back to the system ~instantly after disconnecting the secondary. I am running Linux kernel 6.1.128-1.el8.x86_64 and patching-in the 8.4.11 DRBD module in-tree.