[PATCH] rdma: Fix drbd_transport_rdma module reference count exception
zhengbing.huang
zhengbing.huang at easystack.cn
Wed Feb 19 04:08:04 CET 2025
In testing, we find drbd_transport_rdma module reference count is abnormal:
drbd_transport_rdma 262144 28293
we don't have that many drbd devices.
If the XXX_ADDR_ERROR/XXX_ROUTE_ERROR events occurs
and the DSB_CONNECTING flag bit is not set,
the dtr_cma_event_handler() returns 0 directly.
The cm structure cannot be destroyed,
and the drbd_transport_rdma module reference count is abnormal.
So, for XXX_ADDR_ERROR/XXX_ROUTE_ERROR events,
we do not need to judge the DSB_CONNECTING flag,
and we need to kref_put of cm structure.
Signed-off-by: zhengbing.huang <zhengbing.huang at easystack.cn>
---
drbd/drbd_transport_rdma.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index ba4f1baa7..bb59e6501 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -1292,6 +1292,11 @@ static int dtr_cma_event_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event
// pr_info("%s: RDMA_CM_EVENT_ADDR_ERROR\n", cm->name);
case RDMA_CM_EVENT_ROUTE_ERROR:
// pr_info("%s: RDMA_CM_EVENT_ROUTE_ERROR\n", cm->name);
+ set_bit(DSB_ERROR, &cm->state);
+
+ dtr_cma_retry_connect(cm->path, cm);
+ break;
+
case RDMA_CM_EVENT_CONNECT_ERROR:
// pr_info("%s: RDMA_CM_EVENT_CONNECT_ERROR\n", cm->name);
case RDMA_CM_EVENT_UNREACHABLE:
--
2.43.0
More information about the drbd-dev
mailing list