From: Philipp Reisner <philipp.reisner@linbit.com>
To: "zhengbing . huang" <zhengbing.huang@easystack.cn>
Cc: drbd-dev@lists.linbit.com
Subject: [PATCH 1/1] rdma: Fix cm leak
Date: Mon, 5 May 2025 16:26:23 +0200 [thread overview]
Message-ID: <20250505142623.424049-2-philipp.reisner@linbit.com> (raw)
In-Reply-To: <20250425102421.1673048-1-zhengbing.huang@easystack.cn>
From: "zhengbing.huang" <zhengbing.huang@easystack.cn>
We found that when all the DRBD devices are down, the reference count
of the drbd_transport_rdma module is still 1.
[root@node-4 ~]# drbdadm status
No currently configured DRBD found.
[root@node-4 ~]# lsmod | grep drbd
drbd_transport_rdma 262144 1
Then, we found an unreleased cm structure and discover
that its state is DSB_CONNECT_REQ + DSB_ERROR.
crash> struct dtr_cm ffff57e515da9400
struct dtr_cm {
kref = {
refcount = {
refs = {
counter = 1
...
state = 9,
...
}
The scenario of this problem should be like this:
dtr_cma_event_handler() get an RDMA_CM_EVENT_CONNECT_REQUEST event,
and call dtr_cma_accept() to alloc a cm. and set cm->state = DSM_CONNECT_REQ,
now the cm->kref count is 2.
then dtr_cma_event_handler() get xxx_CONNECT_ERROR/xxx_UNREACHABLE/xxx_REJECTED
event, and set_bit(DSB_ERROR, &cm->state).
the cm remove from path in dtr_cma_retry_connect, put one ref.
and cm->state dont has DSB_CONNECTING flag, then return 0.
Now, the cm->kref count is 1, and state is DSB_CONNECT_REQ + DSB_ERROR.
Therefore, when we test the DSB_CONNECTING flag,
we should also test the DSB_CONNECT_REQ flag to avoid cm leak.
Signed-off-by: zhengbing.huang <zhengbing.huang@easystack.cn>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
---
drbd/drbd_transport_rdma.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index be919a926..4a9ba8fa6 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -1278,8 +1278,8 @@ static int dtr_cma_event_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event
/* cm->state = DSM_CONNECTED; is set later in the work item */
/* This is called for active and passive connections */
- connecting = test_and_clear_bit(DSB_CONNECTING, &cm->state);
- connecting |= test_bit(DSB_CONNECT_REQ, &cm->state);
+ connecting = test_and_clear_bit(DSB_CONNECTING, &cm->state) ||
+ test_and_clear_bit(DSB_CONNECT_REQ, &cm->state);
kref_get(&cm->kref); /* connected -> expect a disconnect in the future */
kref_get(&cm->kref); /* for the work */
schedule_work(&cm->establish_work);
@@ -1307,7 +1307,9 @@ static int dtr_cma_event_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event
set_bit(DSB_ERROR, &cm->state);
dtr_cma_retry_connect(cm->path, cm);
- if (!test_and_clear_bit(DSB_CONNECTING, &cm->state))
+ connecting = test_and_clear_bit(DSB_CONNECTING, &cm->state) ||
+ test_and_clear_bit(DSB_CONNECT_REQ, &cm->state);
+ if (!connecting)
return 0; /* keep ref; __dtr_disconnect_path() won */
break;
@@ -2787,7 +2789,8 @@ static void __dtr_disconnect_path(struct dtr_path *path)
* events. Destroy the cm and cm_id to avoid leaking it.
* This is racing with the event delivery, which drops a reference.
*/
- if (test_and_clear_bit(DSB_CONNECTING, &cm->state))
+ if (test_and_clear_bit(DSB_CONNECTING, &cm->state) ||
+ test_and_clear_bit(DSB_CONNECT_REQ, &cm->state))
kref_put(&cm->kref, dtr_destroy_cm);
kref_put(&cm->kref, dtr_destroy_cm);
--
2.49.0
prev parent reply other threads:[~2025-05-05 14:26 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-25 10:24 [PATCH] rdma: Fix cm leak zhengbing.huang
2025-05-05 14:26 ` Philipp Reisner
2025-05-06 2:20 ` ZhengbingHuang
2025-05-05 14:26 ` Philipp Reisner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250505142623.424049-2-philipp.reisner@linbit.com \
--to=philipp.reisner@linbit.com \
--cc=drbd-dev@lists.linbit.com \
--cc=zhengbing.huang@easystack.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox