* [PATCH 0/2] nvmet-rdma: fix response resource leak on queue teardown
@ 2026-06-29 5:15 Shin'ichiro Kawasaki
2026-06-29 5:15 ` [PATCH 1/2] nvmet-rdma: factor out response resource cleanup Shin'ichiro Kawasaki
2026-06-29 5:15 ` [PATCH 2/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
0 siblings, 2 replies; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2026-06-29 5:15 UTC (permalink / raw)
To: linux-nvme, Keith Busch
Cc: Christoph Hellwig, Sagi Grimberg, Chaitanya Kulkarni,
Shin'ichiro Kawasaki
When an nvme target with rdma transport is removed while I/Os are in
flight, a response can be posted but its send completion is never
delivered before the connection is torn down. This leaks the request
SGLs and the RDMA read/write context. The leaks are recreated by
running blktests nvme/061 with rdma transport and the siw driver [1].
[1] https://lore.kernel.org/linux-nvme/ajtk1CaN1pBreS4O@shinmob/
This series addresses the problem. The first patch is a preparation
refactoring. The second patch fixes the problem by reclaiming memory
objects on queue QP teardown.
Shin'ichiro Kawasaki (2):
nvmet-rdma: factor out response resource cleanup
nvmet-rdma: fix response resource leak on queue teardown
drivers/nvme/target/rdma.c | 31 ++++++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] nvmet-rdma: factor out response resource cleanup
2026-06-29 5:15 [PATCH 0/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
@ 2026-06-29 5:15 ` Shin'ichiro Kawasaki
2026-06-30 11:54 ` Christoph Hellwig
2026-06-29 5:15 ` [PATCH 2/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
1 sibling, 1 reply; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2026-06-29 5:15 UTC (permalink / raw)
To: linux-nvme, Keith Busch
Cc: Christoph Hellwig, Sagi Grimberg, Chaitanya Kulkarni,
Shin'ichiro Kawasaki
Move the RDMA read/write context teardown and the request SGL freeing
out of nvmet_rdma_release_rsp() into a new helper function
nvmet_rdma_free_rsp_resources().
This is a refactoring with no functional change, in preparation for the
following patch that uses nvmet_rdma_free_rsp_resources().
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
drivers/nvme/target/rdma.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index ea1185b8267e..3c34f235e542 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -657,18 +657,25 @@ static void nvmet_rdma_rw_ctx_destroy(struct nvmet_rdma_rsp *rsp)
req->sg, req->sg_cnt, nvmet_data_dir(req));
}
-static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
+static void nvmet_rdma_free_rsp_resources(struct nvmet_rdma_rsp *rsp)
{
struct nvmet_rdma_queue *queue = rsp->queue;
- atomic_add(1 + rsp->n_rdma, &queue->sq_wr_avail);
-
if (rsp->n_rdma)
nvmet_rdma_rw_ctx_destroy(rsp);
if (rsp->req.sg < rsp->cmd->inline_sg ||
rsp->req.sg >= rsp->cmd->inline_sg + queue->dev->inline_page_count)
nvmet_req_free_sgls(&rsp->req);
+}
+
+static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
+{
+ struct nvmet_rdma_queue *queue = rsp->queue;
+
+ atomic_add(1 + rsp->n_rdma, &queue->sq_wr_avail);
+
+ nvmet_rdma_free_rsp_resources(rsp);
if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list)))
nvmet_rdma_process_wr_wait_list(queue);
--
2.54.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] nvmet-rdma: fix response resource leak on queue teardown
2026-06-29 5:15 [PATCH 0/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
2026-06-29 5:15 ` [PATCH 1/2] nvmet-rdma: factor out response resource cleanup Shin'ichiro Kawasaki
@ 2026-06-29 5:15 ` Shin'ichiro Kawasaki
2026-06-30 11:54 ` Christoph Hellwig
1 sibling, 1 reply; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2026-06-29 5:15 UTC (permalink / raw)
To: linux-nvme, Keith Busch
Cc: Christoph Hellwig, Sagi Grimberg, Chaitanya Kulkarni,
Shin'ichiro Kawasaki
When an nvme target with rdma transport is removed while I/Os are in
flight, a response can be posted but its send completion is never
delivered before the connection is torn down. As a result
nvmet_rdma_send_done() and nvmet_rdma_release_rsp() are never called for
the response, and this leaks the allocated RDMA read/write context and
request SGLs.
These leaks are recreated by running blktests nvme/061 with the rdma
transport and the siw driver. Kernel kmemleak feature reports them as
follows:
unreferenced object 0xffff88812bc490c0 (size 32):
comm "kworker/2:1H", pid 409, jiffies 4307744490
backtrace (crc 89afd339):
__kmalloc_noprof+0x5f9/0x890
sgl_alloc_order+0x7b/0x380
nvmet_req_alloc_sgls+0x290/0x4f0 [nvmet]
nvmet_rdma_map_sgl_keyed+0x241/0x12e0 [nvmet_rdma]
nvmet_rdma_handle_command+0x73e/0xb80 [nvmet_rdma]
__ib_process_cq+0x149/0x4c0 [ib_core]
ib_cq_poll_work+0x49/0x160 [ib_core]
process_one_work+0x8b2/0x1640
worker_thread+0x5fd/0xfe0
kthread+0x367/0x460
ret_from_fork+0x655/0x9d0
ret_from_fork_asm+0x1a/0x30
unreferenced object 0xffff88814bd05e80 (size 64):
comm "kworker/3:1H", pid 148, jiffies 4295195428
backtrace (crc e35510cb):
__kmalloc_noprof+0x5f9/0x890
rdma_rw_ctx_init+0x333/0x1fa0 [ib_core]
nvmet_rdma_map_sgl_keyed+0x5c8/0x12e0 [nvmet_rdma]
nvmet_rdma_handle_command+0x73e/0xb80 [nvmet_rdma]
__ib_process_cq+0x149/0x4c0 [ib_core]
ib_cq_poll_work+0x49/0x160 [ib_core]
process_one_work+0x8b2/0x1640
worker_thread+0x5fd/0xfe0
kthread+0x367/0x460
ret_from_fork+0x655/0x9d0
ret_from_fork_asm+0x1a/0x30
To avoid the memory leaks, reclaim the memory of the in-flight responses
when the queue QP is torn down. Call nvmet_rdma_free_rsp_resources()
that frees up the RDMA read/write context and the request SGLs of such
responses.
Fixes: 8f000cac6e7a ("nvmet-rdma: add a NVMe over Fabrics RDMA target driver")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
drivers/nvme/target/rdma.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 3c34f235e542..de5a88fbb233 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1345,9 +1345,27 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
goto out;
}
+static bool nvmet_rdma_reclaim_rsp(struct sbitmap *sb, unsigned int bitnr,
+ void *data)
+{
+ struct nvmet_rdma_queue *queue = data;
+
+ nvmet_rdma_free_rsp_resources(&queue->rsps[bitnr]);
+
+ return true;
+}
+
static void nvmet_rdma_destroy_queue_ib(struct nvmet_rdma_queue *queue)
{
ib_drain_qp(queue->qp);
+
+ /*
+ * Reclaim resources of a response that is still in-flight when the
+ * queue is being torn down. This happens when the connection was
+ * forcefully disconnected while an I/O is in flight.
+ */
+ sbitmap_for_each_set(&queue->rsp_tags, nvmet_rdma_reclaim_rsp, queue);
+
if (queue->cm_id)
rdma_destroy_id(queue->cm_id);
ib_destroy_qp(queue->qp);
--
2.54.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] nvmet-rdma: factor out response resource cleanup
2026-06-29 5:15 ` [PATCH 1/2] nvmet-rdma: factor out response resource cleanup Shin'ichiro Kawasaki
@ 2026-06-30 11:54 ` Christoph Hellwig
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2026-06-30 11:54 UTC (permalink / raw)
To: Shin'ichiro Kawasaki
Cc: linux-nvme, Keith Busch, Sagi Grimberg, Chaitanya Kulkarni
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] nvmet-rdma: fix response resource leak on queue teardown
2026-06-29 5:15 ` [PATCH 2/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
@ 2026-06-30 11:54 ` Christoph Hellwig
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2026-06-30 11:54 UTC (permalink / raw)
To: Shin'ichiro Kawasaki
Cc: linux-nvme, Keith Busch, Sagi Grimberg, Chaitanya Kulkarni
Looks good;
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-30 11:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 5:15 [PATCH 0/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
2026-06-29 5:15 ` [PATCH 1/2] nvmet-rdma: factor out response resource cleanup Shin'ichiro Kawasaki
2026-06-30 11:54 ` Christoph Hellwig
2026-06-29 5:15 ` [PATCH 2/2] nvmet-rdma: fix response resource leak on queue teardown Shin'ichiro Kawasaki
2026-06-30 11:54 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox