* [PATCH 6.6.y] nvme-rdma: unquiesce admin_q before destroy it
@ 2025-04-11 3:04 Feng Liu
2025-04-13 16:47 ` Sasha Levin
0 siblings, 1 reply; 2+ messages in thread
From: Feng Liu @ 2025-04-11 3:04 UTC (permalink / raw)
To: chunguang.xu, yue.zhao, kbusch, Feng.Liu3; +Cc: stable
From: "Chunguang.xu" <chunguang.xu@shopee.com>
[ Upstream commit 5858b687559809f05393af745cbadf06dee61295 ]
Kernel will hang on destroy admin_q while we create ctrl failed, such
as following calltrace:
PID: 23644 TASK: ff2d52b40f439fc0 CPU: 2 COMMAND: "nvme"
#0 [ff61d23de260fb78] __schedule at ffffffff8323bc15
#1 [ff61d23de260fc08] schedule at ffffffff8323c014
#2 [ff61d23de260fc28] blk_mq_freeze_queue_wait at ffffffff82a3dba1
#3 [ff61d23de260fc78] blk_freeze_queue at ffffffff82a4113a
#4 [ff61d23de260fc90] blk_cleanup_queue at ffffffff82a33006
#5 [ff61d23de260fcb0] nvme_rdma_destroy_admin_queue at ffffffffc12686ce
#6 [ff61d23de260fcc8] nvme_rdma_setup_ctrl at ffffffffc1268ced
#7 [ff61d23de260fd28] nvme_rdma_create_ctrl at ffffffffc126919b
#8 [ff61d23de260fd68] nvmf_dev_write at ffffffffc024f362
#9 [ff61d23de260fe38] vfs_write at ffffffff827d5f25
RIP: 00007fda7891d574 RSP: 00007ffe2ef06958 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: 000055e8122a4d90 RCX: 00007fda7891d574
RDX: 000000000000012b RSI: 000055e8122a4d90 RDI: 0000000000000004
RBP: 00007ffe2ef079c0 R8: 000000000000012b R9: 000055e8122a4d90
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004
R13: 000055e8122923c0 R14: 000000000000012b R15: 00007fda78a54500
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
This due to we have quiesced admi_q before cancel requests, but forgot
to unquiesce before destroy it, as a result we fail to drain the
pending requests, and hang on blk_mq_freeze_queue_wait() forever. Here
try to reuse nvme_rdma_teardown_admin_queue() to fix this issue and
simplify the code.
Fixes: 958dc1d32c80 ("nvme-rdma: add clean action for failed reconnection")
Reported-by: Yingfu.zhou <yingfu.zhou@shopee.com>
Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com>
Signed-off-by: Yue.zhao <yue.zhao@shopee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
[Minor context change fixed]
Signed-off-by: Feng Liu <Feng.Liu3@windriver.com>
Signed-off-by: He Zhe <Zhe.He@windriver.com>
---
Verified the build test.
---
drivers/nvme/host/rdma.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c04317a966b3..055b95d2ce93 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1083,13 +1083,7 @@ static int nvme_rdma_setup_ctrl(struct nvme_rdma_ctrl *ctrl, bool new)
nvme_rdma_free_io_queues(ctrl);
}
destroy_admin:
- nvme_quiesce_admin_queue(&ctrl->ctrl);
- blk_sync_queue(ctrl->ctrl.admin_q);
- nvme_rdma_stop_queue(&ctrl->queues[0]);
- nvme_cancel_admin_tagset(&ctrl->ctrl);
- if (new)
- nvme_remove_admin_tag_set(&ctrl->ctrl);
- nvme_rdma_destroy_admin_queue(ctrl);
+ nvme_rdma_teardown_admin_queue(ctrl, new);
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH 6.6.y] nvme-rdma: unquiesce admin_q before destroy it
2025-04-11 3:04 [PATCH 6.6.y] nvme-rdma: unquiesce admin_q before destroy it Feng Liu
@ 2025-04-13 16:47 ` Sasha Levin
0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-04-13 16:47 UTC (permalink / raw)
To: stable; +Cc: Feng Liu, Sasha Levin
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected.
No action required from the submitter.
The upstream commit SHA1 provided is correct: 5858b687559809f05393af745cbadf06dee61295
WARNING: Author mismatch between patch and upstream commit:
Backport author: Feng Liu<Feng.Liu3@windriver.com>
Commit author: Chunguang.xu<chunguang.xu@shopee.com>
Status in newer kernel trees:
6.14.y | Present (exact SHA1)
6.13.y | Present (exact SHA1)
6.12.y | Present (different SHA1: 05b436f3cf65)
Note: The patch differs from the upstream commit:
---
1: 5858b68755980 ! 1: 4d596a05770c2 nvme-rdma: unquiesce admin_q before destroy it
@@ Metadata
## Commit message ##
nvme-rdma: unquiesce admin_q before destroy it
+ [ Upstream commit 5858b687559809f05393af745cbadf06dee61295 ]
+
Kernel will hang on destroy admin_q while we create ctrl failed, such
as following calltrace:
@@ Commit message
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
+ [Minor context change fixed]
+ Signed-off-by: Feng Liu <Feng.Liu3@windriver.com>
+ Signed-off-by: He Zhe <Zhe.He@windriver.com>
## drivers/nvme/host/rdma.c ##
@@ drivers/nvme/host/rdma.c: static int nvme_rdma_setup_ctrl(struct nvme_rdma_ctrl *ctrl, bool new)
+ nvme_rdma_free_io_queues(ctrl);
}
destroy_admin:
- nvme_stop_keep_alive(&ctrl->ctrl);
- nvme_quiesce_admin_queue(&ctrl->ctrl);
- blk_sync_queue(ctrl->ctrl.admin_q);
- nvme_rdma_stop_queue(&ctrl->queues[0]);
---
Results of testing on various branches:
| Branch | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-6.6.y | Success | Success |
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-04-13 16:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11 3:04 [PATCH 6.6.y] nvme-rdma: unquiesce admin_q before destroy it Feng Liu
2025-04-13 16:47 ` Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).