From: Chao Shi <coshi036@gmail.com>
To: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>
Cc: Chaitanya Kulkarni <kch@nvidia.com>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
Sungwoo Kim <iam@sung-woo.kim>, Dave Tian <daveti@purdue.edu>,
Weidong Zhu <weizhu@fiu.edu>
Subject: [PATCH] nvme: bound the freeze drain in passthrough commands
Date: Wed, 27 May 2026 01:59:23 -0400 [thread overview]
Message-ID: <20260527055923.456769-1-coshi036@gmail.com> (raw)
nvme_passthru_start() drains in-flight I/O via the unbounded
nvme_wait_freeze() before submitting a command with command-set
effects (Format NVM, Sanitize, Namespace Management, vendor unique).
If a completion is silently dropped or the device hangs, the calling
task wedges with ctrl->scan_lock and ctrl->subsys->lock held, fanning
out into hung-task reports on any concurrent open/close/passthru on
the same controller:
INFO: task syz-executor:NNNN blocked for more than 123 seconds.
nvme_wait_freeze+0x82/0x100
nvme_passthru_start drivers/nvme/host/core.c:1249 [inline]
nvme_submit_user_cmd+0x1ee/0x3d0 drivers/nvme/host/ioctl.c:189
The other freeze-drain sites (pci shutdown, tcp/rdma reset) already
bound the wait with nvme_wait_freeze_timeout(NVME_IO_TIMEOUT). Apply
it here too; on timeout, unwind the freeze and return -EBUSY (or
NVME_SC_INTERNAL on the nvmet path) instead of submitting the command.
Found by FuzzNvme(Syzkaller with FEMU fuzzing framework).
Acked-by: Sungwoo Kim <iam@sung-woo.kim>
Acked-by: Dave Tian <daveti@purdue.edu>
Acked-by: Weidong Zhu <weizhu@fiu.edu>
Signed-off-by: Chao Shi <coshi036@gmail.com>
---
drivers/nvme/host/core.c | 26 ++++++++++++++++++++------
drivers/nvme/host/ioctl.c | 7 ++++++-
drivers/nvme/host/nvme.h | 3 ++-
drivers/nvme/target/passthru.c | 7 ++++++-
4 files changed, 34 insertions(+), 9 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7bf228df6001..575f98b9a6cc 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1232,23 +1232,37 @@ u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode)
}
EXPORT_SYMBOL_NS_GPL(nvme_command_effects, "NVME_TARGET_PASSTHRU");
-u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode)
+int nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode,
+ u32 *effects)
{
- u32 effects = nvme_command_effects(ctrl, ns, opcode);
+ *effects = nvme_command_effects(ctrl, ns, opcode);
/*
* For simplicity, IO to all namespaces is quiesced even if the command
- * effects say only one namespace is affected.
+ * effects say only one namespace is affected. Bound the drain wait so
+ * a stuck I/O cannot wedge the passthrough caller (and any task on the
+ * scan_lock or subsys lock) indefinitely; the other in-tree callers of
+ * the freeze drain (pci shutdown, tcp/rdma reset) already use this same
+ * NVME_IO_TIMEOUT bound.
*/
- if (effects & NVME_CMD_EFFECTS_CSE_MASK) {
+ if (*effects & NVME_CMD_EFFECTS_CSE_MASK) {
mutex_lock(&ctrl->scan_lock);
mutex_lock(&ctrl->subsys->lock);
nvme_mpath_start_freeze(ctrl->subsys);
nvme_mpath_wait_freeze(ctrl->subsys);
nvme_start_freeze(ctrl);
- nvme_wait_freeze(ctrl);
+ if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
+ dev_warn(ctrl->device,
+ "I/O did not drain in %u seconds; aborting passthrough\n",
+ nvme_io_timeout);
+ nvme_unfreeze(ctrl);
+ nvme_mpath_unfreeze(ctrl->subsys);
+ mutex_unlock(&ctrl->subsys->lock);
+ mutex_unlock(&ctrl->scan_lock);
+ return -EBUSY;
+ }
}
- return effects;
+ return 0;
}
EXPORT_SYMBOL_NS_GPL(nvme_passthru_start, "NVME_TARGET_PASSTHRU");
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index a9c097dacad6..762458a23b38 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -186,7 +186,12 @@ static int nvme_submit_user_cmd(struct request_queue *q,
bio = req->bio;
ctrl = nvme_req(req)->ctrl;
- effects = nvme_passthru_start(ctrl, ns, cmd->common.opcode);
+ ret = nvme_passthru_start(ctrl, ns, cmd->common.opcode, &effects);
+ if (ret) {
+ if (bio)
+ blk_rq_unmap_user(bio);
+ goto out_free_req;
+ }
ret = nvme_execute_rq(req, false);
if (result)
*result = le64_to_cpu(nvme_req(req)->result.u64);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9a5f28c5103c..665d75de044e 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1211,7 +1211,8 @@ static inline void nvme_auth_revoke_tls_key(struct nvme_ctrl *ctrl) {};
u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
u8 opcode);
-u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode);
+int nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode,
+ u32 *effects);
int nvme_execute_rq(struct request *rq, bool at_head);
void nvme_passthru_end(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u32 effects,
struct nvme_command *cmd, int status);
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 67c423a8b052..7b97bfc1ace6 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -220,7 +220,12 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
u32 effects;
int status;
- effects = nvme_passthru_start(ctrl, ns, req->cmd->common.opcode);
+ status = nvme_passthru_start(ctrl, ns, req->cmd->common.opcode, &effects);
+ if (status) {
+ nvmet_req_complete(req, NVME_SC_INTERNAL);
+ blk_mq_free_request(rq);
+ return;
+ }
status = nvme_execute_rq(rq, false);
if (status == NVME_SC_SUCCESS &&
req->cmd->common.opcode == nvme_admin_identify) {
--
2.43.0
next reply other threads:[~2026-05-27 5:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-27 5:59 Chao Shi [this message]
2026-05-27 13:26 ` [PATCH] nvme: bound the freeze drain in passthrough commands Christoph Hellwig
2026-05-27 15:46 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260527055923.456769-1-coshi036@gmail.com \
--to=coshi036@gmail.com \
--cc=axboe@kernel.dk \
--cc=daveti@purdue.edu \
--cc=hch@lst.de \
--cc=iam@sung-woo.kim \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=weizhu@fiu.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.