* [PATCH V5 1/7] nvme: Let the blocklayer set timeouts for requests
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 2/7] nvme: remove redundant timeout argument from nvme_wait_freeze_timeout Maurizio Lombardi
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
From: Maximilian Heyne <mheyne@amazon.de>
When initializing an nvme request which is about to be send to the block
layer, we do not need to initialize its timeout. If it's left
uninitialized at 0 the block layer will use the request queue's timeout
in blk_add_timer (via nvme_start_request which is called from
nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or
NVME_ADMIN_TIMEOUT when the request queues were created.
Because the io_timeout of the IO queues can be modified via sysfs, the
following situation can occur:
1) NVME_IO_TIMEOUT = 30 (default module parameter)
2) nvme1n1 is probed. IO queues default timeout is 30 s
3) manually change the IO timeout to 90 s
echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout
4) Any call of __submit_sync_cmd on nvme1n1 to an IO queue will issue
commands with the 30 s timeout instead of the wanted 90 s which might
be more suitable for this device.
Commit 470e900c8036 ("nvme: refactor nvme_alloc_request") silently
changed the behavior for ioctl's already because it unconditionally
overrides the request's timeout that was set in nvme_init_request. If it
was unset by the user of the ioctl if will be overridden with 0 meaning
the block layer will pick the request queue's IO timeout.
Following up on that, this patch further improves the consistency of IO
timeout usage. However, there are still uses of NVME_IO_TIMEOUT which
could be inconsistent with what is set in the device's request_queue by
the user.
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/core.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c3032d6ad6b1..6c75253d2866 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -729,10 +729,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
struct nvme_ns *ns = req->q->disk->private_data;
logging_enabled = ns->head->passthru_err_log_enabled;
- req->timeout = NVME_IO_TIMEOUT;
} else { /* no queuedata implies admin queue */
logging_enabled = nr->ctrl->passthru_err_log_enabled;
- req->timeout = NVME_ADMIN_TIMEOUT;
}
if (!logging_enabled)
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH V5 2/7] nvme: remove redundant timeout argument from nvme_wait_freeze_timeout
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 1/7] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 3/7] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
All callers of nvme_wait_freeze_timeout() currently pass the exact same
NVME_IO_TIMEOUT default as their timeout argument.
Remove it and use a local variable.
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/apple.c | 2 +-
drivers/nvme/host/core.c | 3 ++-
drivers/nvme/host/nvme.h | 2 +-
drivers/nvme/host/pci.c | 2 +-
drivers/nvme/host/rdma.c | 2 +-
drivers/nvme/host/tcp.c | 2 +-
6 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index 423c9c628e7b..e77c47408102 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -858,7 +858,7 @@ static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
* doing a safe shutdown.
*/
if (!dead && shutdown && freeze)
- nvme_wait_freeze_timeout(&anv->ctrl, NVME_IO_TIMEOUT);
+ nvme_wait_freeze_timeout(&anv->ctrl);
nvme_quiesce_io_queues(&anv->ctrl);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6c75253d2866..84f295e3bf08 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -5250,8 +5250,9 @@ void nvme_unfreeze(struct nvme_ctrl *ctrl)
}
EXPORT_SYMBOL_GPL(nvme_unfreeze);
-int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout)
+int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl)
{
+ unsigned long timeout = NVME_IO_TIMEOUT;
struct nvme_ns *ns;
int srcu_idx;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ccd5e05dac98..6f9ecb4948f4 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -900,7 +900,7 @@ void nvme_sync_queues(struct nvme_ctrl *ctrl);
void nvme_sync_io_queues(struct nvme_ctrl *ctrl);
void nvme_unfreeze(struct nvme_ctrl *ctrl);
void nvme_wait_freeze(struct nvme_ctrl *ctrl);
-int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout);
+int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl);
void nvme_start_freeze(struct nvme_ctrl *ctrl);
static inline enum req_op nvme_req_op(struct nvme_command *cmd)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 139a10cd687f..8dc353451b21 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3278,7 +3278,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
* if doing a safe shutdown.
*/
if (!dead && shutdown)
- nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
+ nvme_wait_freeze_timeout(&dev->ctrl);
}
nvme_quiesce_io_queues(&dev->ctrl);
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index f77c960f7632..bf73135c1439 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -888,7 +888,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
if (!new) {
nvme_start_freeze(&ctrl->ctrl);
nvme_unquiesce_io_queues(&ctrl->ctrl);
- if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+ if (!nvme_wait_freeze_timeout(&ctrl->ctrl)) {
/*
* If we timed out waiting for freeze we are likely to
* be stuck. Fail the controller initialization just
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 15d36d6a728e..0552aa8a1150 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2208,7 +2208,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
if (!new) {
nvme_start_freeze(ctrl);
nvme_unquiesce_io_queues(ctrl);
- if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
+ if (!nvme_wait_freeze_timeout(ctrl)) {
/*
* If we timed out waiting for freeze we are likely to
* be stuck. Fail the controller initialization just
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH V5 3/7] nvme: add sysfs attribute to change admin timeout per nvme controller
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 1/7] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 2/7] nvme: remove redundant timeout argument from nvme_wait_freeze_timeout Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 4/7] nvme: add sysfs attribute to change IO timeout per controller Maurizio Lombardi
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
Currently, there is no method to adjust the timeout values on a
per-controller basis with nvme admin queues.
Add an admin_timeout attribute to nvme so that different nvme controllers
which may have different timeout requirements can have custom admin
timeouts set.
The admin timeout is also applied to the fabrics queue (fabrics_q).
The fabrics queue is utilized for fabric-specific administrative and
control operations, such as Connect and Property Get/Set commands.
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/core.c | 1 +
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/pci.c | 2 +-
drivers/nvme/host/sysfs.c | 41 +++++++++++++++++++++++++++++++++++++++
4 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 84f295e3bf08..fe6dcd19cecb 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -5144,6 +5144,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd));
ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
ctrl->ka_last_check_time = jiffies;
+ ctrl->admin_timeout = NVME_ADMIN_TIMEOUT;
BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
PAGE_SIZE);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 6f9ecb4948f4..7923533cce00 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -370,6 +370,7 @@ struct nvme_ctrl {
u16 mtfa;
u32 ctrl_config;
u32 queue_count;
+ u32 admin_timeout;
u64 cap;
u32 max_hw_sectors;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8dc353451b21..80d2e517aac3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3096,7 +3096,7 @@ static bool __nvme_delete_io_queues(struct nvme_dev *dev, u8 opcode)
unsigned long timeout;
retry:
- timeout = NVME_ADMIN_TIMEOUT;
+ timeout = dev->ctrl.admin_timeout;
while (nr_queues > 0) {
if (nvme_delete_queue(&dev->queues[nr_queues], opcode))
break;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index e59758616f27..3b39b64cd9da 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -623,6 +623,46 @@ static ssize_t quirks_show(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RO(quirks);
+static ssize_t nvme_admin_timeout_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+ return sysfs_emit(buf, "%u\n",
+ jiffies_to_msecs(ctrl->admin_timeout));
+}
+
+static ssize_t nvme_admin_timeout_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+ u32 timeout;
+ int err;
+
+ /*
+ * Wait until the controller reaches the LIVE state to be sure that
+ * admin_q and fabrics_q are properly initialized.
+ */
+ if (!test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags))
+ return -EBUSY;
+
+ err = kstrtou32(buf, 10, &timeout);
+ if (err || !timeout)
+ return -EINVAL;
+
+ ctrl->admin_timeout = msecs_to_jiffies(timeout);
+
+ blk_queue_rq_timeout(ctrl->admin_q, ctrl->admin_timeout);
+ if (ctrl->fabrics_q)
+ blk_queue_rq_timeout(ctrl->fabrics_q, ctrl->admin_timeout);
+
+ return count;
+}
+
+static DEVICE_ATTR(admin_timeout, S_IRUGO | S_IWUSR,
+ nvme_admin_timeout_show, nvme_admin_timeout_store);
+
#ifdef CONFIG_NVME_HOST_AUTH
static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -765,6 +805,7 @@ static struct attribute *nvme_dev_attrs[] = {
&dev_attr_cntrltype.attr,
&dev_attr_dctype.attr,
&dev_attr_quirks.attr,
+ &dev_attr_admin_timeout.attr,
#ifdef CONFIG_NVME_HOST_AUTH
&dev_attr_dhchap_secret.attr,
&dev_attr_dhchap_ctrl_secret.attr,
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH V5 4/7] nvme: add sysfs attribute to change IO timeout per controller
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
` (2 preceding siblings ...)
2026-05-14 8:32 ` [PATCH V5 3/7] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 5/7] nvme-core: align fabrics_q teardown with admin_q in nvme_free_ctrl Maurizio Lombardi
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
Currently, there is no method to adjust the timeout values on a
per controller basis with nvme I/O queues.
Add an io_timeout attribute to nvme so that different nvme controllers
which may have different timeout requirements can have custom I/O
timeouts set.
The I/O timeout is also applied to the connect queue (connect_q).
In NVMe over Fabrics, the connect queue is utilized specifically to
issue Connect commands that establish the I/O queues.
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/core.c | 4 +++-
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/sysfs.c | 47 +++++++++++++++++++++++++++++++++++++++
3 files changed, 51 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fe6dcd19cecb..626db6088dcd 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4207,6 +4207,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
mutex_unlock(&ctrl->namespaces_lock);
goto out_unlink_ns;
}
+ blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
nvme_ns_add_to_ctrl_list(ns);
mutex_unlock(&ctrl->namespaces_lock);
synchronize_srcu(&ctrl->srcu);
@@ -5145,6 +5146,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
ctrl->ka_last_check_time = jiffies;
ctrl->admin_timeout = NVME_ADMIN_TIMEOUT;
+ ctrl->io_timeout = NVME_IO_TIMEOUT;
BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
PAGE_SIZE);
@@ -5253,7 +5255,7 @@ EXPORT_SYMBOL_GPL(nvme_unfreeze);
int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl)
{
- unsigned long timeout = NVME_IO_TIMEOUT;
+ unsigned long timeout = ctrl->io_timeout;
struct nvme_ns *ns;
int srcu_idx;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 7923533cce00..9ccaed0b9dbf 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -371,6 +371,7 @@ struct nvme_ctrl {
u32 ctrl_config;
u32 queue_count;
u32 admin_timeout;
+ u32 io_timeout;
u64 cap;
u32 max_hw_sectors;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 3b39b64cd9da..b682c1a4b23f 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -663,6 +663,52 @@ static ssize_t nvme_admin_timeout_store(struct device *dev,
static DEVICE_ATTR(admin_timeout, S_IRUGO | S_IWUSR,
nvme_admin_timeout_show, nvme_admin_timeout_store);
+static ssize_t nvme_io_timeout_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+ return sysfs_emit(buf, "%u\n", jiffies_to_msecs(ctrl->io_timeout));
+}
+
+static ssize_t nvme_io_timeout_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+ struct nvme_ns *ns;
+ u32 timeout;
+ int err;
+
+ /*
+ * Wait until the controller reaches the LIVE state to be sure that
+ * connect_q is properly initialized.
+ */
+ if (!test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags))
+ return -EBUSY;
+
+ err = kstrtou32(buf, 10, &timeout);
+ if (err || !timeout)
+ return -EINVAL;
+
+ /* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
+ mutex_lock(&ctrl->namespaces_lock);
+
+ ctrl->io_timeout = msecs_to_jiffies(timeout);
+ list_for_each_entry(ns, &ctrl->namespaces, list)
+ blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
+
+ mutex_unlock(&ctrl->namespaces_lock);
+
+ if (ctrl->connect_q)
+ blk_queue_rq_timeout(ctrl->connect_q, ctrl->io_timeout);
+
+ return count;
+}
+
+static DEVICE_ATTR(io_timeout, S_IRUGO | S_IWUSR,
+ nvme_io_timeout_show, nvme_io_timeout_store);
+
#ifdef CONFIG_NVME_HOST_AUTH
static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -806,6 +852,7 @@ static struct attribute *nvme_dev_attrs[] = {
&dev_attr_dctype.attr,
&dev_attr_quirks.attr,
&dev_attr_admin_timeout.attr,
+ &dev_attr_io_timeout.attr,
#ifdef CONFIG_NVME_HOST_AUTH
&dev_attr_dhchap_secret.attr,
&dev_attr_dhchap_ctrl_secret.attr,
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH V5 5/7] nvme-core: align fabrics_q teardown with admin_q in nvme_free_ctrl
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
` (3 preceding siblings ...)
2026-05-14 8:32 ` [PATCH V5 4/7] nvme: add sysfs attribute to change IO timeout per controller Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 6/7] nvmet-loop: do not alloc admin tag set during reset Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 7/7] nvme-core: warn on allocating admin tag set with existing queue Maurizio Lombardi
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
Currently, the final reference for the fabrics admin queue (fabrics_q)
is dropped inside nvme_remove_admin_tag_set(). However, the primary admin
queue (admin_q) defers dropping its final reference until
nvme_free_ctrl().
Move the blk_put_queue() call for fabrics_q from
nvme_remove_admin_tag_set() to nvme_free_ctrl(). This aligns the
lifecycle management of both admin queues, ensuring they are freed
symmetrically when the controller is finally torn down.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 626db6088dcd..e6bb1f5e9657 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4936,10 +4936,8 @@ void nvme_remove_admin_tag_set(struct nvme_ctrl *ctrl)
*/
nvme_stop_keep_alive(ctrl);
blk_mq_destroy_queue(ctrl->admin_q);
- if (ctrl->ops->flags & NVME_F_FABRICS) {
+ if (ctrl->fabrics_q)
blk_mq_destroy_queue(ctrl->fabrics_q);
- blk_put_queue(ctrl->fabrics_q);
- }
blk_mq_free_tag_set(ctrl->admin_tagset);
}
EXPORT_SYMBOL_GPL(nvme_remove_admin_tag_set);
@@ -5081,6 +5079,8 @@ static void nvme_free_ctrl(struct device *dev)
if (ctrl->admin_q)
blk_put_queue(ctrl->admin_q);
+ if (ctrl->fabrics_q)
+ blk_put_queue(ctrl->fabrics_q);
if (!subsys || ctrl->instance != subsys->instance)
ida_free(&nvme_instance_ida, ctrl->instance);
nvme_free_cels(ctrl);
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH V5 6/7] nvmet-loop: do not alloc admin tag set during reset
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
` (4 preceding siblings ...)
2026-05-14 8:32 ` [PATCH V5 5/7] nvme-core: align fabrics_q teardown with admin_q in nvme_free_ctrl Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
2026-05-14 8:32 ` [PATCH V5 7/7] nvme-core: warn on allocating admin tag set with existing queue Maurizio Lombardi
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
Currently, resetting a loopback controller unconditionally invokes
nvme_alloc_admin_tag_set() inside nvme_loop_configure_admin_queue().
Doing so drops the old queue and allocates a new one. Consequently,
this reverts the admin queue's timeout (q->rq_timeout) back to the
module default (NVME_ADMIN_TIMEOUT), completely wiping out any custom
timeout values the user may have configured via sysfs and potentially
racing against the sysfs nvme_admin_timeout_store() function
that may dereference the admin_q pointer during the RESETTING state.
Decouple the admin tag set lifecycle from the admin queue
configuration and destruction paths, which are executed during resets;
Specifically:
* Move nvme_alloc_admin_tag_set() into nvme_loop_create_ctrl() so it
is only allocated once during the initial controller creation.
* Defer the destruction of the admin tag set to
nvme_loop_delete_ctrl_host() and the terminal error-handling
paths of nvme_loop_reset_ctrl_work() and
nvme_loop_create_ctrl().
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/target/loop.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index d98d0cdc5d6f..070d16068e6b 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -274,7 +274,6 @@ static void nvme_loop_destroy_admin_queue(struct nvme_loop_ctrl *ctrl)
nvmet_sq_destroy(&ctrl->queues[0].nvme_sq);
nvmet_cq_put(&ctrl->queues[0].nvme_cq);
- nvme_remove_admin_tag_set(&ctrl->ctrl);
}
static void nvme_loop_free_ctrl(struct nvme_ctrl *nctrl)
@@ -375,25 +374,18 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl)
}
ctrl->ctrl.queue_count = 1;
- error = nvme_alloc_admin_tag_set(&ctrl->ctrl, &ctrl->admin_tag_set,
- &nvme_loop_admin_mq_ops,
- sizeof(struct nvme_loop_iod) +
- NVME_INLINE_SG_CNT * sizeof(struct scatterlist));
- if (error)
- goto out_free_sq;
-
/* reset stopped state for the fresh admin queue */
clear_bit(NVME_CTRL_ADMIN_Q_STOPPED, &ctrl->ctrl.flags);
error = nvmf_connect_admin_queue(&ctrl->ctrl);
if (error)
- goto out_cleanup_tagset;
+ goto out_free_sq;
set_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags);
error = nvme_enable_ctrl(&ctrl->ctrl);
if (error)
- goto out_cleanup_tagset;
+ goto out_free_sq;
ctrl->ctrl.max_hw_sectors =
(NVME_LOOP_MAX_SEGMENTS - 1) << PAGE_SECTORS_SHIFT;
@@ -402,14 +394,12 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl)
error = nvme_init_ctrl_finish(&ctrl->ctrl, false);
if (error)
- goto out_cleanup_tagset;
+ goto out_free_sq;
return 0;
-out_cleanup_tagset:
- clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags);
- nvme_remove_admin_tag_set(&ctrl->ctrl);
out_free_sq:
+ clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags);
nvmet_sq_destroy(&ctrl->queues[0].nvme_sq);
nvmet_cq_put(&ctrl->queues[0].nvme_cq);
return error;
@@ -432,6 +422,7 @@ static void nvme_loop_shutdown_ctrl(struct nvme_loop_ctrl *ctrl)
static void nvme_loop_delete_ctrl_host(struct nvme_ctrl *ctrl)
{
nvme_loop_shutdown_ctrl(to_loop_ctrl(ctrl));
+ nvme_remove_admin_tag_set(ctrl);
}
static void nvme_loop_delete_ctrl(struct nvmet_ctrl *nctrl)
@@ -494,6 +485,7 @@ static void nvme_loop_reset_ctrl_work(struct work_struct *work)
nvme_cancel_admin_tagset(&ctrl->ctrl);
nvme_loop_destroy_admin_queue(ctrl);
out_disable:
+ nvme_remove_admin_tag_set(&ctrl->ctrl);
dev_warn(ctrl->ctrl.device, "Removing after reset failure\n");
nvme_uninit_ctrl(&ctrl->ctrl);
}
@@ -594,10 +586,17 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev,
if (!ctrl->queues)
goto out_uninit_ctrl;
- ret = nvme_loop_configure_admin_queue(ctrl);
+ ret = nvme_alloc_admin_tag_set(&ctrl->ctrl, &ctrl->admin_tag_set,
+ &nvme_loop_admin_mq_ops,
+ sizeof(struct nvme_loop_iod) +
+ NVME_INLINE_SG_CNT * sizeof(struct scatterlist));
if (ret)
goto out_free_queues;
+ ret = nvme_loop_configure_admin_queue(ctrl);
+ if (ret)
+ goto out_remove_admin_tagset;
+
if (opts->queue_size > ctrl->ctrl.maxcmd) {
/* warn if maxcmd is lower than queue_size */
dev_warn(ctrl->ctrl.device,
@@ -633,6 +632,8 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev,
nvme_quiesce_admin_queue(&ctrl->ctrl);
nvme_cancel_admin_tagset(&ctrl->ctrl);
nvme_loop_destroy_admin_queue(ctrl);
+out_remove_admin_tagset:
+ nvme_remove_admin_tag_set(&ctrl->ctrl);
out_free_queues:
kfree(ctrl->queues);
out_uninit_ctrl:
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH V5 7/7] nvme-core: warn on allocating admin tag set with existing queue
2026-05-14 8:32 [PATCH V5 0/7] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
` (5 preceding siblings ...)
2026-05-14 8:32 ` [PATCH V5 6/7] nvmet-loop: do not alloc admin tag set during reset Maurizio Lombardi
@ 2026-05-14 8:32 ` Maurizio Lombardi
6 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-05-14 8:32 UTC (permalink / raw)
To: kbusch
Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard,
mkhalfella, chaitanyak, hare, hch
Currently, nvme_alloc_admin_tag_set() silently drops and releases
the existing admin_q if it called on a controller that already
had one (e.g., during a controller reset).
However, transport drivers should not be reallocating the admin tag
set and queue during a reset. Dropping the old queue and allocating
a new one destroys user-configured timeouts and may race against
nvme_admin_timeout_store()
Since all transport drivers are now expected to preserve the admin queue
across resets, calling nvme_alloc_admin_tag_set() when ctrl->admin_q
is already populated is a bug.
Remove the silent cleanup and replace it with a WARN_ON_ONCE() to
explicitly catch any transport drivers that violate this lifecycle rule
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/core.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e6bb1f5e9657..f23a957c6e8f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4893,12 +4893,7 @@ int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
if (ret)
return ret;
- /*
- * If a previous admin queue exists (e.g., from before a reset),
- * put it now before allocating a new one to avoid orphaning it.
- */
- if (ctrl->admin_q)
- blk_put_queue(ctrl->admin_q);
+ WARN_ON_ONCE(ctrl->admin_q);
ctrl->admin_q = blk_mq_alloc_queue(set, NULL, NULL);
if (IS_ERR(ctrl->admin_q)) {
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread