public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration
@ 2026-02-12 12:09 Maurizio Lombardi
  2026-02-12 12:09 ` [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-12 12:09 UTC (permalink / raw)
  To: kbusch; +Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

This patchset tries to address some limitations in how the NVMe driver handles
command timeouts.
Currently, the driver relies heavily on global module parameters
(NVME_IO_TIMEOUT and NVME_ADMIN_TIMEOUT), making it difficult for users to
tune timeouts for specific controllers that may have very different
characteristics. Also, in some cases, manual changes to sysfs timeout values
are ignored by the driver logic.

For example this patchset removes the unconditional timeout assignment in
nvme_init_request. This allows the block layer to correctly apply the request
queue's timeout settings, ensuring that user-initiated changes via sysfs
are actually respected for all requests.

It introduces new sysfs attributes (admin_timeout and io_timeout) to the NVMe
controller. This allows users to configure distinct timeout requirements for
different controllers rather than relying on global module parameters.

Some examples:

Changes to the controller's io_timeout gets propagated to all
the associated namespaces' queues:

# find /sys -name 'io_timeout' 
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n1/queue/io_timeout
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n2/queue/io_timeout
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n3/queue/io_timeout
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/io_timeout

# echo 27000 > /sys/devices/virtual/nvme-fabrics/ctl/nvme0/io_timeout
# cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n1/queue/io_timeout
27000
# cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n2/queue/io_timeout
27000
# cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n3/queue/io_timeout
27000

When adding a namespace target-side, the io_timeout is inherited from
the controller's preferred timeout:

* target side *
# nvmetcli
/> cd subsystems/test-nqn/namespaces/4 
/subsystems/t.../namespaces/4> enable
The Namespace has been enabled.

************

* Host-side *
nvme nvme0: rescanning namespaces.
# find /sys -name 'io_timeout' 
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n1/queue/io_timeout
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n2/queue/io_timeout
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n3/queue/io_timeout
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n4/queue/io_timeout <-- new namespace
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/io_timeout

# cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n4/queue/io_timeout
27000
***********

io_timeout and admin_timeout module parameters are used as default
values for new controllers:

# nvme connect -t tcp -a 10.37.153.138 -s 8000 -n test-nqn2
connecting to device: nvme1

# cat /sys/devices/virtual/nvme-fabrics/ctl/nvme1/nvme1c1n1/queue/io_timeout
30000
# cat /sys/devices/virtual/nvme-fabrics/ctl/nvme1/admin_timeout
60000


David Jeffery (1):
  nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT

Heyne, Maximilian (1):
  nvme: Let the blocklayer set timeouts for requests

Maurizio Lombardi (3):
  nvme: add sysfs attribute to change admin timeout per nvme controller
  nvme: add sysfs attribute to change IO timeout per nvme controller
  nvme: use per controller timeout waits over depending on global
    default

 drivers/nvme/host/apple.c |  4 +--
 drivers/nvme/host/core.c  | 11 ++++---
 drivers/nvme/host/nvme.h  |  3 +-
 drivers/nvme/host/pci.c   |  5 +--
 drivers/nvme/host/rdma.c  |  2 +-
 drivers/nvme/host/sysfs.c | 69 +++++++++++++++++++++++++++++++++++++++
 drivers/nvme/host/tcp.c   |  2 +-
 7 files changed, 84 insertions(+), 12 deletions(-)

-- 
2.53.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests
  2026-02-12 12:09 [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
@ 2026-02-12 12:09 ` Maurizio Lombardi
  2026-02-17 20:15   ` Mohamed Khalfella
  2026-02-12 12:09 ` [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-12 12:09 UTC (permalink / raw)
  To: kbusch; +Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

From: "Heyne, Maximilian" <mheyne@amazon.de>

When initializing an nvme request which is about to be send to the block
layer, we do not need to initialize its timeout. If it's left
uninitialized at 0 the block layer will use the request queue's timeout
in blk_add_timer (via nvme_start_request which is called from
nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or
NVME_ADMIN_TIMEOUT when the request queues were created.

Because the io_timeout of the IO queues can be modified via sysfs, the
following situation can occur:

1) NVME_IO_TIMEOUT = 30 (default module parameter)
2) nvme1n1 is probed. IO queues default timeout is 30 s
3) manually change the IO timeout to 90 s
   echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout
4) Any call of __submit_sync_cmd on nvme1n1 to an IO queue will issue
   commands with the 30 s timeout instead of the wanted 90 s which might
   be more suitable for this device.

Commit 470e900c8036 ("nvme: refactor nvme_alloc_request") silently
changed the behavior for ioctl's already because it unconditionally
overrides the request's timeout that was set in nvme_init_request. If it
was unset by the user of the ioctl if will be overridden with 0 meaning
the block layer will pick the request queue's IO timeout.

Following up on that, this patch further improves the consistency of IO
timeout usage. However, there are still uses of NVME_IO_TIMEOUT which
could be inconsistent with what is set in the device's request_queue by
the user.

Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
 drivers/nvme/host/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7bf228df6001..b9315f0abf80 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -724,10 +724,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
 		struct nvme_ns *ns = req->q->disk->private_data;
 
 		logging_enabled = ns->head->passthru_err_log_enabled;
-		req->timeout = NVME_IO_TIMEOUT;
 	} else { /* no queuedata implies admin queue */
 		logging_enabled = nr->ctrl->passthru_err_log_enabled;
-		req->timeout = NVME_ADMIN_TIMEOUT;
 	}
 
 	if (!logging_enabled)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller
  2026-02-12 12:09 [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
  2026-02-12 12:09 ` [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
@ 2026-02-12 12:09 ` Maurizio Lombardi
  2026-02-17 20:17   ` Mohamed Khalfella
                     ` (2 more replies)
  2026-02-12 12:09 ` [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT Maurizio Lombardi
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-12 12:09 UTC (permalink / raw)
  To: kbusch; +Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

Currently, there is no method to adjust the timeout values
on a per controller basis with nvme admin queues.
Add an admin_timeout attribute to nvme so that different
nvme controllers which may have different timeout
requirements can have custom admin timeouts set.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/sysfs.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 29430949ce2f..149cd1ab3b3d 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -601,6 +601,36 @@ static ssize_t dctype_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(dctype);
 
+static ssize_t nvme_admin_timeout_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%u\n",
+				jiffies_to_msecs(ctrl->admin_q->rq_timeout));
+}
+
+static ssize_t nvme_admin_timeout_store(struct device *dev,
+			struct device_attribute *attr,
+			const char *buf, size_t count)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+	u32 timeout;
+	int err;
+
+	err = kstrtou32(buf, 10, &timeout);
+	if (err || !timeout)
+		return -EINVAL;
+
+	blk_queue_rq_timeout(ctrl->admin_q, msecs_to_jiffies(timeout));
+
+	return count;
+}
+
+static struct device_attribute dev_attr_admin_timeout =  \
+	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
+	nvme_admin_timeout_show, nvme_admin_timeout_store);
+
 #ifdef CONFIG_NVME_HOST_AUTH
 static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
@@ -742,6 +772,7 @@ static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_kato.attr,
 	&dev_attr_cntrltype.attr,
 	&dev_attr_dctype.attr,
+	&dev_attr_admin_timeout.attr,
 #ifdef CONFIG_NVME_HOST_AUTH
 	&dev_attr_dhchap_secret.attr,
 	&dev_attr_dhchap_ctrl_secret.attr,
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT
  2026-02-12 12:09 [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
  2026-02-12 12:09 ` [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
  2026-02-12 12:09 ` [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
@ 2026-02-12 12:09 ` Maurizio Lombardi
  2026-02-17 20:22   ` Mohamed Khalfella
  2026-02-18 13:10   ` Heyne, Maximilian
  2026-02-12 12:09 ` [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller Maurizio Lombardi
  2026-02-12 12:09 ` [PATCH RFC 5/5] nvme: use per controller timeout waits over depending on global default Maurizio Lombardi
  4 siblings, 2 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-12 12:09 UTC (permalink / raw)
  To: kbusch; +Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

From: David Jeffery <djeffery@redhat.com>

While tearing down its queues, nvme-pci uses NVME_ADMIN_TIMEOUT as its
timeout target. Instead, use the configured admin queue's timeout value
when available to match the device's existing timeout setting.

Signed-off-by: David Jeffery <djeffery@redhat.com>
---
 drivers/nvme/host/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 58f3097888a7..853cd57e4480 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2911,9 +2911,10 @@ static bool __nvme_delete_io_queues(struct nvme_dev *dev, u8 opcode)
 {
 	int nr_queues = dev->online_queues - 1, sent = 0;
 	unsigned long timeout;
+	struct request_queue *q = dev->ctrl.admin_q;
 
  retry:
-	timeout = NVME_ADMIN_TIMEOUT;
+	timeout = q ? q->rq_timeout : NVME_ADMIN_TIMEOUT;
 	while (nr_queues > 0) {
 		if (nvme_delete_queue(&dev->queues[nr_queues], opcode))
 			break;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-12 12:09 [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
                   ` (2 preceding siblings ...)
  2026-02-12 12:09 ` [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT Maurizio Lombardi
@ 2026-02-12 12:09 ` Maurizio Lombardi
  2026-02-17 20:25   ` Mohamed Khalfella
  2026-02-18 17:58   ` Mohamed Khalfella
  2026-02-12 12:09 ` [PATCH RFC 5/5] nvme: use per controller timeout waits over depending on global default Maurizio Lombardi
  4 siblings, 2 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-12 12:09 UTC (permalink / raw)
  To: kbusch; +Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

Currently, there is no method to adjust the timeout values
on a per controller basis with nvme I/O queues.
Add an io_timeout attribute to nvme so that different
nvme controllers which may have different timeout
requirements can have custom I/O timeouts set.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/apple.c |  2 +-
 drivers/nvme/host/core.c  |  4 +++-
 drivers/nvme/host/nvme.h  |  1 +
 drivers/nvme/host/sysfs.c | 38 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index ed61b97fde59..e9bb64e3ec9c 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -1331,7 +1331,7 @@ static int apple_nvme_alloc_tagsets(struct apple_nvme *anv)
 	if (anv->hw->has_lsq_nvmmu)
 		anv->tagset.reserved_tags = APPLE_NVME_AQ_DEPTH;
 	anv->tagset.queue_depth = anv->hw->max_queue_depth - 1;
-	anv->tagset.timeout = NVME_IO_TIMEOUT;
+	anv->tagset.timeout = anv->ctrl.io_timeout;
 	anv->tagset.numa_node = NUMA_NO_NODE;
 	anv->tagset.cmd_size = sizeof(struct apple_nvme_iod);
 	anv->tagset.driver_data = &anv->ioq;
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b9315f0abf80..cc7d725bd6ff 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4169,6 +4169,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
 		mutex_unlock(&ctrl->namespaces_lock);
 		goto out_unlink_ns;
 	}
+	blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
 	nvme_ns_add_to_ctrl_list(ns);
 	mutex_unlock(&ctrl->namespaces_lock);
 	synchronize_srcu(&ctrl->srcu);
@@ -4930,7 +4931,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 	set->cmd_size = cmd_size;
 	set->driver_data = ctrl;
 	set->nr_hw_queues = ctrl->queue_count - 1;
-	set->timeout = NVME_IO_TIMEOUT;
+	set->timeout = ctrl->io_timeout;
 	set->nr_maps = nr_maps;
 	ret = blk_mq_alloc_tag_set(set);
 	if (ret)
@@ -5107,6 +5108,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd));
 	ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
 	ctrl->ka_last_check_time = jiffies;
+	ctrl->io_timeout = NVME_IO_TIMEOUT;
 
 	BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
 			PAGE_SIZE);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9a5f28c5103c..ef390a020d8d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -316,6 +316,7 @@ struct nvme_ctrl {
 	u16 mtfa;
 	u32 ctrl_config;
 	u32 queue_count;
+	u32 io_timeout;
 
 	u64 cap;
 	u32 max_hw_sectors;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 149cd1ab3b3d..0536e4919aa7 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -631,6 +631,43 @@ static struct device_attribute dev_attr_admin_timeout =  \
 	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
 	nvme_admin_timeout_show, nvme_admin_timeout_store);
 
+static ssize_t nvme_io_timeout_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%u\n", jiffies_to_msecs(ctrl->io_timeout));
+}
+
+static ssize_t nvme_io_timeout_store(struct device *dev,
+			struct device_attribute *attr,
+			const char *buf, size_t count)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+	struct nvme_ns *ns;
+	u32 timeout;
+	int err;
+
+	err = kstrtou32(buf, 10, &timeout);
+	if (err || !timeout)
+		return -EINVAL;
+
+	/* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
+	mutex_lock(&ctrl->namespaces_lock);
+
+	ctrl->io_timeout = msecs_to_jiffies(timeout);
+	list_for_each_entry(ns, &ctrl->namespaces, list)
+		blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
+
+	mutex_unlock(&ctrl->namespaces_lock);
+
+	return count;
+}
+
+static struct device_attribute dev_attr_io_timeout = \
+	__ATTR(io_timeout, S_IRUGO | S_IWUSR, \
+	nvme_io_timeout_show, nvme_io_timeout_store);
+
 #ifdef CONFIG_NVME_HOST_AUTH
 static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
@@ -773,6 +810,7 @@ static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_cntrltype.attr,
 	&dev_attr_dctype.attr,
 	&dev_attr_admin_timeout.attr,
+	&dev_attr_io_timeout.attr,
 #ifdef CONFIG_NVME_HOST_AUTH
 	&dev_attr_dhchap_secret.attr,
 	&dev_attr_dhchap_ctrl_secret.attr,
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 5/5] nvme: use per controller timeout waits over depending on global default
  2026-02-12 12:09 [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
                   ` (3 preceding siblings ...)
  2026-02-12 12:09 ` [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller Maurizio Lombardi
@ 2026-02-12 12:09 ` Maurizio Lombardi
  2026-02-17 20:29   ` Mohamed Khalfella
  4 siblings, 1 reply; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-12 12:09 UTC (permalink / raw)
  To: kbusch; +Cc: mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

Instead of passing NVME_IO_TIMEOUT as a parameter with every call to
nvme_wait_freeze_timeout, use the controller's preferred timeout.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/apple.c | 2 +-
 drivers/nvme/host/core.c  | 5 +++--
 drivers/nvme/host/nvme.h  | 2 +-
 drivers/nvme/host/pci.c   | 2 +-
 drivers/nvme/host/rdma.c  | 2 +-
 drivers/nvme/host/tcp.c   | 2 +-
 6 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index e9bb64e3ec9c..c29c1f45f73a 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -858,7 +858,7 @@ static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
 	 * doing a safe shutdown.
 	 */
 	if (!dead && shutdown && freeze)
-		nvme_wait_freeze_timeout(&anv->ctrl, NVME_IO_TIMEOUT);
+		nvme_wait_freeze_timeout(&anv->ctrl);
 
 	nvme_quiesce_io_queues(&anv->ctrl);
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index cc7d725bd6ff..c1b1a5d08838 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -5215,16 +5215,17 @@ void nvme_unfreeze(struct nvme_ctrl *ctrl)
 }
 EXPORT_SYMBOL_GPL(nvme_unfreeze);
 
-int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout)
+int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl)
 {
 	struct nvme_ns *ns;
 	int srcu_idx;
+	unsigned long timeout = ctrl->io_timeout;
 
 	srcu_idx = srcu_read_lock(&ctrl->srcu);
 	list_for_each_entry_srcu(ns, &ctrl->namespaces, list,
 				 srcu_read_lock_held(&ctrl->srcu)) {
 		timeout = blk_mq_freeze_queue_wait_timeout(ns->queue, timeout);
-		if (timeout <= 0)
+		if (!timeout)
 			break;
 	}
 	srcu_read_unlock(&ctrl->srcu, srcu_idx);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ef390a020d8d..c59228895bf0 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -840,7 +840,7 @@ void nvme_sync_queues(struct nvme_ctrl *ctrl);
 void nvme_sync_io_queues(struct nvme_ctrl *ctrl);
 void nvme_unfreeze(struct nvme_ctrl *ctrl);
 void nvme_wait_freeze(struct nvme_ctrl *ctrl);
-int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout);
+int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl);
 void nvme_start_freeze(struct nvme_ctrl *ctrl);
 
 static inline enum req_op nvme_req_op(struct nvme_command *cmd)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 853cd57e4480..0bd533ddc07c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3096,7 +3096,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 		 * if doing a safe shutdown.
 		 */
 		if (!dead && shutdown)
-			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
+			nvme_wait_freeze_timeout(&dev->ctrl);
 	}
 
 	nvme_quiesce_io_queues(&dev->ctrl);
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 35c0822edb2d..c8859367ffff 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -888,7 +888,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 	if (!new) {
 		nvme_start_freeze(&ctrl->ctrl);
 		nvme_unquiesce_io_queues(&ctrl->ctrl);
-		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+		if (!nvme_wait_freeze_timeout(&ctrl->ctrl)) {
 			/*
 			 * If we timed out waiting for freeze we are likely to
 			 * be stuck.  Fail the controller initialization just
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 69cb04406b47..0f78fa8f1cbc 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2193,7 +2193,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 	if (!new) {
 		nvme_start_freeze(ctrl);
 		nvme_unquiesce_io_queues(ctrl);
-		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
+		if (!nvme_wait_freeze_timeout(ctrl)) {
 			/*
 			 * If we timed out waiting for freeze we are likely to
 			 * be stuck.  Fail the controller initialization just
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests
  2026-02-12 12:09 ` [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
@ 2026-02-17 20:15   ` Mohamed Khalfella
  2026-02-18 12:19     ` Heyne, Maximilian
  0 siblings, 1 reply; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-17 20:15 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:47 +0100, Maurizio Lombardi wrote:
> From: "Heyne, Maximilian" <mheyne@amazon.de>
> 
> When initializing an nvme request which is about to be send to the block
> layer, we do not need to initialize its timeout. If it's left
> uninitialized at 0 the block layer will use the request queue's timeout
> in blk_add_timer (via nvme_start_request which is called from
> nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or
> NVME_ADMIN_TIMEOUT when the request queues were created.
> 
> Because the io_timeout of the IO queues can be modified via sysfs, the
> following situation can occur:
> 
> 1) NVME_IO_TIMEOUT = 30 (default module parameter)
> 2) nvme1n1 is probed. IO queues default timeout is 30 s
> 3) manually change the IO timeout to 90 s
>    echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout
> 4) Any call of __submit_sync_cmd on nvme1n1 to an IO queue will issue
>    commands with the 30 s timeout instead of the wanted 90 s which might
>    be more suitable for this device.
> 
> Commit 470e900c8036 ("nvme: refactor nvme_alloc_request") silently
> changed the behavior for ioctl's already because it unconditionally
> overrides the request's timeout that was set in nvme_init_request. If it
> was unset by the user of the ioctl if will be overridden with 0 meaning
> the block layer will pick the request queue's IO timeout.
> 
> Following up on that, this patch further improves the consistency of IO
> timeout usage. However, there are still uses of NVME_IO_TIMEOUT which
> could be inconsistent with what is set in the device's request_queue by
> the user.
> 
> Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> ---
>  drivers/nvme/host/core.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 7bf228df6001..b9315f0abf80 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -724,10 +724,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
>  		struct nvme_ns *ns = req->q->disk->private_data;
>  
>  		logging_enabled = ns->head->passthru_err_log_enabled;
> -		req->timeout = NVME_IO_TIMEOUT;
>  	} else { /* no queuedata implies admin queue */
>  		logging_enabled = nr->ctrl->passthru_err_log_enabled;
> -		req->timeout = NVME_ADMIN_TIMEOUT;
>  	}
>  
>  	if (!logging_enabled)
> -- 
> 2.53.0
> 
> 

I wonder what was the impact of these lines given that req->timeout is
set to 0 by blk_mq_rq_ctx_init() when the request is allocated?


Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller
  2026-02-12 12:09 ` [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
@ 2026-02-17 20:17   ` Mohamed Khalfella
  2026-02-18 12:40   ` Heyne, Maximilian
  2026-02-18 18:00   ` Mohamed Khalfella
  2 siblings, 0 replies; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-17 20:17 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:48 +0100, Maurizio Lombardi wrote:
> Currently, there is no method to adjust the timeout values
> on a per controller basis with nvme admin queues.
> Add an admin_timeout attribute to nvme so that different
> nvme controllers which may have different timeout
> requirements can have custom admin timeouts set.
> 
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>

Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>

> ---
>  drivers/nvme/host/sysfs.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index 29430949ce2f..149cd1ab3b3d 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -601,6 +601,36 @@ static ssize_t dctype_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(dctype);
>  
> +static ssize_t nvme_admin_timeout_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, "%u\n",
> +				jiffies_to_msecs(ctrl->admin_q->rq_timeout));
> +}
> +
> +static ssize_t nvme_admin_timeout_store(struct device *dev,
> +			struct device_attribute *attr,
> +			const char *buf, size_t count)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	u32 timeout;
> +	int err;
> +
> +	err = kstrtou32(buf, 10, &timeout);
> +	if (err || !timeout)
> +		return -EINVAL;
> +
> +	blk_queue_rq_timeout(ctrl->admin_q, msecs_to_jiffies(timeout));
> +
> +	return count;
> +}
> +
> +static struct device_attribute dev_attr_admin_timeout =  \
> +	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
> +	nvme_admin_timeout_show, nvme_admin_timeout_store);
> +
>  #ifdef CONFIG_NVME_HOST_AUTH
>  static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
> @@ -742,6 +772,7 @@ static struct attribute *nvme_dev_attrs[] = {
>  	&dev_attr_kato.attr,
>  	&dev_attr_cntrltype.attr,
>  	&dev_attr_dctype.attr,
> +	&dev_attr_admin_timeout.attr,
>  #ifdef CONFIG_NVME_HOST_AUTH
>  	&dev_attr_dhchap_secret.attr,
>  	&dev_attr_dhchap_ctrl_secret.attr,
> -- 
> 2.53.0
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT
  2026-02-12 12:09 ` [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT Maurizio Lombardi
@ 2026-02-17 20:22   ` Mohamed Khalfella
  2026-02-18 13:10   ` Heyne, Maximilian
  1 sibling, 0 replies; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-17 20:22 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:49 +0100, Maurizio Lombardi wrote:
> From: David Jeffery <djeffery@redhat.com>
> 
> While tearing down its queues, nvme-pci uses NVME_ADMIN_TIMEOUT as its
> timeout target. Instead, use the configured admin queue's timeout value
> when available to match the device's existing timeout setting.
> 
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> ---
>  drivers/nvme/host/pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 58f3097888a7..853cd57e4480 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2911,9 +2911,10 @@ static bool __nvme_delete_io_queues(struct nvme_dev *dev, u8 opcode)
>  {
>  	int nr_queues = dev->online_queues - 1, sent = 0;
>  	unsigned long timeout;
> +	struct request_queue *q = dev->ctrl.admin_q;

Maybe place q between nr_queues and timeout?

>  
>   retry:
> -	timeout = NVME_ADMIN_TIMEOUT;
> +	timeout = q ? q->rq_timeout : NVME_ADMIN_TIMEOUT;

Under what condition we expect admin_q to be NULL?

>  	while (nr_queues > 0) {
>  		if (nvme_delete_queue(&dev->queues[nr_queues], opcode))
>  			break;
> -- 
> 2.53.0
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-12 12:09 ` [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller Maurizio Lombardi
@ 2026-02-17 20:25   ` Mohamed Khalfella
  2026-02-18 13:28     ` Maurizio Lombardi
  2026-02-18 17:58   ` Mohamed Khalfella
  1 sibling, 1 reply; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-17 20:25 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:50 +0100, Maurizio Lombardi wrote:
> Currently, there is no method to adjust the timeout values
> on a per controller basis with nvme I/O queues.
> Add an io_timeout attribute to nvme so that different
> nvme controllers which may have different timeout
> requirements can have custom I/O timeouts set.
> 
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
>  drivers/nvme/host/apple.c |  2 +-
>  drivers/nvme/host/core.c  |  4 +++-
>  drivers/nvme/host/nvme.h  |  1 +
>  drivers/nvme/host/sysfs.c | 38 ++++++++++++++++++++++++++++++++++++++
>  4 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
> index ed61b97fde59..e9bb64e3ec9c 100644
> --- a/drivers/nvme/host/apple.c
> +++ b/drivers/nvme/host/apple.c
> @@ -1331,7 +1331,7 @@ static int apple_nvme_alloc_tagsets(struct apple_nvme *anv)
>  	if (anv->hw->has_lsq_nvmmu)
>  		anv->tagset.reserved_tags = APPLE_NVME_AQ_DEPTH;
>  	anv->tagset.queue_depth = anv->hw->max_queue_depth - 1;
> -	anv->tagset.timeout = NVME_IO_TIMEOUT;
> +	anv->tagset.timeout = anv->ctrl.io_timeout;
>  	anv->tagset.numa_node = NUMA_NO_NODE;
>  	anv->tagset.cmd_size = sizeof(struct apple_nvme_iod);
>  	anv->tagset.driver_data = &anv->ioq;
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index b9315f0abf80..cc7d725bd6ff 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4169,6 +4169,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
>  		mutex_unlock(&ctrl->namespaces_lock);
>  		goto out_unlink_ns;
>  	}
> +	blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);

Given that io tagset->timeout has been initialized to ctrl->io_timeout
Do we still need the line above?

>  	nvme_ns_add_to_ctrl_list(ns);
>  	mutex_unlock(&ctrl->namespaces_lock);
>  	synchronize_srcu(&ctrl->srcu);
> @@ -4930,7 +4931,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
>  	set->cmd_size = cmd_size;
>  	set->driver_data = ctrl;
>  	set->nr_hw_queues = ctrl->queue_count - 1;
> -	set->timeout = NVME_IO_TIMEOUT;
> +	set->timeout = ctrl->io_timeout;
>  	set->nr_maps = nr_maps;
>  	ret = blk_mq_alloc_tag_set(set);
>  	if (ret)
> @@ -5107,6 +5108,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
>  	memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd));
>  	ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
>  	ctrl->ka_last_check_time = jiffies;
> +	ctrl->io_timeout = NVME_IO_TIMEOUT;
>  
>  	BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
>  			PAGE_SIZE);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9a5f28c5103c..ef390a020d8d 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -316,6 +316,7 @@ struct nvme_ctrl {
>  	u16 mtfa;
>  	u32 ctrl_config;
>  	u32 queue_count;
> +	u32 io_timeout;
>  
>  	u64 cap;
>  	u32 max_hw_sectors;
> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index 149cd1ab3b3d..0536e4919aa7 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -631,6 +631,43 @@ static struct device_attribute dev_attr_admin_timeout =  \
>  	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
>  	nvme_admin_timeout_show, nvme_admin_timeout_store);
>  
> +static ssize_t nvme_io_timeout_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, "%u\n", jiffies_to_msecs(ctrl->io_timeout));
> +}
> +
> +static ssize_t nvme_io_timeout_store(struct device *dev,
> +			struct device_attribute *attr,
> +			const char *buf, size_t count)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	struct nvme_ns *ns;
> +	u32 timeout;
> +	int err;
> +
> +	err = kstrtou32(buf, 10, &timeout);
> +	if (err || !timeout)
> +		return -EINVAL;
> +
> +	/* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
> +	mutex_lock(&ctrl->namespaces_lock);
> +
> +	ctrl->io_timeout = msecs_to_jiffies(timeout);
> +	list_for_each_entry(ns, &ctrl->namespaces, list)
> +		blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
> +
> +	mutex_unlock(&ctrl->namespaces_lock);
> +
> +	return count;
> +}
> +
> +static struct device_attribute dev_attr_io_timeout = \
> +	__ATTR(io_timeout, S_IRUGO | S_IWUSR, \
> +	nvme_io_timeout_show, nvme_io_timeout_store);
> +
>  #ifdef CONFIG_NVME_HOST_AUTH
>  static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
> @@ -773,6 +810,7 @@ static struct attribute *nvme_dev_attrs[] = {
>  	&dev_attr_cntrltype.attr,
>  	&dev_attr_dctype.attr,
>  	&dev_attr_admin_timeout.attr,
> +	&dev_attr_io_timeout.attr,
>  #ifdef CONFIG_NVME_HOST_AUTH
>  	&dev_attr_dhchap_secret.attr,
>  	&dev_attr_dhchap_ctrl_secret.attr,
> -- 
> 2.53.0
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 5/5] nvme: use per controller timeout waits over depending on global default
  2026-02-12 12:09 ` [PATCH RFC 5/5] nvme: use per controller timeout waits over depending on global default Maurizio Lombardi
@ 2026-02-17 20:29   ` Mohamed Khalfella
  0 siblings, 0 replies; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-17 20:29 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:51 +0100, Maurizio Lombardi wrote:
> Instead of passing NVME_IO_TIMEOUT as a parameter with every call to
> nvme_wait_freeze_timeout, use the controller's preferred timeout.
> 
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>

Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>

> ---
>  drivers/nvme/host/apple.c | 2 +-
>  drivers/nvme/host/core.c  | 5 +++--
>  drivers/nvme/host/nvme.h  | 2 +-
>  drivers/nvme/host/pci.c   | 2 +-
>  drivers/nvme/host/rdma.c  | 2 +-
>  drivers/nvme/host/tcp.c   | 2 +-
>  6 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
> index e9bb64e3ec9c..c29c1f45f73a 100644
> --- a/drivers/nvme/host/apple.c
> +++ b/drivers/nvme/host/apple.c
> @@ -858,7 +858,7 @@ static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
>  	 * doing a safe shutdown.
>  	 */
>  	if (!dead && shutdown && freeze)
> -		nvme_wait_freeze_timeout(&anv->ctrl, NVME_IO_TIMEOUT);
> +		nvme_wait_freeze_timeout(&anv->ctrl);
>  
>  	nvme_quiesce_io_queues(&anv->ctrl);
>  
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index cc7d725bd6ff..c1b1a5d08838 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -5215,16 +5215,17 @@ void nvme_unfreeze(struct nvme_ctrl *ctrl)
>  }
>  EXPORT_SYMBOL_GPL(nvme_unfreeze);
>  
> -int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout)
> +int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl)
>  {
>  	struct nvme_ns *ns;
>  	int srcu_idx;
> +	unsigned long timeout = ctrl->io_timeout;

nit: Maybe move this to the top before ns?

>  
>  	srcu_idx = srcu_read_lock(&ctrl->srcu);
>  	list_for_each_entry_srcu(ns, &ctrl->namespaces, list,
>  				 srcu_read_lock_held(&ctrl->srcu)) {
>  		timeout = blk_mq_freeze_queue_wait_timeout(ns->queue, timeout);
> -		if (timeout <= 0)
> +		if (!timeout)
>  			break;
>  	}
>  	srcu_read_unlock(&ctrl->srcu, srcu_idx);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index ef390a020d8d..c59228895bf0 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -840,7 +840,7 @@ void nvme_sync_queues(struct nvme_ctrl *ctrl);
>  void nvme_sync_io_queues(struct nvme_ctrl *ctrl);
>  void nvme_unfreeze(struct nvme_ctrl *ctrl);
>  void nvme_wait_freeze(struct nvme_ctrl *ctrl);
> -int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout);
> +int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl);
>  void nvme_start_freeze(struct nvme_ctrl *ctrl);
>  
>  static inline enum req_op nvme_req_op(struct nvme_command *cmd)
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 853cd57e4480..0bd533ddc07c 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3096,7 +3096,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
>  		 * if doing a safe shutdown.
>  		 */
>  		if (!dead && shutdown)
> -			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
> +			nvme_wait_freeze_timeout(&dev->ctrl);
>  	}
>  
>  	nvme_quiesce_io_queues(&dev->ctrl);
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 35c0822edb2d..c8859367ffff 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -888,7 +888,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>  	if (!new) {
>  		nvme_start_freeze(&ctrl->ctrl);
>  		nvme_unquiesce_io_queues(&ctrl->ctrl);
> -		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> +		if (!nvme_wait_freeze_timeout(&ctrl->ctrl)) {
>  			/*
>  			 * If we timed out waiting for freeze we are likely to
>  			 * be stuck.  Fail the controller initialization just
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 69cb04406b47..0f78fa8f1cbc 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2193,7 +2193,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>  	if (!new) {
>  		nvme_start_freeze(ctrl);
>  		nvme_unquiesce_io_queues(ctrl);
> -		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
> +		if (!nvme_wait_freeze_timeout(ctrl)) {
>  			/*
>  			 * If we timed out waiting for freeze we are likely to
>  			 * be stuck.  Fail the controller initialization just
> -- 
> 2.53.0
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests
  2026-02-17 20:15   ` Mohamed Khalfella
@ 2026-02-18 12:19     ` Heyne, Maximilian
  0 siblings, 0 replies; 25+ messages in thread
From: Heyne, Maximilian @ 2026-02-18 12:19 UTC (permalink / raw)
  To: Mohamed Khalfella
  Cc: Maurizio Lombardi, kbusch@kernel.org, emilne@redhat.com,
	jmeneghi@redhat.com, linux-nvme@lists.infradead.org,
	dwagner@suse.de, mlombard@bsdbackstore.eu

On Tue, Feb 17, 2026 at 12:15:12PM -0800, Mohamed Khalfella wrote:
> On Thu 2026-02-12 13:09:47 +0100, Maurizio Lombardi wrote:
> > From: "Heyne, Maximilian" <mheyne@amazon.de>
> > 
> > When initializing an nvme request which is about to be send to the block
> > layer, we do not need to initialize its timeout. If it's left
> > uninitialized at 0 the block layer will use the request queue's timeout
> > in blk_add_timer (via nvme_start_request which is called from
> > nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or
> > NVME_ADMIN_TIMEOUT when the request queues were created.
> > 
> > Because the io_timeout of the IO queues can be modified via sysfs, the
> > following situation can occur:
> > 
> > 1) NVME_IO_TIMEOUT = 30 (default module parameter)
> > 2) nvme1n1 is probed. IO queues default timeout is 30 s
> > 3) manually change the IO timeout to 90 s
> >    echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout
> > 4) Any call of __submit_sync_cmd on nvme1n1 to an IO queue will issue
> >    commands with the 30 s timeout instead of the wanted 90 s which might
> >    be more suitable for this device.
> > 
> > Commit 470e900c8036 ("nvme: refactor nvme_alloc_request") silently
> > changed the behavior for ioctl's already because it unconditionally
> > overrides the request's timeout that was set in nvme_init_request. If it
> > was unset by the user of the ioctl if will be overridden with 0 meaning
> > the block layer will pick the request queue's IO timeout.
> > 
> > Following up on that, this patch further improves the consistency of IO
> > timeout usage. However, there are still uses of NVME_IO_TIMEOUT which
> > could be inconsistent with what is set in the device's request_queue by
> > the user.
> > 
> > Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> > ---
> >  drivers/nvme/host/core.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 7bf228df6001..b9315f0abf80 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -724,10 +724,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
> >  		struct nvme_ns *ns = req->q->disk->private_data;
> >  
> >  		logging_enabled = ns->head->passthru_err_log_enabled;
> > -		req->timeout = NVME_IO_TIMEOUT;
> >  	} else { /* no queuedata implies admin queue */
> >  		logging_enabled = nr->ctrl->passthru_err_log_enabled;
> > -		req->timeout = NVME_ADMIN_TIMEOUT;
> >  	}
> >  
> >  	if (!logging_enabled)
> > -- 
> > 2.53.0
> > 
> > 
> 
> I wonder what was the impact of these lines given that req->timeout is
> set to 0 by blk_mq_rq_ctx_init() when the request is allocated?
> 

Looking through the nvme code. blk_mq_rq_ctx_init is always called
(indirectly, e.g. via blk_mq_alloc_request) before nvme_init_request and
therefore the timeouts are set to NVME_IO_TIMEOUT/NVME_ADMIN_TIMEOUT.
Now with these lines removed the timeout is 0 which means the
request_queue's rq_timeout will be used, see blk_add_timer.



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller
  2026-02-12 12:09 ` [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
  2026-02-17 20:17   ` Mohamed Khalfella
@ 2026-02-18 12:40   ` Heyne, Maximilian
  2026-02-18 14:49     ` Maurizio Lombardi
  2026-02-18 18:00   ` Mohamed Khalfella
  2 siblings, 1 reply; 25+ messages in thread
From: Heyne, Maximilian @ 2026-02-18 12:40 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch@kernel.org, emilne@redhat.com, jmeneghi@redhat.com,
	linux-nvme@lists.infradead.org, dwagner@suse.de,
	mlombard@bsdbackstore.eu

On Thu, Feb 12, 2026 at 01:09:48PM +0100, Maurizio Lombardi wrote:
> Currently, there is no method to adjust the timeout values
> on a per controller basis with nvme admin queues.
> Add an admin_timeout attribute to nvme so that different
> nvme controllers which may have different timeout
> requirements can have custom admin timeouts set.
> 
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>

What would it take to have a "queue" directory representing the
admin_queue for an nvme controller like we have "queue" for the io
queues of the namespaces? This way we could change/inspect more than
just the admin_timeout for the admin queue. Or does this not make any
sense?

> ---
>  drivers/nvme/host/sysfs.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index 29430949ce2f..149cd1ab3b3d 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -601,6 +601,36 @@ static ssize_t dctype_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(dctype);
>  
> +static ssize_t nvme_admin_timeout_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, "%u\n",
> +				jiffies_to_msecs(ctrl->admin_q->rq_timeout));
> +}
> +
> +static ssize_t nvme_admin_timeout_store(struct device *dev,
> +			struct device_attribute *attr,
> +			const char *buf, size_t count)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	u32 timeout;
> +	int err;
> +
> +	err = kstrtou32(buf, 10, &timeout);
> +	if (err || !timeout)
> +		return -EINVAL;
> +
> +	blk_queue_rq_timeout(ctrl->admin_q, msecs_to_jiffies(timeout));
> +
> +	return count;
> +}
> +
> +static struct device_attribute dev_attr_admin_timeout =  \
> +	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
> +	nvme_admin_timeout_show, nvme_admin_timeout_store);

Why not

  DEVICE_ATTR(admin_timeout, S_IRUGO | S_IWUSR, nvme_admin_timeout_show
  nvme_admin_timeout_store);

?

> +
>  #ifdef CONFIG_NVME_HOST_AUTH
>  static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
> @@ -742,6 +772,7 @@ static struct attribute *nvme_dev_attrs[] = {
>  	&dev_attr_kato.attr,
>  	&dev_attr_cntrltype.attr,
>  	&dev_attr_dctype.attr,
> +	&dev_attr_admin_timeout.attr,
>  #ifdef CONFIG_NVME_HOST_AUTH
>  	&dev_attr_dhchap_secret.attr,
>  	&dev_attr_dhchap_ctrl_secret.attr,
> -- 
> 2.53.0
> 
> 



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT
  2026-02-12 12:09 ` [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT Maurizio Lombardi
  2026-02-17 20:22   ` Mohamed Khalfella
@ 2026-02-18 13:10   ` Heyne, Maximilian
  2026-02-18 13:32     ` Maurizio Lombardi
  1 sibling, 1 reply; 25+ messages in thread
From: Heyne, Maximilian @ 2026-02-18 13:10 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch@kernel.org, emilne@redhat.com, jmeneghi@redhat.com,
	linux-nvme@lists.infradead.org, dwagner@suse.de,
	mlombard@bsdbackstore.eu

On Thu, Feb 12, 2026 at 01:09:49PM +0100, Maurizio Lombardi wrote:
> From: David Jeffery <djeffery@redhat.com>
> 
> While tearing down its queues, nvme-pci uses NVME_ADMIN_TIMEOUT as its
> timeout target. Instead, use the configured admin queue's timeout value
> when available to match the device's existing timeout setting.
> 
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> ---
>  drivers/nvme/host/pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 58f3097888a7..853cd57e4480 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2911,9 +2911,10 @@ static bool __nvme_delete_io_queues(struct nvme_dev *dev, u8 opcode)
>  {
>  	int nr_queues = dev->online_queues - 1, sent = 0;
>  	unsigned long timeout;
> +	struct request_queue *q = dev->ctrl.admin_q;
>  
>   retry:
> -	timeout = NVME_ADMIN_TIMEOUT;
> +	timeout = q ? q->rq_timeout : NVME_ADMIN_TIMEOUT;

Skimming over the code I think the check is unnecessary since
ctrl.admin_q is only changed in nvme_alloc_admin_tag_set. Only when this
succeeds, we'll setup io queues. When they get deleted we will still
have the ctrl.admin_q being valid.

>  	while (nr_queues > 0) {
>  		if (nvme_delete_queue(&dev->queues[nr_queues], opcode))
>  			break;
> -- 
> 2.53.0
> 
> 



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-17 20:25   ` Mohamed Khalfella
@ 2026-02-18 13:28     ` Maurizio Lombardi
  2026-02-18 17:54       ` Mohamed Khalfella
  0 siblings, 1 reply; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-18 13:28 UTC (permalink / raw)
  To: Mohamed Khalfella, Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Tue Feb 17, 2026 at 9:25 PM CET, Mohamed Khalfella wrote:
>> @@ -4169,6 +4169,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
>>  		mutex_unlock(&ctrl->namespaces_lock);
>>  		goto out_unlink_ns;
>>  	}
>> +	blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
>
> Given that io tagset->timeout has been initialized to ctrl->io_timeout
> Do we still need the line above?


Yes, the line is necessary. While the tagset is initialized with
ctrl->io_timeout, that happens only once during tagset allocation.

If the user later changes io_timeout via sysfs, ctrl->io_timeout is updated,
but the tagset->timeout remains at the old default value. I deliberately
do not update tagset->timeout in the sysfs handler because the tagset
lifetime is tied to the controller state (e.g., it may be freed during a
controller reset). Attempting to update the tagset from the sysfs path
introduces a race condition where we might dereference a pointer to
a tagset that is being destroyed.

Therefore, when a new namespace is allocated (nvme_alloc_ns), the block
layer initializes the queue using the stale tagset->timeout. We must
explicitly override that with the current ctrl->io_timeout to
ensure the new namespace respects the runtime configuration.

Maurizio




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT
  2026-02-18 13:10   ` Heyne, Maximilian
@ 2026-02-18 13:32     ` Maurizio Lombardi
  0 siblings, 0 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-18 13:32 UTC (permalink / raw)
  To: Heyne, Maximilian, Maurizio Lombardi
  Cc: kbusch@kernel.org, emilne@redhat.com, jmeneghi@redhat.com,
	linux-nvme@lists.infradead.org, dwagner@suse.de,
	mlombard@bsdbackstore.eu

On Wed Feb 18, 2026 at 2:10 PM CET, Maximilian Heyne wrote:
> On Thu, Feb 12, 2026 at 01:09:49PM +0100, Maurizio Lombardi wrote:
>> From: David Jeffery <djeffery@redhat.com>
>> 
>> While tearing down its queues, nvme-pci uses NVME_ADMIN_TIMEOUT as its
>> timeout target. Instead, use the configured admin queue's timeout value
>> when available to match the device's existing timeout setting.
>> 
>> Signed-off-by: David Jeffery <djeffery@redhat.com>
>> ---
>>  drivers/nvme/host/pci.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 58f3097888a7..853cd57e4480 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -2911,9 +2911,10 @@ static bool __nvme_delete_io_queues(struct nvme_dev *dev, u8 opcode)
>>  {
>>  	int nr_queues = dev->online_queues - 1, sent = 0;
>>  	unsigned long timeout;
>> +	struct request_queue *q = dev->ctrl.admin_q;
>>  
>>   retry:
>> -	timeout = NVME_ADMIN_TIMEOUT;
>> +	timeout = q ? q->rq_timeout : NVME_ADMIN_TIMEOUT;
>
> Skimming over the code I think the check is unnecessary since
> ctrl.admin_q is only changed in nvme_alloc_admin_tag_set. Only when this
> succeeds, we'll setup io queues. When they get deleted we will still
> have the ctrl.admin_q being valid.

Yes, afaict you are right and this check is unnecessary, I will remove it.

Maurizio




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller
  2026-02-18 12:40   ` Heyne, Maximilian
@ 2026-02-18 14:49     ` Maurizio Lombardi
  0 siblings, 0 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-18 14:49 UTC (permalink / raw)
  To: Heyne, Maximilian, Maurizio Lombardi
  Cc: kbusch@kernel.org, emilne@redhat.com, jmeneghi@redhat.com,
	linux-nvme@lists.infradead.org, dwagner@suse.de,
	mlombard@bsdbackstore.eu

On Wed Feb 18, 2026 at 1:40 PM CET, Maximilian Heyne wrote:
>> +
>> +static struct device_attribute dev_attr_admin_timeout =  \
>> +	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
>> +	nvme_admin_timeout_show, nvme_admin_timeout_store);
>
> Why not
>
>   DEVICE_ATTR(admin_timeout, S_IRUGO | S_IWUSR, nvme_admin_timeout_show
>   nvme_admin_timeout_store);
>
> ?

No valid reason, I must have copy-pasted it from the old patchset sent
by David Jeffery.

I will fix it, thanks.

Maurizio



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-18 13:28     ` Maurizio Lombardi
@ 2026-02-18 17:54       ` Mohamed Khalfella
  2026-02-19 17:22         ` Maurizio Lombardi
  0 siblings, 1 reply; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-18 17:54 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: Maurizio Lombardi, kbusch, mheyne, emilne, jmeneghi, linux-nvme,
	dwagner, mlombard

On Wed 2026-02-18 14:28:03 +0100, Maurizio Lombardi wrote:
> On Tue Feb 17, 2026 at 9:25 PM CET, Mohamed Khalfella wrote:
> >> @@ -4169,6 +4169,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
> >>  		mutex_unlock(&ctrl->namespaces_lock);
> >>  		goto out_unlink_ns;
> >>  	}
> >> +	blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
> >
> > Given that io tagset->timeout has been initialized to ctrl->io_timeout
> > Do we still need the line above?
> 
> 
> Yes, the line is necessary. While the tagset is initialized with
> ctrl->io_timeout, that happens only once during tagset allocation.
> 
> If the user later changes io_timeout via sysfs, ctrl->io_timeout is updated,
> but the tagset->timeout remains at the old default value. I deliberately
> do not update tagset->timeout in the sysfs handler because the tagset
> lifetime is tied to the controller state (e.g., it may be freed during a
> controller reset). Attempting to update the tagset from the sysfs path
> introduces a race condition where we might dereference a pointer to
> a tagset that is being destroyed.
> 
> Therefore, when a new namespace is allocated (nvme_alloc_ns), the block
> layer initializes the queue using the stale tagset->timeout. We must
> explicitly override that with the current ctrl->io_timeout to
> ensure the new namespace respects the runtime configuration.

Yeah. I got it now. Thanks for the explanation.
It would great if there is way to update admin and io tagsets timeout.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-12 12:09 ` [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller Maurizio Lombardi
  2026-02-17 20:25   ` Mohamed Khalfella
@ 2026-02-18 17:58   ` Mohamed Khalfella
  1 sibling, 0 replies; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-18 17:58 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:50 +0100, Maurizio Lombardi wrote:
> Currently, there is no method to adjust the timeout values
> on a per controller basis with nvme I/O queues.
> Add an io_timeout attribute to nvme so that different
> nvme controllers which may have different timeout
> requirements can have custom I/O timeouts set.
> 
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
>  drivers/nvme/host/apple.c |  2 +-
>  drivers/nvme/host/core.c  |  4 +++-
>  drivers/nvme/host/nvme.h  |  1 +
>  drivers/nvme/host/sysfs.c | 38 ++++++++++++++++++++++++++++++++++++++
>  4 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
> index ed61b97fde59..e9bb64e3ec9c 100644
> --- a/drivers/nvme/host/apple.c
> +++ b/drivers/nvme/host/apple.c
> @@ -1331,7 +1331,7 @@ static int apple_nvme_alloc_tagsets(struct apple_nvme *anv)
>  	if (anv->hw->has_lsq_nvmmu)
>  		anv->tagset.reserved_tags = APPLE_NVME_AQ_DEPTH;
>  	anv->tagset.queue_depth = anv->hw->max_queue_depth - 1;
> -	anv->tagset.timeout = NVME_IO_TIMEOUT;
> +	anv->tagset.timeout = anv->ctrl.io_timeout;
>  	anv->tagset.numa_node = NUMA_NO_NODE;
>  	anv->tagset.cmd_size = sizeof(struct apple_nvme_iod);
>  	anv->tagset.driver_data = &anv->ioq;
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index b9315f0abf80..cc7d725bd6ff 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4169,6 +4169,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
>  		mutex_unlock(&ctrl->namespaces_lock);
>  		goto out_unlink_ns;
>  	}
> +	blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
>  	nvme_ns_add_to_ctrl_list(ns);
>  	mutex_unlock(&ctrl->namespaces_lock);
>  	synchronize_srcu(&ctrl->srcu);
> @@ -4930,7 +4931,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
>  	set->cmd_size = cmd_size;
>  	set->driver_data = ctrl;
>  	set->nr_hw_queues = ctrl->queue_count - 1;
> -	set->timeout = NVME_IO_TIMEOUT;
> +	set->timeout = ctrl->io_timeout;
>  	set->nr_maps = nr_maps;
>  	ret = blk_mq_alloc_tag_set(set);
>  	if (ret)
> @@ -5107,6 +5108,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
>  	memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd));
>  	ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
>  	ctrl->ka_last_check_time = jiffies;
> +	ctrl->io_timeout = NVME_IO_TIMEOUT;
>  
>  	BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
>  			PAGE_SIZE);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9a5f28c5103c..ef390a020d8d 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -316,6 +316,7 @@ struct nvme_ctrl {
>  	u16 mtfa;
>  	u32 ctrl_config;
>  	u32 queue_count;
> +	u32 io_timeout;
>  
>  	u64 cap;
>  	u32 max_hw_sectors;
> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index 149cd1ab3b3d..0536e4919aa7 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -631,6 +631,43 @@ static struct device_attribute dev_attr_admin_timeout =  \
>  	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
>  	nvme_admin_timeout_show, nvme_admin_timeout_store);
>  
> +static ssize_t nvme_io_timeout_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, "%u\n", jiffies_to_msecs(ctrl->io_timeout));
> +}
> +
> +static ssize_t nvme_io_timeout_store(struct device *dev,
> +			struct device_attribute *attr,
> +			const char *buf, size_t count)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	struct nvme_ns *ns;
> +	u32 timeout;
> +	int err;
> +
> +	err = kstrtou32(buf, 10, &timeout);
> +	if (err || !timeout)
> +		return -EINVAL;
> +
> +	/* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
> +	mutex_lock(&ctrl->namespaces_lock);
> +
> +	ctrl->io_timeout = msecs_to_jiffies(timeout);
> +	list_for_each_entry(ns, &ctrl->namespaces, list)
> +		blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
> +
> +	mutex_unlock(&ctrl->namespaces_lock);

What about ctrl->connect_q? It uses io tagset also. Do we want to update
its timeout also?

> +
> +	return count;
> +}
> +
> +static struct device_attribute dev_attr_io_timeout = \
> +	__ATTR(io_timeout, S_IRUGO | S_IWUSR, \
> +	nvme_io_timeout_show, nvme_io_timeout_store);
> +
>  #ifdef CONFIG_NVME_HOST_AUTH
>  static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
> @@ -773,6 +810,7 @@ static struct attribute *nvme_dev_attrs[] = {
>  	&dev_attr_cntrltype.attr,
>  	&dev_attr_dctype.attr,
>  	&dev_attr_admin_timeout.attr,
> +	&dev_attr_io_timeout.attr,
>  #ifdef CONFIG_NVME_HOST_AUTH
>  	&dev_attr_dhchap_secret.attr,
>  	&dev_attr_dhchap_ctrl_secret.attr,
> -- 
> 2.53.0
> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller
  2026-02-12 12:09 ` [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
  2026-02-17 20:17   ` Mohamed Khalfella
  2026-02-18 12:40   ` Heyne, Maximilian
@ 2026-02-18 18:00   ` Mohamed Khalfella
  2026-02-19 16:16     ` Maurizio Lombardi
  2 siblings, 1 reply; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-18 18:00 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Thu 2026-02-12 13:09:48 +0100, Maurizio Lombardi wrote:
> Currently, there is no method to adjust the timeout values
> on a per controller basis with nvme admin queues.
> Add an admin_timeout attribute to nvme so that different
> nvme controllers which may have different timeout
> requirements can have custom admin timeouts set.
> 
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
>  drivers/nvme/host/sysfs.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index 29430949ce2f..149cd1ab3b3d 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -601,6 +601,36 @@ static ssize_t dctype_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(dctype);
>  
> +static ssize_t nvme_admin_timeout_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, "%u\n",
> +				jiffies_to_msecs(ctrl->admin_q->rq_timeout));
> +}
> +
> +static ssize_t nvme_admin_timeout_store(struct device *dev,
> +			struct device_attribute *attr,
> +			const char *buf, size_t count)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	u32 timeout;
> +	int err;
> +
> +	err = kstrtou32(buf, 10, &timeout);
> +	if (err || !timeout)
> +		return -EINVAL;
> +
> +	blk_queue_rq_timeout(ctrl->admin_q, msecs_to_jiffies(timeout));
> +
> +	return count;
> +}
> +
> +static struct device_attribute dev_attr_admin_timeout =  \
> +	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
> +	nvme_admin_timeout_show, nvme_admin_timeout_store);
> +

Do we want similar attribute to ctrl->fabrics_q?

>  #ifdef CONFIG_NVME_HOST_AUTH
>  static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
> @@ -742,6 +772,7 @@ static struct attribute *nvme_dev_attrs[] = {
>  	&dev_attr_kato.attr,
>  	&dev_attr_cntrltype.attr,
>  	&dev_attr_dctype.attr,
> +	&dev_attr_admin_timeout.attr,
>  #ifdef CONFIG_NVME_HOST_AUTH
>  	&dev_attr_dhchap_secret.attr,
>  	&dev_attr_dhchap_ctrl_secret.attr,
> -- 
> 2.53.0
> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller
  2026-02-18 18:00   ` Mohamed Khalfella
@ 2026-02-19 16:16     ` Maurizio Lombardi
  0 siblings, 0 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-19 16:16 UTC (permalink / raw)
  To: Mohamed Khalfella, Maurizio Lombardi
  Cc: kbusch, mheyne, emilne, jmeneghi, linux-nvme, dwagner, mlombard

On Wed Feb 18, 2026 at 7:00 PM CET, Mohamed Khalfella wrote:
>> +static ssize_t nvme_admin_timeout_store(struct device *dev,
>> +			struct device_attribute *attr,
>> +			const char *buf, size_t count)
>> +{
>> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +	u32 timeout;
>> +	int err;
>> +
>> +	err = kstrtou32(buf, 10, &timeout);
>> +	if (err || !timeout)
>> +		return -EINVAL;
>> +
>> +	blk_queue_rq_timeout(ctrl->admin_q, msecs_to_jiffies(timeout));
>> +
>> +	return count;
>> +}
>> +
>> +static struct device_attribute dev_attr_admin_timeout =  \
>> +	__ATTR(admin_timeout, S_IRUGO | S_IWUSR, \
>> +	nvme_admin_timeout_show, nvme_admin_timeout_store);
>> +
>
> Do we want similar attribute to ctrl->fabrics_q?

Makes sense, I think.

We don't need other attributes, it should be sufficient to modify
nvme_admin_timeout_store() to do the following:

static ssize_t nvme_admin_timeout_store(struct device *dev,
                        struct device_attribute *attr,
                        const char *buf, size_t count)
{
        struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
        u32 timeout;
        int err;

        err = kstrtou32(buf, 10, &timeout);
        if (err || !timeout)
                return -EINVAL;

        blk_queue_rq_timeout(ctrl->admin_q, msecs_to_jiffies(timeout));
	if (ctrl->ops->flags & NVME_F_FABRICS)
		 blk_queue_rq_timeout(ctrl->fabrics_q, msecs_to_jiffies(timeout));

        return count;
}

Maurizio



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-18 17:54       ` Mohamed Khalfella
@ 2026-02-19 17:22         ` Maurizio Lombardi
  2026-02-20 12:47           ` Maurizio Lombardi
  0 siblings, 1 reply; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-19 17:22 UTC (permalink / raw)
  To: Mohamed Khalfella, Maurizio Lombardi
  Cc: Maurizio Lombardi, kbusch, mheyne, emilne, jmeneghi, linux-nvme,
	dwagner, mlombard

On Wed Feb 18, 2026 at 6:54 PM CET, Mohamed Khalfella wrote:
>
> Yeah. I got it now. Thanks for the explanation.
> It would great if there is way to update admin and io tagsets timeout.

Yes, it would be much better.

I think it can be done, the only driver that in some cases sets the tagset
pointer to NULL during reset is nvme-pci, but only after removing all
the namespaces.

So changing the timeout field in the tagset should be doable, the
only problem is avoid racing against nvme_alloc_ns().

I will try to come up with something.

Maurizio


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-19 17:22         ` Maurizio Lombardi
@ 2026-02-20 12:47           ` Maurizio Lombardi
  2026-02-20 17:53             ` Mohamed Khalfella
  0 siblings, 1 reply; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-20 12:47 UTC (permalink / raw)
  To: Maurizio Lombardi, Mohamed Khalfella
  Cc: Maurizio Lombardi, kbusch, mheyne, emilne, jmeneghi, linux-nvme,
	dwagner, mlombard

On Thu Feb 19, 2026 at 6:22 PM CET, Maurizio Lombardi wrote:
> On Wed Feb 18, 2026 at 6:54 PM CET, Mohamed Khalfella wrote:
>
> So changing the timeout field in the tagset should be doable, the
> only problem is avoid racing against nvme_alloc_ns().
>
> I will try to come up with something.

I decided to keep the current design, calling blk_queue_rq_timeout()
with the namespaces_lock mutex locked is the easiest solution

I am sending a V2 in a few moments.

Maurizio



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-20 12:47           ` Maurizio Lombardi
@ 2026-02-20 17:53             ` Mohamed Khalfella
  2026-02-23 10:36               ` Maurizio Lombardi
  0 siblings, 1 reply; 25+ messages in thread
From: Mohamed Khalfella @ 2026-02-20 17:53 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: Maurizio Lombardi, kbusch, mheyne, emilne, jmeneghi, linux-nvme,
	dwagner, mlombard

On Fri 2026-02-20 13:47:08 +0100, Maurizio Lombardi wrote:
> On Thu Feb 19, 2026 at 6:22 PM CET, Maurizio Lombardi wrote:
> > On Wed Feb 18, 2026 at 6:54 PM CET, Mohamed Khalfella wrote:
> >
> > So changing the timeout field in the tagset should be doable, the
> > only problem is avoid racing against nvme_alloc_ns().
> >
> > I will try to come up with something.
> 
> I decided to keep the current design, calling blk_queue_rq_timeout()
> with the namespaces_lock mutex locked is the easiest solution
> 
> I am sending a V2 in a few moments.
> 

How about restricting changing io timeout to LIVE controllers only. This
will make the problem easier to solve. Maybe something like the below?

	enum nvme_ctrl_state state;
	unsigned long flags;

        spin_lock_irqsave(&ctrl->lock, flags);
	state = nvme_ctrl_state(ctrl);
	if (state != NVME_CTRL_LIVE) {
		spin_unlock_irqrestore(&ctrl->lock, flags);
		return -EBUSY
	}

	if (ctrl->queue_count > 1)
		WRITE_ONCE(ctrl->tagset->timeout, timeout);
	spin_unlock_irqrestore(&ctrl->lock, flags);

	/* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
	mutex_lock(&ctrl->namespaces_lock);

	ctrl->io_timeout = msecs_to_jiffies(timeout);
	list_for_each_entry(ns, &ctrl->namespaces, list)
		blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);

	mutex_unlock(&ctrl->namespaces_lock);


> Maurizio
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
  2026-02-20 17:53             ` Mohamed Khalfella
@ 2026-02-23 10:36               ` Maurizio Lombardi
  0 siblings, 0 replies; 25+ messages in thread
From: Maurizio Lombardi @ 2026-02-23 10:36 UTC (permalink / raw)
  To: Mohamed Khalfella, Maurizio Lombardi
  Cc: Maurizio Lombardi, kbusch, mheyne, emilne, jmeneghi, linux-nvme,
	dwagner, mlombard

On Fri Feb 20, 2026 at 6:53 PM CET, Mohamed Khalfella wrote:
> On Fri 2026-02-20 13:47:08 +0100, Maurizio Lombardi wrote:
>> On Thu Feb 19, 2026 at 6:22 PM CET, Maurizio Lombardi wrote:
>> > On Wed Feb 18, 2026 at 6:54 PM CET, Mohamed Khalfella wrote:
>> >
>> > So changing the timeout field in the tagset should be doable, the
>> > only problem is avoid racing against nvme_alloc_ns().
>> >
>> > I will try to come up with something.
>> 
>> I decided to keep the current design, calling blk_queue_rq_timeout()
>> with the namespaces_lock mutex locked is the easiest solution
>> 
>> I am sending a V2 in a few moments.
>> 
>
> How about restricting changing io timeout to LIVE controllers only. This
> will make the problem easier to solve. Maybe something like the below?
>
> 	enum nvme_ctrl_state state;
> 	unsigned long flags;
>
>         spin_lock_irqsave(&ctrl->lock, flags);
> 	state = nvme_ctrl_state(ctrl);
> 	if (state != NVME_CTRL_LIVE) {
> 		spin_unlock_irqrestore(&ctrl->lock, flags);
> 		return -EBUSY
> 	}

I previously considered this solution, but I discarded it for two reasons:

1) If the user sets the io_timeout too low, the controller ends up
resetting non-stop. With this patch, it becomes hard to catch the
controller in a LIVE state and fix the problem, because the controller is
almost always RESETTING. You essentially have to wait until the driver
gives up and removes the controller.

2) This doesn't prevent nvme_alloc_ns() from racing against the sysfs
path.
Suppose that thread A is executing nvme_alloc_ns() and has just called
disk = blk_mq_alloc_disk(ctrl->tagset, &lim, ns);

immediately after, the user changes io_timeout. Thread B takes the
namespaces_lock, scans the namespaces list and calls
blk_queue_rq_timeout() on each namespaces' queues.

thread A now can take the namespaces_lock and adds the new namespace to
the list, but the problem is that the new namespace is using the stale
timeout setting it inherited from ctrl->tagset.

Maurizio

>
> 	if (ctrl->queue_count > 1)
> 		WRITE_ONCE(ctrl->tagset->timeout, timeout);
> 	spin_unlock_irqrestore(&ctrl->lock, flags);
>
> 	/* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
> 	mutex_lock(&ctrl->namespaces_lock);
>
> 	ctrl->io_timeout = msecs_to_jiffies(timeout);
> 	list_for_each_entry(ns, &ctrl->namespaces, list)
> 		blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
>
> 	mutex_unlock(&ctrl->namespaces_lock);
>
>
>> Maurizio
>> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2026-02-23 10:42 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-12 12:09 [PATCH RFC 0/5] nvme: Refactor and expose per-controller timeout configuration Maurizio Lombardi
2026-02-12 12:09 ` [PATCH RFC 1/5] nvme: Let the blocklayer set timeouts for requests Maurizio Lombardi
2026-02-17 20:15   ` Mohamed Khalfella
2026-02-18 12:19     ` Heyne, Maximilian
2026-02-12 12:09 ` [PATCH RFC 2/5] nvme: add sysfs attribute to change admin timeout per nvme controller Maurizio Lombardi
2026-02-17 20:17   ` Mohamed Khalfella
2026-02-18 12:40   ` Heyne, Maximilian
2026-02-18 14:49     ` Maurizio Lombardi
2026-02-18 18:00   ` Mohamed Khalfella
2026-02-19 16:16     ` Maurizio Lombardi
2026-02-12 12:09 ` [PATCH RFC 3/5] nvme: pci: use admin queue timeout over NVME_ADMIN_TIMEOUT Maurizio Lombardi
2026-02-17 20:22   ` Mohamed Khalfella
2026-02-18 13:10   ` Heyne, Maximilian
2026-02-18 13:32     ` Maurizio Lombardi
2026-02-12 12:09 ` [PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller Maurizio Lombardi
2026-02-17 20:25   ` Mohamed Khalfella
2026-02-18 13:28     ` Maurizio Lombardi
2026-02-18 17:54       ` Mohamed Khalfella
2026-02-19 17:22         ` Maurizio Lombardi
2026-02-20 12:47           ` Maurizio Lombardi
2026-02-20 17:53             ` Mohamed Khalfella
2026-02-23 10:36               ` Maurizio Lombardi
2026-02-18 17:58   ` Mohamed Khalfella
2026-02-12 12:09 ` [PATCH RFC 5/5] nvme: use per controller timeout waits over depending on global default Maurizio Lombardi
2026-02-17 20:29   ` Mohamed Khalfella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox