[PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs
@ 2026-05-16 18:36 Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 1/8] nvme: add diag attribute group under sysfs Nilay Shroff
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

Hi,

The NVMe driver encounters various events and conditions during normal
operation that are either not tracked today or not exposed to userspace
via sysfs. Lack of visibility into these events can make it difficult to
diagnose subtle issues related to controller behavior, multipath
stability, and I/O reliability.

This patchset adds several diagnostic counters that provide improved
observability into NVMe behavior. These counters are intended to help
users understand events such as transient path unavailability,
controller retries/reconnect/reset, failovers, and I/O failures. They
can also be consumed by monitoring tools such as nvme-top.

Specifically, this series proposes to export the following counters via
sysfs:
  - Command retry count
  - Multipath failover count
  - Command error count
  - I/O requeue count
  - I/O failure count
  - Controller reset event counts
  - Controller reconnect counts

The first patch in the series adds a new diag attribute group under per-path, 
ns-head and ctrl sysfs directories so that all diagnostics counters could be  
grouped together under diag sub-directory. The subsequent patches in the series
adds diagnostics counters listed above.

Please note that this patchset doesn't make any functional change but
rather export relevant counters to user space via sysfs.

As usual, feedback/comments/suggestions are welcome!

Changes from v3:
  - To be consistent in naming, all counters are suffixed with _count
    (Keith Busch)
  - The first patch in the series creates new attribute group named
    diag and all counters are now grouped under this new sysfs
    attribute group (Keith Busch)
  - Counters are defined as atomic_long_t instead of size_t (Keith Busch)
  - Removed RB and TB tags due to above changes
Link to v3: https://lore.kernel.org/all/20260220175024.292898-1-nilay@linux.ibm.com/

Changes from v2:
  - Allow user to write to sysfs attributes so that user could
    reset stat counters, if needed (Sagi)
  - The controller reconnect counter nr_reconnects could reset
    to zero once connection is re-established, so instead of
    exposing nr_reconnects counter via sysfs introduce a new
    counter which accumulates the reconnect attempts and export 
    this accumulated counter via sysfs (Sagi)
Link to v2: https://lore.kernel.org/all/20260205124810.682559-1-nilay@linux.ibm.com/

Changes from v1:
  - Remove export of stats for admin command rerty count (Keith)
  - Use size_add() to ensure stat counters don't overflow (Keith)
Link to v1: https://lore.kernel.org/all/20260130182028.885089-1-nilay@linux.ibm.com/  

Nilay Shroff (8):
  nvme: add diag attribute group under sysfs
  nvme: export command retry count via sysfs
  nvme: export multipath failover count via sysfs
  nvme: export command error counters via sysfs
  nvme: export I/O requeue count when no path is available via sysfs
  nvme: export I/O failure count when no path is available via sysfs
  nvme: export controller reset event count via sysfs
  nvme: export controller reconnect event count via sysfs

 drivers/nvme/host/core.c      |  15 ++-
 drivers/nvme/host/fc.c        |   3 +
 drivers/nvme/host/multipath.c |  87 ++++++++++++++
 drivers/nvme/host/nvme.h      |  13 +++
 drivers/nvme/host/pci.c       |   1 +
 drivers/nvme/host/rdma.c      |   2 +
 drivers/nvme/host/sysfs.c     | 214 ++++++++++++++++++++++++++++++++++
 drivers/nvme/host/tcp.c       |   2 +
 8 files changed, 336 insertions(+), 1 deletion(-)

-- 
2.53.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCHv4 1/8] nvme: add diag attribute group under sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 2/8] nvme: export command retry count via sysfs Nilay Shroff
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

Add a new diag attribute group under:
/sys/class/nvme/<ctrl>/
/sys/block/<nvme-path-dev>/
/sys/block/<ns-head-dev>/

This new sysfs attribute group will be used to organize NVMe diagnostic
and telemetry-related counters under it.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/nvme.h  |  1 +
 drivers/nvme/host/pci.c   |  1 +
 drivers/nvme/host/sysfs.c | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 37 insertions(+)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ccd5e05dac98..c8225d594252 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1012,6 +1012,7 @@ extern const struct attribute_group nvme_ns_mpath_attr_group;
 extern const struct pr_ops nvme_pr_ops;
 extern const struct block_device_operations nvme_ns_head_ops;
 extern const struct attribute_group nvme_dev_attrs_group;
+extern const struct attribute_group nvme_dev_diag_attrs_group;
 extern const struct attribute_group *nvme_subsys_attrs_groups[];
 extern const struct attribute_group *nvme_dev_attr_groups[];
 extern const struct block_device_operations nvme_bdev_ops;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 9fd04cd7c5cb..3f7bdf0fa786 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2810,6 +2810,7 @@ static const struct attribute_group nvme_pci_dev_attrs_group = {
 static const struct attribute_group *nvme_pci_dev_attr_groups[] = {
 	&nvme_dev_attrs_group,
 	&nvme_pci_dev_attrs_group,
+	&nvme_dev_diag_attrs_group,
 	NULL,
 };
 
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index e59758616f27..cc569c8556f3 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -351,11 +351,28 @@ const struct attribute_group nvme_ns_mpath_attr_group = {
 };
 #endif
 
+static struct attribute *nvme_ns_diag_attrs[] = {
+	NULL,
+};
+
+static umode_t nvme_ns_diag_attrs_are_visible(struct kobject *kobj,
+		struct attribute *a, int n)
+{
+	return a->mode;
+}
+
+const struct attribute_group nvme_ns_diag_attr_group = {
+	.name		= "diag",
+	.attrs		= nvme_ns_diag_attrs,
+	.is_visible	= nvme_ns_diag_attrs_are_visible,
+};
+
 const struct attribute_group *nvme_ns_attr_groups[] = {
 	&nvme_ns_attr_group,
 #ifdef CONFIG_NVME_MULTIPATH
 	&nvme_ns_mpath_attr_group,
 #endif
+	&nvme_ns_diag_attr_group,
 	NULL,
 };
 
@@ -937,11 +954,29 @@ static const struct attribute_group nvme_tls_attrs_group = {
 };
 #endif
 
+static struct attribute *nvme_dev_diag_attrs[] = {
+	NULL,
+};
+
+static umode_t nvme_dev_diag_attrs_are_visible(struct kobject *kobj,
+		struct attribute *a, int n)
+{
+	return a->mode;
+}
+
+const struct attribute_group nvme_dev_diag_attrs_group = {
+	.name		= "diag",
+	.attrs		= nvme_dev_diag_attrs,
+	.is_visible	= nvme_dev_diag_attrs_are_visible,
+};
+EXPORT_SYMBOL_GPL(nvme_dev_diag_attrs_group);
+
 const struct attribute_group *nvme_dev_attr_groups[] = {
 	&nvme_dev_attrs_group,
 #ifdef CONFIG_NVME_TCP_TLS
 	&nvme_tls_attrs_group,
 #endif
+	&nvme_dev_diag_attrs_group,
 	NULL,
 };
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 2/8] nvme: export command retry count via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 1/8] nvme: add diag attribute group under sysfs Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 3/8] nvme: export multipath failover " Nilay Shroff
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

When Advanced Command Retry Enable (ACRE) is configured, a controller
may interrupt command execution and return a completion status
indicating command interrupted with the DNR bit cleared. In this case,
the driver retries the command based on the Command Retry Delay (CRD)
value provided in the completion status.

Currently, these command retries are handled entirely within the NVMe
driver and are not visible to userspace. As a result, there is no
observability into retry behavior, which can be a useful diagnostic
signal.

Expose a per-namespace sysfs attribute command_retries_count, under
diag attribute group to provide visibility into retry activity. This
information can help identify controller-side congestion under load
and enables comparison across paths in multipath setups (for example,
detecting cases where one path experiences significantly more retries
than another under identical workloads).

This exported metric is intended for diagnostics and monitoring tools
such as nvme-top, and does not change command retry behavior. A new
sysfs attribute named "command_retries_count" is added for this purpose.
This attribute is both readable as well as writable. So user could
reset this counter if needed.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/core.c  |  4 ++++
 drivers/nvme/host/nvme.h  |  1 +
 drivers/nvme/host/sysfs.c | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dc388e24caad..bacd5e45c322 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -323,6 +323,7 @@ static void nvme_retry_req(struct request *req)
 {
 	unsigned long delay = 0;
 	u16 crd;
+	struct nvme_ns *ns = req->q->queuedata;
 
 	/* The mask and shift result must be <= 3 */
 	crd = (nvme_req(req)->status & NVME_STATUS_CRD) >> 11;
@@ -330,6 +331,9 @@ static void nvme_retry_req(struct request *req)
 		delay = nvme_req(req)->ctrl->crdt[crd - 1] * 100;
 
 	nvme_req(req)->retries++;
+	if (ns)
+		atomic_long_inc(&ns->retries);
+
 	blk_mq_requeue_request(req, false);
 	blk_mq_delay_kick_requeue_list(req->q, delay);
 }
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index c8225d594252..7538153fa61c 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -590,6 +590,7 @@ struct nvme_ns {
 	enum nvme_ana_state ana_state;
 	u32 ana_grpid;
 #endif
+	atomic_long_t retries;
 	struct list_head siblings;
 	struct kref kref;
 	struct nvme_ns_head *head;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index cc569c8556f3..46071e87079f 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -351,13 +351,46 @@ const struct attribute_group nvme_ns_mpath_attr_group = {
 };
 #endif
 
+static ssize_t command_retries_count_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+
+	return sysfs_emit(buf, "%lu\n", atomic_long_read(&ns->retries));
+}
+
+static ssize_t command_retries_count_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	unsigned long retries;
+	int err;
+	struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+
+	err = kstrtoul(buf, 0, &retries);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&ns->retries, retries);
+
+	return count;
+}
+static DEVICE_ATTR_RW(command_retries_count);
+
 static struct attribute *nvme_ns_diag_attrs[] = {
+	&dev_attr_command_retries_count.attr,
 	NULL,
 };
 
 static umode_t nvme_ns_diag_attrs_are_visible(struct kobject *kobj,
 		struct attribute *a, int n)
 {
+	struct device *dev = container_of(kobj, struct device, kobj);
+
+	if (a == &dev_attr_command_retries_count.attr) {
+		if (nvme_disk_is_ns_head(dev_to_disk(dev)))
+			return 0;
+	}
+
 	return a->mode;
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 3/8] nvme: export multipath failover count via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 1/8] nvme: add diag attribute group under sysfs Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 2/8] nvme: export command retry count via sysfs Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 4/8] nvme: export command error counters " Nilay Shroff
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

When an NVMe command completes with a path-specific error, the NVMe
driver may retry the command on an alternate controller or path if one
is available. These failover events indicate that I/O was redirected
away from the original path.

Currently, the number of times requests are failed over to another
available path is not visible to userspace. Exposing this information
can be useful for diagnosing path health and stability.

Export per-path sysfs attribute "multipath_failover_count" under diag
attribute group. This attribute is both readable and writable and thus
allowing user to reset the counter. This counter can be consumed by
monitoring tools such as nvme-top to help identify paths that
consistently trigger failovers under load.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/multipath.c | 27 +++++++++++++++++++++++++++
 drivers/nvme/host/nvme.h      |  2 ++
 drivers/nvme/host/sysfs.c     | 10 +++++++++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 263161cb8ac0..032595502165 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -142,6 +142,7 @@ void nvme_failover_req(struct request *req)
 	struct bio *bio;
 
 	nvme_mpath_clear_current_path(ns);
+	atomic_long_inc(&ns->failover);
 
 	/*
 	 * If we got back an ANA error, we know the controller is alive but not
@@ -1151,6 +1152,32 @@ static ssize_t delayed_removal_secs_store(struct device *dev,
 
 DEVICE_ATTR_RW(delayed_removal_secs);
 
+static ssize_t multipath_failover_count_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+
+	return sysfs_emit(buf, "%lu\n", atomic_long_read(&ns->failover));
+}
+
+static ssize_t multipath_failover_count_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	unsigned long failover;
+	int ret;
+	struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+
+	ret = kstrtoul(buf, 0, &failover);
+	if (ret)
+		return -EINVAL;
+
+	atomic_long_set(&ns->failover, failover);
+
+	return count;
+}
+
+DEVICE_ATTR_RW(multipath_failover_count);
+
 static int nvme_lookup_ana_group_desc(struct nvme_ctrl *ctrl,
 		struct nvme_ana_group_desc *desc, void *data)
 {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 7538153fa61c..68c9df4f457a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -589,6 +589,7 @@ struct nvme_ns {
 #ifdef CONFIG_NVME_MULTIPATH
 	enum nvme_ana_state ana_state;
 	u32 ana_grpid;
+	atomic_long_t failover;
 #endif
 	atomic_long_t retries;
 	struct list_head siblings;
@@ -1063,6 +1064,7 @@ extern struct device_attribute dev_attr_ana_state;
 extern struct device_attribute dev_attr_queue_depth;
 extern struct device_attribute dev_attr_numa_nodes;
 extern struct device_attribute dev_attr_delayed_removal_secs;
+extern struct device_attribute dev_attr_multipath_failover_count;
 extern struct device_attribute subsys_attr_iopolicy;
 
 static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 46071e87079f..35a42fd4aec4 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -378,6 +378,9 @@ static DEVICE_ATTR_RW(command_retries_count);
 
 static struct attribute *nvme_ns_diag_attrs[] = {
 	&dev_attr_command_retries_count.attr,
+#ifdef CONFIG_NVME_MULTIPATH
+	&dev_attr_multipath_failover_count.attr,
+#endif
 	NULL,
 };
 
@@ -390,7 +393,12 @@ static umode_t nvme_ns_diag_attrs_are_visible(struct kobject *kobj,
 		if (nvme_disk_is_ns_head(dev_to_disk(dev)))
 			return 0;
 	}
-
+#ifdef CONFIG_NVME_MULTIPATH
+	if (a == &dev_attr_multipath_failover_count.attr) {
+		if (nvme_disk_is_ns_head(dev_to_disk(dev)))
+			return 0;
+	}
+#endif
 	return a->mode;
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 4/8] nvme: export command error counters via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (2 preceding siblings ...)
  2026-05-16 18:36 ` [PATCHv4 3/8] nvme: export multipath failover " Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 5/8] nvme: export I/O requeue count when no path is usable " Nilay Shroff
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

When an NVMe command completes with an error status, the driver
logs the error to the kernel log. However, these messages may be
lost or overwritten over time since dmesg is a circular buffer.

Expose per-path and ctrl sysfs attribute command_error_count, under
diag attribute group to provide persistent visibility into error
occurrences. This allows users to observe the total number of commands
that have failed on a given path over time, which can be useful for
diagnosing path health and stability.

This attribute is both readable and writable thus allowing user to reset
these counters. These counters can also be consumed by observability
tools such as nvme-top to provide additional insight into NVMe error
behavior.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/core.c  | 10 +++++-
 drivers/nvme/host/nvme.h  |  2 ++
 drivers/nvme/host/sysfs.c | 66 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index bacd5e45c322..3b2f7a972941 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -438,11 +438,19 @@ static inline void nvme_end_req_zoned(struct request *req)
 
 static inline void __nvme_end_req(struct request *req)
 {
-	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET))) {
+	struct nvme_ns *ns = req->q->queuedata;
+	struct nvme_request *nr = nvme_req(req);
+
+	if (unlikely(nr->status && !(req->rq_flags & RQF_QUIET))) {
 		if (blk_rq_is_passthrough(req))
 			nvme_log_err_passthru(req);
 		else
 			nvme_log_error(req);
+
+		if (ns)
+			atomic_long_inc(&ns->errors);
+		else
+			atomic_long_inc(&nr->ctrl->errors);
 	}
 	nvme_end_req_zoned(req);
 	nvme_trace_bio_complete(req);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 68c9df4f457a..b83d702dbb92 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -413,6 +413,7 @@ struct nvme_ctrl {
 	unsigned long ka_last_check_time;
 	struct work_struct fw_act_work;
 	unsigned long events;
+	atomic_long_t errors;
 
 #ifdef CONFIG_NVME_MULTIPATH
 	/* asymmetric namespace access: */
@@ -592,6 +593,7 @@ struct nvme_ns {
 	atomic_long_t failover;
 #endif
 	atomic_long_t retries;
+	atomic_long_t errors;
 	struct list_head siblings;
 	struct kref kref;
 	struct nvme_ns_head *head;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 35a42fd4aec4..789518f21f40 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -6,6 +6,7 @@
  */
 
 #include <linux/nvme-auth.h>
+#include <linux/blkdev.h>
 
 #include "nvme.h"
 #include "fabrics.h"
@@ -376,8 +377,37 @@ static ssize_t command_retries_count_store(struct device *dev,
 }
 static DEVICE_ATTR_RW(command_retries_count);
 
+static ssize_t nvme_io_errors_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+
+	return sysfs_emit(buf, "%lu\n", atomic_long_read(&ns->errors));
+}
+
+static ssize_t nvme_io_errors_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	unsigned long errors;
+	int err;
+	struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+
+	err = kstrtoul(buf, 0, &errors);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&ns->errors, errors);
+
+	return count;
+}
+
+struct device_attribute dev_attr_io_errors =
+	__ATTR(command_error_count, 0644,
+		nvme_io_errors_show, nvme_io_errors_store);
+
 static struct attribute *nvme_ns_diag_attrs[] = {
 	&dev_attr_command_retries_count.attr,
+	&dev_attr_io_errors.attr,
 #ifdef CONFIG_NVME_MULTIPATH
 	&dev_attr_multipath_failover_count.attr,
 #endif
@@ -393,6 +423,12 @@ static umode_t nvme_ns_diag_attrs_are_visible(struct kobject *kobj,
 		if (nvme_disk_is_ns_head(dev_to_disk(dev)))
 			return 0;
 	}
+	if (a == &dev_attr_io_errors.attr) {
+		struct gendisk *disk = dev_to_disk(dev);
+
+		if (nvme_disk_is_ns_head(disk))
+			return 0;
+	}
 #ifdef CONFIG_NVME_MULTIPATH
 	if (a == &dev_attr_multipath_failover_count.attr) {
 		if (nvme_disk_is_ns_head(dev_to_disk(dev)))
@@ -995,7 +1031,37 @@ static const struct attribute_group nvme_tls_attrs_group = {
 };
 #endif
 
+static ssize_t nvme_adm_errors_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%lu\n",
+			(unsigned long)atomic_long_read(&ctrl->errors));
+}
+
+static ssize_t nvme_adm_errors_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	unsigned long errors;
+	int err;
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	err = kstrtoul(buf, 0, &errors);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&ctrl->errors, errors);
+
+	return count;
+}
+
+struct device_attribute dev_attr_adm_errors =
+	__ATTR(command_error_count, 0644,
+		nvme_adm_errors_show, nvme_adm_errors_store);
+
 static struct attribute *nvme_dev_diag_attrs[] = {
+	&dev_attr_adm_errors.attr,
 	NULL,
 };
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 5/8] nvme: export I/O requeue count when no path is usable via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (3 preceding siblings ...)
  2026-05-16 18:36 ` [PATCHv4 4/8] nvme: export command error counters " Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 6/8] nvme: export I/O failure count when no path is available " Nilay Shroff
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

When the NVMe namespace head determines that there is no currently
available path to handle I/O (for example, while a controller is
resetting/connecting or due to a transient link failure), incoming
I/Os are added to the requeue list.

Currently, there is no visibility into how many I/Os have been requeued
in this situation. Add a new ns-head sysfs counter
io_requeue_no_usable_path_count, under diag attribute group to expose
the number of I/Os that were requeued due to the absence of an available
path. This counter is also writable thus allowing user to reset it, if
needed.

This statistic can help users understand I/O slowdowns or stalls caused
by temporary path unavailability, and can be consumed by monitoring
tools such as nvme-top for real-time observability.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/multipath.c | 30 ++++++++++++++++++++++++++++++
 drivers/nvme/host/nvme.h      |  2 ++
 drivers/nvme/host/sysfs.c     |  5 +++++
 3 files changed, 37 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 032595502165..f72a687daa8f 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -522,6 +522,7 @@ static void nvme_ns_head_submit_bio(struct bio *bio)
 		spin_lock_irq(&head->requeue_lock);
 		bio_list_add(&head->requeue_list, bio);
 		spin_unlock_irq(&head->requeue_lock);
+		atomic_long_inc(&head->io_requeue_no_usable_path_count);
 	} else {
 		dev_warn_ratelimited(dev, "no available path - failing I/O\n");
 
@@ -1178,6 +1179,35 @@ static ssize_t multipath_failover_count_store(struct device *dev,
 
 DEVICE_ATTR_RW(multipath_failover_count);
 
+static ssize_t io_requeue_no_usable_path_count_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct gendisk *disk = dev_to_disk(dev);
+	struct nvme_ns_head *head = disk->private_data;
+
+	return sysfs_emit(buf, "%lu\n",
+		    atomic_long_read(&head->io_requeue_no_usable_path_count));
+}
+
+static ssize_t io_requeue_no_usable_path_count_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	int err;
+	unsigned long requeue_cnt;
+	struct gendisk *disk = dev_to_disk(dev);
+	struct nvme_ns_head *head = disk->private_data;
+
+	err = kstrtoul(buf, 0, &requeue_cnt);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&head->io_requeue_no_usable_path_count, requeue_cnt);
+
+	return count;
+}
+
+DEVICE_ATTR_RW(io_requeue_no_usable_path_count);
+
 static int nvme_lookup_ana_group_desc(struct nvme_ctrl *ctrl,
 		struct nvme_ana_group_desc *desc, void *data)
 {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index b83d702dbb92..845e338449ce 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -564,6 +564,7 @@ struct nvme_ns_head {
 	unsigned long		flags;
 	struct delayed_work	remove_work;
 	unsigned int		delayed_removal_secs;
+	atomic_long_t		io_requeue_no_usable_path_count;
 #define NVME_NSHEAD_DISK_LIVE		0
 #define NVME_NSHEAD_QUEUE_IF_NO_PATH	1
 	struct nvme_ns __rcu	*current_path[];
@@ -1067,6 +1068,7 @@ extern struct device_attribute dev_attr_queue_depth;
 extern struct device_attribute dev_attr_numa_nodes;
 extern struct device_attribute dev_attr_delayed_removal_secs;
 extern struct device_attribute dev_attr_multipath_failover_count;
+extern struct device_attribute dev_attr_io_requeue_no_usable_path_count;
 extern struct device_attribute subsys_attr_iopolicy;
 
 static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 789518f21f40..9fe3a74b2bef 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -410,6 +410,7 @@ static struct attribute *nvme_ns_diag_attrs[] = {
 	&dev_attr_io_errors.attr,
 #ifdef CONFIG_NVME_MULTIPATH
 	&dev_attr_multipath_failover_count.attr,
+	&dev_attr_io_requeue_no_usable_path_count.attr,
 #endif
 	NULL,
 };
@@ -434,6 +435,10 @@ static umode_t nvme_ns_diag_attrs_are_visible(struct kobject *kobj,
 		if (nvme_disk_is_ns_head(dev_to_disk(dev)))
 			return 0;
 	}
+	if (a == &dev_attr_io_requeue_no_usable_path_count.attr) {
+		if (!nvme_disk_is_ns_head(dev_to_disk(dev)))
+			return 0;
+	}
 #endif
 	return a->mode;
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 6/8] nvme: export I/O failure count when no path is available via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (4 preceding siblings ...)
  2026-05-16 18:36 ` [PATCHv4 5/8] nvme: export I/O requeue count when no path is usable " Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 7/8] nvme: export controller reset event count " Nilay Shroff
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

When I/O is submitted to the NVMe namespace head and no available path
can handle the request, the driver fails the I/O immediately. Currently,
such failures are only reported via kernel log messages, which may be
lost over time since dmesg is a circular buffer.

Add a new ns-head sysfs counter io_fail_no_available_path_count, under
diag attribute group to expose the number of I/Os that failed due to the
absence of an available path. This provides persistent visibility into
path-related I/O failures and can help users diagnose the cause of I/O
errors. This counter is also writable and so user may reset its value,
if needed.

This counter can also be consumed by monitoring tools such as nvme-top.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/multipath.c | 30 ++++++++++++++++++++++++++++++
 drivers/nvme/host/nvme.h      |  2 ++
 drivers/nvme/host/sysfs.c     |  5 +++++
 3 files changed, 37 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index f72a687daa8f..dce566aca748 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -527,6 +527,7 @@ static void nvme_ns_head_submit_bio(struct bio *bio)
 		dev_warn_ratelimited(dev, "no available path - failing I/O\n");
 
 		bio_io_error(bio);
+		atomic_long_inc(&head->io_fail_no_available_path_count);
 	}
 
 	srcu_read_unlock(&head->srcu, srcu_idx);
@@ -1208,6 +1209,35 @@ static ssize_t io_requeue_no_usable_path_count_store(struct device *dev,
 
 DEVICE_ATTR_RW(io_requeue_no_usable_path_count);
 
+static ssize_t io_fail_no_available_path_count_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct gendisk *disk = dev_to_disk(dev);
+	struct nvme_ns_head *head = disk->private_data;
+
+	return sysfs_emit(buf, "%lu\n",
+		    atomic_long_read(&head->io_fail_no_available_path_count));
+}
+
+static ssize_t io_fail_no_available_path_count_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	int err;
+	unsigned long fail_cnt;
+	struct gendisk *disk = dev_to_disk(dev);
+	struct nvme_ns_head *head = disk->private_data;
+
+	err = kstrtoul(buf, 0, &fail_cnt);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&head->io_fail_no_available_path_count, fail_cnt);
+
+	return count;
+}
+
+DEVICE_ATTR_RW(io_fail_no_available_path_count);
+
 static int nvme_lookup_ana_group_desc(struct nvme_ctrl *ctrl,
 		struct nvme_ana_group_desc *desc, void *data)
 {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 845e338449ce..9434abf2659e 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -565,6 +565,7 @@ struct nvme_ns_head {
 	struct delayed_work	remove_work;
 	unsigned int		delayed_removal_secs;
 	atomic_long_t		io_requeue_no_usable_path_count;
+	atomic_long_t		io_fail_no_available_path_count;
 #define NVME_NSHEAD_DISK_LIVE		0
 #define NVME_NSHEAD_QUEUE_IF_NO_PATH	1
 	struct nvme_ns __rcu	*current_path[];
@@ -1069,6 +1070,7 @@ extern struct device_attribute dev_attr_numa_nodes;
 extern struct device_attribute dev_attr_delayed_removal_secs;
 extern struct device_attribute dev_attr_multipath_failover_count;
 extern struct device_attribute dev_attr_io_requeue_no_usable_path_count;
+extern struct device_attribute dev_attr_io_fail_no_available_path_count;
 extern struct device_attribute subsys_attr_iopolicy;
 
 static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 9fe3a74b2bef..01d771d85f31 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -411,6 +411,7 @@ static struct attribute *nvme_ns_diag_attrs[] = {
 #ifdef CONFIG_NVME_MULTIPATH
 	&dev_attr_multipath_failover_count.attr,
 	&dev_attr_io_requeue_no_usable_path_count.attr,
+	&dev_attr_io_fail_no_available_path_count.attr,
 #endif
 	NULL,
 };
@@ -439,6 +440,10 @@ static umode_t nvme_ns_diag_attrs_are_visible(struct kobject *kobj,
 		if (!nvme_disk_is_ns_head(dev_to_disk(dev)))
 			return 0;
 	}
+	if (a == &dev_attr_io_fail_no_available_path_count.attr) {
+		if (!nvme_disk_is_ns_head(dev_to_disk(dev)))
+			return 0;
+	}
 #endif
 	return a->mode;
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 7/8] nvme: export controller reset event count via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (5 preceding siblings ...)
  2026-05-16 18:36 ` [PATCHv4 6/8] nvme: export I/O failure count when no path is available " Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:36 ` [PATCHv4 8/8] nvme: export controller reconnect " Nilay Shroff
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

The NVMe controller transitions into the RESETTING state during error
recovery, link instability, firmware activation, or when a reset is
explicitly triggered by the user.

Expose a per-ctrl sysfs attribute reset_count, under diag attribute
group to provide visibility into these RESETTING state transitions.
Observing the frequency of reset events can help users identify issues
such as PCIe errors or unstable fabric links. This counter is also
writable thus allowing user to reset its value, if needed.

This counter can also be consumed by monitoring tools such as nvme-top
to improve controller-level observability.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/core.c  |  1 +
 drivers/nvme/host/nvme.h  |  1 +
 drivers/nvme/host/sysfs.c | 27 +++++++++++++++++++++++++++
 3 files changed, 29 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3b2f7a972941..b4dace419552 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -596,6 +596,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		case NVME_CTRL_NEW:
 		case NVME_CTRL_LIVE:
 			changed = true;
+			atomic_long_inc(&ctrl->nr_reset);
 			fallthrough;
 		default:
 			break;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9434abf2659e..e575bef99d4a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -414,6 +414,7 @@ struct nvme_ctrl {
 	struct work_struct fw_act_work;
 	unsigned long events;
 	atomic_long_t errors;
+	atomic_long_t nr_reset;
 
 #ifdef CONFIG_NVME_MULTIPATH
 	/* asymmetric namespace access: */
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 01d771d85f31..72300d6de880 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -1070,8 +1070,35 @@ struct device_attribute dev_attr_adm_errors =
 	__ATTR(command_error_count, 0644,
 		nvme_adm_errors_show, nvme_adm_errors_store);
 
+static ssize_t reset_count_show(struct device *dev,
+		   struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%lu\n", atomic_long_read(&ctrl->nr_reset));
+}
+
+static ssize_t reset_count_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	int err;
+	unsigned long reset_cnt;
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	err = kstrtoul(buf, 0, &reset_cnt);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&ctrl->nr_reset, reset_cnt);
+
+	return count;
+}
+
+static DEVICE_ATTR_RW(reset_count);
+
 static struct attribute *nvme_dev_diag_attrs[] = {
 	&dev_attr_adm_errors.attr,
+	&dev_attr_reset_count.attr,
 	NULL,
 };
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 8/8] nvme: export controller reconnect event count via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (6 preceding siblings ...)
  2026-05-16 18:36 ` [PATCHv4 7/8] nvme: export controller reset event count " Nilay Shroff
@ 2026-05-16 18:36 ` Nilay Shroff
  2026-05-16 18:47 ` [PATCHv4 0/8] nvme: export additional diagnostic counters " Nilay Shroff
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:36 UTC (permalink / raw)
  To: linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong, Nilay Shroff

When an NVMe-oF link goes down, the driver attempts to recover the
connection by repeatedly reconnecting to the remote controller at
configured intervals. A maximum number of reconnect attempts is also
configured, after which recovery stops and the controller is removed
if the connection cannot be re-established.

The driver maintains a counter, nr_reconnects, which is incremented on
each reconnect attempt. However if in case the reconnect is successful
then this counter reset to zero. Moreover, currently, this counter is
only reported via kernel log messages and is not exposed to userspace.
Since dmesg is a circular buffer, this information may be lost over
time.

So introduce a new accumulator which accumulates nr_reconnect attempts
and also expose this accumulator per-fabric ctrl via a new sysfs
attribute reconnect_count, under diag attribute grroup to provide
persistent visibility into the number of reconnect attempts made by the
host. This information can help users diagnose unstable links or
connectivity issues. Furthermore, this sysfs attribute is also writable
so user may reset it to zero, if needed.

The reconnect_count can also be consumed by monitoring tools such as
nvme-top to improve controller-level observability.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/fc.c    |  3 +++
 drivers/nvme/host/nvme.h  |  2 ++
 drivers/nvme/host/rdma.c  |  2 ++
 drivers/nvme/host/sysfs.c | 35 +++++++++++++++++++++++++++++++++++
 drivers/nvme/host/tcp.c   |  2 ++
 5 files changed, 44 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index e4f4528fe2a2..f04eb13dd5e9 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3148,6 +3148,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 		goto out_term_aen_ops;
 	}
 
+	/* accumulate reconnect attempts before resetting it to zero */
+	atomic_long_add(ctrl->ctrl.nr_reconnects, &ctrl->ctrl.acc_reconnects);
 	ctrl->ctrl.nr_reconnects = 0;
 	nvme_start_ctrl(&ctrl->ctrl);
 
@@ -3470,6 +3472,7 @@ nvme_fc_alloc_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 
 	ctrl->ctrl.opts = opts;
 	ctrl->ctrl.nr_reconnects = 0;
+	atomic_long_set(&ctrl->ctrl.acc_reconnects, 0);
 	INIT_LIST_HEAD(&ctrl->ctrl_list);
 	ctrl->lport = lport;
 	ctrl->rport = rport;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index e575bef99d4a..22535328fdd5 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -456,6 +456,8 @@ struct nvme_ctrl {
 	u16 icdoff;
 	u16 maxcmd;
 	int nr_reconnects;
+	/* accumulate reconenct attempts, as nr_reconnects can reset to zero */
+	atomic_long_t acc_reconnects;
 	unsigned long flags;
 	struct nvmf_ctrl_options *opts;
 
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index f77c960f7632..de45fefdc15e 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1110,6 +1110,8 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 	dev_info(ctrl->ctrl.device, "Successfully reconnected (%d attempts)\n",
 			ctrl->ctrl.nr_reconnects);
 
+	/* accumulate reconnect attempts before resetting it to zero */
+	atomic_long_add(ctrl->ctrl.nr_reconnects, &ctrl->ctrl.acc_reconnects);
 	ctrl->ctrl.nr_reconnects = 0;
 
 	return;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 72300d6de880..9c15e7d869ed 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -1094,17 +1094,52 @@ static ssize_t reset_count_store(struct device *dev,
 	return count;
 }
 
+static ssize_t reconnect_count_show(struct device *dev,
+		   struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%lu\n",
+			  atomic_long_read(&ctrl->acc_reconnects) +
+			  ctrl->nr_reconnects);
+}
+
+static ssize_t reconnect_count_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	int err;
+	unsigned long reconnect_cnt;
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	err = kstrtoul(buf, 0, &reconnect_cnt);
+	if (err)
+		return -EINVAL;
+
+	atomic_long_set(&ctrl->acc_reconnects, reconnect_cnt);
+
+	return count;
+}
+
+static DEVICE_ATTR_RW(reconnect_count);
+
 static DEVICE_ATTR_RW(reset_count);
 
 static struct attribute *nvme_dev_diag_attrs[] = {
 	&dev_attr_adm_errors.attr,
 	&dev_attr_reset_count.attr,
+	&dev_attr_reconnect_count.attr,
 	NULL,
 };
 
 static umode_t nvme_dev_diag_attrs_are_visible(struct kobject *kobj,
 		struct attribute *a, int n)
 {
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	if (a == &dev_attr_reconnect_count.attr && !ctrl->opts)
+		return 0;
+
 	return a->mode;
 }
 
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 15d36d6a728e..ab9d19497b3f 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2475,6 +2475,8 @@ static void nvme_tcp_reconnect_ctrl_work(struct work_struct *work)
 	dev_info(ctrl->device, "Successfully reconnected (attempt %d/%d)\n",
 		 ctrl->nr_reconnects, ctrl->opts->max_reconnects);
 
+	/* accumulate reconnect attempts before resetting it to zero */
+	atomic_long_add(ctrl->nr_reconnects, &ctrl->acc_reconnects);
 	ctrl->nr_reconnects = 0;
 
 	return;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (7 preceding siblings ...)
  2026-05-16 18:36 ` [PATCHv4 8/8] nvme: export controller reconnect " Nilay Shroff
@ 2026-05-16 18:47 ` Nilay Shroff
  2026-05-25  9:12 ` Venkat Rao Bagalkote
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Nilay Shroff @ 2026-05-16 18:47 UTC (permalink / raw)
  To: Keith Busch
  Cc: dwagner, hare, hch, sagi, axboe, chaitanyak, venkat88, gjoyce,
	wenxiong, linux-nvme@lists.infradead.org

Hi Keith,

On 5/17/26 12:06 AM, Nilay Shroff wrote:
> Hi,
> 
> The NVMe driver encounters various events and conditions during normal
> operation that are either not tracked today or not exposed to userspace
> via sysfs. Lack of visibility into these events can make it difficult to
> diagnose subtle issues related to controller behavior, multipath
> stability, and I/O reliability.
> 
> This patchset adds several diagnostic counters that provide improved
> observability into NVMe behavior. These counters are intended to help
> users understand events such as transient path unavailability,
> controller retries/reconnect/reset, failovers, and I/O failures. They
> can also be consumed by monitoring tools such as nvme-top.
> 
> Specifically, this series proposes to export the following counters via
> sysfs:
>    - Command retry count
>    - Multipath failover count
>    - Command error count
>    - I/O requeue count
>    - I/O failure count
>    - Controller reset event counts
>    - Controller reconnect counts
> 
> The first patch in the series adds a new diag attribute group under per-path,
> ns-head and ctrl sysfs directories so that all diagnostics counters could be
> grouped together under diag sub-directory. The subsequent patches in the series
> adds diagnostics counters listed above.
> 
> Please note that this patchset doesn't make any functional change but
> rather export relevant counters to user space via sysfs.
> 
> As usual, feedback/comments/suggestions are welcome!
> 
> Changes from v3:
>    - To be consistent in naming, all counters are suffixed with _count
>      (Keith Busch)
>    - The first patch in the series creates new attribute group named
>      diag and all counters are now grouped under this new sysfs
>      attribute group (Keith Busch)
>    - Counters are defined as atomic_long_t instead of size_t (Keith Busch)
>    - Removed RB and TB tags due to above changes
> Link to v3: https://lore.kernel.org/all/20260220175024.292898-1-nilay@linux.ibm.com/

As discussed during LSFMM, I have incorporated all of your feedback listed in the
changelog above in current patchset, except for integrating command_error_count into
the generic gendisk statistics.

While working through the implementation, I realized that although some namespace/
path-level error statistics could conceptually fit within generic block-layer accounting,
controller-wide error statistics do not naturally map onto a gendisk object. This is
because NVMe controllers themselves do not have an associated block device representation.

As a result, controller-scoped telemetry cannot be cleanly integrated into generic gendisk
statistics. Therefore, this patchset exports these counters through NVMe-specific sysfs
attributes instead. Let me know if this approach looks reasonable.

Thanks,
--Nilay



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (8 preceding siblings ...)
  2026-05-16 18:47 ` [PATCHv4 0/8] nvme: export additional diagnostic counters " Nilay Shroff
@ 2026-05-25  9:12 ` Venkat Rao Bagalkote
  2026-05-27 19:54 ` Keith Busch
  2026-06-04  8:58 ` Keith Busch
  11 siblings, 0 replies; 13+ messages in thread
From: Venkat Rao Bagalkote @ 2026-05-25  9:12 UTC (permalink / raw)
  To: Nilay Shroff, linux-nvme
  Cc: dwagner, hare, kbusch, hch, sagi, axboe, chaitanyak, gjoyce,
	wenxiong


On 17/05/26 12:06 am, Nilay Shroff wrote:
> Hi,
>
> The NVMe driver encounters various events and conditions during normal
> operation that are either not tracked today or not exposed to userspace
> via sysfs. Lack of visibility into these events can make it difficult to
> diagnose subtle issues related to controller behavior, multipath
> stability, and I/O reliability.
>
> This patchset adds several diagnostic counters that provide improved
> observability into NVMe behavior. These counters are intended to help
> users understand events such as transient path unavailability,
> controller retries/reconnect/reset, failovers, and I/O failures. They
> can also be consumed by monitoring tools such as nvme-top.
>
> Specifically, this series proposes to export the following counters via
> sysfs:
>    - Command retry count
>    - Multipath failover count
>    - Command error count
>    - I/O requeue count
>    - I/O failure count
>    - Controller reset event counts
>    - Controller reconnect counts
>
> The first patch in the series adds a new diag attribute group under per-path,
> ns-head and ctrl sysfs directories so that all diagnostics counters could be
> grouped together under diag sub-directory. The subsequent patches in the series
> adds diagnostics counters listed above.
>
> Please note that this patchset doesn't make any functional change but
> rather export relevant counters to user space via sysfs.
>
> As usual, feedback/comments/suggestions are welcome!
>
> Changes from v3:
>    - To be consistent in naming, all counters are suffixed with _count
>      (Keith Busch)
>    - The first patch in the series creates new attribute group named
>      diag and all counters are now grouped under this new sysfs
>      attribute group (Keith Busch)
>    - Counters are defined as atomic_long_t instead of size_t (Keith Busch)
>    - Removed RB and TB tags due to above changes
> Link to v3: https://lore.kernel.org/all/20260220175024.292898-1-nilay@linux.ibm.com/
>
> Changes from v2:
>    - Allow user to write to sysfs attributes so that user could
>      reset stat counters, if needed (Sagi)
>    - The controller reconnect counter nr_reconnects could reset
>      to zero once connection is re-established, so instead of
>      exposing nr_reconnects counter via sysfs introduce a new
>      counter which accumulates the reconnect attempts and export
>      this accumulated counter via sysfs (Sagi)
> Link to v2: https://lore.kernel.org/all/20260205124810.682559-1-nilay@linux.ibm.com/
>
> Changes from v1:
>    - Remove export of stats for admin command rerty count (Keith)
>    - Use size_add() to ensure stat counters don't overflow (Keith)
> Link to v1: https://lore.kernel.org/all/20260130182028.885089-1-nilay@linux.ibm.com/
>
> Nilay Shroff (8):
>    nvme: add diag attribute group under sysfs
>    nvme: export command retry count via sysfs
>    nvme: export multipath failover count via sysfs
>    nvme: export command error counters via sysfs
>    nvme: export I/O requeue count when no path is available via sysfs
>    nvme: export I/O failure count when no path is available via sysfs
>    nvme: export controller reset event count via sysfs
>    nvme: export controller reconnect event count via sysfs
>
>   drivers/nvme/host/core.c      |  15 ++-
>   drivers/nvme/host/fc.c        |   3 +
>   drivers/nvme/host/multipath.c |  87 ++++++++++++++
>   drivers/nvme/host/nvme.h      |  13 +++
>   drivers/nvme/host/pci.c       |   1 +
>   drivers/nvme/host/rdma.c      |   2 +
>   drivers/nvme/host/sysfs.c     | 214 ++++++++++++++++++++++++++++++++++
>   drivers/nvme/host/tcp.c       |   2 +
>   8 files changed, 336 insertions(+), 1 deletion(-)
>

Hello Nilay,

Applied this patch series on top of v7.1-rc5 and boot-tested on ppc64le.

Verified the new NVMe diag sysfs hierarchy and counters exposed by this 
series.

Validation steps executed:

Read all exported NVMe diag counters: for f in $(find /sys -path 
'*nvme*diag/*_count' 2>/dev/null); do echo "$f: $(cat "$f")"; done

Reset all writable counters to zero: for f in $(find /sys -path 
'*nvme*diag/*_count' 2>/dev/null); do echo 0 > "$f" && echo "reset ok 
$f"; done

Negative test with invalid input: echo abc > 
/sys/devices/pci0525:48/0525:48:00.0/nvme/nvme0/diag/command_error_count

Observed results:

diag directories were present under:

controller paths, e.g. /sys/devices/.../nvme/nvmeX/diag/

per-path namespace paths, e.g. /sys/devices/.../nvme/nvmeX/nvmeYcZnW/diag/

namespace-head paths, e.g. 
/sys/devices/virtual/nvme-subsystem/nvme-subsysX/nvmeYnZ/diag/

Controller counters observed:
reset_count
command_error_count
reconnect_count on fabrics controllers


# ll /sys/devices/virtual/nvme-fabrics/ctl/nvme7/diag
total 0
-rw-r--r--. 1 root root 65536 May 25 03:58 command_error_count
-rw-r--r--. 1 root root 65536 May 25 03:58 reconnect_count
-rw-r--r--. 1 root root 65536 May 25 03:58 reset_count

# ll /sys/devices/pci052a:58/052a:58:00.0/nvme/nvme2/diag
total 0
-rw-r--r--. 1 root root 65536 May 25 03:58 command_error_count
-rw-r--r--. 1 root root 65536 May 25 03:58 reset_count


Per-path counters observed:
multipath_failover_count
command_error_count
command_retries_count

# ll /sys/devices/pci052a:58/052a:58:00.0/nvme/nvme2/nvme2c2n1/diag
total 0
-rw-r--r--. 1 root root 65536 May 25 03:58 command_error_count
-rw-r--r--. 1 root root 65536 May 25 03:58 command_retries_count
-rw-r--r--. 1 root root 65536 May 25 03:58 multipath_failover_count


Namespace-head counters observed:
io_fail_no_available_path_count
io_requeue_no_usable_path_count


# ll /sys/devices/virtual/nvme-subsystem/nvme-subsys1/nvme1n2/diag
total 0
-rw-r--r--. 1 root root 65536 May 25 03:58 io_fail_no_available_path_count
-rw-r--r--. 1 root root 65536 May 25 03:58 io_requeue_no_usable_path_count


All reads returned numeric values

All reset writes to 0 succeeded

Invalid text write failed as expected: -bash: echo: write error: Invalid 
argument.


If it all looks good, please add below tag.


Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>


Regards,

Venkat.



Regards,

Venkat.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (9 preceding siblings ...)
  2026-05-25  9:12 ` Venkat Rao Bagalkote
@ 2026-05-27 19:54 ` Keith Busch
  2026-06-04  8:58 ` Keith Busch
  11 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2026-05-27 19:54 UTC (permalink / raw)
  To: Nilay Shroff
  Cc: linux-nvme, dwagner, hare, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong

On Sun, May 17, 2026 at 12:06:47AM +0530, Nilay Shroff wrote:
> Hi,
> 
> The NVMe driver encounters various events and conditions during normal
> operation that are either not tracked today or not exposed to userspace
> via sysfs. Lack of visibility into these events can make it difficult to
> diagnose subtle issues related to controller behavior, multipath
> stability, and I/O reliability.
> 
> This patchset adds several diagnostic counters that provide improved
> observability into NVMe behavior. These counters are intended to help
> users understand events such as transient path unavailability,
> controller retries/reconnect/reset, failovers, and I/O failures. They
> can also be consumed by monitoring tools such as nvme-top.

Thanks, Nilay. This looks good to me. I'll queue it up if I don't hear
any objections.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs
  2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
                   ` (10 preceding siblings ...)
  2026-05-27 19:54 ` Keith Busch
@ 2026-06-04  8:58 ` Keith Busch
  11 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2026-06-04  8:58 UTC (permalink / raw)
  To: Nilay Shroff
  Cc: linux-nvme, dwagner, hare, hch, sagi, axboe, chaitanyak, venkat88,
	gjoyce, wenxiong

On Sun, May 17, 2026 at 12:06:47AM +0530, Nilay Shroff wrote:
> Hi,
> 
> The NVMe driver encounters various events and conditions during normal
> operation that are either not tracked today or not exposed to userspace
> via sysfs. Lack of visibility into these events can make it difficult to
> diagnose subtle issues related to controller behavior, multipath
> stability, and I/O reliability.
> 
> This patchset adds several diagnostic counters that provide improved
> observability into NVMe behavior. These counters are intended to help
> users understand events such as transient path unavailability,
> controller retries/reconnect/reset, failovers, and I/O failures. They
> can also be consumed by monitoring tools such as nvme-top.
> 
> Specifically, this series proposes to export the following counters via
> sysfs:
>   - Command retry count
>   - Multipath failover count
>   - Command error count
>   - I/O requeue count
>   - I/O failure count
>   - Controller reset event counts
>   - Controller reconnect counts

Thanks, applied to nvme-7.2.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-06-04  8:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-16 18:36 [PATCHv4 0/8] nvme: export additional diagnostic counters via sysfs Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 1/8] nvme: add diag attribute group under sysfs Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 2/8] nvme: export command retry count via sysfs Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 3/8] nvme: export multipath failover " Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 4/8] nvme: export command error counters " Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 5/8] nvme: export I/O requeue count when no path is usable " Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 6/8] nvme: export I/O failure count when no path is available " Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 7/8] nvme: export controller reset event count " Nilay Shroff
2026-05-16 18:36 ` [PATCHv4 8/8] nvme: export controller reconnect " Nilay Shroff
2026-05-16 18:47 ` [PATCHv4 0/8] nvme: export additional diagnostic counters " Nilay Shroff
2026-05-25  9:12 ` Venkat Rao Bagalkote
2026-05-27 19:54 ` Keith Busch
2026-06-04  8:58 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.