* [PATCH v2] nvme-multipath: expose path_state via sysfs
@ 2026-06-22 13:03 Guixin Liu
2026-06-23 6:16 ` Nilay Shroff
2026-06-23 12:35 ` Keith Busch
0 siblings, 2 replies; 6+ messages in thread
From: Guixin Liu @ 2026-06-22 13:03 UTC (permalink / raw)
To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
Nilay Shroff, Daniel Wagner
Cc: linux-nvme
Add a read-only "path_state" sysfs attribute to each NVMe path namespace
device (/sys/class/nvme/nvmeX/nvmeXcYnZ/path_state) that exposes whether
the path is currently enabled or disabled with a specific reason.
The attribute reflects the result of nvme_path_is_disabled() checks:
- "enabled" : path is available for I/O
- "disabled (ctrl_down)" : controller is not live/deleting
- "disabled (ana_pending)" : ANA state change pending
- "disabled (ns_not_ready)" : namespace is not ready
This gives userspace visibility into the multipath path selection state
without requiring users to piece together controller state and namespace
flags manually.
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
---
v1->v2:
- Show specific disabled reason instead of just "disabled":
"disabled (ctrl_down)", "disabled (ana_pending)",
"disabled (ns_not_ready)". (Nilay Shroff)
drivers/nvme/host/multipath.c | 16 ++++++++++++++++
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/sysfs.c | 4 +++-
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 81fff2f20d23..5e3847fd6632 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -1210,6 +1210,22 @@ static ssize_t in_flight_bytes_show(struct device *dev,
}
DEVICE_ATTR_RO(in_flight_bytes);
+static ssize_t path_state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
+ enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
+
+ if (state != NVME_CTRL_LIVE && state != NVME_CTRL_DELETING)
+ return sysfs_emit(buf, "disabled (ctrl_down)\n");
+ if (test_bit(NVME_NS_ANA_PENDING, &ns->flags))
+ return sysfs_emit(buf, "disabled (ana_pending)\n");
+ if (!test_bit(NVME_NS_READY, &ns->flags))
+ return sysfs_emit(buf, "disabled (ns_not_ready)\n");
+ return sysfs_emit(buf, "enabled\n");
+}
+DEVICE_ATTR_RO(path_state);
+
static ssize_t relative_throughput_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 2b2627e0d3ce..bb9b8cdf973a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1066,6 +1066,7 @@ extern struct device_attribute dev_attr_queue_depth;
extern struct device_attribute dev_attr_in_flight_bytes;
extern struct device_attribute dev_attr_relative_throughput;
extern struct device_attribute dev_attr_numa_nodes;
+extern struct device_attribute dev_attr_path_state;
extern struct device_attribute dev_attr_delayed_removal_secs;
extern struct device_attribute subsys_attr_iopolicy;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 6309af224c93..5110a3bf5279 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -262,6 +262,7 @@ static struct attribute *nvme_ns_attrs[] = {
&dev_attr_in_flight_bytes.attr,
&dev_attr_relative_throughput.attr,
&dev_attr_numa_nodes.attr,
+ &dev_attr_path_state.attr,
&dev_attr_delayed_removal_secs.attr,
#endif
&dev_attr_io_passthru_err_log_enabled.attr,
@@ -296,7 +297,8 @@ static umode_t nvme_ns_attrs_are_visible(struct kobject *kobj,
return 0;
}
if (a == &dev_attr_queue_depth.attr || a == &dev_attr_in_flight_bytes.attr ||
- a == &dev_attr_relative_throughput.attr || a == &dev_attr_numa_nodes.attr) {
+ a == &dev_attr_relative_throughput.attr || a == &dev_attr_numa_nodes.attr ||
+ a == &dev_attr_path_state.attr) {
if (nvme_disk_is_ns_head(dev_to_disk(dev)))
return 0;
}
--
2.43.7
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] nvme-multipath: expose path_state via sysfs
2026-06-22 13:03 [PATCH v2] nvme-multipath: expose path_state via sysfs Guixin Liu
@ 2026-06-23 6:16 ` Nilay Shroff
2026-06-23 7:54 ` Guixin Liu
2026-06-23 12:35 ` Keith Busch
1 sibling, 1 reply; 6+ messages in thread
From: Nilay Shroff @ 2026-06-23 6:16 UTC (permalink / raw)
To: Guixin Liu, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, Daniel Wagner
Cc: linux-nvme
On 6/22/26 6:33 PM, Guixin Liu wrote:
> Add a read-only "path_state" sysfs attribute to each NVMe path namespace
> device (/sys/class/nvme/nvmeX/nvmeXcYnZ/path_state) that exposes whether
> the path is currently enabled or disabled with a specific reason.
>
> The attribute reflects the result of nvme_path_is_disabled() checks:
> - "enabled" : path is available for I/O
> - "disabled (ctrl_down)" : controller is not live/deleting
> - "disabled (ana_pending)" : ANA state change pending
> - "disabled (ns_not_ready)" : namespace is not ready
>
> This gives userspace visibility into the multipath path selection state
> without requiring users to piece together controller state and namespace
> flags manually.
>
> Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
> ---
> v1->v2:
> - Show specific disabled reason instead of just "disabled":
> "disabled (ctrl_down)", "disabled (ana_pending)",
> "disabled (ns_not_ready)". (Nilay Shroff)
>
> drivers/nvme/host/multipath.c | 16 ++++++++++++++++
> drivers/nvme/host/nvme.h | 1 +
> drivers/nvme/host/sysfs.c | 4 +++-
> 3 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 81fff2f20d23..5e3847fd6632 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -1210,6 +1210,22 @@ static ssize_t in_flight_bytes_show(struct device *dev,
> }
> DEVICE_ATTR_RO(in_flight_bytes);
>
> +static ssize_t path_state_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
> + enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
> +
> + if (state != NVME_CTRL_LIVE && state != NVME_CTRL_DELETING)
> + return sysfs_emit(buf, "disabled (ctrl_down)\n");
> + if (test_bit(NVME_NS_ANA_PENDING, &ns->flags))
> + return sysfs_emit(buf, "disabled (ana_pending)\n");
> + if (!test_bit(NVME_NS_READY, &ns->flags))
> + return sysfs_emit(buf, "disabled (ns_not_ready)\n");
> + return sysfs_emit(buf, "enabled\n");
> +}
> +DEVICE_ATTR_RO(path_state);
> +
I'd prefer not to open-code the path disabled checks in the sysfs attribute.
Ideally, we could factor the logic into a helper that returns the path state
(or disabled reason), and have nvme_path_is_disabled() built on top of that.
This would keep the path selection logic and sysfs reporting in sync.
The concern with the current implementation is that it duplicates the checks
from nvme_path_is_disabled(). If someone updates the path disable criteria in
the future, we'd also need to remember to update path_state_show(). Having a
common helper would avoid that.
For instance, something like below:
enum nvme_path_state {
NVME_PATH_ENABLED,
NVME_PATH_DISABLED_CTRL_DOWN,
NVME_PATH_DISABLED_ANA_PENDING,
NVME_PATH_DISABLED_NS_NOT_READY,
};
static enum nvme_path_state nvme_path_get_state(struct nvme_ns *ns)
{
enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
/*
* We don't treat NVME_CTRL_DELETING as a disabled path as I/O should
* still be able to complete assuming that the controller is connected.
* Otherwise it will fail immediately and return to the requeue list.
*/
if (state != NVME_CTRL_LIVE && state != NVME_CTRL_DELETING)
return NVME_PATH_DISABLED_CTRL_DOWN;
if (test_bit(NVME_NS_ANA_PENDING, &ns->flags))
return NVME_PATH_DISABLED_ANA_PENDING;
if (!test_bit(NVME_NS_READY, &ns->flags))
return NVME_PATH_DISABLED_NS_NOT_READY;
return NVME_PATH_ENABLED;
}
static bool nvme_path_is_disabled(struct nvme_ns *ns)
{
return nvme_path_get_state(ns) != NVME_PATH_ENABLED;
}
Now the existing callers can continue using nvme_path_is_disabled()
as is without any change. But for sysfs, we'd call nvme_path_get_state().
Then the sysfs would map enum nvme_path_state to strings and
use it for output.
static const char * const nvme_path_state_str[] = {
[NVME_PATH_ENABLED] = "enabled",
[NVME_PATH_DISABLED_CTRL_DOWN] = "disabled (ctrl_down)",
[NVME_PATH_DISABLED_ANA_PENDING] = "disabled (ana_pending)",
[NVME_PATH_DISABLED_NS_NOT_READY] = "disabled (ns_not_ready)",
};
...
return sysfs_emit(buf, "%s\n",
nvme_path_state_str[nvme_path_get_state(ns)]);
This way any future updates to the path disable logic would automatically
be reflected in the sysfs output as well.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] nvme-multipath: expose path_state via sysfs
2026-06-23 6:16 ` Nilay Shroff
@ 2026-06-23 7:54 ` Guixin Liu
0 siblings, 0 replies; 6+ messages in thread
From: Guixin Liu @ 2026-06-23 7:54 UTC (permalink / raw)
To: Nilay Shroff, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, Daniel Wagner
Cc: linux-nvme
在 2026/6/23 14:16, Nilay Shroff 写道:
> On 6/22/26 6:33 PM, Guixin Liu wrote:
>> Add a read-only "path_state" sysfs attribute to each NVMe path namespace
>> device (/sys/class/nvme/nvmeX/nvmeXcYnZ/path_state) that exposes whether
>> the path is currently enabled or disabled with a specific reason.
>>
>> The attribute reflects the result of nvme_path_is_disabled() checks:
>> - "enabled" : path is available for I/O
>> - "disabled (ctrl_down)" : controller is not live/deleting
>> - "disabled (ana_pending)" : ANA state change pending
>> - "disabled (ns_not_ready)" : namespace is not ready
>>
>> This gives userspace visibility into the multipath path selection state
>> without requiring users to piece together controller state and namespace
>> flags manually.
>>
>> Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
>> ---
>> v1->v2:
>> - Show specific disabled reason instead of just "disabled":
>> "disabled (ctrl_down)", "disabled (ana_pending)",
>> "disabled (ns_not_ready)". (Nilay Shroff)
>>
>> drivers/nvme/host/multipath.c | 16 ++++++++++++++++
>> drivers/nvme/host/nvme.h | 1 +
>> drivers/nvme/host/sysfs.c | 4 +++-
>> 3 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/multipath.c
>> b/drivers/nvme/host/multipath.c
>> index 81fff2f20d23..5e3847fd6632 100644
>> --- a/drivers/nvme/host/multipath.c
>> +++ b/drivers/nvme/host/multipath.c
>> @@ -1210,6 +1210,22 @@ static ssize_t in_flight_bytes_show(struct
>> device *dev,
>> }
>> DEVICE_ATTR_RO(in_flight_bytes);
>> +static ssize_t path_state_show(struct device *dev,
>> + struct device_attribute *attr, char *buf)
>> +{
>> + struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
>> + enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
>> +
>> + if (state != NVME_CTRL_LIVE && state != NVME_CTRL_DELETING)
>> + return sysfs_emit(buf, "disabled (ctrl_down)\n");
>> + if (test_bit(NVME_NS_ANA_PENDING, &ns->flags))
>> + return sysfs_emit(buf, "disabled (ana_pending)\n");
>> + if (!test_bit(NVME_NS_READY, &ns->flags))
>> + return sysfs_emit(buf, "disabled (ns_not_ready)\n");
>> + return sysfs_emit(buf, "enabled\n");
>> +}
>> +DEVICE_ATTR_RO(path_state);
>> +
>
> I'd prefer not to open-code the path disabled checks in the sysfs
> attribute.
> Ideally, we could factor the logic into a helper that returns the path
> state
> (or disabled reason), and have nvme_path_is_disabled() built on top of
> that.
> This would keep the path selection logic and sysfs reporting in sync.
>
> The concern with the current implementation is that it duplicates the
> checks
> from nvme_path_is_disabled(). If someone updates the path disable
> criteria in
> the future, we'd also need to remember to update path_state_show().
> Having a
> common helper would avoid that.
>
> For instance, something like below:
>
> enum nvme_path_state {
> NVME_PATH_ENABLED,
> NVME_PATH_DISABLED_CTRL_DOWN,
> NVME_PATH_DISABLED_ANA_PENDING,
> NVME_PATH_DISABLED_NS_NOT_READY,
> };
>
> static enum nvme_path_state nvme_path_get_state(struct nvme_ns *ns)
> {
> enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
>
> /*
> * We don't treat NVME_CTRL_DELETING as a disabled path as I/O should
> * still be able to complete assuming that the controller is
> connected.
> * Otherwise it will fail immediately and return to the requeue list.
> */
> if (state != NVME_CTRL_LIVE && state != NVME_CTRL_DELETING)
> return NVME_PATH_DISABLED_CTRL_DOWN;
>
> if (test_bit(NVME_NS_ANA_PENDING, &ns->flags))
> return NVME_PATH_DISABLED_ANA_PENDING;
>
> if (!test_bit(NVME_NS_READY, &ns->flags))
> return NVME_PATH_DISABLED_NS_NOT_READY;
>
> return NVME_PATH_ENABLED;
> }
>
> static bool nvme_path_is_disabled(struct nvme_ns *ns)
> {
> return nvme_path_get_state(ns) != NVME_PATH_ENABLED;
> }
>
> Now the existing callers can continue using nvme_path_is_disabled()
> as is without any change. But for sysfs, we'd call nvme_path_get_state().
> Then the sysfs would map enum nvme_path_state to strings and
> use it for output.
>
> static const char * const nvme_path_state_str[] = {
> [NVME_PATH_ENABLED] = "enabled",
> [NVME_PATH_DISABLED_CTRL_DOWN] = "disabled (ctrl_down)",
> [NVME_PATH_DISABLED_ANA_PENDING] = "disabled (ana_pending)",
> [NVME_PATH_DISABLED_NS_NOT_READY] = "disabled (ns_not_ready)",
> };
>
> ...
> return sysfs_emit(buf, "%s\n",
> nvme_path_state_str[nvme_path_get_state(ns)]);
>
> This way any future updates to the path disable logic would automatically
> be reflected in the sysfs output as well.
>
> Thanks,
> --Nilay
That's reasonable, will be changed in v3, thanks.
Best Regards,
Guixin Liu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] nvme-multipath: expose path_state via sysfs
2026-06-22 13:03 [PATCH v2] nvme-multipath: expose path_state via sysfs Guixin Liu
2026-06-23 6:16 ` Nilay Shroff
@ 2026-06-23 12:35 ` Keith Busch
2026-06-23 14:01 ` Nilay Shroff
2026-06-24 1:36 ` Guixin Liu
1 sibling, 2 replies; 6+ messages in thread
From: Keith Busch @ 2026-06-23 12:35 UTC (permalink / raw)
To: Guixin Liu
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, Nilay Shroff,
Daniel Wagner, linux-nvme
On Mon, Jun 22, 2026 at 09:03:18PM +0800, Guixin Liu wrote:
> The attribute reflects the result of nvme_path_is_disabled() checks:
> - "enabled" : path is available for I/O
> - "disabled (ctrl_down)" : controller is not live/deleting
> - "disabled (ana_pending)" : ANA state change pending
> - "disabled (ns_not_ready)" : namespace is not ready
I'm a bit late to looking at this, but just a knee jerk thought,
wouldn't you want to distinguish "enabled optimized" vs "enabled
unoptimized"?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] nvme-multipath: expose path_state via sysfs
2026-06-23 12:35 ` Keith Busch
@ 2026-06-23 14:01 ` Nilay Shroff
2026-06-24 1:36 ` Guixin Liu
1 sibling, 0 replies; 6+ messages in thread
From: Nilay Shroff @ 2026-06-23 14:01 UTC (permalink / raw)
To: Keith Busch, Guixin Liu
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, Daniel Wagner,
linux-nvme
On 6/23/26 6:05 PM, Keith Busch wrote:
> On Mon, Jun 22, 2026 at 09:03:18PM +0800, Guixin Liu wrote:
>> The attribute reflects the result of nvme_path_is_disabled() checks:
>> - "enabled" : path is available for I/O
>> - "disabled (ctrl_down)" : controller is not live/deleting
>> - "disabled (ana_pending)" : ANA state change pending
>> - "disabled (ns_not_ready)" : namespace is not ready
>
> I'm a bit late to looking at this, but just a knee jerk thought,
> wouldn't you want to distinguish "enabled optimized" vs "enabled
> unoptimized"?
This is a good idea. I think, we should consider reporting this.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] nvme-multipath: expose path_state via sysfs
2026-06-23 12:35 ` Keith Busch
2026-06-23 14:01 ` Nilay Shroff
@ 2026-06-24 1:36 ` Guixin Liu
1 sibling, 0 replies; 6+ messages in thread
From: Guixin Liu @ 2026-06-24 1:36 UTC (permalink / raw)
To: Keith Busch
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, Nilay Shroff,
Daniel Wagner, linux-nvme
在 2026/6/23 20:35, Keith Busch 写道:
> On Mon, Jun 22, 2026 at 09:03:18PM +0800, Guixin Liu wrote:
>> The attribute reflects the result of nvme_path_is_disabled() checks:
>> - "enabled" : path is available for I/O
>> - "disabled (ctrl_down)" : controller is not live/deleting
>> - "disabled (ana_pending)" : ANA state change pending
>> - "disabled (ns_not_ready)" : namespace is not ready
> I'm a bit late to looking at this, but just a knee jerk thought,
> wouldn't you want to distinguish "enabled optimized" vs "enabled
> unoptimized"?
Sure, good idea, I will change this in v3, thanks.
Best Regards,
Guixin Liu
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-24 1:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-22 13:03 [PATCH v2] nvme-multipath: expose path_state via sysfs Guixin Liu
2026-06-23 6:16 ` Nilay Shroff
2026-06-23 7:54 ` Guixin Liu
2026-06-23 12:35 ` Keith Busch
2026-06-23 14:01 ` Nilay Shroff
2026-06-24 1:36 ` Guixin Liu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.