public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller
@ 2024-06-20 12:55 Nilay Shroff
  2024-06-21 15:08 ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Nilay Shroff @ 2024-06-20 12:55 UTC (permalink / raw)
  To: dwagner
  Cc: linux-nvme, kbusch, hch, sagi, gjoyce, msmurthy, axboe,
	Nilay Shroff

Sometime it's possible for a multi-controller NVMe disk to have only
one of its controller discovered by the kernel. And if this happens
then it's also possible for a user to create and then attach a namespace
to a controller which has not been discovered by the kernel. In such a
case the attached namespace can't be used for IO because there's no path
to reach such namespace from the kernel.

This patch helps to warn about such case to user when user attempts to
attach a namepsace to an undiscovered controller by kernel. If this warning
appears to the user then user have about 10 seconds of time to abort the
operation by pressing CTRL-C. If user doesn't abort this operation
within 10 seconds of timeout then nvme cli goes ahead as usual and attach
the namespace to the controller.

Link: https://lore.kernel.org/all/f1a7c1e2-3203-4b7a-a922-82fa812455bd@linux.ibm.com/
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 nvme.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 4 deletions(-)

diff --git a/nvme.c b/nvme.c
index ba760901..0f3b3297 100644
--- a/nvme.c
+++ b/nvme.c
@@ -2811,6 +2811,84 @@ static int delete_ns(int argc, char **argv, struct command *cmd, struct plugin *
 	return err;
 }
 
+static bool nvme_match_subsys_device_filter(nvme_subsystem_t s, nvme_ctrl_t c,
+		   nvme_ns_t ns, void *f_arg)
+{
+	nvme_ctrl_t _c;
+	const char *devname = (const char *)f_arg;
+
+	if (!s)
+		return true;
+
+	nvme_subsystem_for_each_ctrl(s, _c) {
+		if (!strcmp(devname, nvme_ctrl_get_name(_c)))
+			return true;
+	}
+
+	return false;
+}
+
+static int nvme_cli_ns_validate_cntlid(__u32 nsid, struct nvme_dev *dev,
+				       int *list, int num)
+{
+	const char *cntlid;
+	int __cntlid;
+	char *p;
+	nvme_host_t h;
+	nvme_subsystem_t s;
+	nvme_ctrl_t c;
+	nvme_root_t r = NULL;
+	int i, err, matched = 0;
+	nvme_scan_filter_t filter = nvme_match_subsys_device_filter;
+
+	r = nvme_create_root(stderr, log_level);
+	if (!r) {
+		nvme_show_error("Failed to create topology root: %s",
+				nvme_strerror(errno));
+		return -errno;
+	}
+
+	err = nvme_scan_topology(r, filter, (void *)dev->name);
+	if (err < 0) {
+		if (errno != ENOENT)
+			nvme_show_error("Failed to scan topology: %s",
+					nvme_strerror(errno));
+		nvme_free_tree(r);
+		return err;
+	}
+	nvme_for_each_host(r, h) {
+		nvme_for_each_subsystem(h, s) {
+			nvme_subsystem_for_each_ctrl(s, c) {
+				cntlid = nvme_ctrl_get_cntlid(c);
+				errno = 0;
+				__cntlid = strtoul(cntlid, &p, 0);
+				if (errno || *p != 0)
+					continue;
+				for (i = 0; i < num; i++) {
+					if (__cntlid == list[i])
+						matched++;
+				}
+			}
+		}
+	}
+
+	nvme_free_tree(r);
+
+	if (matched != num) {
+		fprintf(stderr,
+			"You are about to attach namespace 0x%x to an undiscovered nvme controller.\n",
+			nsid);
+		fprintf(stderr,
+			"WARNING: Attaching nampespace to undiscovered nvme controller may have undesired side effect!\n"
+			"You may not be able to perform any IO to such namespace.\n"
+			"You have 10 seconds to press Ctrl-C to cancel this operation.\n\n");
+		sleep(10);
+		fprintf(stderr, "Sending attach-ns operation ...\n");
+	}
+
+	return 0;
+}
+
 static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, struct command *cmd)
 {
 	_cleanup_free_ struct nvme_ctrl_list *cntlist = NULL;
@@ -2866,12 +2944,16 @@ static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, s
 
 	nvme_init_ctrl_list(cntlist, num, ctrlist);
 
-	if (attach)
-		err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id,
-					       cntlist);
-	else
+	if (attach) {
+		err = nvme_cli_ns_validate_cntlid(cfg.namespace_id, dev,
+						  list, num);
+		if (!err)
+			err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id,
+						       cntlist);
+	} else {
 		err = nvme_cli_ns_detach_ctrls(dev, cfg.namespace_id,
 					       cntlist);
+	}
 
 	if (!err)
 		printf("%s: Success, nsid:%d\n", cmd->name, cfg.namespace_id);
-- 
2.45.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller
  2024-06-20 12:55 [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller Nilay Shroff
@ 2024-06-21 15:08 ` Keith Busch
  2024-06-22 10:35   ` Nilay Shroff
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2024-06-21 15:08 UTC (permalink / raw)
  To: Nilay Shroff; +Cc: dwagner, linux-nvme, hch, sagi, gjoyce, msmurthy, axboe

On Thu, Jun 20, 2024 at 06:25:45PM +0530, Nilay Shroff wrote:
> Sometime it's possible for a multi-controller NVMe disk to have only
> one of its controller discovered by the kernel. And if this happens
> then it's also possible for a user to create and then attach a namespace
> to a controller which has not been discovered by the kernel. In such a
> case the attached namespace can't be used for IO because there's no path
> to reach such namespace from the kernel.

Isn't that a pretty normal thing to do, though? Like for an sriov
situation, a primary controller attaches namespaces to secondaries, but
not to itself.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller
  2024-06-21 15:08 ` Keith Busch
@ 2024-06-22 10:35   ` Nilay Shroff
  2024-06-24 19:32     ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Nilay Shroff @ 2024-06-22 10:35 UTC (permalink / raw)
  To: Keith Busch; +Cc: dwagner, linux-nvme, hch, sagi, gjoyce, msmurthy, axboe



On 6/21/24 20:38, Keith Busch wrote:
> On Thu, Jun 20, 2024 at 06:25:45PM +0530, Nilay Shroff wrote:
>> Sometime it's possible for a multi-controller NVMe disk to have only
>> one of its controller discovered by the kernel. And if this happens
>> then it's also possible for a user to create and then attach a namespace
>> to a controller which has not been discovered by the kernel. In such a
>> case the attached namespace can't be used for IO because there's no path
>> to reach such namespace from the kernel.
> 
> Isn't that a pretty normal thing to do, though? Like for an sriov
> situation, a primary controller attaches namespaces to secondaries, but
> not to itself.
Yes correct, but in case of sriov, AFAIK, the "nvme list-secondary ...." shall 
not list any secondary controller unless the secondary controller's corresponding 
VF is enabled (i.e. /sys/class/nvme/nvmeX/device/sriov_numvfs set to at least the 
secondary controller's VF number). 

So it means that before attaching namespace to secondaries, user would first set 
sriov_numvfs. Setting numvfs would then enable the kernel to discover all secondary
controllers. 

However, in the proposed patch we're trying to address a case where "nvme list-ctrl ..."
shows multiple controller entries however only one of those controllers is discovered
by kernel. All controllers shown in the output of "nvme list-ctrl ..." are physical
controllers on the nvme disk. In this case attaching a namespace to any of the 
undiscovered controller has a side effect due which the namespace rendered unusable.  
This patch tries to address this issue by showing a WARNING to the user and optionally
allowing user to cancel the attach operation. However if user still prefers to go 
ahead without cancelling the attach operation then nvme-cli would execute this command
as usual. 
For sriov use case, as secondary controllers have been already discovered by kernel 
before we attach namespace to secondaries, it doesn't have any side effect. 

Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller
  2024-06-22 10:35   ` Nilay Shroff
@ 2024-06-24 19:32     ` Keith Busch
  2024-06-25  6:23       ` Nilay Shroff
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2024-06-24 19:32 UTC (permalink / raw)
  To: Nilay Shroff; +Cc: dwagner, linux-nvme, hch, sagi, gjoyce, msmurthy, axboe

On Sat, Jun 22, 2024 at 04:05:01PM +0530, Nilay Shroff wrote:
> 
> 
> On 6/21/24 20:38, Keith Busch wrote:
> > On Thu, Jun 20, 2024 at 06:25:45PM +0530, Nilay Shroff wrote:
> >> Sometime it's possible for a multi-controller NVMe disk to have only
> >> one of its controller discovered by the kernel. And if this happens
> >> then it's also possible for a user to create and then attach a namespace
> >> to a controller which has not been discovered by the kernel. In such a
> >> case the attached namespace can't be used for IO because there's no path
> >> to reach such namespace from the kernel.
> > 
> > Isn't that a pretty normal thing to do, though? Like for an sriov
> > situation, a primary controller attaches namespaces to secondaries, but
> > not to itself.
> Yes correct, but in case of sriov, AFAIK, the "nvme list-secondary ...." shall 
> not list any secondary controller unless the secondary controller's corresponding 
> VF is enabled (i.e. /sys/class/nvme/nvmeX/device/sriov_numvfs set to at least the 
> secondary controller's VF number). 
> 
> So it means that before attaching namespace to secondaries, user would first set 
> sriov_numvfs. Setting numvfs would then enable the kernel to discover all secondary
> controllers. 

You'd usually bind those to vfio, not nvme. So while the kernel knows
those pci functions exist, the nvme driver doesn't necessarily know
about these.

And that's not even an sriov exclusive thing. You can bind other PFs
just the same, though they just don't have the resource provisioning
features that VFs can do. But you can still attach and detach namespaces
to/from them all the same.
 
> However, in the proposed patch we're trying to address a case where "nvme list-ctrl ..."
> shows multiple controller entries however only one of those controllers is discovered
> by kernel. All controllers shown in the output of "nvme list-ctrl ..." are physical
> controllers on the nvme disk. In this case attaching a namespace to any of the 
> undiscovered controller has a side effect due which the namespace rendered unusable.  

Unusable from the driver the admin ran the command from. Some other
machine (real or virtual) may be able to use it, or maybe even the same
machine that you handed some function to a user space driver, and that's
not really an error.

> This patch tries to address this issue by showing a WARNING to the user and optionally
> allowing user to cancel the attach operation. However if user still prefers to go 
> ahead without cancelling the attach operation then nvme-cli would execute this command
> as usual. 

Have you received bug reports of people mistakenly attaching namespaces
to the wrong controller or something?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller
  2024-06-24 19:32     ` Keith Busch
@ 2024-06-25  6:23       ` Nilay Shroff
  0 siblings, 0 replies; 5+ messages in thread
From: Nilay Shroff @ 2024-06-25  6:23 UTC (permalink / raw)
  To: Keith Busch; +Cc: dwagner, linux-nvme, hch, sagi, gjoyce, msmurthy, axboe



On 6/25/24 01:02, Keith Busch wrote:
> On Sat, Jun 22, 2024 at 04:05:01PM +0530, Nilay Shroff wrote:
>>
>>
>> On 6/21/24 20:38, Keith Busch wrote:
>>> On Thu, Jun 20, 2024 at 06:25:45PM +0530, Nilay Shroff wrote:
>>>> Sometime it's possible for a multi-controller NVMe disk to have only
>>>> one of its controller discovered by the kernel. And if this happens
>>>> then it's also possible for a user to create and then attach a namespace
>>>> to a controller which has not been discovered by the kernel. In such a
>>>> case the attached namespace can't be used for IO because there's no path
>>>> to reach such namespace from the kernel.
>>>
>>> Isn't that a pretty normal thing to do, though? Like for an sriov
>>> situation, a primary controller attaches namespaces to secondaries, but
>>> not to itself.
>> Yes correct, but in case of sriov, AFAIK, the "nvme list-secondary ...." shall 
>> not list any secondary controller unless the secondary controller's corresponding 
>> VF is enabled (i.e. /sys/class/nvme/nvmeX/device/sriov_numvfs set to at least the 
>> secondary controller's VF number). 
>>
>> So it means that before attaching namespace to secondaries, user would first set 
>> sriov_numvfs. Setting numvfs would then enable the kernel to discover all secondary
>> controllers. 
> 
> You'd usually bind those to vfio, not nvme. So while the kernel knows
> those pci functions exist, the nvme driver doesn't necessarily know
> about these.
> 
> And that's not even an sriov exclusive thing. You can bind other PFs
> just the same, though they just don't have the resource provisioning
> features that VFs can do. But you can still attach and detach namespaces
> to/from them all the same.
>  
>> However, in the proposed patch we're trying to address a case where "nvme list-ctrl ..."
>> shows multiple controller entries however only one of those controllers is discovered
>> by kernel. All controllers shown in the output of "nvme list-ctrl ..." are physical
>> controllers on the nvme disk. In this case attaching a namespace to any of the 
>> undiscovered controller has a side effect due which the namespace rendered unusable.  
> 
> Unusable from the driver the admin ran the command from. Some other
> machine (real or virtual) may be able to use it, or maybe even the same
> machine that you handed some function to a user space driver, and that's
> not really an error.
> 
Ok got it... I didn't know this topology is also supported!

>> This patch tries to address this issue by showing a WARNING to the user and optionally
>> allowing user to cancel the attach operation. However if user still prefers to go 
>> ahead without cancelling the attach operation then nvme-cli would execute this command
>> as usual. 
> 
> Have you received bug reports of people mistakenly attaching namespaces
> to the wrong controller or something?
Not from customer but it was an internally found defect where someone (mistakenly!) 
attached a namespace to a controller which was not discovered by the kernel and that
caused the namespace remains hidden behind undiscovered controller... 

The "nvme list" command would not show up that hidden namespace, and also 
"nvme list-ns <dev>" wouldn't list that namespace. However later I found that 
the only way to list that hidden namespace was to use "nvme list <dev> -a" 
command.

I think, it might be difficult to gauge what's the intent here when user/admin runs 
a command to attach a namespace. If the intent is to attach namespace behind 
controller/VF which might be usable from other (real/virtual) machine then it's OK. 
However if the intent is to use the namespace on the same host machine and using the 
same kernel driver where the command is being executed then there could be an issue. 
Any comment...?

Thanks,
--Nilay 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-06-25  6:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-20 12:55 [PATCH nvme-cli] nvme: warn about attaching a namespace to unknown controller Nilay Shroff
2024-06-21 15:08 ` Keith Busch
2024-06-22 10:35   ` Nilay Shroff
2024-06-24 19:32     ` Keith Busch
2024-06-25  6:23       ` Nilay Shroff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox