* nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk @ 2024-05-06 14:15 Nilay Shroff 2024-05-06 15:05 ` Daniel Wagner 0 siblings, 1 reply; 10+ messages in thread From: Nilay Shroff @ 2024-05-06 14:15 UTC (permalink / raw) To: linux-nvme@lists.infradead.org Cc: Keith Busch, Daniel Wagner, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com Hi, I have got an NVMe disk with multi-controller support. This disk has two controllers however kernel could only discover one of the controllers. On this disk if we create a namespace and attach it to a controller (which is not discovered by the kernel) then that namespace can't used for IO. In fact, there's no path to reach such a namespace from kernel. Please find the details below: # lspci 0018:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM173X # nvme list -v Subsystem Subsystem-NQN Controllers ---------------- ------------------------------------------------------------------------------------------------ ---------------- nvme-subsys0 nqn.1994-11.com.samsung:nvme:PM1735:2.5-inch:S6EUNA0R500358 nvme0 Device SN MN FR TxPort Asdress Slot Subsystem Namespaces -------- -------------------- ---------------------------------------- -------- ------ -------------- ------ ------------ ---------------- nvme0 S6EUNA0R500358 1.6TB NVMe Gen4 U.2 SSD REV.SN49 pcie 0018:01:00.0 nvme-subsys0 Device Generic NSID Usage Format Controllers ------------ ------------ ---------- -------------------------- ---------------- ---------------- # nvme id-ctrl /dev/nvme0 -H NVME Identify Controller: vid : 0x144d ssvid : 0x1014 sn : S6EUNA0R500358 mn : 1.6TB NVMe Gen4 U.2 SSD fr : REV.SN49 rab : 8 ieee : 002538 cmic : 0x3 [3:3] : 0 ANA not supported [2:2] : 0 PCI [1:1] : 0x1 Multi Controller <== this is multi-controller disk [0:0] : 0x1 Multi Port <snip> <snip> # nvme list-ctrl /dev/nvme0 num of ctrls present: 2 [ 0]:0x41 [ 1]:0x42 # cat /sys/class/nvme/nvme0/cntlid 65 So it's apparent from the above output that kernel has discovered one controller (nvme0) with cntlid 65(0x41). The other controller with cntlid 0x42 remain undiscovered. Please note even though kernel couldn't discover both disk controllers, it's possible to create and attach namespace to a controller which is not discovered by the kernel. # nvme create-ns /dev/nvme0 --nsze=0x1749A956 --ncap=0x1749A956 --block-size=4096 create-ns: Success, created nsid:1 Now, attach namespace to a controller which is not discovered/known by the kernel. # nvme attach-ns /dev/nvme0 -c 0x42 -n 1 attach-ns: Success, nsid:1 # nvme list Node Generic SN Model Namespace Usage Format FW Rev --------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- -------- The nvme list doesn't show any namespace and the reason being the namespace is attached to a controller which is not discovered by the kernel. The kernel doesn't have any path to access this namespace. So in essence now this namespace can't be used for any IO. Looking at the above output (namespace list is empty) if we try creating a new namespace then we see the below error. # nvme create-ns /dev/nvme0 --nsze=0x1749A956 --ncap=0x1749A956 --block-size=4096 NVMe status: Namespace Insufficient Capacity: Creating the namespace requires more free space than is currently available(0x2115) # nvme id-ctrl /dev/nvme0 -H NVME Identify Controller: vid : 0x144d ssvid : 0x1014 sn : S6EUNA0R500358 mn : 1.6TB NVMe Gen4 U.2 SSD <snip> <snip> tnvmcap : 1600321314816 [127:0] : 1600321314816 Total NVM Capacity (TNVMCAP) unvmcap : 0 [127:0] : 0 Unallocated NVM Capacity (UNVMCAP) <snip> <snip> We find from the above output that all available capacity of disk has been used but nvme list output is empty. Please note that "kernel is unable to discover both NVMe controllers" is not a kernel bug. Also I think, this is not a bug with NVMe disk or its firmware. This multi controller NVMe disk has a "DualPortEn#" pin and by default, the disk is configured to operate a single x4 port. Asserting the "DualPortEn#" pin enables the Dual x2 port mode. When disk operates in Dual x2 mode, kernel could discover both controllers. However when "DualportEn#" pin is not asserted disk would only enable one of the controllers and hence kernel could discover one controller. There could be multiple ways to address this issue. However my proposal would be to address it in nvme-cli. As nvme-cli builds the nvme topology it shall be easy for nvme-cli to detect this anomaly when a namespace is being attached to undiscovered nvme controller and shows some WARNING message. The WARNING could be shown for ~10 seconds so that it gives enough time to a user to cancel this operation. If the operation is not cancelled then nvme-cli would proceed as usual. Please help review and as usual, comments/feedback/suggestions are most welcome! Thanks, --Nilay ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-06 14:15 nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk Nilay Shroff @ 2024-05-06 15:05 ` Daniel Wagner 2024-05-07 11:44 ` Nilay Shroff 0 siblings, 1 reply; 10+ messages in thread From: Daniel Wagner @ 2024-05-06 15:05 UTC (permalink / raw) To: Nilay Shroff Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On Mon, May 06, 2024 at 07:45:20PM GMT, Nilay Shroff wrote: > There could be multiple ways to address this issue. However my proposal would be to address > it in nvme-cli. As nvme-cli builds the nvme topology it shall be easy > for nvme-cli to detect The topology scan in libnvme is only using the information available in sysfs. libnvme doesn't issue any commands anymore and I'd like to keep it this in this way. So if the kernel doesn't exposes this information to userspace via sysfs, I don't think it's simple to 'fix' this in nvme-cli/libnvme. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-06 15:05 ` Daniel Wagner @ 2024-05-07 11:44 ` Nilay Shroff 2024-05-16 6:25 ` Nilay Shroff 2024-06-20 6:52 ` Daniel Wagner 0 siblings, 2 replies; 10+ messages in thread From: Nilay Shroff @ 2024-05-07 11:44 UTC (permalink / raw) To: Daniel Wagner Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On 5/6/24 20:35, Daniel Wagner wrote: > On Mon, May 06, 2024 at 07:45:20PM GMT, Nilay Shroff wrote: >> There could be multiple ways to address this issue. However my proposal would be to address >> it in nvme-cli. As nvme-cli builds the nvme topology it shall be easy >> for nvme-cli to detect > > The topology scan in libnvme is only using the information available in > sysfs. libnvme doesn't issue any commands anymore and I'd like to keep > it this in this way. So if the kernel doesn't exposes this information > to userspace via sysfs, I don't think it's simple to 'fix' this in > nvme-cli/libnvme. > I think the information which we need to contain this issue is already available through sysfs. If we scan nvme topology then we could find the controller id assigned to each controller under each nvme subsystem. We can then leverage this information to figure out whether each controller id specified in the attach-ns command is valid or not. So essentially we match controller id specified in attach-ns command against the controller id learnt through scanning the topology. If we find that any discrepancy then we can show the WARNING to the user. I have worked out a patch using this method. I have attached the patch below for suggestion/feedback. diff --git a/nvme.c b/nvme.c index c1d4352a..533cc390 100644 --- a/nvme.c +++ b/nvme.c @@ -2784,6 +2784,23 @@ static int delete_ns(int argc, char **argv, struct command *cmd, struct plugin * return err; } +static bool nvme_match_subsys_device_filter(nvme_subsystem_t s, nvme_ctrl_t c, + nvme_ns_t ns, void *f_arg) +{ + nvme_ctrl_t _c; + const char *devname = (const char *)f_arg; + + if (s) { + nvme_subsystem_for_each_ctrl(s, _c) { + if (!strcmp(devname, nvme_ctrl_get_name(_c))) + return true; + } + return false; + } + + return true; +} + static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, struct command *cmd) { _cleanup_free_ struct nvme_ctrl_list *cntlist = NULL; @@ -2839,12 +2856,68 @@ static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, s nvme_init_ctrl_list(cntlist, num, ctrlist); - if (attach) + if (attach) { + const char *cntlid; + int __cntlid; + char *p; + nvme_host_t h; + nvme_subsystem_t s; + nvme_ctrl_t c; + nvme_root_t r = NULL; + int matched = 0; + nvme_scan_filter_t filter = nvme_match_subsys_device_filter; + + r = nvme_create_root(stderr, log_level); + if (!r) { + nvme_show_error("Failed to create topology root: %s", + nvme_strerror(errno)); + return -errno; + } + + err = nvme_scan_topology(r, filter, (void *)dev->name); + if (err < 0) { + if (errno != ENOENT) + nvme_show_error("Failed to scan topology: %s", + nvme_strerror(errno)); + nvme_free_tree(r); + return err; + } + nvme_for_each_host(r, h) { + nvme_for_each_subsystem(h, s) { + nvme_subsystem_for_each_ctrl(s, c) { + cntlid = nvme_ctrl_get_cntlid(c); + errno = 0; + __cntlid = strtoul(cntlid, &p, 0); + if (errno || *p != 0) + continue; + for (i = 0; i < num; i++) { + if (__cntlid == list[i]) + matched++; + } + } + } + } + + nvme_free_tree(r); + + if (matched != num) { + fprintf(stderr, + "You are about to attach namespace 0x%x to an undiscovered nvme controller.\n", + cfg.namespace_id); + fprintf(stderr, + "WARNING: Attaching nampespace to undiscovered nvme controller may have undesired side effect!\n" + "You may not be able to perform any IO to such namespace.\n" + "You have 10 seconds to press Ctrl-C to cancel this operation.\n\n"); + sleep(10); + fprintf(stderr, "Sending attach-ns operation ...\n"); + } + err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id, cntlist); - else + } else { err = nvme_cli_ns_detach_ctrls(dev, cfg.namespace_id, cntlist); + } if (!err) printf("%s: Success, nsid:%d\n", cmd->name, cfg.namespace_id); Thanks, --Nilay ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-07 11:44 ` Nilay Shroff @ 2024-05-16 6:25 ` Nilay Shroff 2024-05-27 11:20 ` Nilay Shroff 2024-06-20 6:52 ` Daniel Wagner 1 sibling, 1 reply; 10+ messages in thread From: Nilay Shroff @ 2024-05-16 6:25 UTC (permalink / raw) To: Daniel Wagner Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On 5/7/24 17:14, Nilay Shroff wrote: > > > On 5/6/24 20:35, Daniel Wagner wrote: >> On Mon, May 06, 2024 at 07:45:20PM GMT, Nilay Shroff wrote: >>> There could be multiple ways to address this issue. However my proposal would be to address >>> it in nvme-cli. As nvme-cli builds the nvme topology it shall be easy >>> for nvme-cli to detect >> >> The topology scan in libnvme is only using the information available in >> sysfs. libnvme doesn't issue any commands anymore and I'd like to keep >> it this in this way. So if the kernel doesn't exposes this information >> to userspace via sysfs, I don't think it's simple to 'fix' this in >> nvme-cli/libnvme. >> > I think the information which we need to contain this issue is already > available through sysfs. If we scan nvme topology then we could find > the controller id assigned to each controller under each nvme subsystem. > We can then leverage this information to figure out whether each controller > id specified in the attach-ns command is valid or not. So essentially we > match controller id specified in attach-ns command against the controller id > learnt through scanning the topology. If we find that any discrepancy then we > can show the WARNING to the user. I have worked out a patch using this method. > I have attached the patch below for suggestion/feedback. > > diff --git a/nvme.c b/nvme.c > index c1d4352a..533cc390 100644 > --- a/nvme.c > +++ b/nvme.c > @@ -2784,6 +2784,23 @@ static int delete_ns(int argc, char **argv, struct command *cmd, struct plugin * > return err; > } > > +static bool nvme_match_subsys_device_filter(nvme_subsystem_t s, nvme_ctrl_t c, > + nvme_ns_t ns, void *f_arg) > +{ > + nvme_ctrl_t _c; > + const char *devname = (const char *)f_arg; > + > + if (s) { > + nvme_subsystem_for_each_ctrl(s, _c) { > + if (!strcmp(devname, nvme_ctrl_get_name(_c))) > + return true; > + } > + return false; > + } > + > + return true; > +} > + > static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, struct command *cmd) > { > _cleanup_free_ struct nvme_ctrl_list *cntlist = NULL; > @@ -2839,12 +2856,68 @@ static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, s > > nvme_init_ctrl_list(cntlist, num, ctrlist); > > - if (attach) > + if (attach) { > + const char *cntlid; > + int __cntlid; > + char *p; > + nvme_host_t h; > + nvme_subsystem_t s; > + nvme_ctrl_t c; > + nvme_root_t r = NULL; > + int matched = 0; > + nvme_scan_filter_t filter = nvme_match_subsys_device_filter; > + > + r = nvme_create_root(stderr, log_level); > + if (!r) { > + nvme_show_error("Failed to create topology root: %s", > + nvme_strerror(errno)); > + return -errno; > + } > + > + err = nvme_scan_topology(r, filter, (void *)dev->name); > + if (err < 0) { > + if (errno != ENOENT) > + nvme_show_error("Failed to scan topology: %s", > + nvme_strerror(errno)); > + nvme_free_tree(r); > + return err; > + } > + nvme_for_each_host(r, h) { > + nvme_for_each_subsystem(h, s) { > + nvme_subsystem_for_each_ctrl(s, c) { > + cntlid = nvme_ctrl_get_cntlid(c); > + errno = 0; > + __cntlid = strtoul(cntlid, &p, 0); > + if (errno || *p != 0) > + continue; > + for (i = 0; i < num; i++) { > + if (__cntlid == list[i]) > + matched++; > + } > + } > + } > + } > + > + nvme_free_tree(r); > + > + if (matched != num) { > + fprintf(stderr, > + "You are about to attach namespace 0x%x to an undiscovered nvme controller.\n", > + cfg.namespace_id); > + fprintf(stderr, > + "WARNING: Attaching nampespace to undiscovered nvme controller may have undesired side effect!\n" > + "You may not be able to perform any IO to such namespace.\n" > + "You have 10 seconds to press Ctrl-C to cancel this operation.\n\n"); > + sleep(10); > + fprintf(stderr, "Sending attach-ns operation ...\n"); > + } > + > err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id, > cntlist); > - else > + } else { > err = nvme_cli_ns_detach_ctrls(dev, cfg.namespace_id, > cntlist); > + } > > if (!err) > printf("%s: Success, nsid:%d\n", cmd->name, cfg.namespace_id); > Gentle ping... If there's no objection to the above proposed patch then may I send a formal patch? Thanks, --Nilay ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-16 6:25 ` Nilay Shroff @ 2024-05-27 11:20 ` Nilay Shroff 2024-05-27 13:36 ` Daniel Wagner 0 siblings, 1 reply; 10+ messages in thread From: Nilay Shroff @ 2024-05-27 11:20 UTC (permalink / raw) To: Daniel Wagner Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On 5/16/24 11:55, Nilay Shroff wrote: > > > On 5/7/24 17:14, Nilay Shroff wrote: >> >> >> On 5/6/24 20:35, Daniel Wagner wrote: >>> On Mon, May 06, 2024 at 07:45:20PM GMT, Nilay Shroff wrote: >>>> There could be multiple ways to address this issue. However my proposal would be to address >>>> it in nvme-cli. As nvme-cli builds the nvme topology it shall be easy >>>> for nvme-cli to detect >>> >>> The topology scan in libnvme is only using the information available in >>> sysfs. libnvme doesn't issue any commands anymore and I'd like to keep >>> it this in this way. So if the kernel doesn't exposes this information >>> to userspace via sysfs, I don't think it's simple to 'fix' this in >>> nvme-cli/libnvme. >>> >> I think the information which we need to contain this issue is already >> available through sysfs. If we scan nvme topology then we could find >> the controller id assigned to each controller under each nvme subsystem. >> We can then leverage this information to figure out whether each controller >> id specified in the attach-ns command is valid or not. So essentially we >> match controller id specified in attach-ns command against the controller id >> learnt through scanning the topology. If we find that any discrepancy then we >> can show the WARNING to the user. I have worked out a patch using this method. >> I have attached the patch below for suggestion/feedback. >> >> diff --git a/nvme.c b/nvme.c >> index c1d4352a..533cc390 100644 >> --- a/nvme.c >> +++ b/nvme.c >> @@ -2784,6 +2784,23 @@ static int delete_ns(int argc, char **argv, struct command *cmd, struct plugin * >> return err; >> } >> >> +static bool nvme_match_subsys_device_filter(nvme_subsystem_t s, nvme_ctrl_t c, >> + nvme_ns_t ns, void *f_arg) >> +{ >> + nvme_ctrl_t _c; >> + const char *devname = (const char *)f_arg; >> + >> + if (s) { >> + nvme_subsystem_for_each_ctrl(s, _c) { >> + if (!strcmp(devname, nvme_ctrl_get_name(_c))) >> + return true; >> + } >> + return false; >> + } >> + >> + return true; >> +} >> + >> static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, struct command *cmd) >> { >> _cleanup_free_ struct nvme_ctrl_list *cntlist = NULL; >> @@ -2839,12 +2856,68 @@ static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, s >> >> nvme_init_ctrl_list(cntlist, num, ctrlist); >> >> - if (attach) >> + if (attach) { >> + const char *cntlid; >> + int __cntlid; >> + char *p; >> + nvme_host_t h; >> + nvme_subsystem_t s; >> + nvme_ctrl_t c; >> + nvme_root_t r = NULL; >> + int matched = 0; >> + nvme_scan_filter_t filter = nvme_match_subsys_device_filter; >> + >> + r = nvme_create_root(stderr, log_level); >> + if (!r) { >> + nvme_show_error("Failed to create topology root: %s", >> + nvme_strerror(errno)); >> + return -errno; >> + } >> + >> + err = nvme_scan_topology(r, filter, (void *)dev->name); >> + if (err < 0) { >> + if (errno != ENOENT) >> + nvme_show_error("Failed to scan topology: %s", >> + nvme_strerror(errno)); >> + nvme_free_tree(r); >> + return err; >> + } >> + nvme_for_each_host(r, h) { >> + nvme_for_each_subsystem(h, s) { >> + nvme_subsystem_for_each_ctrl(s, c) { >> + cntlid = nvme_ctrl_get_cntlid(c); >> + errno = 0; >> + __cntlid = strtoul(cntlid, &p, 0); >> + if (errno || *p != 0) >> + continue; >> + for (i = 0; i < num; i++) { >> + if (__cntlid == list[i]) >> + matched++; >> + } >> + } >> + } >> + } >> + >> + nvme_free_tree(r); >> + >> + if (matched != num) { >> + fprintf(stderr, >> + "You are about to attach namespace 0x%x to an undiscovered nvme controller.\n", >> + cfg.namespace_id); >> + fprintf(stderr, >> + "WARNING: Attaching nampespace to undiscovered nvme controller may have undesired side effect!\n" >> + "You may not be able to perform any IO to such namespace.\n" >> + "You have 10 seconds to press Ctrl-C to cancel this operation.\n\n"); >> + sleep(10); >> + fprintf(stderr, "Sending attach-ns operation ...\n"); >> + } >> + >> err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id, >> cntlist); >> - else >> + } else { >> err = nvme_cli_ns_detach_ctrls(dev, cfg.namespace_id, >> cntlist); >> + } >> >> if (!err) >> printf("%s: Success, nsid:%d\n", cmd->name, cfg.namespace_id); >> > Gentle ping... > > If there's no objection to the above proposed patch then may I send a formal patch? > > Hi Daniel, A gentle ping, if the proposed patch looks good then may I send a formal patch? Thanks, --Nilay ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-27 11:20 ` Nilay Shroff @ 2024-05-27 13:36 ` Daniel Wagner 2024-05-27 14:02 ` Nilay Shroff 0 siblings, 1 reply; 10+ messages in thread From: Daniel Wagner @ 2024-05-27 13:36 UTC (permalink / raw) To: Nilay Shroff Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On Mon, May 27, 2024 at 04:50:34PM GMT, Nilay Shroff wrote: > A gentle ping, if the proposed patch looks good then may I send a formal patch? Sorry for the delay, very busy month. I'll look into it next week. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-27 13:36 ` Daniel Wagner @ 2024-05-27 14:02 ` Nilay Shroff 2024-06-19 10:49 ` Nilay Shroff 0 siblings, 1 reply; 10+ messages in thread From: Nilay Shroff @ 2024-05-27 14:02 UTC (permalink / raw) To: Daniel Wagner Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On 5/27/24 19:06, Daniel Wagner wrote: > On Mon, May 27, 2024 at 04:50:34PM GMT, Nilay Shroff wrote: >> A gentle ping, if the proposed patch looks good then may I send a formal patch? > > Sorry for the delay, very busy month. I'll look into it next week. > Sure no problem... Thanks, --Nilay ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-27 14:02 ` Nilay Shroff @ 2024-06-19 10:49 ` Nilay Shroff 0 siblings, 0 replies; 10+ messages in thread From: Nilay Shroff @ 2024-06-19 10:49 UTC (permalink / raw) To: Daniel Wagner Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On 5/27/24 19:32, Nilay Shroff wrote: > > > On 5/27/24 19:06, Daniel Wagner wrote: >> On Mon, May 27, 2024 at 04:50:34PM GMT, Nilay Shroff wrote: >>> A gentle ping, if the proposed patch looks good then may I send a formal patch? >> >> Sorry for the delay, very busy month. I'll look into it next week. >> Hi Daniel, A gentle ping, just wanted to check with you whether you got a chance to review this change? Thanks, --Nilay ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-05-07 11:44 ` Nilay Shroff 2024-05-16 6:25 ` Nilay Shroff @ 2024-06-20 6:52 ` Daniel Wagner 2024-06-20 9:02 ` Nilay Shroff 1 sibling, 1 reply; 10+ messages in thread From: Daniel Wagner @ 2024-06-20 6:52 UTC (permalink / raw) To: Nilay Shroff Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On Tue, May 07, 2024 at 05:14:40PM GMT, Nilay Shroff wrote: > I think the information which we need to contain this issue is already > available through sysfs. If we scan nvme topology then we could find > the controller id assigned to each controller under each nvme subsystem. > We can then leverage this information to figure out whether each controller > id specified in the attach-ns command is valid or not. So essentially we > match controller id specified in attach-ns command against the controller id > learnt through scanning the topology. If we find that any discrepancy then we > can show the WARNING to the user. I have worked out a patch using this method. > I have attached the patch below for suggestion/feedback. So this is just about issuing a warning. That's fine. I thought you want to do some more 'difficult' stuff. > diff --git a/nvme.c b/nvme.c > index c1d4352a..533cc390 100644 > --- a/nvme.c > +++ b/nvme.c > @@ -2784,6 +2784,23 @@ static int delete_ns(int argc, char **argv, struct command *cmd, struct plugin * > return err; > } > > +static bool nvme_match_subsys_device_filter(nvme_subsystem_t s, nvme_ctrl_t c, > + nvme_ns_t ns, void *f_arg) > +{ > + nvme_ctrl_t _c; > + const char *devname = (const char *)f_arg; > + > + if (s) { use the early return pattern to avoid long indents, e.g. if (!s) return false; > + nvme_subsystem_for_each_ctrl(s, _c) { > + if (!strcmp(devname, nvme_ctrl_get_name(_c))) > + return true; > + } > + return false; > + } > + > + return true; > +} > + > static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, struct command *cmd) > { > _cleanup_free_ struct nvme_ctrl_list *cntlist = NULL; > @@ -2839,12 +2856,68 @@ static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, s > > nvme_init_ctrl_list(cntlist, num, ctrlist); > > - if (attach) > + if (attach) { > + const char *cntlid; > + int __cntlid; > + char *p; > + nvme_host_t h; > + nvme_subsystem_t s; > + nvme_ctrl_t c; > + nvme_root_t r = NULL; > + int matched = 0; > + nvme_scan_filter_t filter = nvme_match_subsys_device_filter; > + > + r = nvme_create_root(stderr, log_level); > + if (!r) { > + nvme_show_error("Failed to create topology root: %s", > + nvme_strerror(errno)); > + return -errno; > + } > + > + err = nvme_scan_topology(r, filter, (void *)dev->name); > + if (err < 0) { > + if (errno != ENOENT) > + nvme_show_error("Failed to scan topology: %s", > + nvme_strerror(errno)); > + nvme_free_tree(r); > + return err; > + } > + nvme_for_each_host(r, h) { > + nvme_for_each_subsystem(h, s) { > + nvme_subsystem_for_each_ctrl(s, c) { > + cntlid = nvme_ctrl_get_cntlid(c); > + errno = 0; > + __cntlid = strtoul(cntlid, &p, 0); > + if (errno || *p != 0) > + continue; > + for (i = 0; i < num; i++) { > + if (__cntlid == list[i]) > + matched++; > + } > + } > + } > + } > + > + nvme_free_tree(r); > + > + if (matched != num) { Wouldn't this check also trigger when the user wants to attach to the discovered controller? I would also suggest to move this check into a helper function in order to keep this function a bit more readable. > + fprintf(stderr, > + "You are about to attach namespace 0x%x to an undiscovered nvme controller.\n", > + cfg.namespace_id); > + fprintf(stderr, > + "WARNING: Attaching nampespace to undiscovered nvme controller may have undesired side effect!\n" > + "You may not be able to perform any IO to such namespace.\n" > + "You have 10 seconds to press Ctrl-C to cancel this operation.\n\n"); > + sleep(10); > + fprintf(stderr, "Sending attach-ns operation ...\n"); > + } > + > err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id, > cntlist); > - else > + } else { > err = nvme_cli_ns_detach_ctrls(dev, cfg.namespace_id, > cntlist); > + } > > if (!err) > printf("%s: Success, nsid:%d\n", cmd->name, cfg.namespace_id); > > Thanks, > --Nilay > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk 2024-06-20 6:52 ` Daniel Wagner @ 2024-06-20 9:02 ` Nilay Shroff 0 siblings, 0 replies; 10+ messages in thread From: Nilay Shroff @ 2024-06-20 9:02 UTC (permalink / raw) To: Daniel Wagner Cc: linux-nvme@lists.infradead.org, Keith Busch, Christoph Hellwig, Sagi Grimberg, Gregory Joyce, Srimannarayana Murthy Maram, axboe@fb.com On 6/20/24 12:22, Daniel Wagner wrote: > On Tue, May 07, 2024 at 05:14:40PM GMT, Nilay Shroff wrote: >> I think the information which we need to contain this issue is already >> available through sysfs. If we scan nvme topology then we could find >> the controller id assigned to each controller under each nvme subsystem. >> We can then leverage this information to figure out whether each controller >> id specified in the attach-ns command is valid or not. So essentially we >> match controller id specified in attach-ns command against the controller id >> learnt through scanning the topology. If we find that any discrepancy then we >> can show the WARNING to the user. I have worked out a patch using this method. >> I have attached the patch below for suggestion/feedback. > > So this is just about issuing a warning. That's fine. I thought you want > to do some more 'difficult' stuff. > >> diff --git a/nvme.c b/nvme.c >> index c1d4352a..533cc390 100644 >> --- a/nvme.c >> +++ b/nvme.c >> @@ -2784,6 +2784,23 @@ static int delete_ns(int argc, char **argv, struct command *cmd, struct plugin * >> return err; >> } >> >> +static bool nvme_match_subsys_device_filter(nvme_subsystem_t s, nvme_ctrl_t c, >> + nvme_ns_t ns, void *f_arg) >> +{ >> + nvme_ctrl_t _c; >> + const char *devname = (const char *)f_arg; >> + >> + if (s) { > > use the early return pattern to avoid long indents, e.g. > > if (!s) > return false; > ok will do. >> + nvme_subsystem_for_each_ctrl(s, _c) { >> + if (!strcmp(devname, nvme_ctrl_get_name(_c))) >> + return true; >> + } >> + return false; >> + } >> + >> + return true; >> +} >> + >> static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, struct command *cmd) >> { >> _cleanup_free_ struct nvme_ctrl_list *cntlist = NULL; >> @@ -2839,12 +2856,68 @@ static int nvme_attach_ns(int argc, char **argv, int attach, const char *desc, s >> >> nvme_init_ctrl_list(cntlist, num, ctrlist); >> >> - if (attach) >> + if (attach) { >> + const char *cntlid; >> + int __cntlid; >> + char *p; >> + nvme_host_t h; >> + nvme_subsystem_t s; >> + nvme_ctrl_t c; >> + nvme_root_t r = NULL; >> + int matched = 0; >> + nvme_scan_filter_t filter = nvme_match_subsys_device_filter; >> + >> + r = nvme_create_root(stderr, log_level); >> + if (!r) { >> + nvme_show_error("Failed to create topology root: %s", >> + nvme_strerror(errno)); >> + return -errno; >> + } >> + >> + err = nvme_scan_topology(r, filter, (void *)dev->name); >> + if (err < 0) { >> + if (errno != ENOENT) >> + nvme_show_error("Failed to scan topology: %s", >> + nvme_strerror(errno)); >> + nvme_free_tree(r); >> + return err; >> + } >> + nvme_for_each_host(r, h) { >> + nvme_for_each_subsystem(h, s) { >> + nvme_subsystem_for_each_ctrl(s, c) { >> + cntlid = nvme_ctrl_get_cntlid(c); >> + errno = 0; >> + __cntlid = strtoul(cntlid, &p, 0); >> + if (errno || *p != 0) >> + continue; >> + for (i = 0; i < num; i++) { >> + if (__cntlid == list[i]) >> + matched++; >> + } >> + } >> + } >> + } >> + >> + nvme_free_tree(r); >> + >> + if (matched != num) { > > Wouldn't this check also trigger when the user wants to attach to the > discovered controller? > Yes it will be triggered however in that case the if check would evaluate to false and hence we would not enter into the exception case. > I would also suggest to move this check into a helper function in order > to keep this function a bit more readable. Yeah I will create a helper and call it from nvme_attach_ns(). > >> + fprintf(stderr, >> + "You are about to attach namespace 0x%x to an undiscovered nvme controller.\n", >> + cfg.namespace_id); >> + fprintf(stderr, >> + "WARNING: Attaching nampespace to undiscovered nvme controller may have undesired side effect!\n" >> + "You may not be able to perform any IO to such namespace.\n" >> + "You have 10 seconds to press Ctrl-C to cancel this operation.\n\n"); >> + sleep(10); >> + fprintf(stderr, "Sending attach-ns operation ...\n"); >> + } >> + >> err = nvme_cli_ns_attach_ctrls(dev, cfg.namespace_id, >> cntlist); >> - else >> + } else { >> err = nvme_cli_ns_detach_ctrls(dev, cfg.namespace_id, >> cntlist); >> + } >> >> if (!err) >> printf("%s: Success, nsid:%d\n", cmd->name, cfg.namespace_id); I think, I will send a formal patch now with above comments incorporated and we may discuss further if there's any additional comment. Thanks, --Nilay ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-06-20 9:02 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-06 14:15 nvme-cli : attaching a namespace to an undiscovered nvme controller on multi-controller nvme disk Nilay Shroff 2024-05-06 15:05 ` Daniel Wagner 2024-05-07 11:44 ` Nilay Shroff 2024-05-16 6:25 ` Nilay Shroff 2024-05-27 11:20 ` Nilay Shroff 2024-05-27 13:36 ` Daniel Wagner 2024-05-27 14:02 ` Nilay Shroff 2024-06-19 10:49 ` Nilay Shroff 2024-06-20 6:52 ` Daniel Wagner 2024-06-20 9:02 ` Nilay Shroff
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox