* [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock @ 2026-03-02 22:25 Keith Busch 2026-03-02 22:25 ` [RFC-PATCH 2/2] nvme: use the namespace id for block device names Keith Busch 2026-03-03 7:29 ` [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock Hannes Reinecke 0 siblings, 2 replies; 8+ messages in thread From: Keith Busch @ 2026-03-02 22:25 UTC (permalink / raw) To: linux-nvme, hch, mlombard; +Cc: jmeneghi, hare, Keith Busch From: Keith Busch <kbusch@kernel.org> To maintain symmetry with the disk add side, perform the disk deletion under the nvme subsystem lock. This will ensure consistent ordering of device naming when adding namespaces while concurrently deleting stale references. The deferred removal requires special attention to the nvme_subsystem: the head may be holding the final reference in the deferred path, so take a reference on it while removing the head. Signed-off-by: Keith Busch <kbusch@kernel.org> --- drivers/nvme/host/core.c | 3 +-- drivers/nvme/host/multipath.c | 28 +++++++++++----------------- drivers/nvme/host/nvme.h | 1 + 3 files changed, 13 insertions(+), 19 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index a5c1af443420b..7a558d5103a21 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -147,7 +147,6 @@ static const struct class nvme_ns_chr_class = { .name = "nvme-generic", }; -static void nvme_put_subsystem(struct nvme_subsystem *subsys); static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl, unsigned nsid); static void nvme_update_keep_alive(struct nvme_ctrl *ctrl, @@ -3119,7 +3118,7 @@ static void nvme_destroy_subsystem(struct kref *ref) put_device(&subsys->dev); } -static void nvme_put_subsystem(struct nvme_subsystem *subsys) +void nvme_put_subsystem(struct nvme_subsystem *subsys) { kref_put(&subsys->ref, nvme_destroy_subsystem); } diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index fc6800a9f7f94..02a50181d1dd6 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -682,6 +682,8 @@ static void nvme_requeue_work(struct work_struct *work) static void nvme_remove_head(struct nvme_ns_head *head) { + list_del_init(&head->entry); + if (test_and_clear_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) { /* * requeue I/O after NVME_NSHEAD_DISK_LIVE has been cleared @@ -700,16 +702,14 @@ static void nvme_remove_head_work(struct work_struct *work) { struct nvme_ns_head *head = container_of(to_delayed_work(work), struct nvme_ns_head, remove_work); - bool remove = false; + struct nvme_subsystem *subsys = head->subsys; - mutex_lock(&head->subsys->lock); - if (list_empty(&head->list)) { - list_del_init(&head->entry); - remove = true; - } - mutex_unlock(&head->subsys->lock); - if (remove) + kref_get(&subsys->ref); + mutex_lock(&subsys->lock); + if (list_empty(&head->list)) nvme_remove_head(head); + mutex_unlock(&subsys->lock); + nvme_put_subsystem(subsys); module_put(THIS_MODULE); } @@ -1292,8 +1292,6 @@ void nvme_mpath_add_disk(struct nvme_ns *ns, __le32 anagrpid) void nvme_mpath_remove_disk(struct nvme_ns_head *head) { - bool remove = false; - if (!head->disk) return; @@ -1314,17 +1312,13 @@ void nvme_mpath_remove_disk(struct nvme_ns_head *head) * Ensure that no one could remove this module while the head * remove work is pending. */ - if (head->delayed_removal_secs && try_module_get(THIS_MODULE)) { + if (head->delayed_removal_secs && try_module_get(THIS_MODULE)) mod_delayed_work(nvme_wq, &head->remove_work, head->delayed_removal_secs * HZ); - } else { - list_del_init(&head->entry); - remove = true; - } + else + nvme_remove_head(head); out: mutex_unlock(&head->subsys->lock); - if (remove) - nvme_remove_head(head); } void nvme_mpath_put_disk(struct nvme_ns_head *head) diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 44801801fc289..ed8ce356c363e 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -1283,6 +1283,7 @@ struct nvme_ctrl *nvme_ctrl_from_file(struct file *file); struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid); bool nvme_get_ns(struct nvme_ns *ns); void nvme_put_ns(struct nvme_ns *ns); +void nvme_put_subsystem(struct nvme_subsystem *subsys); static inline bool nvme_multi_css(struct nvme_ctrl *ctrl) { -- 2.47.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC-PATCH 2/2] nvme: use the namespace id for block device names 2026-03-02 22:25 [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock Keith Busch @ 2026-03-02 22:25 ` Keith Busch 2026-03-03 7:34 ` Hannes Reinecke 2026-03-03 7:29 ` [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock Hannes Reinecke 1 sibling, 1 reply; 8+ messages in thread From: Keith Busch @ 2026-03-02 22:25 UTC (permalink / raw) To: linux-nvme, hch, mlombard; +Cc: jmeneghi, hare, Keith Busch From: Keith Busch <kbusch@kernel.org> We can now assume the ns_id provides a unique number within a subsystem when creating a namespace kobject. Use that rather than pulling an available bit from an ida, which becomes unnecessary once we use the controller's reported identifier. The user observable change is that the suffix of the nvme namespace device handle name will always be consistent regardless of which controller is scanned first or what the namespace attachment state is for a given controller being scanned. The prefix will still be dependent on the order the controllers were connected/probed. Signed-off-by: Keith Busch <kbusch@kernel.org> --- drivers/nvme/host/core.c | 19 +++++-------------- drivers/nvme/host/multipath.c | 4 ++-- drivers/nvme/host/nvme.h | 2 -- 3 files changed, 7 insertions(+), 18 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7a558d5103a21..3f2f9b2be87c2 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -667,7 +667,6 @@ static void nvme_free_ns_head(struct kref *ref) container_of(ref, struct nvme_ns_head, ref); nvme_mpath_put_disk(head); - ida_free(&head->subsys->ns_ida, head->instance); cleanup_srcu_struct(&head->srcu); nvme_put_subsystem(head->subsys); kfree(head->plids); @@ -3113,7 +3112,6 @@ static void nvme_destroy_subsystem(struct kref *ref) list_del(&subsys->entry); mutex_unlock(&nvme_subsystems_lock); - ida_destroy(&subsys->ns_ida); device_del(&subsys->dev); put_device(&subsys->dev); } @@ -3257,7 +3255,6 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id) put_device(&subsys->dev); goto out_unlock; } - ida_init(&subsys->ns_ida); list_add_tail(&subsys->entry, &nvme_subsystems); } @@ -3868,7 +3865,7 @@ static int nvme_add_ns_cdev(struct nvme_ns *ns) ns->cdev_device.parent = ns->ctrl->device; ret = dev_set_name(&ns->cdev_device, "ng%dn%d", - ns->ctrl->instance, ns->head->instance); + ns->ctrl->instance, ns->head->ns_id); if (ret) return ret; @@ -3890,14 +3887,10 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl, head = kzalloc(size, GFP_KERNEL); if (!head) goto out; - ret = ida_alloc_min(&ctrl->subsys->ns_ida, 1, GFP_KERNEL); - if (ret < 0) - goto out_free_head; - head->instance = ret; INIT_LIST_HEAD(&head->list); ret = init_srcu_struct(&head->srcu); if (ret) - goto out_ida_remove; + goto out_free_head; head->subsys = ctrl->subsys; head->ns_id = info->nsid; head->ids = info->ids; @@ -3925,8 +3918,6 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl, return head; out_cleanup_srcu: cleanup_srcu_struct(&head->srcu); -out_ida_remove: - ida_free(&ctrl->subsys->ns_ida, head->instance); out_free_head: kfree(head); out: @@ -4162,14 +4153,14 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info) */ if (nvme_ns_head_multipath(ns->head)) { sprintf(disk->disk_name, "nvme%dc%dn%d", ctrl->subsys->instance, - ctrl->instance, ns->head->instance); + ctrl->instance, ns->head->ns_id); disk->flags |= GENHD_FL_HIDDEN; } else if (multipath) { sprintf(disk->disk_name, "nvme%dn%d", ctrl->subsys->instance, - ns->head->instance); + ns->head->ns_id); } else { sprintf(disk->disk_name, "nvme%dn%d", ctrl->instance, - ns->head->instance); + ns->head->ns_id); } if (nvme_update_ns_info(ns, info)) diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 02a50181d1dd6..8715b04ff92a8 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -640,7 +640,7 @@ static int nvme_add_ns_head_cdev(struct nvme_ns_head *head) head->cdev_device.parent = &head->subsys->dev; ret = dev_set_name(&head->cdev_device, "ng%dn%d", - head->subsys->instance, head->instance); + head->subsys->instance, head->ns_id); if (ret) return ret; ret = nvme_cdev_add(&head->cdev, &head->cdev_device, @@ -767,7 +767,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) */ set_bit(GD_SUPPRESS_PART_SCAN, &head->disk->state); sprintf(head->disk->disk_name, "nvme%dn%d", - ctrl->subsys->instance, head->instance); + ctrl->subsys->instance, head->ns_id); nvme_tryget_ns_head(head); return 0; } diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index ed8ce356c363e..142a5684205a8 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -499,7 +499,6 @@ struct nvme_subsystem { u8 cmic; enum nvme_subsys_type subtype; u16 vendor_id; - struct ida ns_ida; #ifdef CONFIG_NVME_MULTIPATH enum nvme_iopolicy iopolicy; #endif @@ -540,7 +539,6 @@ struct nvme_ns_head { struct nvme_effects_log *effects; u64 nuse; unsigned ns_id; - int instance; #ifdef CONFIG_BLK_DEV_ZONED u64 zsze; #endif -- 2.47.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC-PATCH 2/2] nvme: use the namespace id for block device names 2026-03-02 22:25 ` [RFC-PATCH 2/2] nvme: use the namespace id for block device names Keith Busch @ 2026-03-03 7:34 ` Hannes Reinecke 2026-03-03 15:39 ` Keith Busch 0 siblings, 1 reply; 8+ messages in thread From: Hannes Reinecke @ 2026-03-03 7:34 UTC (permalink / raw) To: Keith Busch, linux-nvme, hch, mlombard; +Cc: jmeneghi, Keith Busch On 3/2/26 23:25, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > We can now assume the ns_id provides a unique number within a subsystem > when creating a namespace kobject. Use that rather than pulling an > available bit from an ida, which becomes unnecessary once we use the > controller's reported identifier. > > The user observable change is that the suffix of the nvme namespace > device handle name will always be consistent regardless of which > controller is scanned first or what the namespace attachment state is > for a given controller being scanned. The prefix will still be dependent > on the order the controllers were connected/probed. > > Signed-off-by: Keith Busch <kbusch@kernel.org> > --- > drivers/nvme/host/core.c | 19 +++++-------------- > drivers/nvme/host/multipath.c | 4 ++-- > drivers/nvme/host/nvme.h | 2 -- > 3 files changed, 7 insertions(+), 18 deletions(-) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 7a558d5103a21..3f2f9b2be87c2 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -667,7 +667,6 @@ static void nvme_free_ns_head(struct kref *ref) > container_of(ref, struct nvme_ns_head, ref); > > nvme_mpath_put_disk(head); > - ida_free(&head->subsys->ns_ida, head->instance); > cleanup_srcu_struct(&head->srcu); > nvme_put_subsystem(head->subsys); > kfree(head->plids); > @@ -3113,7 +3112,6 @@ static void nvme_destroy_subsystem(struct kref *ref) > list_del(&subsys->entry); > mutex_unlock(&nvme_subsystems_lock); > > - ida_destroy(&subsys->ns_ida); > device_del(&subsys->dev); > put_device(&subsys->dev); > } > @@ -3257,7 +3255,6 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id) > put_device(&subsys->dev); > goto out_unlock; > } > - ida_init(&subsys->ns_ida); > list_add_tail(&subsys->entry, &nvme_subsystems); > } > > @@ -3868,7 +3865,7 @@ static int nvme_add_ns_cdev(struct nvme_ns *ns) > > ns->cdev_device.parent = ns->ctrl->device; > ret = dev_set_name(&ns->cdev_device, "ng%dn%d", > - ns->ctrl->instance, ns->head->instance); > + ns->ctrl->instance, ns->head->ns_id); > if (ret) > return ret; > > @@ -3890,14 +3887,10 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl, > head = kzalloc(size, GFP_KERNEL); > if (!head) > goto out; > - ret = ida_alloc_min(&ctrl->subsys->ns_ida, 1, GFP_KERNEL); > - if (ret < 0) > - goto out_free_head; > - head->instance = ret; > INIT_LIST_HEAD(&head->list); > ret = init_srcu_struct(&head->srcu); > if (ret) > - goto out_ida_remove; > + goto out_free_head; > head->subsys = ctrl->subsys; > head->ns_id = info->nsid; > head->ids = info->ids; > @@ -3925,8 +3918,6 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl, > return head; > out_cleanup_srcu: > cleanup_srcu_struct(&head->srcu); > -out_ida_remove: > - ida_free(&ctrl->subsys->ns_ida, head->instance); > out_free_head: > kfree(head); > out: > @@ -4162,14 +4153,14 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info) > */ > if (nvme_ns_head_multipath(ns->head)) { > sprintf(disk->disk_name, "nvme%dc%dn%d", ctrl->subsys->instance, > - ctrl->instance, ns->head->instance); > + ctrl->instance, ns->head->ns_id); > disk->flags |= GENHD_FL_HIDDEN; > } else if (multipath) { > sprintf(disk->disk_name, "nvme%dn%d", ctrl->subsys->instance, > - ns->head->instance); > + ns->head->ns_id); > } else { > sprintf(disk->disk_name, "nvme%dn%d", ctrl->instance, > - ns->head->instance); > + ns->head->ns_id); > } > > if (nvme_update_ns_info(ns, info)) > diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c > index 02a50181d1dd6..8715b04ff92a8 100644 > --- a/drivers/nvme/host/multipath.c > +++ b/drivers/nvme/host/multipath.c > @@ -640,7 +640,7 @@ static int nvme_add_ns_head_cdev(struct nvme_ns_head *head) > > head->cdev_device.parent = &head->subsys->dev; > ret = dev_set_name(&head->cdev_device, "ng%dn%d", > - head->subsys->instance, head->instance); > + head->subsys->instance, head->ns_id); > if (ret) > return ret; > ret = nvme_cdev_add(&head->cdev, &head->cdev_device, > @@ -767,7 +767,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) > */ > set_bit(GD_SUPPRESS_PART_SCAN, &head->disk->state); > sprintf(head->disk->disk_name, "nvme%dn%d", > - ctrl->subsys->instance, head->instance); > + ctrl->subsys->instance, head->ns_id); > nvme_tryget_ns_head(head); > return 0; > } > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h > index ed8ce356c363e..142a5684205a8 100644 > --- a/drivers/nvme/host/nvme.h > +++ b/drivers/nvme/host/nvme.h > @@ -499,7 +499,6 @@ struct nvme_subsystem { > u8 cmic; > enum nvme_subsys_type subtype; > u16 vendor_id; > - struct ida ns_ida; > #ifdef CONFIG_NVME_MULTIPATH > enum nvme_iopolicy iopolicy; > #endif > @@ -540,7 +539,6 @@ struct nvme_ns_head { > struct nvme_effects_log *effects; > u64 nuse; > unsigned ns_id; > - int instance; > #ifdef CONFIG_BLK_DEV_ZONED > u64 zsze; > #endif The idea is nice, and I would love to go into that direction. But have you checked how this holds up under rescan/remapping (eg things like blktest/nvme/058)? Removal of the sysfs nodes might be delayed, and we cannot create new entries with the same name until then. So if that is taken care of, fine, but I don't see that in the patch ... Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC-PATCH 2/2] nvme: use the namespace id for block device names 2026-03-03 7:34 ` Hannes Reinecke @ 2026-03-03 15:39 ` Keith Busch 2026-03-03 16:53 ` Keith Busch 0 siblings, 1 reply; 8+ messages in thread From: Keith Busch @ 2026-03-03 15:39 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Keith Busch, linux-nvme, hch, mlombard, jmeneghi On Tue, Mar 03, 2026 at 08:34:38AM +0100, Hannes Reinecke wrote: > The idea is nice, and I would love to go into that > direction. > But have you checked how this holds up under > rescan/remapping (eg things like blktest/nvme/058)? > Removal of the sysfs nodes might be delayed, and we cannot > create new entries with the same name until then. > So if that is taken care of, fine, but I don't see that in > the patch ... I see. I set out to ensure everything was ordered, but apparently I've missed this case. Thanks for pointing out the test. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC-PATCH 2/2] nvme: use the namespace id for block device names 2026-03-03 15:39 ` Keith Busch @ 2026-03-03 16:53 ` Keith Busch 2026-03-04 11:55 ` Nilay Shroff 0 siblings, 1 reply; 8+ messages in thread From: Keith Busch @ 2026-03-03 16:53 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Keith Busch, linux-nvme, hch, mlombard, jmeneghi On Tue, Mar 03, 2026 at 08:39:56AM -0700, Keith Busch wrote: > On Tue, Mar 03, 2026 at 08:34:38AM +0100, Hannes Reinecke wrote: > > The idea is nice, and I would love to go into that > > direction. > > But have you checked how this holds up under > > rescan/remapping (eg things like blktest/nvme/058)? > > Removal of the sysfs nodes might be delayed, and we cannot > > create new entries with the same name until then. > > So if that is taken care of, fine, but I don't see that in > > the patch ... > > I see. I set out to ensure everything was ordered, but apparently I've > missed this case. Thanks for pointing out the test. Okay, I think it's as simple as the head unlinking prior to actually doing the del_gendisk creates a time when one scan work makes a new namespace before the stale one is deleted in a different scan. This should fix it: --- diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 3f2f9b2be87c2..e8dbb6cb85694 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4246,11 +4246,8 @@ static void nvme_ns_remove(struct nvme_ns *ns) mutex_lock(&ns->ctrl->subsys->lock); list_del_rcu(&ns->siblings); - if (list_empty(&ns->head->list)) { - if (!nvme_mpath_queue_if_no_path(ns->head)) - list_del_init(&ns->head->entry); + if (list_empty(&ns->head->list)) last_path = true; - } mutex_unlock(&ns->ctrl->subsys->lock); /* guarantee not available in head->list */ -- This opens a different race where Controller A is deleting the last path, and Controller B is bringing up a new namespace that reused the NSID. An earlier patch from me should fix that by reschudeling B's scan_work once A detects it removed the last path. It should work, but it feels a bit off to me, so I'll think about it a little more. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC-PATCH 2/2] nvme: use the namespace id for block device names 2026-03-03 16:53 ` Keith Busch @ 2026-03-04 11:55 ` Nilay Shroff 2026-03-04 14:53 ` Keith Busch 0 siblings, 1 reply; 8+ messages in thread From: Nilay Shroff @ 2026-03-04 11:55 UTC (permalink / raw) To: Keith Busch, Hannes Reinecke Cc: Keith Busch, linux-nvme, hch, mlombard, jmeneghi On 3/3/26 10:23 PM, Keith Busch wrote: > On Tue, Mar 03, 2026 at 08:39:56AM -0700, Keith Busch wrote: >> On Tue, Mar 03, 2026 at 08:34:38AM +0100, Hannes Reinecke wrote: >>> The idea is nice, and I would love to go into that >>> direction. >>> But have you checked how this holds up under >>> rescan/remapping (eg things like blktest/nvme/058)? >>> Removal of the sysfs nodes might be delayed, and we cannot >>> create new entries with the same name until then. >>> So if that is taken care of, fine, but I don't see that in >>> the patch ... >> >> I see. I set out to ensure everything was ordered, but apparently I've >> missed this case. Thanks for pointing out the test. > > Okay, I think it's as simple as the head unlinking prior to actually > doing the del_gendisk creates a time when one scan work makes a new > namespace before the stale one is deleted in a different scan. This > should fix it: > > --- > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 3f2f9b2be87c2..e8dbb6cb85694 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -4246,11 +4246,8 @@ static void nvme_ns_remove(struct nvme_ns *ns) > > mutex_lock(&ns->ctrl->subsys->lock); > list_del_rcu(&ns->siblings); > - if (list_empty(&ns->head->list)) { > - if (!nvme_mpath_queue_if_no_path(ns->head)) > - list_del_init(&ns->head->entry); > + if (list_empty(&ns->head->list)) > last_path = true; > - } > mutex_unlock(&ns->ctrl->subsys->lock); > > /* guarantee not available in head->list */ > -- > > This opens a different race where Controller A is deleting the last > path, and Controller B is bringing up a new namespace that reused the > NSID. An earlier patch from me should fix that by reschudeling B's > scan_work once A detects it removed the last path. It should work, but > it feels a bit off to me, so I'll think about it a little more. > I think we need to ensure that the head disk is fully removed from the system before a new head is created that may reuse the same NSID; otherwise we may run into name collisions. Since removal of the sysfs entries associated with the disk can be delayed, a namespace scan may attempt to create a new head with the same name while the previous one is still being torn down. To avoid this, it might be better to defer removing head->entry from the subsys->nsheads list until after the head node’s gendisk has been removed in nvme_remove_head(). From your first patch in this series, I see that the removal of head->entry from subsys->nsheads was moved to the beginning of nvme_remove_head(). IMO, in addition to the change you proposed above, if we also move the removal of head->entry until after del_gendisk() is called, that should ensure the old disk is fully removed before a new head with the same NSID can be created. Thanks, --Nilay ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC-PATCH 2/2] nvme: use the namespace id for block device names 2026-03-04 11:55 ` Nilay Shroff @ 2026-03-04 14:53 ` Keith Busch 0 siblings, 0 replies; 8+ messages in thread From: Keith Busch @ 2026-03-04 14:53 UTC (permalink / raw) To: Nilay Shroff Cc: Hannes Reinecke, Keith Busch, linux-nvme, hch, mlombard, jmeneghi On Wed, Mar 04, 2026 at 05:25:28PM +0530, Nilay Shroff wrote: > I think we need to ensure that the head disk is fully removed from the > system before a new head is created that may reuse the same NSID; otherwise > we may run into name collisions. Since removal of the sysfs entries > associated with the disk can be delayed, a namespace scan may attempt to > create a new head with the same name while the previous one is still being > torn down. > > To avoid this, it might be better to defer removing head->entry from the > subsys->nsheads list until after the head node´s gendisk has been removed in > nvme_remove_head(). > > From your first patch in this series, I see that the removal of > head->entry from subsys->nsheads was moved to the beginning of > nvme_remove_head(). IMO, in addition to the change you proposed above, if we > also move the removal of head->entry until after del_gendisk() is called, > that should ensure the old disk is fully removed before a new head with the > same NSID can be created. Yeah, I'm really sure why it was done in two steps split across different subsystem lock contexts. I'll respin the series along with the last two patches from this one: https://lore.kernel.org/linux-nvme/20260226183216.2098584-1-kbusch@meta.com/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock 2026-03-02 22:25 [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock Keith Busch 2026-03-02 22:25 ` [RFC-PATCH 2/2] nvme: use the namespace id for block device names Keith Busch @ 2026-03-03 7:29 ` Hannes Reinecke 1 sibling, 0 replies; 8+ messages in thread From: Hannes Reinecke @ 2026-03-03 7:29 UTC (permalink / raw) To: Keith Busch, linux-nvme, hch, mlombard; +Cc: jmeneghi, Keith Busch On 3/2/26 23:25, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > To maintain symmetry with the disk add side, perform the disk deletion > under the nvme subsystem lock. This will ensure consistent ordering of > device naming when adding namespaces while concurrently deleting stale > references. > > The deferred removal requires special attention to the nvme_subsystem: > the head may be holding the final reference in the deferred path, so > take a reference on it while removing the head. > > Signed-off-by: Keith Busch <kbusch@kernel.org> > --- > drivers/nvme/host/core.c | 3 +-- > drivers/nvme/host/multipath.c | 28 +++++++++++----------------- > drivers/nvme/host/nvme.h | 1 + > 3 files changed, 13 insertions(+), 19 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-04 14:53 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-02 22:25 [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock Keith Busch 2026-03-02 22:25 ` [RFC-PATCH 2/2] nvme: use the namespace id for block device names Keith Busch 2026-03-03 7:34 ` Hannes Reinecke 2026-03-03 15:39 ` Keith Busch 2026-03-03 16:53 ` Keith Busch 2026-03-04 11:55 ` Nilay Shroff 2026-03-04 14:53 ` Keith Busch 2026-03-03 7:29 ` [RFC-PATCH 1/2] nvme-multipath: delete gendisk under subsys lock Hannes Reinecke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox