* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
@ 2026-02-25 20:34 ` Keith Busch
2026-02-26 7:04 ` Nilay Shroff
2026-02-26 15:37 ` Christoph Hellwig
2 siblings, 0 replies; 12+ messages in thread
From: Keith Busch @ 2026-02-25 20:34 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, nilay
On Wed, Feb 25, 2026 at 12:21:09PM -0800, Keith Busch wrote:
> +
> + WARN_ONCE(list_empty(&head->list));
Ugh, I know this should be WARN_ON_ONCE()...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
2026-02-25 20:34 ` Keith Busch
@ 2026-02-26 7:04 ` Nilay Shroff
2026-02-26 15:37 ` Christoph Hellwig
2 siblings, 0 replies; 12+ messages in thread
From: Nilay Shroff @ 2026-02-26 7:04 UTC (permalink / raw)
To: Keith Busch, linux-nvme, hch; +Cc: Keith Busch
On 2/26/26 1:51 AM, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The NVMe controller is allowed to reuse an NSID for a new namespace after
> deleting the previous namespace that had been using it. The delayed removal may
> have the stale namespace head in the subsystem list pending the timer, which
> would cause the scan to falsely report an ID mismatch error for the new
> namespace. Flush the pending removal work and retry to resolve this.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/nvme/host/core.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3de52f1d27234..e731d3182f095 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3966,6 +3966,7 @@ static int nvme_global_check_duplicate_ids(struct nvme_subsystem *this,
>
> static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> {
> + bool retry = IS_ENABLED(CONFIG_NVME_MULTIPATH);
> struct nvme_ctrl *ctrl = ns->ctrl;
> struct nvme_ns_head *head = NULL;
> int ret;
> @@ -4008,6 +4009,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
> }
>
> +again:
> mutex_lock(&ctrl->subsys->lock);
> head = nvme_find_ns_head(ctrl, info->nsid);
> if (!head) {
> @@ -4033,6 +4035,22 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> goto out_put_ns_head;
> }
> if (!nvme_ns_ids_equal(&head->ids, &info->ids)) {
> + /*
> + * A newly created namespace can reuse an NSID that was
> + * previously deleted. If the head has no active paths,
> + * it is pending delayed removal and still occupying
> + * this NSID in the subsystem list. Flush the removal
> + * work to clear the stale head and retry.
> + */
> + if (retry && list_empty(&head->list)) {
> + mutex_unlock(&ctrl->subsys->lock);
> + flush_delayed_work(&head->remove_work);
> + nvme_put_ns_head(head);
> + retry = false;
> + goto again;
> + }
> +
> + WARN_ONCE(list_empty(&head->list));
We need to replace WARN_ONCE with WARN_ON_ONCE (as you already mentioned
in another thread), so with that change applied, this looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
2026-02-25 20:34 ` Keith Busch
2026-02-26 7:04 ` Nilay Shroff
@ 2026-02-26 15:37 ` Christoph Hellwig
2026-02-26 16:51 ` Keith Busch
2 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2026-02-26 15:37 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, nilay, Keith Busch
On Wed, Feb 25, 2026 at 12:21:09PM -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The NVMe controller is allowed to reuse an NSID for a new namespace after
> deleting the previous namespace that had been using it. The delayed removal may
> have the stale namespace head in the subsystem list pending the timer, which
Overlong lines.
> + bool retry = IS_ENABLED(CONFIG_NVME_MULTIPATH);
> struct nvme_ctrl *ctrl = ns->ctrl;
> struct nvme_ns_head *head = NULL;
> int ret;
> @@ -4008,6 +4009,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
> }
>
> +again:
> mutex_lock(&ctrl->subsys->lock);
> head = nvme_find_ns_head(ctrl, info->nsid);
> if (!head) {
> @@ -4033,6 +4035,22 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> goto out_put_ns_head;
> }
> if (!nvme_ns_ids_equal(&head->ids, &info->ids)) {
> + /*
> + * A newly created namespace can reuse an NSID that was
> + * previously deleted. If the head has no active paths,
> + * it is pending delayed removal and still occupying
> + * this NSID in the subsystem list. Flush the removal
> + * work to clear the stale head and retry.
> + */
> + if (retry && list_empty(&head->list)) {
I find the retry logic a bit odd and different from other places
do in similar areas. What I'd expected is either a "nr_retries" or
"did_retry" variable initialized to 0/false, then checked here to
be not set (plus the IS_ENABLED() for multipath) and incremented/set
below.
But independent of that, the actual logic looks fine.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 15:37 ` Christoph Hellwig
@ 2026-02-26 16:51 ` Keith Busch
2026-02-26 18:31 ` Nilay Shroff
0 siblings, 1 reply; 12+ messages in thread
From: Keith Busch @ 2026-02-26 16:51 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, linux-nvme, nilay
On Thu, Feb 26, 2026 at 04:37:40PM +0100, Christoph Hellwig wrote:
> I find the retry logic a bit odd and different from other places
> do in similar areas. What I'd expected is either a "nr_retries" or
> "did_retry" variable initialized to 0/false, then checked here to
> be not set (plus the IS_ENABLED() for multipath) and incremented/set
> below.
>
> But independent of that, the actual logic looks fine.
I was able to test this, and it does work when we're specifically
blocking on the delayed removal. But there's a different race this
doesn't handle: controller A's scan_work may depend on controller B's
scan_work to finish first to remove a final reference on the deleted
namespace when A is trying to add a newly created namespace that
recycled the NSID.
This is looking pretty tricky to resolve. The best solution I'm coming
up with so far is to have the scan_work synthesize a
NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
then re-kick their scan work if the scan_work removed anything.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 16:51 ` Keith Busch
@ 2026-02-26 18:31 ` Nilay Shroff
2026-02-26 18:35 ` Keith Busch
0 siblings, 1 reply; 12+ messages in thread
From: Nilay Shroff @ 2026-02-26 18:31 UTC (permalink / raw)
To: Keith Busch, Christoph Hellwig; +Cc: Keith Busch, linux-nvme
On 2/26/26 10:21 PM, Keith Busch wrote:
> On Thu, Feb 26, 2026 at 04:37:40PM +0100, Christoph Hellwig wrote:
>> I find the retry logic a bit odd and different from other places
>> do in similar areas. What I'd expected is either a "nr_retries" or
>> "did_retry" variable initialized to 0/false, then checked here to
>> be not set (plus the IS_ENABLED() for multipath) and incremented/set
>> below.
>>
>> But independent of that, the actual logic looks fine.
>
> I was able to test this, and it does work when we're specifically
> blocking on the delayed removal. But there's a different race this
> doesn't handle: controller A's scan_work may depend on controller B's
> scan_work to finish first to remove a final reference on the deleted
> namespace when A is trying to add a newly created namespace that
> recycled the NSID.
>
> This is looking pretty tricky to resolve. The best solution I'm coming
> up with so far is to have the scan_work synthesize a
> NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
> then re-kick their scan work if the scan_work removed anything.
So does your disk when reuse NSID, changes ns ids such as NGUID/UUID/EUI64?
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 18:31 ` Nilay Shroff
@ 2026-02-26 18:35 ` Keith Busch
2026-02-27 13:53 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Keith Busch @ 2026-02-26 18:35 UTC (permalink / raw)
To: Nilay Shroff; +Cc: Christoph Hellwig, Keith Busch, linux-nvme
On Fri, Feb 27, 2026 at 12:01:48AM +0530, Nilay Shroff wrote:
> On 2/26/26 10:21 PM, Keith Busch wrote:
> > This is looking pretty tricky to resolve. The best solution I'm coming
> > up with so far is to have the scan_work synthesize a
> > NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
> > then re-kick their scan work if the scan_work removed anything.
>
> So does your disk when reuse NSID, changes ns ids such as NGUID/UUID/EUI64?
Yes. The NSID is put back into the available pool when it's deleted, but
the UID associated with it is newly generated upon creation.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 18:35 ` Keith Busch
@ 2026-02-27 13:53 ` Christoph Hellwig
0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2026-02-27 13:53 UTC (permalink / raw)
To: Keith Busch; +Cc: Nilay Shroff, Christoph Hellwig, Keith Busch, linux-nvme
On Thu, Feb 26, 2026 at 11:35:39AM -0700, Keith Busch wrote:
> On Fri, Feb 27, 2026 at 12:01:48AM +0530, Nilay Shroff wrote:
> > On 2/26/26 10:21 PM, Keith Busch wrote:
> > > This is looking pretty tricky to resolve. The best solution I'm coming
> > > up with so far is to have the scan_work synthesize a
> > > NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
> > > then re-kick their scan work if the scan_work removed anything.
> >
> > So does your disk when reuse NSID, changes ns ids such as NGUID/UUID/EUI64?
>
> Yes. The NSID is put back into the available pool when it's deleted, but
> the UID associated with it is newly generated upon creation.
Yeah, that is one of the two allowed nvme behaviors, and there's a bit
to detect if the persistent ids can be reused or not.
^ permalink raw reply [flat|nested] 12+ messages in thread