* [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure
@ 2026-02-25 20:21 Keith Busch
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Keith Busch @ 2026-02-25 20:21 UTC (permalink / raw)
To: linux-nvme, hch, nilay; +Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
We need to fall back to the synchronous removal if we can't get a
reference on the module needed for the deferred removal.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/nvme/host/multipath.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index bfcc5904e6a26..fc6800a9f7f94 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -1310,13 +1310,11 @@ void nvme_mpath_remove_disk(struct nvme_ns_head *head)
if (!list_empty(&head->list))
goto out;
- if (head->delayed_removal_secs) {
- /*
- * Ensure that no one could remove this module while the head
- * remove work is pending.
- */
- if (!try_module_get(THIS_MODULE))
- goto out;
+ /*
+ * Ensure that no one could remove this module while the head
+ * remove work is pending.
+ */
+ if (head->delayed_removal_secs && try_module_get(THIS_MODULE)) {
mod_delayed_work(nvme_wq, &head->remove_work,
head->delayed_removal_secs * HZ);
} else {
--
2.47.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Keith Busch
@ 2026-02-25 20:21 ` Keith Busch
2026-02-25 20:34 ` Keith Busch
` (2 more replies)
2026-02-26 6:35 ` [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Nilay Shroff
` (2 subsequent siblings)
3 siblings, 3 replies; 12+ messages in thread
From: Keith Busch @ 2026-02-25 20:21 UTC (permalink / raw)
To: linux-nvme, hch, nilay; +Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
The NVMe controller is allowed to reuse an NSID for a new namespace after
deleting the previous namespace that had been using it. The delayed removal may
have the stale namespace head in the subsystem list pending the timer, which
would cause the scan to falsely report an ID mismatch error for the new
namespace. Flush the pending removal work and retry to resolve this.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/nvme/host/core.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3de52f1d27234..e731d3182f095 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3966,6 +3966,7 @@ static int nvme_global_check_duplicate_ids(struct nvme_subsystem *this,
static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
{
+ bool retry = IS_ENABLED(CONFIG_NVME_MULTIPATH);
struct nvme_ctrl *ctrl = ns->ctrl;
struct nvme_ns_head *head = NULL;
int ret;
@@ -4008,6 +4009,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
}
+again:
mutex_lock(&ctrl->subsys->lock);
head = nvme_find_ns_head(ctrl, info->nsid);
if (!head) {
@@ -4033,6 +4035,22 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
goto out_put_ns_head;
}
if (!nvme_ns_ids_equal(&head->ids, &info->ids)) {
+ /*
+ * A newly created namespace can reuse an NSID that was
+ * previously deleted. If the head has no active paths,
+ * it is pending delayed removal and still occupying
+ * this NSID in the subsystem list. Flush the removal
+ * work to clear the stale head and retry.
+ */
+ if (retry && list_empty(&head->list)) {
+ mutex_unlock(&ctrl->subsys->lock);
+ flush_delayed_work(&head->remove_work);
+ nvme_put_ns_head(head);
+ retry = false;
+ goto again;
+ }
+
+ WARN_ONCE(list_empty(&head->list));
dev_err(ctrl->device,
"IDs don't match for shared namespace %d\n",
info->nsid);
--
2.47.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
@ 2026-02-25 20:34 ` Keith Busch
2026-02-26 7:04 ` Nilay Shroff
2026-02-26 15:37 ` Christoph Hellwig
2 siblings, 0 replies; 12+ messages in thread
From: Keith Busch @ 2026-02-25 20:34 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, nilay
On Wed, Feb 25, 2026 at 12:21:09PM -0800, Keith Busch wrote:
> +
> + WARN_ONCE(list_empty(&head->list));
Ugh, I know this should be WARN_ON_ONCE()...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure
2026-02-25 20:21 [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Keith Busch
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
@ 2026-02-26 6:35 ` Nilay Shroff
2026-02-26 8:31 ` John Garry
2026-02-26 15:35 ` Christoph Hellwig
3 siblings, 0 replies; 12+ messages in thread
From: Nilay Shroff @ 2026-02-26 6:35 UTC (permalink / raw)
To: Keith Busch, linux-nvme, hch; +Cc: Keith Busch
On 2/26/26 1:51 AM, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> We need to fall back to the synchronous removal if we can't get a
> reference on the module needed for the deferred removal.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
2026-02-25 20:34 ` Keith Busch
@ 2026-02-26 7:04 ` Nilay Shroff
2026-02-26 15:37 ` Christoph Hellwig
2 siblings, 0 replies; 12+ messages in thread
From: Nilay Shroff @ 2026-02-26 7:04 UTC (permalink / raw)
To: Keith Busch, linux-nvme, hch; +Cc: Keith Busch
On 2/26/26 1:51 AM, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The NVMe controller is allowed to reuse an NSID for a new namespace after
> deleting the previous namespace that had been using it. The delayed removal may
> have the stale namespace head in the subsystem list pending the timer, which
> would cause the scan to falsely report an ID mismatch error for the new
> namespace. Flush the pending removal work and retry to resolve this.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/nvme/host/core.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3de52f1d27234..e731d3182f095 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3966,6 +3966,7 @@ static int nvme_global_check_duplicate_ids(struct nvme_subsystem *this,
>
> static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> {
> + bool retry = IS_ENABLED(CONFIG_NVME_MULTIPATH);
> struct nvme_ctrl *ctrl = ns->ctrl;
> struct nvme_ns_head *head = NULL;
> int ret;
> @@ -4008,6 +4009,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
> }
>
> +again:
> mutex_lock(&ctrl->subsys->lock);
> head = nvme_find_ns_head(ctrl, info->nsid);
> if (!head) {
> @@ -4033,6 +4035,22 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> goto out_put_ns_head;
> }
> if (!nvme_ns_ids_equal(&head->ids, &info->ids)) {
> + /*
> + * A newly created namespace can reuse an NSID that was
> + * previously deleted. If the head has no active paths,
> + * it is pending delayed removal and still occupying
> + * this NSID in the subsystem list. Flush the removal
> + * work to clear the stale head and retry.
> + */
> + if (retry && list_empty(&head->list)) {
> + mutex_unlock(&ctrl->subsys->lock);
> + flush_delayed_work(&head->remove_work);
> + nvme_put_ns_head(head);
> + retry = false;
> + goto again;
> + }
> +
> + WARN_ONCE(list_empty(&head->list));
We need to replace WARN_ONCE with WARN_ON_ONCE (as you already mentioned
in another thread), so with that change applied, this looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure
2026-02-25 20:21 [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Keith Busch
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
2026-02-26 6:35 ` [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Nilay Shroff
@ 2026-02-26 8:31 ` John Garry
2026-02-26 15:35 ` Christoph Hellwig
3 siblings, 0 replies; 12+ messages in thread
From: John Garry @ 2026-02-26 8:31 UTC (permalink / raw)
To: Keith Busch, linux-nvme, hch, nilay; +Cc: Keith Busch
On 25/02/2026 20:21, Keith Busch wrote:
> From: Keith Busch<kbusch@kernel.org>
>
> We need to fall back to the synchronous removal if we can't get a
> reference on the module needed for the deferred removal.
>
> Signed-off-by: Keith Busch<kbusch@kernel.org>
> ---
Feel free to add:
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure
2026-02-25 20:21 [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Keith Busch
` (2 preceding siblings ...)
2026-02-26 8:31 ` John Garry
@ 2026-02-26 15:35 ` Christoph Hellwig
3 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2026-02-26 15:35 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, nilay, Keith Busch
On Wed, Feb 25, 2026 at 12:21:08PM -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> We need to fall back to the synchronous removal if we can't get a
> reference on the module needed for the deferred removal.
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
But this should probably have a Fixes tag?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
2026-02-25 20:34 ` Keith Busch
2026-02-26 7:04 ` Nilay Shroff
@ 2026-02-26 15:37 ` Christoph Hellwig
2026-02-26 16:51 ` Keith Busch
2 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2026-02-26 15:37 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, nilay, Keith Busch
On Wed, Feb 25, 2026 at 12:21:09PM -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The NVMe controller is allowed to reuse an NSID for a new namespace after
> deleting the previous namespace that had been using it. The delayed removal may
> have the stale namespace head in the subsystem list pending the timer, which
Overlong lines.
> + bool retry = IS_ENABLED(CONFIG_NVME_MULTIPATH);
> struct nvme_ctrl *ctrl = ns->ctrl;
> struct nvme_ns_head *head = NULL;
> int ret;
> @@ -4008,6 +4009,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
> }
>
> +again:
> mutex_lock(&ctrl->subsys->lock);
> head = nvme_find_ns_head(ctrl, info->nsid);
> if (!head) {
> @@ -4033,6 +4035,22 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
> goto out_put_ns_head;
> }
> if (!nvme_ns_ids_equal(&head->ids, &info->ids)) {
> + /*
> + * A newly created namespace can reuse an NSID that was
> + * previously deleted. If the head has no active paths,
> + * it is pending delayed removal and still occupying
> + * this NSID in the subsystem list. Flush the removal
> + * work to clear the stale head and retry.
> + */
> + if (retry && list_empty(&head->list)) {
I find the retry logic a bit odd and different from other places
do in similar areas. What I'd expected is either a "nr_retries" or
"did_retry" variable initialized to 0/false, then checked here to
be not set (plus the IS_ENABLED() for multipath) and incremented/set
below.
But independent of that, the actual logic looks fine.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 15:37 ` Christoph Hellwig
@ 2026-02-26 16:51 ` Keith Busch
2026-02-26 18:31 ` Nilay Shroff
0 siblings, 1 reply; 12+ messages in thread
From: Keith Busch @ 2026-02-26 16:51 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, linux-nvme, nilay
On Thu, Feb 26, 2026 at 04:37:40PM +0100, Christoph Hellwig wrote:
> I find the retry logic a bit odd and different from other places
> do in similar areas. What I'd expected is either a "nr_retries" or
> "did_retry" variable initialized to 0/false, then checked here to
> be not set (plus the IS_ENABLED() for multipath) and incremented/set
> below.
>
> But independent of that, the actual logic looks fine.
I was able to test this, and it does work when we're specifically
blocking on the delayed removal. But there's a different race this
doesn't handle: controller A's scan_work may depend on controller B's
scan_work to finish first to remove a final reference on the deleted
namespace when A is trying to add a newly created namespace that
recycled the NSID.
This is looking pretty tricky to resolve. The best solution I'm coming
up with so far is to have the scan_work synthesize a
NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
then re-kick their scan work if the scan_work removed anything.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 16:51 ` Keith Busch
@ 2026-02-26 18:31 ` Nilay Shroff
2026-02-26 18:35 ` Keith Busch
0 siblings, 1 reply; 12+ messages in thread
From: Nilay Shroff @ 2026-02-26 18:31 UTC (permalink / raw)
To: Keith Busch, Christoph Hellwig; +Cc: Keith Busch, linux-nvme
On 2/26/26 10:21 PM, Keith Busch wrote:
> On Thu, Feb 26, 2026 at 04:37:40PM +0100, Christoph Hellwig wrote:
>> I find the retry logic a bit odd and different from other places
>> do in similar areas. What I'd expected is either a "nr_retries" or
>> "did_retry" variable initialized to 0/false, then checked here to
>> be not set (plus the IS_ENABLED() for multipath) and incremented/set
>> below.
>>
>> But independent of that, the actual logic looks fine.
>
> I was able to test this, and it does work when we're specifically
> blocking on the delayed removal. But there's a different race this
> doesn't handle: controller A's scan_work may depend on controller B's
> scan_work to finish first to remove a final reference on the deleted
> namespace when A is trying to add a newly created namespace that
> recycled the NSID.
>
> This is looking pretty tricky to resolve. The best solution I'm coming
> up with so far is to have the scan_work synthesize a
> NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
> then re-kick their scan work if the scan_work removed anything.
So does your disk when reuse NSID, changes ns ids such as NGUID/UUID/EUI64?
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 18:31 ` Nilay Shroff
@ 2026-02-26 18:35 ` Keith Busch
2026-02-27 13:53 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Keith Busch @ 2026-02-26 18:35 UTC (permalink / raw)
To: Nilay Shroff; +Cc: Christoph Hellwig, Keith Busch, linux-nvme
On Fri, Feb 27, 2026 at 12:01:48AM +0530, Nilay Shroff wrote:
> On 2/26/26 10:21 PM, Keith Busch wrote:
> > This is looking pretty tricky to resolve. The best solution I'm coming
> > up with so far is to have the scan_work synthesize a
> > NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
> > then re-kick their scan work if the scan_work removed anything.
>
> So does your disk when reuse NSID, changes ns ids such as NGUID/UUID/EUI64?
Yes. The NSID is put back into the available pool when it's deleted, but
the UID associated with it is newly generated upon creation.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion
2026-02-26 18:35 ` Keith Busch
@ 2026-02-27 13:53 ` Christoph Hellwig
0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2026-02-27 13:53 UTC (permalink / raw)
To: Keith Busch; +Cc: Nilay Shroff, Christoph Hellwig, Keith Busch, linux-nvme
On Thu, Feb 26, 2026 at 11:35:39AM -0700, Keith Busch wrote:
> On Fri, Feb 27, 2026 at 12:01:48AM +0530, Nilay Shroff wrote:
> > On 2/26/26 10:21 PM, Keith Busch wrote:
> > > This is looking pretty tricky to resolve. The best solution I'm coming
> > > up with so far is to have the scan_work synthesize a
> > > NVME_AER_NOTICE_NS_CHANGED event for every controller in the subsystem,
> > > then re-kick their scan work if the scan_work removed anything.
> >
> > So does your disk when reuse NSID, changes ns ids such as NGUID/UUID/EUI64?
>
> Yes. The NSID is put back into the available pool when it's deleted, but
> the UID associated with it is newly generated upon creation.
Yeah, that is one of the two allowed nvme behaviors, and there's a bit
to detect if the persistent ids can be reused or not.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-02-27 13:53 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 20:21 [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Keith Busch
2026-02-25 20:21 ` [PATCH 2/2] nvme: fix unmatched id's under delayed path deletion Keith Busch
2026-02-25 20:34 ` Keith Busch
2026-02-26 7:04 ` Nilay Shroff
2026-02-26 15:37 ` Christoph Hellwig
2026-02-26 16:51 ` Keith Busch
2026-02-26 18:31 ` Nilay Shroff
2026-02-26 18:35 ` Keith Busch
2026-02-27 13:53 ` Christoph Hellwig
2026-02-26 6:35 ` [PATCH 1/2] nvme-multipath: fix leak on try_module_get failure Nilay Shroff
2026-02-26 8:31 ` John Garry
2026-02-26 15:35 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox