public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
@ 2024-11-29 14:06 Hannes Reinecke
  2024-12-03 19:15 ` Keith Busch
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Hannes Reinecke @ 2024-11-29 14:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke

When a namespace gets unmapped on the target during scanning
nvme_identify_ns_descs() returns with a non-retryable error.
With the currrent code we will ignore that error on the grounds
that we failed to get information, and hence cannot make any
decisions whether to keep or remove that namespace.
But a non-retryable error implies that the namespace is _not_
present as we cannot retry that command and will never get
information about that namespace.
And we need to remove the namespace during scanning, as otherwise
the AEN informing us about a namespace change will find the NSID
present, but nvme_validate_ns() will fail, and the namespace
will never be updated with the correct information.

Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
 drivers/nvme/host/core.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9e7e63e10e5a..396520cb3ce6 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3999,8 +3999,11 @@ static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	struct nvme_ns *ns;
 	int ret;
 
-	if (nvme_identify_ns_descs(ctrl, &info))
+	ret = nvme_identify_ns_descs(ctrl, &info);
+	if (ret > 0 && (ret & NVME_STATUS_DNR)) {
+		nvme_ns_remove_by_nsid(ctrl, nsid);
 		return;
+	}
 
 	if (info.ids.csi != NVME_CSI_NVM && !nvme_multi_css(ctrl)) {
 		dev_warn(ctrl->device,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-11-29 14:06 [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed Hannes Reinecke
@ 2024-12-03 19:15 ` Keith Busch
  2024-12-04  7:14   ` Hannes Reinecke
  2024-12-24 11:35 ` Sagi Grimberg
  2024-12-25  9:58 ` Sagi Grimberg
  2 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2024-12-03 19:15 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme

On Fri, Nov 29, 2024 at 03:06:08PM +0100, Hannes Reinecke wrote:
> When a namespace gets unmapped on the target during scanning
> nvme_identify_ns_descs() returns with a non-retryable error.
> With the currrent code we will ignore that error on the grounds
> that we failed to get information, and hence cannot make any
> decisions whether to keep or remove that namespace.
> But a non-retryable error implies that the namespace is _not_
> present as we cannot retry that command and will never get
> information about that namespace.
> And we need to remove the namespace during scanning, as otherwise
> the AEN informing us about a namespace change will find the NSID
> present, but nvme_validate_ns() will fail, and the namespace
> will never be updated with the correct information.

The scanning only checks namespaces returned in the "active" namespace
list. Every namespace not in the active list gets removed already. Why
is this unmapped namespace appearing on the active list?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-03 19:15 ` Keith Busch
@ 2024-12-04  7:14   ` Hannes Reinecke
  2024-12-04 16:39     ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2024-12-04  7:14 UTC (permalink / raw)
  To: Keith Busch, Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme

On 12/3/24 20:15, Keith Busch wrote:
> On Fri, Nov 29, 2024 at 03:06:08PM +0100, Hannes Reinecke wrote:
>> When a namespace gets unmapped on the target during scanning
>> nvme_identify_ns_descs() returns with a non-retryable error.
>> With the currrent code we will ignore that error on the grounds
>> that we failed to get information, and hence cannot make any
>> decisions whether to keep or remove that namespace.
>> But a non-retryable error implies that the namespace is _not_
>> present as we cannot retry that command and will never get
>> information about that namespace.
>> And we need to remove the namespace during scanning, as otherwise
>> the AEN informing us about a namespace change will find the NSID
>> present, but nvme_validate_ns() will fail, and the namespace
>> will never be updated with the correct information.
> 
> The scanning only checks namespaces returned in the "active" namespace
> list. Every namespace not in the active list gets removed already. Why
> is this unmapped namespace appearing on the active list?

Timing. Imagine a system used as a backing store for kubernetes, where 
namespaces come and go at a _really_ fast pace.
1) AEN triggers a rescan
2) List of active namespace is retrieved
-> NSID A gets unmapped (or moved to another node in the cluster)
3) Scan of NSID A returns an error with DNR set.
Without this patch we keep the namespace around, so eventually we'll
trip over the 'non-matching UUID' check once the NSID is reused.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-04  7:14   ` Hannes Reinecke
@ 2024-12-04 16:39     ` Keith Busch
  2024-12-05 12:30       ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2024-12-04 16:39 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On Wed, Dec 04, 2024 at 08:14:08AM +0100, Hannes Reinecke wrote:
> On 12/3/24 20:15, Keith Busch wrote:
> > On Fri, Nov 29, 2024 at 03:06:08PM +0100, Hannes Reinecke wrote:
> > > When a namespace gets unmapped on the target during scanning
> > > nvme_identify_ns_descs() returns with a non-retryable error.
> > > With the currrent code we will ignore that error on the grounds
> > > that we failed to get information, and hence cannot make any
> > > decisions whether to keep or remove that namespace.
> > > But a non-retryable error implies that the namespace is _not_
> > > present as we cannot retry that command and will never get
> > > information about that namespace.
> > > And we need to remove the namespace during scanning, as otherwise
> > > the AEN informing us about a namespace change will find the NSID
> > > present, but nvme_validate_ns() will fail, and the namespace
> > > will never be updated with the correct information.
> > 
> > The scanning only checks namespaces returned in the "active" namespace
> > list. Every namespace not in the active list gets removed already. Why
> > is this unmapped namespace appearing on the active list?
> 
> Timing. Imagine a system used as a backing store for kubernetes, where
> namespaces come and go at a _really_ fast pace.
> 1) AEN triggers a rescan
> 2) List of active namespace is retrieved
> -> NSID A gets unmapped (or moved to another node in the cluster)
> 3) Scan of NSID A returns an error with DNR set.
> Without this patch we keep the namespace around, so eventually we'll
> trip over the 'non-matching UUID' check once the NSID is reused.

I'm still not sure that makes sense. The target shouldn't attach the new
namespace until the host acknowledges the removal of the older NSID via
the Namespace Change List log. Until the log is read, the inventory for
removed namespaces should be latched. Otherwise, timing might remove+add
a specific NSID before the host requests the NS Descriptor for the
racing removal, then it would just get the "non-matching UUID" issue
anyway.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-04 16:39     ` Keith Busch
@ 2024-12-05 12:30       ` Hannes Reinecke
  2024-12-05 16:15         ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2024-12-05 12:30 UTC (permalink / raw)
  To: Keith Busch; +Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On 12/4/24 17:39, Keith Busch wrote:
> On Wed, Dec 04, 2024 at 08:14:08AM +0100, Hannes Reinecke wrote:
>> On 12/3/24 20:15, Keith Busch wrote:
>>> On Fri, Nov 29, 2024 at 03:06:08PM +0100, Hannes Reinecke wrote:
>>>> When a namespace gets unmapped on the target during scanning
>>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>>> With the currrent code we will ignore that error on the grounds
>>>> that we failed to get information, and hence cannot make any
>>>> decisions whether to keep or remove that namespace.
>>>> But a non-retryable error implies that the namespace is _not_
>>>> present as we cannot retry that command and will never get
>>>> information about that namespace.
>>>> And we need to remove the namespace during scanning, as otherwise
>>>> the AEN informing us about a namespace change will find the NSID
>>>> present, but nvme_validate_ns() will fail, and the namespace
>>>> will never be updated with the correct information.
>>>
>>> The scanning only checks namespaces returned in the "active" namespace
>>> list. Every namespace not in the active list gets removed already. Why
>>> is this unmapped namespace appearing on the active list?
>>
>> Timing. Imagine a system used as a backing store for kubernetes, where
>> namespaces come and go at a _really_ fast pace.
>> 1) AEN triggers a rescan
>> 2) List of active namespace is retrieved
>> -> NSID A gets unmapped (or moved to another node in the cluster)
>> 3) Scan of NSID A returns an error with DNR set.
>> Without this patch we keep the namespace around, so eventually we'll
>> trip over the 'non-matching UUID' check once the NSID is reused.
> 
> I'm still not sure that makes sense. The target shouldn't attach the new
> namespace until the host acknowledges the removal of the older NSID via
> the Namespace Change List log. Until the log is read, the inventory for
> removed namespaces should be latched. Otherwise, timing might remove+add
> a specific NSID before the host requests the NS Descriptor for the
> racing removal, then it would just get the "non-matching UUID" issue
> anyway.

But we read the Namespace Change List log in step 2)
(Not that we're doing anything with it, but that's another story...)
Hmm?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-05 12:30       ` Hannes Reinecke
@ 2024-12-05 16:15         ` Keith Busch
  2024-12-06 12:41           ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2024-12-05 16:15 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On Thu, Dec 05, 2024 at 01:30:39PM +0100, Hannes Reinecke wrote:
> On 12/4/24 17:39, Keith Busch wrote:
> > > 1) AEN triggers a rescan
> > > 2) List of active namespace is retrieved
> > > -> NSID A gets unmapped (or moved to another node in the cluster)
> > > 3) Scan of NSID A returns an error with DNR set.
> > > Without this patch we keep the namespace around, so eventually we'll
> > > trip over the 'non-matching UUID' check once the NSID is reused.
> > 
> > I'm still not sure that makes sense. The target shouldn't attach the new
> > namespace until the host acknowledges the removal of the older NSID via
> > the Namespace Change List log. Until the log is read, the inventory for
> > removed namespaces should be latched. Otherwise, timing might remove+add
> > a specific NSID before the host requests the NS Descriptor for the
> > racing removal, then it would just get the "non-matching UUID" issue
> > anyway.
> 
> But we read the Namespace Change List log in step 2)
> (Not that we're doing anything with it, but that's another story...)
> Hmm?

Indeed. So maybe we should just move the log page retrevial *after* we
scan the identify active namespace list processing?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-05 16:15         ` Keith Busch
@ 2024-12-06 12:41           ` Hannes Reinecke
  2025-01-07 16:01             ` Keith Busch
  2025-01-11 14:01             ` Nilay Shroff
  0 siblings, 2 replies; 23+ messages in thread
From: Hannes Reinecke @ 2024-12-06 12:41 UTC (permalink / raw)
  To: Keith Busch; +Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On 12/5/24 17:15, Keith Busch wrote:
> On Thu, Dec 05, 2024 at 01:30:39PM +0100, Hannes Reinecke wrote:
>> On 12/4/24 17:39, Keith Busch wrote:
>>>> 1) AEN triggers a rescan
>>>> 2) List of active namespace is retrieved
>>>> -> NSID A gets unmapped (or moved to another node in the cluster)
>>>> 3) Scan of NSID A returns an error with DNR set.
>>>> Without this patch we keep the namespace around, so eventually we'll
>>>> trip over the 'non-matching UUID' check once the NSID is reused.
>>>
>>> I'm still not sure that makes sense. The target shouldn't attach the new
>>> namespace until the host acknowledges the removal of the older NSID via
>>> the Namespace Change List log. Until the log is read, the inventory for
>>> removed namespaces should be latched. Otherwise, timing might remove+add
>>> a specific NSID before the host requests the NS Descriptor for the
>>> racing removal, then it would just get the "non-matching UUID" issue
>>> anyway.
>>
>> But we read the Namespace Change List log in step 2)
>> (Not that we're doing anything with it, but that's another story...)
>> Hmm?
> 
> Indeed. So maybe we should just move the log page retrevial *after* we
> scan the identify active namespace list processing?

Not sure how that would help. We are getting an 'ANA inaccessible' with 
DNR set status when retrieving the NS descriptor list for the namespace.
And this has to happen after we read the list of active namespace.
Perfectly legit, but doesn't tell us anything if the namespace is 
present at all.
All we know is that we cannot get information about that, and my 
argument is that we should treat this as equivalent to a namespace
not present.

And I really don't want to delay clearing of the AEN, as that would
open the door for us to miss subsequent AENs, getting even more 
out-of-sync with the target.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-11-29 14:06 [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed Hannes Reinecke
  2024-12-03 19:15 ` Keith Busch
@ 2024-12-24 11:35 ` Sagi Grimberg
  2024-12-25  9:58 ` Sagi Grimberg
  2 siblings, 0 replies; 23+ messages in thread
From: Sagi Grimberg @ 2024-12-24 11:35 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme




On 29/11/2024 16:06, Hannes Reinecke wrote:
> When a namespace gets unmapped on the target during scanning
> nvme_identify_ns_descs() returns with a non-retryable error.
> With the currrent code we will ignore that error on the grounds
> that we failed to get information, and hence cannot make any
> decisions whether to keep or remove that namespace.
> But a non-retryable error implies that the namespace is _not_
> present as we cannot retry that command and will never get
> information about that namespace.
> And we need to remove the namespace during scanning, as otherwise
> the AEN informing us about a namespace change will find the NSID
> present, but nvme_validate_ns() will fail, and the namespace
> will never be updated with the correct information.

Hannes, can you please describe your test case?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-11-29 14:06 [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed Hannes Reinecke
  2024-12-03 19:15 ` Keith Busch
  2024-12-24 11:35 ` Sagi Grimberg
@ 2024-12-25  9:58 ` Sagi Grimberg
  2025-01-07  8:11   ` Hannes Reinecke
  2 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2024-12-25  9:58 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme




On 29/11/2024 16:06, Hannes Reinecke wrote:
> When a namespace gets unmapped on the target during scanning
> nvme_identify_ns_descs() returns with a non-retryable error.
> With the currrent code we will ignore that error on the grounds
> that we failed to get information, and hence cannot make any
> decisions whether to keep or remove that namespace.
> But a non-retryable error implies that the namespace is _not_
> present as we cannot retry that command and will never get
> information about that namespace.
> And we need to remove the namespace during scanning, as otherwise
> the AEN informing us about a namespace change will find the NSID
> present, but nvme_validate_ns() will fail, and the namespace
> will never be updated with the correct information.

Isn't that a bit harsh?
I would expect to see a specific status line NVME_SC_INVALID_NS or 
equivalent
for a full removal of the namespace?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-25  9:58 ` Sagi Grimberg
@ 2025-01-07  8:11   ` Hannes Reinecke
  2025-01-08 10:49     ` Sagi Grimberg
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-07  8:11 UTC (permalink / raw)
  To: Sagi Grimberg, Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 12/25/24 10:58, Sagi Grimberg wrote:
> 
> 
> 
> On 29/11/2024 16:06, Hannes Reinecke wrote:
>> When a namespace gets unmapped on the target during scanning
>> nvme_identify_ns_descs() returns with a non-retryable error.
>> With the currrent code we will ignore that error on the grounds
>> that we failed to get information, and hence cannot make any
>> decisions whether to keep or remove that namespace.
>> But a non-retryable error implies that the namespace is _not_
>> present as we cannot retry that command and will never get
>> information about that namespace.
>> And we need to remove the namespace during scanning, as otherwise
>> the AEN informing us about a namespace change will find the NSID
>> present, but nvme_validate_ns() will fail, and the namespace
>> will never be updated with the correct information.
> 
> Isn't that a bit harsh?
> I would expect to see a specific status line NVME_SC_INVALID_NS or 
> equivalent for a full removal of the namespace?

Does it matter? If we get a DNR status back from 
nvme_identify_ns_descs() we _cannot_ resend that command.
Meaning we cannot get the namespace descriptors. As we
rely on these descriptors to properly map the namespace
we cannot correctly work with it, and we're better off
to pretend the namespace is gone and wait for an AEN
indicating that the namespace (or controller) state has changed.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-06 12:41           ` Hannes Reinecke
@ 2025-01-07 16:01             ` Keith Busch
  2025-01-11 14:01             ` Nilay Shroff
  1 sibling, 0 replies; 23+ messages in thread
From: Keith Busch @ 2025-01-07 16:01 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On Fri, Dec 06, 2024 at 01:41:08PM +0100, Hannes Reinecke wrote:
> On 12/5/24 17:15, Keith Busch wrote:
> > On Thu, Dec 05, 2024 at 01:30:39PM +0100, Hannes Reinecke wrote:
> > > On 12/4/24 17:39, Keith Busch wrote:
> > > > > 1) AEN triggers a rescan
> > > > > 2) List of active namespace is retrieved
> > > > > -> NSID A gets unmapped (or moved to another node in the cluster)
> > > > > 3) Scan of NSID A returns an error with DNR set.
> > > > > Without this patch we keep the namespace around, so eventually we'll
> > > > > trip over the 'non-matching UUID' check once the NSID is reused.
> > > > 
> > > > I'm still not sure that makes sense. The target shouldn't attach the new
> > > > namespace until the host acknowledges the removal of the older NSID via
> > > > the Namespace Change List log. Until the log is read, the inventory for
> > > > removed namespaces should be latched. Otherwise, timing might remove+add
> > > > a specific NSID before the host requests the NS Descriptor for the
> > > > racing removal, then it would just get the "non-matching UUID" issue
> > > > anyway.
> > > 
> > > But we read the Namespace Change List log in step 2)
> > > (Not that we're doing anything with it, but that's another story...)
> > > Hmm?
> > 
> > Indeed. So maybe we should just move the log page retrevial *after* we
> > scan the identify active namespace list processing?
> 
> Not sure how that would help. We are getting an 'ANA inaccessible' with DNR
> set status when retrieving the NS descriptor list for the namespace.
> And this has to happen after we read the list of active namespace.
> Perfectly legit, but doesn't tell us anything if the namespace is present at
> all.
> All we know is that we cannot get information about that, and my argument is
> that we should treat this as equivalent to a namespace
> not present.
> 
> And I really don't want to delay clearing of the AEN, as that would
> open the door for us to miss subsequent AENs, getting even more out-of-sync
> with the target.

I just thought it would be cleaner if the driver could observe the
removed namespace is not present in the active namespace list
identification, so that all removals can happen in a single path.

What I'm worried about with your proposal is that it indicates we can
get a rapid remove + add sequence such that timing may create a
condition where instead of getting "ANA inaccessible w/ DNR", we'd
observe a mismached UUID.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-07  8:11   ` Hannes Reinecke
@ 2025-01-08 10:49     ` Sagi Grimberg
  2025-01-08 15:45       ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2025-01-08 10:49 UTC (permalink / raw)
  To: Hannes Reinecke, Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme




On 07/01/2025 10:11, Hannes Reinecke wrote:
> On 12/25/24 10:58, Sagi Grimberg wrote:
>>
>>
>>
>> On 29/11/2024 16:06, Hannes Reinecke wrote:
>>> When a namespace gets unmapped on the target during scanning
>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>> With the currrent code we will ignore that error on the grounds
>>> that we failed to get information, and hence cannot make any
>>> decisions whether to keep or remove that namespace.
>>> But a non-retryable error implies that the namespace is _not_
>>> present as we cannot retry that command and will never get
>>> information about that namespace.
>>> And we need to remove the namespace during scanning, as otherwise
>>> the AEN informing us about a namespace change will find the NSID
>>> present, but nvme_validate_ns() will fail, and the namespace
>>> will never be updated with the correct information.
>>
>> Isn't that a bit harsh?
>> I would expect to see a specific status line NVME_SC_INVALID_NS or 
>> equivalent for a full removal of the namespace?
>
> Does it matter? If we get a DNR status back from 
> nvme_identify_ns_descs() we _cannot_ resend that command.
> Meaning we cannot get the namespace descriptors. As we
> rely on these descriptors to properly map the namespace
> we cannot correctly work with it, and we're better off
> to pretend the namespace is gone and wait for an AEN
> indicating that the namespace (or controller) state has changed.

I think it does matter. I don't think we should be removing the NS without
the controller telling us that it is actually removed.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-08 10:49     ` Sagi Grimberg
@ 2025-01-08 15:45       ` Hannes Reinecke
  2025-01-10 23:16         ` Sagi Grimberg
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-08 15:45 UTC (permalink / raw)
  To: Sagi Grimberg, Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 1/8/25 11:49, Sagi Grimberg wrote:
> 
> 
> 
> On 07/01/2025 10:11, Hannes Reinecke wrote:
>> On 12/25/24 10:58, Sagi Grimberg wrote:
>>>
>>>
>>>
>>> On 29/11/2024 16:06, Hannes Reinecke wrote:
>>>> When a namespace gets unmapped on the target during scanning
>>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>>> With the currrent code we will ignore that error on the grounds
>>>> that we failed to get information, and hence cannot make any
>>>> decisions whether to keep or remove that namespace.
>>>> But a non-retryable error implies that the namespace is _not_
>>>> present as we cannot retry that command and will never get
>>>> information about that namespace.
>>>> And we need to remove the namespace during scanning, as otherwise
>>>> the AEN informing us about a namespace change will find the NSID
>>>> present, but nvme_validate_ns() will fail, and the namespace
>>>> will never be updated with the correct information.
>>>
>>> Isn't that a bit harsh?
>>> I would expect to see a specific status line NVME_SC_INVALID_NS or 
>>> equivalent for a full removal of the namespace?
>>
>> Does it matter? If we get a DNR status back from 
>> nvme_identify_ns_descs() we _cannot_ resend that command.
>> Meaning we cannot get the namespace descriptors. As we
>> rely on these descriptors to properly map the namespace
>> we cannot correctly work with it, and we're better off
>> to pretend the namespace is gone and wait for an AEN
>> indicating that the namespace (or controller) state has changed.
> 
> I think it does matter. I don't think we should be removing the NS without
> the controller telling us that it is actually removed.

But what would be the recovery action here?
If the 'identify ns descs' command cannot be retried, how are
we going to map the namespace to an ns_head?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-08 15:45       ` Hannes Reinecke
@ 2025-01-10 23:16         ` Sagi Grimberg
  2025-01-13  7:50           ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2025-01-10 23:16 UTC (permalink / raw)
  To: Hannes Reinecke, Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme




On 08/01/2025 17:45, Hannes Reinecke wrote:
> On 1/8/25 11:49, Sagi Grimberg wrote:
>>
>>
>>
>> On 07/01/2025 10:11, Hannes Reinecke wrote:
>>> On 12/25/24 10:58, Sagi Grimberg wrote:
>>>>
>>>>
>>>>
>>>> On 29/11/2024 16:06, Hannes Reinecke wrote:
>>>>> When a namespace gets unmapped on the target during scanning
>>>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>>>> With the currrent code we will ignore that error on the grounds
>>>>> that we failed to get information, and hence cannot make any
>>>>> decisions whether to keep or remove that namespace.
>>>>> But a non-retryable error implies that the namespace is _not_
>>>>> present as we cannot retry that command and will never get
>>>>> information about that namespace.
>>>>> And we need to remove the namespace during scanning, as otherwise
>>>>> the AEN informing us about a namespace change will find the NSID
>>>>> present, but nvme_validate_ns() will fail, and the namespace
>>>>> will never be updated with the correct information.
>>>>
>>>> Isn't that a bit harsh?
>>>> I would expect to see a specific status line NVME_SC_INVALID_NS or 
>>>> equivalent for a full removal of the namespace?
>>>
>>> Does it matter? If we get a DNR status back from 
>>> nvme_identify_ns_descs() we _cannot_ resend that command.
>>> Meaning we cannot get the namespace descriptors. As we
>>> rely on these descriptors to properly map the namespace
>>> we cannot correctly work with it, and we're better off
>>> to pretend the namespace is gone and wait for an AEN
>>> indicating that the namespace (or controller) state has changed.
>>
>> I think it does matter. I don't think we should be removing the NS 
>> without
>> the controller telling us that it is actually removed.
>
> But what would be the recovery action here?
> If the 'identify ns descs' command cannot be retried, how are
> we going to map the namespace to an ns_head?

Let's take a step back here. Can you describe the scenario you hit? what 
was the error
status that you observed?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2024-12-06 12:41           ` Hannes Reinecke
  2025-01-07 16:01             ` Keith Busch
@ 2025-01-11 14:01             ` Nilay Shroff
  2025-01-13  7:43               ` Hannes Reinecke
  1 sibling, 1 reply; 23+ messages in thread
From: Nilay Shroff @ 2025-01-11 14:01 UTC (permalink / raw)
  To: Hannes Reinecke, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme



On 12/6/24 6:11 PM, Hannes Reinecke wrote:
> On 12/5/24 17:15, Keith Busch wrote:
>> On Thu, Dec 05, 2024 at 01:30:39PM +0100, Hannes Reinecke wrote:
>>> On 12/4/24 17:39, Keith Busch wrote:
>>>>> 1) AEN triggers a rescan
>>>>> 2) List of active namespace is retrieved
>>>>> -> NSID A gets unmapped (or moved to another node in the cluster)
>>>>> 3) Scan of NSID A returns an error with DNR set.
>>>>> Without this patch we keep the namespace around, so eventually we'll
>>>>> trip over the 'non-matching UUID' check once the NSID is reused.
>>>>
>>>> I'm still not sure that makes sense. The target shouldn't attach the new
>>>> namespace until the host acknowledges the removal of the older NSID via
>>>> the Namespace Change List log. Until the log is read, the inventory for
>>>> removed namespaces should be latched. Otherwise, timing might remove+add
>>>> a specific NSID before the host requests the NS Descriptor for the
>>>> racing removal, then it would just get the "non-matching UUID" issue
>>>> anyway.
>>>
>>> But we read the Namespace Change List log in step 2)
>>> (Not that we're doing anything with it, but that's another story...)
>>> Hmm?
>>
>> Indeed. So maybe we should just move the log page retrevial *after* we
>> scan the identify active namespace list processing?
> 
> Not sure how that would help. We are getting an 'ANA inaccessible' with DNR set status when retrieving the NS descriptor list for the namespace.
> And this has to happen after we read the list of active namespace.
> Perfectly legit, but doesn't tell us anything if the namespace is present at all.
> All we know is that we cannot get information about that, and my argument is that we should treat this as equivalent to a namespace
> not present.
> 
I think when a nsid is in "ANA inaccessible" state sending any command which 
has that nsid described in it would be aborted by the controller. 
Per the NVMe 2.0  spec (quoting a snippet from section 8.1.3.3 ANA 
Inaccessible state):

"A controller shall abort commands, other than those described in section 8.1.4, with a status code of
Asymmetric Access Inaccessible if those commands are submitted while the relationship between the
namespace specified by the command and the controller processing the command is in this state.

While ANA Inaccessible state is reported by a controller for the namespace, the host should retry the
command on a different controller that is reporting ANA Optimized state or ANA Non-Optimized state. If no
controllers are reporting ANA Optimized state or ANA Non-Optimized state, then a transition may be
occurring such that a controller reporting the Inaccessible state may become accessible and the host should
retry the command on the controller reporting Inaccessible state for at least ANATT seconds (refer to Figure
275). Refer to section 8.10.2."

So as we can see above, removing nsid immediately just because ns-descriptor-list command
failed with status "ANA inaccessible and DNR set" may not be correct. Because it's possible
that ANA state may transition back to optimized/non-optimized state, So instead of removing
ns from host, we may retry that command on another controller which is reporting ANA optimized/
non-optimized state if that nsid is attached to more than one controller. If nsid is private
(means attached only to one controller) then we may not have any option but to skip this nsid
during scan and wait until either ANATT timer expires or nsid transition back from ANA 
inaccessible to ANA optimized/non-optimized state. 

Yes it might be possible that while nsid is in ANA inaccessible state, it might be un-mapped 
from the target controller. But in that case target should send namespace change notice to the 
host and that shall trigger ns scan. And as Keith proposed, we probably want to move the changed
log ns retrieval just after we get active list of ns.

Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-11 14:01             ` Nilay Shroff
@ 2025-01-13  7:43               ` Hannes Reinecke
  2025-01-13 14:12                 ` Nilay Shroff
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-13  7:43 UTC (permalink / raw)
  To: Nilay Shroff, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On 1/11/25 15:01, Nilay Shroff wrote:
> 
> 
> On 12/6/24 6:11 PM, Hannes Reinecke wrote:
>> On 12/5/24 17:15, Keith Busch wrote:
>>> On Thu, Dec 05, 2024 at 01:30:39PM +0100, Hannes Reinecke wrote:
>>>> On 12/4/24 17:39, Keith Busch wrote:
>>>>>> 1) AEN triggers a rescan
>>>>>> 2) List of active namespace is retrieved
>>>>>> -> NSID A gets unmapped (or moved to another node in the cluster)
>>>>>> 3) Scan of NSID A returns an error with DNR set.
>>>>>> Without this patch we keep the namespace around, so eventually we'll
>>>>>> trip over the 'non-matching UUID' check once the NSID is reused.
>>>>>
>>>>> I'm still not sure that makes sense. The target shouldn't attach the new
>>>>> namespace until the host acknowledges the removal of the older NSID via
>>>>> the Namespace Change List log. Until the log is read, the inventory for
>>>>> removed namespaces should be latched. Otherwise, timing might remove+add
>>>>> a specific NSID before the host requests the NS Descriptor for the
>>>>> racing removal, then it would just get the "non-matching UUID" issue
>>>>> anyway.
>>>>
>>>> But we read the Namespace Change List log in step 2)
>>>> (Not that we're doing anything with it, but that's another story...)
>>>> Hmm?
>>>
>>> Indeed. So maybe we should just move the log page retrevial *after* we
>>> scan the identify active namespace list processing?
>>
>> Not sure how that would help. We are getting an 'ANA inaccessible' with DNR set status when retrieving the NS descriptor list for the namespace.
>> And this has to happen after we read the list of active namespace.
>> Perfectly legit, but doesn't tell us anything if the namespace is present at all.
>> All we know is that we cannot get information about that, and my argument is that we should treat this as equivalent to a namespace
>> not present.
>>
> I think when a nsid is in "ANA inaccessible" state sending any command which
> has that nsid described in it would be aborted by the controller.
> Per the NVMe 2.0  spec (quoting a snippet from section 8.1.3.3 ANA
> Inaccessible state):
> 
> "A controller shall abort commands, other than those described in section 8.1.4, with a status code of
> Asymmetric Access Inaccessible if those commands are submitted while the relationship between the
> namespace specified by the command and the controller processing the command is in this state.
> 
> While ANA Inaccessible state is reported by a controller for the namespace, the host should retry the
> command on a different controller that is reporting ANA Optimized state or ANA Non-Optimized state. If no
> controllers are reporting ANA Optimized state or ANA Non-Optimized state, then a transition may be
> occurring such that a controller reporting the Inaccessible state may become accessible and the host should
> retry the command on the controller reporting Inaccessible state for at least ANATT seconds (refer to Figure
> 275). Refer to section 8.10.2."
> 
> So as we can see above, removing nsid immediately just because ns-descriptor-list command
> failed with status "ANA inaccessible and DNR set" may not be correct. Because it's possible
> that ANA state may transition back to optimized/non-optimized state, So instead of removing
> ns from host, we may retry that command on another controller which is reporting ANA optimized/
> non-optimized state if that nsid is attached to more than one controller. If nsid is private
> (means attached only to one controller) then we may not have any option but to skip this nsid
> during scan and wait until either ANATT timer expires or nsid transition back from ANA
> inaccessible to ANA optimized/non-optimized state.
> 
I would agree with you for any other command. But the 'identify ns desc' 
command is the very first command send to the namespace, and it's 
required by our implementation to correctly attach the namespace to the 
corresponding ns_head structure.
We simply _cannot_ retry that command on another path, as that other 
path might (and, actually, is expected to) yield different information.

> Yes it might be possible that while nsid is in ANA inaccessible state, it might be un-mapped
> from the target controller. But in that case target should send namespace change notice to the
> host and that shall trigger ns scan. And as Keith proposed, we probably want to move the changed
> log ns retrieval just after we get active list of ns.
> 
That is precisely the scenario which I ran into.
We _do_ get the AEN changed event, but by the time when we start the ns 
scan the NSID has already been reassigned to a namespace with a 
different UUID.

When nvme_scan_ns() is called,  nvme_find_get_ns() would return 'true'
(as we still have the stale namespace in our lists), but the subsequent
nvme_validate_ns() would fail (as the UUID is different).
So the old namespace will be removed, but the new namespace will never 
be rescanned.

So my argument is that in this specific case the 'ANA inaccessible' nvme
state should _not_ be retried, but should be treated as identical to
'invalid namespace' errors.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-10 23:16         ` Sagi Grimberg
@ 2025-01-13  7:50           ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-13  7:50 UTC (permalink / raw)
  To: Sagi Grimberg, Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 1/11/25 00:16, Sagi Grimberg wrote:
> 
> 
> 
> On 08/01/2025 17:45, Hannes Reinecke wrote:
>> On 1/8/25 11:49, Sagi Grimberg wrote:
>>>
>>>
>>>
>>> On 07/01/2025 10:11, Hannes Reinecke wrote:
>>>> On 12/25/24 10:58, Sagi Grimberg wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 29/11/2024 16:06, Hannes Reinecke wrote:
>>>>>> When a namespace gets unmapped on the target during scanning
>>>>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>>>>> With the currrent code we will ignore that error on the grounds
>>>>>> that we failed to get information, and hence cannot make any
>>>>>> decisions whether to keep or remove that namespace.
>>>>>> But a non-retryable error implies that the namespace is _not_
>>>>>> present as we cannot retry that command and will never get
>>>>>> information about that namespace.
>>>>>> And we need to remove the namespace during scanning, as otherwise
>>>>>> the AEN informing us about a namespace change will find the NSID
>>>>>> present, but nvme_validate_ns() will fail, and the namespace
>>>>>> will never be updated with the correct information.
>>>>>
>>>>> Isn't that a bit harsh?
>>>>> I would expect to see a specific status line NVME_SC_INVALID_NS or 
>>>>> equivalent for a full removal of the namespace?
>>>>
>>>> Does it matter? If we get a DNR status back from 
>>>> nvme_identify_ns_descs() we _cannot_ resend that command.
>>>> Meaning we cannot get the namespace descriptors. As we
>>>> rely on these descriptors to properly map the namespace
>>>> we cannot correctly work with it, and we're better off
>>>> to pretend the namespace is gone and wait for an AEN
>>>> indicating that the namespace (or controller) state has changed.
>>>
>>> I think it does matter. I don't think we should be removing the NS 
>>> without
>>> the controller telling us that it is actually removed.
>>
>> But what would be the recovery action here?
>> If the 'identify ns descs' command cannot be retried, how are
>> we going to map the namespace to an ns_head?
> 
> Let's take a step back here. Can you describe the scenario you hit? what 
> was the error
> status that you observed?

See my reply to Nilay.

Target unmaps the namespace, and we get an AEN for 'namespace changed'.
We start a scan, but by the time we retrieve the list of namespaces the 
NSID has been reassigned to another namespace with a different UUID.
Then our stale namespace will be removed in nvme_validate_ns() and
not rescanned.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-13  7:43               ` Hannes Reinecke
@ 2025-01-13 14:12                 ` Nilay Shroff
  2025-01-13 14:29                   ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Nilay Shroff @ 2025-01-13 14:12 UTC (permalink / raw)
  To: Hannes Reinecke, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme



On 1/13/25 1:13 PM, Hannes Reinecke wrote:
> On 1/11/25 15:01, Nilay Shroff wrote:
>>
>>
>> On 12/6/24 6:11 PM, Hannes Reinecke wrote:
>>> On 12/5/24 17:15, Keith Busch wrote:
>>>> On Thu, Dec 05, 2024 at 01:30:39PM +0100, Hannes Reinecke wrote:
>>>>> On 12/4/24 17:39, Keith Busch wrote:
>>>>>>> 1) AEN triggers a rescan
>>>>>>> 2) List of active namespace is retrieved
>>>>>>> -> NSID A gets unmapped (or moved to another node in the cluster)
>>>>>>> 3) Scan of NSID A returns an error with DNR set.
>>>>>>> Without this patch we keep the namespace around, so eventually we'll
>>>>>>> trip over the 'non-matching UUID' check once the NSID is reused.
>>>>>>
>>>>>> I'm still not sure that makes sense. The target shouldn't attach the new
>>>>>> namespace until the host acknowledges the removal of the older NSID via
>>>>>> the Namespace Change List log. Until the log is read, the inventory for
>>>>>> removed namespaces should be latched. Otherwise, timing might remove+add
>>>>>> a specific NSID before the host requests the NS Descriptor for the
>>>>>> racing removal, then it would just get the "non-matching UUID" issue
>>>>>> anyway.
>>>>>
>>>>> But we read the Namespace Change List log in step 2)
>>>>> (Not that we're doing anything with it, but that's another story...)
>>>>> Hmm?
>>>>
>>>> Indeed. So maybe we should just move the log page retrevial *after* we
>>>> scan the identify active namespace list processing?
>>>
>>> Not sure how that would help. We are getting an 'ANA inaccessible' with DNR set status when retrieving the NS descriptor list for the namespace.
>>> And this has to happen after we read the list of active namespace.
>>> Perfectly legit, but doesn't tell us anything if the namespace is present at all.
>>> All we know is that we cannot get information about that, and my argument is that we should treat this as equivalent to a namespace
>>> not present.
>>>
>> I think when a nsid is in "ANA inaccessible" state sending any command which
>> has that nsid described in it would be aborted by the controller.
>> Per the NVMe 2.0  spec (quoting a snippet from section 8.1.3.3 ANA
>> Inaccessible state):
>>
>> "A controller shall abort commands, other than those described in section 8.1.4, with a status code of
>> Asymmetric Access Inaccessible if those commands are submitted while the relationship between the
>> namespace specified by the command and the controller processing the command is in this state.
>>
>> While ANA Inaccessible state is reported by a controller for the namespace, the host should retry the
>> command on a different controller that is reporting ANA Optimized state or ANA Non-Optimized state. If no
>> controllers are reporting ANA Optimized state or ANA Non-Optimized state, then a transition may be
>> occurring such that a controller reporting the Inaccessible state may become accessible and the host should
>> retry the command on the controller reporting Inaccessible state for at least ANATT seconds (refer to Figure
>> 275). Refer to section 8.10.2."
>>
>> So as we can see above, removing nsid immediately just because ns-descriptor-list command
>> failed with status "ANA inaccessible and DNR set" may not be correct. Because it's possible
>> that ANA state may transition back to optimized/non-optimized state, So instead of removing
>> ns from host, we may retry that command on another controller which is reporting ANA optimized/
>> non-optimized state if that nsid is attached to more than one controller. If nsid is private
>> (means attached only to one controller) then we may not have any option but to skip this nsid
>> during scan and wait until either ANATT timer expires or nsid transition back from ANA
>> inaccessible to ANA optimized/non-optimized state.
>>
> I would agree with you for any other command. But the 'identify ns desc' command is the very first command send to the namespace, and it's required by our implementation to correctly attach the namespace to the corresponding ns_head structure.
> We simply _cannot_ retry that command on another path, as that other path might (and, actually, is expected to) yield different information.
> 
>> Yes it might be possible that while nsid is in ANA inaccessible state, it might be un-mapped
>> from the target controller. But in that case target should send namespace change notice to the
>> host and that shall trigger ns scan. And as Keith proposed, we probably want to move the changed
>> log ns retrieval just after we get active list of ns.
>>
> That is precisely the scenario which I ran into.
> We _do_ get the AEN changed event, but by the time when we start the ns scan the NSID has already been reassigned to a namespace with a different UUID.
> 
> When nvme_scan_ns() is called,  nvme_find_get_ns() would return 'true'
> (as we still have the stale namespace in our lists), but the subsequent
> nvme_validate_ns() would fail (as the UUID is different).
> So the old namespace will be removed, but the new namespace will never be rescanned.
> 
> So my argument is that in this specific case the 'ANA inaccessible' nvme
> state should _not_ be retried, but should be treated as identical to
> 'invalid namespace' errors.
> 
I think I got what you're trying to propose. So when this issue manifests, on host, if we 
could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible" 
state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
other error case we may remove the ns from the host. 
I think issuing ns descriptor list command to target for a nsid which doesn't exist on the 
target would return buffer filled with all zeros. So that might be an indication that ns has 
been removed from the target.
 
Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-13 14:12                 ` Nilay Shroff
@ 2025-01-13 14:29                   ` Hannes Reinecke
  2025-01-15  7:48                     ` Nilay Shroff
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-13 14:29 UTC (permalink / raw)
  To: Nilay Shroff, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On 1/13/25 15:12, Nilay Shroff wrote:
> 
> 
> On 1/13/25 1:13 PM, Hannes Reinecke wrote:
>> On 1/11/25 15:01, Nilay Shroff wrote:
>>>
>>>
[ .. ]
>> So my argument is that in this specific case the 'ANA inaccessible' nvme
>> state should _not_ be retried, but should be treated as identical to
>> 'invalid namespace' errors.
>>
> I think I got what you're trying to propose. So when this issue manifests, on host, if we
> could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
> because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible"
> state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
> the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
> other error case we may remove the ns from the host.
> I think issuing ns descriptor list command to target for a nsid which doesn't exist on the
> target would return buffer filled with all zeros. So that might be an indication that ns has
> been removed from the target.
>   
But only if the NSID has not been remapped in the meantime.
If it has (as in my case) the ns descriptor list will be valid, it just
refers to another namespace.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-13 14:29                   ` Hannes Reinecke
@ 2025-01-15  7:48                     ` Nilay Shroff
  2025-01-15  8:02                       ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Nilay Shroff @ 2025-01-15  7:48 UTC (permalink / raw)
  To: Hannes Reinecke, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme



On 1/13/25 7:59 PM, Hannes Reinecke wrote:
> On 1/13/25 15:12, Nilay Shroff wrote:
>>
>>
>> On 1/13/25 1:13 PM, Hannes Reinecke wrote:
>>> On 1/11/25 15:01, Nilay Shroff wrote:
>>>>
>>>>
> [ .. ]
>>> So my argument is that in this specific case the 'ANA inaccessible' nvme
>>> state should _not_ be retried, but should be treated as identical to
>>> 'invalid namespace' errors.
>>>
>> I think I got what you're trying to propose. So when this issue manifests, on host, if we
>> could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
>> because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible"
>> state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
>> the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
>> other error case we may remove the ns from the host.
>> I think issuing ns descriptor list command to target for a nsid which doesn't exist on the
>> target would return buffer filled with all zeros. So that might be an indication that ns has
>> been removed from the target.
>>   
> But only if the NSID has not been remapped in the meantime.
> If it has (as in my case) the ns descriptor list will be valid, it just
> refers to another namespace.
> 
If NSID has been unmapped and then remapped on the targer then in that case, 
host would hit the mismatch uuid case (under nvme_validate_ns()) and so host 
would then remove the namespace. 

I think there are two cases, 
Case1:
1. AEN triggers rescan
2. List of active nsid is retrieved
-> NSID A is removed on the target
3. Scanning of NSID A fails (i.e. nvme_identify_ns_descs() returns buffer filled with all zeros)
-> host removes the respective namespace

Case2:
1. AEN triggers rescan
2. List of active nsid is retrieved
-> NSID A is unmapped and remapped (possibly with different uuid) on target
3. Scanning of NSID A succeed 
4. host finds the mismatch uuid for NSID A (i.e. nvme_validate_ns() fails)
-> host removes the respective namespace
 
Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-15  7:48                     ` Nilay Shroff
@ 2025-01-15  8:02                       ` Hannes Reinecke
  2025-01-15  8:18                         ` Nilay Shroff
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-15  8:02 UTC (permalink / raw)
  To: Nilay Shroff, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On 1/15/25 08:48, Nilay Shroff wrote:
> 
> 
> On 1/13/25 7:59 PM, Hannes Reinecke wrote:
>> On 1/13/25 15:12, Nilay Shroff wrote:
>>>
>>>
>>> On 1/13/25 1:13 PM, Hannes Reinecke wrote:
>>>> On 1/11/25 15:01, Nilay Shroff wrote:
>>>>>
>>>>>
>> [ .. ]
>>>> So my argument is that in this specific case the 'ANA inaccessible' nvme
>>>> state should _not_ be retried, but should be treated as identical to
>>>> 'invalid namespace' errors.
>>>>
>>> I think I got what you're trying to propose. So when this issue manifests, on host, if we
>>> could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
>>> because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible"
>>> state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
>>> the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
>>> other error case we may remove the ns from the host.
>>> I think issuing ns descriptor list command to target for a nsid which doesn't exist on the
>>> target would return buffer filled with all zeros. So that might be an indication that ns has
>>> been removed from the target.
>>>    
>> But only if the NSID has not been remapped in the meantime.
>> If it has (as in my case) the ns descriptor list will be valid, it just
>> refers to another namespace.
>>
> If NSID has been unmapped and then remapped on the targer then in that case,
> host would hit the mismatch uuid case (under nvme_validate_ns()) and so host
> would then remove the namespace.
> 
> I think there are two cases,
> Case1:
> 1. AEN triggers rescan
> 2. List of active nsid is retrieved
> -> NSID A is removed on the target
> 3. Scanning of NSID A fails (i.e. nvme_identify_ns_descs() returns buffer filled with all zeros)
> -> host removes the respective namespace
> 
> Case2:
> 1. AEN triggers rescan
> 2. List of active nsid is retrieved
> -> NSID A is unmapped and remapped (possibly with different uuid) on target
> 3. Scanning of NSID A succeed
> 4. host finds the mismatch uuid for NSID A (i.e. nvme_validate_ns() fails)
> -> host removes the respective namespace
>   
Entirely correct.
But Case2 results in the new namespace never to be scanned, and not 
visible to the OS. Which is the error I'm fighting with.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-15  8:02                       ` Hannes Reinecke
@ 2025-01-15  8:18                         ` Nilay Shroff
  2025-01-15  8:22                           ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Nilay Shroff @ 2025-01-15  8:18 UTC (permalink / raw)
  To: Hannes Reinecke, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme



On 1/15/25 1:32 PM, Hannes Reinecke wrote:
> On 1/15/25 08:48, Nilay Shroff wrote:
>>
>>
>> On 1/13/25 7:59 PM, Hannes Reinecke wrote:
>>> On 1/13/25 15:12, Nilay Shroff wrote:
>>>>
>>>>
>>>> On 1/13/25 1:13 PM, Hannes Reinecke wrote:
>>>>> On 1/11/25 15:01, Nilay Shroff wrote:
>>>>>>
>>>>>>
>>> [ .. ]
>>>>> So my argument is that in this specific case the 'ANA inaccessible' nvme
>>>>> state should _not_ be retried, but should be treated as identical to
>>>>> 'invalid namespace' errors.
>>>>>
>>>> I think I got what you're trying to propose. So when this issue manifests, on host, if we
>>>> could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
>>>> because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible"
>>>> state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
>>>> the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
>>>> other error case we may remove the ns from the host.
>>>> I think issuing ns descriptor list command to target for a nsid which doesn't exist on the
>>>> target would return buffer filled with all zeros. So that might be an indication that ns has
>>>> been removed from the target.
>>>>    
>>> But only if the NSID has not been remapped in the meantime.
>>> If it has (as in my case) the ns descriptor list will be valid, it just
>>> refers to another namespace.
>>>
>> If NSID has been unmapped and then remapped on the targer then in that case,
>> host would hit the mismatch uuid case (under nvme_validate_ns()) and so host
>> would then remove the namespace.
>>
>> I think there are two cases,
>> Case1:
>> 1. AEN triggers rescan
>> 2. List of active nsid is retrieved
>> -> NSID A is removed on the target
>> 3. Scanning of NSID A fails (i.e. nvme_identify_ns_descs() returns buffer filled with all zeros)
>> -> host removes the respective namespace
>>
>> Case2:
>> 1. AEN triggers rescan
>> 2. List of active nsid is retrieved
>> -> NSID A is unmapped and remapped (possibly with different uuid) on target
>> 3. Scanning of NSID A succeed
>> 4. host finds the mismatch uuid for NSID A (i.e. nvme_validate_ns() fails)
>> -> host removes the respective namespace
>>   
> Entirely correct.
> But Case2 results in the new namespace never to be scanned, and not visible to the OS. Which is the error I'm fighting with.
> 
Ok but then it seems that your proposed patch doesn't address Case2, isn't it? 
It appears to me that the patch tries to address Case1 but with error code 
of "ANA inaccessible and DNR" set. IMO for case2, we may want to schedule 
queue scan again if nvme_validate_ns() fails due to the mismatch uuid.

Thanks,
--Nilay



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
  2025-01-15  8:18                         ` Nilay Shroff
@ 2025-01-15  8:22                           ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2025-01-15  8:22 UTC (permalink / raw)
  To: Nilay Shroff, Keith Busch
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, linux-nvme

On 1/15/25 09:18, Nilay Shroff wrote:
> 
> 
> On 1/15/25 1:32 PM, Hannes Reinecke wrote:
>> On 1/15/25 08:48, Nilay Shroff wrote:
>>>
>>>
>>> On 1/13/25 7:59 PM, Hannes Reinecke wrote:
>>>> On 1/13/25 15:12, Nilay Shroff wrote:
>>>>>
>>>>>
>>>>> On 1/13/25 1:13 PM, Hannes Reinecke wrote:
>>>>>> On 1/11/25 15:01, Nilay Shroff wrote:
>>>>>>>
>>>>>>>
>>>> [ .. ]
>>>>>> So my argument is that in this specific case the 'ANA inaccessible' nvme
>>>>>> state should _not_ be retried, but should be treated as identical to
>>>>>> 'invalid namespace' errors.
>>>>>>
>>>>> I think I got what you're trying to propose. So when this issue manifests, on host, if we
>>>>> could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
>>>>> because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible"
>>>>> state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
>>>>> the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
>>>>> other error case we may remove the ns from the host.
>>>>> I think issuing ns descriptor list command to target for a nsid which doesn't exist on the
>>>>> target would return buffer filled with all zeros. So that might be an indication that ns has
>>>>> been removed from the target.
>>>>>     
>>>> But only if the NSID has not been remapped in the meantime.
>>>> If it has (as in my case) the ns descriptor list will be valid, it just
>>>> refers to another namespace.
>>>>
>>> If NSID has been unmapped and then remapped on the targer then in that case,
>>> host would hit the mismatch uuid case (under nvme_validate_ns()) and so host
>>> would then remove the namespace.
>>>
>>> I think there are two cases,
>>> Case1:
>>> 1. AEN triggers rescan
>>> 2. List of active nsid is retrieved
>>> -> NSID A is removed on the target
>>> 3. Scanning of NSID A fails (i.e. nvme_identify_ns_descs() returns buffer filled with all zeros)
>>> -> host removes the respective namespace
>>>
>>> Case2:
>>> 1. AEN triggers rescan
>>> 2. List of active nsid is retrieved
>>> -> NSID A is unmapped and remapped (possibly with different uuid) on target
>>> 3. Scanning of NSID A succeed
>>> 4. host finds the mismatch uuid for NSID A (i.e. nvme_validate_ns() fails)
>>> -> host removes the respective namespace
>>>    
>> Entirely correct.
>> But Case2 results in the new namespace never to be scanned, and not visible to the OS. Which is the error I'm fighting with.
>>
> Ok but then it seems that your proposed patch doesn't address Case2, isn't it?
> It appears to me that the patch tries to address Case1 but with error code
> of "ANA inaccessible and DNR" set. IMO for case2, we may want to schedule
> queue scan again if nvme_validate_ns() fails due to the mismatch uuid.
> 
That would be possible, too. I'll see to whip up a patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-01-15  8:22 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29 14:06 [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed Hannes Reinecke
2024-12-03 19:15 ` Keith Busch
2024-12-04  7:14   ` Hannes Reinecke
2024-12-04 16:39     ` Keith Busch
2024-12-05 12:30       ` Hannes Reinecke
2024-12-05 16:15         ` Keith Busch
2024-12-06 12:41           ` Hannes Reinecke
2025-01-07 16:01             ` Keith Busch
2025-01-11 14:01             ` Nilay Shroff
2025-01-13  7:43               ` Hannes Reinecke
2025-01-13 14:12                 ` Nilay Shroff
2025-01-13 14:29                   ` Hannes Reinecke
2025-01-15  7:48                     ` Nilay Shroff
2025-01-15  8:02                       ` Hannes Reinecke
2025-01-15  8:18                         ` Nilay Shroff
2025-01-15  8:22                           ` Hannes Reinecke
2024-12-24 11:35 ` Sagi Grimberg
2024-12-25  9:58 ` Sagi Grimberg
2025-01-07  8:11   ` Hannes Reinecke
2025-01-08 10:49     ` Sagi Grimberg
2025-01-08 15:45       ` Hannes Reinecke
2025-01-10 23:16         ` Sagi Grimberg
2025-01-13  7:50           ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox