From: Hannes Reinecke <hare@suse.de>
To: Keith Busch <kbusch@kernel.org>
Cc: Hannes Reinecke <hare@kernel.org>, Christoph Hellwig <hch@lst.de>,
Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org
Subject: Re: [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
Date: Thu, 5 Dec 2024 13:30:39 +0100 [thread overview]
Message-ID: <99025917-e201-4ec9-ba04-e979f61c411b@suse.de> (raw)
In-Reply-To: <Z1CF2vn1_xkNIVAM@kbusch-mbp>
On 12/4/24 17:39, Keith Busch wrote:
> On Wed, Dec 04, 2024 at 08:14:08AM +0100, Hannes Reinecke wrote:
>> On 12/3/24 20:15, Keith Busch wrote:
>>> On Fri, Nov 29, 2024 at 03:06:08PM +0100, Hannes Reinecke wrote:
>>>> When a namespace gets unmapped on the target during scanning
>>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>>> With the currrent code we will ignore that error on the grounds
>>>> that we failed to get information, and hence cannot make any
>>>> decisions whether to keep or remove that namespace.
>>>> But a non-retryable error implies that the namespace is _not_
>>>> present as we cannot retry that command and will never get
>>>> information about that namespace.
>>>> And we need to remove the namespace during scanning, as otherwise
>>>> the AEN informing us about a namespace change will find the NSID
>>>> present, but nvme_validate_ns() will fail, and the namespace
>>>> will never be updated with the correct information.
>>>
>>> The scanning only checks namespaces returned in the "active" namespace
>>> list. Every namespace not in the active list gets removed already. Why
>>> is this unmapped namespace appearing on the active list?
>>
>> Timing. Imagine a system used as a backing store for kubernetes, where
>> namespaces come and go at a _really_ fast pace.
>> 1) AEN triggers a rescan
>> 2) List of active namespace is retrieved
>> -> NSID A gets unmapped (or moved to another node in the cluster)
>> 3) Scan of NSID A returns an error with DNR set.
>> Without this patch we keep the namespace around, so eventually we'll
>> trip over the 'non-matching UUID' check once the NSID is reused.
>
> I'm still not sure that makes sense. The target shouldn't attach the new
> namespace until the host acknowledges the removal of the older NSID via
> the Namespace Change List log. Until the log is read, the inventory for
> removed namespaces should be latched. Otherwise, timing might remove+add
> a specific NSID before the host requests the NS Descriptor for the
> racing removal, then it would just get the "non-matching UUID" issue
> anyway.
But we read the Namespace Change List log in step 2)
(Not that we're doing anything with it, but that's another story...)
Hmm?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
next prev parent reply other threads:[~2024-12-05 12:35 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-29 14:06 [PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed Hannes Reinecke
2024-12-03 19:15 ` Keith Busch
2024-12-04 7:14 ` Hannes Reinecke
2024-12-04 16:39 ` Keith Busch
2024-12-05 12:30 ` Hannes Reinecke [this message]
2024-12-05 16:15 ` Keith Busch
2024-12-06 12:41 ` Hannes Reinecke
2025-01-07 16:01 ` Keith Busch
2025-01-11 14:01 ` Nilay Shroff
2025-01-13 7:43 ` Hannes Reinecke
2025-01-13 14:12 ` Nilay Shroff
2025-01-13 14:29 ` Hannes Reinecke
2025-01-15 7:48 ` Nilay Shroff
2025-01-15 8:02 ` Hannes Reinecke
2025-01-15 8:18 ` Nilay Shroff
2025-01-15 8:22 ` Hannes Reinecke
2024-12-24 11:35 ` Sagi Grimberg
2024-12-25 9:58 ` Sagi Grimberg
2025-01-07 8:11 ` Hannes Reinecke
2025-01-08 10:49 ` Sagi Grimberg
2025-01-08 15:45 ` Hannes Reinecke
2025-01-10 23:16 ` Sagi Grimberg
2025-01-13 7:50 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99025917-e201-4ec9-ba04-e979f61c411b@suse.de \
--to=hare@suse.de \
--cc=hare@kernel.org \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox