* [LSF/MM/BPF TOPIC] dispersed namespaces revisited
@ 2022-03-23 16:20 Hannes Reinecke
2022-03-23 16:27 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2022-03-23 16:20 UTC (permalink / raw)
To: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org
Hi all,
there had been quite some discussion on various venues about dispersed
namespaces on NVMe and missing linux support.
Especially since it looks as if the original specification will not be
implemented, yet vendors do view it as a crucial use-case.
Which is already supported on other protocols like SCSI just fine, I
might add. Even on Linux.
So I would like to have a discussion on where we stand, what the
proposals are, and what we can do from the linux side to support the use
case.
To add a bit of background:
Dispersed namespaces have been defined to support live migration of data
from one subsystem to another. General idea is that the same namespace
(as identified by the namespace identifier) might show up on different
subsystems.
This is already working on SCSI, as dm multipathing will just look as
the VPD page identifcation and arrange devices based on that.
For NVMe with native multipathing this currently does not work, as
a) we're identifying namespaces with the numerical NSID
and
b) namespaces are attached to the subsystem, and can only be assembled
within that subsystem.
Sure we can always switch back to device-mapper multipathing, but I
don't think that's a direction we want to go.
(I certainly don't.)
This discussion will be on how do we go from here; changing the spec
and/or the implementation is on the table.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
2022-03-23 16:20 [LSF/MM/BPF TOPIC] dispersed namespaces revisited Hannes Reinecke
@ 2022-03-23 16:27 ` Christoph Hellwig
2022-03-23 17:16 ` Hannes Reinecke
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-23 16:27 UTC (permalink / raw)
To: Hannes Reinecke
Cc: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org
Hi Hannes,
the answer is pretty simple: dispersed namespaces are a bad idea and
will not be implemented in Linux, and this has been the stance since
this was first proposed.
If these vendors want Linux support they can already trivially
implement virtual subsystems and are highly encouraged to do so.
(the concept of domains actually makes it even simpler than in
the beginning).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
2022-03-23 16:27 ` Christoph Hellwig
@ 2022-03-23 17:16 ` Hannes Reinecke
2022-03-23 17:21 ` Christoph Hellwig
2022-03-24 1:17 ` John Meneghini
0 siblings, 2 replies; 6+ messages in thread
From: Hannes Reinecke @ 2022-03-23 17:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org
On 3/23/22 17:27, Christoph Hellwig wrote:
> Hi Hannes,
>
> the answer is pretty simple: dispersed namespaces are a bad idea and
> will not be implemented in Linux, and this has been the stance since
> this was first proposed.
>
> If these vendors want Linux support they can already trivially
> implement virtual subsystems and are highly encouraged to do so.
> (the concept of domains actually makes it even simpler than in
> the beginning).
Guess what, that's what I have been proposing.
And they have (somewhat) agreed.
I just wanted to use LSF to get everyone on board, and then be able to
come with a proposal to NVMexpress which I know will be acceptable from
the Linux community.
I have a patchset ready implementing virtual subsystems based on the NS
UUID (and thereby putting each namespace into a separate subsystem).
At this time it's just a PoC to demonstrate the concept; specification
is not even proposed, and so it's hard to code against.
I can send a pointer to my kernel.org branch if you like.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
2022-03-23 17:16 ` Hannes Reinecke
@ 2022-03-23 17:21 ` Christoph Hellwig
2022-03-24 1:17 ` John Meneghini
1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-23 17:21 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, lsf-pc@lists.linux-foundation.org,
linux-nvme@lists.infradead.org
On Wed, Mar 23, 2022 at 06:16:10PM +0100, Hannes Reinecke wrote:
> I have a patchset ready implementing virtual subsystems based on the NS UUID
> (and thereby putting each namespace into a separate subsystem).
> At this time it's just a PoC to demonstrate the concept; specification is
> not even proposed, and so it's hard to code against.
Virtual subsystems already work, with one or more namespaces per
subsystems. If you need to do any new code you are doing this completely
wrong.
Just looks at the Linux target code - we can create subsystems at will,
and a sensible configuration will have one per tenant (host or set of
closely cooperating hosts). You can trivially add another controller or
set of controllers on another piece of hardware from the protocol point
of view. You just need to have some way to synchronize the data access
just like you do for the dispersed namespace proposal.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
2022-03-23 17:16 ` Hannes Reinecke
2022-03-23 17:21 ` Christoph Hellwig
@ 2022-03-24 1:17 ` John Meneghini
2022-03-24 5:42 ` Christoph Hellwig
1 sibling, 1 reply; 6+ messages in thread
From: John Meneghini @ 2022-03-24 1:17 UTC (permalink / raw)
To: Hannes Reinecke, Christoph Hellwig
Cc: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org
I agree with Christoph that the current conception of dispersed namespaces fundamentally breaks the NVMe storage object model.
However, there have been several ideas proffered at FMDS to fix this and I think people are willing to make changes to TP-4034
to redress this problem.
I think this is a topic that needs to be discussed at LSF/MM or ALPSS.
/John
On 3/23/22 13:16, Hannes Reinecke wrote:
> On 3/23/22 17:27, Christoph Hellwig wrote:
>> Hi Hannes,
>>
>> the answer is pretty simple: dispersed namespaces are a bad idea and
>> will not be implemented in Linux, and this has been the stance since
>> this was first proposed.
>>
>> If these vendors want Linux support they can already trivially
>> implement virtual subsystems and are highly encouraged to do so.
>> (the concept of domains actually makes it even simpler than in
>> the beginning).
>
> Guess what, that's what I have been proposing.
> And they have (somewhat) agreed.
>
> I just wanted to use LSF to get everyone on board, and then be able to come with a proposal to NVMexpress which I know will be
> acceptable from the Linux community.
>
> I have a patchset ready implementing virtual subsystems based on the NS UUID (and thereby putting each namespace into a separate
> subsystem).
> At this time it's just a PoC to demonstrate the concept; specification is not even proposed, and so it's hard to code against.
>
> I can send a pointer to my kernel.org branch if you like.
>
> Cheers,
>
> Hannes
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
2022-03-24 1:17 ` John Meneghini
@ 2022-03-24 5:42 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-24 5:42 UTC (permalink / raw)
To: John Meneghini
Cc: Hannes Reinecke, Christoph Hellwig,
lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org
On Wed, Mar 23, 2022 at 09:17:58PM -0400, John Meneghini wrote:
> I agree with Christoph that the current conception of dispersed namespaces fundamentally breaks the NVMe storage object model.
>
> However, there have been several ideas proffered at FMDS to fix this and I
> think people are willing to make changes to TP-4034 to redress this problem.
>
> I think this is a topic that needs to be discussed at LSF/MM or ALPSS.
NVMeoF has been designed to support more than one "virtual" subsystem
behind a single port, and thus use the subsystem as a lean per-tenant
container including the ability to scale and migrate it over multiple
pieces hardware. And ANA and Domains have made that even easier.
So really what we need here is a clear explanation of why this does
not work for the people that scream loud, and why brekaing fundamental
NVMe abstractions is the way to go. This is the specific question I've
asked since the TPAR was proposed but it has simply been ignored. And
no, dumb implementation with a lot of legacy baggage do not count.
The array vendors need to do their homework first.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-03-24 5:42 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-23 16:20 [LSF/MM/BPF TOPIC] dispersed namespaces revisited Hannes Reinecke
2022-03-23 16:27 ` Christoph Hellwig
2022-03-23 17:16 ` Hannes Reinecke
2022-03-23 17:21 ` Christoph Hellwig
2022-03-24 1:17 ` John Meneghini
2022-03-24 5:42 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.