All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] dispersed namespaces revisited
@ 2022-03-23 16:20 Hannes Reinecke
  2022-03-23 16:27 ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2022-03-23 16:20 UTC (permalink / raw)
  To: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org

Hi all,

there had been quite some discussion on various venues about dispersed 
namespaces on NVMe and missing linux support.
Especially since it looks as if the original specification will not be 
implemented, yet vendors do view it as a crucial use-case.
Which is already supported on other protocols like SCSI just fine, I 
might add. Even on Linux.

So I would like to have a discussion on where we stand, what the 
proposals are, and what we can do from the linux side to support the use 
case.

To add a bit of background:
Dispersed namespaces have been defined to support live migration of data 
from one subsystem to another. General idea is that the same namespace 
(as identified by the namespace identifier) might show up on different 
subsystems.
This is already working on SCSI, as dm multipathing will just look as 
the VPD page identifcation and arrange devices based on that.
For NVMe with native multipathing this currently does not work, as
a) we're identifying namespaces with the numerical NSID
and
b) namespaces are attached to the subsystem, and can only be assembled 
within that subsystem.

Sure we can always switch back to device-mapper multipathing, but I 
don't think that's a direction we want to go.
(I certainly don't.)

This discussion will be on how do we go from here; changing the spec 
and/or the implementation is on the table.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
  2022-03-23 16:20 [LSF/MM/BPF TOPIC] dispersed namespaces revisited Hannes Reinecke
@ 2022-03-23 16:27 ` Christoph Hellwig
  2022-03-23 17:16   ` Hannes Reinecke
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-23 16:27 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org

Hi Hannes,

the answer is pretty simple:  dispersed namespaces are a bad idea and
will not be implemented in Linux, and this has been the stance since
this was first proposed.

If these vendors want Linux support they can already trivially
implement virtual subsystems and are highly encouraged to do so.
(the concept of domains actually makes it even simpler than in
the beginning).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
  2022-03-23 16:27 ` Christoph Hellwig
@ 2022-03-23 17:16   ` Hannes Reinecke
  2022-03-23 17:21     ` Christoph Hellwig
  2022-03-24  1:17     ` John Meneghini
  0 siblings, 2 replies; 6+ messages in thread
From: Hannes Reinecke @ 2022-03-23 17:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org

On 3/23/22 17:27, Christoph Hellwig wrote:
> Hi Hannes,
> 
> the answer is pretty simple:  dispersed namespaces are a bad idea and
> will not be implemented in Linux, and this has been the stance since
> this was first proposed.
> 
> If these vendors want Linux support they can already trivially
> implement virtual subsystems and are highly encouraged to do so.
> (the concept of domains actually makes it even simpler than in
> the beginning).

Guess what, that's what I have been proposing.
And they have (somewhat) agreed.

I just wanted to use LSF to get everyone on board, and then be able to 
come with a proposal to NVMexpress which I know will be acceptable from 
the Linux community.

I have a patchset ready implementing virtual subsystems based on the NS 
UUID (and thereby putting each namespace into a separate subsystem).
At this time it's just a PoC to demonstrate the concept; specification 
is not even proposed, and so it's hard to code against.

I can send a pointer to my kernel.org branch if you like.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
  2022-03-23 17:16   ` Hannes Reinecke
@ 2022-03-23 17:21     ` Christoph Hellwig
  2022-03-24  1:17     ` John Meneghini
  1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-23 17:21 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, lsf-pc@lists.linux-foundation.org,
	linux-nvme@lists.infradead.org

On Wed, Mar 23, 2022 at 06:16:10PM +0100, Hannes Reinecke wrote:
> I have a patchset ready implementing virtual subsystems based on the NS UUID
> (and thereby putting each namespace into a separate subsystem).
> At this time it's just a PoC to demonstrate the concept; specification is
> not even proposed, and so it's hard to code against.

Virtual subsystems already work, with one or more namespaces per
subsystems. If you need to do any new code you are doing this completely
wrong.

Just looks at the Linux target code - we can create subsystems at will,
and a sensible configuration will have one per tenant (host or set of
closely cooperating hosts).  You can trivially add another controller or
set of controllers on another piece of hardware from the protocol point
of view.  You just need to have some way to synchronize the data access
just like you do for the dispersed namespace proposal.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
  2022-03-23 17:16   ` Hannes Reinecke
  2022-03-23 17:21     ` Christoph Hellwig
@ 2022-03-24  1:17     ` John Meneghini
  2022-03-24  5:42       ` Christoph Hellwig
  1 sibling, 1 reply; 6+ messages in thread
From: John Meneghini @ 2022-03-24  1:17 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org

I agree with Christoph that the current conception of dispersed namespaces fundamentally breaks the NVMe storage object model.

However, there have been several ideas proffered at FMDS to fix this and I think people are willing to make changes to TP-4034 
to redress this problem.

I think this is a topic that needs to be discussed at LSF/MM or ALPSS.

/John

On 3/23/22 13:16, Hannes Reinecke wrote:
> On 3/23/22 17:27, Christoph Hellwig wrote:
>> Hi Hannes,
>>
>> the answer is pretty simple:  dispersed namespaces are a bad idea and
>> will not be implemented in Linux, and this has been the stance since
>> this was first proposed.
>>
>> If these vendors want Linux support they can already trivially
>> implement virtual subsystems and are highly encouraged to do so.
>> (the concept of domains actually makes it even simpler than in
>> the beginning).
> 
> Guess what, that's what I have been proposing.
> And they have (somewhat) agreed.
> 
> I just wanted to use LSF to get everyone on board, and then be able to come with a proposal to NVMexpress which I know will be 
> acceptable from the Linux community.
> 
> I have a patchset ready implementing virtual subsystems based on the NS UUID (and thereby putting each namespace into a separate 
> subsystem).
> At this time it's just a PoC to demonstrate the concept; specification is not even proposed, and so it's hard to code against.
> 
> I can send a pointer to my kernel.org branch if you like.
> 
> Cheers,
> 
> Hannes



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM/BPF TOPIC] dispersed namespaces revisited
  2022-03-24  1:17     ` John Meneghini
@ 2022-03-24  5:42       ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-24  5:42 UTC (permalink / raw)
  To: John Meneghini
  Cc: Hannes Reinecke, Christoph Hellwig,
	lsf-pc@lists.linux-foundation.org, linux-nvme@lists.infradead.org

On Wed, Mar 23, 2022 at 09:17:58PM -0400, John Meneghini wrote:
> I agree with Christoph that the current conception of dispersed namespaces fundamentally breaks the NVMe storage object model.
> 
> However, there have been several ideas proffered at FMDS to fix this and I
> think people are willing to make changes to TP-4034 to redress this problem.
> 
> I think this is a topic that needs to be discussed at LSF/MM or ALPSS.

NVMeoF has been designed to support more than one "virtual" subsystem
behind a single port, and thus use the subsystem as a lean per-tenant
container including the ability to scale and migrate it over multiple
pieces hardware.  And ANA and Domains have made that even easier.

So really what we need here is a clear explanation of why this does
not work for the people that scream loud, and why brekaing fundamental
NVMe abstractions is the way to go.  This is the specific question I've
asked since the TPAR was proposed but it has simply been ignored.  And
no, dumb implementation with a lot of legacy baggage do not count.

The array vendors need to do their homework first.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-24  5:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-23 16:20 [LSF/MM/BPF TOPIC] dispersed namespaces revisited Hannes Reinecke
2022-03-23 16:27 ` Christoph Hellwig
2022-03-23 17:16   ` Hannes Reinecke
2022-03-23 17:21     ` Christoph Hellwig
2022-03-24  1:17     ` John Meneghini
2022-03-24  5:42       ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.