All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Andi Kleen <ak@linux.intel.com>, Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Haggai Eran <haggaie@mellanox.com>,
	balbirs@au1.ibm.com,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	"Kuehling, Felix" <felix.kuehling@amd.com>,
	Philip.Yang@amd.com, "Koenig,
	Christian" <christian.koenig@amd.com>,
	"Blinzer, Paul" <Paul.Blinzer@amd.com>,
	John Hubbard <jhubbard@nvidia.com>,
	rcampbell@nvidia.com
Subject: Re: [RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation
Date: Wed, 5 Dec 2018 17:58:28 -0500	[thread overview]
Message-ID: <20181205225828.GL3536@redhat.com> (raw)
In-Reply-To: <7ab26ea6-d16d-8d71-78ca-4266a864f8d3@deltatee.com>

On Wed, Dec 05, 2018 at 12:10:10PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 2018-12-05 11:55 a.m., Jerome Glisse wrote:
> > So now once next type of device shows up with the exact same thing
> > let say FPGA, we have to create a new subsystem for them too. Also
> > this make the userspace life much much harder. Now userspace must
> > go parse PCIE, subsystem1, subsystem2, subsystemN, NUMA, ... and
> > merge all that different information together and rebuild the
> > representation i am putting forward in this patchset in userspace.
> 
> Yes. But seeing such FPGA links aren't common yet and there isn't really
> much in terms of common FPGA infrastructure in the kernel (which are
> hard seeing the hardware is infinitely customization) you can let the
> people developing FPGA code worry about it and come up with their own
> solution. Buses between FPGAs may end up never being common enough for
> people to care, or they may end up being so weird that they need their
> own description independent of GPUS, or maybe when they become common
> they find a way to use the GPU link subsystem -- who knows. Don't try to
> design for use cases that don't exist yet.
> 
> Yes, userspace will have to know about all the buses it cares to find
> links over. Sounds like a perfect thing for libhms to do.

So just to be clear here is how i understand your position:
"Single coherent sysfs hierarchy to describe something is useless
 let's git rm drivers/base/"

While i am arguing that "hey the /sys/bus/node/devices/* is nice
but it just does not cut it for all this new hardware platform
if i add new nodes there for my new memory i will break tons of
existing application. So what about a new hierarchy that allow
to describe those new hardware platform in a single place like
today node thing"


> 
> > There is no telling that kernel won't be able to provide quirk and
> > workaround because some merging is actually illegal on a given
> > platform (like some link from a subsystem is not accessible through
> > the PCI connection of one of the device connected to that link).
> 
> These are all just different individual problems which need different
> solutions not grand new design concepts.
> 
> > So it means userspace will have to grow its own database or work-
> > around and quirk and i am back in the situation i am in today.
> 
> No, as I've said, quirks are firmly the responsibility of kernels.
> Userspace will need to know how to work with the different buses and
> CPU/node information but there really isn't that many of these to deal
> with and this is a much easier approach than trying to come up with a
> new API that can wrap the nuances of all existing and potential future
> bus types we may have to deal with.

No can do that is what i am trying to explain. So if i bus 1 in a
sub-system A and usualy that kind of bus can serve a bridge for
PCIE ie a CPU can access device behind it by going through a PCIE
device first. So now the userspace libary have this knowledge
bake in. Now if a platform has a bug for whatever reasons where
that does not hold, the kernel has no way to tell userspace that
there is an exception there. It is up to userspace to have a data
base of quirks.

Kernel see all those objects in isolation in your scheme. While in
what i am proposing there is only one place and any device that
participate in this common place can report any quirks so that a
coherent view is given to user space.

If we have gazillion of places where all this informations is spread
around than we have no way to fix weird inter-action between any
of those.

Cheers,
J�r�me

WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Andi Kleen <ak@linux.intel.com>, Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Haggai Eran <haggaie@mellanox.com>,
	balbirs@au1.ibm.com,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	"Kuehling, Felix" <felix.kuehling@amd.com>,
	Philip.Yang@amd.com, "Koenig,
	Christian" <christian.koenig@amd.com>,
	"Blinzer, Paul" <Paul.Blinzer@amd.com>,
	John Hubbard <jhubbard@nvidia.com>,
	rcampbell@nvidia.com
Subject: Re: [RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation
Date: Wed, 5 Dec 2018 17:58:28 -0500	[thread overview]
Message-ID: <20181205225828.GL3536@redhat.com> (raw)
In-Reply-To: <7ab26ea6-d16d-8d71-78ca-4266a864f8d3@deltatee.com>

On Wed, Dec 05, 2018 at 12:10:10PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 2018-12-05 11:55 a.m., Jerome Glisse wrote:
> > So now once next type of device shows up with the exact same thing
> > let say FPGA, we have to create a new subsystem for them too. Also
> > this make the userspace life much much harder. Now userspace must
> > go parse PCIE, subsystem1, subsystem2, subsystemN, NUMA, ... and
> > merge all that different information together and rebuild the
> > representation i am putting forward in this patchset in userspace.
> 
> Yes. But seeing such FPGA links aren't common yet and there isn't really
> much in terms of common FPGA infrastructure in the kernel (which are
> hard seeing the hardware is infinitely customization) you can let the
> people developing FPGA code worry about it and come up with their own
> solution. Buses between FPGAs may end up never being common enough for
> people to care, or they may end up being so weird that they need their
> own description independent of GPUS, or maybe when they become common
> they find a way to use the GPU link subsystem -- who knows. Don't try to
> design for use cases that don't exist yet.
> 
> Yes, userspace will have to know about all the buses it cares to find
> links over. Sounds like a perfect thing for libhms to do.

So just to be clear here is how i understand your position:
"Single coherent sysfs hierarchy to describe something is useless
 let's git rm drivers/base/"

While i am arguing that "hey the /sys/bus/node/devices/* is nice
but it just does not cut it for all this new hardware platform
if i add new nodes there for my new memory i will break tons of
existing application. So what about a new hierarchy that allow
to describe those new hardware platform in a single place like
today node thing"


> 
> > There is no telling that kernel won't be able to provide quirk and
> > workaround because some merging is actually illegal on a given
> > platform (like some link from a subsystem is not accessible through
> > the PCI connection of one of the device connected to that link).
> 
> These are all just different individual problems which need different
> solutions not grand new design concepts.
> 
> > So it means userspace will have to grow its own database or work-
> > around and quirk and i am back in the situation i am in today.
> 
> No, as I've said, quirks are firmly the responsibility of kernels.
> Userspace will need to know how to work with the different buses and
> CPU/node information but there really isn't that many of these to deal
> with and this is a much easier approach than trying to come up with a
> new API that can wrap the nuances of all existing and potential future
> bus types we may have to deal with.

No can do that is what i am trying to explain. So if i bus 1 in a
sub-system A and usualy that kind of bus can serve a bridge for
PCIE ie a CPU can access device behind it by going through a PCIE
device first. So now the userspace libary have this knowledge
bake in. Now if a platform has a bug for whatever reasons where
that does not hold, the kernel has no way to tell userspace that
there is an exception there. It is up to userspace to have a data
base of quirks.

Kernel see all those objects in isolation in your scheme. While in
what i am proposing there is only one place and any device that
participate in this common place can report any quirks so that a
coherent view is given to user space.

If we have gazillion of places where all this informations is spread
around than we have no way to fix weird inter-action between any
of those.

Cheers,
Jérôme

  reply	other threads:[~2018-12-05 22:58 UTC|newest]

Thread overview: 169+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-03 23:34 [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind() jglisse
2018-12-03 23:34 ` jglisse
2018-12-03 23:34 ` [RFC PATCH 01/14] mm/hms: heterogeneous memory system (sysfs infrastructure) jglisse
2018-12-03 23:34 ` [RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation jglisse
2018-12-04 17:06   ` Andi Kleen
2018-12-04 18:24     ` Jerome Glisse
2018-12-04 18:24       ` Jerome Glisse
2018-12-04 18:31       ` Dan Williams
2018-12-04 18:57         ` Jerome Glisse
2018-12-04 18:57           ` Jerome Glisse
2018-12-04 19:11           ` Logan Gunthorpe
2018-12-04 19:22             ` Jerome Glisse
2018-12-04 19:22               ` Jerome Glisse
2018-12-04 19:41               ` Logan Gunthorpe
2018-12-04 20:13                 ` Jerome Glisse
2018-12-04 20:13                   ` Jerome Glisse
2018-12-04 20:30                   ` Logan Gunthorpe
2018-12-04 20:59                     ` Jerome Glisse
2018-12-04 20:59                       ` Jerome Glisse
2018-12-04 21:19                       ` Logan Gunthorpe
2018-12-04 21:51                         ` Jerome Glisse
2018-12-04 21:51                           ` Jerome Glisse
2018-12-04 22:16                           ` Logan Gunthorpe
2018-12-04 23:56                             ` Jerome Glisse
2018-12-04 23:56                               ` Jerome Glisse
2018-12-05  1:15                               ` Logan Gunthorpe
2018-12-05  2:31                                 ` Jerome Glisse
2018-12-05  2:31                                   ` Jerome Glisse
2018-12-05 17:41                                   ` Logan Gunthorpe
2018-12-05 18:07                                     ` Jerome Glisse
2018-12-05 18:07                                       ` Jerome Glisse
2018-12-05 18:20                                       ` Logan Gunthorpe
2018-12-05 18:33                                         ` Jerome Glisse
2018-12-05 18:33                                           ` Jerome Glisse
2018-12-05 18:48                                           ` Logan Gunthorpe
2018-12-05 18:55                                             ` Jerome Glisse
2018-12-05 18:55                                               ` Jerome Glisse
2018-12-05 19:10                                               ` Logan Gunthorpe
2018-12-05 22:58                                                 ` Jerome Glisse [this message]
2018-12-05 22:58                                                   ` Jerome Glisse
2018-12-05 23:09                                                   ` Logan Gunthorpe
2018-12-05 23:20                                                     ` Jerome Glisse
2018-12-05 23:20                                                       ` Jerome Glisse
2018-12-05 23:23                                                       ` Logan Gunthorpe
2018-12-05 23:27                                                         ` Jerome Glisse
2018-12-06  0:08                                                           ` Dan Williams
2018-12-05  2:34                                 ` Dan Williams
2018-12-05  2:37                                   ` Jerome Glisse
2018-12-05  2:37                                     ` Jerome Glisse
2018-12-05 17:25                                     ` Logan Gunthorpe
2018-12-05 18:01                                       ` Jerome Glisse
2018-12-05 18:01                                         ` Jerome Glisse
2018-12-04 20:14             ` Andi Kleen
2018-12-04 20:47               ` Logan Gunthorpe
2018-12-04 21:15                 ` Jerome Glisse
2018-12-04 21:15                   ` Jerome Glisse
2018-12-05  0:54             ` Kuehling, Felix
2018-12-04 19:19           ` Dan Williams
2018-12-04 19:32             ` Jerome Glisse
2018-12-04 19:32               ` Jerome Glisse
2018-12-04 20:12       ` Andi Kleen
2018-12-04 20:41         ` Jerome Glisse
2018-12-04 20:41           ` Jerome Glisse
2018-12-05  4:36       ` Aneesh Kumar K.V
2018-12-05  4:41         ` Jerome Glisse
2018-12-05  4:41           ` Jerome Glisse
2018-12-05 10:52   ` Mike Rapoport
2018-12-05 10:52     ` Mike Rapoport
2018-12-03 23:34 ` [RFC PATCH 03/14] mm/hms: add target memory to heterogeneous memory system infrastructure jglisse
2018-12-03 23:34 ` [RFC PATCH 04/14] mm/hms: add initiator " jglisse
2018-12-03 23:35 ` [RFC PATCH 05/14] mm/hms: add link " jglisse
2018-12-03 23:35 ` [RFC PATCH 06/14] mm/hms: add bridge " jglisse
2018-12-03 23:35 ` [RFC PATCH 07/14] mm/hms: register main memory with heterogenenous memory system jglisse
2018-12-03 23:35 ` [RFC PATCH 08/14] mm/hms: register main CPUs " jglisse
2018-12-03 23:35 ` [RFC PATCH 09/14] mm/hms: hbind() for heterogeneous memory system (aka mbind() for HMS) jglisse
2018-12-03 23:35 ` [RFC PATCH 10/14] mm/hbind: add heterogeneous memory policy tracking infrastructure jglisse
2018-12-03 23:35 ` [RFC PATCH 11/14] mm/hbind: add bind command to heterogeneous memory policy jglisse
2018-12-03 23:35 ` [RFC PATCH 12/14] mm/hbind: add migrate command to hbind() ioctl jglisse
2018-12-03 23:35 ` [RFC PATCH 13/14] drm/nouveau: register GPU under heterogeneous memory system jglisse
2018-12-03 23:35 ` [RFC PATCH 14/14] test/hms: tests for " jglisse
2018-12-04  7:44 ` [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind() Aneesh Kumar K.V
2018-12-04  7:44   ` Aneesh Kumar K.V
2018-12-04 14:44   ` Jerome Glisse
2018-12-04 14:44     ` Jerome Glisse
2018-12-04 14:44     ` Jerome Glisse
2018-12-04 18:02 ` Dave Hansen
2018-12-04 18:02   ` Dave Hansen
2018-12-04 18:49   ` Jerome Glisse
2018-12-04 18:49     ` Jerome Glisse
2018-12-04 18:49     ` Jerome Glisse
2018-12-04 18:54     ` Dave Hansen
2018-12-04 18:54       ` Dave Hansen
2018-12-04 19:11       ` Jerome Glisse
2018-12-04 19:11         ` Jerome Glisse
2018-12-04 19:11         ` Jerome Glisse
2018-12-04 21:37     ` Dave Hansen
2018-12-04 21:37       ` Dave Hansen
2018-12-04 21:57       ` Jerome Glisse
2018-12-04 21:57         ` Jerome Glisse
2018-12-04 21:57         ` Jerome Glisse
2018-12-04 23:58         ` Dave Hansen
2018-12-04 23:58           ` Dave Hansen
2018-12-05  0:29           ` Jerome Glisse
2018-12-05  0:29             ` Jerome Glisse
2018-12-05  0:29             ` Jerome Glisse
2018-12-05  1:22         ` Kuehling, Felix
2018-12-05  1:22           ` Kuehling, Felix
2018-12-05  1:22           ` Kuehling, Felix
2018-12-05 11:27     ` Aneesh Kumar K.V
2018-12-05 11:27       ` Aneesh Kumar K.V
2018-12-05 16:09       ` Jerome Glisse
2018-12-05 16:09         ` Jerome Glisse
2018-12-05 16:09         ` Jerome Glisse
2018-12-04 23:54 ` Dave Hansen
2018-12-04 23:54   ` Dave Hansen
2018-12-05  0:15   ` Jerome Glisse
2018-12-05  0:15     ` Jerome Glisse
2018-12-05  0:15     ` Jerome Glisse
2018-12-05  1:06     ` Dave Hansen
2018-12-05  1:06       ` Dave Hansen
2018-12-05  2:13       ` Jerome Glisse
2018-12-05  2:13         ` Jerome Glisse
2018-12-05  2:13         ` Jerome Glisse
2018-12-05 17:27         ` Dave Hansen
2018-12-05 17:27           ` Dave Hansen
2018-12-05 17:53           ` Jerome Glisse
2018-12-05 17:53             ` Jerome Glisse
2018-12-05 17:53             ` Jerome Glisse
2018-12-06 18:25             ` Dave Hansen
2018-12-06 18:25               ` Dave Hansen
2018-12-06 19:20               ` Jerome Glisse
2018-12-06 19:20                 ` Jerome Glisse
2018-12-06 19:20                 ` Jerome Glisse
2018-12-06 19:31                 ` Dave Hansen
2018-12-06 19:31                   ` Dave Hansen
2018-12-06 20:11                   ` Logan Gunthorpe
2018-12-06 20:11                     ` Logan Gunthorpe
2018-12-06 22:04                     ` Dave Hansen
2018-12-06 22:04                       ` Dave Hansen
2018-12-06 22:39                       ` Jerome Glisse
2018-12-06 22:39                         ` Jerome Glisse
2018-12-06 22:39                         ` Jerome Glisse
2018-12-06 23:09                         ` Dave Hansen
2018-12-06 23:09                           ` Dave Hansen
2018-12-06 23:28                           ` Logan Gunthorpe
2018-12-06 23:28                             ` Logan Gunthorpe
2018-12-06 23:34                             ` Dave Hansen
2018-12-06 23:34                               ` Dave Hansen
2018-12-06 23:38                             ` Dave Hansen
2018-12-06 23:38                               ` Dave Hansen
2018-12-06 23:48                               ` Logan Gunthorpe
2018-12-06 23:48                                 ` Logan Gunthorpe
2018-12-07  0:20                                 ` Jerome Glisse
2018-12-07  0:20                                   ` Jerome Glisse
2018-12-07  0:20                                   ` Jerome Glisse
2018-12-07 15:06                                   ` Jonathan Cameron
2018-12-07 15:06                                     ` Jonathan Cameron
2018-12-07 19:37                                     ` Jerome Glisse
2018-12-07 19:37                                       ` Jerome Glisse
2018-12-07 19:37                                       ` Jerome Glisse
2018-12-07  0:15                           ` Jerome Glisse
2018-12-07  0:15                             ` Jerome Glisse
2018-12-07  0:15                             ` Jerome Glisse
2018-12-06 20:27                   ` Jerome Glisse
2018-12-06 20:27                     ` Jerome Glisse
2018-12-06 20:27                     ` Jerome Glisse
2018-12-06 21:46                     ` Jerome Glisse
2018-12-06 21:46                       ` Jerome Glisse
2018-12-06 21:46                       ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181205225828.GL3536@redhat.com \
    --to=jglisse@redhat.com \
    --cc=Paul.Blinzer@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=balbirs@au1.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=christian.koenig@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=felix.kuehling@amd.com \
    --cc=haggaie@mellanox.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=rafael@kernel.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.