linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Christoph Lameter <cl@linux.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	jglisse@redhat.com, mgorman@suse.de, aarcange@redhat.com,
	riel@redhat.com, airlied@redhat.com,
	aneesh.kumar@linux.vnet.ibm.com,
	Cameron Buschardt <cabuschardt@nvidia.com>,
	Mark Hairgrove <mhairgrove@nvidia.com>,
	Geoffrey Gerfin <ggerfin@nvidia.com>,
	John McKenna <jmckenna@nvidia.com>,
	akpm@linux-foundation.org
Subject: Re: Interacting with coherent memory on external devices
Date: Wed, 22 Apr 2015 10:42:52 +1000	[thread overview]
Message-ID: <1429663372.27410.75.camel@kernel.crashing.org> (raw)
In-Reply-To: <alpine.DEB.2.11.1504211839120.6294@gentwo.org>

On Tue, 2015-04-21 at 18:49 -0500, Christoph Lameter wrote:
> On Tue, 21 Apr 2015, Paul E. McKenney wrote:
> 
> > Thoughts?
> 
> Use DAX for memory instead of the other approaches? That way it is
> explicitly clear what information is put on the CAPI device.

Care to elaborate on what DAX is ?

> > 	Although such a device will provide CPU's with cache-coherent
> 
> Maybe call this coprocessor like IBM does? It is like a processor after
> all in terms of its participation in cache coherent?

It is, yes, in a way, though the actual implementation could be anything
from a NIC to a GPU or a crypto accelerator or whatever you can think
of.

The device memory is fully cachable from the CPU standpoint and the
device *completely* shares the MMU with the CPU (operates within a
normal linux mm context).

> > 	access to on-device memory, the resulting memory latency is
> > 	expected to be slower than the normal memory that is tightly
> > 	coupled to the CPUs.  Nevertheless, data that is only occasionally
> > 	accessed by CPUs should be stored in the device's memory.
> > 	On the other hand, data that is accessed rarely by the device but
> > 	frequently by the CPUs should be stored in normal system memory.
> 
> I would expect many devices to not have *normal memory* at all (those
> that simply process some data or otherwise interface with external
> hardware like f.e. a NIC). Other devices like GPUs have local memory but
> what is in GPU memory is very specific and general OS structures should
> not be allocated there.

That isn't entirely true. Take the GPU as an example, they can have
*large* amounts of local memory and you want to migrate the working set
(not just control structures) over.

So you can basically malloc() something on the host, hand it over to the
coprocessor which churns on it, the bus interface/MMU on the device
"detects" that a given page or set of pages is heavily pounded on by the
GPU and sends an interrupt to the host via a sideband channel to request
its migration to the device.

Since the device memory is fully cachable and coherent, it can simply be
represented with struct pages like normal system memory and we can use
the existing migration mechanism.

> What I mostly would like to see is that these devices will have the
> ability to participate in the cpu cache coherency scheme. I.e. they
> will have l1/l2/l3 caches that will allow fast data exchange between the
> coprocessor and the regular processors in the system.

Yes they can in theory.

> >
> > 		a.	It should be possible to migrate all data away
> > 			from the device's memory at any time.
> 
> That would be device specific and only a special device driver for that
> device could save the state of the device (if that is necessary. It would
> not be for something like a NIC).

Yes and no. If the memory is fully given to the system as struct pages,
we can have random kernel allocations on it which means we can't evict
it.

The ideas here are to try to mitigate that, ie, keep the benefit of
struct page and limit the problem of unrelated allocs hitting the
device.

> > 		b.	Normal memory allocation should avoid using the
> > 			device's memory, as this would interfere
> > 			with the needed migration.  It may nevertheless
> > 			be desirable to use the device's memory
> > 			if system memory is exhausted, however, in some
> > 			cases, even this "emergency" use is best avoided.
> > 			In fact, a good solution will provide some means
> > 			for avoiding this for those cases where it is
> > 			necessary to evacuate memory when offlining the
> > 			device.
> 
> Ok that seems to mean that none of the approaches suggested later would
> be useful.

Why ? A far away numa node covered with a CMA would probably do the
trick, a ZONE would definitely do the trick...

> > 	3.	The device's memory is treated like normal system
> > 		memory by the Linux kernel, for example, each page has a
> > 		"struct page" associate with it.  (In contrast, the
> > 		traditional approach has used special-purpose OS mechanisms
> > 		to manage the device's memory, and this memory was treated
> > 		as MMIO space by the kernel.)
> 
> Why do we need a struct page? If so then maybe equip DAX with a struct
> page so that the contents of the device memory can be controlled via a
> filesystem? (may be custom to the needs of the device).

What is DAX ?

struct page means we can transparently migrate anonymous memory accross
among others.

Cheers,
Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-04-22  0:43 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-21 21:44 Interacting with coherent memory on external devices Paul E. McKenney
2015-04-21 23:46 ` Jerome Glisse
2015-04-22  0:36   ` Benjamin Herrenschmidt
2015-04-22 12:42   ` Paul E. McKenney
2015-04-21 23:49 ` Christoph Lameter
2015-04-22  0:05   ` Jerome Glisse
2015-04-22  0:50     ` Christoph Lameter
2015-04-22  1:01       ` Benjamin Herrenschmidt
2015-04-22 13:35         ` Paul E. McKenney
2015-04-22 13:18       ` Paul E. McKenney
2015-04-22 16:16         ` Christoph Lameter
2015-04-22 17:07           ` Jerome Glisse
2015-04-22 18:17             ` Christoph Lameter
2015-04-22 18:52               ` Paul E. McKenney
2015-04-23 14:12                 ` Christoph Lameter
2015-04-23 19:24                   ` Paul E. McKenney
2015-04-24 14:01                     ` Christoph Lameter
2015-04-24 14:13                       ` Paul E. McKenney
2015-04-24 15:53                       ` Rik van Riel
2015-04-23  2:36               ` Benjamin Herrenschmidt
2015-04-23 14:10                 ` Christoph Lameter
2015-04-23 15:42                   ` Jerome Glisse
2015-04-24 14:04                     ` Christoph Lameter
2015-04-23 22:29                   ` Benjamin Herrenschmidt
2015-04-23  2:30           ` Benjamin Herrenschmidt
2015-04-23 14:25             ` Christoph Lameter
2015-04-23 15:25               ` Austin S Hemmelgarn
2015-04-23 19:33                 ` Paul E. McKenney
2015-04-24 14:12                   ` Christoph Lameter
2015-04-24 14:57                     ` Paul E. McKenney
2015-04-24 15:09                       ` Jerome Glisse
2015-04-25 11:20                         ` Paul E. McKenney
2015-04-24 15:52                       ` Christoph Lameter
2015-04-23 22:37                 ` Benjamin Herrenschmidt
2015-04-24 14:09                 ` Christoph Lameter
2015-04-23 16:04       ` Rik van Riel
2015-04-22  0:42   ` Benjamin Herrenschmidt [this message]
2015-04-22  0:57     ` Paul E. McKenney
2015-04-22  1:04       ` Benjamin Herrenschmidt
2015-04-22 15:25         ` Christoph Lameter
2015-04-22 16:31           ` Jerome Glisse
2015-04-22 17:14             ` Christoph Lameter
2015-04-22 19:07               ` Jerome Glisse
2015-04-23  2:34               ` Benjamin Herrenschmidt
2015-04-23 14:38                 ` Christoph Lameter
2015-04-23 16:11                   ` Jerome Glisse
2015-04-24 14:29                     ` Christoph Lameter
2015-04-24 15:08                       ` Jerome Glisse
2015-04-24 16:03                         ` Christoph Lameter
2015-04-24 16:43                           ` Jerome Glisse
2015-04-24 16:58                             ` Christoph Lameter
2015-04-24 17:19                               ` Jerome Glisse
2015-04-24 18:56                                 ` Christoph Lameter
2015-04-24 19:29                                   ` Jerome Glisse
2015-04-24 20:00                                     ` Christoph Lameter
2015-04-24 20:32                                       ` Jerome Glisse
2015-04-25 11:46                                       ` Paul E. McKenney
2015-04-27 15:08                                         ` Christoph Lameter
2015-04-27 15:47                                           ` Jerome Glisse
2015-04-27 16:17                                             ` Christoph Lameter
2015-04-27 16:29                                               ` Rik van Riel
2015-04-27 16:48                                                 ` Christoph Lameter
2015-04-27 23:54                                                   ` Benjamin Herrenschmidt
2015-05-13 14:10                                                     ` Vlastimil Babka
2015-05-13 23:38                                                       ` Benjamin Herrenschmidt
2015-05-14  7:39                                                         ` Vlastimil Babka
2015-05-14  7:51                                                           ` Benjamin Herrenschmidt
2015-05-28 18:18                                                             ` Paul E. McKenney
2015-04-27 16:43                                               ` Jerome Glisse
2015-04-27 16:51                                                 ` Christoph Lameter
2015-04-27 17:21                                                   ` Jerome Glisse
2015-04-27 19:26                                                     ` Christoph Lameter
2015-04-27 19:35                                                       ` Rik van Riel
2015-04-27 20:52                                                       ` Jerome Glisse
2015-04-28 14:18                                                         ` Christoph Lameter
2015-04-28 17:20                                                           ` Jerome Glisse
2015-04-27 16:15                                           ` Paul E. McKenney
2015-04-27 16:31                                             ` Christoph Lameter
2015-04-24 23:45                               ` Benjamin Herrenschmidt
2015-04-23 18:52                   ` Paul E. McKenney
2015-04-24 14:30                     ` Christoph Lameter
2015-04-24 14:54                       ` Paul E. McKenney
2015-04-24 15:49                         ` Christoph Lameter
2015-04-24 16:06                           ` Rik van Riel
2015-04-25 11:49                           ` Paul E. McKenney
2015-04-24 16:00                       ` Jerome Glisse
2015-04-24 16:08                       ` Rik van Riel
2015-04-23 17:28               ` Rik van Riel
2015-04-23  2:27           ` Benjamin Herrenschmidt
2015-04-23 14:20             ` Christoph Lameter
2015-04-23 16:22               ` Jerome Glisse
2015-04-24 18:41                 ` Oded Gabbay
2015-04-23 19:00               ` Paul E. McKenney
2015-04-22 15:20       ` Christoph Lameter
2015-04-25  2:32 ` Rik van Riel
2015-04-25  3:32   ` Benjamin Herrenschmidt
2015-04-25 11:55     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1429663372.27410.75.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=aarcange@redhat.com \
    --cc=airlied@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cabuschardt@nvidia.com \
    --cc=cl@linux.com \
    --cc=ggerfin@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=jmckenna@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhairgrove@nvidia.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).