From: Jerome Glisse <jglisse@redhat.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
David Nellans <dnellans@nvidia.com>
Subject: Re: [HMM 00/15] HMM (Heterogeneous Memory Management) v22
Date: Thu, 1 Jun 2017 18:38:08 -0400 [thread overview]
Message-ID: <20170601223808.GC2780@redhat.com> (raw)
In-Reply-To: <CAKTCnznUJcHt9cd3ZOn-f2-HVHrCM_L+BPC5mgBVhsB7o0=JUw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]
On Thu, Jun 01, 2017 at 12:04:02PM +1000, Balbir Singh wrote:
> On Thu, May 25, 2017 at 3:53 AM, Jerome Glisse <jglisse@redhat.com> wrote:
> > On Wed, May 24, 2017 at 11:55:12AM +1000, Balbir Singh wrote:
> >> On Tue, May 23, 2017 at 2:51 AM, Jérôme Glisse <jglisse@redhat.com> wrote:
> >> > Patchset is on top of mmotm mmotm-2017-05-18, git branch:
> >> >
> >> > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-v22
> >> >
> >> > Change since v21 is adding back special refcounting in put_page() to
> >> > catch when a ZONE_DEVICE page is free (refcount going from 2 to 1
> >> > unlike regular page where a refcount of 0 means the page is free).
> >> > See patch 8 of this serie for this refcounting. I did not use static
> >> > keys because it kind of scares me to do that for an inline function.
> >> > If people strongly feel about this i can try to make static key works
> >> > here. Kirill will most likely want to review this.
> >> >
> >> >
> >> > Everything else is the same. Below is the long description of what HMM
> >> > is about and why. At the end of this email i describe briefly each patch
> >> > and suggest reviewers for each of them.
> >> >
> >> >
> >> > Heterogeneous Memory Management (HMM) (description and justification)
> >> >
> >>
> >> Thanks for the patches! These patches are very helpful. There are a
> >> few additional things we would need on top of this (once HMM the base
> >> is merged)
> >>
> >> 1. Support for other architectures, we'd like to make sure we can get
> >> this working for powerpc for example. As a first step we have
> >> ZONE_DEVICE enablement patches, but I think we need some additional
> >> patches for iomem space searching and memory hotplug, IIRC
> >> 2. HMM-CDM and physical address based migration bits. In a recent RFC
> >> we decided to try and use the HMM CDM route as a route to implementing
> >> coherent device memory as a starting point. It would be nice to have
> >> those patches on top of these once these make it to mm -
> >> https://lwn.net/Articles/720380/
> >>
> >
> > I intend to post the updated HMM CDM patchset early next week. I am
> > tie in couple internal backport but i should be able to resume work
> > on that this week.
> >
>
> Thanks, I am looking at the HMM CDM branch and trying to forward port
> and see what the results look like on top of HMM-v23. Do we have a timeline
> for the v23 merge?
>
So i am moving to new office and it has taken me more time than i thought
to pack stuff. Attach is first step of CDM on top of lastest HMM. I hope
to have more time tomorrow or next week to finish rebasing patches and to
run some test with stolen ram as CDM memory.
Jérôme
[-- Attachment #2: 0001-mm-device-public-memory-device-memory-cache-coherent.patch --]
[-- Type: text/plain, Size: 6384 bytes --]
>From 0ca0ebe4aecedfe69ae029c529045d609352b921 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= <jglisse@redhat.com>
Date: Thu, 1 Jun 2017 11:25:59 -0400
Subject: [PATCH] mm/device-public-memory: device memory cache coherent with
CPU
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Platform with advance system bus (like CAPI or CCIX) allow device
memory to be accessible from CPU in a cache coherent fashion. Add
a new type of ZONE_DEVICE to represent such memory. The use case
are the same as for the un-addressable device memory but without
all the corners cases.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
---
include/linux/ioport.h | 1 +
include/linux/memremap.h | 21 +++++++++++++++++++++
mm/Kconfig | 13 +++++++++++++
mm/memory.c | 13 +++++++++++++
mm/migrate.c | 23 ++++++++++++++---------
5 files changed, 62 insertions(+), 9 deletions(-)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 3a4f691..f5cf32e 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -131,6 +131,7 @@ enum {
IORES_DESC_PERSISTENT_MEMORY = 4,
IORES_DESC_PERSISTENT_MEMORY_LEGACY = 5,
IORES_DESC_DEVICE_PRIVATE_MEMORY = 6,
+ IORES_DESC_DEVICE_PUBLIC_MEMORY = 7,
};
/* helpers to define resources */
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 0e0d2e6..b9f460a 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -56,10 +56,18 @@ static inline struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
* page must be treated as an opaque object, rather than a "normal" struct page.
* A more complete discussion of unaddressable memory may be found in
* include/linux/hmm.h and Documentation/vm/hmm.txt.
+ *
+ * MEMORY_DEVICE_PUBLIC:
+ * Device memory that is cache coherent from device and CPU point of view. This
+ * is use on platform that have an advance system bus (like CAPI or CCIX). A
+ * driver can hotplug the device memory using ZONE_DEVICE and with that memory
+ * type. Any page of a process can be migrated to such memory. However no one
+ * should be allow to pin such memory so that it can always be evicted.
*/
enum memory_type {
MEMORY_DEVICE_PUBLIC = 0,
MEMORY_DEVICE_PRIVATE,
+ MEMORY_DEVICE_PUBLIC,
};
/*
@@ -91,6 +99,8 @@ enum memory_type {
* The page_free() callback is called once the page refcount reaches 1
* (ZONE_DEVICE pages never reach 0 refcount unless there is a refcount bug.
* This allows the device driver to implement its own memory management.)
+ *
+ * For MEMORY_DEVICE_CACHE_COHERENT only the page_free() callback matter.
*/
typedef int (*dev_page_fault_t)(struct vm_area_struct *vma,
unsigned long addr,
@@ -133,6 +143,12 @@ static inline bool is_device_private_page(const struct page *page)
return is_zone_device_page(page) &&
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
}
+
+static inline bool is_device_public_page(const struct page *page)
+{
+ return is_zone_device_page(page) &&
+ page->pgmap->type == MEMORY_DEVICE_PUBLIC;
+}
#else
static inline void *devm_memremap_pages(struct device *dev,
struct resource *res, struct percpu_ref *ref,
@@ -156,6 +172,11 @@ static inline bool is_device_private_page(const struct page *page)
{
return false;
}
+
+static inline bool is_device_public_page(const struct page *page)
+{
+ return false;
+}
#endif
/**
diff --git a/mm/Kconfig b/mm/Kconfig
index 46296d5d7..bacb193 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -758,6 +758,19 @@ config DEVICE_PRIVATE
memory; i.e., memory that is only accessible from the device (or
group of devices).
+config DEVICE_PUBLIC
+ bool "Unaddressable device memory (GPU memory, ...)"
+ depends on X86_64
+ depends on ZONE_DEVICE
+ depends on MEMORY_HOTPLUG
+ depends on MEMORY_HOTREMOVE
+ depends on SPARSEMEM_VMEMMAP
+
+ help
+ Allows creation of struct pages to represent addressable device
+ memory; i.e., memory that is accessible from both the device and
+ the CPU
+
config FRAME_VECTOR
bool
diff --git a/mm/memory.c b/mm/memory.c
index eba61dd..d192f3d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -983,6 +983,19 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
get_page(page);
page_dup_rmap(page, false);
rss[mm_counter(page)]++;
+ } else if (pte_devmap(pte)) {
+ page = pte_page(pte);
+
+ /*
+ * Cache coherent device memory behave like regular page and
+ * not like persistent memory page. For more informations see
+ * MEMORY_DEVICE_CACHE_COHERENT in memory_hotplug.h
+ */
+ if (is_device_public_page(page)) {
+ get_page(page);
+ page_dup_rmap(page, false);
+ rss[mm_counter(page)]++;
+ }
}
out_set_pte:
diff --git a/mm/migrate.c b/mm/migrate.c
index d7c4db6..a0115b8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -229,12 +229,16 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
if (is_write_migration_entry(entry))
pte = maybe_mkwrite(pte, vma);
- if (unlikely(is_zone_device_page(new)) &&
- is_device_private_page(new)) {
- entry = make_device_private_entry(new, pte_write(pte));
- pte = swp_entry_to_pte(entry);
- if (pte_swp_soft_dirty(*pvmw.pte))
- pte = pte_mksoft_dirty(pte);
+ if (unlikely(is_zone_device_page(new))) {
+ if (is_device_private_page(new)) {
+ entry = make_device_private_entry(new, pte_write(pte));
+ pte = swp_entry_to_pte(entry);
+ if (pte_swp_soft_dirty(*pvmw.pte))
+ pte = pte_mksoft_dirty(pte);
+ } else if (is_device_public_page(new)) {
+ pte = pte_mkdevmap(pte);
+ flush_dcache_page(new);
+ }
} else
flush_dcache_page(new);
@@ -2300,9 +2304,10 @@ static bool migrate_vma_check_page(struct page *page)
/* Page from ZONE_DEVICE have one extra reference */
if (is_zone_device_page(page)) {
- if (is_device_private_page(page)) {
+ if (is_device_private_page(page) ||
+ is_device_public_page)
extra++;
- } else
+ else
/* Other ZONE_DEVICE memory type are not supported */
return false;
}
@@ -2621,7 +2626,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
continue;
}
- } else {
+ } else if (!is_device_public_page(newpage)) {
/*
* Other types of ZONE_DEVICE page are not
* supported.
--
2.4.11
next prev parent reply other threads:[~2017-06-01 22:38 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-22 16:51 [HMM 00/15] HMM (Heterogeneous Memory Management) v22 Jérôme Glisse
2017-05-22 16:51 ` [HMM 01/15] hmm: heterogeneous memory management documentation Jérôme Glisse
2017-05-22 16:51 ` [HMM 02/15] mm/hmm: heterogeneous memory management (HMM for short) v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 03/15] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 04/15] mm/hmm/mirror: helper to snapshot CPU page table v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 05/15] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-05-22 16:51 ` [HMM 06/15] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-05-22 16:51 ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v2 Jérôme Glisse
2017-05-22 21:17 ` Dan Williams
2017-05-23 21:36 ` [HMM 07/18] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3 Jérôme Glisse
2017-05-23 8:36 ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v2 kbuild test robot
2017-05-22 16:51 ` [HMM 08/15] mm/ZONE_DEVICE: special case put_page() for device private pages Jérôme Glisse
2017-05-22 19:29 ` Dan Williams
2017-05-22 20:14 ` Jerome Glisse
2017-05-22 20:19 ` Dan Williams
2017-05-22 21:14 ` Jerome Glisse
2017-05-22 20:22 ` Hugh Dickins
2017-05-22 21:17 ` Jerome Glisse
2017-05-23 9:34 ` kbuild test robot
2017-05-23 13:23 ` Kirill A. Shutemov
2017-05-23 21:37 ` [HMM 08/18] mm/ZONE_DEVICE: special case put_page() for device private pages v2 Jérôme Glisse
2017-05-22 16:52 ` [HMM 09/15] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v4 Jérôme Glisse
2017-05-23 21:37 ` [HMM 09/18] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v5 Jérôme Glisse
2017-05-22 16:52 ` [HMM 10/15] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-05-22 16:52 ` [HMM 11/15] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-05-22 16:52 ` [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-05-23 18:07 ` Reza Arbab
2017-06-27 0:07 ` Evgeny Baskakov
2017-06-30 23:19 ` Evgeny Baskakov
2017-07-01 0:57 ` Jerome Glisse
2017-07-01 2:06 ` Evgeny Baskakov
2017-07-10 22:59 ` Evgeny Baskakov
2017-07-10 23:43 ` Jerome Glisse
2017-07-11 0:17 ` Evgeny Baskakov
2017-07-11 0:54 ` Jerome Glisse
2017-07-20 21:05 ` Evgeny Baskakov
2017-07-10 23:44 ` Evgeny Baskakov
2017-07-11 18:29 ` Jerome Glisse
2017-07-11 18:42 ` Evgeny Baskakov
2017-07-11 18:49 ` Jerome Glisse
2017-07-11 19:35 ` Evgeny Baskakov
2017-07-13 20:16 ` Jerome Glisse
2017-07-14 5:32 ` Evgeny Baskakov
2017-07-14 19:43 ` Evgeny Baskakov
2017-07-15 0:55 ` Jerome Glisse
2017-07-15 5:04 ` Evgeny Baskakov
2017-07-21 1:00 ` Evgeny Baskakov
2017-07-21 1:33 ` Jerome Glisse
2017-07-21 22:01 ` Evgeny Baskakov
2017-07-25 22:45 ` Evgeny Baskakov
2017-07-26 19:14 ` Jerome Glisse
2017-05-22 16:52 ` [HMM 13/15] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-05-22 16:52 ` [HMM 14/15] mm/migrate: support un-addressable ZONE_DEVICE page in migration v2 Jérôme Glisse
2017-05-22 16:52 ` [HMM 15/15] mm/migrate: allow migrate_vma() to alloc new page on empty entry v2 Jérôme Glisse
2017-05-23 22:02 ` [HMM 00/15] HMM (Heterogeneous Memory Management) v22 Jerome Glisse
2017-05-23 22:05 ` Andrew Morton
2017-05-24 1:55 ` Balbir Singh
2017-05-24 17:53 ` Jerome Glisse
2017-06-01 2:04 ` Balbir Singh
2017-06-01 22:38 ` Jerome Glisse [this message]
2017-06-03 9:18 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170601223808.GC2780@redhat.com \
--to=jglisse@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bsingharora@gmail.com \
--cc=dnellans@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).