linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	David Nellans <dnellans@nvidia.com>
Subject: Re: [HMM 00/15] HMM (Heterogeneous Memory Management) v22
Date: Thu, 1 Jun 2017 18:38:08 -0400	[thread overview]
Message-ID: <20170601223808.GC2780@redhat.com> (raw)
In-Reply-To: <CAKTCnznUJcHt9cd3ZOn-f2-HVHrCM_L+BPC5mgBVhsB7o0=JUw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]

On Thu, Jun 01, 2017 at 12:04:02PM +1000, Balbir Singh wrote:
> On Thu, May 25, 2017 at 3:53 AM, Jerome Glisse <jglisse@redhat.com> wrote:
> > On Wed, May 24, 2017 at 11:55:12AM +1000, Balbir Singh wrote:
> >> On Tue, May 23, 2017 at 2:51 AM, Jérôme Glisse <jglisse@redhat.com> wrote:
> >> > Patchset is on top of mmotm mmotm-2017-05-18, git branch:
> >> >
> >> > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-v22
> >> >
> >> > Change since v21 is adding back special refcounting in put_page() to
> >> > catch when a ZONE_DEVICE page is free (refcount going from 2 to 1
> >> > unlike regular page where a refcount of 0 means the page is free).
> >> > See patch 8 of this serie for this refcounting. I did not use static
> >> > keys because it kind of scares me to do that for an inline function.
> >> > If people strongly feel about this i can try to make static key works
> >> > here. Kirill will most likely want to review this.
> >> >
> >> >
> >> > Everything else is the same. Below is the long description of what HMM
> >> > is about and why. At the end of this email i describe briefly each patch
> >> > and suggest reviewers for each of them.
> >> >
> >> >
> >> > Heterogeneous Memory Management (HMM) (description and justification)
> >> >
> >>
> >> Thanks for the patches! These patches are very helpful. There are a
> >> few additional things we would need on top of this (once HMM the base
> >> is merged)
> >>
> >> 1. Support for other architectures, we'd like to make sure we can get
> >> this working for powerpc for example. As a first step we have
> >> ZONE_DEVICE enablement patches, but I think we need some additional
> >> patches for iomem space searching and memory hotplug, IIRC
> >> 2. HMM-CDM and physical address based migration bits. In a recent RFC
> >> we decided to try and use the HMM CDM route as a route to implementing
> >> coherent device memory as a starting point. It would be nice to have
> >> those patches on top of these once these make it to mm -
> >> https://lwn.net/Articles/720380/
> >>
> >
> > I intend to post the updated HMM CDM patchset early next week. I am
> > tie in couple internal backport but i should be able to resume work
> > on that this week.
> >
> 
> Thanks, I am looking at the HMM CDM branch and trying to forward port
> and see what the results look like on top of HMM-v23. Do we have a timeline
> for the v23 merge?
> 

So i am moving to new office and it has taken me more time than i thought
to pack stuff. Attach is first step of CDM on top of lastest HMM. I hope
to have more time tomorrow or next week to finish rebasing patches and to
run some test with stolen ram as CDM memory.

Jérôme

[-- Attachment #2: 0001-mm-device-public-memory-device-memory-cache-coherent.patch --]
[-- Type: text/plain, Size: 6384 bytes --]

>From 0ca0ebe4aecedfe69ae029c529045d609352b921 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= <jglisse@redhat.com>
Date: Thu, 1 Jun 2017 11:25:59 -0400
Subject: [PATCH] mm/device-public-memory: device memory cache coherent with
 CPU
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Platform with advance system bus (like CAPI or CCIX) allow device
memory to be accessible from CPU in a cache coherent fashion. Add
a new type of ZONE_DEVICE to represent such memory. The use case
are the same as for the un-addressable device memory but without
all the corners cases.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
---
 include/linux/ioport.h   |  1 +
 include/linux/memremap.h | 21 +++++++++++++++++++++
 mm/Kconfig               | 13 +++++++++++++
 mm/memory.c              | 13 +++++++++++++
 mm/migrate.c             | 23 ++++++++++++++---------
 5 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 3a4f691..f5cf32e 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -131,6 +131,7 @@ enum {
 	IORES_DESC_PERSISTENT_MEMORY		= 4,
 	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
 	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
+	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
 };
 
 /* helpers to define resources */
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 0e0d2e6..b9f460a 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -56,10 +56,18 @@ static inline struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
  * page must be treated as an opaque object, rather than a "normal" struct page.
  * A more complete discussion of unaddressable memory may be found in
  * include/linux/hmm.h and Documentation/vm/hmm.txt.
+ *
+ * MEMORY_DEVICE_PUBLIC:
+ * Device memory that is cache coherent from device and CPU point of view. This
+ * is use on platform that have an advance system bus (like CAPI or CCIX). A
+ * driver can hotplug the device memory using ZONE_DEVICE and with that memory
+ * type. Any page of a process can be migrated to such memory. However no one
+ * should be allow to pin such memory so that it can always be evicted.
  */
 enum memory_type {
 	MEMORY_DEVICE_PUBLIC = 0,
 	MEMORY_DEVICE_PRIVATE,
+	MEMORY_DEVICE_PUBLIC,
 };
 
 /*
@@ -91,6 +99,8 @@ enum memory_type {
  * The page_free() callback is called once the page refcount reaches 1
  * (ZONE_DEVICE pages never reach 0 refcount unless there is a refcount bug.
  * This allows the device driver to implement its own memory management.)
+ *
+ * For MEMORY_DEVICE_CACHE_COHERENT only the page_free() callback matter.
  */
 typedef int (*dev_page_fault_t)(struct vm_area_struct *vma,
 				unsigned long addr,
@@ -133,6 +143,12 @@ static inline bool is_device_private_page(const struct page *page)
 	return is_zone_device_page(page) &&
 		page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
+
+static inline bool is_device_public_page(const struct page *page)
+{
+	return is_zone_device_page(page) &&
+		page->pgmap->type == MEMORY_DEVICE_PUBLIC;
+}
 #else
 static inline void *devm_memremap_pages(struct device *dev,
 		struct resource *res, struct percpu_ref *ref,
@@ -156,6 +172,11 @@ static inline bool is_device_private_page(const struct page *page)
 {
 	return false;
 }
+
+static inline bool is_device_public_page(const struct page *page)
+{
+	return false;
+}
 #endif
 
 /**
diff --git a/mm/Kconfig b/mm/Kconfig
index 46296d5d7..bacb193 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -758,6 +758,19 @@ config DEVICE_PRIVATE
 	  memory; i.e., memory that is only accessible from the device (or
 	  group of devices).
 
+config DEVICE_PUBLIC
+	bool "Unaddressable device memory (GPU memory, ...)"
+	depends on X86_64
+	depends on ZONE_DEVICE
+	depends on MEMORY_HOTPLUG
+	depends on MEMORY_HOTREMOVE
+	depends on SPARSEMEM_VMEMMAP
+
+	help
+	  Allows creation of struct pages to represent addressable device
+	  memory; i.e., memory that is accessible from both the device and
+	  the CPU
+
 config FRAME_VECTOR
 	bool
 
diff --git a/mm/memory.c b/mm/memory.c
index eba61dd..d192f3d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -983,6 +983,19 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 		get_page(page);
 		page_dup_rmap(page, false);
 		rss[mm_counter(page)]++;
+	} else if (pte_devmap(pte)) {
+		page = pte_page(pte);
+
+		/*
+		 * Cache coherent device memory behave like regular page and
+		 * not like persistent memory page. For more informations see
+		 * MEMORY_DEVICE_CACHE_COHERENT in memory_hotplug.h
+		 */
+		if (is_device_public_page(page)) {
+			get_page(page);
+			page_dup_rmap(page, false);
+			rss[mm_counter(page)]++;
+		}
 	}
 
 out_set_pte:
diff --git a/mm/migrate.c b/mm/migrate.c
index d7c4db6..a0115b8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -229,12 +229,16 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 		if (is_write_migration_entry(entry))
 			pte = maybe_mkwrite(pte, vma);
 
-		if (unlikely(is_zone_device_page(new)) &&
-		    is_device_private_page(new)) {
-			entry = make_device_private_entry(new, pte_write(pte));
-			pte = swp_entry_to_pte(entry);
-			if (pte_swp_soft_dirty(*pvmw.pte))
-				pte = pte_mksoft_dirty(pte);
+		if (unlikely(is_zone_device_page(new))) {
+			if (is_device_private_page(new)) {
+				entry = make_device_private_entry(new, pte_write(pte));
+				pte = swp_entry_to_pte(entry);
+				if (pte_swp_soft_dirty(*pvmw.pte))
+					pte = pte_mksoft_dirty(pte);
+			} else if (is_device_public_page(new)) {
+				pte = pte_mkdevmap(pte);
+				flush_dcache_page(new);
+			}
 		} else
 			flush_dcache_page(new);
 
@@ -2300,9 +2304,10 @@ static bool migrate_vma_check_page(struct page *page)
 
 	/* Page from ZONE_DEVICE have one extra reference */
 	if (is_zone_device_page(page)) {
-		if (is_device_private_page(page)) {
+		if (is_device_private_page(page) ||
+		    is_device_public_page)
 			extra++;
-		} else
+		else
 			/* Other ZONE_DEVICE memory type are not supported */
 			return false;
 	}
@@ -2621,7 +2626,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
 					migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
 					continue;
 				}
-			} else {
+			} else if (!is_device_public_page(newpage)) {
 				/*
 				 * Other types of ZONE_DEVICE page are not
 				 * supported.
-- 
2.4.11


  reply	other threads:[~2017-06-01 22:38 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-22 16:51 [HMM 00/15] HMM (Heterogeneous Memory Management) v22 Jérôme Glisse
2017-05-22 16:51 ` [HMM 01/15] hmm: heterogeneous memory management documentation Jérôme Glisse
2017-05-22 16:51 ` [HMM 02/15] mm/hmm: heterogeneous memory management (HMM for short) v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 03/15] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 04/15] mm/hmm/mirror: helper to snapshot CPU page table v3 Jérôme Glisse
2017-05-22 16:51 ` [HMM 05/15] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-05-22 16:51 ` [HMM 06/15] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-05-22 16:51 ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v2 Jérôme Glisse
2017-05-22 21:17   ` Dan Williams
2017-05-23 21:36     ` [HMM 07/18] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3 Jérôme Glisse
2017-05-23  8:36   ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v2 kbuild test robot
2017-05-22 16:51 ` [HMM 08/15] mm/ZONE_DEVICE: special case put_page() for device private pages Jérôme Glisse
2017-05-22 19:29   ` Dan Williams
2017-05-22 20:14     ` Jerome Glisse
2017-05-22 20:19       ` Dan Williams
2017-05-22 21:14         ` Jerome Glisse
2017-05-22 20:22       ` Hugh Dickins
2017-05-22 21:17         ` Jerome Glisse
2017-05-23  9:34   ` kbuild test robot
2017-05-23 13:23   ` Kirill A. Shutemov
2017-05-23 21:37     ` [HMM 08/18] mm/ZONE_DEVICE: special case put_page() for device private pages v2 Jérôme Glisse
2017-05-22 16:52 ` [HMM 09/15] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v4 Jérôme Glisse
2017-05-23 21:37   ` [HMM 09/18] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v5 Jérôme Glisse
2017-05-22 16:52 ` [HMM 10/15] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-05-22 16:52 ` [HMM 11/15] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-05-22 16:52 ` [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-05-23 18:07   ` Reza Arbab
2017-06-27  0:07   ` Evgeny Baskakov
2017-06-30 23:19     ` Evgeny Baskakov
2017-07-01  0:57       ` Jerome Glisse
2017-07-01  2:06         ` Evgeny Baskakov
2017-07-10 22:59         ` Evgeny Baskakov
2017-07-10 23:43           ` Jerome Glisse
2017-07-11  0:17             ` Evgeny Baskakov
2017-07-11  0:54               ` Jerome Glisse
2017-07-20 21:05                 ` Evgeny Baskakov
2017-07-10 23:44         ` Evgeny Baskakov
2017-07-11 18:29           ` Jerome Glisse
2017-07-11 18:42             ` Evgeny Baskakov
2017-07-11 18:49               ` Jerome Glisse
2017-07-11 19:35                 ` Evgeny Baskakov
2017-07-13 20:16                   ` Jerome Glisse
2017-07-14  5:32                     ` Evgeny Baskakov
2017-07-14 19:43                     ` Evgeny Baskakov
2017-07-15  0:55                       ` Jerome Glisse
2017-07-15  5:04                         ` Evgeny Baskakov
2017-07-21  1:00                         ` Evgeny Baskakov
2017-07-21  1:33                           ` Jerome Glisse
2017-07-21 22:01                             ` Evgeny Baskakov
2017-07-25 22:45                             ` Evgeny Baskakov
2017-07-26 19:14                               ` Jerome Glisse
2017-05-22 16:52 ` [HMM 13/15] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-05-22 16:52 ` [HMM 14/15] mm/migrate: support un-addressable ZONE_DEVICE page in migration v2 Jérôme Glisse
2017-05-22 16:52 ` [HMM 15/15] mm/migrate: allow migrate_vma() to alloc new page on empty entry v2 Jérôme Glisse
2017-05-23 22:02 ` [HMM 00/15] HMM (Heterogeneous Memory Management) v22 Jerome Glisse
2017-05-23 22:05   ` Andrew Morton
2017-05-24  1:55 ` Balbir Singh
2017-05-24 17:53   ` Jerome Glisse
2017-06-01  2:04     ` Balbir Singh
2017-06-01 22:38       ` Jerome Glisse [this message]
2017-06-03  9:18         ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170601223808.GC2780@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=dnellans@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).