linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	David Nellans <dnellans@nvidia.com>,
	Balbir Singh <bsingharora@gmail.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>
Subject: Re: [HMM-v25 07/19] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v5
Date: Thu, 20 Dec 2018 00:33:47 -0800	[thread overview]
Message-ID: <CAA9_cmeag7n4yeiP6pYZSz80KyxqfwbsXJCWvyNE4PSaxCKA3A@mail.gmail.com> (raw)
In-Reply-To: <20170817000548.32038-8-jglisse@redhat.com>

On Wed, Aug 16, 2017 at 5:06 PM Jérôme Glisse <jglisse@redhat.com> wrote:
>
> HMM (heterogeneous memory management) need struct page to support migration
> from system main memory to device memory.  Reasons for HMM and migration to
> device memory is explained with HMM core patch.
>
> This patch deals with device memory that is un-addressable memory (ie CPU
> can not access it). Hence we do not want those struct page to be manage
> like regular memory. That is why we extend ZONE_DEVICE to support different
> types of memory.
>
> A persistent memory type is define for existing user of ZONE_DEVICE and a
> new device un-addressable type is added for the un-addressable memory type.
> There is a clear separation between what is expected from each memory type
> and existing user of ZONE_DEVICE are un-affected by new requirement and new
> use of the un-addressable type. All specific code path are protect with
> test against the memory type.
>
> Because memory is un-addressable we use a new special swap type for when
> a page is migrated to device memory (this reduces the number of maximum
> swap file).
>
> The main two additions beside memory type to ZONE_DEVICE is two callbacks.
> First one, page_free() is call whenever page refcount reach 1 (which means
> the page is free as ZONE_DEVICE page never reach a refcount of 0). This
> allow device driver to manage its memory and associated struct page.
>
> The second callback page_fault() happens when there is a CPU access to
> an address that is back by a device page (which are un-addressable by the
> CPU). This callback is responsible to migrate the page back to system
> main memory. Device driver can not block migration back to system memory,
> HMM make sure that such page can not be pin into device memory.
>
> If device is in some error condition and can not migrate memory back then
> a CPU page fault to device memory should end with SIGBUS.
>
> Changed since v4:
>   - s/DEVICE_PUBLIC/DEVICE_HOST (to free DEVICE_PUBLIC for HMM-CDM)
> Changed since v3:
>   - fix comments that was still using UNADDRESSABLE as keyword
>   - kernel configuration simplification
> Changed since v2:
>   - s/DEVICE_UNADDRESSABLE/DEVICE_PRIVATE
> Changed since v1:
>   - rename to device private memory (from device unaddressable)
>
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Acked-by: Dan Williams <dan.j.williams@intel.com>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
[..]
>  fs/proc/task_mmu.c       |  7 +++++
>  include/linux/ioport.h   |  1 +
>  include/linux/memremap.h | 73 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/mm.h       | 12 ++++++++
>  include/linux/swap.h     | 24 ++++++++++++++--
>  include/linux/swapops.h  | 68 ++++++++++++++++++++++++++++++++++++++++++++
>  kernel/memremap.c        | 34 ++++++++++++++++++++++
>  mm/Kconfig               | 11 +++++++-
>  mm/memory.c              | 61 ++++++++++++++++++++++++++++++++++++++++
>  mm/memory_hotplug.c      | 10 +++++--
>  mm/mprotect.c            | 14 ++++++++++
>  11 files changed, 309 insertions(+), 6 deletions(-)
>
[..]
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 93416196ba64..8e164ec9eed0 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -4,6 +4,8 @@
>  #include <linux/ioport.h>
>  #include <linux/percpu-refcount.h>
>
> +#include <asm/pgtable.h>
> +

So it turns out, over a year later, that this include was a mistake
and makes the build fragile.

>  struct resource;
>  struct device;
>
[..]
> +typedef int (*dev_page_fault_t)(struct vm_area_struct *vma,
> +                               unsigned long addr,
> +                               const struct page *page,
> +                               unsigned int flags,
> +                               pmd_t *pmdp);

I recently included this file somewhere that did not have a pile of
other mm headers included and 0day reports:

  In file included from arch/m68k/include/asm/pgtable_mm.h:148:0,
                    from arch/m68k/include/asm/pgtable.h:5,
                    from include/linux/memremap.h:7,
                    from drivers//dax/bus.c:3:
   arch/m68k/include/asm/motorola_pgtable.h: In function 'pgd_offset':
>> arch/m68k/include/asm/motorola_pgtable.h:199:11: error: dereferencing pointer to incomplete type 'const struct mm_struct'
     return mm->pgd + pgd_index(address);
              ^~
I assume this pulls in the entirety of pgtable.h just to get the pmd_t
definition?

> +typedef void (*dev_page_free_t)(struct page *page, void *data);
> +
>  /**
>   * struct dev_pagemap - metadata for ZONE_DEVICE mappings
> + * @page_fault: callback when CPU fault on an unaddressable device page
> + * @page_free: free page callback when page refcount reaches 1
>   * @altmap: pre-allocated/reserved memory for vmemmap allocations
>   * @res: physical address range covered by @ref
>   * @ref: reference count that pins the devm_memremap_pages() mapping
>   * @dev: host device of the mapping for debug
> + * @data: private data pointer for page_free()
> + * @type: memory type: see MEMORY_* in memory_hotplug.h
>   */
>  struct dev_pagemap {
> +       dev_page_fault_t page_fault;

Rather than try to figure out how to forward declare pmd_t, how about
just move dev_page_fault_t out of the generic dev_pagemap and into the
HMM specific container structure? This should be straightfoward on top
of the recent refactor.

  reply	other threads:[~2018-12-20  8:34 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-17  0:05 [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 01/19] hmm: heterogeneous memory management documentation v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 02/19] mm/hmm: heterogeneous memory management (HMM for short) v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 03/19] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 04/19] mm/hmm/mirror: helper to snapshot CPU page table v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 05/19] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 06/19] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 07/19] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v5 Jérôme Glisse
2018-12-20  8:33   ` Dan Williams [this message]
2018-12-20 16:15     ` Jerome Glisse
2018-12-20 16:15       ` Jerome Glisse
2018-12-20 16:47       ` Dan Williams
2018-12-20 16:47         ` Dan Williams
2018-12-20 16:57         ` Jerome Glisse
2018-12-20 16:57           ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 08/19] mm/ZONE_DEVICE: special case put_page() for device private pages v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 09/19] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 10/19] mm/memcontrol: support MEMORY_DEVICE_PRIVATE v4 Jérôme Glisse
2017-09-05 17:13   ` Laurent Dufour
2017-09-05 17:21     ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 11/19] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v7 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 12/19] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 13/19] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-08-17 21:12   ` Andrew Morton
2017-08-17 21:44     ` Jerome Glisse
2017-08-17  0:05 ` [HMM-v25 14/19] mm/migrate: new memory migration helper for use with device memory v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 15/19] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 16/19] mm/migrate: support un-addressable ZONE_DEVICE page in migration v3 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 17/19] mm/migrate: allow migrate_vma() to alloc new page on empty entry v4 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 18/19] mm/device-public-memory: device memory cache coherent with CPU v5 Jérôme Glisse
2017-08-17  0:05 ` [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3 Jérôme Glisse
2017-09-04  3:09   ` Bob Liu
2017-09-04 15:51     ` Jerome Glisse
2017-09-05  1:13       ` Bob Liu
2017-09-05  2:38         ` Jerome Glisse
2017-09-05  3:50           ` Bob Liu
2017-09-05 13:50             ` Jerome Glisse
2017-09-05 16:18               ` Dan Williams
2017-09-05 19:00               ` Ross Zwisler
2017-09-05 19:20                 ` Jerome Glisse
2017-09-08 19:43                   ` Ross Zwisler
2017-09-08 20:29                     ` Jerome Glisse
2017-09-05 18:54           ` Ross Zwisler
2017-09-06  1:25             ` Bob Liu
2017-09-06  2:12               ` Jerome Glisse
2017-09-07  2:06                 ` Bob Liu
2017-09-07 17:00                   ` Jerome Glisse
2017-09-07 17:27                   ` Jerome Glisse
2017-09-08  1:59                     ` Bob Liu
2017-09-08 20:43                       ` Dan Williams
2017-11-17  3:47                         ` chetan L
2017-09-05  3:36       ` Balbir Singh
2017-08-17 21:39 ` [HMM-v25 00/19] HMM (Heterogeneous Memory Management) v25 Andrew Morton
2017-08-17 21:55   ` Jerome Glisse
2017-08-17 21:59     ` Dan Williams
2017-08-17 22:02       ` Jerome Glisse
2017-08-17 22:06         ` Dan Williams
2017-08-17 22:16       ` Andrew Morton
2017-12-13 12:10 ` Figo.zhang
2017-12-13 16:12   ` Jerome Glisse
2017-12-14  2:48     ` Figo.zhang
2017-12-14  3:16       ` Jerome Glisse
2017-12-14  3:53         ` Figo.zhang
2017-12-14  4:16           ` Jerome Glisse
2017-12-14  7:05             ` Figo.zhang
2017-12-14 15:28               ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA9_cmeag7n4yeiP6pYZSz80KyxqfwbsXJCWvyNE4PSaxCKA3A@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=dnellans@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).