All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Jordan Niethe <jniethe@nvidia.com>
Cc: <linux-mm@kvack.org>, <balbirs@nvidia.com>,
	<akpm@linux-foundation.org>, <linux-kernel@vger.kernel.org>,
	<dri-devel@lists.freedesktop.org>, <david@redhat.com>,
	<ziy@nvidia.com>, <apopple@nvidia.com>,
	<lorenzo.stoakes@oracle.com>, <lyude@redhat.com>,
	<dakr@kernel.org>, <airlied@gmail.com>, <simona@ffwll.ch>,
	<rcampbell@nvidia.com>, <mpenttil@redhat.com>, <jgg@nvidia.com>,
	<willy@infradead.org>, <linuxppc-dev@lists.ozlabs.org>,
	<intel-xe@lists.freedesktop.org>, <jgg@ziepe.ca>,
	<Felix.Kuehling@amd.com>
Subject: Re: [PATCH v2 00/11] Remove device private pages from physical address space
Date: Wed, 7 Jan 2026 10:36:44 -0800	[thread overview]
Message-ID: <aV6nvCw2ugAbSpFL@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20260107091823.68974-1-jniethe@nvidia.com>

On Wed, Jan 07, 2026 at 08:18:12PM +1100, Jordan Niethe wrote:
> Today, when creating these device private struct pages, the first step
> is to use request_free_mem_region() to get a range of physical address
> space large enough to represent the devices memory. This allocated
> physical address range is then remapped as device private memory using
> memremap_pages.
> 
> Needing allocation of physical address space has some problems:
> 
>   1) There may be insufficient physical address space to represent the
>      device memory. KASLR reducing the physical address space and VM
>      configurations with limited physical address space increase the
>      likelihood of hitting this especially as device memory increases. This
>      has been observed to prevent device private from being initialized.  
> 
>   2) Attempting to add the device private pages to the linear map at
>      addresses beyond the actual physical memory causes issues on
>      architectures like aarch64  - meaning the feature does not work there [0].
> 
> This series changes device private memory so that it does not require
> allocation of physical address space and these problems are avoided.
> Instead of using the physical address space, we introduce a "device
> private address space" and allocate from there.
> 
> A consequence of placing the device private pages outside of the
> physical address space is that they no longer have a PFN. However, it is
> still necessary to be able to look up a corresponding device private
> page from a device private PTE entry, which means that we still require
> some way to index into this device private address space. Instead of a
> PFN, device private pages use an offset into this device private address
> space to look up device private struct pages.
> 
> The problem that then needs to be addressed is how to avoid confusing
> these device private offsets with PFNs. It is the inherent limited usage
> of the device private pages themselves which make this possible. A
> device private page is only used for userspace mappings, we do not need
> to be concerned with them being used within the mm more broadly. This
> means that the only way that the core kernel looks up these pages is via
> the page table, where their PTE already indicates if they refer to a
> device private page via their swap type, e.g.  SWP_DEVICE_WRITE. We can
> use this information to determine if the PTE contains a PFN which should
> be looked up in the page map, or a device private offset which should be
> looked up elsewhere.
> 
> This applies when we are creating PTE entries for device private pages -
> because they have their own type there are already must be handled
> separately, so it is a small step to convert them to a device private
> PFN now too.
> 
> The first part of the series updates callers where device private
> offsets might now be encountered to track this extra state.
> 
> The last patch contains the bulk of the work where we change how we
> convert between device private pages to device private offsets and then
> use a new interface for allocating device private pages without the need
> for reserving physical address space.
> 
> By removing the device private pages from the physical address space,
> this series also opens up the possibility to moving away from tracking
> device private memory using struct pages in the future. This is
> desirable as on systems with large amounts of memory these device
> private struct pages use a signifiant amount of memory and take a
> significant amount of time to initialize.
> 
> *** Changes in v2 ***
> 
> The most significant change in v2 is addressing code paths that are
> common between MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT devices.
> 
> This had been overlooked in previous revisions.
> 
> To do this we introduce a migrate_pfn_from_page() helper which will call
> device_private_offset_to_page() and set the MIGRATE_PFN_DEVICE_PRIVATE
> flag if required.
> 
> In places where we could have a device private offset
> (MEMORY_DEVICE_PRIVATE) or a pfn (MEMORY_DEVICE_COHERENT) we update to
> use an mpfn to disambiguate.  This includes some users in the drivers
> and migrate_device_{pfns,range}().
> 
> Seeking opinions on using the mpfns like this or if a new type would be
> preferred.
> 
>   - mm/migrate_device: Introduce migrate_pfn_from_page() helper
>     - New to series
> 
>   - drm/amdkfd: Use migrate pfns internally
>     - New to series
> 
>   - mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
>     - New to series
> 
>   - mm/migrate_device: Add migrate PFN flag to track device private pages
>     - Update for migrate_pfn_from_page()
>     - Rename to MIGRATE_PFN_DEVICE_PRIVATE
>     - drm/amd: Check adev->gmc.xgmi.connected_to_cpu
>     - lib/test_hmm.c: Check chunk->pagemap.type == MEMORY_DEVICE_PRIVATE
> 
>   - mm: Add helpers to create migration entries from struct pages
>     - Add a flags param
> 
>   - mm: Add a new swap type for migration entries of device private pages
>     - Add softleaf_is_migration_device_private_read()
> 
>   - mm: Add helpers to create device private entries from struct pages
>     - Add a flags param
> 
>   - mm: Remove device private pages from the physical address space
>     - Make sure last member of struct dev_pagemap remains DECLARE_FLEX_ARRAY(struct range, ranges);
> 
> Testing:
> - selftests/mm/hmm-tests on an amd64 VM
> 
> * NOTE: I will need help in testing the driver changes *
> 

Thanks for the series. For some reason Intel's CI couldn't apply this
series to drm-tip to get results [1]. I'll manually apply this and run all
our SVM tests and get back you on results + review the changes here. For
future reference if you want to use our CI system, the series must apply
to drm-tip, feel free to rebase this series and just send to intel-xe
list if you want CI results.

I was also wondering if Nvidia could help review one our core MM patches
[2] which is gating enabling 2M device pages too?

Matt

[1] https://patchwork.freedesktop.org/series/159738/
[2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1 

> Revisions:
> - RFC: https://lore.kernel.org/all/20251128044146.80050-1-jniethe@nvidia.com/
> - v1: https://lore.kernel.org/all/20251231043154.42931-1-jniethe@nvidia.com/
> 
> [0] https://lore.kernel.org/lkml/CAMj1kXFZ=4hLL1w6iCV5O5uVoVLHAJbc0rr40j24ObenAjXe9w@mail.gmail.com/
> 
> Jordan Niethe (11):
>   mm/migrate_device: Introduce migrate_pfn_from_page() helper
>   drm/amdkfd: Use migrate pfns internally
>   mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
>   mm/migrate_device: Add migrate PFN flag to track device private pages
>   mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn to track
>     device private pages
>   mm: Add helpers to create migration entries from struct pages
>   mm: Add a new swap type for migration entries of device private pages
>   mm: Add helpers to create device private entries from struct pages
>   mm/util: Add flag to track device private pages in page snapshots
>   mm/hmm: Add flag to track device private pages
>   mm: Remove device private pages from the physical address space
> 
>  Documentation/mm/hmm.rst                 |  11 +-
>  arch/powerpc/kvm/book3s_hv_uvmem.c       |  43 ++---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  45 +++---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   2 +-
>  drivers/gpu/drm/drm_pagemap.c            |  11 +-
>  drivers/gpu/drm/nouveau/nouveau_dmem.c   |  45 ++----
>  drivers/gpu/drm/xe/xe_svm.c              |  37 ++---
>  fs/proc/page.c                           |   6 +-
>  include/drm/drm_pagemap.h                |   8 +-
>  include/linux/hmm.h                      |   7 +-
>  include/linux/leafops.h                  | 116 ++++++++++++--
>  include/linux/memremap.h                 |  64 +++++++-
>  include/linux/migrate.h                  |  23 ++-
>  include/linux/mm.h                       |   9 +-
>  include/linux/rmap.h                     |  33 +++-
>  include/linux/swap.h                     |   8 +-
>  include/linux/swapops.h                  | 136 ++++++++++++++++
>  lib/test_hmm.c                           |  86 ++++++----
>  mm/debug.c                               |   9 +-
>  mm/hmm.c                                 |   5 +-
>  mm/huge_memory.c                         |  43 ++---
>  mm/hugetlb.c                             |  15 +-
>  mm/memory.c                              |   5 +-
>  mm/memremap.c                            | 193 ++++++++++++++++++-----
>  mm/migrate.c                             |   6 +-
>  mm/migrate_device.c                      |  76 +++++----
>  mm/mm_init.c                             |   8 +-
>  mm/mprotect.c                            |  10 +-
>  mm/page_vma_mapped.c                     |  32 +++-
>  mm/rmap.c                                |  59 ++++---
>  mm/util.c                                |   8 +-
>  mm/vmscan.c                              |   2 +-
>  32 files changed, 822 insertions(+), 339 deletions(-)
> 
> 
> base-commit: f8f9c1f4d0c7a64600e2ca312dec824a0bc2f1da
> -- 
> 2.34.1
> 

  parent reply	other threads:[~2026-01-07 18:36 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
2026-01-08 20:03   ` Felix Kuehling
2026-01-08 23:49     ` Jordan Niethe
2026-01-09 21:03       ` Kuehling, Felix
2026-01-09 22:47   ` Balbir Singh
2026-01-07  9:18 ` [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally Jordan Niethe
2026-01-08 22:00   ` Felix Kuehling
2026-01-08 23:56     ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 03/11] mm/migrate_device: Make migrate_device_{pfns, range}() take mpfns Jordan Niethe
2026-01-07  9:18   ` [PATCH v2 03/11] mm/migrate_device: Make migrate_device_{pfns,range}() " Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages Jordan Niethe
2026-01-08 20:01   ` Felix Kuehling
2026-01-08 23:41     ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 05/11] mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn " Jordan Niethe
2026-01-13 19:44   ` Zi Yan
2026-01-20 22:37     ` Jordan Niethe
2026-01-20 22:49       ` Zi Yan
2026-01-20 22:52         ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 06/11] mm: Add helpers to create migration entries from struct pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 07/11] mm: Add a new swap type for migration entries of device private pages Jordan Niethe
2026-01-12  1:00   ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 08/11] mm: Add helpers to create device private entries from struct pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 09/11] mm/util: Add flag to track device private pages in page snapshots Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 10/11] mm/hmm: Add flag to track device private pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 11/11] mm: Remove device private pages from the physical address space Jordan Niethe
2026-01-13 20:04   ` Zi Yan
2026-01-20 22:33     ` Jordan Niethe
2026-01-20 22:53       ` Zi Yan
2026-01-20 23:02         ` Jordan Niethe
2026-01-20 23:06           ` Zi Yan
2026-01-20 23:34             ` Jordan Niethe
2026-01-21  2:41               ` Zi Yan
2026-01-21  4:04                 ` Jordan Niethe
2026-01-22  6:24                   ` Jordan Niethe
2026-01-23  2:02             ` Alistair Popple
2026-01-23  3:06               ` Zi Yan
2026-01-23  3:09                 ` Zi Yan
2026-01-23  5:38                   ` Alistair Popple
2026-01-23 13:50                     ` Jason Gunthorpe
2026-01-07 18:36 ` Matthew Brost [this message]
2026-01-07 20:21   ` [PATCH v2 00/11] Remove device private pages from " Zi Yan
2026-01-08  2:25   ` Jordan Niethe
2026-01-08  5:42     ` Jordan Niethe
2026-01-09  0:01       ` Jordan Niethe
2026-01-09  0:31         ` Matthew Brost
2026-01-09  1:27           ` Jordan Niethe
2026-01-09  6:22             ` Matthew Brost
2026-01-14  5:41               ` Jordan Niethe
2026-01-23  6:25                 ` Jordan Niethe
2026-01-07 20:06 ` Andrew Morton
2026-01-07 20:54   ` Jason Gunthorpe
2026-01-07 21:02     ` Balbir Singh
2026-01-08  1:29       ` Alistair Popple
2026-01-08  1:08   ` John Hubbard
2026-01-08  1:49   ` Alistair Popple
2026-01-08  2:55     ` Jordan Niethe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aV6nvCw2ugAbSpFL@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=david@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=jniethe@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lyude@redhat.com \
    --cc=mpenttil@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=simona@ffwll.ch \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.