From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
dri-devel@lists.freedesktop.org, himal.prasad.ghimiray@intel.com,
apopple@nvidia.com, airlied@gmail.com,
"Simona Vetter" <simona.vetter@ffwll.ch>,
felix.kuehling@amd.com, "Matthew Brost" <matthew.brost@intel.com>,
"Christian König" <christian.koenig@amd.com>,
dakr@kernel.org, "Mrozek, Michal" <michal.mrozek@intel.com>,
"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Subject: [PATCH v5 00/24] Dynamic drm_pagemaps and Initial multi-device SVM
Date: Thu, 18 Dec 2025 17:20:37 +0100 [thread overview]
Message-ID: <20251218162101.605379-1-thomas.hellstrom@linux.intel.com> (raw)
This series aims at providing an initial implementation of multi-device
SVM, where communitcation with peers (migration and direct execution out
of peer memory) uses some form of fast interconnect. In this series
we're using pcie p2p.
In a multi-device environment, the struct pages for device-private memory
(the dev_pagemap) may take up a significant amount of system memory. We
therefore want to provide a means of revoking / removing the dev_pagemaps
not in use. In particular when a device is offlined, we want to block
migrating *to* the device memory and migrate data already existing in the
devices memory to system. The dev_pagemap then becomes unused and can be
removed.
Removing and setting up a large dev_pagemap is also quite time-consuming,
so removal of unused dev_pagemaps only happens on system memory pressure
using a shrinker.
Patch 1 is a small debug printout fix.
Patch 2 removes some dead code.
Patch 3 fixes a condition where memory was used while being cleared.
Patches 4-9 deals with dynamic drm_pagemaps as described above.
Patches 10-14 adds infrastructure to handle remote drm_pagemaps with
fast interconnects.
Patch 15 extends the xe madvise() UAPI to handle remote drm_pagemaps.
Patch 16 adds a pcie-p2p dma SVM interconnect to the xe driver.
Patch 17 adds some SVM-related debug printouts for xe.
Patch 18 adds documentation on how the drm_pagemaps are reference counted.
Patch 19 Cleans up the usage of the dev_private owner.
Patch 20 Introduces a gpusvm function to scan the current CPU address space.
Patch 21 Uses the above function in Xe to avoid unnecessary migrations.
Patch 22 Adds drm_pagemap support for p2p destination migration.
Patch 23 Adds drm_pagemap support for p2p source migration.
Patch 24 Adds an rwsem to optionally serialize migration.
What's still missing is implementation of migration policies.
That will be implemented in follow-up series.
v2:
- Address review comments from Matt Brost.
- Fix compilation issues reported by automated testing
- Add patch 1, 17.
- What's now patch 16 was extended to support p2p migration.
v3:
- Add patches 2, 18, 19, 10, 22. Main functionality is the address space
scan to avoid unnecessary migration, and p2p source migration which
is needed on Xe to decompress and to flush out the L2 cache.
- Rework what's now Patch 21 slightly.
- Minor fixes all over the place.
v4:
- Fix a build error (CI)
- Fix possibly incorrect waiting for the pre_migrate_fence.
v5:
- New patch: broken out from patch 22: drm/pagemap: Remove some dead code
(Matt Brost)
- New patch: drm/xe/svm: Serialize migration to device if racing
(Matt Brost)
- Fix a UAF in what's now patch 3. (CI)
- Release the migrate fence early in patch 3.
- Address review comments to patch 3. See the patch for details.
- Address review comments to patch 22. See the patch for details.
- Rebase, update R-Bs.
Test-with: 20251204085432.35023-1-nishit.sharma@intel.com
Thomas Hellström (24):
drm/xe/svm: Fix a debug printout
drm/pagemap: Remove some dead code
drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before
use
drm/pagemap, drm/xe: Add refcounting to struct drm_pagemap
drm/pagemap: Add a refcounted drm_pagemap backpointer to struct
drm_pagemap_zdd
drm/pagemap, drm/xe: Manage drm_pagemap provider lifetimes
drm/pagemap: Add a drm_pagemap cache and shrinker
drm/xe: Use the drm_pagemap cache and shrinker
drm/pagemap: Remove the drm_pagemap_create() interface
drm/pagemap_util: Add a utility to assign an owner to a set of
interconnected gpus
drm/xe: Use the drm_pagemap_util helper to get a svm pagemap owner
drm/xe: Pass a drm_pagemap pointer around with the memory advise
attributes
drm/xe: Use the vma attibute drm_pagemap to select where to migrate
drm/xe: Simplify madvise_preferred_mem_loc()
drm/xe/uapi: Extend the madvise functionality to support foreign
pagemap placement for svm
drm/xe: Support pcie p2p dma as a fast interconnect
drm/xe/vm: Add a couple of VM debug printouts
drm/xe/svm: Document how xe keeps drm_pagemap references
drm/pagemap, drm/xe: Clean up the use of the device-private page owner
drm/gpusvm: Introduce a function to scan the current migration state
drm/xe: Use drm_gpusvm_scan_mm()
drm/pagemap, drm/xe: Support destination migration over interconnect
drm/pagemap: Support source migration over interconnect
drm/xe/svm: Serialize migration to device if racing
drivers/gpu/drm/Makefile | 3 +-
drivers/gpu/drm/drm_gpusvm.c | 124 +++++
drivers/gpu/drm/drm_pagemap.c | 565 +++++++++++++++++---
drivers/gpu/drm/drm_pagemap_util.c | 568 +++++++++++++++++++++
drivers/gpu/drm/xe/xe_device.c | 20 +
drivers/gpu/drm/xe/xe_device.h | 2 +
drivers/gpu/drm/xe/xe_device_types.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 4 +-
drivers/gpu/drm/xe/xe_svm.c | 738 +++++++++++++++++++++++----
drivers/gpu/drm/xe/xe_svm.h | 85 ++-
drivers/gpu/drm/xe/xe_tile.c | 34 +-
drivers/gpu/drm/xe/xe_tile.h | 21 +
drivers/gpu/drm/xe/xe_userptr.c | 2 +-
drivers/gpu/drm/xe/xe_vm.c | 65 ++-
drivers/gpu/drm/xe/xe_vm.h | 1 +
drivers/gpu/drm/xe/xe_vm_madvise.c | 106 +++-
drivers/gpu/drm/xe/xe_vm_types.h | 21 +-
drivers/gpu/drm/xe/xe_vram_types.h | 15 +-
include/drm/drm_gpusvm.h | 29 ++
include/drm/drm_pagemap.h | 128 ++++-
include/drm/drm_pagemap_util.h | 92 ++++
include/uapi/drm/xe_drm.h | 18 +-
22 files changed, 2355 insertions(+), 291 deletions(-)
create mode 100644 drivers/gpu/drm/drm_pagemap_util.c
create mode 100644 include/drm/drm_pagemap_util.h
--
2.51.1
next reply other threads:[~2025-12-18 16:21 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 16:20 Thomas Hellström [this message]
2025-12-18 16:20 ` [PATCH v5 01/24] drm/xe/svm: Fix a debug printout Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 02/24] drm/pagemap: Remove some dead code Thomas Hellström
2025-12-18 18:16 ` Matthew Brost
2025-12-18 16:20 ` [PATCH v5 03/24] drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use Thomas Hellström
2025-12-18 18:33 ` Matthew Brost
2025-12-18 19:18 ` Thomas Hellström
2025-12-18 19:33 ` Matthew Brost
2025-12-18 16:20 ` [PATCH v5 04/24] drm/pagemap, drm/xe: Add refcounting to struct drm_pagemap Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 05/24] drm/pagemap: Add a refcounted drm_pagemap backpointer to struct drm_pagemap_zdd Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 06/24] drm/pagemap, drm/xe: Manage drm_pagemap provider lifetimes Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 07/24] drm/pagemap: Add a drm_pagemap cache and shrinker Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 08/24] drm/xe: Use the " Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 09/24] drm/pagemap: Remove the drm_pagemap_create() interface Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 10/24] drm/pagemap_util: Add a utility to assign an owner to a set of interconnected gpus Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 11/24] drm/xe: Use the drm_pagemap_util helper to get a svm pagemap owner Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 12/24] drm/xe: Pass a drm_pagemap pointer around with the memory advise attributes Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 13/24] drm/xe: Use the vma attibute drm_pagemap to select where to migrate Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 14/24] drm/xe: Simplify madvise_preferred_mem_loc() Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 15/24] drm/xe/uapi: Extend the madvise functionality to support foreign pagemap placement for svm Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 16/24] drm/xe: Support pcie p2p dma as a fast interconnect Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 17/24] drm/xe/vm: Add a couple of VM debug printouts Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 18/24] drm/xe/svm: Document how xe keeps drm_pagemap references Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 19/24] drm/pagemap, drm/xe: Clean up the use of the device-private page owner Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 20/24] drm/gpusvm: Introduce a function to scan the current migration state Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 21/24] drm/xe: Use drm_gpusvm_scan_mm() Thomas Hellström
2025-12-18 16:20 ` [PATCH v5 22/24] drm/pagemap, drm/xe: Support destination migration over interconnect Thomas Hellström
2025-12-18 18:40 ` Matthew Brost
2025-12-18 16:21 ` [PATCH v5 23/24] drm/pagemap: Support source " Thomas Hellström
2025-12-18 20:36 ` Matthew Brost
2025-12-18 23:01 ` Matthew Brost
2025-12-18 16:21 ` [PATCH v5 24/24] drm/xe/svm: Serialize migration to device if racing Thomas Hellström
2025-12-18 19:03 ` Matthew Brost
2025-12-18 16:58 ` ✗ CI.checkpatch: warning for Dynamic drm_pagemaps and Initial multi-device SVM (rev6) Patchwork
2025-12-18 16:59 ` ✓ CI.KUnit: success " Patchwork
2025-12-18 17:38 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-19 14:56 ` ✓ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251218162101.605379-1-thomas.hellstrom@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=airlied@gmail.com \
--cc=apopple@nvidia.com \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=felix.kuehling@amd.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=matthew.brost@intel.com \
--cc=michal.mrozek@intel.com \
--cc=simona.vetter@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox