From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: "Ghimiray, Himal Prasad" <himal.prasad.ghimiray@intel.com>,
intel-xe@lists.freedesktop.org
Cc: "Matthew Brost" <matthew.brost@intel.com>,
dri-devel@lists.freedesktop.org, apopple@nvidia.com,
airlied@gmail.com, "Simona Vetter" <simona.vetter@ffwll.ch>,
"Felix Kühling" <felix.kuehling@amd.com>,
"Philip Yang" <philip.yang@amd.com>,
"Christian König" <christian.koenig@amd.com>,
dakr@kernel.org, "Mrozek, Michal" <michal.mrozek@intel.com>,
"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Subject: Re: [PATCH v3 1/3] drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap
Date: Tue, 17 Jun 2025 16:55:26 +0200 [thread overview]
Message-ID: <5da4b6f81517a6c091411a92e1c9d3fd5b902dba.camel@linux.intel.com> (raw)
In-Reply-To: <e4d37eb0-ec91-47c5-b888-0b5bbb74db84@intel.com>
On Tue, 2025-06-17 at 20:17 +0530, Ghimiray, Himal Prasad wrote:
>
>
> On 17-06-2025 18:41, Thomas Hellström wrote:
> > On Tue, 2025-06-17 at 18:25 +0530, Ghimiray, Himal Prasad wrote:
> > >
> > >
> > > On 13-06-2025 19:32, Thomas Hellström wrote:
> > > > From: Matthew Brost <matthew.brost@intel.com>
> > > >
> > > > The migration functionality and track-keeping of per-pagemap
> > > > VRAM
> > > > mapped to the CPU mm is not per GPU_vm, but rather per pagemap.
> > > > This is also reflected by the functions not needing the
> > > > drm_gpusvm
> > > > structures. So move to drm_pagemap.
> > > >
> > > > With this, drm_gpusvm shouldn't really access the page zone-
> > > > device-
> > > > data
> > > > since its meaning is internal to drm_pagemap. Currently it's
> > > > used
> > > > to
> > > > reject mapping ranges backed by multiple drm_pagemap
> > > > allocations.
> > > > For now, make the zone-device-data a void pointer.
> > > >
> > > > Alter the interface of drm_gpusvm_migrate_to_devmem() to ensure
> > > > we
> > > > don't
> > > > pass a gpusvm pointer.
> > > >
> > > > Rename CONFIG_DRM_XE_DEVMEM_MIRROR to CONFIG_DRM_XE_PAGEMAP.
> > > >
> > > > Matt is listed as author of this commit since he wrote most of
> > > > the
> > > > code,
> > > > and it makes sense to retain his git authorship.
> > > > Thomas mostly moved the code around.
> > >
> > > >
> > > > v3:
> > > > - Kerneldoc fixes (CI)
> > > > - Don't update documentation about how the drm_pagemap
> > > > migration should be interpreted until upcoming
> > > > patches where the functionality is implemented.
> > > > (Matt Brost)
> > > >
> > > > Co-developed-by: Thomas Hellström
> > > > <thomas.hellstrom@linux.intel.com>
> > > > Signed-off-by: Thomas Hellström
> > > > <thomas.hellstrom@linux.intel.com>
> > > > ---
> > > > Documentation/gpu/rfc/gpusvm.rst | 12 +-
> > > > drivers/gpu/drm/Makefile | 6 +-
> > > > drivers/gpu/drm/drm_gpusvm.c | 759 +----------------
> > > > -----
> > > > ----
> > > > drivers/gpu/drm/drm_pagemap.c | 788
> > > > +++++++++++++++++++++++++++
> > > > drivers/gpu/drm/xe/Kconfig | 10 +-
> > > > drivers/gpu/drm/xe/xe_bo_types.h | 2 +-
> > > > drivers/gpu/drm/xe/xe_device_types.h | 2 +-
> > > > drivers/gpu/drm/xe/xe_svm.c | 47 +-
> > > > include/drm/drm_gpusvm.h | 96 ----
> > > > include/drm/drm_pagemap.h | 101 ++++
> > > > 10 files changed, 950 insertions(+), 873 deletions(-)
> > > > create mode 100644 drivers/gpu/drm/drm_pagemap.c
> > > >
> > > > diff --git a/Documentation/gpu/rfc/gpusvm.rst
> > > > b/Documentation/gpu/rfc/gpusvm.rst
> > > > index bcf66a8137a6..469db1372f16 100644
> > > > --- a/Documentation/gpu/rfc/gpusvm.rst
> > > > +++ b/Documentation/gpu/rfc/gpusvm.rst
> > > > @@ -73,15 +73,21 @@ Overview of baseline design
> > > > .. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
> > > > :doc: Locking
> > > >
> > > > -.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
> > > > - :doc: Migration
> > > > -
> > > > .. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
> > > > :doc: Partial Unmapping of Ranges
> > > >
> > > > .. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
> > > > :doc: Examples
> > > >
> > > > +Overview of drm_pagemap design
> > > > +==============================
> > > > +
> > > > +.. kernel-doc:: drivers/gpu/drm/drm_pagemap.c
> > > > + :doc: Overview
> > > > +
> > > > +.. kernel-doc:: drivers/gpu/drm/drm_pagemap.c
> > > > + :doc: Migration
> > > > +
> > > > Possible future design features
> > > > ===============================
> > > >
> > > > diff --git a/drivers/gpu/drm/Makefile
> > > > b/drivers/gpu/drm/Makefile
> > > > index 5050ac32bba2..4dafbdc8f86a 100644
> > > > --- a/drivers/gpu/drm/Makefile
> > > > +++ b/drivers/gpu/drm/Makefile
> > > > @@ -104,7 +104,11 @@ obj-$(CONFIG_DRM_PANEL_BACKLIGHT_QUIRKS)
> > > > +=
> > > > drm_panel_backlight_quirks.o
> > > > #
> > > > obj-$(CONFIG_DRM_EXEC) += drm_exec.o
> > > > obj-$(CONFIG_DRM_GPUVM) += drm_gpuvm.o
> > > > -obj-$(CONFIG_DRM_GPUSVM) += drm_gpusvm.o
> > > > +
> > > > +drm_gpusvm_helper-y := \
> > > > + drm_gpusvm.o\
> > > > + drm_pagemap.o
> > > > +obj-$(CONFIG_DRM_GPUSVM) += drm_gpusvm_helper.o
> > > >
> > > > obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_gpusvm.c
> > > > b/drivers/gpu/drm/drm_gpusvm.c
> > > > index 7ff81aa0a1ca..ef81381609de 100644
> > > > --- a/drivers/gpu/drm/drm_gpusvm.c
> > > > +++ b/drivers/gpu/drm/drm_gpusvm.c
> > > > @@ -8,10 +8,9 @@
> > > >
> > > > #include <linux/dma-mapping.h>
> > > > #include <linux/hmm.h>
> > > > +#include <linux/hugetlb_inline.h>
> > > > #include <linux/memremap.h>
> > > > -#include <linux/migrate.h>
> > > > #include <linux/mm_types.h>
> > > > -#include <linux/pagemap.h>
> > > > #include <linux/slab.h>
> > > >
> > > > #include <drm/drm_device.h>
> > > > @@ -107,21 +106,6 @@
> > > > * to add annotations to GPU SVM.
> > > > */
> > > >
> > > > -/**
> > > > - * DOC: Migration
> > > > - *
> > > > - * The migration support is quite simple, allowing migration
> > > > between RAM and
> > > > - * device memory at the range granularity. For example, GPU
> > > > SVM
> > > > currently does
> > > > - * not support mixing RAM and device memory pages within a
> > > > range.
> > > > This means
> > > > - * that upon GPU fault, the entire range can be migrated to
> > > > device
> > > > memory, and
> > > > - * upon CPU fault, the entire range is migrated to RAM. Mixed
> > > > RAM
> > > > and device
> > > > - * memory storage within a range could be added in the future
> > > > if
> > > > required.
> > > > - *
> > > > - * The reasoning for only supporting range granularity is as
> > > > follows: it
> > > > - * simplifies the implementation, and range sizes are driver-
> > > > defined and should
> > > > - * be relatively small.
> > > > - */
> > > > -
> > > > /**
> > > > * DOC: Partial Unmapping of Ranges
> > > > *
> > > > @@ -193,10 +177,9 @@
> > > > * if (driver_migration_policy(range)) {
> > > > * mmap_read_lock(mm);
> > > > * devmem = driver_alloc_devmem();
> > > > - * err =
> > > > drm_gpusvm_migrate_to_devmem(gpusvm,
> > > > range,
> > > > - *
> > > > devmem_allocation,
> > > > - *
> > > > &ctx);
> > > > - * mmap_read_unlock(mm);
> > > > + * err =
> > > > drm_pagemap_migrate_to_devmem(devmem, gpusvm->mm, gpuva_start,
> > > > + *
> > > > gpuva_end, driver_pgmap_owner());
> > >
> > >
> > >
> > > fix doc passing timeslice as parameter.
> >
> > Will fix.
> >
> > >
> >
> > 8<-----------------------------------------------------------------
> > ----
> > > > +/**
> > > > + * drm_pagemap_migrate_to_devmem() - Migrate a struct
> > > > mm_struct
> > > > range to device memory
> > > > + * @devmem_allocation: The device memory allocation to migrate
> > > > to.
> > > > + * The caller should hold a reference to the device memory
> > > > allocation,
> > > > + * and the reference is consumed by this function unless it
> > > > returns with
> > > > + * an error.
> > > > + * @mm: Pointer to the struct mm_struct.
> > > > + * @start: Start of the virtual address range to migrate.
> > > > + * @end: End of the virtual address range to migrate.
> > > > + * @timeslice_ms: The time requested for the migrated pages to
> > > > + * be present in the cpu memory map before migrated back.
> > >
> > > Shouldn't this be present in gpu or cpu memory map ? We are using
> > > this
> > > to ensure pagefault can be handled effectively by ensuring pages
> > > remain
> > > in vram here for prescribed time too.
> >
> > So with this split, drm_pagemap is responsible for migrating memory
> > and
> > updating the CPU memory map only, whereas drm_gpusvm is responsible
> > for
> > setting up the GPU memory maps. So if it remains in the CPU memory
> > map,
> > then nothing will force the gpu vms to invalidate, unless the gpu
> > driver decides to invalidate itself.
>
> Thats true.
>
> >
> > But looking at this i should probably rephrase "before migrated
> > back"
> > to "before being allowed to be migrated back".
>
> The confusion for me is that timeslice_ms does not represent the time
> pages are required to stay in the CPU memory map, but rather the time
> they must remain in the GPU memory map. We defer migrate_to_smem
> until
> this timeslice has expired.
Yeah, although drm_pagemap is not aware of any gpu memory map so it
would be incorrect to bring that up in the api docs. Could add some
discussion, though, that "this could be used to..." and give the
typical gpu use-case?
Thanks,
Thomas
next prev parent reply other threads:[~2025-06-17 14:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-13 14:02 [PATCH v3 0/3] drm/gpusvm, drm/pagemap, drm/xe: Restructure migration in preparation for multi-device Thomas Hellström
2025-06-13 14:02 ` [PATCH v3 1/3] drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap Thomas Hellström
2025-06-17 12:55 ` Ghimiray, Himal Prasad
2025-06-17 13:11 ` Thomas Hellström
2025-06-17 14:47 ` Ghimiray, Himal Prasad
2025-06-17 14:55 ` Thomas Hellström [this message]
2025-06-17 17:04 ` Matthew Brost
2025-06-17 19:37 ` Thomas Hellström
2025-06-17 20:44 ` Matthew Brost
2025-06-13 14:02 ` [PATCH v3 2/3] drm/pagemap: Add a populate_mm op Thomas Hellström
2025-06-17 17:05 ` Matthew Brost
2025-06-13 14:02 ` [PATCH v3 3/3] drm/xe: Implement and use the drm_pagemap " Thomas Hellström
2025-06-17 17:09 ` Matthew Brost
2025-06-17 19:24 ` Thomas Hellström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5da4b6f81517a6c091411a92e1c9d3fd5b902dba.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=airlied@gmail.com \
--cc=apopple@nvidia.com \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=felix.kuehling@amd.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=matthew.brost@intel.com \
--cc=michal.mrozek@intel.com \
--cc=philip.yang@amd.com \
--cc=simona.vetter@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).