Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	himal.prasad.ghimiray@intel.com, apopple@nvidia.com,
	airlied@gmail.com, "Simona Vetter" <simona.vetter@ffwll.ch>,
	"Felix Kühling" <felix.kuehling@amd.com>,
	"Philip Yang" <philip.yang@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	dakr@kernel.org, "Mrozek, Michal" <michal.mrozek@intel.com>,
	"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Subject: Re: [PATCH v3 2/3] drm/pagemap: Add a populate_mm op
Date: Tue, 17 Jun 2025 10:05:15 -0700	[thread overview]
Message-ID: <aFGgS0qa2YvsB6AQ@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20250613140219.87479-3-thomas.hellstrom@linux.intel.com>

On Fri, Jun 13, 2025 at 04:02:18PM +0200, Thomas Hellström wrote:
> Add an operation to populate a part of a drm_mm with device
> private memory. Clarify how migration using it is intended
> to work.
> 
> v3:
> - Kerneldoc fixes and updates (Matt Brost).
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1

Review is still valid on this rev.

Matt

> ---
>  drivers/gpu/drm/drm_gpusvm.c  |  7 ++--
>  drivers/gpu/drm/drm_pagemap.c | 67 ++++++++++++++++++++++++++++-------
>  include/drm/drm_pagemap.h     | 34 ++++++++++++++++++
>  3 files changed, 91 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index ef81381609de..51afc8a9704d 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -175,11 +175,8 @@
>   *		}
>   *
>   *		if (driver_migration_policy(range)) {
> - *			mmap_read_lock(mm);
> - *			devmem = driver_alloc_devmem();
> - *			err = drm_pagemap_migrate_to_devmem(devmem, gpusvm->mm, gpuva_start,
> - *                                                          gpuva_end, driver_pgmap_owner());
> - *                      mmap_read_unlock(mm);
> + *			err = drm_pagemap_populate_mm(driver_choose_drm_pagemap(),
> + *                                                    gpuva_start, gpuva_end, gpusvm->mm);
>   *			if (err)	// CPU mappings may have changed
>   *				goto retry;
>   *		}
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index b7a0e6d15aff..e4120c8db262 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -6,6 +6,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/migrate.h>
>  #include <linux/pagemap.h>
> +#include <drm/drm_drv.h>
>  #include <drm/drm_pagemap.h>
>  
>  /**
> @@ -20,23 +21,30 @@
>   * system.
>   *
>   * Typically the DRM pagemap receives requests from one or more DRM GPU SVM
> - * instances to populate struct mm_struct virtual ranges with memory.
> + * instances to populate struct mm_struct virtual ranges with memory, and the
> + * migration is best effort only and may thus fail. The implementation should
> + * also handle device unbinding by blocking (return an -ENODEV) error for new
> + * population requests and after that migrate all device pages to system ram.
>   */
>  
>  /**
>   * DOC: Migration
>   *
> - * The migration support is quite simple, allowing migration between RAM and
> - * device memory at the range granularity. For example, GPU SVM currently does
> - * not support mixing RAM and device memory pages within a range. This means
> - * that upon GPU fault, the entire range can be migrated to device memory, and
> - * upon CPU fault, the entire range is migrated to RAM. Mixed RAM and device
> - * memory storage within a range could be added in the future if required.
> - *
> - * The reasoning for only supporting range granularity is as follows: it
> - * simplifies the implementation, and range sizes are driver-defined and should
> - * be relatively small.
> - *
> + * Migration granularity typically follows the GPU SVM range requests, but
> + * if there are clashes, due to races or due to the fact that multiple GPU
> + * SVM instances have different views of the ranges used, and because of that
> + * parts of a requested range is already present in the requested device memory,
> + * the implementation has a variety of options. It can fail and it can choose
> + * to populate only the part of the range that isn't already in device memory,
> + * and it can evict the range to system before trying to migrate. Ideally an
> + * implementation would just try to migrate the missing part of the range and
> + * allocate just enough memory to do so.
> + *
> + * When migrating to system memory as a response to a cpu fault or a device
> + * memory eviction request, currently a full device memory allocation is
> + * migrated back to system. Moving forward this might need improvement for
> + * situations where a single page needs bouncing between system memory and
> + * device memory due to, for example, atomic operations.
>   *
>   * Key DRM pagemap components:
>   *
> @@ -786,3 +794,38 @@ struct drm_pagemap *drm_pagemap_page_to_dpagemap(struct page *page)
>  	return zdd->devmem_allocation->dpagemap;
>  }
>  EXPORT_SYMBOL_GPL(drm_pagemap_page_to_dpagemap);
> +
> +/**
> + * drm_pagemap_populate_mm() - Populate a virtual range with device memory pages
> + * @dpagemap: Pointer to the drm_pagemap managing the device memory
> + * @start: Start of the virtual range to populate.
> + * @end: End of the virtual range to populate.
> + * @mm: Pointer to the virtual address space.
> + * @timeslice_ms: The time requested for the migrated pages to
> + * be present in the cpu memory map before migrated back.
> + *
> + * Attempt to populate a virtual range with device memory pages,
> + * clearing them or migrating data from the existing pages if necessary.
> + * The function is best effort only, and implementations may vary
> + * in how hard they try to satisfy the request.
> + *
> + * Return: %0 on success, negative error code on error. If the hardware
> + * device was removed / unbound the function will return %-ENODEV.
> + */
> +int drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
> +			    unsigned long start, unsigned long end,
> +			    struct mm_struct *mm,
> +			    unsigned long timeslice_ms)
> +{
> +	int err;
> +
> +	if (!mmget_not_zero(mm))
> +		return -EFAULT;
> +	mmap_read_lock(mm);
> +	err = dpagemap->ops->populate_mm(dpagemap, start, end, mm,
> +					 timeslice_ms);
> +	mmap_read_unlock(mm);
> +	mmput(mm);
> +
> +	return err;
> +}
> diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
> index dabc9c365df4..e5f20a1235be 100644
> --- a/include/drm/drm_pagemap.h
> +++ b/include/drm/drm_pagemap.h
> @@ -92,6 +92,35 @@ struct drm_pagemap_ops {
>  			     struct device *dev,
>  			     struct drm_pagemap_device_addr addr);
>  
> +	/**
> +	 * @populate_mm: Populate part of the mm with @dpagemap memory,
> +	 * migrating existing data.
> +	 * @dpagemap: The struct drm_pagemap managing the memory.
> +	 * @start: The virtual start address in @mm
> +	 * @end: The virtual end address in @mm
> +	 * @mm: Pointer to a live mm. The caller must have an mmget()
> +	 * reference.
> +	 *
> +	 * The caller will have the mm lock at least in read mode.
> +	 * Note that there is no guarantee that the memory is resident
> +	 * after the function returns, it's best effort only.
> +	 * When the mm is not using the memory anymore,
> +	 * it will be released. The struct drm_pagemap might have a
> +	 * mechanism in place to reclaim the memory and the data will
> +	 * then be migrated. Typically to system memory.
> +	 * The implementation should hold sufficient runtime power-
> +	 * references while pages are used in an address space and
> +	 * should ideally guard against hardware device unbind in
> +	 * a way such that device pages are migrated back to system
> +	 * followed by device page removal. The implementation should
> +	 * return -ENODEV after device removal.
> +	 *
> +	 * Return: 0 if successful. Negative error code on error.
> +	 */
> +	int (*populate_mm)(struct drm_pagemap *dpagemap,
> +			   unsigned long start, unsigned long end,
> +			   struct mm_struct *mm,
> +			   unsigned long timeslice_ms);
>  };
>  
>  /**
> @@ -205,4 +234,9 @@ void drm_pagemap_devmem_init(struct drm_pagemap_devmem *devmem_allocation,
>  			     const struct drm_pagemap_devmem_ops *ops,
>  			     struct drm_pagemap *dpagemap, size_t size);
>  
> +int drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
> +			    unsigned long start, unsigned long end,
> +			    struct mm_struct *mm,
> +			    unsigned long timeslice_ms);
> +
>  #endif
> -- 
> 2.49.0
> 

  reply	other threads:[~2025-06-17 17:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-13 14:02 [PATCH v3 0/3] drm/gpusvm, drm/pagemap, drm/xe: Restructure migration in preparation for multi-device Thomas Hellström
2025-06-13 14:02 ` [PATCH v3 1/3] drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap Thomas Hellström
2025-06-17 12:55   ` Ghimiray, Himal Prasad
2025-06-17 13:11     ` Thomas Hellström
2025-06-17 14:47       ` Ghimiray, Himal Prasad
2025-06-17 14:55         ` Thomas Hellström
2025-06-17 17:04           ` Matthew Brost
2025-06-17 19:37             ` Thomas Hellström
2025-06-17 20:44               ` Matthew Brost
2025-06-13 14:02 ` [PATCH v3 2/3] drm/pagemap: Add a populate_mm op Thomas Hellström
2025-06-17 17:05   ` Matthew Brost [this message]
2025-06-13 14:02 ` [PATCH v3 3/3] drm/xe: Implement and use the drm_pagemap " Thomas Hellström
2025-06-17 17:09   ` Matthew Brost
2025-06-17 19:24     ` Thomas Hellström
2025-06-13 15:56 ` ✗ CI.checkpatch: warning for drm/gpusvm, drm/pagemap, drm/xe: Restructure migration in preparation for multi-device (rev3) Patchwork
2025-06-13 15:57 ` ✓ CI.KUnit: success " Patchwork
2025-06-13 17:09 ` ✓ Xe.CI.BAT: " Patchwork
2025-06-15  9:44 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFGgS0qa2YvsB6AQ@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=airlied@gmail.com \
    --cc=apopple@nvidia.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=felix.kuehling@amd.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=michal.mrozek@intel.com \
    --cc=philip.yang@amd.com \
    --cc=simona.vetter@ffwll.ch \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox