Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Honglei Huang <honghuan@amd.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v7 4/5] drm/gpusvm: move struct drm_gpusvm_pages out of struct drm_gpusvm_range
Date: Mon, 29 Jun 2026 09:27:56 -0700	[thread overview]
Message-ID: <akKdDBjNvTpqJHem@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <20260629022921.17533-5-honghuan@amd.com>

On Mon, Jun 29, 2026 at 10:29:20AM +0800, Honglei Huang wrote:
> Since the pages the physical pages and MM VA range has been abstractly
> separated. Unbinding a single form of physical page from the MM VA
> range, brings flexibility to the drm gpu SVM framework, transfer the
> way of management of MM and device physical pages to the driver layer.
> 
> framework's range embedded pages object and its range level wrappers
> have no users left. Remove the following:
> 
>   - Drop pages in drm_gpusvm_range.
>   - Drop drm_gpusvm_range_pages_valid(), drm_gpusvm_range_get_pages()
>     and drm_gpusvm_range_unmap_pages(); drivers should use the
>     drm_gpusvm_pages helpers (drm_gpusvm_pages_valid,
>     drm_gpusvm_get_pages, drm_gpusvm_unmap_pages) directly on a
>     pages object they own.
>   - Drop the notifier_seq seeding in drm_gpusvm_range_alloc();
>     drivers initialise notifier_seq on their own pages object.
> 
> Update the DOC: overview to match the new model: document struct
> drm_gpusvm_pages and the driver owned 1:1 / N:1 layouts, and rewrite the
> usage examples to operate on a driver embedded pages object by the
> drm_gpusvm_pages helpers instead of the removed range level wrappers.
> 
> Suggested-by: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> Signed-off-by: Honglei Huang <honghuan@amd.com>
> ---
>  drivers/gpu/drm/drm_gpusvm.c | 169 +++++++++++++++++++----------------
>  include/drm/drm_gpusvm.h     |  13 ---
>  2 files changed, 90 insertions(+), 92 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index b0e9a2db108..e0fd0b2fcc5 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -48,6 +48,47 @@
>   *	event. As mentioned above, ranges are tracked in a notifier's Red-Black
>   *	tree.
>   *
> + * - Pages:
> + *	struct drm_gpusvm_pages holds the DMA mapping state for a range of
> + *	CPU virtual addresses: the DMA mapped device addresses,
> + *	the device private pagemap, the IOVA state, the per mapping
> + *	notifier sequence number, and the drm_device that owns the DMA
> + *	mappings.
> + *	A driver embeds one or more struct drm_gpusvm_pages alongside its
> + *	struct drm_gpusvm_range, choosing one of two layouts:
> + *
> + *	1:1 - one drm_gpusvm_pages per range (one drm_device). Simplest
> + *	layout; to mirror a VA range on several devices a driver uses a
> + *	separate range (and notifier) per device, so the HMM fault is taken
> + *	once per device.
> + *
> + *	N:1 - one drm_gpusvm_pages per drm_device, all sharing one range and
> + *	notifier; only the per-device DMA mapping differs. The instances must
> + *	sit in contiguous memory so a single drm_gpusvm_range_set_unmapped()
> + *	can mark them all. A driver can keep one instance inline for the single
> + *	device case and switch to a heap array only when more devices join,
> + *	e.g.:
> + *
> + *	.. code-block:: c
> + *
> + *		struct driver_range {
> + *			struct drm_gpusvm_range base;
> + *			unsigned int num_pages;	// 1: inline_pages, >1: pages[]
> + *			union {
> + *				struct drm_gpusvm_pages inline_pages;
> + *				struct drm_gpusvm_pages *pages;
> + *			};
> + *		};
> + *
> + *	In the N:1 case the driver allocates the pages array with a zeroing
> + *	allocator (e.g. kcalloc(num_pages, ...)), initialises each entry with
> + *	drm_gpusvm_init_pages(), and frees each entry with
> + *	drm_gpusvm_free_pages() plus the array itself, from its range free
> + *	callback. Each drm_gpusvm_pages is mapped independently by their own
> + *	drm_device.
> + *	Each drm_gpusvm_pages must be zero-initialised and initialised with
> + *	drm_gpusvm_init_pages(), called once per entry.
> + *
>   * - Operations:
>   *	Define the interface for driver-specific GPU SVM operations such as
>   *	range allocation, notifier allocation, and invalidations.
> @@ -92,7 +133,7 @@
>   * range RB tree and list, as well as the range's DMA mappings and sequence
>   * number. GPU SVM manages all necessary locking and unlocking operations,
>   * except for the recheck range's pages being valid
> - * (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings.
> + * (drm_gpusvm_pages_valid) when the driver is committing GPU bindings.
>   * This lock corresponds to the ``driver->update`` lock mentioned in
>   * Documentation/mm/hmm.rst. Future revisions may transition from a GPU SVM
>   * global lock to a per-notifier lock if finer-grained locking is deemed
> @@ -140,15 +181,20 @@
>   *
>   * .. code-block:: c
>   *
> - *	int driver_bind_range(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range)
> + *	struct driver_range {
> + *		struct drm_gpusvm_range base;
> + *		struct drm_gpusvm_pages pages;
> + *	};
> + *
> + *	int driver_bind_range(struct drm_gpusvm *gpusvm, struct driver_range *drange)
>   *	{
>   *		int err = 0;
>   *
> - *		driver_alloc_and_setup_memory_for_bind(gpusvm, range);
> + *		driver_alloc_and_setup_memory_for_bind(gpusvm, drange);
>   *
>   *		drm_gpusvm_notifier_lock(gpusvm);
> - *		if (drm_gpusvm_range_pages_valid(range))
> - *			driver_commit_bind(gpusvm, range);
> + *		if (drm_gpusvm_pages_valid(gpusvm, &drange->pages))
> + *			driver_commit_bind(gpusvm, drange);
>   *		else
>   *			err = -EAGAIN;
>   *		drm_gpusvm_notifier_unlock(gpusvm);
> @@ -160,6 +206,8 @@
>   *			     unsigned long gpuva_start, unsigned long gpuva_end)
>   *	{
>   *		struct drm_gpusvm_ctx ctx = {};
> + *		struct driver_range *drange;
> + *		struct drm_gpusvm_range *range;
>   *		int err;
>   *
>   *		driver_svm_lock();
> @@ -174,6 +222,7 @@
>   *			err = PTR_ERR(range);
>   *			goto unlock;
>   *		}
> + *		drange = container_of(range, struct driver_range, base);
>   *
>   *		if (driver_migration_policy(range)) {
>   *			err = drm_pagemap_populate_mm(driver_choose_drm_pagemap(),
> @@ -183,7 +232,10 @@
>   *				goto retry;
>   *		}
>   *
> - *		err = drm_gpusvm_range_get_pages(gpusvm, range, &ctx);
> + *		err = drm_gpusvm_get_pages(gpusvm, &drange->pages,
> + *					   gpusvm->mm, &range->notifier->notifier,
> + *					   drm_gpusvm_range_start(range),
> + *					   drm_gpusvm_range_end(range), &ctx);
>   *		if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) {	// CPU mappings changed
>   *			if (err == -EOPNOTSUPP)
>   *				drm_gpusvm_range_evict(gpusvm, range);
> @@ -192,7 +244,7 @@
>   *			goto unlock;
>   *		}
>   *
> - *		err = driver_bind_range(gpusvm, range);
> + *		err = driver_bind_range(gpusvm, drange);
>   *		if (err == -EAGAIN)	// CPU mappings changed
>   *			goto retry
>   *
> @@ -205,6 +257,21 @@
>   *
>   * .. code-block:: c
>   *
> + *	// The driver owns the drm_gpusvm_pages lifecycle. ops->range_free is
> + *	// the final fallback: drm_gpusvm_free_pages() unmaps any
> + *	// lingering DMA mapping and a no-op if already unmapped and frees the
> + *	// dma_addr array. The normal flow is to DMA unmap before
> + *	// drm_gpusvm_range_remove() (before the range leaves the tree).
> + *	void driver_range_free(struct drm_gpusvm_range *range)
> + *	{
> + *		struct driver_range *drange =
> + *			container_of(range, struct driver_range, base);
> + *
> + *		drm_gpusvm_free_pages(range->gpusvm, &drange->pages,
> + *				      drm_gpusvm_range_size(range) >> PAGE_SHIFT);
> + *		kfree(drange);
> + *	}
> + *
>   *	void __driver_garbage_collector(struct drm_gpusvm *gpusvm,
>   *					struct drm_gpusvm_range *range)
>   *	{
> @@ -215,6 +282,14 @@
>   *			drm_gpusvm_range_evict(gpusvm, range);
>   *
>   *		driver_unbind_range(range);
> + *		// The pages must be DMA unmapped before drm_gpusvm_range_remove()
> + *		// , so a range is never off the MMU interval tree while still DMA
> + *		// mapped as the original drmsvm design flow. Otherwise a concurrent CPU
> + *		// munmap's notifier could miss this range and free pages still mapped
> + *		// for device DMA. This is the normal unmap point.
> + *		drm_gpusvm_unmap_pages(gpusvm, &drange->pages,
> + *				       drm_gpusvm_range_size(range) >> PAGE_SHIFT,
> + *				       &(struct drm_gpusvm_ctx){ .in_notifier = false });
>   *		drm_gpusvm_range_remove(gpusvm, range);
>   *	}
>   *
> @@ -236,17 +311,22 @@
>   *	{
>   *		struct drm_gpusvm_ctx ctx = { .in_notifier = true, };
>   *		struct drm_gpusvm_range *range = NULL;
> + *		struct driver_range *drange;
>   *
>   *		driver_invalidate_device_pages(gpusvm, mmu_range->start, mmu_range->end);
>   *
>   *		drm_gpusvm_for_each_range(range, notifier, mmu_range->start,
>   *					  mmu_range->end) {
> - *			drm_gpusvm_range_unmap_pages(gpusvm, range, &ctx);
> + *			drange = container_of(range, struct driver_range, base);
> + *
> + *			drm_gpusvm_unmap_pages(gpusvm, &drange->pages,
> + *					       drm_gpusvm_range_size(range) >> PAGE_SHIFT,
> + *					       &ctx);
>   *
>   *			if (mmu_range->event != MMU_NOTIFY_UNMAP)
>   *				continue;
>   *
> - *			drm_gpusvm_range_set_unmapped(range, mmu_range);
> + *			drm_gpusvm_range_set_unmapped(range, &drange->pages, 1, mmu_range);
>   *			driver_garbage_collector_add(gpusvm, range);
>   *		}
>   *	}
> @@ -640,8 +720,6 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
>  	range->itree.start = ALIGN_DOWN(fault_addr, chunk_size);
>  	range->itree.last = ALIGN(fault_addr + 1, chunk_size) - 1;
>  	INIT_LIST_HEAD(&range->entry);
> -	range->pages.notifier_seq = LONG_MAX;
> -	range->pages.drm = gpusvm->drm;
>  	range->flags.migrate_devmem = migrate_devmem ? 1 : 0;
>  
>  	return range;
> @@ -930,7 +1008,7 @@ drm_gpusvm_range_chunk_size(struct drm_gpusvm *gpusvm,
>  		 * mallocs 16k but the CPU VMA is ~128k which results in 64k SVM
>  		 * ranges. When migrating the SVM ranges, some processes fail in
>  		 * drm_pagemap_migrate_to_devmem with 'migrate.cpages != npages'
> -		 * and then upon drm_gpusvm_range_get_pages device pages from
> +		 * and then upon drm_gpusvm_get_pages device pages from
>  		 * other processes are collected + faulted in which creates all
>  		 * sorts of problems. Unsure exactly how this happening, also
>  		 * problem goes away if 'xe_exec_system_allocator --r
> @@ -1337,27 +1415,6 @@ bool drm_gpusvm_pages_valid(struct drm_gpusvm *gpusvm,
>  }
>  EXPORT_SYMBOL_GPL(drm_gpusvm_pages_valid);
>  
> -/**
> - * drm_gpusvm_range_pages_valid() - GPU SVM range pages valid
> - * @gpusvm: Pointer to the GPU SVM structure
> - * @range: Pointer to the GPU SVM range structure
> - *
> - * This function determines if a GPU SVM range pages are valid. Expected be
> - * called holding gpusvm->notifier_lock and as the last step before committing a
> - * GPU binding. This is akin to a notifier seqno check in the HMM documentation
> - * but due to wider notifiers (i.e., notifiers which span multiple ranges) this
> - * function is required for finer grained checking (i.e., per range) if pages
> - * are valid.
> - *
> - * Return: True if GPU SVM range has valid pages, False otherwise
> - */
> -bool drm_gpusvm_range_pages_valid(struct drm_gpusvm *gpusvm,
> -				  struct drm_gpusvm_range *range)
> -{
> -	return drm_gpusvm_pages_valid(gpusvm, &range->pages);
> -}
> -EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
> -
>  /**
>   * drm_gpusvm_pages_valid_unlocked() - GPU SVM pages valid unlocked
>   * @gpusvm: Pointer to the GPU SVM structure
> @@ -1638,29 +1695,6 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
>  }
>  EXPORT_SYMBOL_GPL(drm_gpusvm_get_pages);
>  
> -/**
> - * drm_gpusvm_range_get_pages() - Get pages for a GPU SVM range
> - * @gpusvm: Pointer to the GPU SVM structure
> - * @range: Pointer to the GPU SVM range structure
> - * @ctx: GPU SVM context
> - *
> - * This function gets pages for a GPU SVM range and ensures they are mapped for
> - * DMA access.
> - *
> - * Return: 0 on success, negative error code on failure.
> - */
> -int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> -			       struct drm_gpusvm_range *range,
> -			       const struct drm_gpusvm_ctx *ctx)
> -{
> -	return drm_gpusvm_get_pages(gpusvm, &range->pages,
> -				    gpusvm->mm,
> -				    &range->notifier->notifier,
> -				    drm_gpusvm_range_start(range),
> -				    drm_gpusvm_range_end(range), ctx);
> -}
> -EXPORT_SYMBOL_GPL(drm_gpusvm_range_get_pages);
> -
>  /**
>   * drm_gpusvm_unmap_pages() - Unmap GPU svm pages
>   * @gpusvm: Pointer to the GPU SVM structure
> @@ -1691,29 +1725,6 @@ void drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
>  }
>  EXPORT_SYMBOL_GPL(drm_gpusvm_unmap_pages);
>  
> -/**
> - * drm_gpusvm_range_unmap_pages() - Unmap pages associated with a GPU SVM range
> - * @gpusvm: Pointer to the GPU SVM structure
> - * @range: Pointer to the GPU SVM range structure
> - * @ctx: GPU SVM context
> - *
> - * This function unmaps pages associated with a GPU SVM range. If @in_notifier
> - * is set, it is assumed that gpusvm->notifier_lock is held in write mode; if it
> - * is clear, it acquires gpusvm->notifier_lock in read mode. Must be called on
> - * each GPU SVM range attached to notifier in gpusvm->ops->invalidate for IOMMU
> - * security model.
> - */
> -void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> -				  struct drm_gpusvm_range *range,
> -				  const struct drm_gpusvm_ctx *ctx)
> -{
> -	unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
> -					       drm_gpusvm_range_end(range));
> -
> -	return drm_gpusvm_unmap_pages(gpusvm, &range->pages, npages, ctx);
> -}
> -EXPORT_SYMBOL_GPL(drm_gpusvm_range_unmap_pages);
> -
>  /**
>   * drm_gpusvm_range_evict() - Evict GPU SVM range
>   * @gpusvm: Pointer to the GPU SVM structure
> diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
> index 3f38283111c..2862104aa1b 100644
> --- a/include/drm/drm_gpusvm.h
> +++ b/include/drm/drm_gpusvm.h
> @@ -177,7 +177,6 @@ struct drm_gpusvm_range_flags {
>   * @refcount: Reference count for the range
>   * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
>   * @entry: List entry to fast interval tree traversal
> - * @pages: The pages for this range.
>   * @flags: Flags for range see &struct drm_gpusvm_range_flags
>   *
>   * This structure represents a GPU SVM range used for tracking memory ranges
> @@ -189,7 +188,6 @@ struct drm_gpusvm_range {
>  	struct kref refcount;
>  	struct interval_tree_node itree;
>  	struct list_head entry;
> -	struct drm_gpusvm_pages pages;
>  	struct drm_gpusvm_range_flags flags;
>  };
>  
> @@ -307,20 +305,9 @@ drm_gpusvm_range_get(struct drm_gpusvm_range *range);
>  
>  void drm_gpusvm_range_put(struct drm_gpusvm_range *range);
>  
> -bool drm_gpusvm_range_pages_valid(struct drm_gpusvm *gpusvm,
> -				  struct drm_gpusvm_range *range);
> -
>  bool drm_gpusvm_pages_valid(struct drm_gpusvm *gpusvm,
>  			    struct drm_gpusvm_pages *svm_pages);
>  
> -int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> -			       struct drm_gpusvm_range *range,
> -			       const struct drm_gpusvm_ctx *ctx);
> -
> -void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> -				  struct drm_gpusvm_range *range,
> -				  const struct drm_gpusvm_ctx *ctx);
> -
>  bool drm_gpusvm_has_mapping(struct drm_gpusvm *gpusvm, unsigned long start,
>  			    unsigned long end);
>  
> -- 
> 2.34.1
> 

  reply	other threads:[~2026-06-29 16:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29  2:29 [PATCH v7 0/5] drm/gpusvm: split MM and device state across gpusvm/range/pages Honglei Huang
2026-06-29  2:29 ` [PATCH v7 1/5] drm/gpusvm: split MM state flags out of drm_gpusvm_pages_flags Honglei Huang
2026-06-29  2:29 ` [PATCH v7 2/5] drm/gpusvm: embed struct drm_device into drm_gpusvm_pages Honglei Huang
2026-06-29  2:29 ` [PATCH v7 3/5] drm/xe: have xe_svm_range embed one drm_gpusvm_pages Honglei Huang
2026-06-29  2:29 ` [PATCH v7 4/5] drm/gpusvm: move struct drm_gpusvm_pages out of struct drm_gpusvm_range Honglei Huang
2026-06-29 16:27   ` Matthew Brost [this message]
2026-06-29  2:29 ` [PATCH v7 5/5] drm/gpusvm: let the drm_gpusvm core context purely MM level Honglei Huang
2026-06-29 18:51 ` ✗ CI.KUnit: failure for drm/gpusvm: split MM and device state across gpusvm/range/pages (rev2) Patchwork
2026-06-30  3:31 ` [PATCH v8 0/5] drm/gpusvm: split MM and device state across gpusvm/range/pages Honglei Huang
2026-06-30  3:31   ` [PATCH v8 1/5] drm/gpusvm: split MM state flags out of drm_gpusvm_pages_flags Honglei Huang
2026-06-30  3:31   ` [PATCH v8 2/5] drm/gpusvm: embed struct drm_device into drm_gpusvm_pages Honglei Huang
2026-06-30  3:31   ` [PATCH v8 3/5] drm/xe: have xe_svm_range embed one drm_gpusvm_pages Honglei Huang
2026-06-30  3:31   ` [PATCH v8 4/5] drm/gpusvm: move struct drm_gpusvm_pages out of struct drm_gpusvm_range Honglei Huang
2026-06-30  3:31   ` [PATCH v8 5/5] drm/gpusvm: let the drm_gpusvm core context purely MM level Honglei Huang
  -- strict thread matches above, loose matches on Subject: below --
2026-06-28 11:14 [PATCH v7 0/5] drm/gpusvm: split MM and device state across gpusvm/range/pages Honglei Huang
2026-06-28 11:14 ` [PATCH v7 4/5] drm/gpusvm: move struct drm_gpusvm_pages out of struct drm_gpusvm_range Honglei Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akKdDBjNvTpqJHem@gsse-cloud1.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=honghuan@amd.com \
    --cc=intel-xe@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox