[PATCH v4 00/40] drm/msm: sparse / "VM

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support
@ 2025-05-14 16:58 Rob Clark
  2025-05-14 16:59 ` [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path Rob Clark
                   ` (11 more replies)
  0 siblings, 12 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:58 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Abhinav Kumar,
	André Almeida, Arnd Bergmann, Barnabás Czémán,
	Christian König, Christopher Snowhill, Dmitry Baryshkov,
	Dmitry Baryshkov, Eugene Lepshy, open list:IOMMU SUBSYSTEM,
	Jason Gunthorpe, Jessica Zhang, Joao Martins, Jonathan Marek,
	Kevin Tian, Konrad Dybcio, Krzysztof Kozlowski,
	moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:bdma_(?:buf|fence|resv)b,
	moderated list:ARM SMMU DRIVERS, open list,
	open list:DMA BUFFER SHARING FRAMEWORK:Keyword:bdma_(?:buf|fence|resv)b,
	Marijn Suijten, Nicolin Chen, Robin Murphy, Sean Paul,
	Will Deacon

From: Rob Clark <robdclark@chromium.org>

Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
Memory[2] in the form of:

1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
   MAP_NULL/UNMAP commands

2. A new VM_BIND ioctl to allow submitting batches of one or more
   MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue

I did not implement support for synchronous VM_BIND commands.  Since
userspace could just immediately wait for the `SUBMIT` to complete, I don't
think we need this extra complexity in the kernel.  Synchronous/immediate
VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.

The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533

Changes in v4:
- Various locking/etc fixes
- Optimize the pgtable preallocation.  If userspace sorts the VM_BIND ops
  then the kernel detects ops that fall into the same 2MB last level PTD
  to avoid duplicate page preallocation.
- Add way to throttle pushing jobs to the scheduler, to cap the amount of
  potentially temporary prealloc'd pgtable pages.
- Add vm_log to devcoredump for debugging.  If the vm_log_shift module
  param is set, keep a log of the last 1<<vm_log_shift VM updates for
  easier debugging of faults/crashes.
- Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/

Changes in v3:
- Switched to seperate VM_BIND ioctl.  This makes the UABI a bit
  cleaner, but OTOH the userspace code was cleaner when the end result
  of either type of VkQueue lead to the same ioctl.  So I'm a bit on
  the fence.
- Switched to doing the gpuvm bookkeeping synchronously, and only
  deferring the pgtable updates.  This avoids needing to hold any resv
  locks in the fence signaling path, resolving the last shrinker related
  lockdep complaints.  OTOH it means userspace can trigger invalid
  pgtable updates with multiple VM_BIND queues.  In this case, we ensure
  that unmaps happen completely (to prevent userspace from using this to
  access free'd pages), mark the context as unusable, and move on with
  life.
- Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/

Changes in v2:
- Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
  merged.
- Pre-allocate all the things, and drop HACK patch which disabled shrinker.
  This includes ensuring that vm_bo objects are allocated up front, pre-
  allocating VMA objects, and pre-allocating pages used for pgtable updates.
  The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
  were initially added for panthor. 
- Add back support for BO dumping for devcoredump.
- Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.com/T/#t

[1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
[2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
[3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700

Rob Clark (40):
  drm/gpuvm: Don't require obj lock in destructor path
  drm/gpuvm: Allow VAs to hold soft reference to BOs
  drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
  drm/sched: Add enqueue credit limit
  iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
  drm/msm: Rename msm_file_private -> msm_context
  drm/msm: Improve msm_context comments
  drm/msm: Rename msm_gem_address_space -> msm_gem_vm
  drm/msm: Remove vram carveout support
  drm/msm: Collapse vma allocation and initialization
  drm/msm: Collapse vma close and delete
  drm/msm: Don't close VMAs on purge
  drm/msm: drm_gpuvm conversion
  drm/msm: Convert vm locking
  drm/msm: Use drm_gpuvm types more
  drm/msm: Split out helper to get iommu prot flags
  drm/msm: Add mmu support for non-zero offset
  drm/msm: Add PRR support
  drm/msm: Rename msm_gem_vma_purge() -> _unmap()
  drm/msm: Drop queued submits on lastclose()
  drm/msm: Lazily create context VM
  drm/msm: Add opt-in for VM_BIND
  drm/msm: Mark VM as unusable on GPU hangs
  drm/msm: Add _NO_SHARE flag
  drm/msm: Crashdump prep for sparse mappings
  drm/msm: rd dumping prep for sparse mappings
  drm/msm: Crashdec support for sparse
  drm/msm: rd dumping support for sparse
  drm/msm: Extract out syncobj helpers
  drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
  drm/msm: Add VM_BIND submitqueue
  drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
  drm/msm: Support pgtable preallocation
  drm/msm: Split out map/unmap ops
  drm/msm: Add VM_BIND ioctl
  drm/msm: Add VM logging for VM_BIND updates
  drm/msm: Add VMA unmap reason
  drm/msm: Add mmu prealloc tracepoint
  drm/msm: use trylock for debugfs
  drm/msm: Bump UAPI version

 drivers/gpu/drm/drm_gem.c                     |   14 +-
 drivers/gpu/drm/drm_gpuvm.c                   |   15 +-
 drivers/gpu/drm/msm/Kconfig                   |    1 +
 drivers/gpu/drm/msm/Makefile                  |    1 +
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |   25 +-
 drivers/gpu/drm/msm/adreno/a2xx_gpummu.c      |    5 +-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   17 +-
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |   17 +-
 drivers/gpu/drm/msm/adreno/a5xx_debugfs.c     |    4 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   22 +-
 drivers/gpu/drm/msm/adreno/a5xx_power.c       |    2 +-
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c     |   10 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c         |   32 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h         |    2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |   49 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c   |    6 +-
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c     |   10 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c    |    4 -
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |   99 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |   23 +-
 .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   |   14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c   |   18 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h   |    2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |   18 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c     |   14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h     |    4 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c     |    6 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c      |   28 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c    |   12 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c     |    4 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c      |   19 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c    |   12 +-
 drivers/gpu/drm/msm/dsi/dsi_host.c            |   14 +-
 drivers/gpu/drm/msm/msm_drv.c                 |  184 +--
 drivers/gpu/drm/msm/msm_drv.h                 |   35 +-
 drivers/gpu/drm/msm/msm_fb.c                  |   18 +-
 drivers/gpu/drm/msm/msm_fbdev.c               |    2 +-
 drivers/gpu/drm/msm/msm_gem.c                 |  494 +++---
 drivers/gpu/drm/msm/msm_gem.h                 |  247 ++-
 drivers/gpu/drm/msm/msm_gem_prime.c           |   15 +
 drivers/gpu/drm/msm/msm_gem_shrinker.c        |  104 +-
 drivers/gpu/drm/msm/msm_gem_submit.c          |  295 ++--
 drivers/gpu/drm/msm/msm_gem_vma.c             | 1471 ++++++++++++++++-
 drivers/gpu/drm/msm/msm_gpu.c                 |  214 ++-
 drivers/gpu/drm/msm/msm_gpu.h                 |  144 +-
 drivers/gpu/drm/msm/msm_gpu_trace.h           |   14 +
 drivers/gpu/drm/msm/msm_iommu.c               |  302 +++-
 drivers/gpu/drm/msm/msm_kms.c                 |   18 +-
 drivers/gpu/drm/msm/msm_kms.h                 |    2 +-
 drivers/gpu/drm/msm/msm_mmu.h                 |   38 +-
 drivers/gpu/drm/msm/msm_rd.c                  |   62 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c          |   10 +-
 drivers/gpu/drm/msm/msm_submitqueue.c         |   96 +-
 drivers/gpu/drm/msm/msm_syncobj.c             |  172 ++
 drivers/gpu/drm/msm/msm_syncobj.h             |   37 +
 drivers/gpu/drm/scheduler/sched_entity.c      |   16 +-
 drivers/gpu/drm/scheduler/sched_main.c        |    3 +
 drivers/iommu/io-pgtable-arm.c                |   27 +-
 include/drm/drm_gem.h                         |   10 +-
 include/drm/drm_gpuvm.h                       |   12 +-
 include/drm/gpu_scheduler.h                   |   13 +-
 include/linux/io-pgtable.h                    |    8 +
 include/uapi/drm/msm_drm.h                    |  149 +-
 63 files changed, 3484 insertions(+), 1251 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
 create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h

-- 
2.49.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 02/40] drm/gpuvm: Allow VAs to hold soft reference to BOs Rob Clark
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

See commit a414fe3a2129 ("drm/msm/gem: Drop obj lock in
msm_gem_free_object()") for justification.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/drm_gpuvm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index f9eb56f24bef..1e89a98caad4 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1511,7 +1511,9 @@ drm_gpuvm_bo_destroy(struct kref *kref)
 	drm_gpuvm_bo_list_del(vm_bo, extobj, lock);
 	drm_gpuvm_bo_list_del(vm_bo, evict, lock);
 
-	drm_gem_gpuva_assert_lock_held(obj);
+	if (kref_read(&obj->refcount) > 0)
+		drm_gem_gpuva_assert_lock_held(obj);
+
 	list_del(&vm_bo->list.entry.gem);
 
 	if (ops && ops->vm_bo_free)
@@ -1871,7 +1873,8 @@ drm_gpuva_unlink(struct drm_gpuva *va)
 	if (unlikely(!obj))
 		return;
 
-	drm_gem_gpuva_assert_lock_held(obj);
+	if (kref_read(&obj->refcount) > 0)
+		drm_gem_gpuva_assert_lock_held(obj);
 	list_del_init(&va->gem.entry);
 
 	va->vm_bo = NULL;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 02/40] drm/gpuvm: Allow VAs to hold soft reference to BOs
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
  2025-05-14 16:59 ` [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 03/40] drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan() Rob Clark
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

Eases migration for drivers where VAs don't hold hard references to
their associated BO, avoiding reference loops.

In particular, msm uses soft references to optimistically keep around
mappings until the BO is distroyed.  Which obviously won't work if the
VA (the mapping) is holding a reference to the BO.

By making this a per-VM flag, we can use normal hard-references for
mappings in a "VM_BIND" managed VM, but soft references in other cases,
such as kernel-internal VMs (for display scanout, etc).

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/drm_gpuvm.c |  8 ++++++--
 include/drm/drm_gpuvm.h     | 12 ++++++++++--
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 1e89a98caad4..f1d521dc1fb0 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1482,7 +1482,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
 
 	vm_bo->vm = drm_gpuvm_get(gpuvm);
 	vm_bo->obj = obj;
-	drm_gem_object_get(obj);
+
+	if (!(gpuvm->flags & DRM_GPUVM_VA_WEAK_REF))
+		drm_gem_object_get(obj);
 
 	kref_init(&vm_bo->kref);
 	INIT_LIST_HEAD(&vm_bo->list.gpuva);
@@ -1504,6 +1506,7 @@ drm_gpuvm_bo_destroy(struct kref *kref)
 	const struct drm_gpuvm_ops *ops = gpuvm->ops;
 	struct drm_gem_object *obj = vm_bo->obj;
 	bool lock = !drm_gpuvm_resv_protected(gpuvm);
+	bool unref = !(gpuvm->flags & DRM_GPUVM_VA_WEAK_REF);
 
 	if (!lock)
 		drm_gpuvm_resv_assert_held(gpuvm);
@@ -1522,7 +1525,8 @@ drm_gpuvm_bo_destroy(struct kref *kref)
 		kfree(vm_bo);
 
 	drm_gpuvm_put(gpuvm);
-	drm_gem_object_put(obj);
+	if (unref)
+		drm_gem_object_put(obj);
 }
 
 /**
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 00d4e43b76b6..13ab087a45fa 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -205,10 +205,18 @@ enum drm_gpuvm_flags {
 	 */
 	DRM_GPUVM_RESV_PROTECTED = BIT(0),
 
+	/**
+	 * @DRM_GPUVM_VA_WEAK_REF:
+	 *
+	 * Flag indicating that the &drm_gpuva (or more correctly, the
+	 * &drm_gpuvm_bo) only holds a weak reference to the &drm_gem_object.
+	 */
+	DRM_GPUVM_VA_WEAK_REF = BIT(1),
+
 	/**
 	 * @DRM_GPUVM_USERBITS: user defined bits
 	 */
-	DRM_GPUVM_USERBITS = BIT(1),
+	DRM_GPUVM_USERBITS = BIT(2),
 };
 
 /**
@@ -651,7 +659,7 @@ struct drm_gpuvm_bo {
 
 	/**
 	 * @obj: The &drm_gem_object being mapped in @vm. This is a reference
-	 * counted pointer.
+	 * counted pointer, unless the &DRM_GPUVM_VA_WEAK_REF flag is set.
 	 */
 	struct drm_gem_object *obj;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 03/40] drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
  2025-05-14 16:59 ` [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path Rob Clark
  2025-05-14 16:59 ` [PATCH v4 02/40] drm/gpuvm: Allow VAs to hold soft reference to BOs Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Rob Clark, Abhinav Kumar, Dmitry Baryshkov,
	Sean Paul, Marijn Suijten, open list

From: Rob Clark <robdclark@chromium.org>

If the callback is going to have to attempt to grab more locks, it is
useful to have an ww_acquire_ctx to avoid locking order problems.

Why not use the drm_exec helper instead?  Mainly because (a) where
ww_acquire_init() is called is awkward, and (b) we don't really
need to retry after backoff, we can just move on to the next object.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/drm_gem.c              | 14 +++++++++++---
 drivers/gpu/drm/msm/msm_gem_shrinker.c | 24 +++++++++++++-----------
 include/drm/drm_gem.h                  | 10 ++++++----
 3 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index ee811764c3df..9e3db9a864f8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1460,12 +1460,14 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail);
  * @nr_to_scan: The number of pages to try to reclaim
  * @remaining: The number of pages left to reclaim, should be initialized by caller
  * @shrink: Callback to try to shrink/reclaim the object.
+ * @ticket: Optional ww_acquire_ctx context to use for locking
  */
 unsigned long
 drm_gem_lru_scan(struct drm_gem_lru *lru,
 		 unsigned int nr_to_scan,
 		 unsigned long *remaining,
-		 bool (*shrink)(struct drm_gem_object *obj))
+		 bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket),
+		 struct ww_acquire_ctx *ticket)
 {
 	struct drm_gem_lru still_in_lru;
 	struct drm_gem_object *obj;
@@ -1498,17 +1500,20 @@ drm_gem_lru_scan(struct drm_gem_lru *lru,
 		 */
 		mutex_unlock(lru->lock);
 
+		if (ticket)
+			ww_acquire_init(ticket, &reservation_ww_class);
+
 		/*
 		 * Note that this still needs to be trylock, since we can
 		 * hit shrinker in response to trying to get backing pages
 		 * for this obj (ie. while it's lock is already held)
 		 */
-		if (!dma_resv_trylock(obj->resv)) {
+		if (!ww_mutex_trylock(&obj->resv->lock, ticket)) {
 			*remaining += obj->size >> PAGE_SHIFT;
 			goto tail;
 		}
 
-		if (shrink(obj)) {
+		if (shrink(obj, ticket)) {
 			freed += obj->size >> PAGE_SHIFT;
 
 			/*
@@ -1522,6 +1527,9 @@ drm_gem_lru_scan(struct drm_gem_lru *lru,
 
 		dma_resv_unlock(obj->resv);
 
+		if (ticket)
+			ww_acquire_fini(ticket);
+
 tail:
 		drm_gem_object_put(obj);
 		mutex_lock(lru->lock);
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 07ca4ddfe4e3..de185fc34084 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -44,7 +44,7 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
 }
 
 static bool
-purge(struct drm_gem_object *obj)
+purge(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket)
 {
 	if (!is_purgeable(to_msm_bo(obj)))
 		return false;
@@ -58,7 +58,7 @@ purge(struct drm_gem_object *obj)
 }
 
 static bool
-evict(struct drm_gem_object *obj)
+evict(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket)
 {
 	if (is_unevictable(to_msm_bo(obj)))
 		return false;
@@ -79,21 +79,21 @@ wait_for_idle(struct drm_gem_object *obj)
 }
 
 static bool
-active_purge(struct drm_gem_object *obj)
+active_purge(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket)
 {
 	if (!wait_for_idle(obj))
 		return false;
 
-	return purge(obj);
+	return purge(obj, ticket);
 }
 
 static bool
-active_evict(struct drm_gem_object *obj)
+active_evict(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket)
 {
 	if (!wait_for_idle(obj))
 		return false;
 
-	return evict(obj);
+	return evict(obj, ticket);
 }
 
 static unsigned long
@@ -102,7 +102,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 	struct msm_drm_private *priv = shrinker->private_data;
 	struct {
 		struct drm_gem_lru *lru;
-		bool (*shrink)(struct drm_gem_object *obj);
+		bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket);
 		bool cond;
 		unsigned long freed;
 		unsigned long remaining;
@@ -122,8 +122,9 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 			continue;
 		stages[i].freed =
 			drm_gem_lru_scan(stages[i].lru, nr,
-					&stages[i].remaining,
-					 stages[i].shrink);
+					 &stages[i].remaining,
+					 stages[i].shrink,
+					 NULL);
 		nr -= stages[i].freed;
 		freed += stages[i].freed;
 		remaining += stages[i].remaining;
@@ -164,7 +165,7 @@ msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_to_scan)
 static const int vmap_shrink_limit = 15;
 
 static bool
-vmap_shrink(struct drm_gem_object *obj)
+vmap_shrink(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket)
 {
 	if (!is_vunmapable(to_msm_bo(obj)))
 		return false;
@@ -192,7 +193,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr)
 		unmapped += drm_gem_lru_scan(lrus[idx],
 					     vmap_shrink_limit - unmapped,
 					     &remaining,
-					     vmap_shrink);
+					     vmap_shrink,
+					     NULL);
 	}
 
 	*(unsigned long *)ptr += unmapped;
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index fdae947682cd..0e2c476df731 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -555,10 +555,12 @@ void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock);
 void drm_gem_lru_remove(struct drm_gem_object *obj);
 void drm_gem_lru_move_tail_locked(struct drm_gem_lru *lru, struct drm_gem_object *obj);
 void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj);
-unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
-			       unsigned int nr_to_scan,
-			       unsigned long *remaining,
-			       bool (*shrink)(struct drm_gem_object *obj));
+unsigned long
+drm_gem_lru_scan(struct drm_gem_lru *lru,
+		 unsigned int nr_to_scan,
+		 unsigned long *remaining,
+		 bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket),
+		 struct ww_acquire_ctx *ticket);
 
 int drm_gem_evict(struct drm_gem_object *obj);
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (2 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 03/40] drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan() Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-15  9:28   ` Philipp Stanner
  2025-05-14 16:59 ` [PATCH v4 05/40] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() Rob Clark
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

Similar to the existing credit limit mechanism, but applying to jobs
enqueued to the scheduler but not yet run.

The use case is to put an upper bound on preallocated, and potentially
unneeded, pgtable pages.  When this limit is exceeded, pushing new jobs
will block until the count drops below the limit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 16 ++++++++++++++--
 drivers/gpu/drm/scheduler/sched_main.c   |  3 +++
 include/drm/gpu_scheduler.h              | 13 ++++++++++++-
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index dc0e60d2c14b..c5f688362a34 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -580,11 +580,21 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * under common lock for the struct drm_sched_entity that was set up for
  * @sched_job in drm_sched_job_init().
  */
-void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
+int drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 {
 	struct drm_sched_entity *entity = sched_job->entity;
+	struct drm_gpu_scheduler *sched = sched_job->sched;
 	bool first;
 	ktime_t submit_ts;
+	int ret;
+
+	ret = wait_event_interruptible(
+			sched->job_scheduled,
+			atomic_read(&sched->enqueue_credit_count) <=
+			sched->enqueue_credit_limit);
+	if (ret)
+		return ret;
+	atomic_add(sched_job->enqueue_credits, &sched->enqueue_credit_count);
 
 	trace_drm_sched_job(sched_job, entity);
 	atomic_inc(entity->rq->sched->score);
@@ -609,7 +619,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 			spin_unlock(&entity->lock);
 
 			DRM_ERROR("Trying to push to a killed entity\n");
-			return;
+			return -EINVAL;
 		}
 
 		rq = entity->rq;
@@ -626,5 +636,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 
 		drm_sched_wakeup(sched);
 	}
+
+	return 0;
 }
 EXPORT_SYMBOL(drm_sched_entity_push_job);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 9412bffa8c74..1102cca69cb4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1217,6 +1217,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
 
 	trace_drm_run_job(sched_job, entity);
 	fence = sched->ops->run_job(sched_job);
+	atomic_sub(sched_job->enqueue_credits, &sched->enqueue_credit_count);
 	complete_all(&entity->entity_idle);
 	drm_sched_fence_scheduled(s_fence, fence);
 
@@ -1253,6 +1254,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
 
 	sched->ops = args->ops;
 	sched->credit_limit = args->credit_limit;
+	sched->enqueue_credit_limit = args->enqueue_credit_limit;
 	sched->name = args->name;
 	sched->timeout = args->timeout;
 	sched->hang_limit = args->hang_limit;
@@ -1308,6 +1310,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
 	INIT_LIST_HEAD(&sched->pending_list);
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->credit_count, 0);
+	atomic_set(&sched->enqueue_credit_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
 	INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
 	INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index da64232c989d..d830ffe083f1 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -329,6 +329,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  * @s_fence: contains the fences for the scheduling of job.
  * @finish_cb: the callback for the finished fence.
  * @credits: the number of credits this job contributes to the scheduler
+ * @enqueue_credits: the number of enqueue credits this job contributes
  * @work: Helper to reschedule job kill to different context.
  * @id: a unique id assigned to each job scheduled on the scheduler.
  * @karma: increment on every hang caused by this job. If this exceeds the hang
@@ -366,6 +367,7 @@ struct drm_sched_job {
 
 	enum drm_sched_priority		s_priority;
 	u32				credits;
+	u32				enqueue_credits;
 	/** @last_dependency: tracks @dependencies as they signal */
 	unsigned int			last_dependency;
 	atomic_t			karma;
@@ -485,6 +487,10 @@ struct drm_sched_backend_ops {
  * @ops: backend operations provided by the driver.
  * @credit_limit: the credit limit of this scheduler
  * @credit_count: the current credit count of this scheduler
+ * @enqueue_credit_limit: the credit limit of jobs pushed to scheduler and not
+ *                        yet run
+ * @enqueue_credit_count: the current crdit count of jobs pushed to scheduler
+ *                        but not yet run
  * @timeout: the time after which a job is removed from the scheduler.
  * @name: name of the ring for which this scheduler is being used.
  * @num_rqs: Number of run-queues. This is at most DRM_SCHED_PRIORITY_COUNT,
@@ -518,6 +524,8 @@ struct drm_gpu_scheduler {
 	const struct drm_sched_backend_ops	*ops;
 	u32				credit_limit;
 	atomic_t			credit_count;
+	u32				enqueue_credit_limit;
+	atomic_t			enqueue_credit_count;
 	long				timeout;
 	const char			*name;
 	u32                             num_rqs;
@@ -550,6 +558,8 @@ struct drm_gpu_scheduler {
  * @num_rqs: Number of run-queues. This may be at most DRM_SCHED_PRIORITY_COUNT,
  *	     as there's usually one run-queue per priority, but may be less.
  * @credit_limit: the number of credits this scheduler can hold from all jobs
+ * @enqueue_credit_limit: the number of credits that can be enqueued before
+ *                        drm_sched_entity_push_job() blocks
  * @hang_limit: number of times to allow a job to hang before dropping it.
  *		This mechanism is DEPRECATED. Set it to 0.
  * @timeout: timeout value in jiffies for submitted jobs.
@@ -564,6 +574,7 @@ struct drm_sched_init_args {
 	struct workqueue_struct *timeout_wq;
 	u32 num_rqs;
 	u32 credit_limit;
+	u32 enqueue_credit_limit;
 	unsigned int hang_limit;
 	long timeout;
 	atomic_t *score;
@@ -600,7 +611,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       u32 credits, void *owner);
 void drm_sched_job_arm(struct drm_sched_job *job);
-void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
+int drm_sched_entity_push_job(struct drm_sched_job *sched_job);
 int drm_sched_job_add_dependency(struct drm_sched_job *job,
 				 struct dma_fence *fence);
 int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 05/40] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (3 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 06/40] drm/msm: Rename msm_file_private -> msm_context Rob Clark
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Robin Murphy,
	Will Deacon, Joerg Roedel, Jason Gunthorpe, Kevin Tian,
	Nicolin Chen, Joao Martins, moderated list:ARM SMMU DRIVERS,
	open list:IOMMU SUBSYSTEM, open list

From: Rob Clark <robdclark@chromium.org>

In situations where mapping/unmapping sequence can be controlled by
userspace, attempting to map over a region that has not yet been
unmapped is an error.  But not something that should spam dmesg.

Now that there is a quirk, we can also drop the selftest_running
flag, and use the quirk instead for selftests.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/iommu/io-pgtable-arm.c | 27 ++++++++++++++-------------
 include/linux/io-pgtable.h     |  8 ++++++++
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index f27965caf6a1..a535d88f8943 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -253,8 +253,6 @@ static inline bool arm_lpae_concat_mandatory(struct io_pgtable_cfg *cfg,
 	       (data->start_level == 1) && (oas == 40);
 }
 
-static bool selftest_running = false;
-
 static dma_addr_t __arm_lpae_dma_addr(void *pages)
 {
 	return (dma_addr_t)virt_to_phys(pages);
@@ -373,7 +371,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 	for (i = 0; i < num_entries; i++)
 		if (iopte_leaf(ptep[i], lvl, data->iop.fmt)) {
 			/* We require an unmap first */
-			WARN_ON(!selftest_running);
+			WARN_ON(!(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NO_WARN_ON));
 			return -EEXIST;
 		} else if (iopte_type(ptep[i]) == ARM_LPAE_PTE_TYPE_TABLE) {
 			/*
@@ -475,7 +473,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 		cptep = iopte_deref(pte, data);
 	} else if (pte) {
 		/* We require an unmap first */
-		WARN_ON(!selftest_running);
+		WARN_ON(!(cfg->quirks & IO_PGTABLE_QUIRK_NO_WARN_ON));
 		return -EEXIST;
 	}
 
@@ -649,8 +647,10 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
 	unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
 	ptep += unmap_idx_start;
 	pte = READ_ONCE(*ptep);
-	if (WARN_ON(!pte))
-		return 0;
+	if (!pte) {
+		WARN_ON(!(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NO_WARN_ON));
+		return -ENOENT;
+	}
 
 	/* If the size matches this level, we're in the right place */
 	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
@@ -660,8 +660,10 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
 		/* Find and handle non-leaf entries */
 		for (i = 0; i < num_entries; i++) {
 			pte = READ_ONCE(ptep[i]);
-			if (WARN_ON(!pte))
+			if (!pte) {
+				WARN_ON(!(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NO_WARN_ON));
 				break;
+			}
 
 			if (!iopte_leaf(pte, lvl, iop->fmt)) {
 				__arm_lpae_clear_pte(&ptep[i], &iop->cfg, 1);
@@ -976,7 +978,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
 			    IO_PGTABLE_QUIRK_ARM_TTBR1 |
 			    IO_PGTABLE_QUIRK_ARM_OUTER_WBWA |
-			    IO_PGTABLE_QUIRK_ARM_HD))
+			    IO_PGTABLE_QUIRK_ARM_HD |
+			    IO_PGTABLE_QUIRK_NO_WARN_ON))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -1079,7 +1082,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	struct arm_lpae_io_pgtable *data;
 	typeof(&cfg->arm_lpae_s2_cfg.vtcr) vtcr = &cfg->arm_lpae_s2_cfg.vtcr;
 
-	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_S2FWB))
+	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_S2FWB |
+			    IO_PGTABLE_QUIRK_NO_WARN_ON))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -1320,7 +1324,6 @@ static void __init arm_lpae_dump_ops(struct io_pgtable_ops *ops)
 #define __FAIL(ops, i)	({						\
 		WARN(1, "selftest: test failed for fmt idx %d\n", (i));	\
 		arm_lpae_dump_ops(ops);					\
-		selftest_running = false;				\
 		-EFAULT;						\
 })
 
@@ -1336,8 +1339,6 @@ static int __init arm_lpae_run_tests(struct io_pgtable_cfg *cfg)
 	size_t size, mapped;
 	struct io_pgtable_ops *ops;
 
-	selftest_running = true;
-
 	for (i = 0; i < ARRAY_SIZE(fmts); ++i) {
 		cfg_cookie = cfg;
 		ops = alloc_io_pgtable_ops(fmts[i], cfg, cfg);
@@ -1426,7 +1427,6 @@ static int __init arm_lpae_run_tests(struct io_pgtable_cfg *cfg)
 		free_io_pgtable_ops(ops);
 	}
 
-	selftest_running = false;
 	return 0;
 }
 
@@ -1448,6 +1448,7 @@ static int __init arm_lpae_do_selftests(void)
 		.tlb = &dummy_tlb_ops,
 		.coherent_walk = true,
 		.iommu_dev = &dev,
+		.quirks = IO_PGTABLE_QUIRK_NO_WARN_ON,
 	};
 
 	/* __arm_lpae_alloc_pages() merely needs dev_to_node() to work */
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index bba2a51c87d2..639b8f4fb87d 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -88,6 +88,13 @@ struct io_pgtable_cfg {
 	 *
 	 * IO_PGTABLE_QUIRK_ARM_HD: Enables dirty tracking in stage 1 pagetable.
 	 * IO_PGTABLE_QUIRK_ARM_S2FWB: Use the FWB format for the MemAttrs bits
+	 *
+	 * IO_PGTABLE_QUIRK_NO_WARN_ON: Do not WARN_ON() on conflicting
+	 *	mappings, but silently return -EEXISTS.  Normally an attempt
+	 *	to map over an existing mapping would indicate some sort of
+	 *	kernel bug, which would justify the WARN_ON().  But for GPU
+	 *	drivers, this could be under control of userspace.  Which
+	 *	deserves an error return, but not to spam dmesg.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS			BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS		BIT(1)
@@ -97,6 +104,7 @@ struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA		BIT(6)
 	#define IO_PGTABLE_QUIRK_ARM_HD			BIT(7)
 	#define IO_PGTABLE_QUIRK_ARM_S2FWB		BIT(8)
+	#define IO_PGTABLE_QUIRK_NO_WARN_ON		BIT(9)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 06/40] drm/msm: Rename msm_file_private -> msm_context
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (4 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 05/40] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 07/40] drm/msm: Improve msm_context comments Rob Clark
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark,
	Dmitry Baryshkov, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

This is a more descriptive name.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  6 ++--
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  4 +--
 drivers/gpu/drm/msm/msm_drv.c           | 14 ++++-----
 drivers/gpu/drm/msm/msm_gem.c           |  2 +-
 drivers/gpu/drm/msm/msm_gem_submit.c    |  2 +-
 drivers/gpu/drm/msm/msm_gpu.c           |  4 +--
 drivers/gpu/drm/msm/msm_gpu.h           | 39 ++++++++++++-------------
 drivers/gpu/drm/msm/msm_submitqueue.c   | 27 +++++++++--------
 9 files changed, 49 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 129c33f0b027..a32cce8b0c5c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -111,7 +111,7 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 		struct msm_ringbuffer *ring, struct msm_gem_submit *submit)
 {
 	bool sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1;
-	struct msm_file_private *ctx = submit->queue->ctx;
+	struct msm_context *ctx = submit->queue->ctx;
 	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
 	phys_addr_t ttbr;
 	u32 asid;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index e80db01a01c0..25c939b3367a 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -356,7 +356,7 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags,
 	return 0;
 }
 
-int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		     uint32_t param, uint64_t *value, uint32_t *len)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -444,7 +444,7 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 	}
 }
 
-int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+int adreno_set_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		     uint32_t param, uint64_t value, uint32_t len)
 {
 	struct drm_device *drm = gpu->dev;
@@ -490,7 +490,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 	case MSM_PARAM_SYSPROF:
 		if (!capable(CAP_SYS_ADMIN))
 			return UERR(EPERM, drm, "invalid permissions");
-		return msm_file_private_set_sysprof(ctx, gpu, value);
+		return msm_context_set_sysprof(ctx, gpu, value);
 	default:
 		return UERR(EINVAL, drm, "%s: invalid param: %u", gpu->name, param);
 	}
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 2366a57b280f..fed9516da365 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -603,9 +603,9 @@ static inline int adreno_is_a7xx(struct adreno_gpu *gpu)
 /* Put vm_start above 32b to catch issues with not setting xyz_BASE_HI */
 #define ADRENO_VM_START 0x100000000ULL
 u64 adreno_private_address_space_size(struct msm_gpu *gpu);
-int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		     uint32_t param, uint64_t *value, uint32_t *len);
-int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+int adreno_set_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		     uint32_t param, uint64_t value, uint32_t len);
 const struct firmware *adreno_request_fw(struct adreno_gpu *adreno_gpu,
 		const char *fwname);
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index c3588dc9e537..29ca24548c67 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -333,7 +333,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 {
 	static atomic_t ident = ATOMIC_INIT(0);
 	struct msm_drm_private *priv = dev->dev_private;
-	struct msm_file_private *ctx;
+	struct msm_context *ctx;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
@@ -363,23 +363,23 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)
 	return context_init(dev, file);
 }
 
-static void context_close(struct msm_file_private *ctx)
+static void context_close(struct msm_context *ctx)
 {
 	msm_submitqueue_close(ctx);
-	msm_file_private_put(ctx);
+	msm_context_put(ctx);
 }
 
 static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 {
 	struct msm_drm_private *priv = dev->dev_private;
-	struct msm_file_private *ctx = file->driver_priv;
+	struct msm_context *ctx = file->driver_priv;
 
 	/*
 	 * It is not possible to set sysprof param to non-zero if gpu
 	 * is not initialized:
 	 */
 	if (priv->gpu)
-		msm_file_private_set_sysprof(ctx, priv->gpu, 0);
+		msm_context_set_sysprof(ctx, priv->gpu, 0);
 
 	context_close(ctx);
 }
@@ -511,7 +511,7 @@ static int msm_ioctl_gem_info_iova(struct drm_device *dev,
 		uint64_t *iova)
 {
 	struct msm_drm_private *priv = dev->dev_private;
-	struct msm_file_private *ctx = file->driver_priv;
+	struct msm_context *ctx = file->driver_priv;
 
 	if (!priv->gpu)
 		return -EINVAL;
@@ -531,7 +531,7 @@ static int msm_ioctl_gem_info_set_iova(struct drm_device *dev,
 		uint64_t iova)
 {
 	struct msm_drm_private *priv = dev->dev_private;
-	struct msm_file_private *ctx = file->driver_priv;
+	struct msm_context *ctx = file->driver_priv;
 
 	if (!priv->gpu)
 		return -EINVAL;
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index d2f38e1df510..fdeb6cf7eeb5 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -48,7 +48,7 @@ static void update_device_mem(struct msm_drm_private *priv, ssize_t size)
 
 static void update_ctx_mem(struct drm_file *file, ssize_t size)
 {
-	struct msm_file_private *ctx = file->driver_priv;
+	struct msm_context *ctx = file->driver_priv;
 	uint64_t ctx_mem = atomic64_add_return(size, &ctx->ctx_mem);
 
 	rcu_read_lock(); /* Locks file->pid! */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index d4f71bb54e84..3aabf7f1da6d 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -651,7 +651,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 {
 	struct msm_drm_private *priv = dev->dev_private;
 	struct drm_msm_gem_submit *args = data;
-	struct msm_file_private *ctx = file->driver_priv;
+	struct msm_context *ctx = file->driver_priv;
 	struct msm_gem_submit *submit = NULL;
 	struct msm_gpu *gpu = priv->gpu;
 	struct msm_gpu_submitqueue *queue;
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index c380d9d9f5af..d786fcfad62f 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -148,7 +148,7 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
 	return 0;
 }
 
-void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
+void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_context *ctx,
 			 struct drm_printer *p)
 {
 	drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
@@ -339,7 +339,7 @@ static void retire_submits(struct msm_gpu *gpu);
 
 static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
 {
-	struct msm_file_private *ctx = submit->queue->ctx;
+	struct msm_context *ctx = submit->queue->ctx;
 	struct task_struct *task;
 
 	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index e25009150579..957d6fb3469d 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -22,7 +22,7 @@
 struct msm_gem_submit;
 struct msm_gpu_perfcntr;
 struct msm_gpu_state;
-struct msm_file_private;
+struct msm_context;
 
 struct msm_gpu_config {
 	const char *ioname;
@@ -44,9 +44,9 @@ struct msm_gpu_config {
  *    + z180_gpu
  */
 struct msm_gpu_funcs {
-	int (*get_param)(struct msm_gpu *gpu, struct msm_file_private *ctx,
+	int (*get_param)(struct msm_gpu *gpu, struct msm_context *ctx,
 			 uint32_t param, uint64_t *value, uint32_t *len);
-	int (*set_param)(struct msm_gpu *gpu, struct msm_file_private *ctx,
+	int (*set_param)(struct msm_gpu *gpu, struct msm_context *ctx,
 			 uint32_t param, uint64_t value, uint32_t len);
 	int (*hw_init)(struct msm_gpu *gpu);
 
@@ -347,7 +347,7 @@ struct msm_gpu_perfcntr {
 #define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_LOW - DRM_SCHED_PRIORITY_HIGH)
 
 /**
- * struct msm_file_private - per-drm_file context
+ * struct msm_context - per-drm_file context
  *
  * @queuelock:    synchronizes access to submitqueues list
  * @submitqueues: list of &msm_gpu_submitqueue created by userspace
@@ -357,7 +357,7 @@ struct msm_gpu_perfcntr {
  * @ref:          reference count
  * @seqno:        unique per process seqno
  */
-struct msm_file_private {
+struct msm_context {
 	rwlock_t queuelock;
 	struct list_head submitqueues;
 	int queueid;
@@ -512,7 +512,7 @@ struct msm_gpu_submitqueue {
 	u32 ring_nr;
 	int faults;
 	uint32_t last_fence;
-	struct msm_file_private *ctx;
+	struct msm_context *ctx;
 	struct list_head node;
 	struct idr fence_idr;
 	struct spinlock idr_lock;
@@ -608,33 +608,32 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 reg, u64 val)
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
 int msm_gpu_pm_resume(struct msm_gpu *gpu);
 
-void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
+void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_context *ctx,
 			 struct drm_printer *p);
 
-int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx);
-struct msm_gpu_submitqueue *msm_submitqueue_get(struct msm_file_private *ctx,
+int msm_submitqueue_init(struct drm_device *drm, struct msm_context *ctx);
+struct msm_gpu_submitqueue *msm_submitqueue_get(struct msm_context *ctx,
 		u32 id);
 int msm_submitqueue_create(struct drm_device *drm,
-		struct msm_file_private *ctx,
+		struct msm_context *ctx,
 		u32 prio, u32 flags, u32 *id);
-int msm_submitqueue_query(struct drm_device *drm, struct msm_file_private *ctx,
+int msm_submitqueue_query(struct drm_device *drm, struct msm_context *ctx,
 		struct drm_msm_submitqueue_query *args);
-int msm_submitqueue_remove(struct msm_file_private *ctx, u32 id);
-void msm_submitqueue_close(struct msm_file_private *ctx);
+int msm_submitqueue_remove(struct msm_context *ctx, u32 id);
+void msm_submitqueue_close(struct msm_context *ctx);
 
 void msm_submitqueue_destroy(struct kref *kref);
 
-int msm_file_private_set_sysprof(struct msm_file_private *ctx,
-				 struct msm_gpu *gpu, int sysprof);
-void __msm_file_private_destroy(struct kref *kref);
+int msm_context_set_sysprof(struct msm_context *ctx, struct msm_gpu *gpu, int sysprof);
+void __msm_context_destroy(struct kref *kref);
 
-static inline void msm_file_private_put(struct msm_file_private *ctx)
+static inline void msm_context_put(struct msm_context *ctx)
 {
-	kref_put(&ctx->ref, __msm_file_private_destroy);
+	kref_put(&ctx->ref, __msm_context_destroy);
 }
 
-static inline struct msm_file_private *msm_file_private_get(
-	struct msm_file_private *ctx)
+static inline struct msm_context *msm_context_get(
+	struct msm_context *ctx)
 {
 	kref_get(&ctx->ref);
 	return ctx;
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
index 7fed1de63b5d..1acc0fe36353 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -7,8 +7,7 @@
 
 #include "msm_gpu.h"
 
-int msm_file_private_set_sysprof(struct msm_file_private *ctx,
-				 struct msm_gpu *gpu, int sysprof)
+int msm_context_set_sysprof(struct msm_context *ctx, struct msm_gpu *gpu, int sysprof)
 {
 	/*
 	 * Since pm_runtime and sysprof_active are both refcounts, we
@@ -46,10 +45,10 @@ int msm_file_private_set_sysprof(struct msm_file_private *ctx,
 	return 0;
 }
 
-void __msm_file_private_destroy(struct kref *kref)
+void __msm_context_destroy(struct kref *kref)
 {
-	struct msm_file_private *ctx = container_of(kref,
-		struct msm_file_private, ref);
+	struct msm_context *ctx = container_of(kref,
+		struct msm_context, ref);
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(ctx->entities); i++) {
@@ -73,12 +72,12 @@ void msm_submitqueue_destroy(struct kref *kref)
 
 	idr_destroy(&queue->fence_idr);
 
-	msm_file_private_put(queue->ctx);
+	msm_context_put(queue->ctx);
 
 	kfree(queue);
 }
 
-struct msm_gpu_submitqueue *msm_submitqueue_get(struct msm_file_private *ctx,
+struct msm_gpu_submitqueue *msm_submitqueue_get(struct msm_context *ctx,
 		u32 id)
 {
 	struct msm_gpu_submitqueue *entry;
@@ -101,7 +100,7 @@ struct msm_gpu_submitqueue *msm_submitqueue_get(struct msm_file_private *ctx,
 	return NULL;
 }
 
-void msm_submitqueue_close(struct msm_file_private *ctx)
+void msm_submitqueue_close(struct msm_context *ctx)
 {
 	struct msm_gpu_submitqueue *entry, *tmp;
 
@@ -119,7 +118,7 @@ void msm_submitqueue_close(struct msm_file_private *ctx)
 }
 
 static struct drm_sched_entity *
-get_sched_entity(struct msm_file_private *ctx, struct msm_ringbuffer *ring,
+get_sched_entity(struct msm_context *ctx, struct msm_ringbuffer *ring,
 		 unsigned ring_nr, enum drm_sched_priority sched_prio)
 {
 	static DEFINE_MUTEX(entity_lock);
@@ -155,7 +154,7 @@ get_sched_entity(struct msm_file_private *ctx, struct msm_ringbuffer *ring,
 	return ctx->entities[idx];
 }
 
-int msm_submitqueue_create(struct drm_device *drm, struct msm_file_private *ctx,
+int msm_submitqueue_create(struct drm_device *drm, struct msm_context *ctx,
 		u32 prio, u32 flags, u32 *id)
 {
 	struct msm_drm_private *priv = drm->dev_private;
@@ -200,7 +199,7 @@ int msm_submitqueue_create(struct drm_device *drm, struct msm_file_private *ctx,
 
 	write_lock(&ctx->queuelock);
 
-	queue->ctx = msm_file_private_get(ctx);
+	queue->ctx = msm_context_get(ctx);
 	queue->id = ctx->queueid++;
 
 	if (id)
@@ -221,7 +220,7 @@ int msm_submitqueue_create(struct drm_device *drm, struct msm_file_private *ctx,
  * Create the default submit-queue (id==0), used for backwards compatibility
  * for userspace that pre-dates the introduction of submitqueues.
  */
-int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx)
+int msm_submitqueue_init(struct drm_device *drm, struct msm_context *ctx)
 {
 	struct msm_drm_private *priv = drm->dev_private;
 	int default_prio, max_priority;
@@ -261,7 +260,7 @@ static int msm_submitqueue_query_faults(struct msm_gpu_submitqueue *queue,
 	return ret ? -EFAULT : 0;
 }
 
-int msm_submitqueue_query(struct drm_device *drm, struct msm_file_private *ctx,
+int msm_submitqueue_query(struct drm_device *drm, struct msm_context *ctx,
 		struct drm_msm_submitqueue_query *args)
 {
 	struct msm_gpu_submitqueue *queue;
@@ -282,7 +281,7 @@ int msm_submitqueue_query(struct drm_device *drm, struct msm_file_private *ctx,
 	return ret;
 }
 
-int msm_submitqueue_remove(struct msm_file_private *ctx, u32 id)
+int msm_submitqueue_remove(struct msm_context *ctx, u32 id)
 {
 	struct msm_gpu_submitqueue *entry;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 07/40] drm/msm: Improve msm_context comments
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (5 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 06/40] drm/msm: Rename msm_file_private -> msm_context Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 08/40] drm/msm: Rename msm_gem_address_space -> msm_gem_vm Rob Clark
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark,
	Dmitry Baryshkov, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

Just some tidying up.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/msm_gpu.h | 44 +++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 957d6fb3469d..c699ce0c557b 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -348,25 +348,39 @@ struct msm_gpu_perfcntr {
 
 /**
  * struct msm_context - per-drm_file context
- *
- * @queuelock:    synchronizes access to submitqueues list
- * @submitqueues: list of &msm_gpu_submitqueue created by userspace
- * @queueid:      counter incremented each time a submitqueue is created,
- *                used to assign &msm_gpu_submitqueue.id
- * @aspace:       the per-process GPU address-space
- * @ref:          reference count
- * @seqno:        unique per process seqno
  */
 struct msm_context {
+	/** @queuelock: synchronizes access to submitqueues list */
 	rwlock_t queuelock;
+
+	/** @submitqueues: list of &msm_gpu_submitqueue created by userspace */
 	struct list_head submitqueues;
+
+	/**
+	 * @queueid:
+	 *
+	 * Counter incremented each time a submitqueue is created, used to
+	 * assign &msm_gpu_submitqueue.id
+	 */
 	int queueid;
+
+	/** @aspace: the per-process GPU address-space */
 	struct msm_gem_address_space *aspace;
+
+	/** @kref: the reference count */
 	struct kref ref;
+
+	/**
+	 * @seqno:
+	 *
+	 * A unique per-process sequence number.  Used to detect context
+	 * switches, without relying on keeping a, potentially dangling,
+	 * pointer to the previous context.
+	 */
 	int seqno;
 
 	/**
-	 * sysprof:
+	 * @sysprof:
 	 *
 	 * The value of MSM_PARAM_SYSPROF set by userspace.  This is
 	 * intended to be used by system profiling tools like Mesa's
@@ -384,21 +398,21 @@ struct msm_context {
 	int sysprof;
 
 	/**
-	 * comm: Overridden task comm, see MSM_PARAM_COMM
+	 * @comm: Overridden task comm, see MSM_PARAM_COMM
 	 *
 	 * Accessed under msm_gpu::lock
 	 */
 	char *comm;
 
 	/**
-	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
+	 * @cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
 	 *
 	 * Accessed under msm_gpu::lock
 	 */
 	char *cmdline;
 
 	/**
-	 * elapsed:
+	 * @elapsed:
 	 *
 	 * The total (cumulative) elapsed time GPU was busy with rendering
 	 * from this context in ns.
@@ -406,7 +420,7 @@ struct msm_context {
 	uint64_t elapsed_ns;
 
 	/**
-	 * cycles:
+	 * @cycles:
 	 *
 	 * The total (cumulative) GPU cycles elapsed attributed to this
 	 * context.
@@ -414,7 +428,7 @@ struct msm_context {
 	uint64_t cycles;
 
 	/**
-	 * entities:
+	 * @entities:
 	 *
 	 * Table of per-priority-level sched entities used by submitqueues
 	 * associated with this &drm_file.  Because some userspace apps
@@ -427,7 +441,7 @@ struct msm_context {
 	struct drm_sched_entity *entities[NR_SCHED_PRIORITIES * MSM_GPU_MAX_RINGS];
 
 	/**
-	 * ctx_mem:
+	 * @ctx_mem:
 	 *
 	 * Total amount of memory of GEM buffers with handles attached for
 	 * this context.
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 08/40] drm/msm: Rename msm_gem_address_space -> msm_gem_vm
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (6 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 07/40] drm/msm: Improve msm_context comments Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 09/40] drm/msm: Remove vram carveout support Rob Clark
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark,
	Dmitry Baryshkov, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Jessica Zhang, Barnabás Czémán,
	Arnd Bergmann, Christopher Snowhill, André Almeida,
	Jonathan Marek, Krzysztof Kozlowski, Haoxiang Li, Eugene Lepshy,
	open list

From: Rob Clark <robdclark@chromium.org>

Re-aligning naming to better match drm_gpuvm terminology will make
things less confusing at the end of the drm_gpuvm conversion.

This is just rename churn, no functional change.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c         | 18 ++--
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |  4 +-
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |  4 +-
 drivers/gpu/drm/msm/adreno/a5xx_debugfs.c     |  4 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         | 22 ++---
 drivers/gpu/drm/msm/adreno/a5xx_power.c       |  2 +-
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c     | 10 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c         | 26 +++---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h         |  2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         | 45 +++++----
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c   |  6 +-
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c     | 10 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       | 47 +++++-----
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       | 18 ++--
 .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 14 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c   | 18 ++--
 drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h   |  2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       | 18 ++--
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c     | 14 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h     |  4 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c     |  6 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c      | 24 ++---
 drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c    | 12 +--
 drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c     |  4 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c      | 18 ++--
 drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c    | 12 +--
 drivers/gpu/drm/msm/dsi/dsi_host.c            | 14 +--
 drivers/gpu/drm/msm/msm_drv.c                 |  8 +-
 drivers/gpu/drm/msm/msm_drv.h                 | 10 +-
 drivers/gpu/drm/msm/msm_fb.c                  | 10 +-
 drivers/gpu/drm/msm/msm_fbdev.c               |  2 +-
 drivers/gpu/drm/msm/msm_gem.c                 | 74 +++++++--------
 drivers/gpu/drm/msm/msm_gem.h                 | 34 +++----
 drivers/gpu/drm/msm/msm_gem_submit.c          |  6 +-
 drivers/gpu/drm/msm/msm_gem_vma.c             | 93 +++++++++----------
 drivers/gpu/drm/msm/msm_gpu.c                 | 48 +++++-----
 drivers/gpu/drm/msm/msm_gpu.h                 | 16 ++--
 drivers/gpu/drm/msm/msm_kms.c                 | 16 ++--
 drivers/gpu/drm/msm/msm_kms.h                 |  2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c          |  4 +-
 drivers/gpu/drm/msm/msm_submitqueue.c         |  2 +-
 41 files changed, 349 insertions(+), 354 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
index 379a3d346c30..5eb063ed0b46 100644
--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
@@ -113,7 +113,7 @@ static int a2xx_hw_init(struct msm_gpu *gpu)
 	uint32_t *ptr, len;
 	int i, ret;
 
-	a2xx_gpummu_params(gpu->aspace->mmu, &pt_base, &tran_error);
+	a2xx_gpummu_params(gpu->vm->mmu, &pt_base, &tran_error);
 
 	DBG("%s", gpu->name);
 
@@ -466,19 +466,19 @@ static struct msm_gpu_state *a2xx_gpu_state_get(struct msm_gpu *gpu)
 	return state;
 }
 
-static struct msm_gem_address_space *
-a2xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
+static struct msm_gem_vm *
+a2xx_create_vm(struct msm_gpu *gpu, struct platform_device *pdev)
 {
 	struct msm_mmu *mmu = a2xx_gpummu_new(&pdev->dev, gpu);
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
-	aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
+	vm = msm_gem_vm_create(mmu, "gpu", SZ_16M,
 		0xfff * SZ_64K);
 
-	if (IS_ERR(aspace) && !IS_ERR(mmu))
+	if (IS_ERR(vm) && !IS_ERR(mmu))
 		mmu->funcs->destroy(mmu);
 
-	return aspace;
+	return vm;
 }
 
 static u32 a2xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
@@ -504,7 +504,7 @@ static const struct adreno_gpu_funcs funcs = {
 #endif
 		.gpu_state_get = a2xx_gpu_state_get,
 		.gpu_state_put = adreno_gpu_state_put,
-		.create_address_space = a2xx_create_address_space,
+		.create_vm = a2xx_create_vm,
 		.get_rptr = a2xx_get_rptr,
 	},
 };
@@ -551,7 +551,7 @@ struct msm_gpu *a2xx_gpu_init(struct drm_device *dev)
 	else
 		adreno_gpu->registers = a220_registers;
 
-	if (!gpu->aspace) {
+	if (!gpu->vm) {
 		dev_err(dev->dev, "No memory protection without MMU\n");
 		if (!allow_vram_carveout) {
 			ret = -ENXIO;
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index b6df115bb567..434e6ededf83 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -526,7 +526,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_busy = a3xx_gpu_busy,
 		.gpu_state_get = a3xx_gpu_state_get,
 		.gpu_state_put = adreno_gpu_state_put,
-		.create_address_space = adreno_create_address_space,
+		.create_vm = adreno_create_vm,
 		.get_rptr = a3xx_get_rptr,
 	},
 };
@@ -581,7 +581,7 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev)
 			goto fail;
 	}
 
-	if (!gpu->aspace) {
+	if (!gpu->vm) {
 		/* TODO we think it is possible to configure the GPU to
 		 * restrict access to VRAM carveout.  But the required
 		 * registers are unknown.  For now just bail out and
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index f1b18a6663f7..2c75debcfd84 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -645,7 +645,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_busy = a4xx_gpu_busy,
 		.gpu_state_get = a4xx_gpu_state_get,
 		.gpu_state_put = adreno_gpu_state_put,
-		.create_address_space = adreno_create_address_space,
+		.create_vm = adreno_create_vm,
 		.get_rptr = a4xx_get_rptr,
 	},
 	.get_timestamp = a4xx_get_timestamp,
@@ -695,7 +695,7 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev)
 
 	adreno_gpu->uche_trap_base = 0xffff0000ffff0000ull;
 
-	if (!gpu->aspace) {
+	if (!gpu->vm) {
 		/* TODO we think it is possible to configure the GPU to
 		 * restrict access to VRAM carveout.  But the required
 		 * registers are unknown.  For now just bail out and
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_debugfs.c b/drivers/gpu/drm/msm/adreno/a5xx_debugfs.c
index 169b8fe688f8..625a4e787d8f 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_debugfs.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_debugfs.c
@@ -116,13 +116,13 @@ reset_set(void *data, u64 val)
 	adreno_gpu->fw[ADRENO_FW_PFP] = NULL;
 
 	if (a5xx_gpu->pm4_bo) {
-		msm_gem_unpin_iova(a5xx_gpu->pm4_bo, gpu->aspace);
+		msm_gem_unpin_iova(a5xx_gpu->pm4_bo, gpu->vm);
 		drm_gem_object_put(a5xx_gpu->pm4_bo);
 		a5xx_gpu->pm4_bo = NULL;
 	}
 
 	if (a5xx_gpu->pfp_bo) {
-		msm_gem_unpin_iova(a5xx_gpu->pfp_bo, gpu->aspace);
+		msm_gem_unpin_iova(a5xx_gpu->pfp_bo, gpu->vm);
 		drm_gem_object_put(a5xx_gpu->pfp_bo);
 		a5xx_gpu->pfp_bo = NULL;
 	}
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 670141531112..cce95ad3cfb8 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -622,7 +622,7 @@ static int a5xx_ucode_load(struct msm_gpu *gpu)
 			a5xx_gpu->shadow = msm_gem_kernel_new(gpu->dev,
 				sizeof(u32) * gpu->nr_rings,
 				MSM_BO_WC | MSM_BO_MAP_PRIV,
-				gpu->aspace, &a5xx_gpu->shadow_bo,
+				gpu->vm, &a5xx_gpu->shadow_bo,
 				&a5xx_gpu->shadow_iova);
 
 			if (IS_ERR(a5xx_gpu->shadow))
@@ -1042,22 +1042,22 @@ static void a5xx_destroy(struct msm_gpu *gpu)
 	a5xx_preempt_fini(gpu);
 
 	if (a5xx_gpu->pm4_bo) {
-		msm_gem_unpin_iova(a5xx_gpu->pm4_bo, gpu->aspace);
+		msm_gem_unpin_iova(a5xx_gpu->pm4_bo, gpu->vm);
 		drm_gem_object_put(a5xx_gpu->pm4_bo);
 	}
 
 	if (a5xx_gpu->pfp_bo) {
-		msm_gem_unpin_iova(a5xx_gpu->pfp_bo, gpu->aspace);
+		msm_gem_unpin_iova(a5xx_gpu->pfp_bo, gpu->vm);
 		drm_gem_object_put(a5xx_gpu->pfp_bo);
 	}
 
 	if (a5xx_gpu->gpmu_bo) {
-		msm_gem_unpin_iova(a5xx_gpu->gpmu_bo, gpu->aspace);
+		msm_gem_unpin_iova(a5xx_gpu->gpmu_bo, gpu->vm);
 		drm_gem_object_put(a5xx_gpu->gpmu_bo);
 	}
 
 	if (a5xx_gpu->shadow_bo) {
-		msm_gem_unpin_iova(a5xx_gpu->shadow_bo, gpu->aspace);
+		msm_gem_unpin_iova(a5xx_gpu->shadow_bo, gpu->vm);
 		drm_gem_object_put(a5xx_gpu->shadow_bo);
 	}
 
@@ -1457,7 +1457,7 @@ static int a5xx_crashdumper_init(struct msm_gpu *gpu,
 		struct a5xx_crashdumper *dumper)
 {
 	dumper->ptr = msm_gem_kernel_new(gpu->dev,
-		SZ_1M, MSM_BO_WC, gpu->aspace,
+		SZ_1M, MSM_BO_WC, gpu->vm,
 		&dumper->bo, &dumper->iova);
 
 	if (!IS_ERR(dumper->ptr))
@@ -1557,7 +1557,7 @@ static void a5xx_gpu_state_get_hlsq_regs(struct msm_gpu *gpu,
 
 	if (a5xx_crashdumper_run(gpu, &dumper)) {
 		kfree(a5xx_state->hlsqregs);
-		msm_gem_kernel_put(dumper.bo, gpu->aspace);
+		msm_gem_kernel_put(dumper.bo, gpu->vm);
 		return;
 	}
 
@@ -1565,7 +1565,7 @@ static void a5xx_gpu_state_get_hlsq_regs(struct msm_gpu *gpu,
 	memcpy(a5xx_state->hlsqregs, dumper.ptr + (256 * SZ_1K),
 		count * sizeof(u32));
 
-	msm_gem_kernel_put(dumper.bo, gpu->aspace);
+	msm_gem_kernel_put(dumper.bo, gpu->vm);
 }
 
 static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu)
@@ -1713,7 +1713,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_busy = a5xx_gpu_busy,
 		.gpu_state_get = a5xx_gpu_state_get,
 		.gpu_state_put = a5xx_gpu_state_put,
-		.create_address_space = adreno_create_address_space,
+		.create_vm = adreno_create_vm,
 		.get_rptr = a5xx_get_rptr,
 	},
 	.get_timestamp = a5xx_get_timestamp,
@@ -1786,8 +1786,8 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
-	if (gpu->aspace)
-		msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu, a5xx_fault_handler);
+	if (gpu->vm)
+		msm_mmu_set_fault_handler(gpu->vm->mmu, gpu, a5xx_fault_handler);
 
 	/* Set up the preemption specific bits and pieces for each ringbuffer */
 	a5xx_preempt_init(gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c b/drivers/gpu/drm/msm/adreno/a5xx_power.c
index 6b91e0bd1514..d6da7351cfbb 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_power.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
@@ -363,7 +363,7 @@ void a5xx_gpmu_ucode_init(struct msm_gpu *gpu)
 	bosize = (cmds_size + (cmds_size / TYPE4_MAX_PAYLOAD) + 1) << 2;
 
 	ptr = msm_gem_kernel_new(drm, bosize,
-		MSM_BO_WC | MSM_BO_GPU_READONLY, gpu->aspace,
+		MSM_BO_WC | MSM_BO_GPU_READONLY, gpu->vm,
 		&a5xx_gpu->gpmu_bo, &a5xx_gpu->gpmu_iova);
 	if (IS_ERR(ptr))
 		return;
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 0469fea55010..5f9e2eb80a2c 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -254,7 +254,7 @@ static int preempt_init_ring(struct a5xx_gpu *a5xx_gpu,
 
 	ptr = msm_gem_kernel_new(gpu->dev,
 		A5XX_PREEMPT_RECORD_SIZE + A5XX_PREEMPT_COUNTER_SIZE,
-		MSM_BO_WC | MSM_BO_MAP_PRIV, gpu->aspace, &bo, &iova);
+		MSM_BO_WC | MSM_BO_MAP_PRIV, gpu->vm, &bo, &iova);
 
 	if (IS_ERR(ptr))
 		return PTR_ERR(ptr);
@@ -262,9 +262,9 @@ static int preempt_init_ring(struct a5xx_gpu *a5xx_gpu,
 	/* The buffer to store counters needs to be unprivileged */
 	counters = msm_gem_kernel_new(gpu->dev,
 		A5XX_PREEMPT_COUNTER_SIZE,
-		MSM_BO_WC, gpu->aspace, &counters_bo, &counters_iova);
+		MSM_BO_WC, gpu->vm, &counters_bo, &counters_iova);
 	if (IS_ERR(counters)) {
-		msm_gem_kernel_put(bo, gpu->aspace);
+		msm_gem_kernel_put(bo, gpu->vm);
 		return PTR_ERR(counters);
 	}
 
@@ -295,8 +295,8 @@ void a5xx_preempt_fini(struct msm_gpu *gpu)
 	int i;
 
 	for (i = 0; i < gpu->nr_rings; i++) {
-		msm_gem_kernel_put(a5xx_gpu->preempt_bo[i], gpu->aspace);
-		msm_gem_kernel_put(a5xx_gpu->preempt_counters_bo[i], gpu->aspace);
+		msm_gem_kernel_put(a5xx_gpu->preempt_bo[i], gpu->vm);
+		msm_gem_kernel_put(a5xx_gpu->preempt_counters_bo[i], gpu->vm);
 	}
 }
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 3d2c5661dbee..4c459ae25cba 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1259,15 +1259,15 @@ int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu)
 
 static void a6xx_gmu_memory_free(struct a6xx_gmu *gmu)
 {
-	msm_gem_kernel_put(gmu->hfi.obj, gmu->aspace);
-	msm_gem_kernel_put(gmu->debug.obj, gmu->aspace);
-	msm_gem_kernel_put(gmu->icache.obj, gmu->aspace);
-	msm_gem_kernel_put(gmu->dcache.obj, gmu->aspace);
-	msm_gem_kernel_put(gmu->dummy.obj, gmu->aspace);
-	msm_gem_kernel_put(gmu->log.obj, gmu->aspace);
-
-	gmu->aspace->mmu->funcs->detach(gmu->aspace->mmu);
-	msm_gem_address_space_put(gmu->aspace);
+	msm_gem_kernel_put(gmu->hfi.obj, gmu->vm);
+	msm_gem_kernel_put(gmu->debug.obj, gmu->vm);
+	msm_gem_kernel_put(gmu->icache.obj, gmu->vm);
+	msm_gem_kernel_put(gmu->dcache.obj, gmu->vm);
+	msm_gem_kernel_put(gmu->dummy.obj, gmu->vm);
+	msm_gem_kernel_put(gmu->log.obj, gmu->vm);
+
+	gmu->vm->mmu->funcs->detach(gmu->vm->mmu);
+	msm_gem_vm_put(gmu->vm);
 }
 
 static int a6xx_gmu_memory_alloc(struct a6xx_gmu *gmu, struct a6xx_gmu_bo *bo,
@@ -1296,7 +1296,7 @@ static int a6xx_gmu_memory_alloc(struct a6xx_gmu *gmu, struct a6xx_gmu_bo *bo,
 	if (IS_ERR(bo->obj))
 		return PTR_ERR(bo->obj);
 
-	ret = msm_gem_get_and_pin_iova_range(bo->obj, gmu->aspace, &bo->iova,
+	ret = msm_gem_get_and_pin_iova_range(bo->obj, gmu->vm, &bo->iova,
 					     range_start, range_end);
 	if (ret) {
 		drm_gem_object_put(bo->obj);
@@ -1321,9 +1321,9 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu)
 	if (IS_ERR(mmu))
 		return PTR_ERR(mmu);
 
-	gmu->aspace = msm_gem_address_space_create(mmu, "gmu", 0x0, 0x80000000);
-	if (IS_ERR(gmu->aspace))
-		return PTR_ERR(gmu->aspace);
+	gmu->vm = msm_gem_vm_create(mmu, "gmu", 0x0, 0x80000000);
+	if (IS_ERR(gmu->vm))
+		return PTR_ERR(gmu->vm);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 39fb8c774a79..cceda7d9c33a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -62,7 +62,7 @@ struct a6xx_gmu {
 	/* For serializing communication with the GMU: */
 	struct mutex lock;
 
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
 	void __iomem *mmio;
 	void __iomem *rscc;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index a32cce8b0c5c..3c92ea35d39a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -120,7 +120,7 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 	if (ctx->seqno == ring->cur_ctx_seqno)
 		return;
 
-	if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
+	if (msm_iommu_pagetable_params(ctx->vm->mmu, &ttbr, &asid))
 		return;
 
 	if (adreno_gpu->info->family >= ADRENO_7XX_GEN1) {
@@ -957,7 +957,7 @@ static int a6xx_ucode_load(struct msm_gpu *gpu)
 
 		msm_gem_object_set_name(a6xx_gpu->sqe_bo, "sqefw");
 		if (!a6xx_ucode_check_version(a6xx_gpu, a6xx_gpu->sqe_bo)) {
-			msm_gem_unpin_iova(a6xx_gpu->sqe_bo, gpu->aspace);
+			msm_gem_unpin_iova(a6xx_gpu->sqe_bo, gpu->vm);
 			drm_gem_object_put(a6xx_gpu->sqe_bo);
 
 			a6xx_gpu->sqe_bo = NULL;
@@ -974,7 +974,7 @@ static int a6xx_ucode_load(struct msm_gpu *gpu)
 		a6xx_gpu->shadow = msm_gem_kernel_new(gpu->dev,
 						      sizeof(u32) * gpu->nr_rings,
 						      MSM_BO_WC | MSM_BO_MAP_PRIV,
-						      gpu->aspace, &a6xx_gpu->shadow_bo,
+						      gpu->vm, &a6xx_gpu->shadow_bo,
 						      &a6xx_gpu->shadow_iova);
 
 		if (IS_ERR(a6xx_gpu->shadow))
@@ -985,7 +985,7 @@ static int a6xx_ucode_load(struct msm_gpu *gpu)
 
 	a6xx_gpu->pwrup_reglist_ptr = msm_gem_kernel_new(gpu->dev, PAGE_SIZE,
 							 MSM_BO_WC  | MSM_BO_MAP_PRIV,
-							 gpu->aspace, &a6xx_gpu->pwrup_reglist_bo,
+							 gpu->vm, &a6xx_gpu->pwrup_reglist_bo,
 							 &a6xx_gpu->pwrup_reglist_iova);
 
 	if (IS_ERR(a6xx_gpu->pwrup_reglist_ptr))
@@ -2198,12 +2198,12 @@ static void a6xx_destroy(struct msm_gpu *gpu)
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 
 	if (a6xx_gpu->sqe_bo) {
-		msm_gem_unpin_iova(a6xx_gpu->sqe_bo, gpu->aspace);
+		msm_gem_unpin_iova(a6xx_gpu->sqe_bo, gpu->vm);
 		drm_gem_object_put(a6xx_gpu->sqe_bo);
 	}
 
 	if (a6xx_gpu->shadow_bo) {
-		msm_gem_unpin_iova(a6xx_gpu->shadow_bo, gpu->aspace);
+		msm_gem_unpin_iova(a6xx_gpu->shadow_bo, gpu->vm);
 		drm_gem_object_put(a6xx_gpu->shadow_bo);
 	}
 
@@ -2243,8 +2243,8 @@ static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
 	mutex_unlock(&a6xx_gpu->gmu.lock);
 }
 
-static struct msm_gem_address_space *
-a6xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
+static struct msm_gem_vm *
+a6xx_create_vm(struct msm_gpu *gpu, struct platform_device *pdev)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
@@ -2258,22 +2258,22 @@ a6xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
 	    !device_iommu_capable(&pdev->dev, IOMMU_CAP_CACHE_COHERENCY))
 		quirks |= IO_PGTABLE_QUIRK_ARM_OUTER_WBWA;
 
-	return adreno_iommu_create_address_space(gpu, pdev, quirks);
+	return adreno_iommu_create_vm(gpu, pdev, quirks);
 }
 
-static struct msm_gem_address_space *
-a6xx_create_private_address_space(struct msm_gpu *gpu)
+static struct msm_gem_vm *
+a6xx_create_private_vm(struct msm_gpu *gpu)
 {
 	struct msm_mmu *mmu;
 
-	mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
+	mmu = msm_iommu_pagetable_create(gpu->vm->mmu);
 
 	if (IS_ERR(mmu))
 		return ERR_CAST(mmu);
 
-	return msm_gem_address_space_create(mmu,
+	return msm_gem_vm_create(mmu,
 		"gpu", ADRENO_VM_START,
-		adreno_private_address_space_size(gpu));
+		adreno_private_vm_size(gpu));
 }
 
 static uint32_t a6xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
@@ -2390,8 +2390,8 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_get = a6xx_gpu_state_get,
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
-		.create_address_space = a6xx_create_address_space,
-		.create_private_address_space = a6xx_create_private_address_space,
+		.create_vm = a6xx_create_vm,
+		.create_private_vm = a6xx_create_private_vm,
 		.get_rptr = a6xx_get_rptr,
 		.progress = a6xx_progress,
 	},
@@ -2419,8 +2419,8 @@ static const struct adreno_gpu_funcs funcs_gmuwrapper = {
 		.gpu_state_get = a6xx_gpu_state_get,
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
-		.create_address_space = a6xx_create_address_space,
-		.create_private_address_space = a6xx_create_private_address_space,
+		.create_vm = a6xx_create_vm,
+		.create_private_vm = a6xx_create_private_vm,
 		.get_rptr = a6xx_get_rptr,
 		.progress = a6xx_progress,
 	},
@@ -2450,8 +2450,8 @@ static const struct adreno_gpu_funcs funcs_a7xx = {
 		.gpu_state_get = a6xx_gpu_state_get,
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
-		.create_address_space = a6xx_create_address_space,
-		.create_private_address_space = a6xx_create_private_address_space,
+		.create_vm = a6xx_create_vm,
+		.create_private_vm = a6xx_create_private_vm,
 		.get_rptr = a6xx_get_rptr,
 		.progress = a6xx_progress,
 	},
@@ -2547,9 +2547,8 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 
 	adreno_gpu->uche_trap_base = 0x1fffffffff000ull;
 
-	if (gpu->aspace)
-		msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu,
-				a6xx_fault_handler);
+	if (gpu->vm)
+		msm_mmu_set_fault_handler(gpu->vm->mmu, gpu, a6xx_fault_handler);
 
 	a6xx_calc_ubwc_config(adreno_gpu);
 	/* Set up the preemption specific bits and pieces for each ringbuffer */
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 341a72a67401..ff06bb75b76d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -132,7 +132,7 @@ static int a6xx_crashdumper_init(struct msm_gpu *gpu,
 		struct a6xx_crashdumper *dumper)
 {
 	dumper->ptr = msm_gem_kernel_new(gpu->dev,
-		SZ_1M, MSM_BO_WC, gpu->aspace,
+		SZ_1M, MSM_BO_WC, gpu->vm,
 		&dumper->bo, &dumper->iova);
 
 	if (!IS_ERR(dumper->ptr))
@@ -1619,7 +1619,7 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
 			a7xx_get_clusters(gpu, a6xx_state, dumper);
 			a7xx_get_dbgahb_clusters(gpu, a6xx_state, dumper);
 
-			msm_gem_kernel_put(dumper->bo, gpu->aspace);
+			msm_gem_kernel_put(dumper->bo, gpu->vm);
 		}
 
 		a7xx_get_post_crashdumper_registers(gpu, a6xx_state);
@@ -1631,7 +1631,7 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
 			a6xx_get_clusters(gpu, a6xx_state, dumper);
 			a6xx_get_dbgahb_clusters(gpu, a6xx_state, dumper);
 
-			msm_gem_kernel_put(dumper->bo, gpu->aspace);
+			msm_gem_kernel_put(dumper->bo, gpu->vm);
 		}
 	}
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
index 2fd4e39f618f..41229c60aa06 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
@@ -343,7 +343,7 @@ static int preempt_init_ring(struct a6xx_gpu *a6xx_gpu,
 
 	ptr = msm_gem_kernel_new(gpu->dev,
 		PREEMPT_RECORD_SIZE(adreno_gpu),
-		MSM_BO_WC | MSM_BO_MAP_PRIV, gpu->aspace, &bo, &iova);
+		MSM_BO_WC | MSM_BO_MAP_PRIV, gpu->vm, &bo, &iova);
 
 	if (IS_ERR(ptr))
 		return PTR_ERR(ptr);
@@ -361,7 +361,7 @@ static int preempt_init_ring(struct a6xx_gpu *a6xx_gpu,
 	ptr = msm_gem_kernel_new(gpu->dev,
 		PREEMPT_SMMU_INFO_SIZE,
 		MSM_BO_WC | MSM_BO_MAP_PRIV | MSM_BO_GPU_READONLY,
-		gpu->aspace, &bo, &iova);
+		gpu->vm, &bo, &iova);
 
 	if (IS_ERR(ptr))
 		return PTR_ERR(ptr);
@@ -376,7 +376,7 @@ static int preempt_init_ring(struct a6xx_gpu *a6xx_gpu,
 
 	struct a7xx_cp_smmu_info *smmu_info_ptr = ptr;
 
-	msm_iommu_pagetable_params(gpu->aspace->mmu, &ttbr, &asid);
+	msm_iommu_pagetable_params(gpu->vm->mmu, &ttbr, &asid);
 
 	smmu_info_ptr->magic = GEN7_CP_SMMU_INFO_MAGIC;
 	smmu_info_ptr->ttbr0 = ttbr;
@@ -404,7 +404,7 @@ void a6xx_preempt_fini(struct msm_gpu *gpu)
 	int i;
 
 	for (i = 0; i < gpu->nr_rings; i++)
-		msm_gem_kernel_put(a6xx_gpu->preempt_bo[i], gpu->aspace);
+		msm_gem_kernel_put(a6xx_gpu->preempt_bo[i], gpu->vm);
 }
 
 void a6xx_preempt_init(struct msm_gpu *gpu)
@@ -430,7 +430,7 @@ void a6xx_preempt_init(struct msm_gpu *gpu)
 	a6xx_gpu->preempt_postamble_ptr  = msm_gem_kernel_new(gpu->dev,
 			PAGE_SIZE,
 			MSM_BO_WC | MSM_BO_MAP_PRIV | MSM_BO_GPU_READONLY,
-			gpu->aspace, &a6xx_gpu->preempt_postamble_bo,
+			gpu->vm, &a6xx_gpu->preempt_postamble_bo,
 			&a6xx_gpu->preempt_postamble_iova);
 
 	preempt_prepare_postamble(a6xx_gpu);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 25c939b3367a..b13aaebd8da7 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -191,21 +191,21 @@ int adreno_zap_shader_load(struct msm_gpu *gpu, u32 pasid)
 	return zap_shader_load_mdt(gpu, adreno_gpu->info->zapfw, pasid);
 }
 
-struct msm_gem_address_space *
-adreno_create_address_space(struct msm_gpu *gpu,
-			    struct platform_device *pdev)
+struct msm_gem_vm *
+adreno_create_vm(struct msm_gpu *gpu,
+		 struct platform_device *pdev)
 {
-	return adreno_iommu_create_address_space(gpu, pdev, 0);
+	return adreno_iommu_create_vm(gpu, pdev, 0);
 }
 
-struct msm_gem_address_space *
-adreno_iommu_create_address_space(struct msm_gpu *gpu,
-				  struct platform_device *pdev,
-				  unsigned long quirks)
+struct msm_gem_vm *
+adreno_iommu_create_vm(struct msm_gpu *gpu,
+		       struct platform_device *pdev,
+		       unsigned long quirks)
 {
 	struct iommu_domain_geometry *geometry;
 	struct msm_mmu *mmu;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	u64 start, size;
 
 	mmu = msm_iommu_gpu_new(&pdev->dev, gpu, quirks);
@@ -224,16 +224,15 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
 	start = max_t(u64, SZ_16M, geometry->aperture_start);
 	size = geometry->aperture_end - start + 1;
 
-	aspace = msm_gem_address_space_create(mmu, "gpu",
-		start & GENMASK_ULL(48, 0), size);
+	vm = msm_gem_vm_create(mmu, "gpu", start & GENMASK_ULL(48, 0), size);
 
-	if (IS_ERR(aspace) && !IS_ERR(mmu))
+	if (IS_ERR(vm) && !IS_ERR(mmu))
 		mmu->funcs->destroy(mmu);
 
-	return aspace;
+	return vm;
 }
 
-u64 adreno_private_address_space_size(struct msm_gpu *gpu)
+u64 adreno_private_vm_size(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(&gpu->pdev->dev);
@@ -274,7 +273,7 @@ void adreno_check_and_reenable_stall(struct adreno_gpu *adreno_gpu)
 			!READ_ONCE(gpu->crashstate)) {
 		adreno_gpu->stall_enabled = true;
 
-		gpu->aspace->mmu->funcs->set_stall(gpu->aspace->mmu, true);
+		gpu->vm->mmu->funcs->set_stall(gpu->vm->mmu, true);
 	}
 	spin_unlock_irqrestore(&adreno_gpu->fault_stall_lock, flags);
 }
@@ -302,7 +301,7 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags,
 	if (adreno_gpu->stall_enabled) {
 		adreno_gpu->stall_enabled = false;
 
-		gpu->aspace->mmu->funcs->set_stall(gpu->aspace->mmu, false);
+		gpu->vm->mmu->funcs->set_stall(gpu->vm->mmu, false);
 	}
 	adreno_gpu->stall_reenable_time = ktime_add_ms(ktime_get(), 500);
 	spin_unlock_irqrestore(&adreno_gpu->fault_stall_lock, irq_flags);
@@ -312,7 +311,7 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags,
 	 * it now.
 	 */
 	if (!do_devcoredump) {
-		gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu);
+		gpu->vm->mmu->funcs->resume_translation(gpu->vm->mmu);
 	}
 
 	/*
@@ -406,8 +405,8 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		*value = 0;
 		return 0;
 	case MSM_PARAM_FAULTS:
-		if (ctx->aspace)
-			*value = gpu->global_faults + ctx->aspace->faults;
+		if (ctx->vm)
+			*value = gpu->global_faults + ctx->vm->faults;
 		else
 			*value = gpu->global_faults;
 		return 0;
@@ -415,14 +414,14 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		*value = gpu->suspend_count;
 		return 0;
 	case MSM_PARAM_VA_START:
-		if (ctx->aspace == gpu->aspace)
+		if (ctx->vm == gpu->vm)
 			return UERR(EINVAL, drm, "requires per-process pgtables");
-		*value = ctx->aspace->va_start;
+		*value = ctx->vm->va_start;
 		return 0;
 	case MSM_PARAM_VA_SIZE:
-		if (ctx->aspace == gpu->aspace)
+		if (ctx->vm == gpu->vm)
 			return UERR(EINVAL, drm, "requires per-process pgtables");
-		*value = ctx->aspace->va_size;
+		*value = ctx->vm->va_size;
 		return 0;
 	case MSM_PARAM_HIGHEST_BANK_BIT:
 		*value = adreno_gpu->ubwc_config.highest_bank_bit;
@@ -612,7 +611,7 @@ struct drm_gem_object *adreno_fw_create_bo(struct msm_gpu *gpu,
 	void *ptr;
 
 	ptr = msm_gem_kernel_new(gpu->dev, fw->size - 4,
-		MSM_BO_WC | MSM_BO_GPU_READONLY, gpu->aspace, &bo, iova);
+		MSM_BO_WC | MSM_BO_GPU_READONLY, gpu->vm, &bo, iova);
 
 	if (IS_ERR(ptr))
 		return ERR_CAST(ptr);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index fed9516da365..258c5c6dde2e 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -602,7 +602,7 @@ static inline int adreno_is_a7xx(struct adreno_gpu *gpu)
 
 /* Put vm_start above 32b to catch issues with not setting xyz_BASE_HI */
 #define ADRENO_VM_START 0x100000000ULL
-u64 adreno_private_address_space_size(struct msm_gpu *gpu);
+u64 adreno_private_vm_size(struct msm_gpu *gpu);
 int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		     uint32_t param, uint64_t *value, uint32_t *len);
 int adreno_set_param(struct msm_gpu *gpu, struct msm_context *ctx,
@@ -645,14 +645,14 @@ void adreno_show_object(struct drm_printer *p, void **ptr, int len,
  * Common helper function to initialize the default address space for arm-smmu
  * attached targets
  */
-struct msm_gem_address_space *
-adreno_create_address_space(struct msm_gpu *gpu,
-			    struct platform_device *pdev);
-
-struct msm_gem_address_space *
-adreno_iommu_create_address_space(struct msm_gpu *gpu,
-				  struct platform_device *pdev,
-				  unsigned long quirks);
+struct msm_gem_vm *
+adreno_create_vm(struct msm_gpu *gpu,
+		 struct platform_device *pdev);
+
+struct msm_gem_vm *
+adreno_iommu_create_vm(struct msm_gpu *gpu,
+		       struct platform_device *pdev,
+		       unsigned long quirks);
 
 int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags,
 			 struct adreno_smmu_fault_info *info, const char *block,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index 849fea580a4c..32e208ee946d 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -566,7 +566,7 @@ static void dpu_encoder_phys_wb_prepare_wb_job(struct dpu_encoder_phys *phys_enc
 		struct drm_writeback_job *job)
 {
 	const struct msm_format *format;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	struct dpu_hw_wb_cfg *wb_cfg;
 	int ret;
 	struct dpu_encoder_phys_wb *wb_enc = to_dpu_encoder_phys_wb(phys_enc);
@@ -576,13 +576,13 @@ static void dpu_encoder_phys_wb_prepare_wb_job(struct dpu_encoder_phys *phys_enc
 
 	wb_enc->wb_job = job;
 	wb_enc->wb_conn = job->connector;
-	aspace = phys_enc->dpu_kms->base.aspace;
+	vm = phys_enc->dpu_kms->base.vm;
 
 	wb_cfg = &wb_enc->wb_cfg;
 
 	memset(wb_cfg, 0, sizeof(struct dpu_hw_wb_cfg));
 
-	ret = msm_framebuffer_prepare(job->fb, aspace, false);
+	ret = msm_framebuffer_prepare(job->fb, vm, false);
 	if (ret) {
 		DPU_ERROR("prep fb failed, %d\n", ret);
 		return;
@@ -596,7 +596,7 @@ static void dpu_encoder_phys_wb_prepare_wb_job(struct dpu_encoder_phys *phys_enc
 		return;
 	}
 
-	dpu_format_populate_addrs(aspace, job->fb, &wb_cfg->dest);
+	dpu_format_populate_addrs(vm, job->fb, &wb_cfg->dest);
 
 	wb_cfg->dest.width = job->fb->width;
 	wb_cfg->dest.height = job->fb->height;
@@ -619,14 +619,14 @@ static void dpu_encoder_phys_wb_cleanup_wb_job(struct dpu_encoder_phys *phys_enc
 		struct drm_writeback_job *job)
 {
 	struct dpu_encoder_phys_wb *wb_enc = to_dpu_encoder_phys_wb(phys_enc);
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
 	if (!job->fb)
 		return;
 
-	aspace = phys_enc->dpu_kms->base.aspace;
+	vm = phys_enc->dpu_kms->base.vm;
 
-	msm_framebuffer_cleanup(job->fb, aspace, false);
+	msm_framebuffer_cleanup(job->fb, vm, false);
 	wb_enc->wb_job = NULL;
 	wb_enc->wb_conn = NULL;
 }
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
index 59c9427da7dd..d115b79af771 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
@@ -274,7 +274,7 @@ int dpu_format_populate_plane_sizes(
 	return _dpu_format_populate_plane_sizes_linear(fmt, fb, layout);
 }
 
-static void _dpu_format_populate_addrs_ubwc(struct msm_gem_address_space *aspace,
+static void _dpu_format_populate_addrs_ubwc(struct msm_gem_vm *vm,
 					    struct drm_framebuffer *fb,
 					    struct dpu_hw_fmt_layout *layout)
 {
@@ -282,7 +282,7 @@ static void _dpu_format_populate_addrs_ubwc(struct msm_gem_address_space *aspace
 	uint32_t base_addr = 0;
 	bool meta;
 
-	base_addr = msm_framebuffer_iova(fb, aspace, 0);
+	base_addr = msm_framebuffer_iova(fb, vm, 0);
 
 	fmt = msm_framebuffer_format(fb);
 	meta = MSM_FORMAT_IS_UBWC(fmt);
@@ -355,7 +355,7 @@ static void _dpu_format_populate_addrs_ubwc(struct msm_gem_address_space *aspace
 	}
 }
 
-static void _dpu_format_populate_addrs_linear(struct msm_gem_address_space *aspace,
+static void _dpu_format_populate_addrs_linear(struct msm_gem_vm *vm,
 					      struct drm_framebuffer *fb,
 					      struct dpu_hw_fmt_layout *layout)
 {
@@ -363,17 +363,17 @@ static void _dpu_format_populate_addrs_linear(struct msm_gem_address_space *aspa
 
 	/* Populate addresses for simple formats here */
 	for (i = 0; i < layout->num_planes; ++i)
-		layout->plane_addr[i] = msm_framebuffer_iova(fb, aspace, i);
-}
+		layout->plane_addr[i] = msm_framebuffer_iova(fb, vm, i);
+	}
 
 /**
  * dpu_format_populate_addrs - populate buffer addresses based on
  *                     mmu, fb, and format found in the fb
- * @aspace:            address space pointer
+ * @vm:                address space pointer
  * @fb:                framebuffer pointer
  * @layout:            format layout structure to populate
  */
-void dpu_format_populate_addrs(struct msm_gem_address_space *aspace,
+void dpu_format_populate_addrs(struct msm_gem_vm *vm,
 			       struct drm_framebuffer *fb,
 			       struct dpu_hw_fmt_layout *layout)
 {
@@ -384,7 +384,7 @@ void dpu_format_populate_addrs(struct msm_gem_address_space *aspace,
 	/* Populate the addresses given the fb */
 	if (MSM_FORMAT_IS_UBWC(fmt) ||
 			MSM_FORMAT_IS_TILE(fmt))
-		_dpu_format_populate_addrs_ubwc(aspace, fb, layout);
+		_dpu_format_populate_addrs_ubwc(vm, fb, layout);
 	else
-		_dpu_format_populate_addrs_linear(aspace, fb, layout);
+		_dpu_format_populate_addrs_linear(vm, fb, layout);
 }
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h
index c6145d43aa3f..989f3e13c497 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h
@@ -31,7 +31,7 @@ static inline bool dpu_find_format(u32 format, const u32 *supported_formats,
 	return false;
 }
 
-void dpu_format_populate_addrs(struct msm_gem_address_space *aspace,
+void dpu_format_populate_addrs(struct msm_gem_vm *vm,
 			       struct drm_framebuffer *fb,
 			       struct dpu_hw_fmt_layout *layout);
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 3305ad0623ca..bb5db6da636a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1095,26 +1095,26 @@ static void _dpu_kms_mmu_destroy(struct dpu_kms *dpu_kms)
 {
 	struct msm_mmu *mmu;
 
-	if (!dpu_kms->base.aspace)
+	if (!dpu_kms->base.vm)
 		return;
 
-	mmu = dpu_kms->base.aspace->mmu;
+	mmu = dpu_kms->base.vm->mmu;
 
 	mmu->funcs->detach(mmu);
-	msm_gem_address_space_put(dpu_kms->base.aspace);
+	msm_gem_vm_put(dpu_kms->base.vm);
 
-	dpu_kms->base.aspace = NULL;
+	dpu_kms->base.vm = NULL;
 }
 
 static int _dpu_kms_mmu_init(struct dpu_kms *dpu_kms)
 {
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
-	aspace = msm_kms_init_aspace(dpu_kms->dev);
-	if (IS_ERR(aspace))
-		return PTR_ERR(aspace);
+	vm = msm_kms_init_vm(dpu_kms->dev);
+	if (IS_ERR(vm))
+		return PTR_ERR(vm);
 
-	dpu_kms->base.aspace = aspace;
+	dpu_kms->base.vm = vm;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index af3e541f60c3..92a249b2ef5f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -71,7 +71,7 @@ static const uint32_t qcom_compressed_supported_formats[] = {
 
 /*
  * struct dpu_plane - local dpu plane structure
- * @aspace: address space pointer
+ * @vm: address space pointer
  * @csc_ptr: Points to dpu_csc_cfg structure to use for current
  * @catalog: Points to dpu catalog structure
  * @revalidate: force revalidation of all the plane properties
@@ -654,8 +654,8 @@ static int dpu_plane_prepare_fb(struct drm_plane *plane,
 
 	DPU_DEBUG_PLANE(pdpu, "FB[%u]\n", fb->base.id);
 
-	/* cache aspace */
-	pstate->aspace = kms->base.aspace;
+	/* cache vm */
+	pstate->vm = kms->base.vm;
 
 	/*
 	 * TODO: Need to sort out the msm_framebuffer_prepare() call below so
@@ -664,9 +664,9 @@ static int dpu_plane_prepare_fb(struct drm_plane *plane,
 	 */
 	drm_gem_plane_helper_prepare_fb(plane, new_state);
 
-	if (pstate->aspace) {
+	if (pstate->vm) {
 		ret = msm_framebuffer_prepare(new_state->fb,
-				pstate->aspace, pstate->needs_dirtyfb);
+				pstate->vm, pstate->needs_dirtyfb);
 		if (ret) {
 			DPU_ERROR("failed to prepare framebuffer\n");
 			return ret;
@@ -689,7 +689,7 @@ static void dpu_plane_cleanup_fb(struct drm_plane *plane,
 
 	DPU_DEBUG_PLANE(pdpu, "FB[%u]\n", old_state->fb->base.id);
 
-	msm_framebuffer_cleanup(old_state->fb, old_pstate->aspace,
+	msm_framebuffer_cleanup(old_state->fb, old_pstate->vm,
 				old_pstate->needs_dirtyfb);
 }
 
@@ -1349,7 +1349,7 @@ static void dpu_plane_sspp_atomic_update(struct drm_plane *plane,
 	pstate->needs_qos_remap |= (is_rt_pipe != pdpu->is_rt_pipe);
 	pdpu->is_rt_pipe = is_rt_pipe;
 
-	dpu_format_populate_addrs(pstate->aspace, new_state->fb, &pstate->layout);
+	dpu_format_populate_addrs(pstate->vm, new_state->fb, &pstate->layout);
 
 	DPU_DEBUG_PLANE(pdpu, "FB[%u] " DRM_RECT_FP_FMT "->crtc%u " DRM_RECT_FMT
 			", %p4cc ubwc %d\n", fb->base.id, DRM_RECT_FP_ARG(&state->src),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h
index acd5725175cd..3578f52048a5 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h
@@ -17,7 +17,7 @@
 /**
  * struct dpu_plane_state: Define dpu extension of drm plane state object
  * @base:	base drm plane state object
- * @aspace:	pointer to address space for input/output buffers
+ * @vm:	pointer to address space for input/output buffers
  * @pipe:	software pipe description
  * @r_pipe:	software pipe description of the second pipe
  * @pipe_cfg:	software pipe configuration
@@ -34,7 +34,7 @@
  */
 struct dpu_plane_state {
 	struct drm_plane_state base;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	struct dpu_sw_pipe pipe;
 	struct dpu_sw_pipe r_pipe;
 	struct dpu_sw_pipe_cfg pipe_cfg;
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c
index b8610aa806ea..0133c0c01a0b 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c
@@ -120,7 +120,7 @@ static void unref_cursor_worker(struct drm_flip_work *work, void *val)
 	struct mdp4_kms *mdp4_kms = get_kms(&mdp4_crtc->base);
 	struct msm_kms *kms = &mdp4_kms->base.base;
 
-	msm_gem_unpin_iova(val, kms->aspace);
+	msm_gem_unpin_iova(val, kms->vm);
 	drm_gem_object_put(val);
 }
 
@@ -369,7 +369,7 @@ static void update_cursor(struct drm_crtc *crtc)
 		if (next_bo) {
 			/* take a obj ref + iova ref when we start scanning out: */
 			drm_gem_object_get(next_bo);
-			msm_gem_get_and_pin_iova(next_bo, kms->aspace, &iova);
+			msm_gem_get_and_pin_iova(next_bo, kms->vm, &iova);
 
 			/* enable cursor: */
 			mdp4_write(mdp4_kms, REG_MDP4_DMA_CURSOR_SIZE(dma),
@@ -427,7 +427,7 @@ static int mdp4_crtc_cursor_set(struct drm_crtc *crtc,
 	}
 
 	if (cursor_bo) {
-		ret = msm_gem_get_and_pin_iova(cursor_bo, kms->aspace, &iova);
+		ret = msm_gem_get_and_pin_iova(cursor_bo, kms->vm, &iova);
 		if (ret)
 			goto fail;
 	} else {
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
index c469e66cfc11..94fbc20b2fbd 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
@@ -120,15 +120,15 @@ static void mdp4_destroy(struct msm_kms *kms)
 {
 	struct mdp4_kms *mdp4_kms = to_mdp4_kms(to_mdp_kms(kms));
 	struct device *dev = mdp4_kms->dev->dev;
-	struct msm_gem_address_space *aspace = kms->aspace;
+	struct msm_gem_vm *vm = kms->vm;
 
 	if (mdp4_kms->blank_cursor_iova)
-		msm_gem_unpin_iova(mdp4_kms->blank_cursor_bo, kms->aspace);
+		msm_gem_unpin_iova(mdp4_kms->blank_cursor_bo, kms->vm);
 	drm_gem_object_put(mdp4_kms->blank_cursor_bo);
 
-	if (aspace) {
-		aspace->mmu->funcs->detach(aspace->mmu);
-		msm_gem_address_space_put(aspace);
+	if (vm) {
+		vm->mmu->funcs->detach(vm->mmu);
+		msm_gem_vm_put(vm);
 	}
 
 	if (mdp4_kms->rpm_enabled)
@@ -380,7 +380,7 @@ static int mdp4_kms_init(struct drm_device *dev)
 	struct mdp4_kms *mdp4_kms = to_mdp4_kms(to_mdp_kms(priv->kms));
 	struct msm_kms *kms = NULL;
 	struct msm_mmu *mmu;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	int ret;
 	u32 major, minor;
 	unsigned long max_clk;
@@ -449,19 +449,19 @@ static int mdp4_kms_init(struct drm_device *dev)
 	} else if (!mmu) {
 		DRM_DEV_INFO(dev->dev, "no iommu, fallback to phys "
 				"contig buffers for scanout\n");
-		aspace = NULL;
+		vm = NULL;
 	} else {
-		aspace  = msm_gem_address_space_create(mmu,
+		vm  = msm_gem_vm_create(mmu,
 			"mdp4", 0x1000, 0x100000000 - 0x1000);
 
-		if (IS_ERR(aspace)) {
+		if (IS_ERR(vm)) {
 			if (!IS_ERR(mmu))
 				mmu->funcs->destroy(mmu);
-			ret = PTR_ERR(aspace);
+			ret = PTR_ERR(vm);
 			goto fail;
 		}
 
-		kms->aspace = aspace;
+		kms->vm = vm;
 	}
 
 	ret = modeset_init(mdp4_kms);
@@ -478,7 +478,7 @@ static int mdp4_kms_init(struct drm_device *dev)
 		goto fail;
 	}
 
-	ret = msm_gem_get_and_pin_iova(mdp4_kms->blank_cursor_bo, kms->aspace,
+	ret = msm_gem_get_and_pin_iova(mdp4_kms->blank_cursor_bo, kms->vm,
 			&mdp4_kms->blank_cursor_iova);
 	if (ret) {
 		DRM_DEV_ERROR(dev->dev, "could not pin blank-cursor bo: %d\n", ret);
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c
index 3fefb2088008..7743be6167f8 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c
@@ -87,7 +87,7 @@ static int mdp4_plane_prepare_fb(struct drm_plane *plane,
 
 	drm_gem_plane_helper_prepare_fb(plane, new_state);
 
-	return msm_framebuffer_prepare(new_state->fb, kms->aspace, false);
+	return msm_framebuffer_prepare(new_state->fb, kms->vm, false);
 }
 
 static void mdp4_plane_cleanup_fb(struct drm_plane *plane,
@@ -102,7 +102,7 @@ static void mdp4_plane_cleanup_fb(struct drm_plane *plane,
 		return;
 
 	DBG("%s: cleanup: FB[%u]", mdp4_plane->name, fb->base.id);
-	msm_framebuffer_cleanup(fb, kms->aspace, false);
+	msm_framebuffer_cleanup(fb, kms->vm, false);
 }
 
 
@@ -153,13 +153,13 @@ static void mdp4_plane_set_scanout(struct drm_plane *plane,
 			MDP4_PIPE_SRC_STRIDE_B_P3(fb->pitches[3]));
 
 	mdp4_write(mdp4_kms, REG_MDP4_PIPE_SRCP0_BASE(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 0));
+			msm_framebuffer_iova(fb, kms->vm, 0));
 	mdp4_write(mdp4_kms, REG_MDP4_PIPE_SRCP1_BASE(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 1));
+			msm_framebuffer_iova(fb, kms->vm, 1));
 	mdp4_write(mdp4_kms, REG_MDP4_PIPE_SRCP2_BASE(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 2));
+			msm_framebuffer_iova(fb, kms->vm, 2));
 	mdp4_write(mdp4_kms, REG_MDP4_PIPE_SRCP3_BASE(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 3));
+			msm_framebuffer_iova(fb, kms->vm, 3));
 }
 
 static void mdp4_write_csc_config(struct mdp4_kms *mdp4_kms,
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
index 0f653e62b4a0..298861f373b0 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -169,7 +169,7 @@ static void unref_cursor_worker(struct drm_flip_work *work, void *val)
 	struct mdp5_kms *mdp5_kms = get_kms(&mdp5_crtc->base);
 	struct msm_kms *kms = &mdp5_kms->base.base;
 
-	msm_gem_unpin_iova(val, kms->aspace);
+	msm_gem_unpin_iova(val, kms->vm);
 	drm_gem_object_put(val);
 }
 
@@ -993,7 +993,7 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 	if (!cursor_bo)
 		return -ENOENT;
 
-	ret = msm_gem_get_and_pin_iova(cursor_bo, kms->aspace,
+	ret = msm_gem_get_and_pin_iova(cursor_bo, kms->vm,
 			&mdp5_crtc->cursor.iova);
 	if (ret) {
 		drm_gem_object_put(cursor_bo);
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
index 3fcca7a3d82e..9dca0385a42d 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
@@ -198,11 +198,11 @@ static void mdp5_destroy(struct mdp5_kms *mdp5_kms);
 static void mdp5_kms_destroy(struct msm_kms *kms)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
-	struct msm_gem_address_space *aspace = kms->aspace;
+	struct msm_gem_vm *vm = kms->vm;
 
-	if (aspace) {
-		aspace->mmu->funcs->detach(aspace->mmu);
-		msm_gem_address_space_put(aspace);
+	if (vm) {
+		vm->mmu->funcs->detach(vm->mmu);
+		msm_gem_vm_put(vm);
 	}
 
 	mdp_kms_destroy(&mdp5_kms->base);
@@ -500,7 +500,7 @@ static int mdp5_kms_init(struct drm_device *dev)
 	struct mdp5_kms *mdp5_kms;
 	struct mdp5_cfg *config;
 	struct msm_kms *kms = priv->kms;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	int i, ret;
 
 	ret = mdp5_init(to_platform_device(dev->dev), dev);
@@ -534,13 +534,13 @@ static int mdp5_kms_init(struct drm_device *dev)
 	}
 	mdelay(16);
 
-	aspace = msm_kms_init_aspace(mdp5_kms->dev);
-	if (IS_ERR(aspace)) {
-		ret = PTR_ERR(aspace);
+	vm = msm_kms_init_vm(mdp5_kms->dev);
+	if (IS_ERR(vm)) {
+		ret = PTR_ERR(vm);
 		goto fail;
 	}
 
-	kms->aspace = aspace;
+	kms->vm = vm;
 
 	pm_runtime_put_sync(&pdev->dev);
 
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
index bb1601921938..9f68a4747203 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
@@ -144,7 +144,7 @@ static int mdp5_plane_prepare_fb(struct drm_plane *plane,
 
 	drm_gem_plane_helper_prepare_fb(plane, new_state);
 
-	return msm_framebuffer_prepare(new_state->fb, kms->aspace, needs_dirtyfb);
+	return msm_framebuffer_prepare(new_state->fb, kms->vm, needs_dirtyfb);
 }
 
 static void mdp5_plane_cleanup_fb(struct drm_plane *plane,
@@ -159,7 +159,7 @@ static void mdp5_plane_cleanup_fb(struct drm_plane *plane,
 		return;
 
 	DBG("%s: cleanup: FB[%u]", plane->name, fb->base.id);
-	msm_framebuffer_cleanup(fb, kms->aspace, needed_dirtyfb);
+	msm_framebuffer_cleanup(fb, kms->vm, needed_dirtyfb);
 }
 
 static int mdp5_plane_atomic_check_with_state(struct drm_crtc_state *crtc_state,
@@ -478,13 +478,13 @@ static void set_scanout_locked(struct mdp5_kms *mdp5_kms,
 			MDP5_PIPE_SRC_STRIDE_B_P3(fb->pitches[3]));
 
 	mdp5_write(mdp5_kms, REG_MDP5_PIPE_SRC0_ADDR(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 0));
+			msm_framebuffer_iova(fb, kms->vm, 0));
 	mdp5_write(mdp5_kms, REG_MDP5_PIPE_SRC1_ADDR(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 1));
+			msm_framebuffer_iova(fb, kms->vm, 1));
 	mdp5_write(mdp5_kms, REG_MDP5_PIPE_SRC2_ADDR(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 2));
+			msm_framebuffer_iova(fb, kms->vm, 2));
 	mdp5_write(mdp5_kms, REG_MDP5_PIPE_SRC3_ADDR(pipe),
-			msm_framebuffer_iova(fb, kms->aspace, 3));
+			msm_framebuffer_iova(fb, kms->vm, 3));
 }
 
 /* Note: mdp5_plane->pipe_lock must be locked */
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 4d75529c0e85..16335ebd21e4 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -143,7 +143,7 @@ struct msm_dsi_host {
 
 	/* DSI 6G TX buffer*/
 	struct drm_gem_object *tx_gem_obj;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
 	/* DSI v2 TX buffer */
 	void *tx_buf;
@@ -1146,10 +1146,10 @@ int dsi_tx_buf_alloc_6g(struct msm_dsi_host *msm_host, int size)
 	uint64_t iova;
 	u8 *data;
 
-	msm_host->aspace = msm_gem_address_space_get(priv->kms->aspace);
+	msm_host->vm = msm_gem_vm_get(priv->kms->vm);
 
 	data = msm_gem_kernel_new(dev, size, MSM_BO_WC,
-					msm_host->aspace,
+					msm_host->vm,
 					&msm_host->tx_gem_obj, &iova);
 
 	if (IS_ERR(data)) {
@@ -1193,10 +1193,10 @@ void msm_dsi_tx_buf_free(struct mipi_dsi_host *host)
 		return;
 
 	if (msm_host->tx_gem_obj) {
-		msm_gem_kernel_put(msm_host->tx_gem_obj, msm_host->aspace);
-		msm_gem_address_space_put(msm_host->aspace);
+		msm_gem_kernel_put(msm_host->tx_gem_obj, msm_host->vm);
+		msm_gem_vm_put(msm_host->vm);
 		msm_host->tx_gem_obj = NULL;
-		msm_host->aspace = NULL;
+		msm_host->vm = NULL;
 	}
 
 	if (msm_host->tx_buf)
@@ -1327,7 +1327,7 @@ int dsi_dma_base_get_6g(struct msm_dsi_host *msm_host, uint64_t *dma_base)
 		return -EINVAL;
 
 	return msm_gem_get_and_pin_iova(msm_host->tx_gem_obj,
-				priv->kms->aspace, dma_base);
+				priv->kms->vm, dma_base);
 }
 
 int dsi_dma_base_get_v2(struct msm_dsi_host *msm_host, uint64_t *dma_base)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 29ca24548c67..903abf3532e0 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -345,7 +345,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 	kref_init(&ctx->ref);
 	msm_submitqueue_init(dev, ctx);
 
-	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
+	ctx->vm = msm_gpu_create_private_vm(priv->gpu, current);
 	file->driver_priv = ctx;
 
 	ctx->seqno = atomic_inc_return(&ident);
@@ -523,7 +523,7 @@ static int msm_ioctl_gem_info_iova(struct drm_device *dev,
 	 * Don't pin the memory here - just get an address so that userspace can
 	 * be productive
 	 */
-	return msm_gem_get_iova(obj, ctx->aspace, iova);
+	return msm_gem_get_iova(obj, ctx->vm, iova);
 }
 
 static int msm_ioctl_gem_info_set_iova(struct drm_device *dev,
@@ -537,13 +537,13 @@ static int msm_ioctl_gem_info_set_iova(struct drm_device *dev,
 		return -EINVAL;
 
 	/* Only supported if per-process address space is supported: */
-	if (priv->gpu->aspace == ctx->aspace)
+	if (priv->gpu->vm == ctx->vm)
 		return UERR(EOPNOTSUPP, dev, "requires per-process pgtables");
 
 	if (should_fail(&fail_gem_iova, obj->size))
 		return -ENOMEM;
 
-	return msm_gem_set_iova(obj, ctx->aspace, iova);
+	return msm_gem_set_iova(obj, ctx->vm, iova);
 }
 
 static int msm_ioctl_gem_info_set_metadata(struct drm_gem_object *obj,
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index a65077855201..0e675c9a7f83 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -48,7 +48,7 @@ struct msm_rd_state;
 struct msm_perf_state;
 struct msm_gem_submit;
 struct msm_fence_context;
-struct msm_gem_address_space;
+struct msm_gem_vm;
 struct msm_gem_vma;
 struct msm_disp_state;
 
@@ -241,7 +241,7 @@ void msm_crtc_disable_vblank(struct drm_crtc *crtc);
 int msm_register_mmu(struct drm_device *dev, struct msm_mmu *mmu);
 void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
 
-struct msm_gem_address_space *msm_kms_init_aspace(struct drm_device *dev);
+struct msm_gem_vm *msm_kms_init_vm(struct drm_device *dev);
 bool msm_use_mmu(struct drm_device *dev);
 
 int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
@@ -263,11 +263,11 @@ int msm_gem_prime_pin(struct drm_gem_object *obj);
 void msm_gem_prime_unpin(struct drm_gem_object *obj);
 
 int msm_framebuffer_prepare(struct drm_framebuffer *fb,
-		struct msm_gem_address_space *aspace, bool needs_dirtyfb);
+		struct msm_gem_vm *vm, bool needs_dirtyfb);
 void msm_framebuffer_cleanup(struct drm_framebuffer *fb,
-		struct msm_gem_address_space *aspace, bool needed_dirtyfb);
+		struct msm_gem_vm *vm, bool needed_dirtyfb);
 uint32_t msm_framebuffer_iova(struct drm_framebuffer *fb,
-		struct msm_gem_address_space *aspace, int plane);
+		struct msm_gem_vm *vm, int plane);
 struct drm_gem_object *msm_framebuffer_bo(struct drm_framebuffer *fb, int plane);
 const struct msm_format *msm_framebuffer_format(struct drm_framebuffer *fb);
 struct drm_framebuffer *msm_framebuffer_create(struct drm_device *dev,
diff --git a/drivers/gpu/drm/msm/msm_fb.c b/drivers/gpu/drm/msm/msm_fb.c
index 09268e416843..6df318b73534 100644
--- a/drivers/gpu/drm/msm/msm_fb.c
+++ b/drivers/gpu/drm/msm/msm_fb.c
@@ -76,7 +76,7 @@ void msm_framebuffer_describe(struct drm_framebuffer *fb, struct seq_file *m)
 /* prepare/pin all the fb's bo's for scanout.
  */
 int msm_framebuffer_prepare(struct drm_framebuffer *fb,
-		struct msm_gem_address_space *aspace,
+		struct msm_gem_vm *vm,
 		bool needs_dirtyfb)
 {
 	struct msm_framebuffer *msm_fb = to_msm_framebuffer(fb);
@@ -88,7 +88,7 @@ int msm_framebuffer_prepare(struct drm_framebuffer *fb,
 	atomic_inc(&msm_fb->prepare_count);
 
 	for (i = 0; i < n; i++) {
-		ret = msm_gem_get_and_pin_iova(fb->obj[i], aspace, &msm_fb->iova[i]);
+		ret = msm_gem_get_and_pin_iova(fb->obj[i], vm, &msm_fb->iova[i]);
 		drm_dbg_state(fb->dev, "FB[%u]: iova[%d]: %08llx (%d)\n",
 			      fb->base.id, i, msm_fb->iova[i], ret);
 		if (ret)
@@ -99,7 +99,7 @@ int msm_framebuffer_prepare(struct drm_framebuffer *fb,
 }
 
 void msm_framebuffer_cleanup(struct drm_framebuffer *fb,
-		struct msm_gem_address_space *aspace,
+		struct msm_gem_vm *vm,
 		bool needed_dirtyfb)
 {
 	struct msm_framebuffer *msm_fb = to_msm_framebuffer(fb);
@@ -109,14 +109,14 @@ void msm_framebuffer_cleanup(struct drm_framebuffer *fb,
 		refcount_dec(&msm_fb->dirtyfb);
 
 	for (i = 0; i < n; i++)
-		msm_gem_unpin_iova(fb->obj[i], aspace);
+		msm_gem_unpin_iova(fb->obj[i], vm);
 
 	if (!atomic_dec_return(&msm_fb->prepare_count))
 		memset(msm_fb->iova, 0, sizeof(msm_fb->iova));
 }
 
 uint32_t msm_framebuffer_iova(struct drm_framebuffer *fb,
-		struct msm_gem_address_space *aspace, int plane)
+		struct msm_gem_vm *vm, int plane)
 {
 	struct msm_framebuffer *msm_fb = to_msm_framebuffer(fb);
 	return msm_fb->iova[plane] + fb->offsets[plane];
diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
index c62249b1ab3d..b5969374d53f 100644
--- a/drivers/gpu/drm/msm/msm_fbdev.c
+++ b/drivers/gpu/drm/msm/msm_fbdev.c
@@ -122,7 +122,7 @@ int msm_fbdev_driver_fbdev_probe(struct drm_fb_helper *helper,
 	 * in panic (ie. lock-safe, etc) we could avoid pinning the
 	 * buffer now:
 	 */
-	ret = msm_gem_get_and_pin_iova(bo, priv->kms->aspace, &paddr);
+	ret = msm_gem_get_and_pin_iova(bo, priv->kms->vm, &paddr);
 	if (ret) {
 		DRM_DEV_ERROR(dev->dev, "failed to get buffer obj iova: %d\n", ret);
 		goto fail;
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index fdeb6cf7eeb5..07a30d29248c 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -402,14 +402,14 @@ uint64_t msm_gem_mmap_offset(struct drm_gem_object *obj)
 }
 
 static struct msm_gem_vma *add_vma(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace)
+		struct msm_gem_vm *vm)
 {
 	struct msm_gem_object *msm_obj = to_msm_bo(obj);
 	struct msm_gem_vma *vma;
 
 	msm_gem_assert_locked(obj);
 
-	vma = msm_gem_vma_new(aspace);
+	vma = msm_gem_vma_new(vm);
 	if (!vma)
 		return ERR_PTR(-ENOMEM);
 
@@ -419,7 +419,7 @@ static struct msm_gem_vma *add_vma(struct drm_gem_object *obj,
 }
 
 static struct msm_gem_vma *lookup_vma(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace)
+		struct msm_gem_vm *vm)
 {
 	struct msm_gem_object *msm_obj = to_msm_bo(obj);
 	struct msm_gem_vma *vma;
@@ -427,7 +427,7 @@ static struct msm_gem_vma *lookup_vma(struct drm_gem_object *obj,
 	msm_gem_assert_locked(obj);
 
 	list_for_each_entry(vma, &msm_obj->vmas, list) {
-		if (vma->aspace == aspace)
+		if (vma->vm == vm)
 			return vma;
 	}
 
@@ -458,7 +458,7 @@ put_iova_spaces(struct drm_gem_object *obj, bool close)
 	msm_gem_assert_locked(obj);
 
 	list_for_each_entry(vma, &msm_obj->vmas, list) {
-		if (vma->aspace) {
+		if (vma->vm) {
 			msm_gem_vma_purge(vma);
 			if (close)
 				msm_gem_vma_close(vma);
@@ -481,19 +481,19 @@ put_iova_vmas(struct drm_gem_object *obj)
 }
 
 static struct msm_gem_vma *get_vma_locked(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace,
+		struct msm_gem_vm *vm,
 		u64 range_start, u64 range_end)
 {
 	struct msm_gem_vma *vma;
 
 	msm_gem_assert_locked(obj);
 
-	vma = lookup_vma(obj, aspace);
+	vma = lookup_vma(obj, vm);
 
 	if (!vma) {
 		int ret;
 
-		vma = add_vma(obj, aspace);
+		vma = add_vma(obj, vm);
 		if (IS_ERR(vma))
 			return vma;
 
@@ -569,13 +569,13 @@ void msm_gem_unpin_active(struct drm_gem_object *obj)
 }
 
 struct msm_gem_vma *msm_gem_get_vma_locked(struct drm_gem_object *obj,
-					   struct msm_gem_address_space *aspace)
+					   struct msm_gem_vm *vm)
 {
-	return get_vma_locked(obj, aspace, 0, U64_MAX);
+	return get_vma_locked(obj, vm, 0, U64_MAX);
 }
 
 static int get_and_pin_iova_range_locked(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova,
+		struct msm_gem_vm *vm, uint64_t *iova,
 		u64 range_start, u64 range_end)
 {
 	struct msm_gem_vma *vma;
@@ -583,7 +583,7 @@ static int get_and_pin_iova_range_locked(struct drm_gem_object *obj,
 
 	msm_gem_assert_locked(obj);
 
-	vma = get_vma_locked(obj, aspace, range_start, range_end);
+	vma = get_vma_locked(obj, vm, range_start, range_end);
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
@@ -601,13 +601,13 @@ static int get_and_pin_iova_range_locked(struct drm_gem_object *obj,
  * limits iova to specified range (in pages)
  */
 int msm_gem_get_and_pin_iova_range(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova,
+		struct msm_gem_vm *vm, uint64_t *iova,
 		u64 range_start, u64 range_end)
 {
 	int ret;
 
 	msm_gem_lock(obj);
-	ret = get_and_pin_iova_range_locked(obj, aspace, iova, range_start, range_end);
+	ret = get_and_pin_iova_range_locked(obj, vm, iova, range_start, range_end);
 	msm_gem_unlock(obj);
 
 	return ret;
@@ -615,9 +615,9 @@ int msm_gem_get_and_pin_iova_range(struct drm_gem_object *obj,
 
 /* get iova and pin it. Should have a matching put */
 int msm_gem_get_and_pin_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova)
+		struct msm_gem_vm *vm, uint64_t *iova)
 {
-	return msm_gem_get_and_pin_iova_range(obj, aspace, iova, 0, U64_MAX);
+	return msm_gem_get_and_pin_iova_range(obj, vm, iova, 0, U64_MAX);
 }
 
 /*
@@ -625,13 +625,13 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj,
  * valid for the life of the object
  */
 int msm_gem_get_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova)
+		struct msm_gem_vm *vm, uint64_t *iova)
 {
 	struct msm_gem_vma *vma;
 	int ret = 0;
 
 	msm_gem_lock(obj);
-	vma = get_vma_locked(obj, aspace, 0, U64_MAX);
+	vma = get_vma_locked(obj, vm, 0, U64_MAX);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
 	} else {
@@ -643,9 +643,9 @@ int msm_gem_get_iova(struct drm_gem_object *obj,
 }
 
 static int clear_iova(struct drm_gem_object *obj,
-		      struct msm_gem_address_space *aspace)
+		      struct msm_gem_vm *vm)
 {
-	struct msm_gem_vma *vma = lookup_vma(obj, aspace);
+	struct msm_gem_vma *vma = lookup_vma(obj, vm);
 
 	if (!vma)
 		return 0;
@@ -665,20 +665,20 @@ static int clear_iova(struct drm_gem_object *obj,
  * Setting an iova of zero will clear the vma.
  */
 int msm_gem_set_iova(struct drm_gem_object *obj,
-		     struct msm_gem_address_space *aspace, uint64_t iova)
+		     struct msm_gem_vm *vm, uint64_t iova)
 {
 	int ret = 0;
 
 	msm_gem_lock(obj);
 	if (!iova) {
-		ret = clear_iova(obj, aspace);
+		ret = clear_iova(obj, vm);
 	} else {
 		struct msm_gem_vma *vma;
-		vma = get_vma_locked(obj, aspace, iova, iova + obj->size);
+		vma = get_vma_locked(obj, vm, iova, iova + obj->size);
 		if (IS_ERR(vma)) {
 			ret = PTR_ERR(vma);
 		} else if (GEM_WARN_ON(vma->iova != iova)) {
-			clear_iova(obj, aspace);
+			clear_iova(obj, vm);
 			ret = -EBUSY;
 		}
 	}
@@ -693,12 +693,12 @@ int msm_gem_set_iova(struct drm_gem_object *obj,
  * to get rid of it
  */
 void msm_gem_unpin_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace)
+		struct msm_gem_vm *vm)
 {
 	struct msm_gem_vma *vma;
 
 	msm_gem_lock(obj);
-	vma = lookup_vma(obj, aspace);
+	vma = lookup_vma(obj, vm);
 	if (!GEM_WARN_ON(!vma)) {
 		msm_gem_unpin_locked(obj);
 	}
@@ -1016,23 +1016,23 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m,
 
 		list_for_each_entry(vma, &msm_obj->vmas, list) {
 			const char *name, *comm;
-			if (vma->aspace) {
-				struct msm_gem_address_space *aspace = vma->aspace;
+			if (vma->vm) {
+				struct msm_gem_vm *vm = vma->vm;
 				struct task_struct *task =
-					get_pid_task(aspace->pid, PIDTYPE_PID);
+					get_pid_task(vm->pid, PIDTYPE_PID);
 				if (task) {
 					comm = kstrdup(task->comm, GFP_KERNEL);
 					put_task_struct(task);
 				} else {
 					comm = NULL;
 				}
-				name = aspace->name;
+				name = vm->name;
 			} else {
 				name = comm = NULL;
 			}
-			seq_printf(m, " [%s%s%s: aspace=%p, %08llx,%s]",
+			seq_printf(m, " [%s%s%s: vm=%p, %08llx,%s]",
 				name, comm ? ":" : "", comm ? comm : "",
-				vma->aspace, vma->iova,
+				vma->vm, vma->iova,
 				vma->mapped ? "mapped" : "unmapped");
 			kfree(comm);
 		}
@@ -1357,7 +1357,7 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,
 }
 
 void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
-		uint32_t flags, struct msm_gem_address_space *aspace,
+		uint32_t flags, struct msm_gem_vm *vm,
 		struct drm_gem_object **bo, uint64_t *iova)
 {
 	void *vaddr;
@@ -1368,14 +1368,14 @@ void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
 		return ERR_CAST(obj);
 
 	if (iova) {
-		ret = msm_gem_get_and_pin_iova(obj, aspace, iova);
+		ret = msm_gem_get_and_pin_iova(obj, vm, iova);
 		if (ret)
 			goto err;
 	}
 
 	vaddr = msm_gem_get_vaddr(obj);
 	if (IS_ERR(vaddr)) {
-		msm_gem_unpin_iova(obj, aspace);
+		msm_gem_unpin_iova(obj, vm);
 		ret = PTR_ERR(vaddr);
 		goto err;
 	}
@@ -1392,13 +1392,13 @@ void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
 }
 
 void msm_gem_kernel_put(struct drm_gem_object *bo,
-		struct msm_gem_address_space *aspace)
+		struct msm_gem_vm *vm)
 {
 	if (IS_ERR_OR_NULL(bo))
 		return;
 
 	msm_gem_put_vaddr(bo);
-	msm_gem_unpin_iova(bo, aspace);
+	msm_gem_unpin_iova(bo, vm);
 	drm_gem_object_put(bo);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 85f0257e83da..d2f39a371373 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -22,7 +22,7 @@
 #define MSM_BO_STOLEN        0x10000000    /* try to use stolen/splash memory */
 #define MSM_BO_MAP_PRIV      0x20000000    /* use IOMMU_PRIV when mapping */
 
-struct msm_gem_address_space {
+struct msm_gem_vm {
 	const char *name;
 	/* NOTE: mm managed at the page level, size is in # of pages
 	 * and position mm_node->start is in # of pages:
@@ -47,13 +47,13 @@ struct msm_gem_address_space {
 	uint64_t va_size;
 };
 
-struct msm_gem_address_space *
-msm_gem_address_space_get(struct msm_gem_address_space *aspace);
+struct msm_gem_vm *
+msm_gem_vm_get(struct msm_gem_vm *vm);
 
-void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
+void msm_gem_vm_put(struct msm_gem_vm *vm);
 
-struct msm_gem_address_space *
-msm_gem_address_space_create(struct msm_mmu *mmu, const char *name,
+struct msm_gem_vm *
+msm_gem_vm_create(struct msm_mmu *mmu, const char *name,
 		u64 va_start, u64 size);
 
 struct msm_fence_context;
@@ -61,12 +61,12 @@ struct msm_fence_context;
 struct msm_gem_vma {
 	struct drm_mm_node node;
 	uint64_t iova;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	struct list_head list;    /* node in msm_gem_object::vmas */
 	bool mapped;
 };
 
-struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_address_space *aspace);
+struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_vm *vm);
 int msm_gem_vma_init(struct msm_gem_vma *vma, int size,
 		u64 range_start, u64 range_end);
 void msm_gem_vma_purge(struct msm_gem_vma *vma);
@@ -127,18 +127,18 @@ int msm_gem_pin_vma_locked(struct drm_gem_object *obj, struct msm_gem_vma *vma);
 void msm_gem_unpin_locked(struct drm_gem_object *obj);
 void msm_gem_unpin_active(struct drm_gem_object *obj);
 struct msm_gem_vma *msm_gem_get_vma_locked(struct drm_gem_object *obj,
-					   struct msm_gem_address_space *aspace);
+					   struct msm_gem_vm *vm);
 int msm_gem_get_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova);
+		struct msm_gem_vm *vm, uint64_t *iova);
 int msm_gem_set_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t iova);
+		struct msm_gem_vm *vm, uint64_t iova);
 int msm_gem_get_and_pin_iova_range(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova,
+		struct msm_gem_vm *vm, uint64_t *iova,
 		u64 range_start, u64 range_end);
 int msm_gem_get_and_pin_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace, uint64_t *iova);
+		struct msm_gem_vm *vm, uint64_t *iova);
 void msm_gem_unpin_iova(struct drm_gem_object *obj,
-		struct msm_gem_address_space *aspace);
+		struct msm_gem_vm *vm);
 void msm_gem_pin_obj_locked(struct drm_gem_object *obj);
 struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj);
 void msm_gem_unpin_pages_locked(struct drm_gem_object *obj);
@@ -160,10 +160,10 @@ int msm_gem_new_handle(struct drm_device *dev, struct drm_file *file,
 struct drm_gem_object *msm_gem_new(struct drm_device *dev,
 		uint32_t size, uint32_t flags);
 void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
-		uint32_t flags, struct msm_gem_address_space *aspace,
+		uint32_t flags, struct msm_gem_vm *vm,
 		struct drm_gem_object **bo, uint64_t *iova);
 void msm_gem_kernel_put(struct drm_gem_object *bo,
-		struct msm_gem_address_space *aspace);
+		struct msm_gem_vm *vm);
 struct drm_gem_object *msm_gem_import(struct drm_device *dev,
 		struct dma_buf *dmabuf, struct sg_table *sgt);
 __printf(2, 3)
@@ -257,7 +257,7 @@ struct msm_gem_submit {
 	struct kref ref;
 	struct drm_device *dev;
 	struct msm_gpu *gpu;
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	struct list_head node;   /* node in ring submit list */
 	struct drm_exec exec;
 	uint32_t seqno;		/* Sequence number of the submit on the ring */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 3aabf7f1da6d..a59816b6b6de 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -63,7 +63,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
 
 	kref_init(&submit->ref);
 	submit->dev = dev;
-	submit->aspace = queue->ctx->aspace;
+	submit->vm = queue->ctx->vm;
 	submit->gpu = gpu;
 	submit->cmd = (void *)&submit->bos[nr_bos];
 	submit->queue = queue;
@@ -311,7 +311,7 @@ static int submit_pin_objects(struct msm_gem_submit *submit)
 		struct msm_gem_vma *vma;
 
 		/* if locking succeeded, pin bo: */
-		vma = msm_gem_get_vma_locked(obj, submit->aspace);
+		vma = msm_gem_get_vma_locked(obj, submit->vm);
 		if (IS_ERR(vma)) {
 			ret = PTR_ERR(vma);
 			break;
@@ -669,7 +669,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 	if (args->pad)
 		return -EINVAL;
 
-	if (unlikely(!ctx->aspace) && !capable(CAP_SYS_RAWIO)) {
+	if (unlikely(!ctx->vm) && !capable(CAP_SYS_RAWIO)) {
 		DRM_ERROR_RATELIMITED("IOMMU support or CAP_SYS_RAWIO required!\n");
 		return -EPERM;
 	}
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index 11e842dda73c..9419692f0cc8 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -10,45 +10,44 @@
 #include "msm_mmu.h"
 
 static void
-msm_gem_address_space_destroy(struct kref *kref)
+msm_gem_vm_destroy(struct kref *kref)
 {
-	struct msm_gem_address_space *aspace = container_of(kref,
-			struct msm_gem_address_space, kref);
-
-	drm_mm_takedown(&aspace->mm);
-	if (aspace->mmu)
-		aspace->mmu->funcs->destroy(aspace->mmu);
-	put_pid(aspace->pid);
-	kfree(aspace);
+	struct msm_gem_vm *vm = container_of(kref, struct msm_gem_vm, kref);
+
+	drm_mm_takedown(&vm->mm);
+	if (vm->mmu)
+		vm->mmu->funcs->destroy(vm->mmu);
+	put_pid(vm->pid);
+	kfree(vm);
 }
 
 
-void msm_gem_address_space_put(struct msm_gem_address_space *aspace)
+void msm_gem_vm_put(struct msm_gem_vm *vm)
 {
-	if (aspace)
-		kref_put(&aspace->kref, msm_gem_address_space_destroy);
+	if (vm)
+		kref_put(&vm->kref, msm_gem_vm_destroy);
 }
 
-struct msm_gem_address_space *
-msm_gem_address_space_get(struct msm_gem_address_space *aspace)
+struct msm_gem_vm *
+msm_gem_vm_get(struct msm_gem_vm *vm)
 {
-	if (!IS_ERR_OR_NULL(aspace))
-		kref_get(&aspace->kref);
+	if (!IS_ERR_OR_NULL(vm))
+		kref_get(&vm->kref);
 
-	return aspace;
+	return vm;
 }
 
 /* Actually unmap memory for the vma */
 void msm_gem_vma_purge(struct msm_gem_vma *vma)
 {
-	struct msm_gem_address_space *aspace = vma->aspace;
+	struct msm_gem_vm *vm = vma->vm;
 	unsigned size = vma->node.size;
 
 	/* Don't do anything if the memory isn't mapped */
 	if (!vma->mapped)
 		return;
 
-	aspace->mmu->funcs->unmap(aspace->mmu, vma->iova, size);
+	vm->mmu->funcs->unmap(vm->mmu, vma->iova, size);
 
 	vma->mapped = false;
 }
@@ -58,7 +57,7 @@ int
 msm_gem_vma_map(struct msm_gem_vma *vma, int prot,
 		struct sg_table *sgt, int size)
 {
-	struct msm_gem_address_space *aspace = vma->aspace;
+	struct msm_gem_vm *vm = vma->vm;
 	int ret;
 
 	if (GEM_WARN_ON(!vma->iova))
@@ -69,7 +68,7 @@ msm_gem_vma_map(struct msm_gem_vma *vma, int prot,
 
 	vma->mapped = true;
 
-	if (!aspace)
+	if (!vm)
 		return 0;
 
 	/*
@@ -81,7 +80,7 @@ msm_gem_vma_map(struct msm_gem_vma *vma, int prot,
 	 * Revisit this if we can come up with a scheme to pre-alloc pages
 	 * for the pgtable in map/unmap ops.
 	 */
-	ret = aspace->mmu->funcs->map(aspace->mmu, vma->iova, sgt, size, prot);
+	ret = vm->mmu->funcs->map(vm->mmu, vma->iova, sgt, size, prot);
 
 	if (ret) {
 		vma->mapped = false;
@@ -93,21 +92,21 @@ msm_gem_vma_map(struct msm_gem_vma *vma, int prot,
 /* Close an iova.  Warn if it is still in use */
 void msm_gem_vma_close(struct msm_gem_vma *vma)
 {
-	struct msm_gem_address_space *aspace = vma->aspace;
+	struct msm_gem_vm *vm = vma->vm;
 
 	GEM_WARN_ON(vma->mapped);
 
-	spin_lock(&aspace->lock);
+	spin_lock(&vm->lock);
 	if (vma->iova)
 		drm_mm_remove_node(&vma->node);
-	spin_unlock(&aspace->lock);
+	spin_unlock(&vm->lock);
 
 	vma->iova = 0;
 
-	msm_gem_address_space_put(aspace);
+	msm_gem_vm_put(vm);
 }
 
-struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_address_space *aspace)
+struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_vm *vm)
 {
 	struct msm_gem_vma *vma;
 
@@ -115,7 +114,7 @@ struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_address_space *aspace)
 	if (!vma)
 		return NULL;
 
-	vma->aspace = aspace;
+	vma->vm = vm;
 
 	return vma;
 }
@@ -124,20 +123,20 @@ struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_address_space *aspace)
 int msm_gem_vma_init(struct msm_gem_vma *vma, int size,
 		u64 range_start, u64 range_end)
 {
-	struct msm_gem_address_space *aspace = vma->aspace;
+	struct msm_gem_vm *vm = vma->vm;
 	int ret;
 
-	if (GEM_WARN_ON(!aspace))
+	if (GEM_WARN_ON(!vm))
 		return -EINVAL;
 
 	if (GEM_WARN_ON(vma->iova))
 		return -EBUSY;
 
-	spin_lock(&aspace->lock);
-	ret = drm_mm_insert_node_in_range(&aspace->mm, &vma->node,
+	spin_lock(&vm->lock);
+	ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node,
 					  size, PAGE_SIZE, 0,
 					  range_start, range_end, 0);
-	spin_unlock(&aspace->lock);
+	spin_unlock(&vm->lock);
 
 	if (ret)
 		return ret;
@@ -145,33 +144,33 @@ int msm_gem_vma_init(struct msm_gem_vma *vma, int size,
 	vma->iova = vma->node.start;
 	vma->mapped = false;
 
-	kref_get(&aspace->kref);
+	kref_get(&vm->kref);
 
 	return 0;
 }
 
-struct msm_gem_address_space *
-msm_gem_address_space_create(struct msm_mmu *mmu, const char *name,
+struct msm_gem_vm *
+msm_gem_vm_create(struct msm_mmu *mmu, const char *name,
 		u64 va_start, u64 size)
 {
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
 	if (IS_ERR(mmu))
 		return ERR_CAST(mmu);
 
-	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
-	if (!aspace)
+	vm = kzalloc(sizeof(*vm), GFP_KERNEL);
+	if (!vm)
 		return ERR_PTR(-ENOMEM);
 
-	spin_lock_init(&aspace->lock);
-	aspace->name = name;
-	aspace->mmu = mmu;
-	aspace->va_start = va_start;
-	aspace->va_size  = size;
+	spin_lock_init(&vm->lock);
+	vm->name = name;
+	vm->mmu = mmu;
+	vm->va_start = va_start;
+	vm->va_size  = size;
 
-	drm_mm_init(&aspace->mm, va_start, size);
+	drm_mm_init(&vm->mm, va_start, size);
 
-	kref_init(&aspace->kref);
+	kref_init(&vm->kref);
 
-	return aspace;
+	return vm;
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index d786fcfad62f..0d466a2e9b32 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -283,7 +283,7 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
 
 		if (state->fault_info.ttbr0) {
 			struct msm_gpu_fault_info *info = &state->fault_info;
-			struct msm_mmu *mmu = submit->aspace->mmu;
+			struct msm_mmu *mmu = submit->vm->mmu;
 
 			msm_iommu_pagetable_params(mmu, &info->pgtbl_ttbr0,
 						   &info->asid);
@@ -386,8 +386,8 @@ static void recover_worker(struct kthread_work *work)
 
 	/* Increment the fault counts */
 	submit->queue->faults++;
-	if (submit->aspace)
-		submit->aspace->faults++;
+	if (submit->vm)
+		submit->vm->faults++;
 
 	get_comm_cmdline(submit, &comm, &cmd);
 
@@ -492,7 +492,7 @@ static void fault_worker(struct kthread_work *work)
 
 resume_smmu:
 	memset(&gpu->fault_info, 0, sizeof(gpu->fault_info));
-	gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu);
+	gpu->vm->mmu->funcs->resume_translation(gpu->vm->mmu);
 
 	mutex_unlock(&gpu->lock);
 }
@@ -829,10 +829,10 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
 }
 
 /* Return a new address space for a msm_drm_private instance */
-struct msm_gem_address_space *
-msm_gpu_create_private_address_space(struct msm_gpu *gpu, struct task_struct *task)
+struct msm_gem_vm *
+msm_gpu_create_private_vm(struct msm_gpu *gpu, struct task_struct *task)
 {
-	struct msm_gem_address_space *aspace = NULL;
+	struct msm_gem_vm *vm = NULL;
 	if (!gpu)
 		return NULL;
 
@@ -840,16 +840,16 @@ msm_gpu_create_private_address_space(struct msm_gpu *gpu, struct task_struct *ta
 	 * If the target doesn't support private address spaces then return
 	 * the global one
 	 */
-	if (gpu->funcs->create_private_address_space) {
-		aspace = gpu->funcs->create_private_address_space(gpu);
-		if (!IS_ERR(aspace))
-			aspace->pid = get_pid(task_pid(task));
+	if (gpu->funcs->create_private_vm) {
+		vm = gpu->funcs->create_private_vm(gpu);
+		if (!IS_ERR(vm))
+			vm->pid = get_pid(task_pid(task));
 	}
 
-	if (IS_ERR_OR_NULL(aspace))
-		aspace = msm_gem_address_space_get(gpu->aspace);
+	if (IS_ERR_OR_NULL(vm))
+		vm = msm_gem_vm_get(gpu->vm);
 
-	return aspace;
+	return vm;
 }
 
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
@@ -945,18 +945,18 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	msm_devfreq_init(gpu);
 
 
-	gpu->aspace = gpu->funcs->create_address_space(gpu, pdev);
+	gpu->vm = gpu->funcs->create_vm(gpu, pdev);
 
-	if (gpu->aspace == NULL)
+	if (gpu->vm == NULL)
 		DRM_DEV_INFO(drm->dev, "%s: no IOMMU, fallback to VRAM carveout!\n", name);
-	else if (IS_ERR(gpu->aspace)) {
-		ret = PTR_ERR(gpu->aspace);
+	else if (IS_ERR(gpu->vm)) {
+		ret = PTR_ERR(gpu->vm);
 		goto fail;
 	}
 
 	memptrs = msm_gem_kernel_new(drm,
 		sizeof(struct msm_rbmemptrs) * nr_rings,
-		check_apriv(gpu, MSM_BO_WC), gpu->aspace, &gpu->memptrs_bo,
+		check_apriv(gpu, MSM_BO_WC), gpu->vm, &gpu->memptrs_bo,
 		&memptrs_iova);
 
 	if (IS_ERR(memptrs)) {
@@ -1000,7 +1000,7 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		gpu->rb[i] = NULL;
 	}
 
-	msm_gem_kernel_put(gpu->memptrs_bo, gpu->aspace);
+	msm_gem_kernel_put(gpu->memptrs_bo, gpu->vm);
 
 	platform_set_drvdata(pdev, NULL);
 	return ret;
@@ -1017,11 +1017,11 @@ void msm_gpu_cleanup(struct msm_gpu *gpu)
 		gpu->rb[i] = NULL;
 	}
 
-	msm_gem_kernel_put(gpu->memptrs_bo, gpu->aspace);
+	msm_gem_kernel_put(gpu->memptrs_bo, gpu->vm);
 
-	if (!IS_ERR_OR_NULL(gpu->aspace)) {
-		gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu);
-		msm_gem_address_space_put(gpu->aspace);
+	if (!IS_ERR_OR_NULL(gpu->vm)) {
+		gpu->vm->mmu->funcs->detach(gpu->vm->mmu);
+		msm_gem_vm_put(gpu->vm);
 	}
 
 	if (gpu->worker) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index c699ce0c557b..1f26ba00f773 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -78,10 +78,8 @@ struct msm_gpu_funcs {
 	/* note: gpu_set_freq() can assume that we have been pm_resumed */
 	void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp,
 			     bool suspended);
-	struct msm_gem_address_space *(*create_address_space)
-		(struct msm_gpu *gpu, struct platform_device *pdev);
-	struct msm_gem_address_space *(*create_private_address_space)
-		(struct msm_gpu *gpu);
+	struct msm_gem_vm *(*create_vm)(struct msm_gpu *gpu, struct platform_device *pdev);
+	struct msm_gem_vm *(*create_private_vm)(struct msm_gpu *gpu);
 	uint32_t (*get_rptr)(struct msm_gpu *gpu, struct msm_ringbuffer *ring);
 
 	/**
@@ -236,7 +234,7 @@ struct msm_gpu {
 	void __iomem *mmio;
 	int irq;
 
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
 	/* Power Control: */
 	struct regulator *gpu_reg, *gpu_cx;
@@ -364,8 +362,8 @@ struct msm_context {
 	 */
 	int queueid;
 
-	/** @aspace: the per-process GPU address-space */
-	struct msm_gem_address_space *aspace;
+	/** @vm: the per-process GPU address-space */
+	struct msm_gem_vm *vm;
 
 	/** @kref: the reference count */
 	struct kref ref;
@@ -675,8 +673,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
 		const char *name, struct msm_gpu_config *config);
 
-struct msm_gem_address_space *
-msm_gpu_create_private_address_space(struct msm_gpu *gpu, struct task_struct *task);
+struct msm_gem_vm *
+msm_gpu_create_private_vm(struct msm_gpu *gpu, struct task_struct *task);
 
 void msm_gpu_cleanup(struct msm_gpu *gpu);
 
diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
index 35d5397e73b4..88504c4b842f 100644
--- a/drivers/gpu/drm/msm/msm_kms.c
+++ b/drivers/gpu/drm/msm/msm_kms.c
@@ -176,9 +176,9 @@ static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, void
 	return -ENOSYS;
 }
 
-struct msm_gem_address_space *msm_kms_init_aspace(struct drm_device *dev)
+struct msm_gem_vm *msm_kms_init_vm(struct drm_device *dev)
 {
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 	struct msm_mmu *mmu;
 	struct device *mdp_dev = dev->dev;
 	struct device *mdss_dev = mdp_dev->parent;
@@ -204,17 +204,17 @@ struct msm_gem_address_space *msm_kms_init_aspace(struct drm_device *dev)
 		return NULL;
 	}
 
-	aspace = msm_gem_address_space_create(mmu, "mdp_kms",
+	vm = msm_gem_vm_create(mmu, "mdp_kms",
 		0x1000, 0x100000000 - 0x1000);
-	if (IS_ERR(aspace)) {
-		dev_err(mdp_dev, "aspace create, error %pe\n", aspace);
+	if (IS_ERR(vm)) {
+		dev_err(mdp_dev, "vm create, error %pe\n", vm);
 		mmu->funcs->destroy(mmu);
-		return aspace;
+		return vm;
 	}
 
-	msm_mmu_set_fault_handler(aspace->mmu, kms, msm_kms_fault_handler);
+	msm_mmu_set_fault_handler(vm->mmu, kms, msm_kms_fault_handler);
 
-	return aspace;
+	return vm;
 }
 
 void msm_drm_kms_uninit(struct device *dev)
diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index 43b58d052ee6..f45996a03e15 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -139,7 +139,7 @@ struct msm_kms {
 	atomic_t fault_snapshot_capture;
 
 	/* mapper-id used to request GEM buffer mapped for scanout: */
-	struct msm_gem_address_space *aspace;
+	struct msm_gem_vm *vm;
 
 	/* disp snapshot support */
 	struct kthread_worker *dump_worker;
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index c5651c39ac2a..bbf8503f6bb5 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -84,7 +84,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
 
 	ring->start = msm_gem_kernel_new(gpu->dev, MSM_GPU_RINGBUFFER_SZ,
 		check_apriv(gpu, MSM_BO_WC | MSM_BO_GPU_READONLY),
-		gpu->aspace, &ring->bo, &ring->iova);
+		gpu->vm, &ring->bo, &ring->iova);
 
 	if (IS_ERR(ring->start)) {
 		ret = PTR_ERR(ring->start);
@@ -131,7 +131,7 @@ void msm_ringbuffer_destroy(struct msm_ringbuffer *ring)
 
 	msm_fence_context_free(ring->fctx);
 
-	msm_gem_kernel_put(ring->bo, ring->gpu->aspace);
+	msm_gem_kernel_put(ring->bo, ring->gpu->vm);
 
 	kfree(ring);
 }
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
index 1acc0fe36353..6298233c3568 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -59,7 +59,7 @@ void __msm_context_destroy(struct kref *kref)
 		kfree(ctx->entities[i]);
 	}
 
-	msm_gem_address_space_put(ctx->aspace);
+	msm_gem_vm_put(ctx->vm);
 	kfree(ctx->comm);
 	kfree(ctx->cmdline);
 	kfree(ctx);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 09/40] drm/msm: Remove vram carveout support
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (7 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 08/40] drm/msm: Rename msm_gem_address_space -> msm_gem_vm Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 10/40] drm/msm: Collapse vma allocation and initialization Rob Clark
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Rob Clark,
	Sean Paul, Konrad Dybcio, Abhinav Kumar, Dmitry Baryshkov,
	Marijn Suijten, David Airlie, Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

It is standing in the way of drm_gpuvm / VM_BIND support.  Not to
mention frequently broken and rarely tested.  And I think only needed
for a 10yr old not quite upstream SoC (msm8974).

Maybe we can add support back in later, but I'm doubtful.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c      |   8 --
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c      |  15 ---
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c      |  15 ---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c      |   3 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c      |   3 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c |   4 -
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    |   4 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h    |   1 -
 drivers/gpu/drm/msm/msm_drv.c              | 117 +-----------------
 drivers/gpu/drm/msm/msm_drv.h              |  11 --
 drivers/gpu/drm/msm/msm_gem.c              | 131 ++-------------------
 drivers/gpu/drm/msm/msm_gem.h              |   5 -
 drivers/gpu/drm/msm/msm_gem_submit.c       |   5 -
 drivers/gpu/drm/msm/msm_gpu.c              |   6 +-
 14 files changed, 19 insertions(+), 309 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
index 5eb063ed0b46..095bae92e3e8 100644
--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
@@ -551,14 +551,6 @@ struct msm_gpu *a2xx_gpu_init(struct drm_device *dev)
 	else
 		adreno_gpu->registers = a220_registers;
 
-	if (!gpu->vm) {
-		dev_err(dev->dev, "No memory protection without MMU\n");
-		if (!allow_vram_carveout) {
-			ret = -ENXIO;
-			goto fail;
-		}
-	}
-
 	return gpu;
 
 fail:
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 434e6ededf83..a956cd79195e 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -581,21 +581,6 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev)
 			goto fail;
 	}
 
-	if (!gpu->vm) {
-		/* TODO we think it is possible to configure the GPU to
-		 * restrict access to VRAM carveout.  But the required
-		 * registers are unknown.  For now just bail out and
-		 * limp along with just modesetting.  If it turns out
-		 * to not be possible to restrict access, then we must
-		 * implement a cmdstream validator.
-		 */
-		DRM_DEV_ERROR(dev->dev, "No memory protection without IOMMU\n");
-		if (!allow_vram_carveout) {
-			ret = -ENXIO;
-			goto fail;
-		}
-	}
-
 	icc_path = devm_of_icc_get(&pdev->dev, "gfx-mem");
 	if (IS_ERR(icc_path)) {
 		ret = PTR_ERR(icc_path);
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 2c75debcfd84..83f6329accba 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -695,21 +695,6 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev)
 
 	adreno_gpu->uche_trap_base = 0xffff0000ffff0000ull;
 
-	if (!gpu->vm) {
-		/* TODO we think it is possible to configure the GPU to
-		 * restrict access to VRAM carveout.  But the required
-		 * registers are unknown.  For now just bail out and
-		 * limp along with just modesetting.  If it turns out
-		 * to not be possible to restrict access, then we must
-		 * implement a cmdstream validator.
-		 */
-		DRM_DEV_ERROR(dev->dev, "No memory protection without IOMMU\n");
-		if (!allow_vram_carveout) {
-			ret = -ENXIO;
-			goto fail;
-		}
-	}
-
 	icc_path = devm_of_icc_get(&pdev->dev, "gfx-mem");
 	if (IS_ERR(icc_path)) {
 		ret = PTR_ERR(icc_path);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index cce95ad3cfb8..913e4fdfca21 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1786,8 +1786,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
-	if (gpu->vm)
-		msm_mmu_set_fault_handler(gpu->vm->mmu, gpu, a5xx_fault_handler);
+	msm_mmu_set_fault_handler(gpu->vm->mmu, gpu, a5xx_fault_handler);
 
 	/* Set up the preemption specific bits and pieces for each ringbuffer */
 	a5xx_preempt_init(gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 3c92ea35d39a..c119493c13aa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2547,8 +2547,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 
 	adreno_gpu->uche_trap_base = 0x1fffffffff000ull;
 
-	if (gpu->vm)
-		msm_mmu_set_fault_handler(gpu->vm->mmu, gpu, a6xx_fault_handler);
+	msm_mmu_set_fault_handler(gpu->vm->mmu, gpu, a6xx_fault_handler);
 
 	a6xx_calc_ubwc_config(adreno_gpu);
 	/* Set up the preemption specific bits and pieces for each ringbuffer */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index f4552b8c6767..6b0390c38bff 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -16,10 +16,6 @@ bool snapshot_debugbus = false;
 MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)");
 module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600);
 
-bool allow_vram_carveout = false;
-MODULE_PARM_DESC(allow_vram_carveout, "Allow using VRAM Carveout, in place of IOMMU");
-module_param_named(allow_vram_carveout, allow_vram_carveout, bool, 0600);
-
 int enable_preemption = -1;
 MODULE_PARM_DESC(enable_preemption, "Enable preemption (A7xx only) (1=on , 0=disable, -1=auto (default))");
 module_param(enable_preemption, int, 0600);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index b13aaebd8da7..a2e39283360f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -209,7 +209,9 @@ adreno_iommu_create_vm(struct msm_gpu *gpu,
 	u64 start, size;
 
 	mmu = msm_iommu_gpu_new(&pdev->dev, gpu, quirks);
-	if (IS_ERR_OR_NULL(mmu))
+	if (!mmu)
+		return ERR_PTR(-ENODEV);
+	else if (IS_ERR_OR_NULL(mmu))
 		return ERR_CAST(mmu);
 
 	geometry = msm_iommu_get_geometry(mmu);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 258c5c6dde2e..bbd7e664286e 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -18,7 +18,6 @@
 #include "adreno_pm4.xml.h"
 
 extern bool snapshot_debugbus;
-extern bool allow_vram_carveout;
 
 enum {
 	ADRENO_FW_PM4 = 0,
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 903abf3532e0..978f1d355b42 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -46,12 +46,6 @@
 #define MSM_VERSION_MINOR	12
 #define MSM_VERSION_PATCHLEVEL	0
 
-static void msm_deinit_vram(struct drm_device *ddev);
-
-static char *vram = "16m";
-MODULE_PARM_DESC(vram, "Configure VRAM size (for devices without IOMMU/GPUMMU)");
-module_param(vram, charp, 0);
-
 bool dumpstate;
 MODULE_PARM_DESC(dumpstate, "Dump KMS state on errors");
 module_param(dumpstate, bool, 0600);
@@ -97,8 +91,6 @@ static int msm_drm_uninit(struct device *dev)
 	if (priv->kms)
 		msm_drm_kms_uninit(dev);
 
-	msm_deinit_vram(ddev);
-
 	component_unbind_all(dev, ddev);
 
 	ddev->dev_private = NULL;
@@ -109,107 +101,6 @@ static int msm_drm_uninit(struct device *dev)
 	return 0;
 }
 
-bool msm_use_mmu(struct drm_device *dev)
-{
-	struct msm_drm_private *priv = dev->dev_private;
-
-	/*
-	 * a2xx comes with its own MMU
-	 * On other platforms IOMMU can be declared specified either for the
-	 * MDP/DPU device or for its parent, MDSS device.
-	 */
-	return priv->is_a2xx ||
-		device_iommu_mapped(dev->dev) ||
-		device_iommu_mapped(dev->dev->parent);
-}
-
-static int msm_init_vram(struct drm_device *dev)
-{
-	struct msm_drm_private *priv = dev->dev_private;
-	struct device_node *node;
-	unsigned long size = 0;
-	int ret = 0;
-
-	/* In the device-tree world, we could have a 'memory-region'
-	 * phandle, which gives us a link to our "vram".  Allocating
-	 * is all nicely abstracted behind the dma api, but we need
-	 * to know the entire size to allocate it all in one go. There
-	 * are two cases:
-	 *  1) device with no IOMMU, in which case we need exclusive
-	 *     access to a VRAM carveout big enough for all gpu
-	 *     buffers
-	 *  2) device with IOMMU, but where the bootloader puts up
-	 *     a splash screen.  In this case, the VRAM carveout
-	 *     need only be large enough for fbdev fb.  But we need
-	 *     exclusive access to the buffer to avoid the kernel
-	 *     using those pages for other purposes (which appears
-	 *     as corruption on screen before we have a chance to
-	 *     load and do initial modeset)
-	 */
-
-	node = of_parse_phandle(dev->dev->of_node, "memory-region", 0);
-	if (node) {
-		struct resource r;
-		ret = of_address_to_resource(node, 0, &r);
-		of_node_put(node);
-		if (ret)
-			return ret;
-		size = r.end - r.start + 1;
-		DRM_INFO("using VRAM carveout: %lx@%pa\n", size, &r.start);
-
-		/* if we have no IOMMU, then we need to use carveout allocator.
-		 * Grab the entire DMA chunk carved out in early startup in
-		 * mach-msm:
-		 */
-	} else if (!msm_use_mmu(dev)) {
-		DRM_INFO("using %s VRAM carveout\n", vram);
-		size = memparse(vram, NULL);
-	}
-
-	if (size) {
-		unsigned long attrs = 0;
-		void *p;
-
-		priv->vram.size = size;
-
-		drm_mm_init(&priv->vram.mm, 0, (size >> PAGE_SHIFT) - 1);
-		spin_lock_init(&priv->vram.lock);
-
-		attrs |= DMA_ATTR_NO_KERNEL_MAPPING;
-		attrs |= DMA_ATTR_WRITE_COMBINE;
-
-		/* note that for no-kernel-mapping, the vaddr returned
-		 * is bogus, but non-null if allocation succeeded:
-		 */
-		p = dma_alloc_attrs(dev->dev, size,
-				&priv->vram.paddr, GFP_KERNEL, attrs);
-		if (!p) {
-			DRM_DEV_ERROR(dev->dev, "failed to allocate VRAM\n");
-			priv->vram.paddr = 0;
-			return -ENOMEM;
-		}
-
-		DRM_DEV_INFO(dev->dev, "VRAM: %08x->%08x\n",
-				(uint32_t)priv->vram.paddr,
-				(uint32_t)(priv->vram.paddr + size));
-	}
-
-	return ret;
-}
-
-static void msm_deinit_vram(struct drm_device *ddev)
-{
-	struct msm_drm_private *priv = ddev->dev_private;
-	unsigned long attrs = DMA_ATTR_NO_KERNEL_MAPPING;
-
-	if (!priv->vram.paddr)
-		return;
-
-	drm_mm_takedown(&priv->vram.mm);
-	dma_free_attrs(ddev->dev, priv->vram.size, NULL, priv->vram.paddr,
-			attrs);
-}
-
 static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
 {
 	struct msm_drm_private *priv = dev_get_drvdata(dev);
@@ -256,16 +147,12 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
 			goto err_destroy_wq;
 	}
 
-	ret = msm_init_vram(ddev);
-	if (ret)
-		goto err_destroy_wq;
-
 	dma_set_max_seg_size(dev, UINT_MAX);
 
 	/* Bind all our sub-components: */
 	ret = component_bind_all(dev, ddev);
 	if (ret)
-		goto err_deinit_vram;
+		goto err_destroy_wq;
 
 	ret = msm_gem_shrinker_init(ddev);
 	if (ret)
@@ -302,8 +189,6 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
 
 	return ret;
 
-err_deinit_vram:
-	msm_deinit_vram(ddev);
 err_destroy_wq:
 	destroy_workqueue(priv->wq);
 err_put_dev:
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 0e675c9a7f83..ad509403f072 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -183,17 +183,6 @@ struct msm_drm_private {
 
 	struct msm_drm_thread event_thread[MAX_CRTCS];
 
-	/* VRAM carveout, used when no IOMMU: */
-	struct {
-		unsigned long size;
-		dma_addr_t paddr;
-		/* NOTE: mm managed at the page level, size is in # of pages
-		 * and position mm_node->start is in # of pages:
-		 */
-		struct drm_mm mm;
-		spinlock_t lock; /* Protects drm_mm node allocation/removal */
-	} vram;
-
 	struct notifier_block vmap_notifier;
 	struct shrinker *shrinker;
 
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 07a30d29248c..621fb4e17a2e 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -17,24 +17,8 @@
 #include <trace/events/gpu_mem.h>
 
 #include "msm_drv.h"
-#include "msm_fence.h"
 #include "msm_gem.h"
 #include "msm_gpu.h"
-#include "msm_mmu.h"
-
-static dma_addr_t physaddr(struct drm_gem_object *obj)
-{
-	struct msm_gem_object *msm_obj = to_msm_bo(obj);
-	struct msm_drm_private *priv = obj->dev->dev_private;
-	return (((dma_addr_t)msm_obj->vram_node->start) << PAGE_SHIFT) +
-			priv->vram.paddr;
-}
-
-static bool use_pages(struct drm_gem_object *obj)
-{
-	struct msm_gem_object *msm_obj = to_msm_bo(obj);
-	return !msm_obj->vram_node;
-}
 
 static int pgprot = 0;
 module_param(pgprot, int, 0600);
@@ -139,36 +123,6 @@ static void update_lru(struct drm_gem_object *obj)
 	mutex_unlock(&priv->lru.lock);
 }
 
-/* allocate pages from VRAM carveout, used when no IOMMU: */
-static struct page **get_pages_vram(struct drm_gem_object *obj, int npages)
-{
-	struct msm_gem_object *msm_obj = to_msm_bo(obj);
-	struct msm_drm_private *priv = obj->dev->dev_private;
-	dma_addr_t paddr;
-	struct page **p;
-	int ret, i;
-
-	p = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!p)
-		return ERR_PTR(-ENOMEM);
-
-	spin_lock(&priv->vram.lock);
-	ret = drm_mm_insert_node(&priv->vram.mm, msm_obj->vram_node, npages);
-	spin_unlock(&priv->vram.lock);
-	if (ret) {
-		kvfree(p);
-		return ERR_PTR(ret);
-	}
-
-	paddr = physaddr(obj);
-	for (i = 0; i < npages; i++) {
-		p[i] = pfn_to_page(__phys_to_pfn(paddr));
-		paddr += PAGE_SIZE;
-	}
-
-	return p;
-}
-
 static struct page **get_pages(struct drm_gem_object *obj)
 {
 	struct msm_gem_object *msm_obj = to_msm_bo(obj);
@@ -180,10 +134,7 @@ static struct page **get_pages(struct drm_gem_object *obj)
 		struct page **p;
 		int npages = obj->size >> PAGE_SHIFT;
 
-		if (use_pages(obj))
-			p = drm_gem_get_pages(obj);
-		else
-			p = get_pages_vram(obj, npages);
+		p = drm_gem_get_pages(obj);
 
 		if (IS_ERR(p)) {
 			DRM_DEV_ERROR(dev->dev, "could not get pages: %ld\n",
@@ -216,18 +167,6 @@ static struct page **get_pages(struct drm_gem_object *obj)
 	return msm_obj->pages;
 }
 
-static void put_pages_vram(struct drm_gem_object *obj)
-{
-	struct msm_gem_object *msm_obj = to_msm_bo(obj);
-	struct msm_drm_private *priv = obj->dev->dev_private;
-
-	spin_lock(&priv->vram.lock);
-	drm_mm_remove_node(msm_obj->vram_node);
-	spin_unlock(&priv->vram.lock);
-
-	kvfree(msm_obj->pages);
-}
-
 static void put_pages(struct drm_gem_object *obj)
 {
 	struct msm_gem_object *msm_obj = to_msm_bo(obj);
@@ -248,10 +187,7 @@ static void put_pages(struct drm_gem_object *obj)
 
 		update_device_mem(obj->dev->dev_private, -obj->size);
 
-		if (use_pages(obj))
-			drm_gem_put_pages(obj, msm_obj->pages, true, false);
-		else
-			put_pages_vram(obj);
+		drm_gem_put_pages(obj, msm_obj->pages, true, false);
 
 		msm_obj->pages = NULL;
 		update_lru(obj);
@@ -1215,19 +1151,10 @@ struct drm_gem_object *msm_gem_new(struct drm_device *dev, uint32_t size, uint32
 	struct msm_drm_private *priv = dev->dev_private;
 	struct msm_gem_object *msm_obj;
 	struct drm_gem_object *obj = NULL;
-	bool use_vram = false;
 	int ret;
 
 	size = PAGE_ALIGN(size);
 
-	if (!msm_use_mmu(dev))
-		use_vram = true;
-	else if ((flags & (MSM_BO_STOLEN | MSM_BO_SCANOUT)) && priv->vram.size)
-		use_vram = true;
-
-	if (GEM_WARN_ON(use_vram && !priv->vram.size))
-		return ERR_PTR(-EINVAL);
-
 	/* Disallow zero sized objects as they make the underlying
 	 * infrastructure grumpy
 	 */
@@ -1240,44 +1167,16 @@ struct drm_gem_object *msm_gem_new(struct drm_device *dev, uint32_t size, uint32
 
 	msm_obj = to_msm_bo(obj);
 
-	if (use_vram) {
-		struct msm_gem_vma *vma;
-		struct page **pages;
-
-		drm_gem_private_object_init(dev, obj, size);
-
-		msm_gem_lock(obj);
-
-		vma = add_vma(obj, NULL);
-		msm_gem_unlock(obj);
-		if (IS_ERR(vma)) {
-			ret = PTR_ERR(vma);
-			goto fail;
-		}
-
-		to_msm_bo(obj)->vram_node = &vma->node;
-
-		msm_gem_lock(obj);
-		pages = get_pages(obj);
-		msm_gem_unlock(obj);
-		if (IS_ERR(pages)) {
-			ret = PTR_ERR(pages);
-			goto fail;
-		}
-
-		vma->iova = physaddr(obj);
-	} else {
-		ret = drm_gem_object_init(dev, obj, size);
-		if (ret)
-			goto fail;
-		/*
-		 * Our buffers are kept pinned, so allocating them from the
-		 * MOVABLE zone is a really bad idea, and conflicts with CMA.
-		 * See comments above new_inode() why this is required _and_
-		 * expected if you're going to pin these pages.
-		 */
-		mapping_set_gfp_mask(obj->filp->f_mapping, GFP_HIGHUSER);
-	}
+	ret = drm_gem_object_init(dev, obj, size);
+	if (ret)
+		goto fail;
+	/*
+	 * Our buffers are kept pinned, so allocating them from the
+	 * MOVABLE zone is a really bad idea, and conflicts with CMA.
+	 * See comments above new_inode() why this is required _and_
+	 * expected if you're going to pin these pages.
+	 */
+	mapping_set_gfp_mask(obj->filp->f_mapping, GFP_HIGHUSER);
 
 	drm_gem_lru_move_tail(&priv->lru.unbacked, obj);
 
@@ -1305,12 +1204,6 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,
 	uint32_t size;
 	int ret, npages;
 
-	/* if we don't have IOMMU, don't bother pretending we can import: */
-	if (!msm_use_mmu(dev)) {
-		DRM_DEV_ERROR(dev->dev, "cannot import without IOMMU\n");
-		return ERR_PTR(-EINVAL);
-	}
-
 	size = PAGE_ALIGN(dmabuf->size);
 
 	ret = msm_gem_new_impl(dev, size, MSM_BO_WC, &obj);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index d2f39a371373..c16b11182831 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -102,11 +102,6 @@ struct msm_gem_object {
 
 	struct list_head vmas;    /* list of msm_gem_vma */
 
-	/* For physically contiguous buffers.  Used when we don't have
-	 * an IOMMU.  Also used for stolen/splashscreen buffer.
-	 */
-	struct drm_mm_node *vram_node;
-
 	char name[32]; /* Identifier to print for the debugfs files */
 
 	/* userspace metadata backchannel */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index a59816b6b6de..c184b1a1f522 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -669,11 +669,6 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 	if (args->pad)
 		return -EINVAL;
 
-	if (unlikely(!ctx->vm) && !capable(CAP_SYS_RAWIO)) {
-		DRM_ERROR_RATELIMITED("IOMMU support or CAP_SYS_RAWIO required!\n");
-		return -EPERM;
-	}
-
 	/* for now, we just have 3d pipe.. eventually this would need to
 	 * be more clever to dispatch to appropriate gpu module:
 	 */
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 0d466a2e9b32..b30800f80120 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -944,12 +944,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 
 	msm_devfreq_init(gpu);
 
-
 	gpu->vm = gpu->funcs->create_vm(gpu, pdev);
-
-	if (gpu->vm == NULL)
-		DRM_DEV_INFO(drm->dev, "%s: no IOMMU, fallback to VRAM carveout!\n", name);
-	else if (IS_ERR(gpu->vm)) {
+	if (IS_ERR(gpu->vm)) {
 		ret = PTR_ERR(gpu->vm);
 		goto fail;
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 10/40] drm/msm: Collapse vma allocation and initialization
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (8 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 09/40] drm/msm: Remove vram carveout support Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 16:59 ` [PATCH v4 11/40] drm/msm: Collapse vma close and delete Rob Clark
  2025-05-14 17:13 ` [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Rob Clark,
	Abhinav Kumar, Dmitry Baryshkov, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

Now that we've dropped vram carveout support, we can collapse vma
allocation and initialization.  This better matches how things work
with drm_gpuvm.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/msm_gem.c     | 30 +++-----------------------
 drivers/gpu/drm/msm/msm_gem.h     |  4 ++--
 drivers/gpu/drm/msm/msm_gem_vma.c | 36 +++++++++++++------------------
 3 files changed, 20 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 621fb4e17a2e..29247911f048 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -337,23 +337,6 @@ uint64_t msm_gem_mmap_offset(struct drm_gem_object *obj)
 	return offset;
 }
 
-static struct msm_gem_vma *add_vma(struct drm_gem_object *obj,
-		struct msm_gem_vm *vm)
-{
-	struct msm_gem_object *msm_obj = to_msm_bo(obj);
-	struct msm_gem_vma *vma;
-
-	msm_gem_assert_locked(obj);
-
-	vma = msm_gem_vma_new(vm);
-	if (!vma)
-		return ERR_PTR(-ENOMEM);
-
-	list_add_tail(&vma->list, &msm_obj->vmas);
-
-	return vma;
-}
-
 static struct msm_gem_vma *lookup_vma(struct drm_gem_object *obj,
 		struct msm_gem_vm *vm)
 {
@@ -420,6 +403,7 @@ static struct msm_gem_vma *get_vma_locked(struct drm_gem_object *obj,
 		struct msm_gem_vm *vm,
 		u64 range_start, u64 range_end)
 {
+	struct msm_gem_object *msm_obj = to_msm_bo(obj);
 	struct msm_gem_vma *vma;
 
 	msm_gem_assert_locked(obj);
@@ -427,18 +411,10 @@ static struct msm_gem_vma *get_vma_locked(struct drm_gem_object *obj,
 	vma = lookup_vma(obj, vm);
 
 	if (!vma) {
-		int ret;
-
-		vma = add_vma(obj, vm);
+		vma = msm_gem_vma_new(vm, obj, range_start, range_end);
 		if (IS_ERR(vma))
 			return vma;
-
-		ret = msm_gem_vma_init(vma, obj->size,
-			range_start, range_end);
-		if (ret) {
-			del_vma(vma);
-			return ERR_PTR(ret);
-		}
+		list_add_tail(&vma->list, &msm_obj->vmas);
 	} else {
 		GEM_WARN_ON(vma->iova < range_start);
 		GEM_WARN_ON((vma->iova + obj->size) > range_end);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index c16b11182831..9bd78642671c 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -66,8 +66,8 @@ struct msm_gem_vma {
 	bool mapped;
 };
 
-struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_vm *vm);
-int msm_gem_vma_init(struct msm_gem_vma *vma, int size,
+struct msm_gem_vma *
+msm_gem_vma_new(struct msm_gem_vm *vm, struct drm_gem_object *obj,
 		u64 range_start, u64 range_end);
 void msm_gem_vma_purge(struct msm_gem_vma *vma);
 int msm_gem_vma_map(struct msm_gem_vma *vma, int prot, struct sg_table *sgt, int size);
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index 9419692f0cc8..6d18364f321c 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -106,47 +106,41 @@ void msm_gem_vma_close(struct msm_gem_vma *vma)
 	msm_gem_vm_put(vm);
 }
 
-struct msm_gem_vma *msm_gem_vma_new(struct msm_gem_vm *vm)
+/* Create a new vma and allocate an iova for it */
+struct msm_gem_vma *
+msm_gem_vma_new(struct msm_gem_vm *vm, struct drm_gem_object *obj,
+		u64 range_start, u64 range_end)
 {
 	struct msm_gem_vma *vma;
+	int ret;
 
 	vma = kzalloc(sizeof(*vma), GFP_KERNEL);
 	if (!vma)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	vma->vm = vm;
 
-	return vma;
-}
-
-/* Initialize a new vma and allocate an iova for it */
-int msm_gem_vma_init(struct msm_gem_vma *vma, int size,
-		u64 range_start, u64 range_end)
-{
-	struct msm_gem_vm *vm = vma->vm;
-	int ret;
-
-	if (GEM_WARN_ON(!vm))
-		return -EINVAL;
-
-	if (GEM_WARN_ON(vma->iova))
-		return -EBUSY;
-
 	spin_lock(&vm->lock);
 	ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node,
-					  size, PAGE_SIZE, 0,
+					  obj->size, PAGE_SIZE, 0,
 					  range_start, range_end, 0);
 	spin_unlock(&vm->lock);
 
 	if (ret)
-		return ret;
+		goto err_free_vma;
 
 	vma->iova = vma->node.start;
 	vma->mapped = false;
 
+	INIT_LIST_HEAD(&vma->list);
+
 	kref_get(&vm->kref);
 
-	return 0;
+	return vma;
+
+err_free_vma:
+	kfree(vma);
+	return ERR_PTR(ret);
 }
 
 struct msm_gem_vm *
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v4 11/40] drm/msm: Collapse vma close and delete
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (9 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 10/40] drm/msm: Collapse vma allocation and initialization Rob Clark
@ 2025-05-14 16:59 ` Rob Clark
  2025-05-14 17:13 ` [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 16:59 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Rob Clark,
	Abhinav Kumar, Dmitry Baryshkov, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

From: Rob Clark <robdclark@chromium.org>

This fits better drm_gpuvm/drm_gpuva.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/msm_gem.c     | 16 +++-------------
 drivers/gpu/drm/msm/msm_gem_vma.c |  2 ++
 2 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 29247911f048..4c10eca404e0 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -353,15 +353,6 @@ static struct msm_gem_vma *lookup_vma(struct drm_gem_object *obj,
 	return NULL;
 }
 
-static void del_vma(struct msm_gem_vma *vma)
-{
-	if (!vma)
-		return;
-
-	list_del(&vma->list);
-	kfree(vma);
-}
-
 /*
  * If close is true, this also closes the VMA (releasing the allocated
  * iova range) in addition to removing the iommu mapping.  In the eviction
@@ -372,11 +363,11 @@ static void
 put_iova_spaces(struct drm_gem_object *obj, bool close)
 {
 	struct msm_gem_object *msm_obj = to_msm_bo(obj);
-	struct msm_gem_vma *vma;
+	struct msm_gem_vma *vma, *tmp;
 
 	msm_gem_assert_locked(obj);
 
-	list_for_each_entry(vma, &msm_obj->vmas, list) {
+	list_for_each_entry_safe(vma, tmp, &msm_obj->vmas, list) {
 		if (vma->vm) {
 			msm_gem_vma_purge(vma);
 			if (close)
@@ -395,7 +386,7 @@ put_iova_vmas(struct drm_gem_object *obj)
 	msm_gem_assert_locked(obj);
 
 	list_for_each_entry_safe(vma, tmp, &msm_obj->vmas, list) {
-		del_vma(vma);
+		msm_gem_vma_close(vma);
 	}
 }
 
@@ -564,7 +555,6 @@ static int clear_iova(struct drm_gem_object *obj,
 
 	msm_gem_vma_purge(vma);
 	msm_gem_vma_close(vma);
-	del_vma(vma);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index 6d18364f321c..ca29e81d79d2 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -102,8 +102,10 @@ void msm_gem_vma_close(struct msm_gem_vma *vma)
 	spin_unlock(&vm->lock);
 
 	vma->iova = 0;
+	list_del(&vma->list);
 
 	msm_gem_vm_put(vm);
+	kfree(vma);
 }
 
 /* Create a new vma and allocate an iova for it */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support
  2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
                   ` (10 preceding siblings ...)
  2025-05-14 16:59 ` [PATCH v4 11/40] drm/msm: Collapse vma close and delete Rob Clark
@ 2025-05-14 17:13 ` Rob Clark
  11 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 17:13 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Abhinav Kumar,
	André Almeida, Arnd Bergmann, Barnabás Czémán,
	Christian König, Christopher Snowhill, Dmitry Baryshkov,
	Dmitry Baryshkov, Eugene Lepshy, open list:IOMMU SUBSYSTEM,
	Jason Gunthorpe, Jessica Zhang, Joao Martins, Jonathan Marek,
	Kevin Tian, Konrad Dybcio, Krzysztof Kozlowski,
	moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:bdma_(?:buf|fence|resv)b,
	moderated list:ARM SMMU DRIVERS, open list,
	open list:DMA BUFFER SHARING FRAMEWORK:Keyword:bdma_(?:buf|fence|resv)b,
	Marijn Suijten, Nicolin Chen, Robin Murphy, Sean Paul,
	Will Deacon

hmm, looks like git-send-email died with a TLS error a quarter of the
way thru this series.. I'll try to resend later

BR,
-R

On Wed, May 14, 2025 at 10:03 AM Rob Clark <robdclark@gmail.com> wrote:
>
> From: Rob Clark <robdclark@chromium.org>
>
> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
> Memory[2] in the form of:
>
> 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
>    MAP_NULL/UNMAP commands
>
> 2. A new VM_BIND ioctl to allow submitting batches of one or more
>    MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
>
> I did not implement support for synchronous VM_BIND commands.  Since
> userspace could just immediately wait for the `SUBMIT` to complete, I don't
> think we need this extra complexity in the kernel.  Synchronous/immediate
> VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.
>
> The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
>
> Changes in v4:
> - Various locking/etc fixes
> - Optimize the pgtable preallocation.  If userspace sorts the VM_BIND ops
>   then the kernel detects ops that fall into the same 2MB last level PTD
>   to avoid duplicate page preallocation.
> - Add way to throttle pushing jobs to the scheduler, to cap the amount of
>   potentially temporary prealloc'd pgtable pages.
> - Add vm_log to devcoredump for debugging.  If the vm_log_shift module
>   param is set, keep a log of the last 1<<vm_log_shift VM updates for
>   easier debugging of faults/crashes.
> - Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/
>
> Changes in v3:
> - Switched to seperate VM_BIND ioctl.  This makes the UABI a bit
>   cleaner, but OTOH the userspace code was cleaner when the end result
>   of either type of VkQueue lead to the same ioctl.  So I'm a bit on
>   the fence.
> - Switched to doing the gpuvm bookkeeping synchronously, and only
>   deferring the pgtable updates.  This avoids needing to hold any resv
>   locks in the fence signaling path, resolving the last shrinker related
>   lockdep complaints.  OTOH it means userspace can trigger invalid
>   pgtable updates with multiple VM_BIND queues.  In this case, we ensure
>   that unmaps happen completely (to prevent userspace from using this to
>   access free'd pages), mark the context as unusable, and move on with
>   life.
> - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/
>
> Changes in v2:
> - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
>   merged.
> - Pre-allocate all the things, and drop HACK patch which disabled shrinker.
>   This includes ensuring that vm_bo objects are allocated up front, pre-
>   allocating VMA objects, and pre-allocating pages used for pgtable updates.
>   The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
>   were initially added for panthor.
> - Add back support for BO dumping for devcoredump.
> - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.com/T/#t
>
> [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
> [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
> [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700
>
> Rob Clark (40):
>   drm/gpuvm: Don't require obj lock in destructor path
>   drm/gpuvm: Allow VAs to hold soft reference to BOs
>   drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
>   drm/sched: Add enqueue credit limit
>   iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
>   drm/msm: Rename msm_file_private -> msm_context
>   drm/msm: Improve msm_context comments
>   drm/msm: Rename msm_gem_address_space -> msm_gem_vm
>   drm/msm: Remove vram carveout support
>   drm/msm: Collapse vma allocation and initialization
>   drm/msm: Collapse vma close and delete
>   drm/msm: Don't close VMAs on purge
>   drm/msm: drm_gpuvm conversion
>   drm/msm: Convert vm locking
>   drm/msm: Use drm_gpuvm types more
>   drm/msm: Split out helper to get iommu prot flags
>   drm/msm: Add mmu support for non-zero offset
>   drm/msm: Add PRR support
>   drm/msm: Rename msm_gem_vma_purge() -> _unmap()
>   drm/msm: Drop queued submits on lastclose()
>   drm/msm: Lazily create context VM
>   drm/msm: Add opt-in for VM_BIND
>   drm/msm: Mark VM as unusable on GPU hangs
>   drm/msm: Add _NO_SHARE flag
>   drm/msm: Crashdump prep for sparse mappings
>   drm/msm: rd dumping prep for sparse mappings
>   drm/msm: Crashdec support for sparse
>   drm/msm: rd dumping support for sparse
>   drm/msm: Extract out syncobj helpers
>   drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
>   drm/msm: Add VM_BIND submitqueue
>   drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
>   drm/msm: Support pgtable preallocation
>   drm/msm: Split out map/unmap ops
>   drm/msm: Add VM_BIND ioctl
>   drm/msm: Add VM logging for VM_BIND updates
>   drm/msm: Add VMA unmap reason
>   drm/msm: Add mmu prealloc tracepoint
>   drm/msm: use trylock for debugfs
>   drm/msm: Bump UAPI version
>
>  drivers/gpu/drm/drm_gem.c                     |   14 +-
>  drivers/gpu/drm/drm_gpuvm.c                   |   15 +-
>  drivers/gpu/drm/msm/Kconfig                   |    1 +
>  drivers/gpu/drm/msm/Makefile                  |    1 +
>  drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |   25 +-
>  drivers/gpu/drm/msm/adreno/a2xx_gpummu.c      |    5 +-
>  drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   17 +-
>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |   17 +-
>  drivers/gpu/drm/msm/adreno/a5xx_debugfs.c     |    4 +-
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   22 +-
>  drivers/gpu/drm/msm/adreno/a5xx_power.c       |    2 +-
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c     |   10 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c         |   32 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.h         |    2 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |   49 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c   |    6 +-
>  drivers/gpu/drm/msm/adreno/a6xx_preempt.c     |   10 +-
>  drivers/gpu/drm/msm/adreno/adreno_device.c    |    4 -
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c       |   99 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h       |   23 +-
>  .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   |   14 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c   |   18 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h   |    2 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |   18 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c     |   14 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h     |    4 +-
>  drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c     |    6 +-
>  drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c      |   28 +-
>  drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c    |   12 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c     |    4 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c      |   19 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c    |   12 +-
>  drivers/gpu/drm/msm/dsi/dsi_host.c            |   14 +-
>  drivers/gpu/drm/msm/msm_drv.c                 |  184 +--
>  drivers/gpu/drm/msm/msm_drv.h                 |   35 +-
>  drivers/gpu/drm/msm/msm_fb.c                  |   18 +-
>  drivers/gpu/drm/msm/msm_fbdev.c               |    2 +-
>  drivers/gpu/drm/msm/msm_gem.c                 |  494 +++---
>  drivers/gpu/drm/msm/msm_gem.h                 |  247 ++-
>  drivers/gpu/drm/msm/msm_gem_prime.c           |   15 +
>  drivers/gpu/drm/msm/msm_gem_shrinker.c        |  104 +-
>  drivers/gpu/drm/msm/msm_gem_submit.c          |  295 ++--
>  drivers/gpu/drm/msm/msm_gem_vma.c             | 1471 ++++++++++++++++-
>  drivers/gpu/drm/msm/msm_gpu.c                 |  214 ++-
>  drivers/gpu/drm/msm/msm_gpu.h                 |  144 +-
>  drivers/gpu/drm/msm/msm_gpu_trace.h           |   14 +
>  drivers/gpu/drm/msm/msm_iommu.c               |  302 +++-
>  drivers/gpu/drm/msm/msm_kms.c                 |   18 +-
>  drivers/gpu/drm/msm/msm_kms.h                 |    2 +-
>  drivers/gpu/drm/msm/msm_mmu.h                 |   38 +-
>  drivers/gpu/drm/msm/msm_rd.c                  |   62 +-
>  drivers/gpu/drm/msm/msm_ringbuffer.c          |   10 +-
>  drivers/gpu/drm/msm/msm_submitqueue.c         |   96 +-
>  drivers/gpu/drm/msm/msm_syncobj.c             |  172 ++
>  drivers/gpu/drm/msm/msm_syncobj.h             |   37 +
>  drivers/gpu/drm/scheduler/sched_entity.c      |   16 +-
>  drivers/gpu/drm/scheduler/sched_main.c        |    3 +
>  drivers/iommu/io-pgtable-arm.c                |   27 +-
>  include/drm/drm_gem.h                         |   10 +-
>  include/drm/drm_gpuvm.h                       |   12 +-
>  include/drm/gpu_scheduler.h                   |   13 +-
>  include/linux/io-pgtable.h                    |    8 +
>  include/uapi/drm/msm_drm.h                    |  149 +-
>  63 files changed, 3484 insertions(+), 1251 deletions(-)
>  create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
>  create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h
>
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support
@ 2025-05-14 17:53 Rob Clark
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-14 17:53 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Abhinav Kumar,
	André Almeida, Arnd Bergmann, Barnabás Czémán,
	Christian König, Christopher Snowhill, Dmitry Baryshkov,
	Dmitry Baryshkov, Eugene Lepshy, Haoxiang Li,
	open list:IOMMU SUBSYSTEM, Jason Gunthorpe, Jessica Zhang,
	Joao Martins, Jonathan Marek, Kevin Tian, Konrad Dybcio,
	Krzysztof Kozlowski,
	moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:bdma_(?:buf|fence|resv)b,
	moderated list:ARM SMMU DRIVERS, open list,
	open list:DMA BUFFER SHARING FRAMEWORK:Keyword:bdma_(?:buf|fence|resv)b,
	Marijn Suijten, Nicolin Chen, Robin Murphy, Sean Paul,
	Will Deacon

From: Rob Clark <robdclark@chromium.org>

Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
Memory[2] in the form of:

1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
   MAP_NULL/UNMAP commands

2. A new VM_BIND ioctl to allow submitting batches of one or more
   MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue

I did not implement support for synchronous VM_BIND commands.  Since
userspace could just immediately wait for the `SUBMIT` to complete, I don't
think we need this extra complexity in the kernel.  Synchronous/immediate
VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.

The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533

Changes in v4:
- Various locking/etc fixes
- Optimize the pgtable preallocation.  If userspace sorts the VM_BIND ops
  then the kernel detects ops that fall into the same 2MB last level PTD
  to avoid duplicate page preallocation.
- Add way to throttle pushing jobs to the scheduler, to cap the amount of
  potentially temporary prealloc'd pgtable pages.
- Add vm_log to devcoredump for debugging.  If the vm_log_shift module
  param is set, keep a log of the last 1<<vm_log_shift VM updates for
  easier debugging of faults/crashes.
- Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/

Changes in v3:
- Switched to seperate VM_BIND ioctl.  This makes the UABI a bit
  cleaner, but OTOH the userspace code was cleaner when the end result
  of either type of VkQueue lead to the same ioctl.  So I'm a bit on
  the fence.
- Switched to doing the gpuvm bookkeeping synchronously, and only
  deferring the pgtable updates.  This avoids needing to hold any resv
  locks in the fence signaling path, resolving the last shrinker related
  lockdep complaints.  OTOH it means userspace can trigger invalid
  pgtable updates with multiple VM_BIND queues.  In this case, we ensure
  that unmaps happen completely (to prevent userspace from using this to
  access free'd pages), mark the context as unusable, and move on with
  life.
- Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/

Changes in v2:
- Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
  merged.
- Pre-allocate all the things, and drop HACK patch which disabled shrinker.
  This includes ensuring that vm_bo objects are allocated up front, pre-
  allocating VMA objects, and pre-allocating pages used for pgtable updates.
  The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
  were initially added for panthor. 
- Add back support for BO dumping for devcoredump.
- Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.com/T/#t

[1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
[2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
[3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700

Rob Clark (40):
  drm/gpuvm: Don't require obj lock in destructor path
  drm/gpuvm: Allow VAs to hold soft reference to BOs
  drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
  drm/sched: Add enqueue credit limit
  iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
  drm/msm: Rename msm_file_private -> msm_context
  drm/msm: Improve msm_context comments
  drm/msm: Rename msm_gem_address_space -> msm_gem_vm
  drm/msm: Remove vram carveout support
  drm/msm: Collapse vma allocation and initialization
  drm/msm: Collapse vma close and delete
  drm/msm: Don't close VMAs on purge
  drm/msm: drm_gpuvm conversion
  drm/msm: Convert vm locking
  drm/msm: Use drm_gpuvm types more
  drm/msm: Split out helper to get iommu prot flags
  drm/msm: Add mmu support for non-zero offset
  drm/msm: Add PRR support
  drm/msm: Rename msm_gem_vma_purge() -> _unmap()
  drm/msm: Drop queued submits on lastclose()
  drm/msm: Lazily create context VM
  drm/msm: Add opt-in for VM_BIND
  drm/msm: Mark VM as unusable on GPU hangs
  drm/msm: Add _NO_SHARE flag
  drm/msm: Crashdump prep for sparse mappings
  drm/msm: rd dumping prep for sparse mappings
  drm/msm: Crashdec support for sparse
  drm/msm: rd dumping support for sparse
  drm/msm: Extract out syncobj helpers
  drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
  drm/msm: Add VM_BIND submitqueue
  drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
  drm/msm: Support pgtable preallocation
  drm/msm: Split out map/unmap ops
  drm/msm: Add VM_BIND ioctl
  drm/msm: Add VM logging for VM_BIND updates
  drm/msm: Add VMA unmap reason
  drm/msm: Add mmu prealloc tracepoint
  drm/msm: use trylock for debugfs
  drm/msm: Bump UAPI version

 drivers/gpu/drm/drm_gem.c                     |   14 +-
 drivers/gpu/drm/drm_gpuvm.c                   |   15 +-
 drivers/gpu/drm/msm/Kconfig                   |    1 +
 drivers/gpu/drm/msm/Makefile                  |    1 +
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |   25 +-
 drivers/gpu/drm/msm/adreno/a2xx_gpummu.c      |    5 +-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   17 +-
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |   17 +-
 drivers/gpu/drm/msm/adreno/a5xx_debugfs.c     |    4 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   22 +-
 drivers/gpu/drm/msm/adreno/a5xx_power.c       |    2 +-
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c     |   10 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c         |   32 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h         |    2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |   49 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c   |    6 +-
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c     |   10 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c    |    4 -
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |   99 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |   23 +-
 .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   |   14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c   |   18 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h   |    2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |   18 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c     |   14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h     |    4 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c     |    6 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c      |   28 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c    |   12 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c     |    4 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c      |   19 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c    |   12 +-
 drivers/gpu/drm/msm/dsi/dsi_host.c            |   14 +-
 drivers/gpu/drm/msm/msm_drv.c                 |  184 +--
 drivers/gpu/drm/msm/msm_drv.h                 |   35 +-
 drivers/gpu/drm/msm/msm_fb.c                  |   18 +-
 drivers/gpu/drm/msm/msm_fbdev.c               |    2 +-
 drivers/gpu/drm/msm/msm_gem.c                 |  494 +++---
 drivers/gpu/drm/msm/msm_gem.h                 |  247 ++-
 drivers/gpu/drm/msm/msm_gem_prime.c           |   15 +
 drivers/gpu/drm/msm/msm_gem_shrinker.c        |  104 +-
 drivers/gpu/drm/msm/msm_gem_submit.c          |  295 ++--
 drivers/gpu/drm/msm/msm_gem_vma.c             | 1471 ++++++++++++++++-
 drivers/gpu/drm/msm/msm_gpu.c                 |  214 ++-
 drivers/gpu/drm/msm/msm_gpu.h                 |  144 +-
 drivers/gpu/drm/msm/msm_gpu_trace.h           |   14 +
 drivers/gpu/drm/msm/msm_iommu.c               |  302 +++-
 drivers/gpu/drm/msm/msm_kms.c                 |   18 +-
 drivers/gpu/drm/msm/msm_kms.h                 |    2 +-
 drivers/gpu/drm/msm/msm_mmu.h                 |   38 +-
 drivers/gpu/drm/msm/msm_rd.c                  |   62 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c          |   10 +-
 drivers/gpu/drm/msm/msm_submitqueue.c         |   96 +-
 drivers/gpu/drm/msm/msm_syncobj.c             |  172 ++
 drivers/gpu/drm/msm/msm_syncobj.h             |   37 +
 drivers/gpu/drm/scheduler/sched_entity.c      |   16 +-
 drivers/gpu/drm/scheduler/sched_main.c        |    3 +
 drivers/iommu/io-pgtable-arm.c                |   27 +-
 include/drm/drm_gem.h                         |   10 +-
 include/drm/drm_gpuvm.h                       |   12 +-
 include/drm/gpu_scheduler.h                   |   13 +-
 include/linux/io-pgtable.h                    |    8 +
 include/uapi/drm/msm_drm.h                    |  149 +-
 63 files changed, 3484 insertions(+), 1251 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
 create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h

-- 
2.49.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-14 16:59 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark
@ 2025-05-15  9:28   ` Philipp Stanner
  2025-05-15 16:15     ` Rob Clark
  0 siblings, 1 reply; 33+ messages in thread
From: Philipp Stanner @ 2025-05-15  9:28 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: freedreno, linux-arm-msm, Connor Abbott, Rob Clark, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

Hello,

On Wed, 2025-05-14 at 09:59 -0700, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Similar to the existing credit limit mechanism, but applying to jobs
> enqueued to the scheduler but not yet run.
> 
> The use case is to put an upper bound on preallocated, and
> potentially
> unneeded, pgtable pages.  When this limit is exceeded, pushing new
> jobs
> will block until the count drops below the limit.

the commit message doesn't make clear why that's needed within the
scheduler.

From what I understand from the cover letter, this is a (rare?) Vulkan
feature. And as important as Vulkan is, it's the drivers that implement
support for it. I don't see why the scheduler is a blocker.

All the knowledge about when to stop pushing into the entity is in the
driver, and the scheduler obtains all the knowledge about that from the
driver anyways.

So you could do

if (my_vulkan_condition())
   drm_sched_entity_push_job();

couldn't you?

> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 16 ++++++++++++++--
>  drivers/gpu/drm/scheduler/sched_main.c   |  3 +++
>  include/drm/gpu_scheduler.h              | 13 ++++++++++++-
>  3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index dc0e60d2c14b..c5f688362a34 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -580,11 +580,21 @@ void drm_sched_entity_select_rq(struct
> drm_sched_entity *entity)
>   * under common lock for the struct drm_sched_entity that was set up
> for
>   * @sched_job in drm_sched_job_init().
>   */
> -void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> +int drm_sched_entity_push_job(struct drm_sched_job *sched_job)

Return code would need to be documented in the docstring, too. If we'd
go for that solution.

>  {
>  	struct drm_sched_entity *entity = sched_job->entity;
> +	struct drm_gpu_scheduler *sched = sched_job->sched;
>  	bool first;
>  	ktime_t submit_ts;
> +	int ret;
> +
> +	ret = wait_event_interruptible(
> +			sched->job_scheduled,
> +			atomic_read(&sched->enqueue_credit_count) <=
> +			sched->enqueue_credit_limit);

This very significantly changes the function's semantics. This function
is used in a great many drivers, and here it would be transformed into
a function that can block.

From what I see below those credits are to be optional. But even if, it
needs to be clearly documented when a function can block.

> +	if (ret)
> +		return ret;
> +	atomic_add(sched_job->enqueue_credits, &sched-
> >enqueue_credit_count);
>  
>  	trace_drm_sched_job(sched_job, entity);
>  	atomic_inc(entity->rq->sched->score);
> @@ -609,7 +619,7 @@ void drm_sched_entity_push_job(struct
> drm_sched_job *sched_job)
>  			spin_unlock(&entity->lock);
>  
>  			DRM_ERROR("Trying to push to a killed
> entity\n");
> -			return;
> +			return -EINVAL;
>  		}
>  
>  		rq = entity->rq;
> @@ -626,5 +636,7 @@ void drm_sched_entity_push_job(struct
> drm_sched_job *sched_job)
>  
>  		drm_sched_wakeup(sched);
>  	}
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL(drm_sched_entity_push_job);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 9412bffa8c74..1102cca69cb4 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1217,6 +1217,7 @@ static void drm_sched_run_job_work(struct
> work_struct *w)
>  
>  	trace_drm_run_job(sched_job, entity);
>  	fence = sched->ops->run_job(sched_job);
> +	atomic_sub(sched_job->enqueue_credits, &sched-
> >enqueue_credit_count);
>  	complete_all(&entity->entity_idle);
>  	drm_sched_fence_scheduled(s_fence, fence);
>  
> @@ -1253,6 +1254,7 @@ int drm_sched_init(struct drm_gpu_scheduler
> *sched, const struct drm_sched_init_
>  
>  	sched->ops = args->ops;
>  	sched->credit_limit = args->credit_limit;
> +	sched->enqueue_credit_limit = args->enqueue_credit_limit;
>  	sched->name = args->name;
>  	sched->timeout = args->timeout;
>  	sched->hang_limit = args->hang_limit;
> @@ -1308,6 +1310,7 @@ int drm_sched_init(struct drm_gpu_scheduler
> *sched, const struct drm_sched_init_
>  	INIT_LIST_HEAD(&sched->pending_list);
>  	spin_lock_init(&sched->job_list_lock);
>  	atomic_set(&sched->credit_count, 0);
> +	atomic_set(&sched->enqueue_credit_count, 0);
>  	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>  	INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
>  	INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
> diff --git a/include/drm/gpu_scheduler.h
> b/include/drm/gpu_scheduler.h
> index da64232c989d..d830ffe083f1 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -329,6 +329,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct
> dma_fence *f);
>   * @s_fence: contains the fences for the scheduling of job.
>   * @finish_cb: the callback for the finished fence.
>   * @credits: the number of credits this job contributes to the
> scheduler
> + * @enqueue_credits: the number of enqueue credits this job
> contributes
>   * @work: Helper to reschedule job kill to different context.
>   * @id: a unique id assigned to each job scheduled on the scheduler.
>   * @karma: increment on every hang caused by this job. If this
> exceeds the hang
> @@ -366,6 +367,7 @@ struct drm_sched_job {
>  
>  	enum drm_sched_priority		s_priority;
>  	u32				credits;
> +	u32				enqueue_credits;

What's the policy of setting this?

drm_sched_job_init() and drm_sched_job_arm() are responsible for
initializing jobs.

>  	/** @last_dependency: tracks @dependencies as they signal */
>  	unsigned int			last_dependency;
>  	atomic_t			karma;
> @@ -485,6 +487,10 @@ struct drm_sched_backend_ops {
>   * @ops: backend operations provided by the driver.
>   * @credit_limit: the credit limit of this scheduler
>   * @credit_count: the current credit count of this scheduler
> + * @enqueue_credit_limit: the credit limit of jobs pushed to
> scheduler and not
> + *                        yet run
> + * @enqueue_credit_count: the current crdit count of jobs pushed to
> scheduler
> + *                        but not yet run
>   * @timeout: the time after which a job is removed from the
> scheduler.
>   * @name: name of the ring for which this scheduler is being used.
>   * @num_rqs: Number of run-queues. This is at most
> DRM_SCHED_PRIORITY_COUNT,
> @@ -518,6 +524,8 @@ struct drm_gpu_scheduler {
>  	const struct drm_sched_backend_ops	*ops;
>  	u32				credit_limit;
>  	atomic_t			credit_count;
> +	u32				enqueue_credit_limit;
> +	atomic_t			enqueue_credit_count;
>  	long				timeout;
>  	const char			*name;
>  	u32                             num_rqs;
> @@ -550,6 +558,8 @@ struct drm_gpu_scheduler {
>   * @num_rqs: Number of run-queues. This may be at most
> DRM_SCHED_PRIORITY_COUNT,
>   *	     as there's usually one run-queue per priority, but may
> be less.
>   * @credit_limit: the number of credits this scheduler can hold from
> all jobs
> + * @enqueue_credit_limit: the number of credits that can be enqueued
> before
> + *                        drm_sched_entity_push_job() blocks

Is it optional or not? Can it be deactivated?

It seems to me that it is optional, and so far only used in msm. If
there are no other parties in need for that mechanism, the right place
to have this feature probably is msm, which has all the knowledge about
when to block already.


Regards
P.


>   * @hang_limit: number of times to allow a job to hang before
> dropping it.
>   *		This mechanism is DEPRECATED. Set it to 0.
>   * @timeout: timeout value in jiffies for submitted jobs.
> @@ -564,6 +574,7 @@ struct drm_sched_init_args {
>  	struct workqueue_struct *timeout_wq;
>  	u32 num_rqs;
>  	u32 credit_limit;
> +	u32 enqueue_credit_limit;
>  	unsigned int hang_limit;
>  	long timeout;
>  	atomic_t *score;
> @@ -600,7 +611,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       u32 credits, void *owner);
>  void drm_sched_job_arm(struct drm_sched_job *job);
> -void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
> +int drm_sched_entity_push_job(struct drm_sched_job *sched_job);
>  int drm_sched_job_add_dependency(struct drm_sched_job *job,
>  				 struct dma_fence *fence);
>  int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15  9:28   ` Philipp Stanner
@ 2025-05-15 16:15     ` Rob Clark
  2025-05-15 16:22       ` Connor Abbott
  2025-05-15 17:23       ` Danilo Krummrich
  0 siblings, 2 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-15 16:15 UTC (permalink / raw)
  To: phasta
  Cc: Rob Clark, dri-devel, freedreno, linux-arm-msm, Connor Abbott,
	Matthew Brost, Danilo Krummrich, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

On Thu, May 15, 2025 at 2:28 AM Philipp Stanner <phasta@mailbox.org> wrote:
>
> Hello,
>
> On Wed, 2025-05-14 at 09:59 -0700, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Similar to the existing credit limit mechanism, but applying to jobs
> > enqueued to the scheduler but not yet run.
> >
> > The use case is to put an upper bound on preallocated, and
> > potentially
> > unneeded, pgtable pages.  When this limit is exceeded, pushing new
> > jobs
> > will block until the count drops below the limit.
>
> the commit message doesn't make clear why that's needed within the
> scheduler.
>
> From what I understand from the cover letter, this is a (rare?) Vulkan
> feature. And as important as Vulkan is, it's the drivers that implement
> support for it. I don't see why the scheduler is a blocker.

Maybe not rare, or at least it comes up with a group of deqp-vk tests ;-)

Basically it is a way to throttle userspace to prevent it from OoM'ing
itself.  (I suppose userspace could throttle itself, but it doesn't
really know how much pre-allocation will need to be done for pgtable
updates.)

> All the knowledge about when to stop pushing into the entity is in the
> driver, and the scheduler obtains all the knowledge about that from the
> driver anyways.
>
> So you could do
>
> if (my_vulkan_condition())
>    drm_sched_entity_push_job();
>
> couldn't you?

It would need to reach in and use the sched's job_scheduled
wait_queue_head_t...  if that isn't too ugly, maybe the rest could be
implemented on top of sched.  But it seemed like a reasonable thing
for the scheduler to support directly.

> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >  drivers/gpu/drm/scheduler/sched_entity.c | 16 ++++++++++++++--
> >  drivers/gpu/drm/scheduler/sched_main.c   |  3 +++
> >  include/drm/gpu_scheduler.h              | 13 ++++++++++++-
> >  3 files changed, 29 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index dc0e60d2c14b..c5f688362a34 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -580,11 +580,21 @@ void drm_sched_entity_select_rq(struct
> > drm_sched_entity *entity)
> >   * under common lock for the struct drm_sched_entity that was set up
> > for
> >   * @sched_job in drm_sched_job_init().
> >   */
> > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>
> Return code would need to be documented in the docstring, too. If we'd
> go for that solution.
>
> >  {
> >       struct drm_sched_entity *entity = sched_job->entity;
> > +     struct drm_gpu_scheduler *sched = sched_job->sched;
> >       bool first;
> >       ktime_t submit_ts;
> > +     int ret;
> > +
> > +     ret = wait_event_interruptible(
> > +                     sched->job_scheduled,
> > +                     atomic_read(&sched->enqueue_credit_count) <=
> > +                     sched->enqueue_credit_limit);
>
> This very significantly changes the function's semantics. This function
> is used in a great many drivers, and here it would be transformed into
> a function that can block.
>
> From what I see below those credits are to be optional. But even if, it
> needs to be clearly documented when a function can block.

Sure.  The behavior changes only for drivers that use the
enqueue_credit_limit, so other drivers should be unaffected.

I can improve the docs.

(Maybe push_credit or something else would be a better name than
enqueue_credit?)

>
> > +     if (ret)
> > +             return ret;
> > +     atomic_add(sched_job->enqueue_credits, &sched-
> > >enqueue_credit_count);
> >
> >       trace_drm_sched_job(sched_job, entity);
> >       atomic_inc(entity->rq->sched->score);
> > @@ -609,7 +619,7 @@ void drm_sched_entity_push_job(struct
> > drm_sched_job *sched_job)
> >                       spin_unlock(&entity->lock);
> >
> >                       DRM_ERROR("Trying to push to a killed
> > entity\n");
> > -                     return;
> > +                     return -EINVAL;
> >               }
> >
> >               rq = entity->rq;
> > @@ -626,5 +636,7 @@ void drm_sched_entity_push_job(struct
> > drm_sched_job *sched_job)
> >
> >               drm_sched_wakeup(sched);
> >       }
> > +
> > +     return 0;
> >  }
> >  EXPORT_SYMBOL(drm_sched_entity_push_job);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index 9412bffa8c74..1102cca69cb4 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -1217,6 +1217,7 @@ static void drm_sched_run_job_work(struct
> > work_struct *w)
> >
> >       trace_drm_run_job(sched_job, entity);
> >       fence = sched->ops->run_job(sched_job);
> > +     atomic_sub(sched_job->enqueue_credits, &sched-
> > >enqueue_credit_count);
> >       complete_all(&entity->entity_idle);
> >       drm_sched_fence_scheduled(s_fence, fence);
> >
> > @@ -1253,6 +1254,7 @@ int drm_sched_init(struct drm_gpu_scheduler
> > *sched, const struct drm_sched_init_
> >
> >       sched->ops = args->ops;
> >       sched->credit_limit = args->credit_limit;
> > +     sched->enqueue_credit_limit = args->enqueue_credit_limit;
> >       sched->name = args->name;
> >       sched->timeout = args->timeout;
> >       sched->hang_limit = args->hang_limit;
> > @@ -1308,6 +1310,7 @@ int drm_sched_init(struct drm_gpu_scheduler
> > *sched, const struct drm_sched_init_
> >       INIT_LIST_HEAD(&sched->pending_list);
> >       spin_lock_init(&sched->job_list_lock);
> >       atomic_set(&sched->credit_count, 0);
> > +     atomic_set(&sched->enqueue_credit_count, 0);
> >       INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> >       INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
> >       INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
> > diff --git a/include/drm/gpu_scheduler.h
> > b/include/drm/gpu_scheduler.h
> > index da64232c989d..d830ffe083f1 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -329,6 +329,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct
> > dma_fence *f);
> >   * @s_fence: contains the fences for the scheduling of job.
> >   * @finish_cb: the callback for the finished fence.
> >   * @credits: the number of credits this job contributes to the
> > scheduler
> > + * @enqueue_credits: the number of enqueue credits this job
> > contributes
> >   * @work: Helper to reschedule job kill to different context.
> >   * @id: a unique id assigned to each job scheduled on the scheduler.
> >   * @karma: increment on every hang caused by this job. If this
> > exceeds the hang
> > @@ -366,6 +367,7 @@ struct drm_sched_job {
> >
> >       enum drm_sched_priority         s_priority;
> >       u32                             credits;
> > +     u32                             enqueue_credits;
>
> What's the policy of setting this?
>
> drm_sched_job_init() and drm_sched_job_arm() are responsible for
> initializing jobs.

It should be set before drm_sched_entity_push_job().  I wouldn't
really expect drivers to know the value at drm_sched_job_init() time.
But they would by the time drm_sched_entity_push_job() is called.

> >       /** @last_dependency: tracks @dependencies as they signal */
> >       unsigned int                    last_dependency;
> >       atomic_t                        karma;
> > @@ -485,6 +487,10 @@ struct drm_sched_backend_ops {
> >   * @ops: backend operations provided by the driver.
> >   * @credit_limit: the credit limit of this scheduler
> >   * @credit_count: the current credit count of this scheduler
> > + * @enqueue_credit_limit: the credit limit of jobs pushed to
> > scheduler and not
> > + *                        yet run
> > + * @enqueue_credit_count: the current crdit count of jobs pushed to
> > scheduler
> > + *                        but not yet run
> >   * @timeout: the time after which a job is removed from the
> > scheduler.
> >   * @name: name of the ring for which this scheduler is being used.
> >   * @num_rqs: Number of run-queues. This is at most
> > DRM_SCHED_PRIORITY_COUNT,
> > @@ -518,6 +524,8 @@ struct drm_gpu_scheduler {
> >       const struct drm_sched_backend_ops      *ops;
> >       u32                             credit_limit;
> >       atomic_t                        credit_count;
> > +     u32                             enqueue_credit_limit;
> > +     atomic_t                        enqueue_credit_count;
> >       long                            timeout;
> >       const char                      *name;
> >       u32                             num_rqs;
> > @@ -550,6 +558,8 @@ struct drm_gpu_scheduler {
> >   * @num_rqs: Number of run-queues. This may be at most
> > DRM_SCHED_PRIORITY_COUNT,
> >   *        as there's usually one run-queue per priority, but may
> > be less.
> >   * @credit_limit: the number of credits this scheduler can hold from
> > all jobs
> > + * @enqueue_credit_limit: the number of credits that can be enqueued
> > before
> > + *                        drm_sched_entity_push_job() blocks
>
> Is it optional or not? Can it be deactivated?
>
> It seems to me that it is optional, and so far only used in msm. If
> there are no other parties in need for that mechanism, the right place
> to have this feature probably is msm, which has all the knowledge about
> when to block already.
>

As with the existing credit_limit, it is optional.  Although I think
it would be also useful for other drivers that use drm sched for
VM_BIND queues, for the same reason.

BR,
-R

>
> Regards
> P.
>
>
> >   * @hang_limit: number of times to allow a job to hang before
> > dropping it.
> >   *           This mechanism is DEPRECATED. Set it to 0.
> >   * @timeout: timeout value in jiffies for submitted jobs.
> > @@ -564,6 +574,7 @@ struct drm_sched_init_args {
> >       struct workqueue_struct *timeout_wq;
> >       u32 num_rqs;
> >       u32 credit_limit;
> > +     u32 enqueue_credit_limit;
> >       unsigned int hang_limit;
> >       long timeout;
> >       atomic_t *score;
> > @@ -600,7 +611,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >                      struct drm_sched_entity *entity,
> >                      u32 credits, void *owner);
> >  void drm_sched_job_arm(struct drm_sched_job *job);
> > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
> > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job);
> >  int drm_sched_job_add_dependency(struct drm_sched_job *job,
> >                                struct dma_fence *fence);
> >  int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 16:15     ` Rob Clark
@ 2025-05-15 16:22       ` Connor Abbott
  2025-05-15 17:29         ` Danilo Krummrich
  2025-05-15 17:23       ` Danilo Krummrich
  1 sibling, 1 reply; 33+ messages in thread
From: Connor Abbott @ 2025-05-15 16:22 UTC (permalink / raw)
  To: Rob Clark
  Cc: phasta, Rob Clark, dri-devel, freedreno, linux-arm-msm,
	Matthew Brost, Danilo Krummrich, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

On Thu, May 15, 2025 at 12:15 PM Rob Clark <robdclark@chromium.org> wrote:
>
> On Thu, May 15, 2025 at 2:28 AM Philipp Stanner <phasta@mailbox.org> wrote:
> >
> > Hello,
> >
> > On Wed, 2025-05-14 at 09:59 -0700, Rob Clark wrote:
> > > From: Rob Clark <robdclark@chromium.org>
> > >
> > > Similar to the existing credit limit mechanism, but applying to jobs
> > > enqueued to the scheduler but not yet run.
> > >
> > > The use case is to put an upper bound on preallocated, and
> > > potentially
> > > unneeded, pgtable pages.  When this limit is exceeded, pushing new
> > > jobs
> > > will block until the count drops below the limit.
> >
> > the commit message doesn't make clear why that's needed within the
> > scheduler.
> >
> > From what I understand from the cover letter, this is a (rare?) Vulkan
> > feature. And as important as Vulkan is, it's the drivers that implement
> > support for it. I don't see why the scheduler is a blocker.
>
> Maybe not rare, or at least it comes up with a group of deqp-vk tests ;-)
>
> Basically it is a way to throttle userspace to prevent it from OoM'ing
> itself.  (I suppose userspace could throttle itself, but it doesn't
> really know how much pre-allocation will need to be done for pgtable
> updates.)

For some context, other drivers have the concept of a "synchronous"
VM_BIND ioctl which completes immediately, and drivers implement it by
waiting for the whole thing to finish before returning. But this
doesn't work for native context, where everything has to be
asynchronous, so we're trying a new approach where we instead submit
an asynchronous bind for "normal" (non-sparse/driver internal)
allocations and only attach its out-fence to the in-fence of
subsequent submits to other queues. Once you do this then you need a
limit like this to prevent memory usage from pending page table
updates from getting out of control. Other drivers haven't needed this
yet, but they will when they get native context support.

Connor

>
> > All the knowledge about when to stop pushing into the entity is in the
> > driver, and the scheduler obtains all the knowledge about that from the
> > driver anyways.
> >
> > So you could do
> >
> > if (my_vulkan_condition())
> >    drm_sched_entity_push_job();
> >
> > couldn't you?
>
> It would need to reach in and use the sched's job_scheduled
> wait_queue_head_t...  if that isn't too ugly, maybe the rest could be
> implemented on top of sched.  But it seemed like a reasonable thing
> for the scheduler to support directly.
>
> > >
> > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > ---
> > >  drivers/gpu/drm/scheduler/sched_entity.c | 16 ++++++++++++++--
> > >  drivers/gpu/drm/scheduler/sched_main.c   |  3 +++
> > >  include/drm/gpu_scheduler.h              | 13 ++++++++++++-
> > >  3 files changed, 29 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> > > b/drivers/gpu/drm/scheduler/sched_entity.c
> > > index dc0e60d2c14b..c5f688362a34 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > @@ -580,11 +580,21 @@ void drm_sched_entity_select_rq(struct
> > > drm_sched_entity *entity)
> > >   * under common lock for the struct drm_sched_entity that was set up
> > > for
> > >   * @sched_job in drm_sched_job_init().
> > >   */
> > > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> >
> > Return code would need to be documented in the docstring, too. If we'd
> > go for that solution.
> >
> > >  {
> > >       struct drm_sched_entity *entity = sched_job->entity;
> > > +     struct drm_gpu_scheduler *sched = sched_job->sched;
> > >       bool first;
> > >       ktime_t submit_ts;
> > > +     int ret;
> > > +
> > > +     ret = wait_event_interruptible(
> > > +                     sched->job_scheduled,
> > > +                     atomic_read(&sched->enqueue_credit_count) <=
> > > +                     sched->enqueue_credit_limit);
> >
> > This very significantly changes the function's semantics. This function
> > is used in a great many drivers, and here it would be transformed into
> > a function that can block.
> >
> > From what I see below those credits are to be optional. But even if, it
> > needs to be clearly documented when a function can block.
>
> Sure.  The behavior changes only for drivers that use the
> enqueue_credit_limit, so other drivers should be unaffected.
>
> I can improve the docs.
>
> (Maybe push_credit or something else would be a better name than
> enqueue_credit?)
>
> >
> > > +     if (ret)
> > > +             return ret;
> > > +     atomic_add(sched_job->enqueue_credits, &sched-
> > > >enqueue_credit_count);
> > >
> > >       trace_drm_sched_job(sched_job, entity);
> > >       atomic_inc(entity->rq->sched->score);
> > > @@ -609,7 +619,7 @@ void drm_sched_entity_push_job(struct
> > > drm_sched_job *sched_job)
> > >                       spin_unlock(&entity->lock);
> > >
> > >                       DRM_ERROR("Trying to push to a killed
> > > entity\n");
> > > -                     return;
> > > +                     return -EINVAL;
> > >               }
> > >
> > >               rq = entity->rq;
> > > @@ -626,5 +636,7 @@ void drm_sched_entity_push_job(struct
> > > drm_sched_job *sched_job)
> > >
> > >               drm_sched_wakeup(sched);
> > >       }
> > > +
> > > +     return 0;
> > >  }
> > >  EXPORT_SYMBOL(drm_sched_entity_push_job);
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 9412bffa8c74..1102cca69cb4 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -1217,6 +1217,7 @@ static void drm_sched_run_job_work(struct
> > > work_struct *w)
> > >
> > >       trace_drm_run_job(sched_job, entity);
> > >       fence = sched->ops->run_job(sched_job);
> > > +     atomic_sub(sched_job->enqueue_credits, &sched-
> > > >enqueue_credit_count);
> > >       complete_all(&entity->entity_idle);
> > >       drm_sched_fence_scheduled(s_fence, fence);
> > >
> > > @@ -1253,6 +1254,7 @@ int drm_sched_init(struct drm_gpu_scheduler
> > > *sched, const struct drm_sched_init_
> > >
> > >       sched->ops = args->ops;
> > >       sched->credit_limit = args->credit_limit;
> > > +     sched->enqueue_credit_limit = args->enqueue_credit_limit;
> > >       sched->name = args->name;
> > >       sched->timeout = args->timeout;
> > >       sched->hang_limit = args->hang_limit;
> > > @@ -1308,6 +1310,7 @@ int drm_sched_init(struct drm_gpu_scheduler
> > > *sched, const struct drm_sched_init_
> > >       INIT_LIST_HEAD(&sched->pending_list);
> > >       spin_lock_init(&sched->job_list_lock);
> > >       atomic_set(&sched->credit_count, 0);
> > > +     atomic_set(&sched->enqueue_credit_count, 0);
> > >       INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> > >       INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
> > >       INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
> > > diff --git a/include/drm/gpu_scheduler.h
> > > b/include/drm/gpu_scheduler.h
> > > index da64232c989d..d830ffe083f1 100644
> > > --- a/include/drm/gpu_scheduler.h
> > > +++ b/include/drm/gpu_scheduler.h
> > > @@ -329,6 +329,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct
> > > dma_fence *f);
> > >   * @s_fence: contains the fences for the scheduling of job.
> > >   * @finish_cb: the callback for the finished fence.
> > >   * @credits: the number of credits this job contributes to the
> > > scheduler
> > > + * @enqueue_credits: the number of enqueue credits this job
> > > contributes
> > >   * @work: Helper to reschedule job kill to different context.
> > >   * @id: a unique id assigned to each job scheduled on the scheduler.
> > >   * @karma: increment on every hang caused by this job. If this
> > > exceeds the hang
> > > @@ -366,6 +367,7 @@ struct drm_sched_job {
> > >
> > >       enum drm_sched_priority         s_priority;
> > >       u32                             credits;
> > > +     u32                             enqueue_credits;
> >
> > What's the policy of setting this?
> >
> > drm_sched_job_init() and drm_sched_job_arm() are responsible for
> > initializing jobs.
>
> It should be set before drm_sched_entity_push_job().  I wouldn't
> really expect drivers to know the value at drm_sched_job_init() time.
> But they would by the time drm_sched_entity_push_job() is called.
>
> > >       /** @last_dependency: tracks @dependencies as they signal */
> > >       unsigned int                    last_dependency;
> > >       atomic_t                        karma;
> > > @@ -485,6 +487,10 @@ struct drm_sched_backend_ops {
> > >   * @ops: backend operations provided by the driver.
> > >   * @credit_limit: the credit limit of this scheduler
> > >   * @credit_count: the current credit count of this scheduler
> > > + * @enqueue_credit_limit: the credit limit of jobs pushed to
> > > scheduler and not
> > > + *                        yet run
> > > + * @enqueue_credit_count: the current crdit count of jobs pushed to
> > > scheduler
> > > + *                        but not yet run
> > >   * @timeout: the time after which a job is removed from the
> > > scheduler.
> > >   * @name: name of the ring for which this scheduler is being used.
> > >   * @num_rqs: Number of run-queues. This is at most
> > > DRM_SCHED_PRIORITY_COUNT,
> > > @@ -518,6 +524,8 @@ struct drm_gpu_scheduler {
> > >       const struct drm_sched_backend_ops      *ops;
> > >       u32                             credit_limit;
> > >       atomic_t                        credit_count;
> > > +     u32                             enqueue_credit_limit;
> > > +     atomic_t                        enqueue_credit_count;
> > >       long                            timeout;
> > >       const char                      *name;
> > >       u32                             num_rqs;
> > > @@ -550,6 +558,8 @@ struct drm_gpu_scheduler {
> > >   * @num_rqs: Number of run-queues. This may be at most
> > > DRM_SCHED_PRIORITY_COUNT,
> > >   *        as there's usually one run-queue per priority, but may
> > > be less.
> > >   * @credit_limit: the number of credits this scheduler can hold from
> > > all jobs
> > > + * @enqueue_credit_limit: the number of credits that can be enqueued
> > > before
> > > + *                        drm_sched_entity_push_job() blocks
> >
> > Is it optional or not? Can it be deactivated?
> >
> > It seems to me that it is optional, and so far only used in msm. If
> > there are no other parties in need for that mechanism, the right place
> > to have this feature probably is msm, which has all the knowledge about
> > when to block already.
> >
>
> As with the existing credit_limit, it is optional.  Although I think
> it would be also useful for other drivers that use drm sched for
> VM_BIND queues, for the same reason.
>
> BR,
> -R
>
> >
> > Regards
> > P.
> >
> >
> > >   * @hang_limit: number of times to allow a job to hang before
> > > dropping it.
> > >   *           This mechanism is DEPRECATED. Set it to 0.
> > >   * @timeout: timeout value in jiffies for submitted jobs.
> > > @@ -564,6 +574,7 @@ struct drm_sched_init_args {
> > >       struct workqueue_struct *timeout_wq;
> > >       u32 num_rqs;
> > >       u32 credit_limit;
> > > +     u32 enqueue_credit_limit;
> > >       unsigned int hang_limit;
> > >       long timeout;
> > >       atomic_t *score;
> > > @@ -600,7 +611,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> > >                      struct drm_sched_entity *entity,
> > >                      u32 credits, void *owner);
> > >  void drm_sched_job_arm(struct drm_sched_job *job);
> > > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
> > > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job);
> > >  int drm_sched_job_add_dependency(struct drm_sched_job *job,
> > >                                struct dma_fence *fence);
> > >  int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,
> >

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 16:15     ` Rob Clark
  2025-05-15 16:22       ` Connor Abbott
@ 2025-05-15 17:23       ` Danilo Krummrich
  2025-05-15 17:36         ` Rob Clark
  1 sibling, 1 reply; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-15 17:23 UTC (permalink / raw)
  To: Rob Clark
  Cc: phasta, Rob Clark, dri-devel, freedreno, linux-arm-msm,
	Connor Abbott, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

On Thu, May 15, 2025 at 09:15:08AM -0700, Rob Clark wrote:
> Basically it is a way to throttle userspace to prevent it from OoM'ing
> itself.  (I suppose userspace could throttle itself, but it doesn't
> really know how much pre-allocation will need to be done for pgtable
> updates.)

I assume you mean prevent a single process from OOM'ing itself by queuing up
VM_BIND requests much faster than they can be completed and hence
pre-allocations for page tables get out of control?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 16:22       ` Connor Abbott
@ 2025-05-15 17:29         ` Danilo Krummrich
  2025-05-15 17:40           ` Rob Clark
  0 siblings, 1 reply; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-15 17:29 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, phasta, Rob Clark, dri-devel, freedreno, linux-arm-msm,
	Matthew Brost, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	open list, Boris Brezillon

(Cc: Boris)

On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> For some context, other drivers have the concept of a "synchronous"
> VM_BIND ioctl which completes immediately, and drivers implement it by
> waiting for the whole thing to finish before returning.

Nouveau implements sync by issuing a normal async VM_BIND and subsequently
waits for the out-fence synchronously.

> But this
> doesn't work for native context, where everything has to be
> asynchronous, so we're trying a new approach where we instead submit
> an asynchronous bind for "normal" (non-sparse/driver internal)
> allocations and only attach its out-fence to the in-fence of
> subsequent submits to other queues.

This is what nouveau does and I think other drivers like Xe and panthor do this
as well.

> Once you do this then you need a
> limit like this to prevent memory usage from pending page table
> updates from getting out of control. Other drivers haven't needed this
> yet, but they will when they get native context support.

What are the cases where you did run into this, i.e. which application in
userspace hit this? Was it the CTS, some game, something else?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 17:23       ` Danilo Krummrich
@ 2025-05-15 17:36         ` Rob Clark
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Clark @ 2025-05-15 17:36 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Rob Clark, phasta, dri-devel, freedreno, linux-arm-msm,
	Connor Abbott, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list

On Thu, May 15, 2025 at 10:23 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu, May 15, 2025 at 09:15:08AM -0700, Rob Clark wrote:
> > Basically it is a way to throttle userspace to prevent it from OoM'ing
> > itself.  (I suppose userspace could throttle itself, but it doesn't
> > really know how much pre-allocation will need to be done for pgtable
> > updates.)
>
> I assume you mean prevent a single process from OOM'ing itself by queuing up
> VM_BIND requests much faster than they can be completed and hence
> pre-allocations for page tables get out of control?

Yes

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 17:29         ` Danilo Krummrich
@ 2025-05-15 17:40           ` Rob Clark
  2025-05-15 18:56             ` Danilo Krummrich
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-15 17:40 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> (Cc: Boris)
>
> On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > For some context, other drivers have the concept of a "synchronous"
> > VM_BIND ioctl which completes immediately, and drivers implement it by
> > waiting for the whole thing to finish before returning.
>
> Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> waits for the out-fence synchronously.

As Connor mentioned, we'd prefer it to be async rather than blocking,
in normal cases, otherwise with drm native context for using native
UMD in guest VM, you'd be blocking the single host/VMM virglrender
thread.

The key is we want to keep it async in the normal cases, and not have
weird edge case CTS tests blow up from being _too_ async ;-)

> > But this
> > doesn't work for native context, where everything has to be
> > asynchronous, so we're trying a new approach where we instead submit
> > an asynchronous bind for "normal" (non-sparse/driver internal)
> > allocations and only attach its out-fence to the in-fence of
> > subsequent submits to other queues.
>
> This is what nouveau does and I think other drivers like Xe and panthor do this
> as well.

No one has added native context support for these drivers yet

> > Once you do this then you need a
> > limit like this to prevent memory usage from pending page table
> > updates from getting out of control. Other drivers haven't needed this
> > yet, but they will when they get native context support.
>
> What are the cases where you did run into this, i.e. which application in
> userspace hit this? Was it the CTS, some game, something else?

CTS tests that do weird things with massive # of small bind/unbind.  I
wouldn't expect to hit the blocking case in the real world.

BR,
-R

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 17:40           ` Rob Clark
@ 2025-05-15 18:56             ` Danilo Krummrich
  2025-05-15 19:56               ` Rob Clark
  0 siblings, 1 reply; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-15 18:56 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > (Cc: Boris)
> >
> > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > For some context, other drivers have the concept of a "synchronous"
> > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > waiting for the whole thing to finish before returning.
> >
> > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > waits for the out-fence synchronously.
> 
> As Connor mentioned, we'd prefer it to be async rather than blocking,
> in normal cases, otherwise with drm native context for using native
> UMD in guest VM, you'd be blocking the single host/VMM virglrender
> thread.
> 
> The key is we want to keep it async in the normal cases, and not have
> weird edge case CTS tests blow up from being _too_ async ;-)

I really wonder why they don't blow up in Nouveau, which also support full
asynchronous VM_BIND. Mind sharing which tests blow up? :)

> > > But this
> > > doesn't work for native context, where everything has to be
> > > asynchronous, so we're trying a new approach where we instead submit
> > > an asynchronous bind for "normal" (non-sparse/driver internal)
> > > allocations and only attach its out-fence to the in-fence of
> > > subsequent submits to other queues.
> >
> > This is what nouveau does and I think other drivers like Xe and panthor do this
> > as well.
> 
> No one has added native context support for these drivers yet

Huh? What exactly do you mean with "native context" then?

> > > Once you do this then you need a
> > > limit like this to prevent memory usage from pending page table
> > > updates from getting out of control. Other drivers haven't needed this
> > > yet, but they will when they get native context support.
> >
> > What are the cases where you did run into this, i.e. which application in
> > userspace hit this? Was it the CTS, some game, something else?
> 
> CTS tests that do weird things with massive # of small bind/unbind.  I
> wouldn't expect to hit the blocking case in the real world.

As mentioned above, can you please share them? I'd like to play around a bit. :)

- Danilo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 18:56             ` Danilo Krummrich
@ 2025-05-15 19:56               ` Rob Clark
  2025-05-20  7:06                 ` Danilo Krummrich
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-15 19:56 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > >
> > > (Cc: Boris)
> > >
> > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > For some context, other drivers have the concept of a "synchronous"
> > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > waiting for the whole thing to finish before returning.
> > >
> > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > waits for the out-fence synchronously.
> >
> > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > in normal cases, otherwise with drm native context for using native
> > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > thread.
> >
> > The key is we want to keep it async in the normal cases, and not have
> > weird edge case CTS tests blow up from being _too_ async ;-)
>
> I really wonder why they don't blow up in Nouveau, which also support full
> asynchronous VM_BIND. Mind sharing which tests blow up? :)

Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,
but I might be mixing that up, I'd have to back out this patch and see
where things blow up, which would take many hours.

There definitely was one where I was seeing >5k VM_BIND jobs pile up,
so absolutely throttling like this is needed.

Part of the VM_BIND for msm series adds some tracepoints for amount of
memory preallocated vs used for each job.  That plus scheduler
tracepoints should let you see how much memory is tied up in
prealloc'd pgtables.  You might not be noticing only because you are
running on a big desktop with lots of RAM ;-)

> > > > But this
> > > > doesn't work for native context, where everything has to be
> > > > asynchronous, so we're trying a new approach where we instead submit
> > > > an asynchronous bind for "normal" (non-sparse/driver internal)
> > > > allocations and only attach its out-fence to the in-fence of
> > > > subsequent submits to other queues.
> > >
> > > This is what nouveau does and I think other drivers like Xe and panthor do this
> > > as well.
> >
> > No one has added native context support for these drivers yet
>
> Huh? What exactly do you mean with "native context" then?

It is a way to use native usermode driver in a guest VM, by remoting
at the UAPI level, as opposed to the vk or gl API level.  You can
generally get equal to native performance, but the guest/host boundary
strongly encourages asynchronous to hide the guest->host latency.

https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/693
https://indico.freedesktop.org/event/2/contributions/53/attachments/76/121/XDC2022_%20virtgpu%20drm%20native%20context.pdf

So far there is (merged) support for msm + freedreno/turnip, amdgpu +
radeonsi/radv, with MRs in-flight for i915 and asahi.

BR,
-R

> > > > Once you do this then you need a
> > > > limit like this to prevent memory usage from pending page table
> > > > updates from getting out of control. Other drivers haven't needed this
> > > > yet, but they will when they get native context support.
> > >
> > > What are the cases where you did run into this, i.e. which application in
> > > userspace hit this? Was it the CTS, some game, something else?
> >
> > CTS tests that do weird things with massive # of small bind/unbind.  I
> > wouldn't expect to hit the blocking case in the real world.
>
> As mentioned above, can you please share them? I'd like to play around a bit. :)
>
> - Danilo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-15 19:56               ` Rob Clark
@ 2025-05-20  7:06                 ` Danilo Krummrich
  2025-05-20 16:07                   ` Rob Clark
  0 siblings, 1 reply; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-20  7:06 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 15, 2025 at 12:56:38PM -0700, Rob Clark wrote:
> On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > >
> > > > (Cc: Boris)
> > > >
> > > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > > For some context, other drivers have the concept of a "synchronous"
> > > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > > waiting for the whole thing to finish before returning.
> > > >
> > > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > > waits for the out-fence synchronously.
> > >
> > > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > > in normal cases, otherwise with drm native context for using native
> > > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > > thread.
> > >
> > > The key is we want to keep it async in the normal cases, and not have
> > > weird edge case CTS tests blow up from being _too_ async ;-)
> >
> > I really wonder why they don't blow up in Nouveau, which also support full
> > asynchronous VM_BIND. Mind sharing which tests blow up? :)
> 
> Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,

The test above is part of the smoke testing I do for nouveau, but I haven't seen
such issues yet for nouveau.

> but I might be mixing that up, I'd have to back out this patch and see
> where things blow up, which would take many hours.

Well, you said that you never had this issue with "real" workloads, but only
with VK CTS, so I really think we should know what we are trying to fix here.

We can't just add new generic infrastructure without reasonable and *well
understood* justification.

> There definitely was one where I was seeing >5k VM_BIND jobs pile up,
> so absolutely throttling like this is needed.

I still don't understand why the kernel must throttle this? If userspace uses
async VM_BIND, it obviously can't spam the kernel infinitely without running
into an OOM case.

But let's assume we agree that we want to avoid that userspace can ever OOM itself
through async VM_BIND, then the proposed solution seems wrong:

Do we really want the driver developer to set an arbitrary boundary of a number
of jobs that can be submitted before *async* VM_BIND blocks and becomes
semi-sync?

How do we choose this number of jobs? A very small number to be safe, which
scales badly on powerful machines? A large number that scales well on powerful
machines, but OOMs on weaker ones?

I really think, this isn't the correct solution, but more a workaround.

> Part of the VM_BIND for msm series adds some tracepoints for amount of
> memory preallocated vs used for each job.  That plus scheduler
> tracepoints should let you see how much memory is tied up in
> prealloc'd pgtables.  You might not be noticing only because you are
> running on a big desktop with lots of RAM ;-)
> 
> > > > > But this
> > > > > doesn't work for native context, where everything has to be
> > > > > asynchronous, so we're trying a new approach where we instead submit
> > > > > an asynchronous bind for "normal" (non-sparse/driver internal)
> > > > > allocations and only attach its out-fence to the in-fence of
> > > > > subsequent submits to other queues.
> > > >
> > > > This is what nouveau does and I think other drivers like Xe and panthor do this
> > > > as well.
> > >
> > > No one has added native context support for these drivers yet
> >
> > Huh? What exactly do you mean with "native context" then?
> 
> It is a way to use native usermode driver in a guest VM, by remoting
> at the UAPI level, as opposed to the vk or gl API level.  You can
> generally get equal to native performance, but the guest/host boundary
> strongly encourages asynchronous to hide the guest->host latency.

For the context we're discussing this isn't different to other drivers supporing
async VM_BIND utilizing it from the host, rather than from a guest.

So, my original statement about nouveau, Xe, panthor doing the same thing
without running into trouble should be valid.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-20  7:06                 ` Danilo Krummrich
@ 2025-05-20 16:07                   ` Rob Clark
  2025-05-20 16:54                     ` Danilo Krummrich
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-20 16:07 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu, May 15, 2025 at 12:56:38PM -0700, Rob Clark wrote:
> > On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > >
> > > On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > > > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > >
> > > > > (Cc: Boris)
> > > > >
> > > > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > > > For some context, other drivers have the concept of a "synchronous"
> > > > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > > > waiting for the whole thing to finish before returning.
> > > > >
> > > > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > > > waits for the out-fence synchronously.
> > > >
> > > > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > > > in normal cases, otherwise with drm native context for using native
> > > > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > > > thread.
> > > >
> > > > The key is we want to keep it async in the normal cases, and not have
> > > > weird edge case CTS tests blow up from being _too_ async ;-)
> > >
> > > I really wonder why they don't blow up in Nouveau, which also support full
> > > asynchronous VM_BIND. Mind sharing which tests blow up? :)
> >
> > Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,
>
> The test above is part of the smoke testing I do for nouveau, but I haven't seen
> such issues yet for nouveau.

nouveau is probably not using async binds for everything?  Or maybe
I'm just pointing to the wrong test.

> > but I might be mixing that up, I'd have to back out this patch and see
> > where things blow up, which would take many hours.
>
> Well, you said that you never had this issue with "real" workloads, but only
> with VK CTS, so I really think we should know what we are trying to fix here.
>
> We can't just add new generic infrastructure without reasonable and *well
> understood* justification.

What is not well understood about this?  We need to pre-allocate
memory that we likely don't need for pagetables.

In the worst case, a large # of async PAGE_SIZE binds, you end up
needing to pre-allocate 3 pgtable pages (4 lvl pgtable) per one page
of mapping.  Queue up enough of those and you can explode your memory
usage.

> > There definitely was one where I was seeing >5k VM_BIND jobs pile up,
> > so absolutely throttling like this is needed.
>
> I still don't understand why the kernel must throttle this? If userspace uses
> async VM_BIND, it obviously can't spam the kernel infinitely without running
> into an OOM case.

It is a valid question about whether the kernel or userspace should be
the one to do the throttling.

I went for doing it in the kernel because the kernel has better
knowledge of how much it needs to pre-allocate.

(There is also the side point, that this pre-allocated memory is not
charged to the calling process from a PoV of memory accounting.  So
with that in mind it seems like a good idea for the kernel to throttle
memory usage.)

> But let's assume we agree that we want to avoid that userspace can ever OOM itself
> through async VM_BIND, then the proposed solution seems wrong:
>
> Do we really want the driver developer to set an arbitrary boundary of a number
> of jobs that can be submitted before *async* VM_BIND blocks and becomes
> semi-sync?
>
> How do we choose this number of jobs? A very small number to be safe, which
> scales badly on powerful machines? A large number that scales well on powerful
> machines, but OOMs on weaker ones?

The way I am using it in msm, the credit amount and limit are in units
of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
1024 pages, once there are jobs queued up exceeding that limit, they
start blocking.

The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.

> I really think, this isn't the correct solution, but more a workaround.
>
> > Part of the VM_BIND for msm series adds some tracepoints for amount of
> > memory preallocated vs used for each job.  That plus scheduler
> > tracepoints should let you see how much memory is tied up in
> > prealloc'd pgtables.  You might not be noticing only because you are
> > running on a big desktop with lots of RAM ;-)
> >
> > > > > > But this
> > > > > > doesn't work for native context, where everything has to be
> > > > > > asynchronous, so we're trying a new approach where we instead submit
> > > > > > an asynchronous bind for "normal" (non-sparse/driver internal)
> > > > > > allocations and only attach its out-fence to the in-fence of
> > > > > > subsequent submits to other queues.
> > > > >
> > > > > This is what nouveau does and I think other drivers like Xe and panthor do this
> > > > > as well.
> > > >
> > > > No one has added native context support for these drivers yet
> > >
> > > Huh? What exactly do you mean with "native context" then?
> >
> > It is a way to use native usermode driver in a guest VM, by remoting
> > at the UAPI level, as opposed to the vk or gl API level.  You can
> > generally get equal to native performance, but the guest/host boundary
> > strongly encourages asynchronous to hide the guest->host latency.
>
> For the context we're discussing this isn't different to other drivers supporing
> async VM_BIND utilizing it from the host, rather than from a guest.
>
> So, my original statement about nouveau, Xe, panthor doing the same thing
> without running into trouble should be valid.

Probably the difference is that we don't do any _synchronous_ binds.
And that is partially motivated by the virtual machine case.

BR,
-R

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-20 16:07                   ` Rob Clark
@ 2025-05-20 16:54                     ` Danilo Krummrich
  2025-05-20 17:05                       ` Connor Abbott
  2025-05-20 17:22                       ` Rob Clark
  0 siblings, 2 replies; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-20 16:54 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Thu, May 15, 2025 at 12:56:38PM -0700, Rob Clark wrote:
> > > On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > >
> > > > On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > > > > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > >
> > > > > > (Cc: Boris)
> > > > > >
> > > > > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > > > > For some context, other drivers have the concept of a "synchronous"
> > > > > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > > > > waiting for the whole thing to finish before returning.
> > > > > >
> > > > > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > > > > waits for the out-fence synchronously.
> > > > >
> > > > > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > > > > in normal cases, otherwise with drm native context for using native
> > > > > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > > > > thread.
> > > > >
> > > > > The key is we want to keep it async in the normal cases, and not have
> > > > > weird edge case CTS tests blow up from being _too_ async ;-)
> > > >
> > > > I really wonder why they don't blow up in Nouveau, which also support full
> > > > asynchronous VM_BIND. Mind sharing which tests blow up? :)
> > >
> > > Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,
> >
> > The test above is part of the smoke testing I do for nouveau, but I haven't seen
> > such issues yet for nouveau.
> 
> nouveau is probably not using async binds for everything?  Or maybe
> I'm just pointing to the wrong test.

Let me double check later on.

> > > but I might be mixing that up, I'd have to back out this patch and see
> > > where things blow up, which would take many hours.
> >
> > Well, you said that you never had this issue with "real" workloads, but only
> > with VK CTS, so I really think we should know what we are trying to fix here.
> >
> > We can't just add new generic infrastructure without reasonable and *well
> > understood* justification.
> 
> What is not well understood about this?  We need to pre-allocate
> memory that we likely don't need for pagetables.
> 
> In the worst case, a large # of async PAGE_SIZE binds, you end up
> needing to pre-allocate 3 pgtable pages (4 lvl pgtable) per one page
> of mapping.  Queue up enough of those and you can explode your memory
> usage.

Well, the general principle how this can OOM is well understood, sure. What's
not well understood is how we run in this case. I think we should also
understand what test causes the issue and why other drivers are not affected
(yet).

> > > There definitely was one where I was seeing >5k VM_BIND jobs pile up,
> > > so absolutely throttling like this is needed.
> >
> > I still don't understand why the kernel must throttle this? If userspace uses
> > async VM_BIND, it obviously can't spam the kernel infinitely without running
> > into an OOM case.
> 
> It is a valid question about whether the kernel or userspace should be
> the one to do the throttling.
> 
> I went for doing it in the kernel because the kernel has better
> knowledge of how much it needs to pre-allocate.
> 
> (There is also the side point, that this pre-allocated memory is not
> charged to the calling process from a PoV of memory accounting.  So
> with that in mind it seems like a good idea for the kernel to throttle
> memory usage.)

That's a very valid point, maybe we should investigate in the direction of
addressing this, rather than trying to work around it in the scheduler, where we
can only set an arbitrary credit limit.

> > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > through async VM_BIND, then the proposed solution seems wrong:
> >
> > Do we really want the driver developer to set an arbitrary boundary of a number
> > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > semi-sync?
> >
> > How do we choose this number of jobs? A very small number to be safe, which
> > scales badly on powerful machines? A large number that scales well on powerful
> > machines, but OOMs on weaker ones?
> 
> The way I am using it in msm, the credit amount and limit are in units
> of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> 1024 pages, once there are jobs queued up exceeding that limit, they
> start blocking.
> 
> The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.

That doesn't make a difference for my question. How do you know 1024 pages is a
good value? How do we scale for different machines with different capabilities?

If you have a powerful machine with lots of memory, we might throttle userspace
for no reason, no?

If the machine has very limited resources, it might already be too much?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-20 16:54                     ` Danilo Krummrich
@ 2025-05-20 17:05                       ` Connor Abbott
  2025-05-20 17:22                       ` Rob Clark
  1 sibling, 0 replies; 33+ messages in thread
From: Connor Abbott @ 2025-05-20 17:05 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Rob Clark, Rob Clark, phasta, dri-devel, freedreno, linux-arm-msm,
	Matthew Brost, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	open list, Boris Brezillon

n Tue, May 20, 2025 at 12:54 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > >
> > > On Thu, May 15, 2025 at 12:56:38PM -0700, Rob Clark wrote:
> > > > On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > >
> > > > > On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > > > > > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > > >
> > > > > > > (Cc: Boris)
> > > > > > >
> > > > > > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > > > > > For some context, other drivers have the concept of a "synchronous"
> > > > > > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > > > > > waiting for the whole thing to finish before returning.
> > > > > > >
> > > > > > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > > > > > waits for the out-fence synchronously.
> > > > > >
> > > > > > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > > > > > in normal cases, otherwise with drm native context for using native
> > > > > > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > > > > > thread.
> > > > > >
> > > > > > The key is we want to keep it async in the normal cases, and not have
> > > > > > weird edge case CTS tests blow up from being _too_ async ;-)
> > > > >
> > > > > I really wonder why they don't blow up in Nouveau, which also support full
> > > > > asynchronous VM_BIND. Mind sharing which tests blow up? :)
> > > >
> > > > Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,
> > >
> > > The test above is part of the smoke testing I do for nouveau, but I haven't seen
> > > such issues yet for nouveau.
> >
> > nouveau is probably not using async binds for everything?  Or maybe
> > I'm just pointing to the wrong test.
>
> Let me double check later on.
>
> > > > but I might be mixing that up, I'd have to back out this patch and see
> > > > where things blow up, which would take many hours.
> > >
> > > Well, you said that you never had this issue with "real" workloads, but only
> > > with VK CTS, so I really think we should know what we are trying to fix here.
> > >
> > > We can't just add new generic infrastructure without reasonable and *well
> > > understood* justification.
> >
> > What is not well understood about this?  We need to pre-allocate
> > memory that we likely don't need for pagetables.
> >
> > In the worst case, a large # of async PAGE_SIZE binds, you end up
> > needing to pre-allocate 3 pgtable pages (4 lvl pgtable) per one page
> > of mapping.  Queue up enough of those and you can explode your memory
> > usage.
>
> Well, the general principle how this can OOM is well understood, sure. What's
> not well understood is how we run in this case. I think we should also
> understand what test causes the issue and why other drivers are not affected
> (yet).

Once again, it's well understood why other drivers aren't affected.
They have both synchronous and asynchronous VM_BINDs in the uabi, and
the userspace driver uses synchronous VM_BIND for everything except
sparse mappings. For freedreno we tried to change that because async
works better for native context, which exposed the pre-existing issue
with async VM_BINDs causing the whole system to hang when we run out
of memory since more mappings started being async.

I think it would be possible in theory for other drivers to forward
synchronous VM_BINDs asynchronously to the host as long as the host
kernel executes them synchronously, so maybe other drivers won't have
a problem with native context support. But it will still be possible
to make them fall over if you poke them the right way.

Connor

>
> > > > There definitely was one where I was seeing >5k VM_BIND jobs pile up,
> > > > so absolutely throttling like this is needed.
> > >
> > > I still don't understand why the kernel must throttle this? If userspace uses
> > > async VM_BIND, it obviously can't spam the kernel infinitely without running
> > > into an OOM case.
> >
> > It is a valid question about whether the kernel or userspace should be
> > the one to do the throttling.
> >
> > I went for doing it in the kernel because the kernel has better
> > knowledge of how much it needs to pre-allocate.
> >
> > (There is also the side point, that this pre-allocated memory is not
> > charged to the calling process from a PoV of memory accounting.  So
> > with that in mind it seems like a good idea for the kernel to throttle
> > memory usage.)
>
> That's a very valid point, maybe we should investigate in the direction of
> addressing this, rather than trying to work around it in the scheduler, where we
> can only set an arbitrary credit limit.
>
> > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > through async VM_BIND, then the proposed solution seems wrong:
> > >
> > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > semi-sync?
> > >
> > > How do we choose this number of jobs? A very small number to be safe, which
> > > scales badly on powerful machines? A large number that scales well on powerful
> > > machines, but OOMs on weaker ones?
> >
> > The way I am using it in msm, the credit amount and limit are in units
> > of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> > 1024 pages, once there are jobs queued up exceeding that limit, they
> > start blocking.
> >
> > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
>
> That doesn't make a difference for my question. How do you know 1024 pages is a
> good value? How do we scale for different machines with different capabilities?
>
> If you have a powerful machine with lots of memory, we might throttle userspace
> for no reason, no?
>
> If the machine has very limited resources, it might already be too much?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-20 16:54                     ` Danilo Krummrich
  2025-05-20 17:05                       ` Connor Abbott
@ 2025-05-20 17:22                       ` Rob Clark
  2025-05-22 11:00                         ` Danilo Krummrich
  1 sibling, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-20 17:22 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Tue, May 20, 2025 at 9:54 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > >
> > > On Thu, May 15, 2025 at 12:56:38PM -0700, Rob Clark wrote:
> > > > On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > >
> > > > > On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > > > > > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > > >
> > > > > > > (Cc: Boris)
> > > > > > >
> > > > > > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > > > > > For some context, other drivers have the concept of a "synchronous"
> > > > > > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > > > > > waiting for the whole thing to finish before returning.
> > > > > > >
> > > > > > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > > > > > waits for the out-fence synchronously.
> > > > > >
> > > > > > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > > > > > in normal cases, otherwise with drm native context for using native
> > > > > > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > > > > > thread.
> > > > > >
> > > > > > The key is we want to keep it async in the normal cases, and not have
> > > > > > weird edge case CTS tests blow up from being _too_ async ;-)
> > > > >
> > > > > I really wonder why they don't blow up in Nouveau, which also support full
> > > > > asynchronous VM_BIND. Mind sharing which tests blow up? :)
> > > >
> > > > Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,
> > >
> > > The test above is part of the smoke testing I do for nouveau, but I haven't seen
> > > such issues yet for nouveau.
> >
> > nouveau is probably not using async binds for everything?  Or maybe
> > I'm just pointing to the wrong test.
>
> Let me double check later on.
>
> > > > but I might be mixing that up, I'd have to back out this patch and see
> > > > where things blow up, which would take many hours.
> > >
> > > Well, you said that you never had this issue with "real" workloads, but only
> > > with VK CTS, so I really think we should know what we are trying to fix here.
> > >
> > > We can't just add new generic infrastructure without reasonable and *well
> > > understood* justification.
> >
> > What is not well understood about this?  We need to pre-allocate
> > memory that we likely don't need for pagetables.
> >
> > In the worst case, a large # of async PAGE_SIZE binds, you end up
> > needing to pre-allocate 3 pgtable pages (4 lvl pgtable) per one page
> > of mapping.  Queue up enough of those and you can explode your memory
> > usage.
>
> Well, the general principle how this can OOM is well understood, sure. What's
> not well understood is how we run in this case. I think we should also
> understand what test causes the issue and why other drivers are not affected
> (yet).
>
> > > > There definitely was one where I was seeing >5k VM_BIND jobs pile up,
> > > > so absolutely throttling like this is needed.
> > >
> > > I still don't understand why the kernel must throttle this? If userspace uses
> > > async VM_BIND, it obviously can't spam the kernel infinitely without running
> > > into an OOM case.
> >
> > It is a valid question about whether the kernel or userspace should be
> > the one to do the throttling.
> >
> > I went for doing it in the kernel because the kernel has better
> > knowledge of how much it needs to pre-allocate.
> >
> > (There is also the side point, that this pre-allocated memory is not
> > charged to the calling process from a PoV of memory accounting.  So
> > with that in mind it seems like a good idea for the kernel to throttle
> > memory usage.)
>
> That's a very valid point, maybe we should investigate in the direction of
> addressing this, rather than trying to work around it in the scheduler, where we
> can only set an arbitrary credit limit.

Perhaps.. but that seems like a bigger can of worms

> > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > through async VM_BIND, then the proposed solution seems wrong:
> > >
> > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > semi-sync?
> > >
> > > How do we choose this number of jobs? A very small number to be safe, which
> > > scales badly on powerful machines? A large number that scales well on powerful
> > > machines, but OOMs on weaker ones?
> >
> > The way I am using it in msm, the credit amount and limit are in units
> > of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> > 1024 pages, once there are jobs queued up exceeding that limit, they
> > start blocking.
> >
> > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
>
> That doesn't make a difference for my question. How do you know 1024 pages is a
> good value? How do we scale for different machines with different capabilities?
>
> If you have a powerful machine with lots of memory, we might throttle userspace
> for no reason, no?
>
> If the machine has very limited resources, it might already be too much?

It may be a bit arbitrary, but then again I'm not sure that userspace
is in any better position to pick an appropriate limit.

4MB of in-flight pages isn't going to be too much for anything that is
capable enough to run vk, but still allows for a lot of in-flight
maps.  As I mentioned before, I don't expect anyone to hit this case
normally, unless they are just trying to poke the driver in weird
ways.  Having the kernel guard against that doesn't seem unreasonable.

BR,
-R

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-20 17:22                       ` Rob Clark
@ 2025-05-22 11:00                         ` Danilo Krummrich
  2025-05-22 14:47                           ` Rob Clark
  0 siblings, 1 reply; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-22 11:00 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Tue, May 20, 2025 at 10:22:54AM -0700, Rob Clark wrote:
> On Tue, May 20, 2025 at 9:54 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > > through async VM_BIND, then the proposed solution seems wrong:
> > > >
> > > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > > semi-sync?
> > > >
> > > > How do we choose this number of jobs? A very small number to be safe, which
> > > > scales badly on powerful machines? A large number that scales well on powerful
> > > > machines, but OOMs on weaker ones?
> > >
> > > The way I am using it in msm, the credit amount and limit are in units
> > > of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> > > 1024 pages, once there are jobs queued up exceeding that limit, they
> > > start blocking.
> > >
> > > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
> >
> > That doesn't make a difference for my question. How do you know 1024 pages is a
> > good value? How do we scale for different machines with different capabilities?
> >
> > If you have a powerful machine with lots of memory, we might throttle userspace
> > for no reason, no?
> >
> > If the machine has very limited resources, it might already be too much?
> 
> It may be a bit arbitrary, but then again I'm not sure that userspace
> is in any better position to pick an appropriate limit.
> 
> 4MB of in-flight pages isn't going to be too much for anything that is
> capable enough to run vk, but still allows for a lot of in-flight
> maps.

Ok, but what about the other way around? What's the performance impact if the
limit is chosen rather small, but we're running on a very powerful machine?

Since you already have the implementation for hardware you have access to, can
you please check if and how performance degrades when you use a very small
threshold?

Also, I think we should probably put this throttle mechanism in a separate
component, that just wraps a counter of bytes or rather pages that can be
increased and decreased through an API and the increase just blocks at a certain
threshold.

This component can then be called by a driver from the job submit IOCTL and the
corresponding place where the pre-allocated memory is actually used / freed.

Depending on the driver, this might not necessarily be in the scheduler's
run_job() callback.

We could call the component something like drm_throttle or drm_submit_throttle.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-22 11:00                         ` Danilo Krummrich
@ 2025-05-22 14:47                           ` Rob Clark
  2025-05-22 15:53                             ` Danilo Krummrich
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-22 14:47 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 22, 2025 at 4:00 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue, May 20, 2025 at 10:22:54AM -0700, Rob Clark wrote:
> > On Tue, May 20, 2025 at 9:54 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > > > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > > > through async VM_BIND, then the proposed solution seems wrong:
> > > > >
> > > > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > > > semi-sync?
> > > > >
> > > > > How do we choose this number of jobs? A very small number to be safe, which
> > > > > scales badly on powerful machines? A large number that scales well on powerful
> > > > > machines, but OOMs on weaker ones?
> > > >
> > > > The way I am using it in msm, the credit amount and limit are in units
> > > > of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> > > > 1024 pages, once there are jobs queued up exceeding that limit, they
> > > > start blocking.
> > > >
> > > > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
> > >
> > > That doesn't make a difference for my question. How do you know 1024 pages is a
> > > good value? How do we scale for different machines with different capabilities?
> > >
> > > If you have a powerful machine with lots of memory, we might throttle userspace
> > > for no reason, no?
> > >
> > > If the machine has very limited resources, it might already be too much?
> >
> > It may be a bit arbitrary, but then again I'm not sure that userspace
> > is in any better position to pick an appropriate limit.
> >
> > 4MB of in-flight pages isn't going to be too much for anything that is
> > capable enough to run vk, but still allows for a lot of in-flight
> > maps.
>
> Ok, but what about the other way around? What's the performance impact if the
> limit is chosen rather small, but we're running on a very powerful machine?
>
> Since you already have the implementation for hardware you have access to, can
> you please check if and how performance degrades when you use a very small
> threshold?

I mean, considering that some drivers (asahi, at least), _only_
implement synchronous VM_BIND, I guess blocking in extreme cases isn't
so bad.  But I think you are overthinking this.  4MB of pagetables is
enough to map ~8GB of buffers.

Perhaps drivers would want to set their limit based on the amount of
memory the GPU could map, which might land them on a # larger than
1024, but still not an order of magnitude more.

I don't really have a good setup for testing games that use this, atm,
fex-emu isn't working for me atm.  But I think Connor has a setup with
proton working?

But, flip it around.  It is pretty simple to create a test program
that submits a flood of 4k (or whatever your min page size is)
VM_BINDs, and see how prealloc memory usage blows up.  This is really
the thing this patch is trying to protect against.

> Also, I think we should probably put this throttle mechanism in a separate
> component, that just wraps a counter of bytes or rather pages that can be
> increased and decreased through an API and the increase just blocks at a certain
> threshold.

Maybe?  I don't see why we need to explicitly define the units for the
credit.  This wasn't done for the existing credit mechanism.. which,
seems like if you used some extra fences could also have been
implemented externally.

> This component can then be called by a driver from the job submit IOCTL and the
> corresponding place where the pre-allocated memory is actually used / freed.
>
> Depending on the driver, this might not necessarily be in the scheduler's
> run_job() callback.
>
> We could call the component something like drm_throttle or drm_submit_throttle.

Maybe?  This still has the same complaint I had about just
implementing this in msm.. it would have to reach in and use the
scheduler's job_scheduled wait-queue.  Which, to me at least, seems
like more of an internal detail about how the scheduler works.

BR,
-R

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-22 14:47                           ` Rob Clark
@ 2025-05-22 15:53                             ` Danilo Krummrich
  2025-05-23  2:31                               ` Rob Clark
  0 siblings, 1 reply; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-22 15:53 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 22, 2025 at 07:47:17AM -0700, Rob Clark wrote:
> On Thu, May 22, 2025 at 4:00 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > On Tue, May 20, 2025 at 10:22:54AM -0700, Rob Clark wrote:
> > > On Tue, May 20, 2025 at 9:54 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > > > > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > > > > through async VM_BIND, then the proposed solution seems wrong:
> > > > > >
> > > > > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > > > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > > > > semi-sync?
> > > > > >
> > > > > > How do we choose this number of jobs? A very small number to be safe, which
> > > > > > scales badly on powerful machines? A large number that scales well on powerful
> > > > > > machines, but OOMs on weaker ones?
> > > > >
> > > > > The way I am using it in msm, the credit amount and limit are in units
> > > > > of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> > > > > 1024 pages, once there are jobs queued up exceeding that limit, they
> > > > > start blocking.
> > > > >
> > > > > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
> > > >
> > > > That doesn't make a difference for my question. How do you know 1024 pages is a
> > > > good value? How do we scale for different machines with different capabilities?
> > > >
> > > > If you have a powerful machine with lots of memory, we might throttle userspace
> > > > for no reason, no?
> > > >
> > > > If the machine has very limited resources, it might already be too much?
> > >
> > > It may be a bit arbitrary, but then again I'm not sure that userspace
> > > is in any better position to pick an appropriate limit.
> > >
> > > 4MB of in-flight pages isn't going to be too much for anything that is
> > > capable enough to run vk, but still allows for a lot of in-flight
> > > maps.
> >
> > Ok, but what about the other way around? What's the performance impact if the
> > limit is chosen rather small, but we're running on a very powerful machine?
> >
> > Since you already have the implementation for hardware you have access to, can
> > you please check if and how performance degrades when you use a very small
> > threshold?
> 
> I mean, considering that some drivers (asahi, at least), _only_
> implement synchronous VM_BIND, I guess blocking in extreme cases isn't
> so bad.

Which is not even upstream yet and eventually will support async VM_BIND too,
AFAIK.

> But I think you are overthinking this.  4MB of pagetables is
> enough to map ~8GB of buffers.
> 
> Perhaps drivers would want to set their limit based on the amount of
> memory the GPU could map, which might land them on a # larger than
> 1024, but still not an order of magnitude more.

Nouveau currently supports an address space width of 128TiB.

In general, we have to cover the range of some small laptop or handheld devices
to huge datacenter machines.

> I don't really have a good setup for testing games that use this, atm,
> fex-emu isn't working for me atm.  But I think Connor has a setup with
> proton working?

I just want to be sure that an arbitrary small limit doing the job for a small
device to not fail VK CTS can't regress the performance on large machines.

So, kindly try to prove that we're not prone to extreme performance regression
with a static value as you propose.

> > Also, I think we should probably put this throttle mechanism in a separate
> > component, that just wraps a counter of bytes or rather pages that can be
> > increased and decreased through an API and the increase just blocks at a certain
> > threshold.
> 
> Maybe?  I don't see why we need to explicitly define the units for the
> credit.  This wasn't done for the existing credit mechanism.. which,
> seems like if you used some extra fences could also have been
> implemented externally.

If you are referring to the credit mechanism in the scheduler for ring buffers,
that's a different case. Drivers know the size of their ring buffers exactly and
the scheduler has the responsibility of when to submit tasks to the ring buffer.
So the scheduler kind of owns the resource.

However, the throttle mechanism you propose is independent from the scheduler,
it depends on the available system memory, a resource the scheduler doesn't own.

I'm fine to make the unit credits as well, but in this case we really care about
the consumption of system memory, so we could just use an applicable unit.

> > This component can then be called by a driver from the job submit IOCTL and the
> > corresponding place where the pre-allocated memory is actually used / freed.
> >
> > Depending on the driver, this might not necessarily be in the scheduler's
> > run_job() callback.
> >
> > We could call the component something like drm_throttle or drm_submit_throttle.
> 
> Maybe?  This still has the same complaint I had about just
> implementing this in msm.. it would have to reach in and use the
> scheduler's job_scheduled wait-queue.  Which, to me at least, seems
> like more of an internal detail about how the scheduler works.

Why? The component should use its own waitqueue. Subsequently, from your code
that releases the pre-allocated memory, you can decrement the counter through
the drm_throttle API, which automatically kicks its the waitqueue.

For instance from your VM_BIND IOCTL you can call

	drm_throttle_inc(value)

which blocks if the increment goes above the threshold. And when you release the
pre-allocated memory you call

	drm_throttle_dec(value)

which wakes the waitqueue and unblocks the drm_throttle_inc() call from your
VM_BIND IOCTL.

Another advantage is that, if necessary, we can make drm_throttle
(automatically) scale for the machines resources, which otherwise we'd need to
pollute the scheduler with.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-22 15:53                             ` Danilo Krummrich
@ 2025-05-23  2:31                               ` Rob Clark
  2025-05-23  6:58                                 ` Danilo Krummrich
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Clark @ 2025-05-23  2:31 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 22, 2025 at 8:53 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu, May 22, 2025 at 07:47:17AM -0700, Rob Clark wrote:
> > On Thu, May 22, 2025 at 4:00 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > On Tue, May 20, 2025 at 10:22:54AM -0700, Rob Clark wrote:
> > > > On Tue, May 20, 2025 at 9:54 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > > > > > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > > > > > through async VM_BIND, then the proposed solution seems wrong:
> > > > > > >
> > > > > > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > > > > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > > > > > semi-sync?
> > > > > > >
> > > > > > > How do we choose this number of jobs? A very small number to be safe, which
> > > > > > > scales badly on powerful machines? A large number that scales well on powerful
> > > > > > > machines, but OOMs on weaker ones?
> > > > > >
> > > > > > The way I am using it in msm, the credit amount and limit are in units
> > > > > > of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> > > > > > 1024 pages, once there are jobs queued up exceeding that limit, they
> > > > > > start blocking.
> > > > > >
> > > > > > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
> > > > >
> > > > > That doesn't make a difference for my question. How do you know 1024 pages is a
> > > > > good value? How do we scale for different machines with different capabilities?
> > > > >
> > > > > If you have a powerful machine with lots of memory, we might throttle userspace
> > > > > for no reason, no?
> > > > >
> > > > > If the machine has very limited resources, it might already be too much?
> > > >
> > > > It may be a bit arbitrary, but then again I'm not sure that userspace
> > > > is in any better position to pick an appropriate limit.
> > > >
> > > > 4MB of in-flight pages isn't going to be too much for anything that is
> > > > capable enough to run vk, but still allows for a lot of in-flight
> > > > maps.
> > >
> > > Ok, but what about the other way around? What's the performance impact if the
> > > limit is chosen rather small, but we're running on a very powerful machine?
> > >
> > > Since you already have the implementation for hardware you have access to, can
> > > you please check if and how performance degrades when you use a very small
> > > threshold?
> >
> > I mean, considering that some drivers (asahi, at least), _only_
> > implement synchronous VM_BIND, I guess blocking in extreme cases isn't
> > so bad.
>
> Which is not even upstream yet and eventually will support async VM_BIND too,
> AFAIK.

the uapi is upstream

> > But I think you are overthinking this.  4MB of pagetables is
> > enough to map ~8GB of buffers.
> >
> > Perhaps drivers would want to set their limit based on the amount of
> > memory the GPU could map, which might land them on a # larger than
> > 1024, but still not an order of magnitude more.
>
> Nouveau currently supports an address space width of 128TiB.
>
> In general, we have to cover the range of some small laptop or handheld devices
> to huge datacenter machines.

sure.. and?  It is still up to the user of sched to set their own
limits, I'm not proposing that sched takes charge of that policy

Maybe msm doesn't have to scale up quite as much (yet).. but it has to
scale quite a bit further down (like watches).  In the end it is the
same.  And also not really the point here.

> > I don't really have a good setup for testing games that use this, atm,
> > fex-emu isn't working for me atm.  But I think Connor has a setup with
> > proton working?
>
> I just want to be sure that an arbitrary small limit doing the job for a small
> device to not fail VK CTS can't regress the performance on large machines.

why are we debating the limit I set outside of sched.. even that might
be subject to some tuning for devices that have more memory, but that
really outside the scope of this patch

> So, kindly try to prove that we're not prone to extreme performance regression
> with a static value as you propose.
>
> > > Also, I think we should probably put this throttle mechanism in a separate
> > > component, that just wraps a counter of bytes or rather pages that can be
> > > increased and decreased through an API and the increase just blocks at a certain
> > > threshold.
> >
> > Maybe?  I don't see why we need to explicitly define the units for the
> > credit.  This wasn't done for the existing credit mechanism.. which,
> > seems like if you used some extra fences could also have been
> > implemented externally.
>
> If you are referring to the credit mechanism in the scheduler for ring buffers,
> that's a different case. Drivers know the size of their ring buffers exactly and
> the scheduler has the responsibility of when to submit tasks to the ring buffer.
> So the scheduler kind of owns the resource.
>
> However, the throttle mechanism you propose is independent from the scheduler,
> it depends on the available system memory, a resource the scheduler doesn't own.

it is a distinction that is perhaps a matter of opinion.  I don't see
such a big difference, it is all just a matter of managing physical
resource usage in different stages of a scheduled job's lifetime.

> I'm fine to make the unit credits as well, but in this case we really care about
> the consumption of system memory, so we could just use an applicable unit.
>
> > > This component can then be called by a driver from the job submit IOCTL and the
> > > corresponding place where the pre-allocated memory is actually used / freed.
> > >
> > > Depending on the driver, this might not necessarily be in the scheduler's
> > > run_job() callback.
> > >
> > > We could call the component something like drm_throttle or drm_submit_throttle.
> >
> > Maybe?  This still has the same complaint I had about just
> > implementing this in msm.. it would have to reach in and use the
> > scheduler's job_scheduled wait-queue.  Which, to me at least, seems
> > like more of an internal detail about how the scheduler works.
>
> Why? The component should use its own waitqueue. Subsequently, from your code
> that releases the pre-allocated memory, you can decrement the counter through
> the drm_throttle API, which automatically kicks its the waitqueue.
>
> For instance from your VM_BIND IOCTL you can call
>
>         drm_throttle_inc(value)
>
> which blocks if the increment goes above the threshold. And when you release the
> pre-allocated memory you call
>
>         drm_throttle_dec(value)
>
> which wakes the waitqueue and unblocks the drm_throttle_inc() call from your
> VM_BIND IOCTL.

ok, sure, we could introduce another waitqueue, but with my proposal
that is not needed.  And like I said, the existing throttling could
also be implemented externally to the scheduler..  so I'm not seeing
any fundamental difference.

> Another advantage is that, if necessary, we can make drm_throttle
> (automatically) scale for the machines resources, which otherwise we'd need to
> pollute the scheduler with.

How is this different from drivers being more sophisticated about
picking the limit we configure the scheduler with?

Sure, maybe just setting a hard coded limit of 1024 might not be the
final solution.. maybe we should take into consideration the size of
the device.  But this is also entirely outside of the scheduler and I
fail to understand why we are discussing this here?

BR,
-R

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
  2025-05-23  2:31                               ` Rob Clark
@ 2025-05-23  6:58                                 ` Danilo Krummrich
  0 siblings, 0 replies; 33+ messages in thread
From: Danilo Krummrich @ 2025-05-23  6:58 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Rob Clark, phasta, dri-devel, freedreno,
	linux-arm-msm, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, open list, Boris Brezillon

On Thu, May 22, 2025 at 07:31:28PM -0700, Rob Clark wrote:
> On Thu, May 22, 2025 at 8:53 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > On Thu, May 22, 2025 at 07:47:17AM -0700, Rob Clark wrote:
> > > On Thu, May 22, 2025 at 4:00 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > Ok, but what about the other way around? What's the performance impact if the
> > > > limit is chosen rather small, but we're running on a very powerful machine?
> > > >
> > > > Since you already have the implementation for hardware you have access to, can
> > > > you please check if and how performance degrades when you use a very small
> > > > threshold?
> > >
> > > I mean, considering that some drivers (asahi, at least), _only_
> > > implement synchronous VM_BIND, I guess blocking in extreme cases isn't
> > > so bad.
> >
> > Which is not even upstream yet and eventually will support async VM_BIND too,
> > AFAIK.
> 
> the uapi is upstream

And will be extended once they have the corresponding async implementation in
the driver.

> > > But I think you are overthinking this.  4MB of pagetables is
> > > enough to map ~8GB of buffers.
> > >
> > > Perhaps drivers would want to set their limit based on the amount of
> > > memory the GPU could map, which might land them on a # larger than
> > > 1024, but still not an order of magnitude more.
> >
> > Nouveau currently supports an address space width of 128TiB.
> >
> > In general, we have to cover the range of some small laptop or handheld devices
> > to huge datacenter machines.
> 
> sure.. and?  It is still up to the user of sched to set their own
> limits, I'm not proposing that sched takes charge of that policy
> 
> Maybe msm doesn't have to scale up quite as much (yet).. but it has to
> scale quite a bit further down (like watches).  In the end it is the
> same.  And also not really the point here.
> 
> > > I don't really have a good setup for testing games that use this, atm,
> > > fex-emu isn't working for me atm.  But I think Connor has a setup with
> > > proton working?
> >
> > I just want to be sure that an arbitrary small limit doing the job for a small
> > device to not fail VK CTS can't regress the performance on large machines.
> 
> why are we debating the limit I set outside of sched.. even that might
> be subject to some tuning for devices that have more memory, but that
> really outside the scope of this patch

We are not debating the number you set in MSM, we're talking about whether a
statically set number will be sufficient.

Also, do we really want it to be our quality standard that we introduce some
throttling mechanism as generic infrastructure for driver and don't even add a
comment guiding drivers how to choose a proper limit and what are the potential
pitfalls in choosing the limit?

When working on a driver, do you want to run into APIs that don't give you
proper guidance on how to use them correctly?

I think it would not be very nice to tell drivers, "Look, here's a throttling API
for when VK CTS (unknown test) ruins your day. We also can't give any advise on
the limit that should be set depending on the scale of the machine, since we
never looked into it.".

> > So, kindly try to prove that we're not prone to extreme performance regression
> > with a static value as you propose.
> >
> > > > Also, I think we should probably put this throttle mechanism in a separate
> > > > component, that just wraps a counter of bytes or rather pages that can be
> > > > increased and decreased through an API and the increase just blocks at a certain
> > > > threshold.
> > >
> > > Maybe?  I don't see why we need to explicitly define the units for the
> > > credit.  This wasn't done for the existing credit mechanism.. which,
> > > seems like if you used some extra fences could also have been
> > > implemented externally.
> >
> > If you are referring to the credit mechanism in the scheduler for ring buffers,
> > that's a different case. Drivers know the size of their ring buffers exactly and
> > the scheduler has the responsibility of when to submit tasks to the ring buffer.
> > So the scheduler kind of owns the resource.
> >
> > However, the throttle mechanism you propose is independent from the scheduler,
> > it depends on the available system memory, a resource the scheduler doesn't own.
> 
> it is a distinction that is perhaps a matter of opinion.  I don't see
> such a big difference, it is all just a matter of managing physical
> resource usage in different stages of a scheduled job's lifetime.

Yes, but the ring buffer as a resource is owned by the scheduler, and hence
having the scheduler care about flow control makes sense.

Here you want to flow control the uAPI (i.e. VM_BIND ioctl) -- let's do this in
a seaparate component please.

> > > Maybe?  This still has the same complaint I had about just
> > > implementing this in msm.. it would have to reach in and use the
> > > scheduler's job_scheduled wait-queue.  Which, to me at least, seems
> > > like more of an internal detail about how the scheduler works.
> >
> > Why? The component should use its own waitqueue. Subsequently, from your code
> > that releases the pre-allocated memory, you can decrement the counter through
> > the drm_throttle API, which automatically kicks its the waitqueue.
> >
> > For instance from your VM_BIND IOCTL you can call
> >
> >         drm_throttle_inc(value)
> >
> > which blocks if the increment goes above the threshold. And when you release the
> > pre-allocated memory you call
> >
> >         drm_throttle_dec(value)
> >
> > which wakes the waitqueue and unblocks the drm_throttle_inc() call from your
> > VM_BIND IOCTL.
> 
> ok, sure, we could introduce another waitqueue, but with my proposal
> that is not needed.  And like I said, the existing throttling could
> also be implemented externally to the scheduler..  so I'm not seeing
> any fundamental difference.

Yes, but you also implicitly force drivers to actually release the pre-allocated
memory before the scheduler's internal waitqueue is woken. Having such implicit
rules isn't nice.

Also, with that drivers would need to do so in run_job(), i.e. in the fence
signalling critical path, which some drivers may not be able to do.

And, it also adds complexity to the scheduler, which we're trying to reduce.

All this goes away with making this a separate component -- please do that
instead.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2025-05-23  6:59 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
2025-05-14 16:59 ` [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path Rob Clark
2025-05-14 16:59 ` [PATCH v4 02/40] drm/gpuvm: Allow VAs to hold soft reference to BOs Rob Clark
2025-05-14 16:59 ` [PATCH v4 03/40] drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan() Rob Clark
2025-05-14 16:59 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark
2025-05-15  9:28   ` Philipp Stanner
2025-05-15 16:15     ` Rob Clark
2025-05-15 16:22       ` Connor Abbott
2025-05-15 17:29         ` Danilo Krummrich
2025-05-15 17:40           ` Rob Clark
2025-05-15 18:56             ` Danilo Krummrich
2025-05-15 19:56               ` Rob Clark
2025-05-20  7:06                 ` Danilo Krummrich
2025-05-20 16:07                   ` Rob Clark
2025-05-20 16:54                     ` Danilo Krummrich
2025-05-20 17:05                       ` Connor Abbott
2025-05-20 17:22                       ` Rob Clark
2025-05-22 11:00                         ` Danilo Krummrich
2025-05-22 14:47                           ` Rob Clark
2025-05-22 15:53                             ` Danilo Krummrich
2025-05-23  2:31                               ` Rob Clark
2025-05-23  6:58                                 ` Danilo Krummrich
2025-05-15 17:23       ` Danilo Krummrich
2025-05-15 17:36         ` Rob Clark
2025-05-14 16:59 ` [PATCH v4 05/40] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() Rob Clark
2025-05-14 16:59 ` [PATCH v4 06/40] drm/msm: Rename msm_file_private -> msm_context Rob Clark
2025-05-14 16:59 ` [PATCH v4 07/40] drm/msm: Improve msm_context comments Rob Clark
2025-05-14 16:59 ` [PATCH v4 08/40] drm/msm: Rename msm_gem_address_space -> msm_gem_vm Rob Clark
2025-05-14 16:59 ` [PATCH v4 09/40] drm/msm: Remove vram carveout support Rob Clark
2025-05-14 16:59 ` [PATCH v4 10/40] drm/msm: Collapse vma allocation and initialization Rob Clark
2025-05-14 16:59 ` [PATCH v4 11/40] drm/msm: Collapse vma close and delete Rob Clark
2025-05-14 17:13 ` [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
  -- strict thread matches above, loose matches on Subject: below --
2025-05-14 17:53 Rob Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).