[PATCH 0/4] Rust GPUVM support

rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] Rust GPUVM support
@ 2025-11-28 14:14 Alice Ryhl
  2025-11-28 14:14 ` [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc() Alice Ryhl
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Alice Ryhl @ 2025-11-28 14:14 UTC (permalink / raw)
  To: Danilo Krummrich, Daniel Almeida
  Cc: Matthew Brost, Thomas Hellström, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Boris Brezillon, Steven Price, Liviu Dudau, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Frank Binns, Matt Coster,
	Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
	Sean Paul, Marijn Suijten, Lyude Paul, Lucas De Marchi,
	Rodrigo Vivi, Sumit Semwal, Christian König, dri-devel,
	linux-kernel, rust-for-linux, linux-arm-msm, freedreno, nouveau,
	intel-xe, linux-media, linaro-mm-sig, Alice Ryhl, Asahi Lina

This makes a few changes to the way immediate mode works, and then it
implements a Rust immediate mode GPUVM abstraction on top of that.

Please see the following branch for example usage in Tyr:
https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/53

For context, please see this previous patch:
https://lore.kernel.org/rust-for-linux/20250621-gpuvm-v3-1-10203da06867@collabora.com/
and the commit message of the last patch.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Alice Ryhl (4):
      drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc()
      drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode
      drm/gpuvm: use const for drm_gpuva_op_* ptrs
      rust: drm: add GPUVM immediate mode abstraction

 MAINTAINERS                            |   1 +
 drivers/gpu/drm/drm_gpuvm.c            |  80 ++++--
 drivers/gpu/drm/imagination/pvr_vm.c   |   2 +-
 drivers/gpu/drm/msm/msm_gem.h          |   2 +-
 drivers/gpu/drm/msm/msm_gem_vma.c      |   2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c |   2 +-
 drivers/gpu/drm/panthor/panthor_mmu.c  |  10 -
 drivers/gpu/drm/xe/xe_vm.c             |   4 +-
 include/drm/drm_gpuvm.h                |  12 +-
 rust/bindings/bindings_helper.h        |   2 +
 rust/helpers/drm_gpuvm.c               |  43 +++
 rust/helpers/helpers.c                 |   1 +
 rust/kernel/drm/gpuvm/mod.rs           | 394 +++++++++++++++++++++++++++
 rust/kernel/drm/gpuvm/sm_ops.rs        | 469 +++++++++++++++++++++++++++++++++
 rust/kernel/drm/gpuvm/va.rs            | 148 +++++++++++
 rust/kernel/drm/gpuvm/vm_bo.rs         | 213 +++++++++++++++
 rust/kernel/drm/mod.rs                 |   1 +
 17 files changed, 1337 insertions(+), 49 deletions(-)
---
base-commit: 77b686f688126a5f758b51441a03186e9eb1b0f1
change-id: 20251128-gpuvm-rust-b719cac27ad6

Best regards,
-- 
Alice Ryhl <aliceryhl@google.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc()
  2025-11-28 14:14 [PATCH 0/4] Rust GPUVM support Alice Ryhl
@ 2025-11-28 14:14 ` Alice Ryhl
  2025-11-28 14:24   ` Boris Brezillon
  2025-11-28 14:14 ` [PATCH 2/4] drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode Alice Ryhl
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Alice Ryhl @ 2025-11-28 14:14 UTC (permalink / raw)
  To: Danilo Krummrich, Daniel Almeida
  Cc: Matthew Brost, Thomas Hellström, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Boris Brezillon, Steven Price, Liviu Dudau, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Frank Binns, Matt Coster,
	Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
	Sean Paul, Marijn Suijten, Lyude Paul, Lucas De Marchi,
	Rodrigo Vivi, Sumit Semwal, Christian König, dri-devel,
	linux-kernel, rust-for-linux, linux-arm-msm, freedreno, nouveau,
	intel-xe, linux-media, linaro-mm-sig, Alice Ryhl

When calling drm_gpuvm_bo_obtain_prealloc() and using immediate mode,
this may result in a call to ops->vm_bo_free(vm_bo) while holding the
GEMs gpuva mutex. This is a problem if ops->vm_bo_free(vm_bo) performs
any operations that are not safe in the fence signalling critical path,
and it turns out that Panthor (the only current user of the method)
calls drm_gem_shmem_unpin() which takes a resv lock internally.

This constitutes both a violation of signalling safety and lock
inversion. To fix this, we modify the method to internally take the GEMs
gpuva mutex so that the mutex can be unlocked before freeing the
preallocated vm_bo.

Note that this modification introduces a requirement that the driver
uses immediate mode to call drm_gpuvm_bo_obtain_prealloc() as it would
otherwise take the wrong lock.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/gpu/drm/drm_gpuvm.c           | 58 ++++++++++++++++++++++-------------
 drivers/gpu/drm/panthor/panthor_mmu.c | 10 ------
 2 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 936e6c1a60c16ed5a6898546bf99e23a74f6b58b..f08a5cc1d611f971862c1272987e5ecd6d97c163 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1601,14 +1601,37 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_bo_create);
 
+static void
+drm_gpuvm_bo_destroy_not_in_lists(struct drm_gpuvm_bo *vm_bo)
+{
+	struct drm_gpuvm *gpuvm = vm_bo->vm;
+	const struct drm_gpuvm_ops *ops = gpuvm->ops;
+	struct drm_gem_object *obj = vm_bo->obj;
+
+	if (ops && ops->vm_bo_free)
+		ops->vm_bo_free(vm_bo);
+	else
+		kfree(vm_bo);
+
+	drm_gpuvm_put(gpuvm);
+	drm_gem_object_put(obj);
+}
+
+static void
+drm_gpuvm_bo_destroy_not_in_lists_kref(struct kref *kref)
+{
+	struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
+						  kref);
+
+	drm_gpuvm_bo_destroy_not_in_lists(vm_bo);
+}
+
 static void
 drm_gpuvm_bo_destroy(struct kref *kref)
 {
 	struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
 						  kref);
 	struct drm_gpuvm *gpuvm = vm_bo->vm;
-	const struct drm_gpuvm_ops *ops = gpuvm->ops;
-	struct drm_gem_object *obj = vm_bo->obj;
 	bool lock = !drm_gpuvm_resv_protected(gpuvm);
 
 	if (!lock)
@@ -1617,16 +1640,10 @@ drm_gpuvm_bo_destroy(struct kref *kref)
 	drm_gpuvm_bo_list_del(vm_bo, extobj, lock);
 	drm_gpuvm_bo_list_del(vm_bo, evict, lock);
 
-	drm_gem_gpuva_assert_lock_held(gpuvm, obj);
+	drm_gem_gpuva_assert_lock_held(gpuvm, vm_bo->obj);
 	list_del(&vm_bo->list.entry.gem);
 
-	if (ops && ops->vm_bo_free)
-		ops->vm_bo_free(vm_bo);
-	else
-		kfree(vm_bo);
-
-	drm_gpuvm_put(gpuvm);
-	drm_gem_object_put(obj);
+	drm_gpuvm_bo_destroy_not_in_lists(vm_bo);
 }
 
 /**
@@ -1744,9 +1761,7 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put_deferred);
 void
 drm_gpuvm_bo_deferred_cleanup(struct drm_gpuvm *gpuvm)
 {
-	const struct drm_gpuvm_ops *ops = gpuvm->ops;
 	struct drm_gpuvm_bo *vm_bo;
-	struct drm_gem_object *obj;
 	struct llist_node *bo_defer;
 
 	bo_defer = llist_del_all(&gpuvm->bo_defer);
@@ -1765,14 +1780,7 @@ drm_gpuvm_bo_deferred_cleanup(struct drm_gpuvm *gpuvm)
 	while (bo_defer) {
 		vm_bo = llist_entry(bo_defer, struct drm_gpuvm_bo, list.entry.bo_defer);
 		bo_defer = bo_defer->next;
-		obj = vm_bo->obj;
-		if (ops && ops->vm_bo_free)
-			ops->vm_bo_free(vm_bo);
-		else
-			kfree(vm_bo);
-
-		drm_gpuvm_put(gpuvm);
-		drm_gem_object_put(obj);
+		drm_gpuvm_bo_destroy_not_in_lists(vm_bo);
 	}
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_bo_deferred_cleanup);
@@ -1860,6 +1868,9 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain);
  * count is decreased. If not found @__vm_bo is returned without further
  * increase of the reference count.
  *
+ * The provided @__vm_bo must not already be in the gpuva, evict, or extobj
+ * lists prior to calling this method.
+ *
  * A new &drm_gpuvm_bo is added to the GEMs gpuva list.
  *
  * Returns: a pointer to the found &drm_gpuvm_bo or @__vm_bo if no existing
@@ -1872,14 +1883,19 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
 	struct drm_gem_object *obj = __vm_bo->obj;
 	struct drm_gpuvm_bo *vm_bo;
 
+	drm_WARN_ON(gpuvm->drm, !drm_gpuvm_immediate_mode(gpuvm));
+
+	mutex_lock(&obj->gpuva.lock);
 	vm_bo = drm_gpuvm_bo_find(gpuvm, obj);
 	if (vm_bo) {
-		drm_gpuvm_bo_put(__vm_bo);
+		mutex_unlock(&obj->gpuva.lock);
+		kref_put(&__vm_bo->kref, drm_gpuvm_bo_destroy_not_in_lists_kref);
 		return vm_bo;
 	}
 
 	drm_gem_gpuva_assert_lock_held(gpuvm, obj);
 	list_add_tail(&__vm_bo->list.entry.gem, &obj->gpuva.list);
+	mutex_unlock(&obj->gpuva.lock);
 
 	return __vm_bo;
 }
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 9f5f4ddf291024121f3fd5644f2fdeba354fa67c..be8811a70e1a3adec87ca4a85cad7c838f54bebf 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -1224,17 +1224,7 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
 		goto err_cleanup;
 	}
 
-	/* drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our
-	 * pre-allocated BO if the <BO,VM> association exists. Given we
-	 * only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will
-	 * be called immediately, and we have to hold the VM resv lock when
-	 * calling this function.
-	 */
-	dma_resv_lock(panthor_vm_resv(vm), NULL);
-	mutex_lock(&bo->base.base.gpuva.lock);
 	op_ctx->map.vm_bo = drm_gpuvm_bo_obtain_prealloc(preallocated_vm_bo);
-	mutex_unlock(&bo->base.base.gpuva.lock);
-	dma_resv_unlock(panthor_vm_resv(vm));
 
 	op_ctx->map.bo_offset = offset;
 

-- 
2.52.0.487.g5c8c507ade-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc()
  2025-11-28 14:14 ` [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc() Alice Ryhl
@ 2025-11-28 14:24   ` Boris Brezillon
  2025-12-01  9:55     ` Alice Ryhl
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Brezillon @ 2025-11-28 14:24 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Danilo Krummrich, Daniel Almeida, Matthew Brost,
	Thomas Hellström, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Steven Price,
	Liviu Dudau, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Frank Binns, Matt Coster, Rob Clark,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Sean Paul,
	Marijn Suijten, Lyude Paul, Lucas De Marchi, Rodrigo Vivi,
	Sumit Semwal, Christian König, dri-devel, linux-kernel,
	rust-for-linux, linux-arm-msm, freedreno, nouveau, intel-xe,
	linux-media, linaro-mm-sig

On Fri, 28 Nov 2025 14:14:15 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> When calling drm_gpuvm_bo_obtain_prealloc() and using immediate mode,
> this may result in a call to ops->vm_bo_free(vm_bo) while holding the
> GEMs gpuva mutex. This is a problem if ops->vm_bo_free(vm_bo) performs
> any operations that are not safe in the fence signalling critical path,
> and it turns out that Panthor (the only current user of the method)
> calls drm_gem_shmem_unpin() which takes a resv lock internally.
> 
> This constitutes both a violation of signalling safety and lock
> inversion. To fix this, we modify the method to internally take the GEMs
> gpuva mutex so that the mutex can be unlocked before freeing the
> preallocated vm_bo.
> 
> Note that this modification introduces a requirement that the driver
> uses immediate mode to call drm_gpuvm_bo_obtain_prealloc() as it would
> otherwise take the wrong lock.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

Should we add a Fixes tag?

> ---
>  drivers/gpu/drm/drm_gpuvm.c           | 58 ++++++++++++++++++++++-------------
>  drivers/gpu/drm/panthor/panthor_mmu.c | 10 ------
>  2 files changed, 37 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index 936e6c1a60c16ed5a6898546bf99e23a74f6b58b..f08a5cc1d611f971862c1272987e5ecd6d97c163 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -1601,14 +1601,37 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
>  }
>  EXPORT_SYMBOL_GPL(drm_gpuvm_bo_create);
>  
> +static void
> +drm_gpuvm_bo_destroy_not_in_lists(struct drm_gpuvm_bo *vm_bo)
> +{
> +	struct drm_gpuvm *gpuvm = vm_bo->vm;
> +	const struct drm_gpuvm_ops *ops = gpuvm->ops;
> +	struct drm_gem_object *obj = vm_bo->obj;
> +
> +	if (ops && ops->vm_bo_free)
> +		ops->vm_bo_free(vm_bo);
> +	else
> +		kfree(vm_bo);
> +
> +	drm_gpuvm_put(gpuvm);
> +	drm_gem_object_put(obj);
> +}
> +
> +static void
> +drm_gpuvm_bo_destroy_not_in_lists_kref(struct kref *kref)
> +{
> +	struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
> +						  kref);
> +
> +	drm_gpuvm_bo_destroy_not_in_lists(vm_bo);
> +}
> +
>  static void
>  drm_gpuvm_bo_destroy(struct kref *kref)
>  {
>  	struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
>  						  kref);
>  	struct drm_gpuvm *gpuvm = vm_bo->vm;
> -	const struct drm_gpuvm_ops *ops = gpuvm->ops;
> -	struct drm_gem_object *obj = vm_bo->obj;
>  	bool lock = !drm_gpuvm_resv_protected(gpuvm);
>  
>  	if (!lock)
> @@ -1617,16 +1640,10 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>  	drm_gpuvm_bo_list_del(vm_bo, extobj, lock);
>  	drm_gpuvm_bo_list_del(vm_bo, evict, lock);
>  
> -	drm_gem_gpuva_assert_lock_held(gpuvm, obj);
> +	drm_gem_gpuva_assert_lock_held(gpuvm, vm_bo->obj);
>  	list_del(&vm_bo->list.entry.gem);
>  
> -	if (ops && ops->vm_bo_free)
> -		ops->vm_bo_free(vm_bo);
> -	else
> -		kfree(vm_bo);
> -
> -	drm_gpuvm_put(gpuvm);
> -	drm_gem_object_put(obj);
> +	drm_gpuvm_bo_destroy_not_in_lists(vm_bo);
>  }
>  
>  /**
> @@ -1744,9 +1761,7 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put_deferred);
>  void
>  drm_gpuvm_bo_deferred_cleanup(struct drm_gpuvm *gpuvm)
>  {
> -	const struct drm_gpuvm_ops *ops = gpuvm->ops;
>  	struct drm_gpuvm_bo *vm_bo;
> -	struct drm_gem_object *obj;
>  	struct llist_node *bo_defer;
>  
>  	bo_defer = llist_del_all(&gpuvm->bo_defer);
> @@ -1765,14 +1780,7 @@ drm_gpuvm_bo_deferred_cleanup(struct drm_gpuvm *gpuvm)
>  	while (bo_defer) {
>  		vm_bo = llist_entry(bo_defer, struct drm_gpuvm_bo, list.entry.bo_defer);
>  		bo_defer = bo_defer->next;
> -		obj = vm_bo->obj;
> -		if (ops && ops->vm_bo_free)
> -			ops->vm_bo_free(vm_bo);
> -		else
> -			kfree(vm_bo);
> -
> -		drm_gpuvm_put(gpuvm);
> -		drm_gem_object_put(obj);
> +		drm_gpuvm_bo_destroy_not_in_lists(vm_bo);
>  	}
>  }
>  EXPORT_SYMBOL_GPL(drm_gpuvm_bo_deferred_cleanup);
> @@ -1860,6 +1868,9 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain);
>   * count is decreased. If not found @__vm_bo is returned without further
>   * increase of the reference count.
>   *
> + * The provided @__vm_bo must not already be in the gpuva, evict, or extobj
> + * lists prior to calling this method.
> + *
>   * A new &drm_gpuvm_bo is added to the GEMs gpuva list.
>   *
>   * Returns: a pointer to the found &drm_gpuvm_bo or @__vm_bo if no existing
> @@ -1872,14 +1883,19 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
>  	struct drm_gem_object *obj = __vm_bo->obj;
>  	struct drm_gpuvm_bo *vm_bo;
>  
> +	drm_WARN_ON(gpuvm->drm, !drm_gpuvm_immediate_mode(gpuvm));
> +
> +	mutex_lock(&obj->gpuva.lock);
>  	vm_bo = drm_gpuvm_bo_find(gpuvm, obj);
>  	if (vm_bo) {
> -		drm_gpuvm_bo_put(__vm_bo);
> +		mutex_unlock(&obj->gpuva.lock);
> +		kref_put(&__vm_bo->kref, drm_gpuvm_bo_destroy_not_in_lists_kref);
>  		return vm_bo;
>  	}
>  
>  	drm_gem_gpuva_assert_lock_held(gpuvm, obj);
>  	list_add_tail(&__vm_bo->list.entry.gem, &obj->gpuva.list);
> +	mutex_unlock(&obj->gpuva.lock);
>  
>  	return __vm_bo;
>  }
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 9f5f4ddf291024121f3fd5644f2fdeba354fa67c..be8811a70e1a3adec87ca4a85cad7c838f54bebf 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -1224,17 +1224,7 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
>  		goto err_cleanup;
>  	}
>  
> -	/* drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our
> -	 * pre-allocated BO if the <BO,VM> association exists. Given we
> -	 * only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will
> -	 * be called immediately, and we have to hold the VM resv lock when
> -	 * calling this function.
> -	 */
> -	dma_resv_lock(panthor_vm_resv(vm), NULL);
> -	mutex_lock(&bo->base.base.gpuva.lock);
>  	op_ctx->map.vm_bo = drm_gpuvm_bo_obtain_prealloc(preallocated_vm_bo);
> -	mutex_unlock(&bo->base.base.gpuva.lock);
> -	dma_resv_unlock(panthor_vm_resv(vm));
>  
>  	op_ctx->map.bo_offset = offset;
>  
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc()
  2025-11-28 14:24   ` Boris Brezillon
@ 2025-12-01  9:55     ` Alice Ryhl
  0 siblings, 0 replies; 12+ messages in thread
From: Alice Ryhl @ 2025-12-01  9:55 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Danilo Krummrich, Daniel Almeida, Matthew Brost,
	Thomas Hellström, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Steven Price,
	Liviu Dudau, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Frank Binns, Matt Coster, Rob Clark,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Sean Paul,
	Marijn Suijten, Lyude Paul, Lucas De Marchi, Rodrigo Vivi,
	Sumit Semwal, Christian König, dri-devel, linux-kernel,
	rust-for-linux, linux-arm-msm, freedreno, nouveau, intel-xe,
	linux-media, linaro-mm-sig

On Fri, Nov 28, 2025 at 03:24:03PM +0100, Boris Brezillon wrote:
> On Fri, 28 Nov 2025 14:14:15 +0000
> Alice Ryhl <aliceryhl@google.com> wrote:
> 
> > When calling drm_gpuvm_bo_obtain_prealloc() and using immediate mode,
> > this may result in a call to ops->vm_bo_free(vm_bo) while holding the
> > GEMs gpuva mutex. This is a problem if ops->vm_bo_free(vm_bo) performs
> > any operations that are not safe in the fence signalling critical path,
> > and it turns out that Panthor (the only current user of the method)
> > calls drm_gem_shmem_unpin() which takes a resv lock internally.
> > 
> > This constitutes both a violation of signalling safety and lock
> > inversion. To fix this, we modify the method to internally take the GEMs
> > gpuva mutex so that the mutex can be unlocked before freeing the
> > preallocated vm_bo.
> > 
> > Note that this modification introduces a requirement that the driver
> > uses immediate mode to call drm_gpuvm_bo_obtain_prealloc() as it would
> > otherwise take the wrong lock.
> > 
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> 
> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
> 
> Should we add a Fixes tag?

Yeah, let's add:

Fixes: 63e919a31625 ("panthor: use drm_gpuva_unlink_defer()")

Alice

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/4] drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode
  2025-11-28 14:14 [PATCH 0/4] Rust GPUVM support Alice Ryhl
  2025-11-28 14:14 ` [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc() Alice Ryhl
@ 2025-11-28 14:14 ` Alice Ryhl
  2025-11-28 14:25   ` Boris Brezillon
  2025-11-28 14:14 ` [PATCH 3/4] drm/gpuvm: use const for drm_gpuva_op_* ptrs Alice Ryhl
  2025-11-28 14:14 ` [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction Alice Ryhl
  3 siblings, 1 reply; 12+ messages in thread
From: Alice Ryhl @ 2025-11-28 14:14 UTC (permalink / raw)
  To: Danilo Krummrich, Daniel Almeida
  Cc: Matthew Brost, Thomas Hellström, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Boris Brezillon, Steven Price, Liviu Dudau, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Frank Binns, Matt Coster,
	Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
	Sean Paul, Marijn Suijten, Lyude Paul, Lucas De Marchi,
	Rodrigo Vivi, Sumit Semwal, Christian König, dri-devel,
	linux-kernel, rust-for-linux, linux-arm-msm, freedreno, nouveau,
	intel-xe, linux-media, linaro-mm-sig, Alice Ryhl

In the previous commit we updated drm_gpuvm_bo_obtain_prealloc() to take
locks internally, which means that it's only usable in immediate mode.
In this commit, we notice that drm_gpuvm_bo_obtain() requires you to use
staged mode. This means that we now have one variant of obtain for each
mode you might use gpuvm in.

To reflect this information, we add a warning about using it in
immediate mode, and to make the distinction clearer we rename the method
with a _locked() suffix so that it's clear that it requires the caller
to take the locks.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/gpu/drm/drm_gpuvm.c            | 16 +++++++++++++---
 drivers/gpu/drm/imagination/pvr_vm.c   |  2 +-
 drivers/gpu/drm/msm/msm_gem.h          |  2 +-
 drivers/gpu/drm/msm/msm_gem_vma.c      |  2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c |  2 +-
 drivers/gpu/drm/xe/xe_vm.c             |  4 ++--
 include/drm/drm_gpuvm.h                |  4 ++--
 7 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index f08a5cc1d611f971862c1272987e5ecd6d97c163..9cd06c7600dc32ceee0f0beb5e3daf31698a66b3 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1832,16 +1832,26 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_find);
  * count of the &drm_gpuvm_bo accordingly. If not found, allocates a new
  * &drm_gpuvm_bo.
  *
+ * Requires the lock for the GEMs gpuva list.
+ *
  * A new &drm_gpuvm_bo is added to the GEMs gpuva list.
  *
  * Returns: a pointer to the &drm_gpuvm_bo on success, an ERR_PTR on failure
  */
 struct drm_gpuvm_bo *
-drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
-		    struct drm_gem_object *obj)
+drm_gpuvm_bo_obtain_locked(struct drm_gpuvm *gpuvm,
+			   struct drm_gem_object *obj)
 {
 	struct drm_gpuvm_bo *vm_bo;
 
+	/*
+	 * In immediate mode this would require the caller to hold the GEMs
+	 * gpuva mutex, but it's not okay to allocate while holding that lock,
+	 * and this method allocates. Immediate mode drivers should use
+	 * drm_gpuvm_bo_obtain_prealloc() instead.
+	 */
+	drm_WARN_ON(gpuvm->drm, drm_gpuvm_immediate_mode(gpuvm));
+
 	vm_bo = drm_gpuvm_bo_find(gpuvm, obj);
 	if (vm_bo)
 		return vm_bo;
@@ -1855,7 +1865,7 @@ drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
 
 	return vm_bo;
 }
-EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain);
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_locked);
 
 /**
  * drm_gpuvm_bo_obtain_prealloc() - obtains an instance of the &drm_gpuvm_bo
diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c
index 3d97990170bf6b1341116c5c8b9d01421944eda4..30ff9b84eb14f2455003e76108de6d489a13f61a 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -255,7 +255,7 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op,
 	bind_op->type = PVR_VM_BIND_TYPE_MAP;
 
 	dma_resv_lock(obj->resv, NULL);
-	bind_op->gpuvm_bo = drm_gpuvm_bo_obtain(&vm_ctx->gpuvm_mgr, obj);
+	bind_op->gpuvm_bo = drm_gpuvm_bo_obtain_locked(&vm_ctx->gpuvm_mgr, obj);
 	dma_resv_unlock(obj->resv);
 	if (IS_ERR(bind_op->gpuvm_bo))
 		return PTR_ERR(bind_op->gpuvm_bo);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index a4cf31853c5008e171c3ad72cde1004c60fe5212..26dfe3d22e3e847f7e63174481d03f72878a8ced 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -60,7 +60,7 @@ struct msm_gem_vm_log_entry {
  * embedded in any larger driver structure.  The GEM object holds a list of
  * drm_gpuvm_bo, which in turn holds a list of msm_gem_vma.  A linked vma
  * holds a reference to the vm_bo, and drops it when the vma is unlinked.
- * So we just need to call drm_gpuvm_bo_obtain() to return a ref to an
+ * So we just need to call drm_gpuvm_bo_obtain_locked() to return a ref to an
  * existing vm_bo, or create a new one.  Once the vma is linked, the ref
  * to the vm_bo can be dropped (since the vma is holding one).
  */
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index 8316af1723c227f919594446c3721e1a948cbc9e..239b6168a26e636b511187b4993945d1565d149f 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -413,7 +413,7 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj,
 	if (!obj)
 		return &vma->base;
 
-	vm_bo = drm_gpuvm_bo_obtain(&vm->base, obj);
+	vm_bo = drm_gpuvm_bo_obtain_locked(&vm->base, obj);
 	if (IS_ERR(vm_bo)) {
 		ret = PTR_ERR(vm_bo);
 		goto err_va_remove;
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 79eefdfd08a2678fedf69503ddf7e9e17ed14c6f..d8888bd29cccef4b8dad9eff2bf6e2b1fd1a7e4d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1207,7 +1207,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job,
 				return -ENOENT;
 
 			dma_resv_lock(obj->resv, NULL);
-			op->vm_bo = drm_gpuvm_bo_obtain(&uvmm->base, obj);
+			op->vm_bo = drm_gpuvm_bo_obtain_locked(&uvmm->base, obj);
 			dma_resv_unlock(obj->resv);
 			if (IS_ERR(op->vm_bo))
 				return PTR_ERR(op->vm_bo);
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index f602b874e0547591d9008333c18f3de0634c48c7..de52d01b0921cc8ac619deeed47b578e0ae69257 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1004,7 +1004,7 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 
 		xe_bo_assert_held(bo);
 
-		vm_bo = drm_gpuvm_bo_obtain(vma->gpuva.vm, &bo->ttm.base);
+		vm_bo = drm_gpuvm_bo_obtain_locked(vma->gpuva.vm, &bo->ttm.base);
 		if (IS_ERR(vm_bo)) {
 			xe_vma_free(vma);
 			return ERR_CAST(vm_bo);
@@ -2249,7 +2249,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_vma_ops *vops,
 		if (err)
 			return ERR_PTR(err);
 
-		vm_bo = drm_gpuvm_bo_obtain(&vm->gpuvm, obj);
+		vm_bo = drm_gpuvm_bo_obtain_locked(&vm->gpuvm, obj);
 		if (IS_ERR(vm_bo)) {
 			xe_bo_unlock(bo);
 			return ERR_CAST(vm_bo);
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index fdfc575b260360611ff8ce16c327acede787929f..0d3fc1f6cac9966a42f3bc82b0b491bfefaf5b96 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -736,8 +736,8 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
 		    struct drm_gem_object *obj);
 
 struct drm_gpuvm_bo *
-drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
-		    struct drm_gem_object *obj);
+drm_gpuvm_bo_obtain_locked(struct drm_gpuvm *gpuvm,
+			   struct drm_gem_object *obj);
 struct drm_gpuvm_bo *
 drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *vm_bo);
 

-- 
2.52.0.487.g5c8c507ade-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode
  2025-11-28 14:14 ` [PATCH 2/4] drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode Alice Ryhl
@ 2025-11-28 14:25   ` Boris Brezillon
  0 siblings, 0 replies; 12+ messages in thread
From: Boris Brezillon @ 2025-11-28 14:25 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Danilo Krummrich, Daniel Almeida, Matthew Brost,
	Thomas Hellström, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Steven Price,
	Liviu Dudau, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Frank Binns, Matt Coster, Rob Clark,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Sean Paul,
	Marijn Suijten, Lyude Paul, Lucas De Marchi, Rodrigo Vivi,
	Sumit Semwal, Christian König, dri-devel, linux-kernel,
	rust-for-linux, linux-arm-msm, freedreno, nouveau, intel-xe,
	linux-media, linaro-mm-sig

On Fri, 28 Nov 2025 14:14:16 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> In the previous commit we updated drm_gpuvm_bo_obtain_prealloc() to take
> locks internally, which means that it's only usable in immediate mode.
> In this commit, we notice that drm_gpuvm_bo_obtain() requires you to use
> staged mode. This means that we now have one variant of obtain for each
> mode you might use gpuvm in.
> 
> To reflect this information, we add a warning about using it in
> immediate mode, and to make the distinction clearer we rename the method
> with a _locked() suffix so that it's clear that it requires the caller
> to take the locks.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  drivers/gpu/drm/drm_gpuvm.c            | 16 +++++++++++++---

The gpuvm changes are

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

>  drivers/gpu/drm/imagination/pvr_vm.c   |  2 +-
>  drivers/gpu/drm/msm/msm_gem.h          |  2 +-
>  drivers/gpu/drm/msm/msm_gem_vma.c      |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_uvmm.c |  2 +-
>  drivers/gpu/drm/xe/xe_vm.c             |  4 ++--
>  include/drm/drm_gpuvm.h                |  4 ++--
>  7 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index f08a5cc1d611f971862c1272987e5ecd6d97c163..9cd06c7600dc32ceee0f0beb5e3daf31698a66b3 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -1832,16 +1832,26 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_find);
>   * count of the &drm_gpuvm_bo accordingly. If not found, allocates a new
>   * &drm_gpuvm_bo.
>   *
> + * Requires the lock for the GEMs gpuva list.
> + *
>   * A new &drm_gpuvm_bo is added to the GEMs gpuva list.
>   *
>   * Returns: a pointer to the &drm_gpuvm_bo on success, an ERR_PTR on failure
>   */
>  struct drm_gpuvm_bo *
> -drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
> -		    struct drm_gem_object *obj)
> +drm_gpuvm_bo_obtain_locked(struct drm_gpuvm *gpuvm,
> +			   struct drm_gem_object *obj)
>  {
>  	struct drm_gpuvm_bo *vm_bo;
>  
> +	/*
> +	 * In immediate mode this would require the caller to hold the GEMs
> +	 * gpuva mutex, but it's not okay to allocate while holding that lock,
> +	 * and this method allocates. Immediate mode drivers should use
> +	 * drm_gpuvm_bo_obtain_prealloc() instead.
> +	 */
> +	drm_WARN_ON(gpuvm->drm, drm_gpuvm_immediate_mode(gpuvm));
> +
>  	vm_bo = drm_gpuvm_bo_find(gpuvm, obj);
>  	if (vm_bo)
>  		return vm_bo;
> @@ -1855,7 +1865,7 @@ drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
>  
>  	return vm_bo;
>  }
> -EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain);
> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_locked);
>  
>  /**
>   * drm_gpuvm_bo_obtain_prealloc() - obtains an instance of the &drm_gpuvm_bo
> diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c
> index 3d97990170bf6b1341116c5c8b9d01421944eda4..30ff9b84eb14f2455003e76108de6d489a13f61a 100644
> --- a/drivers/gpu/drm/imagination/pvr_vm.c
> +++ b/drivers/gpu/drm/imagination/pvr_vm.c
> @@ -255,7 +255,7 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op,
>  	bind_op->type = PVR_VM_BIND_TYPE_MAP;
>  
>  	dma_resv_lock(obj->resv, NULL);
> -	bind_op->gpuvm_bo = drm_gpuvm_bo_obtain(&vm_ctx->gpuvm_mgr, obj);
> +	bind_op->gpuvm_bo = drm_gpuvm_bo_obtain_locked(&vm_ctx->gpuvm_mgr, obj);
>  	dma_resv_unlock(obj->resv);
>  	if (IS_ERR(bind_op->gpuvm_bo))
>  		return PTR_ERR(bind_op->gpuvm_bo);
> diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
> index a4cf31853c5008e171c3ad72cde1004c60fe5212..26dfe3d22e3e847f7e63174481d03f72878a8ced 100644
> --- a/drivers/gpu/drm/msm/msm_gem.h
> +++ b/drivers/gpu/drm/msm/msm_gem.h
> @@ -60,7 +60,7 @@ struct msm_gem_vm_log_entry {
>   * embedded in any larger driver structure.  The GEM object holds a list of
>   * drm_gpuvm_bo, which in turn holds a list of msm_gem_vma.  A linked vma
>   * holds a reference to the vm_bo, and drops it when the vma is unlinked.
> - * So we just need to call drm_gpuvm_bo_obtain() to return a ref to an
> + * So we just need to call drm_gpuvm_bo_obtain_locked() to return a ref to an
>   * existing vm_bo, or create a new one.  Once the vma is linked, the ref
>   * to the vm_bo can be dropped (since the vma is holding one).
>   */
> diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
> index 8316af1723c227f919594446c3721e1a948cbc9e..239b6168a26e636b511187b4993945d1565d149f 100644
> --- a/drivers/gpu/drm/msm/msm_gem_vma.c
> +++ b/drivers/gpu/drm/msm/msm_gem_vma.c
> @@ -413,7 +413,7 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj,
>  	if (!obj)
>  		return &vma->base;
>  
> -	vm_bo = drm_gpuvm_bo_obtain(&vm->base, obj);
> +	vm_bo = drm_gpuvm_bo_obtain_locked(&vm->base, obj);
>  	if (IS_ERR(vm_bo)) {
>  		ret = PTR_ERR(vm_bo);
>  		goto err_va_remove;
> diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
> index 79eefdfd08a2678fedf69503ddf7e9e17ed14c6f..d8888bd29cccef4b8dad9eff2bf6e2b1fd1a7e4d 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
> @@ -1207,7 +1207,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job,
>  				return -ENOENT;
>  
>  			dma_resv_lock(obj->resv, NULL);
> -			op->vm_bo = drm_gpuvm_bo_obtain(&uvmm->base, obj);
> +			op->vm_bo = drm_gpuvm_bo_obtain_locked(&uvmm->base, obj);
>  			dma_resv_unlock(obj->resv);
>  			if (IS_ERR(op->vm_bo))
>  				return PTR_ERR(op->vm_bo);
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index f602b874e0547591d9008333c18f3de0634c48c7..de52d01b0921cc8ac619deeed47b578e0ae69257 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -1004,7 +1004,7 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
>  
>  		xe_bo_assert_held(bo);
>  
> -		vm_bo = drm_gpuvm_bo_obtain(vma->gpuva.vm, &bo->ttm.base);
> +		vm_bo = drm_gpuvm_bo_obtain_locked(vma->gpuva.vm, &bo->ttm.base);
>  		if (IS_ERR(vm_bo)) {
>  			xe_vma_free(vma);
>  			return ERR_CAST(vm_bo);
> @@ -2249,7 +2249,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_vma_ops *vops,
>  		if (err)
>  			return ERR_PTR(err);
>  
> -		vm_bo = drm_gpuvm_bo_obtain(&vm->gpuvm, obj);
> +		vm_bo = drm_gpuvm_bo_obtain_locked(&vm->gpuvm, obj);
>  		if (IS_ERR(vm_bo)) {
>  			xe_bo_unlock(bo);
>  			return ERR_CAST(vm_bo);
> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> index fdfc575b260360611ff8ce16c327acede787929f..0d3fc1f6cac9966a42f3bc82b0b491bfefaf5b96 100644
> --- a/include/drm/drm_gpuvm.h
> +++ b/include/drm/drm_gpuvm.h
> @@ -736,8 +736,8 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
>  		    struct drm_gem_object *obj);
>  
>  struct drm_gpuvm_bo *
> -drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
> -		    struct drm_gem_object *obj);
> +drm_gpuvm_bo_obtain_locked(struct drm_gpuvm *gpuvm,
> +			   struct drm_gem_object *obj);
>  struct drm_gpuvm_bo *
>  drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *vm_bo);
>  
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/4] drm/gpuvm: use const for drm_gpuva_op_* ptrs
  2025-11-28 14:14 [PATCH 0/4] Rust GPUVM support Alice Ryhl
  2025-11-28 14:14 ` [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc() Alice Ryhl
  2025-11-28 14:14 ` [PATCH 2/4] drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode Alice Ryhl
@ 2025-11-28 14:14 ` Alice Ryhl
  2025-11-28 14:27   ` Boris Brezillon
  2025-11-28 14:14 ` [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction Alice Ryhl
  3 siblings, 1 reply; 12+ messages in thread
From: Alice Ryhl @ 2025-11-28 14:14 UTC (permalink / raw)
  To: Danilo Krummrich, Daniel Almeida
  Cc: Matthew Brost, Thomas Hellström, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Boris Brezillon, Steven Price, Liviu Dudau, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Frank Binns, Matt Coster,
	Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
	Sean Paul, Marijn Suijten, Lyude Paul, Lucas De Marchi,
	Rodrigo Vivi, Sumit Semwal, Christian König, dri-devel,
	linux-kernel, rust-for-linux, linux-arm-msm, freedreno, nouveau,
	intel-xe, linux-media, linaro-mm-sig, Alice Ryhl

These methods just read the values stored in the op pointers without
modifying them, so it is appropriate to use const ptrs here.

This allows us to avoid const -> mut pointer casts in Rust.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/gpu/drm/drm_gpuvm.c | 6 +++---
 include/drm/drm_gpuvm.h     | 8 ++++----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 9cd06c7600dc32ceee0f0beb5e3daf31698a66b3..e06b58aabb8ea6ebd92c625583ae2852c9d2caf1 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -2283,7 +2283,7 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_interval_empty);
 void
 drm_gpuva_map(struct drm_gpuvm *gpuvm,
 	      struct drm_gpuva *va,
-	      struct drm_gpuva_op_map *op)
+	      const struct drm_gpuva_op_map *op)
 {
 	drm_gpuva_init_from_op(va, op);
 	drm_gpuva_insert(gpuvm, va);
@@ -2303,7 +2303,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_map);
 void
 drm_gpuva_remap(struct drm_gpuva *prev,
 		struct drm_gpuva *next,
-		struct drm_gpuva_op_remap *op)
+		const struct drm_gpuva_op_remap *op)
 {
 	struct drm_gpuva *va = op->unmap->va;
 	struct drm_gpuvm *gpuvm = va->vm;
@@ -2330,7 +2330,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_remap);
  * Removes the &drm_gpuva associated with the &drm_gpuva_op_unmap.
  */
 void
-drm_gpuva_unmap(struct drm_gpuva_op_unmap *op)
+drm_gpuva_unmap(const struct drm_gpuva_op_unmap *op)
 {
 	drm_gpuva_remove(op->va);
 }
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 0d3fc1f6cac9966a42f3bc82b0b491bfefaf5b96..655bd9104ffb24170fca14dfa034ee79f5400930 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -1121,7 +1121,7 @@ void drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
 			struct drm_gpuva_ops *ops);
 
 static inline void drm_gpuva_init_from_op(struct drm_gpuva *va,
-					  struct drm_gpuva_op_map *op)
+					  const struct drm_gpuva_op_map *op)
 {
 	va->va.addr = op->va.addr;
 	va->va.range = op->va.range;
@@ -1265,13 +1265,13 @@ int drm_gpuvm_sm_unmap_exec_lock(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
 
 void drm_gpuva_map(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva *va,
-		   struct drm_gpuva_op_map *op);
+		   const struct drm_gpuva_op_map *op);
 
 void drm_gpuva_remap(struct drm_gpuva *prev,
 		     struct drm_gpuva *next,
-		     struct drm_gpuva_op_remap *op);
+		     const struct drm_gpuva_op_remap *op);
 
-void drm_gpuva_unmap(struct drm_gpuva_op_unmap *op);
+void drm_gpuva_unmap(const struct drm_gpuva_op_unmap *op);
 
 /**
  * drm_gpuva_op_remap_to_unmap_range() - Helper to get the start and range of

-- 
2.52.0.487.g5c8c507ade-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] drm/gpuvm: use const for drm_gpuva_op_* ptrs
  2025-11-28 14:14 ` [PATCH 3/4] drm/gpuvm: use const for drm_gpuva_op_* ptrs Alice Ryhl
@ 2025-11-28 14:27   ` Boris Brezillon
  0 siblings, 0 replies; 12+ messages in thread
From: Boris Brezillon @ 2025-11-28 14:27 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Danilo Krummrich, Daniel Almeida, Matthew Brost,
	Thomas Hellström, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Steven Price,
	Liviu Dudau, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Frank Binns, Matt Coster, Rob Clark,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Sean Paul,
	Marijn Suijten, Lyude Paul, Lucas De Marchi, Rodrigo Vivi,
	Sumit Semwal, Christian König, dri-devel, linux-kernel,
	rust-for-linux, linux-arm-msm, freedreno, nouveau, intel-xe,
	linux-media, linaro-mm-sig

On Fri, 28 Nov 2025 14:14:17 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> These methods just read the values stored in the op pointers without
> modifying them, so it is appropriate to use const ptrs here.
> 
> This allows us to avoid const -> mut pointer casts in Rust.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/drm_gpuvm.c | 6 +++---
>  include/drm/drm_gpuvm.h     | 8 ++++----
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index 9cd06c7600dc32ceee0f0beb5e3daf31698a66b3..e06b58aabb8ea6ebd92c625583ae2852c9d2caf1 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -2283,7 +2283,7 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_interval_empty);
>  void
>  drm_gpuva_map(struct drm_gpuvm *gpuvm,
>  	      struct drm_gpuva *va,
> -	      struct drm_gpuva_op_map *op)
> +	      const struct drm_gpuva_op_map *op)
>  {
>  	drm_gpuva_init_from_op(va, op);
>  	drm_gpuva_insert(gpuvm, va);
> @@ -2303,7 +2303,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_map);
>  void
>  drm_gpuva_remap(struct drm_gpuva *prev,
>  		struct drm_gpuva *next,
> -		struct drm_gpuva_op_remap *op)
> +		const struct drm_gpuva_op_remap *op)
>  {
>  	struct drm_gpuva *va = op->unmap->va;
>  	struct drm_gpuvm *gpuvm = va->vm;
> @@ -2330,7 +2330,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_remap);
>   * Removes the &drm_gpuva associated with the &drm_gpuva_op_unmap.
>   */
>  void
> -drm_gpuva_unmap(struct drm_gpuva_op_unmap *op)
> +drm_gpuva_unmap(const struct drm_gpuva_op_unmap *op)
>  {
>  	drm_gpuva_remove(op->va);
>  }
> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> index 0d3fc1f6cac9966a42f3bc82b0b491bfefaf5b96..655bd9104ffb24170fca14dfa034ee79f5400930 100644
> --- a/include/drm/drm_gpuvm.h
> +++ b/include/drm/drm_gpuvm.h
> @@ -1121,7 +1121,7 @@ void drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
>  			struct drm_gpuva_ops *ops);
>  
>  static inline void drm_gpuva_init_from_op(struct drm_gpuva *va,
> -					  struct drm_gpuva_op_map *op)
> +					  const struct drm_gpuva_op_map *op)
>  {
>  	va->va.addr = op->va.addr;
>  	va->va.range = op->va.range;
> @@ -1265,13 +1265,13 @@ int drm_gpuvm_sm_unmap_exec_lock(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
>  
>  void drm_gpuva_map(struct drm_gpuvm *gpuvm,
>  		   struct drm_gpuva *va,
> -		   struct drm_gpuva_op_map *op);
> +		   const struct drm_gpuva_op_map *op);
>  
>  void drm_gpuva_remap(struct drm_gpuva *prev,
>  		     struct drm_gpuva *next,
> -		     struct drm_gpuva_op_remap *op);
> +		     const struct drm_gpuva_op_remap *op);
>  
> -void drm_gpuva_unmap(struct drm_gpuva_op_unmap *op);
> +void drm_gpuva_unmap(const struct drm_gpuva_op_unmap *op);
>  
>  /**
>   * drm_gpuva_op_remap_to_unmap_range() - Helper to get the start and range of
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction
  2025-11-28 14:14 [PATCH 0/4] Rust GPUVM support Alice Ryhl
                   ` (2 preceding siblings ...)
  2025-11-28 14:14 ` [PATCH 3/4] drm/gpuvm: use const for drm_gpuva_op_* ptrs Alice Ryhl
@ 2025-11-28 14:14 ` Alice Ryhl
  2025-12-01 15:16   ` Daniel Almeida
  3 siblings, 1 reply; 12+ messages in thread
From: Alice Ryhl @ 2025-11-28 14:14 UTC (permalink / raw)
  To: Danilo Krummrich, Daniel Almeida
  Cc: Matthew Brost, Thomas Hellström, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Boris Brezillon, Steven Price, Liviu Dudau, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Frank Binns, Matt Coster,
	Rob Clark, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
	Sean Paul, Marijn Suijten, Lyude Paul, Lucas De Marchi,
	Rodrigo Vivi, Sumit Semwal, Christian König, dri-devel,
	linux-kernel, rust-for-linux, linux-arm-msm, freedreno, nouveau,
	intel-xe, linux-media, linaro-mm-sig, Alice Ryhl, Asahi Lina

Add a GPUVM abstraction to be used by Rust GPU drivers.

GPUVM keeps track of a GPU's virtual address (VA) space and manages the
corresponding virtual mappings represented by "GPU VA" objects. It also
keeps track of the gem::Object<T> used to back the mappings through
GpuVmBo<T>.

This abstraction is only usable by drivers that wish to use GPUVM in
immediate mode. This allows us to build the locking scheme into the API
design. It means that the GEM mutex is used for the GEM gpuva list, and
that the resv lock is used for the extobj list. The evicted list is not
yet used in this version.

This abstraction provides a special handle called the GpuVmCore, which
is a wrapper around ARef<GpuVm> that provides access to the interval
tree. Generally, all changes to the address space requires mutable
access to this unique handle.

Some of the safety comments are still somewhat WIP, but I think the API
should be sound as-is.

Co-developed-by: Asahi Lina <lina+kernel@asahilina.net>
Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
Co-developed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 MAINTAINERS                     |   1 +
 rust/bindings/bindings_helper.h |   2 +
 rust/helpers/drm_gpuvm.c        |  43 ++++
 rust/helpers/helpers.c          |   1 +
 rust/kernel/drm/gpuvm/mod.rs    | 394 +++++++++++++++++++++++++++++++++
 rust/kernel/drm/gpuvm/sm_ops.rs | 469 ++++++++++++++++++++++++++++++++++++++++
 rust/kernel/drm/gpuvm/va.rs     | 148 +++++++++++++
 rust/kernel/drm/gpuvm/vm_bo.rs  | 213 ++++++++++++++++++
 rust/kernel/drm/mod.rs          |   1 +
 9 files changed, 1272 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 952aed4619c25d395c12962e559d6cd3362f64a7..946629eb9ebf19922bbe782fed37be07067d6bf2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8591,6 +8591,7 @@ S:	Supported
 T:	git https://gitlab.freedesktop.org/drm/misc/kernel.git
 F:	drivers/gpu/drm/drm_gpuvm.c
 F:	include/drm/drm_gpuvm.h
+F:	rust/kernel/drm/gpuvm/
 
 DRM LOG
 M:	Jocelyn Falempe <jfalempe@redhat.com>
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 2e43c66635a2c9f31bd99b9817bd2d6ab89fbcf2..c776ae198e1db91f010f88ff1d1c888a3036a87f 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -33,6 +33,7 @@
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
 #include <drm/drm_gem.h>
+#include <drm/drm_gpuvm.h>
 #include <drm/drm_ioctl.h>
 #include <kunit/test.h>
 #include <linux/auxiliary_bus.h>
@@ -103,6 +104,7 @@ const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
 const gfp_t RUST_CONST_HELPER___GFP_NOWARN = ___GFP_NOWARN;
 const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
 const fop_flags_t RUST_CONST_HELPER_FOP_UNSIGNED_OFFSET = FOP_UNSIGNED_OFFSET;
+const u32 RUST_CONST_HELPER_DRM_EXEC_INTERRUPTIBLE_WAIT = DRM_EXEC_INTERRUPTIBLE_WAIT;
 
 const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
 
diff --git a/rust/helpers/drm_gpuvm.c b/rust/helpers/drm_gpuvm.c
new file mode 100644
index 0000000000000000000000000000000000000000..18b7dbd2e32c3162455b344e72ec2940c632cc6b
--- /dev/null
+++ b/rust/helpers/drm_gpuvm.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0 or MIT
+
+#ifdef CONFIG_DRM_GPUVM
+
+#include <drm/drm_gpuvm.h>
+
+struct drm_gpuvm *rust_helper_drm_gpuvm_get(struct drm_gpuvm *obj)
+{
+	return drm_gpuvm_get(obj);
+}
+
+void rust_helper_drm_gpuva_init_from_op(struct drm_gpuva *va, struct drm_gpuva_op_map *op)
+{
+	drm_gpuva_init_from_op(va, op);
+}
+
+struct drm_gpuvm_bo *rust_helper_drm_gpuvm_bo_get(struct drm_gpuvm_bo *vm_bo)
+{
+	return drm_gpuvm_bo_get(vm_bo);
+}
+
+void rust_helper_drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
+{
+	return drm_gpuvm_exec_unlock(vm_exec);
+}
+
+bool rust_helper_drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
+				     struct drm_gem_object *obj)
+{
+	return drm_gpuvm_is_extobj(gpuvm, obj);
+}
+
+int rust_helper_dma_resv_lock(struct dma_resv *obj, struct ww_acquire_ctx *ctx)
+{
+	return dma_resv_lock(obj, ctx);
+}
+
+void rust_helper_dma_resv_unlock(struct dma_resv *obj)
+{
+	dma_resv_unlock(obj);
+}
+
+#endif // CONFIG_DRM_GPUVM
diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
index 551da6c9b5064c324d6f62bafcec672c6c6f5bee..91f45155eb9c2c4e92b56ee1abf7d45188873f3c 100644
--- a/rust/helpers/helpers.c
+++ b/rust/helpers/helpers.c
@@ -26,6 +26,7 @@
 #include "device.c"
 #include "dma.c"
 #include "drm.c"
+#include "drm_gpuvm.c"
 #include "err.c"
 #include "irq.c"
 #include "fs.c"
diff --git a/rust/kernel/drm/gpuvm/mod.rs b/rust/kernel/drm/gpuvm/mod.rs
new file mode 100644
index 0000000000000000000000000000000000000000..9834dbb938a3622e46048e9b8e06bc6bf03aa0d2
--- /dev/null
+++ b/rust/kernel/drm/gpuvm/mod.rs
@@ -0,0 +1,394 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+//! DRM GPUVM in immediate mode
+//!
+//! Rust abstractions for using GPUVM in immediate mode. This is when the GPUVM state is updated
+//! during `run_job()`, i.e., in the DMA fence signalling critical path, to ensure that the GPUVM
+//! and the GPU's virtual address space has the same state at all times.
+//!
+//! C header: [`include/drm/drm_gpuvm.h`](srctree/include/drm/drm_gpuvm.h)
+
+use kernel::{
+    alloc::{AllocError, Flags as AllocFlags},
+    bindings, drm,
+    drm::gem::IntoGEMObject,
+    error::to_result,
+    prelude::*,
+    sync::aref::{ARef, AlwaysRefCounted},
+    types::Opaque,
+};
+
+use core::{
+    cell::UnsafeCell,
+    marker::PhantomData,
+    mem::{ManuallyDrop, MaybeUninit},
+    ops::{Deref, DerefMut, Range},
+    ptr::{self, NonNull},
+};
+
+mod sm_ops;
+pub use self::sm_ops::*;
+
+mod vm_bo;
+pub use self::vm_bo::*;
+
+mod va;
+pub use self::va::*;
+
+/// A DRM GPU VA manager.
+///
+/// This object is refcounted, but the "core" is only accessible using a special unique handle. The
+/// core consists of the `core` field and the GPUVM's interval tree.
+#[repr(C)]
+#[pin_data]
+pub struct GpuVm<T: DriverGpuVm> {
+    #[pin]
+    vm: Opaque<bindings::drm_gpuvm>,
+    /// Accessed only through the [`GpuVmCore`] reference.
+    core: UnsafeCell<T>,
+    /// Shared data not protected by any lock.
+    #[pin]
+    shared_data: T::SharedData,
+}
+
+// SAFETY: dox
+unsafe impl<T: DriverGpuVm> AlwaysRefCounted for GpuVm<T> {
+    fn inc_ref(&self) {
+        // SAFETY: dox
+        unsafe { bindings::drm_gpuvm_get(self.vm.get()) };
+    }
+
+    unsafe fn dec_ref(obj: NonNull<Self>) {
+        // SAFETY: dox
+        unsafe { bindings::drm_gpuvm_put((*obj.as_ptr()).vm.get()) };
+    }
+}
+
+impl<T: DriverGpuVm> GpuVm<T> {
+    const fn vtable() -> &'static bindings::drm_gpuvm_ops {
+        &bindings::drm_gpuvm_ops {
+            vm_free: Some(Self::vm_free),
+            op_alloc: None,
+            op_free: None,
+            vm_bo_alloc: GpuVmBo::<T>::ALLOC_FN,
+            vm_bo_free: GpuVmBo::<T>::FREE_FN,
+            vm_bo_validate: None,
+            sm_step_map: Some(Self::sm_step_map),
+            sm_step_unmap: Some(Self::sm_step_unmap),
+            sm_step_remap: Some(Self::sm_step_remap),
+        }
+    }
+
+    /// Creates a GPUVM instance.
+    #[expect(clippy::new_ret_no_self)]
+    pub fn new<E>(
+        name: &'static CStr,
+        dev: &drm::Device<T::Driver>,
+        r_obj: &T::Object,
+        range: Range<u64>,
+        reserve_range: Range<u64>,
+        core: T,
+        shared: impl PinInit<T::SharedData, E>,
+    ) -> Result<GpuVmCore<T>, E>
+    where
+        E: From<AllocError>,
+        E: From<core::convert::Infallible>,
+    {
+        let obj = KBox::try_pin_init::<E>(
+            try_pin_init!(Self {
+                core <- UnsafeCell::new(core),
+                shared_data <- shared,
+                vm <- Opaque::ffi_init(|vm| {
+                    // SAFETY: These arguments are valid. `vm` is valid until refcount drops to
+                    // zero.
+                    unsafe {
+                        bindings::drm_gpuvm_init(
+                            vm,
+                            name.as_char_ptr(),
+                            bindings::drm_gpuvm_flags_DRM_GPUVM_IMMEDIATE_MODE
+                                | bindings::drm_gpuvm_flags_DRM_GPUVM_RESV_PROTECTED,
+                            dev.as_raw(),
+                            r_obj.as_raw(),
+                            range.start,
+                            range.end - range.start,
+                            reserve_range.start,
+                            reserve_range.end - reserve_range.start,
+                            const { Self::vtable() },
+                        )
+                    }
+                }),
+            }? E),
+            GFP_KERNEL,
+        )?;
+        // SAFETY: This transfers the initial refcount to the ARef.
+        Ok(GpuVmCore(unsafe {
+            ARef::from_raw(NonNull::new_unchecked(KBox::into_raw(
+                Pin::into_inner_unchecked(obj),
+            )))
+        }))
+    }
+
+    /// Access this [`GpuVm`] from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// For the duration of `'a`, the pointer must reference a valid [`GpuVm<T>`].
+    #[inline]
+    pub unsafe fn from_raw<'a>(ptr: *mut bindings::drm_gpuvm) -> &'a Self {
+        // SAFETY: `drm_gpuvm` is first field and `repr(C)`.
+        unsafe { &*ptr.cast() }
+    }
+
+    /// Get a raw pointer.
+    #[inline]
+    pub fn as_raw(&self) -> *mut bindings::drm_gpuvm {
+        self.vm.get()
+    }
+
+    /// Access the shared data.
+    #[inline]
+    pub fn shared(&self) -> &T::SharedData {
+        &self.shared_data
+    }
+
+    /// The start of the VA space.
+    #[inline]
+    pub fn va_start(&self) -> u64 {
+        // SAFETY: Safe by the type invariant of `GpuVm<T>`.
+        unsafe { (*self.as_raw()).mm_start }
+    }
+
+    /// The length of the address space
+    #[inline]
+    pub fn va_length(&self) -> u64 {
+        // SAFETY: Safe by the type invariant of `GpuVm<T>`.
+        unsafe { (*self.as_raw()).mm_range }
+    }
+
+    /// Returns the range of the GPU virtual address space.
+    #[inline]
+    pub fn va_range(&self) -> Range<u64> {
+        let start = self.va_start();
+        let end = start + self.va_length();
+        Range { start, end }
+    }
+
+    /// Returns a [`GpuVmBoObtain`] for the provided GEM object.
+    #[inline]
+    pub fn obtain(
+        &self,
+        obj: &T::Object,
+        data: impl PinInit<T::VmBoData>,
+    ) -> Result<GpuVmBoObtain<T>, AllocError> {
+        Ok(GpuVmBoAlloc::new(self, obj, data)?.obtain())
+    }
+
+    /// Prepare this GPUVM.
+    #[inline]
+    pub fn prepare(&self, num_fences: u32) -> impl PinInit<GpuVmExec<'_, T>, Error> {
+        try_pin_init!(GpuVmExec {
+            exec <- Opaque::try_ffi_init(|exec: *mut bindings::drm_gpuvm_exec| {
+                // SAFETY: exec is valid but unused memory, so we can write.
+                unsafe {
+                    ptr::write_bytes(exec, 0u8, 1usize);
+                    ptr::write(&raw mut (*exec).vm, self.as_raw());
+                    ptr::write(&raw mut (*exec).flags, bindings::DRM_EXEC_INTERRUPTIBLE_WAIT);
+                    ptr::write(&raw mut (*exec).num_fences, num_fences);
+                }
+
+                // SAFETY: We can prepare the GPUVM.
+                to_result(unsafe { bindings::drm_gpuvm_exec_lock(exec) })
+            }),
+            _gpuvm: PhantomData,
+        })
+    }
+
+    /// Clean up buffer objects that are no longer used.
+    #[inline]
+    pub fn deferred_cleanup(&self) {
+        // SAFETY: Always safe to perform deferred cleanup.
+        unsafe { bindings::drm_gpuvm_bo_deferred_cleanup(self.as_raw()) }
+    }
+
+    /// Check if this GEM object is an external object for this GPUVM.
+    #[inline]
+    pub fn is_extobj(&self, obj: &T::Object) -> bool {
+        // SAFETY: We may call this with any GPUVM and GEM object.
+        unsafe { bindings::drm_gpuvm_is_extobj(self.as_raw(), obj.as_raw()) }
+    }
+
+    /// Free this GPUVM.
+    ///
+    /// # Safety
+    ///
+    /// Called when refcount hits zero.
+    unsafe extern "C" fn vm_free(me: *mut bindings::drm_gpuvm) {
+        // SAFETY: GPUVM was allocated with KBox and can now be freed.
+        drop(unsafe { KBox::<Self>::from_raw(me.cast()) })
+    }
+}
+
+/// The manager for a GPUVM.
+pub trait DriverGpuVm: Sized {
+    /// Parent `Driver` for this object.
+    type Driver: drm::Driver;
+
+    /// The kind of GEM object stored in this GPUVM.
+    type Object: IntoGEMObject;
+
+    /// Data stored in the [`GpuVm`] that is fully shared.
+    type SharedData;
+
+    /// Data stored with each `struct drm_gpuvm_bo`.
+    type VmBoData;
+
+    /// Data stored with each `struct drm_gpuva`.
+    type VaData;
+
+    /// The private data passed to callbacks.
+    type SmContext;
+
+    /// Indicates that a new mapping should be created.
+    fn sm_step_map<'op>(
+        &mut self,
+        op: OpMap<'op, Self>,
+        context: &mut Self::SmContext,
+    ) -> Result<OpMapped<'op, Self>, Error>;
+
+    /// Indicates that an existing mapping should be removed.
+    fn sm_step_unmap<'op>(
+        &mut self,
+        op: OpUnmap<'op, Self>,
+        context: &mut Self::SmContext,
+    ) -> Result<OpUnmapped<'op, Self>, Error>;
+
+    /// Indicates that an existing mapping should be split up.
+    fn sm_step_remap<'op>(
+        &mut self,
+        op: OpRemap<'op, Self>,
+        context: &mut Self::SmContext,
+    ) -> Result<OpRemapped<'op, Self>, Error>;
+}
+
+/// The core of the DRM GPU VA manager.
+///
+/// This object is the reference to the GPUVM that
+///
+/// # Invariants
+///
+/// This object owns the core.
+pub struct GpuVmCore<T: DriverGpuVm>(ARef<GpuVm<T>>);
+
+impl<T: DriverGpuVm> GpuVmCore<T> {
+    /// Get a reference without access to `core`.
+    #[inline]
+    pub fn gpuvm(&self) -> &GpuVm<T> {
+        &self.0
+    }
+}
+
+impl<T: DriverGpuVm> Deref for GpuVmCore<T> {
+    type Target = T;
+    #[inline]
+    fn deref(&self) -> &T {
+        // SAFETY: By the type invariants we may access `core`.
+        unsafe { &*self.0.core.get() }
+    }
+}
+
+impl<T: DriverGpuVm> DerefMut for GpuVmCore<T> {
+    #[inline]
+    fn deref_mut(&mut self) -> &mut T {
+        // SAFETY: By the type invariants we may access `core`.
+        unsafe { &mut *self.0.core.get() }
+    }
+}
+
+/// The exec token for preparing the objects.
+#[pin_data(PinnedDrop)]
+pub struct GpuVmExec<'a, T: DriverGpuVm> {
+    #[pin]
+    exec: Opaque<bindings::drm_gpuvm_exec>,
+    _gpuvm: PhantomData<&'a mut GpuVm<T>>,
+}
+
+impl<'a, T: DriverGpuVm> GpuVmExec<'a, T> {
+    /// Add a fence.
+    ///
+    /// # Safety
+    ///
+    /// `fence` arg must be valid.
+    pub unsafe fn resv_add_fence(
+        &self,
+        // TODO: use a safe fence abstraction
+        fence: *mut bindings::dma_fence,
+        private_usage: DmaResvUsage,
+        extobj_usage: DmaResvUsage,
+    ) {
+        // SAFETY: Caller ensures fence is ok.
+        unsafe {
+            bindings::drm_gpuvm_resv_add_fence(
+                (*self.exec.get()).vm,
+                &raw mut (*self.exec.get()).exec,
+                fence,
+                private_usage as u32,
+                extobj_usage as u32,
+            )
+        }
+    }
+}
+
+#[pinned_drop]
+impl<'a, T: DriverGpuVm> PinnedDrop for GpuVmExec<'a, T> {
+    fn drop(self: Pin<&mut Self>) {
+        // SAFETY: We hold the lock, so it's safe to unlock.
+        unsafe { bindings::drm_gpuvm_exec_unlock(self.exec.get()) };
+    }
+}
+
+/// How the fence will be used.
+#[repr(u32)]
+pub enum DmaResvUsage {
+    /// For in kernel memory management only (e.g. copying, clearing memory).
+    Kernel = bindings::dma_resv_usage_DMA_RESV_USAGE_KERNEL,
+    /// Implicit write synchronization for userspace submissions.
+    Write = bindings::dma_resv_usage_DMA_RESV_USAGE_WRITE,
+    /// Implicit read synchronization for userspace submissions.
+    Read = bindings::dma_resv_usage_DMA_RESV_USAGE_READ,
+    /// No implicit sync (e.g. preemption fences, page table updates, TLB flushes).
+    Bookkeep = bindings::dma_resv_usage_DMA_RESV_USAGE_BOOKKEEP,
+}
+
+/// A lock guard for the GPUVM's resv lock.
+///
+/// This guard provides access to the extobj and evicted lists.
+///
+/// # Invariants
+///
+/// Holds the GPUVM resv lock.
+pub struct GpuvmResvLockGuard<'a, T: DriverGpuVm>(&'a GpuVm<T>);
+
+impl<T: DriverGpuVm> GpuVm<T> {
+    /// Lock the VM's resv lock.
+    #[inline]
+    pub fn resv_lock(&self) -> GpuvmResvLockGuard<'_, T> {
+        // SAFETY: It's always ok to lock the resv lock.
+        unsafe { bindings::dma_resv_lock(self.raw_resv_lock(), ptr::null_mut()) };
+        // INVARIANTS: We took the lock.
+        GpuvmResvLockGuard(self)
+    }
+
+    #[inline]
+    fn raw_resv_lock(&self) -> *mut bindings::dma_resv {
+        // SAFETY: `r_obj` is immutable and valid for duration of GPUVM.
+        unsafe { (*(*self.as_raw()).r_obj).resv }
+    }
+}
+
+impl<'a, T: DriverGpuVm> Drop for GpuvmResvLockGuard<'a, T> {
+    #[inline]
+    fn drop(&mut self) {
+        // SAFETY: We hold the lock so we can release it.
+        unsafe { bindings::dma_resv_unlock(self.0.raw_resv_lock()) };
+    }
+}
diff --git a/rust/kernel/drm/gpuvm/sm_ops.rs b/rust/kernel/drm/gpuvm/sm_ops.rs
new file mode 100644
index 0000000000000000000000000000000000000000..c0dbd4675de644a3b1cbe7d528194ca7fb471848
--- /dev/null
+++ b/rust/kernel/drm/gpuvm/sm_ops.rs
@@ -0,0 +1,469 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+#![allow(clippy::tabs_in_doc_comments)]
+
+use super::*;
+
+struct SmData<'a, T: DriverGpuVm> {
+    gpuvm: &'a mut GpuVmCore<T>,
+    user_context: &'a mut T::SmContext,
+}
+
+#[repr(C)]
+struct SmMapData<'a, T: DriverGpuVm> {
+    sm_data: SmData<'a, T>,
+    vm_bo: GpuVmBoObtain<T>,
+}
+
+/// The argument for [`GpuVmCore::sm_map`].
+pub struct OpMapRequest<'a, T: DriverGpuVm> {
+    /// Address in GPU virtual address space.
+    pub addr: u64,
+    /// Length of mapping to create.
+    pub range: u64,
+    /// Offset in GEM object.
+    pub offset: u64,
+    /// The GEM object to map.
+    pub vm_bo: GpuVmBoObtain<T>,
+    /// The user-provided context type.
+    pub context: &'a mut T::SmContext,
+}
+
+impl<'a, T: DriverGpuVm> OpMapRequest<'a, T> {
+    fn raw_request(&self) -> bindings::drm_gpuvm_map_req {
+        bindings::drm_gpuvm_map_req {
+            map: bindings::drm_gpuva_op_map {
+                va: bindings::drm_gpuva_op_map__bindgen_ty_1 {
+                    addr: self.addr,
+                    range: self.range,
+                },
+                gem: bindings::drm_gpuva_op_map__bindgen_ty_2 {
+                    offset: self.offset,
+                    obj: self.vm_bo.obj().as_raw(),
+                },
+            },
+        }
+    }
+}
+
+/// ```
+/// struct drm_gpuva_op_map {
+/// 	/**
+/// 	 * @va: structure containing address and range of a map
+/// 	 * operation
+/// 	 */
+/// 	struct {
+/// 		/**
+/// 		 * @va.addr: the base address of the new mapping
+/// 		 */
+/// 		u64 addr;
+///
+/// 		/**
+/// 		 * @va.range: the range of the new mapping
+/// 		 */
+/// 		u64 range;
+/// 	} va;
+///
+/// 	/**
+/// 	 * @gem: structure containing the &drm_gem_object and it's offset
+/// 	 */
+/// 	struct {
+/// 		/**
+/// 		 * @gem.offset: the offset within the &drm_gem_object
+/// 		 */
+/// 		u64 offset;
+///
+/// 		/**
+/// 		 * @gem.obj: the &drm_gem_object to map
+/// 		 */
+/// 		struct drm_gem_object *obj;
+/// 	} gem;
+/// };
+/// ```
+pub struct OpMap<'op, T: DriverGpuVm> {
+    op: &'op bindings::drm_gpuva_op_map,
+    // Since these abstractions are designed for immediate mode, the VM BO needs to be
+    // pre-allocated, so we always have it available when we reach this point.
+    vm_bo: &'op GpuVmBo<T>,
+    _invariant: PhantomData<*mut &'op mut T>,
+}
+
+impl<'op, T: DriverGpuVm> OpMap<'op, T> {
+    /// The base address of the new mapping.
+    pub fn addr(&self) -> u64 {
+        self.op.va.addr
+    }
+
+    /// The length of the new mapping.
+    pub fn length(&self) -> u64 {
+        self.op.va.range
+    }
+
+    /// The offset within the [`drm_gem_object`](crate::gem::Object).
+    pub fn gem_offset(&self) -> u64 {
+        self.op.gem.offset
+    }
+
+    /// The [`drm_gem_object`](crate::gem::Object) to map.
+    pub fn obj(&self) -> &T::Object {
+        // SAFETY: The `obj` pointer is guaranteed to be valid.
+        unsafe { <T::Object as IntoGEMObject>::from_raw(self.op.gem.obj) }
+    }
+
+    /// The [`GpuVmBo`] that the new VA will be associated with.
+    pub fn vm_bo(&self) -> &GpuVmBo<T> {
+        self.vm_bo
+    }
+
+    /// Use the pre-allocated VA to carry out this map operation.
+    pub fn insert(self, va: GpuVaAlloc<T>, va_data: impl PinInit<T::VaData>) -> OpMapped<'op, T> {
+        let va = va.prepare(va_data);
+        // SAFETY: By the type invariants we may access the interval tree.
+        unsafe { bindings::drm_gpuva_map(self.vm_bo.gpuvm().as_raw(), va, self.op) };
+        // SAFETY: The GEM object is valid, so the mutex is properly initialized.
+        unsafe { bindings::mutex_lock(&raw mut (*self.op.gem.obj).gpuva.lock) };
+        // SAFETY: The va is prepared for insertion, and we hold the GEM lock.
+        unsafe { bindings::drm_gpuva_link(va, self.vm_bo.as_raw()) };
+        // SAFETY: We took the mutex above, so we may unlock it.
+        unsafe { bindings::mutex_unlock(&raw mut (*self.op.gem.obj).gpuva.lock) };
+        OpMapped {
+            _invariant: self._invariant,
+        }
+    }
+}
+
+/// Represents a completed [`OpMap`] operation.
+pub struct OpMapped<'op, T> {
+    _invariant: PhantomData<*mut &'op mut T>,
+}
+
+/// ```
+/// struct drm_gpuva_op_unmap {
+/// 	/**
+/// 	 * @va: the &drm_gpuva to unmap
+/// 	 */
+/// 	struct drm_gpuva *va;
+///
+/// 	/**
+/// 	 * @keep:
+/// 	 *
+/// 	 * Indicates whether this &drm_gpuva is physically contiguous with the
+/// 	 * original mapping request.
+/// 	 *
+/// 	 * Optionally, if &keep is set, drivers may keep the actual page table
+/// 	 * mappings for this &drm_gpuva, adding the missing page table entries
+/// 	 * only and update the &drm_gpuvm accordingly.
+/// 	 */
+/// 	bool keep;
+/// };
+/// ```
+pub struct OpUnmap<'op, T: DriverGpuVm> {
+    op: &'op bindings::drm_gpuva_op_unmap,
+    _invariant: PhantomData<*mut &'op mut T>,
+}
+
+impl<'op, T: DriverGpuVm> OpUnmap<'op, T> {
+    /// Indicates whether this `drm_gpuva` is physically contiguous with the
+    /// original mapping request.
+    ///
+    /// Optionally, if `keep` is set, drivers may keep the actual page table
+    /// mappings for this `drm_gpuva`, adding the missing page table entries
+    /// only and update the `drm_gpuvm` accordingly.
+    pub fn keep(&self) -> bool {
+        self.op.keep
+    }
+
+    /// The range being unmapped.
+    pub fn va(&self) -> &GpuVa<T> {
+        // SAFETY: This is a valid va.
+        unsafe { GpuVa::<T>::from_raw(self.op.va) }
+    }
+
+    /// Remove the VA.
+    pub fn remove(self) -> (OpUnmapped<'op, T>, GpuVaRemoved<T>) {
+        // SAFETY: The op references a valid drm_gpuva in the GPUVM.
+        unsafe { bindings::drm_gpuva_unmap(self.op) };
+        // SAFETY: The va is no longer in the interval tree so we may unlink it.
+        unsafe { bindings::drm_gpuva_unlink_defer(self.op.va) };
+
+        // SAFETY: We just removed this va from the `GpuVm<T>`.
+        let va = unsafe { GpuVaRemoved::from_raw(self.op.va) };
+
+        (
+            OpUnmapped {
+                _invariant: self._invariant,
+            },
+            va,
+        )
+    }
+}
+
+/// Represents a completed [`OpUnmap`] operation.
+pub struct OpUnmapped<'op, T> {
+    _invariant: PhantomData<*mut &'op mut T>,
+}
+
+/// ```
+/// struct drm_gpuva_op_remap {
+/// 	/**
+/// 	 * @prev: the preceding part of a split mapping
+/// 	 */
+/// 	struct drm_gpuva_op_map *prev;
+///
+/// 	/**
+/// 	 * @next: the subsequent part of a split mapping
+/// 	 */
+/// 	struct drm_gpuva_op_map *next;
+///
+/// 	/**
+/// 	 * @unmap: the unmap operation for the original existing mapping
+/// 	 */
+/// 	struct drm_gpuva_op_unmap *unmap;
+/// };
+/// ```
+pub struct OpRemap<'op, T: DriverGpuVm> {
+    op: &'op bindings::drm_gpuva_op_remap,
+    _invariant: PhantomData<*mut &'op mut T>,
+}
+
+impl<'op, T: DriverGpuVm> OpRemap<'op, T> {
+    /// The preceding part of a split mapping.
+    #[inline]
+    pub fn prev(&self) -> Option<&OpRemapMapData> {
+        // SAFETY: We checked for null, so the pointer must be valid.
+        NonNull::new(self.op.prev).map(|ptr| unsafe { OpRemapMapData::from_raw(ptr) })
+    }
+
+    /// The subsequent part of a split mapping.
+    #[inline]
+    pub fn next(&self) -> Option<&OpRemapMapData> {
+        // SAFETY: We checked for null, so the pointer must be valid.
+        NonNull::new(self.op.next).map(|ptr| unsafe { OpRemapMapData::from_raw(ptr) })
+    }
+
+    /// Indicates whether the `drm_gpuva` being removed is physically contiguous with the original
+    /// mapping request.
+    ///
+    /// Optionally, if `keep` is set, drivers may keep the actual page table mappings for this
+    /// `drm_gpuva`, adding the missing page table entries only and update the `drm_gpuvm`
+    /// accordingly.
+    #[inline]
+    pub fn keep(&self) -> bool {
+        // SAFETY: The unmap pointer is always valid.
+        unsafe { (*self.op.unmap).keep }
+    }
+
+    /// The range being unmapped.
+    #[inline]
+    pub fn va_to_unmap(&self) -> &GpuVa<T> {
+        // SAFETY: This is a valid va.
+        unsafe { GpuVa::<T>::from_raw((*self.op.unmap).va) }
+    }
+
+    /// The [`drm_gem_object`](crate::gem::Object) whose VA is being remapped.
+    #[inline]
+    pub fn obj(&self) -> &T::Object {
+        self.va_to_unmap().obj()
+    }
+
+    /// The [`GpuVmBo`] that is being remapped.
+    #[inline]
+    pub fn vm_bo(&self) -> &GpuVmBo<T> {
+        self.va_to_unmap().vm_bo()
+    }
+
+    /// Update the GPUVM to perform the remapping.
+    pub fn remap(
+        self,
+        va_alloc: [GpuVaAlloc<T>; 2],
+        prev_data: impl PinInit<T::VaData>,
+        next_data: impl PinInit<T::VaData>,
+    ) -> (OpRemapped<'op, T>, OpRemapRet<T>) {
+        let [va1, va2] = va_alloc;
+
+        let mut unused_va = None;
+        let mut prev_ptr = ptr::null_mut();
+        let mut next_ptr = ptr::null_mut();
+        if self.prev().is_some() {
+            prev_ptr = va1.prepare(prev_data);
+        } else {
+            unused_va = Some(va1);
+        }
+        if self.next().is_some() {
+            next_ptr = va2.prepare(next_data);
+        } else {
+            unused_va = Some(va2);
+        }
+
+        // SAFETY: the pointers are non-null when required
+        unsafe { bindings::drm_gpuva_remap(prev_ptr, next_ptr, self.op) };
+
+        // SAFETY: The GEM object is valid, so the mutex is properly initialized.
+        unsafe { bindings::mutex_lock(&raw mut (*self.obj().as_raw()).gpuva.lock) };
+        if !prev_ptr.is_null() {
+            // SAFETY: The prev_ptr is a valid drm_gpuva prepared for insertion. The vm_bo is still
+            // valid as the not-yet-unlinked gpuva holds a refcount on the vm_bo.
+            unsafe { bindings::drm_gpuva_link(prev_ptr, self.vm_bo().as_raw()) };
+        }
+        if !next_ptr.is_null() {
+            // SAFETY: The next_ptr is a valid drm_gpuva prepared for insertion. The vm_bo is still
+            // valid as the not-yet-unlinked gpuva holds a refcount on the vm_bo.
+            unsafe { bindings::drm_gpuva_link(next_ptr, self.vm_bo().as_raw()) };
+        }
+        // SAFETY: We took the mutex above, so we may unlock it.
+        unsafe { bindings::mutex_unlock(&raw mut (*self.obj().as_raw()).gpuva.lock) };
+        // SAFETY: The va is no longer in the interval tree so we may unlink it.
+        unsafe { bindings::drm_gpuva_unlink_defer((*self.op.unmap).va) };
+
+        (
+            OpRemapped {
+                _invariant: self._invariant,
+            },
+            OpRemapRet {
+                // SAFETY: We just removed this va from the `GpuVm<T>`.
+                unmapped_va: unsafe { GpuVaRemoved::from_raw((*self.op.unmap).va) },
+                unused_va,
+            },
+        )
+    }
+}
+
+/// Part of an [`OpRemap`] that represents a new mapping.
+#[repr(transparent)]
+pub struct OpRemapMapData(bindings::drm_gpuva_op_map);
+
+impl OpRemapMapData {
+    /// # Safety
+    /// Must reference a valid `drm_gpuva_op_map` for duration of `'a`.
+    unsafe fn from_raw<'a>(ptr: NonNull<bindings::drm_gpuva_op_map>) -> &'a Self {
+        // SAFETY: ok per safety requirements
+        unsafe { ptr.cast().as_ref() }
+    }
+
+    /// The base address of the new mapping.
+    pub fn addr(&self) -> u64 {
+        self.0.va.addr
+    }
+
+    /// The length of the new mapping.
+    pub fn length(&self) -> u64 {
+        self.0.va.range
+    }
+
+    /// The offset within the [`drm_gem_object`](crate::gem::Object).
+    pub fn gem_offset(&self) -> u64 {
+        self.0.gem.offset
+    }
+}
+
+/// Struct containing objects removed or not used by [`OpRemap::remap`].
+pub struct OpRemapRet<T: DriverGpuVm> {
+    /// The `drm_gpuva` that was removed.
+    pub unmapped_va: GpuVaRemoved<T>,
+    /// If the remap did not split the region into two pieces, then the unused `drm_gpuva` is
+    /// returned here.
+    pub unused_va: Option<GpuVaAlloc<T>>,
+}
+
+/// Represents a completed [`OpRemap`] operation.
+pub struct OpRemapped<'op, T> {
+    _invariant: PhantomData<*mut &'op mut T>,
+}
+
+impl<T: DriverGpuVm> GpuVmCore<T> {
+    /// Create a mapping, removing or remapping anything that overlaps.
+    #[inline]
+    pub fn sm_map(&mut self, req: OpMapRequest<'_, T>) -> Result {
+        let gpuvm = self.gpuvm().as_raw();
+        let raw_req = req.raw_request();
+        let mut p = SmMapData {
+            sm_data: SmData {
+                gpuvm: self,
+                user_context: req.context,
+            },
+            vm_bo: req.vm_bo,
+        };
+        // SAFETY:
+        // * raw_request() creates a valid request.
+        // * The private data is valid to be interpreted as both SmData and SmMapData since the
+        //   first field of SmMapData is SmData.
+        to_result(unsafe {
+            bindings::drm_gpuvm_sm_map(gpuvm, (&raw mut p).cast(), &raw const raw_req)
+        })
+    }
+
+    /// Remove any mappings in the given region.
+    #[inline]
+    pub fn sm_unmap(&mut self, addr: u64, length: u64, context: &mut T::SmContext) -> Result {
+        let gpuvm = self.gpuvm().as_raw();
+        let mut p = SmData {
+            gpuvm: self,
+            user_context: context,
+        };
+        // SAFETY:
+        // * raw_request() creates a valid request.
+        // * The private data is valid to be interpreted as only SmData, but drm_gpuvm_sm_unmap()
+        //   never calls sm_step_map().
+        to_result(unsafe { bindings::drm_gpuvm_sm_unmap(gpuvm, (&raw mut p).cast(), addr, length) })
+    }
+}
+
+impl<T: DriverGpuVm> GpuVm<T> {
+    /// # Safety
+    /// Must be called from `sm_map`.
+    pub(super) unsafe extern "C" fn sm_step_map(
+        op: *mut bindings::drm_gpuva_op,
+        p: *mut c_void,
+    ) -> c_int {
+        // SAFETY: If we reach `sm_step_map` then we were called from `sm_map` which always passes
+        // an `SmMapData` as private data.
+        let p = unsafe { &mut *p.cast::<SmMapData<'_, T>>() };
+        let op = OpMap {
+            // SAFETY: sm_step_map is called with a map operation.
+            op: unsafe { &(*op).__bindgen_anon_1.map },
+            vm_bo: &p.vm_bo,
+            _invariant: PhantomData,
+        };
+        match p.sm_data.gpuvm.sm_step_map(op, p.sm_data.user_context) {
+            Ok(OpMapped { .. }) => 0,
+            Err(err) => err.to_errno(),
+        }
+    }
+    /// # Safety
+    /// Must be called from `sm_map` or `sm_unmap`.
+    pub(super) unsafe extern "C" fn sm_step_unmap(
+        op: *mut bindings::drm_gpuva_op,
+        p: *mut c_void,
+    ) -> c_int {
+        // SAFETY: If we reach `sm_step_unmap` then we were called from `sm_map` or `sm_unmap` which passes either
+        // an `SmMapData` or `SmData` as private data. Both cases can be cast to `SmData`.
+        let p = unsafe { &mut *p.cast::<SmData<'_, T>>() };
+        let op = OpUnmap {
+            // SAFETY: sm_step_unmap is called with an unmap operation.
+            op: unsafe { &(*op).__bindgen_anon_1.unmap },
+            _invariant: PhantomData,
+        };
+        match p.gpuvm.sm_step_unmap(op, p.user_context) {
+            Ok(OpUnmapped { .. }) => 0,
+            Err(err) => err.to_errno(),
+        }
+    }
+    /// # Safety
+    /// Must be called from `sm_map` or `sm_unmap`.
+    pub(super) unsafe extern "C" fn sm_step_remap(
+        op: *mut bindings::drm_gpuva_op,
+        p: *mut c_void,
+    ) -> c_int {
+        // SAFETY: If we reach `sm_step_remap` then we were called from `sm_map` or `sm_unmap` which passes either
+        // an `SmMapData` or `SmData` as private data. Both cases can be cast to `SmData`.
+        let p = unsafe { &mut *p.cast::<SmData<'_, T>>() };
+        let op = OpRemap {
+            // SAFETY: sm_step_remap is called with a remap operation.
+            op: unsafe { &(*op).__bindgen_anon_1.remap },
+            _invariant: PhantomData,
+        };
+        match p.gpuvm.sm_step_remap(op, p.user_context) {
+            Ok(OpRemapped { .. }) => 0,
+            Err(err) => err.to_errno(),
+        }
+    }
+}
diff --git a/rust/kernel/drm/gpuvm/va.rs b/rust/kernel/drm/gpuvm/va.rs
new file mode 100644
index 0000000000000000000000000000000000000000..a31122ff22282186a1d76d4bb085714f6465722b
--- /dev/null
+++ b/rust/kernel/drm/gpuvm/va.rs
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+use super::*;
+
+/// Represents that a range of a GEM object is mapped in this [`GpuVm`] instance.
+///
+/// Does not assume that GEM lock is held.
+///
+/// # Invariants
+///
+/// This is a valid `drm_gpuva` that is resident in the [`GpuVm`] instance.
+#[repr(C)]
+#[pin_data]
+pub struct GpuVa<T: DriverGpuVm> {
+    #[pin]
+    inner: Opaque<bindings::drm_gpuva>,
+    #[pin]
+    data: T::VaData,
+}
+
+impl<T: DriverGpuVm> GpuVa<T> {
+    /// Access this [`GpuVa`] from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// For the duration of `'a`, the pointer must reference a valid `drm_gpuva` associated with a
+    /// [`GpuVm<T>`].
+    #[inline]
+    pub unsafe fn from_raw<'a>(ptr: *mut bindings::drm_gpuva) -> &'a Self {
+        // SAFETY: `drm_gpuva` is first field and `repr(C)`.
+        unsafe { &*ptr.cast() }
+    }
+
+    /// Returns a raw pointer to underlying C value.
+    #[inline]
+    pub fn as_raw(&self) -> *mut bindings::drm_gpuva {
+        self.inner.get()
+    }
+
+    /// Returns the address of this mapping in the GPU virtual address space.
+    #[inline]
+    pub fn addr(&self) -> u64 {
+        // SAFETY: The `va.addr` field of `drm_gpuva` is immutable.
+        unsafe { (*self.as_raw()).va.addr }
+    }
+
+    /// Returns the length of this mapping.
+    #[inline]
+    pub fn length(&self) -> u64 {
+        // SAFETY: The `va.range` field of `drm_gpuva` is immutable.
+        unsafe { (*self.as_raw()).va.range }
+    }
+
+    /// Returns `addr..addr+length`.
+    #[inline]
+    pub fn range(&self) -> Range<u64> {
+        let addr = self.addr();
+        addr..addr + self.length()
+    }
+
+    /// Returns the offset within the GEM object.
+    #[inline]
+    pub fn gem_offset(&self) -> u64 {
+        // SAFETY: The `gem.offset` field of `drm_gpuva` is immutable.
+        unsafe { (*self.as_raw()).gem.offset }
+    }
+
+    /// Returns the GEM object.
+    #[inline]
+    pub fn obj(&self) -> &T::Object {
+        // SAFETY: The `gem.offset` field of `drm_gpuva` is immutable.
+        unsafe { <T::Object as IntoGEMObject>::from_raw((*self.as_raw()).gem.obj) }
+    }
+
+    /// Returns the underlying [`GpuVmBo`] object that backs this [`GpuVa`].
+    #[inline]
+    pub fn vm_bo(&self) -> &GpuVmBo<T> {
+        // SAFETY: The `vm_bo` field has been set and is immutable for the duration in which this
+        // `drm_gpuva` is resident in the VM.
+        unsafe { GpuVmBo::from_raw((*self.as_raw()).vm_bo) }
+    }
+}
+
+/// A pre-allocated [`GpuVa`] object.
+///
+/// # Invariants
+///
+/// The memory is zeroed.
+pub struct GpuVaAlloc<T: DriverGpuVm>(KBox<MaybeUninit<GpuVa<T>>>);
+
+impl<T: DriverGpuVm> GpuVaAlloc<T> {
+    /// Pre-allocate a [`GpuVa`] object.
+    pub fn new(flags: AllocFlags) -> Result<GpuVaAlloc<T>, AllocError> {
+        // INVARIANTS: Memory allocated with __GFP_ZERO.
+        Ok(GpuVaAlloc(KBox::new_uninit(flags | __GFP_ZERO)?))
+    }
+
+    /// Prepare this `drm_gpuva` for insertion into the GPUVM.
+    pub(super) fn prepare(mut self, va_data: impl PinInit<T::VaData>) -> *mut bindings::drm_gpuva {
+        let va_ptr = MaybeUninit::as_mut_ptr(&mut self.0);
+        // SAFETY: The `data` field is pinned.
+        let Ok(()) = unsafe { va_data.__pinned_init(&raw mut (*va_ptr).data) };
+        KBox::into_raw(self.0).cast()
+    }
+}
+
+/// A [`GpuVa`] object that has been removed.
+///
+/// # Invariants
+///
+/// The `drm_gpuva` is not resident in the [`GpuVm`].
+pub struct GpuVaRemoved<T: DriverGpuVm>(KBox<GpuVa<T>>);
+
+impl<T: DriverGpuVm> GpuVaRemoved<T> {
+    /// Convert a raw pointer into a [`GpuVaRemoved`].
+    ///
+    /// # Safety
+    ///
+    /// Must have been removed from a [`GpuVm<T>`].
+    pub(super) unsafe fn from_raw(ptr: *mut bindings::drm_gpuva) -> Self {
+        // SAFETY: Since it has been removed we can take ownership of allocation.
+        GpuVaRemoved(unsafe { KBox::from_raw(ptr.cast()) })
+    }
+
+    /// Take ownership of the VA data.
+    pub fn into_inner(self) -> T::VaData
+    where
+        T::VaData: Unpin,
+    {
+        KBox::into_inner(self.0).data
+    }
+}
+
+impl<T: DriverGpuVm> Deref for GpuVaRemoved<T> {
+    type Target = T::VaData;
+    fn deref(&self) -> &T::VaData {
+        &self.0.data
+    }
+}
+
+impl<T: DriverGpuVm> DerefMut for GpuVaRemoved<T>
+where
+    T::VaData: Unpin,
+{
+    fn deref_mut(&mut self) -> &mut T::VaData {
+        &mut self.0.data
+    }
+}
diff --git a/rust/kernel/drm/gpuvm/vm_bo.rs b/rust/kernel/drm/gpuvm/vm_bo.rs
new file mode 100644
index 0000000000000000000000000000000000000000..f21aa17ea4f42c4a2b57b1f3a57a18dd2c3c8b7b
--- /dev/null
+++ b/rust/kernel/drm/gpuvm/vm_bo.rs
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+use super::*;
+
+/// Represents that a given GEM object has at least one mapping on this [`GpuVm`] instance.
+///
+/// Does not assume that GEM lock is held.
+#[repr(C)]
+#[pin_data]
+pub struct GpuVmBo<T: DriverGpuVm> {
+    #[pin]
+    inner: Opaque<bindings::drm_gpuvm_bo>,
+    #[pin]
+    data: T::VmBoData,
+}
+
+impl<T: DriverGpuVm> GpuVmBo<T> {
+    pub(super) const ALLOC_FN: Option<unsafe extern "C" fn() -> *mut bindings::drm_gpuvm_bo> = {
+        use core::alloc::Layout;
+        let base = Layout::new::<bindings::drm_gpuvm_bo>();
+        let rust = Layout::new::<Self>();
+        assert!(base.size() <= rust.size());
+        if base.size() != rust.size() || base.align() != rust.align() {
+            Some(Self::vm_bo_alloc)
+        } else {
+            // This causes GPUVM to allocate a `GpuVmBo<T>` with `kzalloc(sizeof(drm_gpuvm_bo))`.
+            None
+        }
+    };
+
+    pub(super) const FREE_FN: Option<unsafe extern "C" fn(*mut bindings::drm_gpuvm_bo)> = {
+        if core::mem::needs_drop::<Self>() {
+            Some(Self::vm_bo_free)
+        } else {
+            // This causes GPUVM to free a `GpuVmBo<T>` with `kfree`.
+            None
+        }
+    };
+
+    /// Custom function for allocating a `drm_gpuvm_bo`.
+    ///
+    /// # Safety
+    ///
+    /// Always safe to call. Unsafe to match function pointer type in C struct.
+    unsafe extern "C" fn vm_bo_alloc() -> *mut bindings::drm_gpuvm_bo {
+        KBox::<Self>::new_uninit(GFP_KERNEL | __GFP_ZERO)
+            .map(KBox::into_raw)
+            .unwrap_or(ptr::null_mut())
+            .cast()
+    }
+
+    /// Custom function for freeing a `drm_gpuvm_bo`.
+    ///
+    /// # Safety
+    ///
+    /// The pointer must have been allocated with [`GpuVmBo::ALLOC_FN`], and must not be used after
+    /// this call.
+    unsafe extern "C" fn vm_bo_free(ptr: *mut bindings::drm_gpuvm_bo) {
+        // SAFETY:
+        // * The ptr was allocated from kmalloc with the layout of `GpuVmBo<T>`.
+        // * `ptr->inner` has no destructor.
+        // * `ptr->data` contains a valid `T::VmBoData` that we can drop.
+        drop(unsafe { KBox::<Self>::from_raw(ptr.cast()) });
+    }
+
+    /// Access this [`GpuVmBo`] from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// For the duration of `'a`, the pointer must reference a valid `drm_gpuvm_bo` associated with
+    /// a [`GpuVm<T>`].
+    #[inline]
+    pub unsafe fn from_raw<'a>(ptr: *mut bindings::drm_gpuvm_bo) -> &'a Self {
+        // SAFETY: `drm_gpuvm_bo` is first field and `repr(C)`.
+        unsafe { &*ptr.cast() }
+    }
+
+    /// Returns a raw pointer to underlying C value.
+    #[inline]
+    pub fn as_raw(&self) -> *mut bindings::drm_gpuvm_bo {
+        self.inner.get()
+    }
+
+    /// The [`GpuVm`] that this GEM object is mapped in.
+    #[inline]
+    pub fn gpuvm(&self) -> &GpuVm<T> {
+        // SAFETY: The `obj` pointer is guaranteed to be valid.
+        unsafe { GpuVm::<T>::from_raw((*self.inner.get()).vm) }
+    }
+
+    /// The [`drm_gem_object`](crate::gem::Object) for these mappings.
+    #[inline]
+    pub fn obj(&self) -> &T::Object {
+        // SAFETY: The `obj` pointer is guaranteed to be valid.
+        unsafe { <T::Object as IntoGEMObject>::from_raw((*self.inner.get()).obj) }
+    }
+
+    /// The driver data with this buffer object.
+    #[inline]
+    pub fn data(&self) -> &T::VmBoData {
+        &self.data
+    }
+}
+
+/// A pre-allocated [`GpuVmBo`] object.
+///
+/// # Invariants
+///
+/// Points at a `drm_gpuvm_bo` that contains a valid `T::VmBoData`, has a refcount of one, and is
+/// absent from any gem, extobj, or evict lists.
+pub(super) struct GpuVmBoAlloc<T: DriverGpuVm>(NonNull<GpuVmBo<T>>);
+
+impl<T: DriverGpuVm> GpuVmBoAlloc<T> {
+    /// Create a new pre-allocated [`GpuVmBo`].
+    ///
+    /// It's intentional that the initializer is infallible because `drm_gpuvm_bo_put` will call
+    /// drop on the data, so we don't have a way to free it when the data is missing.
+    #[inline]
+    pub(super) fn new(
+        gpuvm: &GpuVm<T>,
+        gem: &T::Object,
+        value: impl PinInit<T::VmBoData>,
+    ) -> Result<GpuVmBoAlloc<T>, AllocError> {
+        // SAFETY: The provided gpuvm and gem ptrs are valid for the duration of this call.
+        let raw_ptr = unsafe {
+            bindings::drm_gpuvm_bo_create(gpuvm.as_raw(), gem.as_raw()).cast::<GpuVmBo<T>>()
+        };
+        // CAST: `GpuVmBoAlloc::vm_bo_alloc` ensures that this memory was allocated with the layout
+        // of `GpuVmBo<T>`.
+        let ptr = NonNull::new(raw_ptr).ok_or(AllocError)?;
+        // SAFETY: `ptr->data` is a valid pinned location.
+        let Ok(()) = unsafe { value.__pinned_init(&raw mut (*raw_ptr).data) };
+        // INVARIANTS: We just created the vm_bo so it's absent from lists, and the data is valid
+        // as we just initialized it.
+        Ok(GpuVmBoAlloc(ptr))
+    }
+
+    /// Returns a raw pointer to underlying C value.
+    #[inline]
+    pub(super) fn as_raw(&self) -> *mut bindings::drm_gpuvm_bo {
+        // SAFETY: The pointer references a valid `drm_gpuvm_bo`.
+        unsafe { (*self.0.as_ptr()).inner.get() }
+    }
+
+    /// Look up whether there is an existing [`GpuVmBo`] for this gem object.
+    #[inline]
+    pub(super) fn obtain(self) -> GpuVmBoObtain<T> {
+        let me = ManuallyDrop::new(self);
+        // SAFETY: Valid `drm_gpuvm_bo` not already in the lists.
+        let ptr = unsafe { bindings::drm_gpuvm_bo_obtain_prealloc(me.as_raw()) };
+
+        // If the vm_bo does not already exist, ensure that it's in the extobj list.
+        if ptr::eq(ptr, me.as_raw()) && me.gpuvm().is_extobj(me.obj()) {
+            let _resv_lock = me.gpuvm().resv_lock();
+            // SAFETY: We hold the GPUVMs resv lock.
+            unsafe { bindings::drm_gpuvm_bo_extobj_add(ptr) };
+        }
+
+        // INVARIANTS: Valid `drm_gpuvm_bo` in the GEM list.
+        // SAFETY: `drm_gpuvm_bo_obtain_prealloc` always returns a non-null ptr
+        GpuVmBoObtain(unsafe { NonNull::new_unchecked(ptr.cast()) })
+    }
+}
+
+impl<T: DriverGpuVm> Deref for GpuVmBoAlloc<T> {
+    type Target = GpuVmBo<T>;
+    #[inline]
+    fn deref(&self) -> &GpuVmBo<T> {
+        // SAFETY: By the type invariants we may deref while `Self` exists.
+        unsafe { self.0.as_ref() }
+    }
+}
+
+impl<T: DriverGpuVm> Drop for GpuVmBoAlloc<T> {
+    #[inline]
+    fn drop(&mut self) {
+        // SAFETY: It's safe to perform a deferred put in any context.
+        unsafe { bindings::drm_gpuvm_bo_put_deferred(self.as_raw()) };
+    }
+}
+
+/// A [`GpuVmBo`] object in the GEM list.
+///
+/// # Invariants
+///
+/// Points at a `drm_gpuvm_bo` that contains a valid `T::VmBoData` and is present in the gem list.
+pub struct GpuVmBoObtain<T: DriverGpuVm>(NonNull<GpuVmBo<T>>);
+
+impl<T: DriverGpuVm> GpuVmBoObtain<T> {
+    /// Returns a raw pointer to underlying C value.
+    #[inline]
+    pub fn as_raw(&self) -> *mut bindings::drm_gpuvm_bo {
+        // SAFETY: The pointer references a valid `drm_gpuvm_bo`.
+        unsafe { (*self.0.as_ptr()).inner.get() }
+    }
+}
+
+impl<T: DriverGpuVm> Deref for GpuVmBoObtain<T> {
+    type Target = GpuVmBo<T>;
+    #[inline]
+    fn deref(&self) -> &GpuVmBo<T> {
+        // SAFETY: By the type invariants we may deref while `Self` exists.
+        unsafe { self.0.as_ref() }
+    }
+}
+
+impl<T: DriverGpuVm> Drop for GpuVmBoObtain<T> {
+    #[inline]
+    fn drop(&mut self) {
+        // SAFETY: It's safe to perform a deferred put in any context.
+        unsafe { bindings::drm_gpuvm_bo_put_deferred(self.as_raw()) };
+    }
+}
diff --git a/rust/kernel/drm/mod.rs b/rust/kernel/drm/mod.rs
index 1b82b6945edf25b947afc08300e211bd97150d6b..a4b6c5430198571ec701af2ef452cc9ac55870e6 100644
--- a/rust/kernel/drm/mod.rs
+++ b/rust/kernel/drm/mod.rs
@@ -6,6 +6,7 @@
 pub mod driver;
 pub mod file;
 pub mod gem;
+pub mod gpuvm;
 pub mod ioctl;
 
 pub use self::device::Device;

-- 
2.52.0.487.g5c8c507ade-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction
  2025-11-28 14:14 ` [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction Alice Ryhl
@ 2025-12-01 15:16   ` Daniel Almeida
  2025-12-02  8:39     ` Alice Ryhl
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Almeida @ 2025-12-01 15:16 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Danilo Krummrich, Matthew Brost, Thomas Hellström,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Boris Brezillon, Steven Price, Liviu Dudau,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Frank Binns,
	Matt Coster, Rob Clark, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Sean Paul, Marijn Suijten, Lyude Paul,
	Lucas De Marchi, Rodrigo Vivi, Sumit Semwal, Christian König,
	dri-devel, linux-kernel, rust-for-linux, linux-arm-msm, freedreno,
	nouveau, intel-xe, linux-media, linaro-mm-sig, Asahi Lina

Hi Alice,

I find it a bit weird that we reverted to v1, given that the previous gpuvm
attempt was v3. No big deal though.


> On 28 Nov 2025, at 11:14, Alice Ryhl <aliceryhl@google.com> wrote:
> 
> Add a GPUVM abstraction to be used by Rust GPU drivers.
> 
> GPUVM keeps track of a GPU's virtual address (VA) space and manages the
> corresponding virtual mappings represented by "GPU VA" objects. It also
> keeps track of the gem::Object<T> used to back the mappings through
> GpuVmBo<T>.
> 
> This abstraction is only usable by drivers that wish to use GPUVM in
> immediate mode. This allows us to build the locking scheme into the API
> design. It means that the GEM mutex is used for the GEM gpuva list, and
> that the resv lock is used for the extobj list. The evicted list is not
> yet used in this version.
> 
> This abstraction provides a special handle called the GpuVmCore, which
> is a wrapper around ARef<GpuVm> that provides access to the interval
> tree. Generally, all changes to the address space requires mutable
> access to this unique handle.
> 
> Some of the safety comments are still somewhat WIP, but I think the API
> should be sound as-is.
> 
> Co-developed-by: Asahi Lina <lina+kernel@asahilina.net>
> Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
> Co-developed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> MAINTAINERS                     |   1 +
> rust/bindings/bindings_helper.h |   2 +
> rust/helpers/drm_gpuvm.c        |  43 ++++
> rust/helpers/helpers.c          |   1 +
> rust/kernel/drm/gpuvm/mod.rs    | 394 +++++++++++++++++++++++++++++++++
> rust/kernel/drm/gpuvm/sm_ops.rs | 469 ++++++++++++++++++++++++++++++++++++++++
> rust/kernel/drm/gpuvm/va.rs     | 148 +++++++++++++
> rust/kernel/drm/gpuvm/vm_bo.rs  | 213 ++++++++++++++++++
> rust/kernel/drm/mod.rs          |   1 +
> 9 files changed, 1272 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 952aed4619c25d395c12962e559d6cd3362f64a7..946629eb9ebf19922bbe782fed37be07067d6bf2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8591,6 +8591,7 @@ S: Supported
> T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
> F: drivers/gpu/drm/drm_gpuvm.c
> F: include/drm/drm_gpuvm.h
> +F: rust/kernel/drm/gpuvm/
> 
> DRM LOG
> M: Jocelyn Falempe <jfalempe@redhat.com>
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 2e43c66635a2c9f31bd99b9817bd2d6ab89fbcf2..c776ae198e1db91f010f88ff1d1c888a3036a87f 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -33,6 +33,7 @@
> #include <drm/drm_drv.h>
> #include <drm/drm_file.h>
> #include <drm/drm_gem.h>
> +#include <drm/drm_gpuvm.h>
> #include <drm/drm_ioctl.h>
> #include <kunit/test.h>
> #include <linux/auxiliary_bus.h>
> @@ -103,6 +104,7 @@ const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
> const gfp_t RUST_CONST_HELPER___GFP_NOWARN = ___GFP_NOWARN;
> const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
> const fop_flags_t RUST_CONST_HELPER_FOP_UNSIGNED_OFFSET = FOP_UNSIGNED_OFFSET;
> +const u32 RUST_CONST_HELPER_DRM_EXEC_INTERRUPTIBLE_WAIT = DRM_EXEC_INTERRUPTIBLE_WAIT;
> 
> const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
> 
> diff --git a/rust/helpers/drm_gpuvm.c b/rust/helpers/drm_gpuvm.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..18b7dbd2e32c3162455b344e72ec2940c632cc6b
> --- /dev/null
> +++ b/rust/helpers/drm_gpuvm.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0 or MIT
> +
> +#ifdef CONFIG_DRM_GPUVM
> +
> +#include <drm/drm_gpuvm.h>
> +
> +struct drm_gpuvm *rust_helper_drm_gpuvm_get(struct drm_gpuvm *obj)
> +{
> + return drm_gpuvm_get(obj);
> +}
> +
> +void rust_helper_drm_gpuva_init_from_op(struct drm_gpuva *va, struct drm_gpuva_op_map *op)
> +{
> + drm_gpuva_init_from_op(va, op);
> +}
> +
> +struct drm_gpuvm_bo *rust_helper_drm_gpuvm_bo_get(struct drm_gpuvm_bo *vm_bo)
> +{
> + return drm_gpuvm_bo_get(vm_bo);
> +}
> +
> +void rust_helper_drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> +{
> + return drm_gpuvm_exec_unlock(vm_exec);
> +}
> +
> +bool rust_helper_drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
> +     struct drm_gem_object *obj)
> +{
> + return drm_gpuvm_is_extobj(gpuvm, obj);
> +}
> +
> +int rust_helper_dma_resv_lock(struct dma_resv *obj, struct ww_acquire_ctx *ctx)
> +{
> + return dma_resv_lock(obj, ctx);
> +}
> +
> +void rust_helper_dma_resv_unlock(struct dma_resv *obj)
> +{
> + dma_resv_unlock(obj);
> +}
> +
> +#endif // CONFIG_DRM_GPUVM
> diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
> index 551da6c9b5064c324d6f62bafcec672c6c6f5bee..91f45155eb9c2c4e92b56ee1abf7d45188873f3c 100644
> --- a/rust/helpers/helpers.c
> +++ b/rust/helpers/helpers.c
> @@ -26,6 +26,7 @@
> #include "device.c"
> #include "dma.c"
> #include "drm.c"
> +#include "drm_gpuvm.c"
> #include "err.c"
> #include "irq.c"
> #include "fs.c"
> diff --git a/rust/kernel/drm/gpuvm/mod.rs b/rust/kernel/drm/gpuvm/mod.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..9834dbb938a3622e46048e9b8e06bc6bf03aa0d2
> --- /dev/null
> +++ b/rust/kernel/drm/gpuvm/mod.rs
> @@ -0,0 +1,394 @@
> +// SPDX-License-Identifier: GPL-2.0 OR MIT
> +
> +//! DRM GPUVM in immediate mode
> +//!
> +//! Rust abstractions for using GPUVM in immediate mode. This is when the GPUVM state is updated
> +//! during `run_job()`, i.e., in the DMA fence signalling critical path, to ensure that the GPUVM

IMHO: We should initially target synchronous VM_BINDS, which are the opposite
of what you described above.


> +//! and the GPU's virtual address space has the same state at all times.
> +//!
> +//! C header: [`include/drm/drm_gpuvm.h`](srctree/include/drm/drm_gpuvm.h)
> +
> +use kernel::{
> +    alloc::{AllocError, Flags as AllocFlags},
> +    bindings, drm,
> +    drm::gem::IntoGEMObject,
> +    error::to_result,
> +    prelude::*,
> +    sync::aref::{ARef, AlwaysRefCounted},
> +    types::Opaque,
> +};
> +
> +use core::{
> +    cell::UnsafeCell,
> +    marker::PhantomData,
> +    mem::{ManuallyDrop, MaybeUninit},
> +    ops::{Deref, DerefMut, Range},
> +    ptr::{self, NonNull},
> +};
> +
> +mod sm_ops;
> +pub use self::sm_ops::*;
> +
> +mod vm_bo;
> +pub use self::vm_bo::*;
> +
> +mod va;
> +pub use self::va::*;
> +
> +/// A DRM GPU VA manager.
> +///
> +/// This object is refcounted, but the "core" is only accessible using a special unique handle. The

I wonder if `Owned<T>` is a good fit here? IIUC, Owned<T> can be refcounted,
but there is only ever one handle on the Rust side? If so, this seems to be
what we want here?


> +/// core consists of the `core` field and the GPUVM's interval tree.
> +#[repr(C)]
> +#[pin_data]
> +pub struct GpuVm<T: DriverGpuVm> {
> +    #[pin]
> +    vm: Opaque<bindings::drm_gpuvm>,
> +    /// Accessed only through the [`GpuVmCore`] reference.
> +    core: UnsafeCell<T>,

This UnsafeCell has been here since Lina’s version. I must say I never
understood why, and perhaps now is a good time to clarify it given the changes
we’re making w.r.t to the “unique handle” thing.

This is just some driver private data. It’s never shared with C. I am not
sure why we need this wrapper.

> +    /// Shared data not protected by any lock.
> +    #[pin]
> +    shared_data: T::SharedData,

Should we deref to this?

> +}
> +
> +// SAFETY: dox
> +unsafe impl<T: DriverGpuVm> AlwaysRefCounted for GpuVm<T> {
> +    fn inc_ref(&self) {
> +        // SAFETY: dox
> +        unsafe { bindings::drm_gpuvm_get(self.vm.get()) };
> +    }
> +
> +    unsafe fn dec_ref(obj: NonNull<Self>) {
> +        // SAFETY: dox
> +        unsafe { bindings::drm_gpuvm_put((*obj.as_ptr()).vm.get()) };
> +    }
> +}
> +
> +impl<T: DriverGpuVm> GpuVm<T> {
> +    const fn vtable() -> &'static bindings::drm_gpuvm_ops {
> +        &bindings::drm_gpuvm_ops {
> +            vm_free: Some(Self::vm_free),
> +            op_alloc: None,
> +            op_free: None,
> +            vm_bo_alloc: GpuVmBo::<T>::ALLOC_FN,
> +            vm_bo_free: GpuVmBo::<T>::FREE_FN,
> +            vm_bo_validate: None,
> +            sm_step_map: Some(Self::sm_step_map),
> +            sm_step_unmap: Some(Self::sm_step_unmap),
> +            sm_step_remap: Some(Self::sm_step_remap),
> +        }
> +    }
> +
> +    /// Creates a GPUVM instance.
> +    #[expect(clippy::new_ret_no_self)]
> +    pub fn new<E>(
> +        name: &'static CStr,
> +        dev: &drm::Device<T::Driver>,
> +        r_obj: &T::Object,

Can we call this “reservation_object”, or similar?

We should probably briefly explain what it does, perhaps linking to the C docs.


> +        range: Range<u64>,
> +        reserve_range: Range<u64>,
> +        core: T,
> +        shared: impl PinInit<T::SharedData, E>,
> +    ) -> Result<GpuVmCore<T>, E>
> +    where
> +        E: From<AllocError>,
> +        E: From<core::convert::Infallible>,
> +    {
> +        let obj = KBox::try_pin_init::<E>(
> +            try_pin_init!(Self {
> +                core <- UnsafeCell::new(core),
> +                shared_data <- shared,
> +                vm <- Opaque::ffi_init(|vm| {
> +                    // SAFETY: These arguments are valid. `vm` is valid until refcount drops to
> +                    // zero.
> +                    unsafe {
> +                        bindings::drm_gpuvm_init(
> +                            vm,
> +                            name.as_char_ptr(),
> +                            bindings::drm_gpuvm_flags_DRM_GPUVM_IMMEDIATE_MODE
> +                                | bindings::drm_gpuvm_flags_DRM_GPUVM_RESV_PROTECTED,
> +                            dev.as_raw(),
> +                            r_obj.as_raw(),
> +                            range.start,
> +                            range.end - range.start,
> +                            reserve_range.start,
> +                            reserve_range.end - reserve_range.start,
> +                            const { Self::vtable() },
> +                        )
> +                    }
> +                }),
> +            }? E),
> +            GFP_KERNEL,
> +        )?;
> +        // SAFETY: This transfers the initial refcount to the ARef.
> +        Ok(GpuVmCore(unsafe {
> +            ARef::from_raw(NonNull::new_unchecked(KBox::into_raw(
> +                Pin::into_inner_unchecked(obj),
> +            )))
> +        }))
> +    }
> +
> +    /// Access this [`GpuVm`] from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// For the duration of `'a`, the pointer must reference a valid [`GpuVm<T>`].
> +    #[inline]
> +    pub unsafe fn from_raw<'a>(ptr: *mut bindings::drm_gpuvm) -> &'a Self {
> +        // SAFETY: `drm_gpuvm` is first field and `repr(C)`.
> +        unsafe { &*ptr.cast() }
> +    }
> +
> +    /// Get a raw pointer.
> +    #[inline]
> +    pub fn as_raw(&self) -> *mut bindings::drm_gpuvm {
> +        self.vm.get()
> +    }
> +
> +    /// Access the shared data.
> +    #[inline]
> +    pub fn shared(&self) -> &T::SharedData {
> +        &self.shared_data
> +    }
> +
> +    /// The start of the VA space.
> +    #[inline]
> +    pub fn va_start(&self) -> u64 {
> +        // SAFETY: Safe by the type invariant of `GpuVm<T>`.
> +        unsafe { (*self.as_raw()).mm_start }
> +    }
> +
> +    /// The length of the address space
> +    #[inline]
> +    pub fn va_length(&self) -> u64 {
> +        // SAFETY: Safe by the type invariant of `GpuVm<T>`.
> +        unsafe { (*self.as_raw()).mm_range }
> +    }
> +
> +    /// Returns the range of the GPU virtual address space.
> +    #[inline]
> +    pub fn va_range(&self) -> Range<u64> {
> +        let start = self.va_start();
> +        let end = start + self.va_length();
> +        Range { start, end }
> +    }
> +


I wonder if we should expose the methods below at this moment. We will not need
them in Tyr until we start submitting jobs. This is still a bit in the future.

I say this for a few reasons:

a) Philipp is still working on the fence abstractions,

b) As a result from the above, we are taking raw fence pointers,

c) Onur is working on a WW Mutex abstraction [0] that includes a Rust
implementation of drm_exec (under another name, and useful in other contexts
outside of DRM). Should we use them here?

I think your current design with the ExecToken is also ok and perhaps we should
stick to it, but it's good to at least discuss this with the others.


> +    /// Returns a [`GpuVmBoObtain`] for the provided GEM object.
> +    #[inline]
> +    pub fn obtain(
> +        &self,
> +        obj: &T::Object,
> +        data: impl PinInit<T::VmBoData>,
> +    ) -> Result<GpuVmBoObtain<T>, AllocError> {

Perhaps this should be called GpuVmBo? That’s what you want to “obtain” in the first place.

This is indeed a question, by the way.

> +        Ok(GpuVmBoAlloc::new(self, obj, data)?.obtain())
> +    }
> +
> +    /// Prepare this GPUVM.
> +    #[inline]
> +    pub fn prepare(&self, num_fences: u32) -> impl PinInit<GpuVmExec<'_, T>, Error> {

> +        try_pin_init!(GpuVmExec {
> +            exec <- Opaque::try_ffi_init(|exec: *mut bindings::drm_gpuvm_exec| {
> +                // SAFETY: exec is valid but unused memory, so we can write.
> +                unsafe {
> +                    ptr::write_bytes(exec, 0u8, 1usize);
> +                    ptr::write(&raw mut (*exec).vm, self.as_raw());
> +                    ptr::write(&raw mut (*exec).flags, bindings::DRM_EXEC_INTERRUPTIBLE_WAIT);
> +                    ptr::write(&raw mut (*exec).num_fences, num_fences);
> +                }
> +
> +                // SAFETY: We can prepare the GPUVM.
> +                to_result(unsafe { bindings::drm_gpuvm_exec_lock(exec) })
> +            }),
> +            _gpuvm: PhantomData,
> +        })
> +    }
> +
> +    /// Clean up buffer objects that are no longer used.
> +    #[inline]
> +    pub fn deferred_cleanup(&self) {
> +        // SAFETY: Always safe to perform deferred cleanup.
> +        unsafe { bindings::drm_gpuvm_bo_deferred_cleanup(self.as_raw()) }
> +    }
> +
> +    /// Check if this GEM object is an external object for this GPUVM.
> +    #[inline]
> +    pub fn is_extobj(&self, obj: &T::Object) -> bool {
> +        // SAFETY: We may call this with any GPUVM and GEM object.
> +        unsafe { bindings::drm_gpuvm_is_extobj(self.as_raw(), obj.as_raw()) }
> +    }
> +
> +    /// Free this GPUVM.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Called when refcount hits zero.
> +    unsafe extern "C" fn vm_free(me: *mut bindings::drm_gpuvm) {
> +        // SAFETY: GPUVM was allocated with KBox and can now be freed.
> +        drop(unsafe { KBox::<Self>::from_raw(me.cast()) })
> +    }
> +}
> +
> +/// The manager for a GPUVM.
> +pub trait DriverGpuVm: Sized {
> +    /// Parent `Driver` for this object.
> +    type Driver: drm::Driver;
> +
> +    /// The kind of GEM object stored in this GPUVM.
> +    type Object: IntoGEMObject;
> +
> +    /// Data stored in the [`GpuVm`] that is fully shared.
> +    type SharedData;
> +
> +    /// Data stored with each `struct drm_gpuvm_bo`.
> +    type VmBoData;
> +
> +    /// Data stored with each `struct drm_gpuva`.
> +    type VaData;
> +
> +    /// The private data passed to callbacks.
> +    type SmContext;
> +
> +    /// Indicates that a new mapping should be created.
> +    fn sm_step_map<'op>(
> +        &mut self,
> +        op: OpMap<'op, Self>,
> +        context: &mut Self::SmContext,
> +    ) -> Result<OpMapped<'op, Self>, Error>;
> +
> +    /// Indicates that an existing mapping should be removed.
> +    fn sm_step_unmap<'op>(
> +        &mut self,
> +        op: OpUnmap<'op, Self>,
> +        context: &mut Self::SmContext,
> +    ) -> Result<OpUnmapped<'op, Self>, Error>;
> +
> +    /// Indicates that an existing mapping should be split up.
> +    fn sm_step_remap<'op>(
> +        &mut self,
> +        op: OpRemap<'op, Self>,
> +        context: &mut Self::SmContext,
> +    ) -> Result<OpRemapped<'op, Self>, Error>;
> +}
> +
> +/// The core of the DRM GPU VA manager.
> +///
> +/// This object is the reference to the GPUVM that
> +///
> +/// # Invariants
> +///
> +/// This object owns the core.
> +pub struct GpuVmCore<T: DriverGpuVm>(ARef<GpuVm<T>>);
> +
> +impl<T: DriverGpuVm> GpuVmCore<T> {
> +    /// Get a reference without access to `core`.
> +    #[inline]
> +    pub fn gpuvm(&self) -> &GpuVm<T> {
> +        &self.0
> +    }
> +}
> +
> +impl<T: DriverGpuVm> Deref for GpuVmCore<T> {
> +    type Target = T;
> +    #[inline]
> +    fn deref(&self) -> &T {
> +        // SAFETY: By the type invariants we may access `core`.
> +        unsafe { &*self.0.core.get() }
> +    }
> +}
> +
> +impl<T: DriverGpuVm> DerefMut for GpuVmCore<T> {
> +    #[inline]
> +    fn deref_mut(&mut self) -> &mut T {
> +        // SAFETY: By the type invariants we may access `core`.
> +        unsafe { &mut *self.0.core.get() }
> +    }
> +}
> +
> +/// The exec token for preparing the objects.
> +#[pin_data(PinnedDrop)]
> +pub struct GpuVmExec<'a, T: DriverGpuVm> {
> +    #[pin]
> +    exec: Opaque<bindings::drm_gpuvm_exec>,
> +    _gpuvm: PhantomData<&'a mut GpuVm<T>>,
> +}
> +
> +impl<'a, T: DriverGpuVm> GpuVmExec<'a, T> {
> +    /// Add a fence.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `fence` arg must be valid.
> +    pub unsafe fn resv_add_fence(
> +        &self,
> +        // TODO: use a safe fence abstraction
> +        fence: *mut bindings::dma_fence,
> +        private_usage: DmaResvUsage,
> +        extobj_usage: DmaResvUsage,
> +    ) {
> +        // SAFETY: Caller ensures fence is ok.
> +        unsafe {
> +            bindings::drm_gpuvm_resv_add_fence(
> +                (*self.exec.get()).vm,
> +                &raw mut (*self.exec.get()).exec,
> +                fence,
> +                private_usage as u32,
> +                extobj_usage as u32,
> +            )
> +        }
> +    }
> +}
> +
> +#[pinned_drop]
> +impl<'a, T: DriverGpuVm> PinnedDrop for GpuVmExec<'a, T> {
> +    fn drop(self: Pin<&mut Self>) {
> +        // SAFETY: We hold the lock, so it's safe to unlock.
> +        unsafe { bindings::drm_gpuvm_exec_unlock(self.exec.get()) };
> +    }
> +}
> +
> +/// How the fence will be used.
> +#[repr(u32)]
> +pub enum DmaResvUsage {
> +    /// For in kernel memory management only (e.g. copying, clearing memory).
> +    Kernel = bindings::dma_resv_usage_DMA_RESV_USAGE_KERNEL,
> +    /// Implicit write synchronization for userspace submissions.
> +    Write = bindings::dma_resv_usage_DMA_RESV_USAGE_WRITE,
> +    /// Implicit read synchronization for userspace submissions.
> +    Read = bindings::dma_resv_usage_DMA_RESV_USAGE_READ,
> +    /// No implicit sync (e.g. preemption fences, page table updates, TLB flushes).
> +    Bookkeep = bindings::dma_resv_usage_DMA_RESV_USAGE_BOOKKEEP,
> +}
> +
> +/// A lock guard for the GPUVM's resv lock.
> +///
> +/// This guard provides access to the extobj and evicted lists.

Should we bother with evicted objects at this stage?

> +///
> +/// # Invariants
> +///
> +/// Holds the GPUVM resv lock.
> +pub struct GpuvmResvLockGuard<'a, T: DriverGpuVm>(&'a GpuVm<T>);
> +
> +impl<T: DriverGpuVm> GpuVm<T> {
> +    /// Lock the VM's resv lock.

More docs here would be nice.

> +    #[inline]
> +    pub fn resv_lock(&self) -> GpuvmResvLockGuard<'_, T> {
> +        // SAFETY: It's always ok to lock the resv lock.
> +        unsafe { bindings::dma_resv_lock(self.raw_resv_lock(), ptr::null_mut()) };
> +        // INVARIANTS: We took the lock.
> +        GpuvmResvLockGuard(self)
> +    }

You can call this more than once and deadlock. Perhaps we should warn about this, or forbid it?

i.e.:

if self.resv_is_locked {
  return Err(EALREADY)
}

> +
> +    #[inline]
> +    fn raw_resv_lock(&self) -> *mut bindings::dma_resv {
> +        // SAFETY: `r_obj` is immutable and valid for duration of GPUVM.
> +        unsafe { (*(*self.as_raw()).r_obj).resv }
> +    }
> +}
> +
> +impl<'a, T: DriverGpuVm> Drop for GpuvmResvLockGuard<'a, T> {
> +    #[inline]
> +    fn drop(&mut self) {
> +        // SAFETY: We hold the lock so we can release it.
> +        unsafe { bindings::dma_resv_unlock(self.0.raw_resv_lock()) };
> +    }
> +}
> diff --git a/rust/kernel/drm/gpuvm/sm_ops.rs b/rust/kernel/drm/gpuvm/sm_ops.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..c0dbd4675de644a3b1cbe7d528194ca7fb471848
> --- /dev/null
> +++ b/rust/kernel/drm/gpuvm/sm_ops.rs
> @@ -0,0 +1,469 @@
> +// SPDX-License-Identifier: GPL-2.0 OR MIT
> +#![allow(clippy::tabs_in_doc_comments)]
> +
> +use super::*;
> +
> +struct SmData<'a, T: DriverGpuVm> {
> +    gpuvm: &'a mut GpuVmCore<T>,
> +    user_context: &'a mut T::SmContext,
> +}
> +
> +#[repr(C)]
> +struct SmMapData<'a, T: DriverGpuVm> {
> +    sm_data: SmData<'a, T>,
> +    vm_bo: GpuVmBoObtain<T>,
> +}
> +
> +/// The argument for [`GpuVmCore::sm_map`].
> +pub struct OpMapRequest<'a, T: DriverGpuVm> {
> +    /// Address in GPU virtual address space.
> +    pub addr: u64,
> +    /// Length of mapping to create.
> +    pub range: u64,
> +    /// Offset in GEM object.

"in the GEM object."

> +    pub offset: u64,
> +    /// The GEM object to map.
> +    pub vm_bo: GpuVmBoObtain<T>,
> +    /// The user-provided context type.
> +    pub context: &'a mut T::SmContext,
> +}
> +
> +impl<'a, T: DriverGpuVm> OpMapRequest<'a, T> {
> +    fn raw_request(&self) -> bindings::drm_gpuvm_map_req {
> +        bindings::drm_gpuvm_map_req {
> +            map: bindings::drm_gpuva_op_map {
> +                va: bindings::drm_gpuva_op_map__bindgen_ty_1 {
> +                    addr: self.addr,
> +                    range: self.range,
> +                },
> +                gem: bindings::drm_gpuva_op_map__bindgen_ty_2 {
> +                    offset: self.offset,
> +                    obj: self.vm_bo.obj().as_raw(),
> +                },
> +            },
> +        }
> +    }
> +}
> +
> +/// ```
> +/// struct drm_gpuva_op_map {
> +/// /**
> +/// * @va: structure containing address and range of a map
> +/// * operation
> +/// */
> +/// struct {
> +/// /**
> +/// * @va.addr: the base address of the new mapping
> +/// */
> +/// u64 addr;
> +///
> +/// /**
> +/// * @va.range: the range of the new mapping
> +/// */
> +/// u64 range;
> +/// } va;
> +///
> +/// /**
> +/// * @gem: structure containing the &drm_gem_object and it's offset
> +/// */
> +/// struct {
> +/// /**
> +/// * @gem.offset: the offset within the &drm_gem_object
> +/// */
> +/// u64 offset;
> +///
> +/// /**
> +/// * @gem.obj: the &drm_gem_object to map
> +/// */
> +/// struct drm_gem_object *obj;
> +/// } gem;
> +/// };
> +/// ```

I think we can improve the docs above a bit.

> +pub struct OpMap<'op, T: DriverGpuVm> {
> +    op: &'op bindings::drm_gpuva_op_map,
> +    // Since these abstractions are designed for immediate mode, the VM BO needs to be
> +    // pre-allocated, so we always have it available when we reach this point.
> +    vm_bo: &'op GpuVmBo<T>,
> +    _invariant: PhantomData<*mut &'op mut T>,
> +}
> +
> +impl<'op, T: DriverGpuVm> OpMap<'op, T> {
> +    /// The base address of the new mapping.
> +    pub fn addr(&self) -> u64 {
> +        self.op.va.addr
> +    }
> +
> +    /// The length of the new mapping.
> +    pub fn length(&self) -> u64 {
> +        self.op.va.range
> +    }
> +
> +    /// The offset within the [`drm_gem_object`](crate::gem::Object).
> +    pub fn gem_offset(&self) -> u64 {
> +        self.op.gem.offset
> +    }
> +
> +    /// The [`drm_gem_object`](crate::gem::Object) to map.
> +    pub fn obj(&self) -> &T::Object {
> +        // SAFETY: The `obj` pointer is guaranteed to be valid.
> +        unsafe { <T::Object as IntoGEMObject>::from_raw(self.op.gem.obj) }
> +    }
> +
> +    /// The [`GpuVmBo`] that the new VA will be associated with.
> +    pub fn vm_bo(&self) -> &GpuVmBo<T> {
> +        self.vm_bo
> +    }
> +
> +    /// Use the pre-allocated VA to carry out this map operation.
> +    pub fn insert(self, va: GpuVaAlloc<T>, va_data: impl PinInit<T::VaData>) -> OpMapped<'op, T> {
> +        let va = va.prepare(va_data);
> +        // SAFETY: By the type invariants we may access the interval tree.
> +        unsafe { bindings::drm_gpuva_map(self.vm_bo.gpuvm().as_raw(), va, self.op) };
> +        // SAFETY: The GEM object is valid, so the mutex is properly initialized.

> +        unsafe { bindings::mutex_lock(&raw mut (*self.op.gem.obj).gpuva.lock) };

Should we use Fujita’s might_sleep() support here?

> +        // SAFETY: The va is prepared for insertion, and we hold the GEM lock.
> +        unsafe { bindings::drm_gpuva_link(va, self.vm_bo.as_raw()) };
> +        // SAFETY: We took the mutex above, so we may unlock it.
> +        unsafe { bindings::mutex_unlock(&raw mut (*self.op.gem.obj).gpuva.lock) };
> +        OpMapped {
> +            _invariant: self._invariant,
> +        }
> +    }
> +}
> +
> +/// Represents a completed [`OpMap`] operation.
> +pub struct OpMapped<'op, T> {
> +    _invariant: PhantomData<*mut &'op mut T>,
> +}
> +
> +/// ```
> +/// struct drm_gpuva_op_unmap {
> +/// /**
> +/// * @va: the &drm_gpuva to unmap
> +/// */
> +/// struct drm_gpuva *va;
> +///
> +/// /**
> +/// * @keep:
> +/// *
> +/// * Indicates whether this &drm_gpuva is physically contiguous with the
> +/// * original mapping request.
> +/// *
> +/// * Optionally, if &keep is set, drivers may keep the actual page table
> +/// * mappings for this &drm_gpuva, adding the missing page table entries
> +/// * only and update the &drm_gpuvm accordingly.
> +/// */
> +/// bool keep;
> +/// };
> +/// ```

I think the docs could improve here ^

> +pub struct OpUnmap<'op, T: DriverGpuVm> {
> +    op: &'op bindings::drm_gpuva_op_unmap,
> +    _invariant: PhantomData<*mut &'op mut T>,
> +}
> +
> +impl<'op, T: DriverGpuVm> OpUnmap<'op, T> {
> +    /// Indicates whether this `drm_gpuva` is physically contiguous with the
> +    /// original mapping request.
> +    ///
> +    /// Optionally, if `keep` is set, drivers may keep the actual page table
> +    /// mappings for this `drm_gpuva`, adding the missing page table entries
> +    /// only and update the `drm_gpuvm` accordingly.
> +    pub fn keep(&self) -> bool {
> +        self.op.keep
> +    }
> +
> +    /// The range being unmapped.
> +    pub fn va(&self) -> &GpuVa<T> {
> +        // SAFETY: This is a valid va.
> +        unsafe { GpuVa::<T>::from_raw(self.op.va) }
> +    }
> +
> +    /// Remove the VA.
> +    pub fn remove(self) -> (OpUnmapped<'op, T>, GpuVaRemoved<T>) {
> +        // SAFETY: The op references a valid drm_gpuva in the GPUVM.
> +        unsafe { bindings::drm_gpuva_unmap(self.op) };
> +        // SAFETY: The va is no longer in the interval tree so we may unlink it.
> +        unsafe { bindings::drm_gpuva_unlink_defer(self.op.va) };
> +
> +        // SAFETY: We just removed this va from the `GpuVm<T>`.
> +        let va = unsafe { GpuVaRemoved::from_raw(self.op.va) };
> +
> +        (
> +            OpUnmapped {
> +                _invariant: self._invariant,
> +            },
> +            va,
> +        )
> +    }
> +}
> +
> +/// Represents a completed [`OpUnmap`] operation.
> +pub struct OpUnmapped<'op, T> {
> +    _invariant: PhantomData<*mut &'op mut T>,
> +}
> +
> +/// ```
> +/// struct drm_gpuva_op_remap {
> +/// /**
> +/// * @prev: the preceding part of a split mapping
> +/// */
> +/// struct drm_gpuva_op_map *prev;
> +///
> +/// /**
> +/// * @next: the subsequent part of a split mapping
> +/// */
> +/// struct drm_gpuva_op_map *next;
> +///
> +/// /**
> +/// * @unmap: the unmap operation for the original existing mapping
> +/// */
> +/// struct drm_gpuva_op_unmap *unmap;
> +/// };
> +/// ```
> +pub struct OpRemap<'op, T: DriverGpuVm> {
> +    op: &'op bindings::drm_gpuva_op_remap,
> +    _invariant: PhantomData<*mut &'op mut T>,
> +}
> +
> +impl<'op, T: DriverGpuVm> OpRemap<'op, T> {
> +    /// The preceding part of a split mapping.
> +    #[inline]
> +    pub fn prev(&self) -> Option<&OpRemapMapData> {
> +        // SAFETY: We checked for null, so the pointer must be valid.
> +        NonNull::new(self.op.prev).map(|ptr| unsafe { OpRemapMapData::from_raw(ptr) })
> +    }
> +
> +    /// The subsequent part of a split mapping.
> +    #[inline]
> +    pub fn next(&self) -> Option<&OpRemapMapData> {
> +        // SAFETY: We checked for null, so the pointer must be valid.
> +        NonNull::new(self.op.next).map(|ptr| unsafe { OpRemapMapData::from_raw(ptr) })
> +    }
> +
> +    /// Indicates whether the `drm_gpuva` being removed is physically contiguous with the original
> +    /// mapping request.
> +    ///
> +    /// Optionally, if `keep` is set, drivers may keep the actual page table mappings for this
> +    /// `drm_gpuva`, adding the missing page table entries only and update the `drm_gpuvm`
> +    /// accordingly.
> +    #[inline]
> +    pub fn keep(&self) -> bool {
> +        // SAFETY: The unmap pointer is always valid.
> +        unsafe { (*self.op.unmap).keep }
> +    }
> +
> +    /// The range being unmapped.
> +    #[inline]
> +    pub fn va_to_unmap(&self) -> &GpuVa<T> {
> +        // SAFETY: This is a valid va.
> +        unsafe { GpuVa::<T>::from_raw((*self.op.unmap).va) }
> +    }
> +
> +    /// The [`drm_gem_object`](crate::gem::Object) whose VA is being remapped.
> +    #[inline]
> +    pub fn obj(&self) -> &T::Object {
> +        self.va_to_unmap().obj()
> +    }
> +
> +    /// The [`GpuVmBo`] that is being remapped.
> +    #[inline]
> +    pub fn vm_bo(&self) -> &GpuVmBo<T> {
> +        self.va_to_unmap().vm_bo()
> +    }
> +
> +    /// Update the GPUVM to perform the remapping.
> +    pub fn remap(
> +        self,
> +        va_alloc: [GpuVaAlloc<T>; 2],
> +        prev_data: impl PinInit<T::VaData>,
> +        next_data: impl PinInit<T::VaData>,
> +    ) -> (OpRemapped<'op, T>, OpRemapRet<T>) {
> +        let [va1, va2] = va_alloc;
> +
> +        let mut unused_va = None;
> +        let mut prev_ptr = ptr::null_mut();
> +        let mut next_ptr = ptr::null_mut();
> +        if self.prev().is_some() {
> +            prev_ptr = va1.prepare(prev_data);
> +        } else {
> +            unused_va = Some(va1);
> +        }
> +        if self.next().is_some() {
> +            next_ptr = va2.prepare(next_data);
> +        } else {
> +            unused_va = Some(va2);
> +        }
> +
> +        // SAFETY: the pointers are non-null when required
> +        unsafe { bindings::drm_gpuva_remap(prev_ptr, next_ptr, self.op) };
> +
> +        // SAFETY: The GEM object is valid, so the mutex is properly initialized.
> +        unsafe { bindings::mutex_lock(&raw mut (*self.obj().as_raw()).gpuva.lock) };
> +        if !prev_ptr.is_null() {
> +            // SAFETY: The prev_ptr is a valid drm_gpuva prepared for insertion. The vm_bo is still
> +            // valid as the not-yet-unlinked gpuva holds a refcount on the vm_bo.
> +            unsafe { bindings::drm_gpuva_link(prev_ptr, self.vm_bo().as_raw()) };
> +        }
> +        if !next_ptr.is_null() {
> +            // SAFETY: The next_ptr is a valid drm_gpuva prepared for insertion. The vm_bo is still
> +            // valid as the not-yet-unlinked gpuva holds a refcount on the vm_bo.
> +            unsafe { bindings::drm_gpuva_link(next_ptr, self.vm_bo().as_raw()) };
> +        }
> +        // SAFETY: We took the mutex above, so we may unlock it.
> +        unsafe { bindings::mutex_unlock(&raw mut (*self.obj().as_raw()).gpuva.lock) };
> +        // SAFETY: The va is no longer in the interval tree so we may unlink it.
> +        unsafe { bindings::drm_gpuva_unlink_defer((*self.op.unmap).va) };
> +
> +        (
> +            OpRemapped {
> +                _invariant: self._invariant,
> +            },
> +            OpRemapRet {
> +                // SAFETY: We just removed this va from the `GpuVm<T>`.
> +                unmapped_va: unsafe { GpuVaRemoved::from_raw((*self.op.unmap).va) },
> +                unused_va,
> +            },
> +        )
> +    }
> +}
> +
> +/// Part of an [`OpRemap`] that represents a new mapping.
> +#[repr(transparent)]
> +pub struct OpRemapMapData(bindings::drm_gpuva_op_map);
> +
> +impl OpRemapMapData {
> +    /// # Safety
> +    /// Must reference a valid `drm_gpuva_op_map` for duration of `'a`.
> +    unsafe fn from_raw<'a>(ptr: NonNull<bindings::drm_gpuva_op_map>) -> &'a Self {
> +        // SAFETY: ok per safety requirements
> +        unsafe { ptr.cast().as_ref() }
> +    }
> +
> +    /// The base address of the new mapping.
> +    pub fn addr(&self) -> u64 {
> +        self.0.va.addr
> +    }
> +
> +    /// The length of the new mapping.
> +    pub fn length(&self) -> u64 {
> +        self.0.va.range
> +    }
> +
> +    /// The offset within the [`drm_gem_object`](crate::gem::Object).
> +    pub fn gem_offset(&self) -> u64 {
> +        self.0.gem.offset
> +    }
> +}
> +
> +/// Struct containing objects removed or not used by [`OpRemap::remap`].
> +pub struct OpRemapRet<T: DriverGpuVm> {
> +    /// The `drm_gpuva` that was removed.
> +    pub unmapped_va: GpuVaRemoved<T>,
> +    /// If the remap did not split the region into two pieces, then the unused `drm_gpuva` is
> +    /// returned here.
> +    pub unused_va: Option<GpuVaAlloc<T>>,
> +}
> +
> +/// Represents a completed [`OpRemap`] operation.
> +pub struct OpRemapped<'op, T> {
> +    _invariant: PhantomData<*mut &'op mut T>,
> +}
> +
> +impl<T: DriverGpuVm> GpuVmCore<T> {
> +    /// Create a mapping, removing or remapping anything that overlaps.
> +    #[inline]
> +    pub fn sm_map(&mut self, req: OpMapRequest<'_, T>) -> Result {

I wonder if we should keep this “sm” prefix. Perhaps
“map_region” or “map_range” would be better names IMHO.

> +        let gpuvm = self.gpuvm().as_raw();
> +        let raw_req = req.raw_request();
> +        let mut p = SmMapData {
> +            sm_data: SmData {
> +                gpuvm: self,
> +                user_context: req.context,
> +            },
> +            vm_bo: req.vm_bo,
> +        };
> +        // SAFETY:
> +        // * raw_request() creates a valid request.
> +        // * The private data is valid to be interpreted as both SmData and SmMapData since the
> +        //   first field of SmMapData is SmData.
> +        to_result(unsafe {
> +            bindings::drm_gpuvm_sm_map(gpuvm, (&raw mut p).cast(), &raw const raw_req)
> +        })
> +    }
> +
> +    /// Remove any mappings in the given region.
> +    #[inline]
> +    pub fn sm_unmap(&mut self, addr: u64, length: u64, context: &mut T::SmContext) -> Result {

Same here

> +        let gpuvm = self.gpuvm().as_raw();
> +        let mut p = SmData {
> +            gpuvm: self,
> +            user_context: context,
> +        };
> +        // SAFETY:
> +        // * raw_request() creates a valid request.
> +        // * The private data is valid to be interpreted as only SmData, but drm_gpuvm_sm_unmap()
> +        //   never calls sm_step_map().
> +        to_result(unsafe { bindings::drm_gpuvm_sm_unmap(gpuvm, (&raw mut p).cast(), addr, length) })
> +    }
> +}
> +
> +impl<T: DriverGpuVm> GpuVm<T> {
> +    /// # Safety
> +    /// Must be called from `sm_map`.
> +    pub(super) unsafe extern "C" fn sm_step_map(
> +        op: *mut bindings::drm_gpuva_op,
> +        p: *mut c_void,
> +    ) -> c_int {
> +        // SAFETY: If we reach `sm_step_map` then we were called from `sm_map` which always passes
> +        // an `SmMapData` as private data.
> +        let p = unsafe { &mut *p.cast::<SmMapData<'_, T>>() };
> +        let op = OpMap {
> +            // SAFETY: sm_step_map is called with a map operation.
> +            op: unsafe { &(*op).__bindgen_anon_1.map },
> +            vm_bo: &p.vm_bo,
> +            _invariant: PhantomData,
> +        };
> +        match p.sm_data.gpuvm.sm_step_map(op, p.sm_data.user_context) {
> +            Ok(OpMapped { .. }) => 0,
> +            Err(err) => err.to_errno(),
> +        }
> +    }
> +    /// # Safety
> +    /// Must be called from `sm_map` or `sm_unmap`.
> +    pub(super) unsafe extern "C" fn sm_step_unmap(
> +        op: *mut bindings::drm_gpuva_op,
> +        p: *mut c_void,
> +    ) -> c_int {
> +        // SAFETY: If we reach `sm_step_unmap` then we were called from `sm_map` or `sm_unmap` which passes either
> +        // an `SmMapData` or `SmData` as private data. Both cases can be cast to `SmData`.
> +        let p = unsafe { &mut *p.cast::<SmData<'_, T>>() };
> +        let op = OpUnmap {
> +            // SAFETY: sm_step_unmap is called with an unmap operation.
> +            op: unsafe { &(*op).__bindgen_anon_1.unmap },
> +            _invariant: PhantomData,
> +        };
> +        match p.gpuvm.sm_step_unmap(op, p.user_context) {
> +            Ok(OpUnmapped { .. }) => 0,
> +            Err(err) => err.to_errno(),
> +        }
> +    }
> +    /// # Safety
> +    /// Must be called from `sm_map` or `sm_unmap`.
> +    pub(super) unsafe extern "C" fn sm_step_remap(
> +        op: *mut bindings::drm_gpuva_op,
> +        p: *mut c_void,
> +    ) -> c_int {
> +        // SAFETY: If we reach `sm_step_remap` then we were called from `sm_map` or `sm_unmap` which passes either
> +        // an `SmMapData` or `SmData` as private data. Both cases can be cast to `SmData`.
> +        let p = unsafe { &mut *p.cast::<SmData<'_, T>>() };
> +        let op = OpRemap {
> +            // SAFETY: sm_step_remap is called with a remap operation.
> +            op: unsafe { &(*op).__bindgen_anon_1.remap },
> +            _invariant: PhantomData,
> +        };
> +        match p.gpuvm.sm_step_remap(op, p.user_context) {
> +            Ok(OpRemapped { .. }) => 0,
> +            Err(err) => err.to_errno(),
> +        }
> +    }
> +}
> diff --git a/rust/kernel/drm/gpuvm/va.rs b/rust/kernel/drm/gpuvm/va.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..a31122ff22282186a1d76d4bb085714f6465722b
> --- /dev/null
> +++ b/rust/kernel/drm/gpuvm/va.rs
> @@ -0,0 +1,148 @@
> +// SPDX-License-Identifier: GPL-2.0 OR MIT
> +
> +use super::*;
> +
> +/// Represents that a range of a GEM object is mapped in this [`GpuVm`] instance.
> +///
> +/// Does not assume that GEM lock is held.
> +///
> +/// # Invariants
> +///
> +/// This is a valid `drm_gpuva` that is resident in the [`GpuVm`] instance.
> +#[repr(C)]
> +#[pin_data]
> +pub struct GpuVa<T: DriverGpuVm> {
> +    #[pin]
> +    inner: Opaque<bindings::drm_gpuva>,
> +    #[pin]
> +    data: T::VaData,
> +}
> +
> +impl<T: DriverGpuVm> GpuVa<T> {
> +    /// Access this [`GpuVa`] from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// For the duration of `'a`, the pointer must reference a valid `drm_gpuva` associated with a
> +    /// [`GpuVm<T>`].
> +    #[inline]
> +    pub unsafe fn from_raw<'a>(ptr: *mut bindings::drm_gpuva) -> &'a Self {
> +        // SAFETY: `drm_gpuva` is first field and `repr(C)`.
> +        unsafe { &*ptr.cast() }
> +    }
> +
> +    /// Returns a raw pointer to underlying C value.
> +    #[inline]
> +    pub fn as_raw(&self) -> *mut bindings::drm_gpuva {
> +        self.inner.get()
> +    }
> +
> +    /// Returns the address of this mapping in the GPU virtual address space.
> +    #[inline]
> +    pub fn addr(&self) -> u64 {
> +        // SAFETY: The `va.addr` field of `drm_gpuva` is immutable.
> +        unsafe { (*self.as_raw()).va.addr }
> +    }
> +
> +    /// Returns the length of this mapping.
> +    #[inline]
> +    pub fn length(&self) -> u64 {
> +        // SAFETY: The `va.range` field of `drm_gpuva` is immutable.
> +        unsafe { (*self.as_raw()).va.range }
> +    }
> +
> +    /// Returns `addr..addr+length`.
> +    #[inline]
> +    pub fn range(&self) -> Range<u64> {
> +        let addr = self.addr();
> +        addr..addr + self.length()
> +    }
> +
> +    /// Returns the offset within the GEM object.
> +    #[inline]
> +    pub fn gem_offset(&self) -> u64 {
> +        // SAFETY: The `gem.offset` field of `drm_gpuva` is immutable.
> +        unsafe { (*self.as_raw()).gem.offset }
> +    }
> +
> +    /// Returns the GEM object.
> +    #[inline]
> +    pub fn obj(&self) -> &T::Object {
> +        // SAFETY: The `gem.offset` field of `drm_gpuva` is immutable.
> +        unsafe { <T::Object as IntoGEMObject>::from_raw((*self.as_raw()).gem.obj) }
> +    }
> +
> +    /// Returns the underlying [`GpuVmBo`] object that backs this [`GpuVa`].
> +    #[inline]
> +    pub fn vm_bo(&self) -> &GpuVmBo<T> {
> +        // SAFETY: The `vm_bo` field has been set and is immutable for the duration in which this
> +        // `drm_gpuva` is resident in the VM.
> +        unsafe { GpuVmBo::from_raw((*self.as_raw()).vm_bo) }
> +    }
> +}
> +
> +/// A pre-allocated [`GpuVa`] object.
> +///
> +/// # Invariants
> +///
> +/// The memory is zeroed.
> +pub struct GpuVaAlloc<T: DriverGpuVm>(KBox<MaybeUninit<GpuVa<T>>>);
> +
> +impl<T: DriverGpuVm> GpuVaAlloc<T> {
> +    /// Pre-allocate a [`GpuVa`] object.
> +    pub fn new(flags: AllocFlags) -> Result<GpuVaAlloc<T>, AllocError> {
> +        // INVARIANTS: Memory allocated with __GFP_ZERO.
> +        Ok(GpuVaAlloc(KBox::new_uninit(flags | __GFP_ZERO)?))
> +    }
> +
> +    /// Prepare this `drm_gpuva` for insertion into the GPUVM.
> +    pub(super) fn prepare(mut self, va_data: impl PinInit<T::VaData>) -> *mut bindings::drm_gpuva {
> +        let va_ptr = MaybeUninit::as_mut_ptr(&mut self.0);
> +        // SAFETY: The `data` field is pinned.
> +        let Ok(()) = unsafe { va_data.__pinned_init(&raw mut (*va_ptr).data) };
> +        KBox::into_raw(self.0).cast()
> +    }
> +}
> +
> +/// A [`GpuVa`] object that has been removed.
> +///
> +/// # Invariants
> +///
> +/// The `drm_gpuva` is not resident in the [`GpuVm`].
> +pub struct GpuVaRemoved<T: DriverGpuVm>(KBox<GpuVa<T>>);
> +
> +impl<T: DriverGpuVm> GpuVaRemoved<T> {
> +    /// Convert a raw pointer into a [`GpuVaRemoved`].
> +    ///
> +    /// # Safety
> +    ///
> +    /// Must have been removed from a [`GpuVm<T>`].
> +    pub(super) unsafe fn from_raw(ptr: *mut bindings::drm_gpuva) -> Self {
> +        // SAFETY: Since it has been removed we can take ownership of allocation.
> +        GpuVaRemoved(unsafe { KBox::from_raw(ptr.cast()) })
> +    }
> +
> +    /// Take ownership of the VA data.
> +    pub fn into_inner(self) -> T::VaData
> +    where
> +        T::VaData: Unpin,
> +    {
> +        KBox::into_inner(self.0).data
> +    }
> +}
> +
> +impl<T: DriverGpuVm> Deref for GpuVaRemoved<T> {
> +    type Target = T::VaData;
> +    fn deref(&self) -> &T::VaData {
> +        &self.0.data
> +    }
> +}
> +
> +impl<T: DriverGpuVm> DerefMut for GpuVaRemoved<T>
> +where
> +    T::VaData: Unpin,
> +{
> +    fn deref_mut(&mut self) -> &mut T::VaData {
> +        &mut self.0.data
> +    }
> +}
> diff --git a/rust/kernel/drm/gpuvm/vm_bo.rs b/rust/kernel/drm/gpuvm/vm_bo.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..f21aa17ea4f42c4a2b57b1f3a57a18dd2c3c8b7b
> --- /dev/null
> +++ b/rust/kernel/drm/gpuvm/vm_bo.rs
> @@ -0,0 +1,213 @@
> +// SPDX-License-Identifier: GPL-2.0 OR MIT
> +
> +use super::*;
> +
> +/// Represents that a given GEM object has at least one mapping on this [`GpuVm`] instance.
> +///
> +/// Does not assume that GEM lock is held.
> +#[repr(C)]
> +#[pin_data]
> +pub struct GpuVmBo<T: DriverGpuVm> {

Oh, we already have GpuVmBo, and GpuVmBoObtain. I see.

> +    #[pin]
> +    inner: Opaque<bindings::drm_gpuvm_bo>,
> +    #[pin]
> +    data: T::VmBoData,
> +}
> +
> +impl<T: DriverGpuVm> GpuVmBo<T> {
> +    pub(super) const ALLOC_FN: Option<unsafe extern "C" fn() -> *mut bindings::drm_gpuvm_bo> = {
> +        use core::alloc::Layout;
> +        let base = Layout::new::<bindings::drm_gpuvm_bo>();
> +        let rust = Layout::new::<Self>();
> +        assert!(base.size() <= rust.size());

We should default to something else instead of panicking IMHO.

> +        if base.size() != rust.size() || base.align() != rust.align() {
> +            Some(Self::vm_bo_alloc)
> +        } else {
> +            // This causes GPUVM to allocate a `GpuVmBo<T>` with `kzalloc(sizeof(drm_gpuvm_bo))`.
> +            None
> +        }
> +    };
> +
> +    pub(super) const FREE_FN: Option<unsafe extern "C" fn(*mut bindings::drm_gpuvm_bo)> = {
> +        if core::mem::needs_drop::<Self>() {
> +            Some(Self::vm_bo_free)
> +        } else {
> +            // This causes GPUVM to free a `GpuVmBo<T>` with `kfree`.
> +            None
> +        }
> +    };
> +
> +    /// Custom function for allocating a `drm_gpuvm_bo`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Always safe to call. Unsafe to match function pointer type in C struct.
> +    unsafe extern "C" fn vm_bo_alloc() -> *mut bindings::drm_gpuvm_bo {
> +        KBox::<Self>::new_uninit(GFP_KERNEL | __GFP_ZERO)
> +            .map(KBox::into_raw)
> +            .unwrap_or(ptr::null_mut())
> +            .cast()
> +    }
> +
> +    /// Custom function for freeing a `drm_gpuvm_bo`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The pointer must have been allocated with [`GpuVmBo::ALLOC_FN`], and must not be used after
> +    /// this call.
> +    unsafe extern "C" fn vm_bo_free(ptr: *mut bindings::drm_gpuvm_bo) {
> +        // SAFETY:
> +        // * The ptr was allocated from kmalloc with the layout of `GpuVmBo<T>`.
> +        // * `ptr->inner` has no destructor.
> +        // * `ptr->data` contains a valid `T::VmBoData` that we can drop.
> +        drop(unsafe { KBox::<Self>::from_raw(ptr.cast()) });
> +    }
> +
> +    /// Access this [`GpuVmBo`] from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// For the duration of `'a`, the pointer must reference a valid `drm_gpuvm_bo` associated with
> +    /// a [`GpuVm<T>`].
> +    #[inline]
> +    pub unsafe fn from_raw<'a>(ptr: *mut bindings::drm_gpuvm_bo) -> &'a Self {
> +        // SAFETY: `drm_gpuvm_bo` is first field and `repr(C)`.
> +        unsafe { &*ptr.cast() }
> +    }
> +
> +    /// Returns a raw pointer to underlying C value.
> +    #[inline]
> +    pub fn as_raw(&self) -> *mut bindings::drm_gpuvm_bo {
> +        self.inner.get()
> +    }
> +
> +    /// The [`GpuVm`] that this GEM object is mapped in.
> +    #[inline]
> +    pub fn gpuvm(&self) -> &GpuVm<T> {
> +        // SAFETY: The `obj` pointer is guaranteed to be valid.
> +        unsafe { GpuVm::<T>::from_raw((*self.inner.get()).vm) }
> +    }
> +
> +    /// The [`drm_gem_object`](crate::gem::Object) for these mappings.
> +    #[inline]
> +    pub fn obj(&self) -> &T::Object {
> +        // SAFETY: The `obj` pointer is guaranteed to be valid.
> +        unsafe { <T::Object as IntoGEMObject>::from_raw((*self.inner.get()).obj) }
> +    }
> +
> +    /// The driver data with this buffer object.
> +    #[inline]
> +    pub fn data(&self) -> &T::VmBoData {
> +        &self.data
> +    }
> +}
> +
> +/// A pre-allocated [`GpuVmBo`] object.
> +///
> +/// # Invariants
> +///
> +/// Points at a `drm_gpuvm_bo` that contains a valid `T::VmBoData`, has a refcount of one, and is
> +/// absent from any gem, extobj, or evict lists.
> +pub(super) struct GpuVmBoAlloc<T: DriverGpuVm>(NonNull<GpuVmBo<T>>);
> +
> +impl<T: DriverGpuVm> GpuVmBoAlloc<T> {
> +    /// Create a new pre-allocated [`GpuVmBo`].
> +    ///
> +    /// It's intentional that the initializer is infallible because `drm_gpuvm_bo_put` will call
> +    /// drop on the data, so we don't have a way to free it when the data is missing.
> +    #[inline]
> +    pub(super) fn new(
> +        gpuvm: &GpuVm<T>,
> +        gem: &T::Object,
> +        value: impl PinInit<T::VmBoData>,
> +    ) -> Result<GpuVmBoAlloc<T>, AllocError> {
> +        // SAFETY: The provided gpuvm and gem ptrs are valid for the duration of this call.
> +        let raw_ptr = unsafe {
> +            bindings::drm_gpuvm_bo_create(gpuvm.as_raw(), gem.as_raw()).cast::<GpuVmBo<T>>()
> +        };
> +        // CAST: `GpuVmBoAlloc::vm_bo_alloc` ensures that this memory was allocated with the layout
> +        // of `GpuVmBo<T>`.
> +        let ptr = NonNull::new(raw_ptr).ok_or(AllocError)?;
> +        // SAFETY: `ptr->data` is a valid pinned location.
> +        let Ok(()) = unsafe { value.__pinned_init(&raw mut (*raw_ptr).data) };
> +        // INVARIANTS: We just created the vm_bo so it's absent from lists, and the data is valid
> +        // as we just initialized it.
> +        Ok(GpuVmBoAlloc(ptr))
> +    }
> +
> +    /// Returns a raw pointer to underlying C value.
> +    #[inline]
> +    pub(super) fn as_raw(&self) -> *mut bindings::drm_gpuvm_bo {
> +        // SAFETY: The pointer references a valid `drm_gpuvm_bo`.
> +        unsafe { (*self.0.as_ptr()).inner.get() }
> +    }
> +
> +    /// Look up whether there is an existing [`GpuVmBo`] for this gem object.
> +    #[inline]
> +    pub(super) fn obtain(self) -> GpuVmBoObtain<T> {
> +        let me = ManuallyDrop::new(self);
> +        // SAFETY: Valid `drm_gpuvm_bo` not already in the lists.
> +        let ptr = unsafe { bindings::drm_gpuvm_bo_obtain_prealloc(me.as_raw()) };
> +
> +        // If the vm_bo does not already exist, ensure that it's in the extobj list.
> +        if ptr::eq(ptr, me.as_raw()) && me.gpuvm().is_extobj(me.obj()) {
> +            let _resv_lock = me.gpuvm().resv_lock();
> +            // SAFETY: We hold the GPUVMs resv lock.
> +            unsafe { bindings::drm_gpuvm_bo_extobj_add(ptr) };
> +        }
> +
> +        // INVARIANTS: Valid `drm_gpuvm_bo` in the GEM list.
> +        // SAFETY: `drm_gpuvm_bo_obtain_prealloc` always returns a non-null ptr
> +        GpuVmBoObtain(unsafe { NonNull::new_unchecked(ptr.cast()) })
> +    }
> +}
> +
> +impl<T: DriverGpuVm> Deref for GpuVmBoAlloc<T> {
> +    type Target = GpuVmBo<T>;
> +    #[inline]
> +    fn deref(&self) -> &GpuVmBo<T> {
> +        // SAFETY: By the type invariants we may deref while `Self` exists.
> +        unsafe { self.0.as_ref() }
> +    }
> +}
> +
> +impl<T: DriverGpuVm> Drop for GpuVmBoAlloc<T> {
> +    #[inline]
> +    fn drop(&mut self) {
> +        // SAFETY: It's safe to perform a deferred put in any context.
> +        unsafe { bindings::drm_gpuvm_bo_put_deferred(self.as_raw()) };
> +    }
> +}
> +
> +/// A [`GpuVmBo`] object in the GEM list.
> +///
> +/// # Invariants
> +///
> +/// Points at a `drm_gpuvm_bo` that contains a valid `T::VmBoData` and is present in the gem list.
> +pub struct GpuVmBoObtain<T: DriverGpuVm>(NonNull<GpuVmBo<T>>);
> +
> +impl<T: DriverGpuVm> GpuVmBoObtain<T> {
> +    /// Returns a raw pointer to underlying C value.
> +    #[inline]
> +    pub fn as_raw(&self) -> *mut bindings::drm_gpuvm_bo {
> +        // SAFETY: The pointer references a valid `drm_gpuvm_bo`.
> +        unsafe { (*self.0.as_ptr()).inner.get() }
> +    }
> +}
> +
> +impl<T: DriverGpuVm> Deref for GpuVmBoObtain<T> {
> +    type Target = GpuVmBo<T>;
> +    #[inline]
> +    fn deref(&self) -> &GpuVmBo<T> {
> +        // SAFETY: By the type invariants we may deref while `Self` exists.
> +        unsafe { self.0.as_ref() }
> +    }
> +}
> +
> +impl<T: DriverGpuVm> Drop for GpuVmBoObtain<T> {
> +    #[inline]
> +    fn drop(&mut self) {
> +        // SAFETY: It's safe to perform a deferred put in any context.
> +        unsafe { bindings::drm_gpuvm_bo_put_deferred(self.as_raw()) };
> +    }
> +}
> diff --git a/rust/kernel/drm/mod.rs b/rust/kernel/drm/mod.rs
> index 1b82b6945edf25b947afc08300e211bd97150d6b..a4b6c5430198571ec701af2ef452cc9ac55870e6 100644
> --- a/rust/kernel/drm/mod.rs
> +++ b/rust/kernel/drm/mod.rs
> @@ -6,6 +6,7 @@
> pub mod driver;
> pub mod file;
> pub mod gem;
> +pub mod gpuvm;
> pub mod ioctl;
> 
> pub use self::device::Device;
> 
> -- 
> 2.52.0.487.g5c8c507ade-goog
> 
> 


My overall opinion is that we’re adding a lot of things that will only be
relevant when we’re more advanced on the job submission front. This
includes the things that Phillip is working on (i.e.: Fences + JobQueue).

Perhaps we should keep this iteration downstream (so we’re sure it works
when the time comes) and focus on synchronous VM_BINDS upstream.
The Tyr demo that you’ve tested this on is very helpful for this purpose.

Thoughts?

— Daniel

[0]: https://lore.kernel.org/rust-for-linux/20251201102855.4413-1-work@onurozkan.dev/T/#m43dc9256c4b18dc8955df968bf0712fc7a9d24c6


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction
  2025-12-01 15:16   ` Daniel Almeida
@ 2025-12-02  8:39     ` Alice Ryhl
  2025-12-02 13:42       ` Daniel Almeida
  0 siblings, 1 reply; 12+ messages in thread
From: Alice Ryhl @ 2025-12-02  8:39 UTC (permalink / raw)
  To: Daniel Almeida
  Cc: Danilo Krummrich, Matthew Brost, Thomas Hellström,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Boris Brezillon, Steven Price, Liviu Dudau,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Frank Binns,
	Matt Coster, Rob Clark, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Sean Paul, Marijn Suijten, Lyude Paul,
	Lucas De Marchi, Rodrigo Vivi, Sumit Semwal, Christian König,
	dri-devel, linux-kernel, rust-for-linux, linux-arm-msm, freedreno,
	nouveau, intel-xe, linux-media, linaro-mm-sig, Asahi Lina

On Mon, Dec 01, 2025 at 12:16:09PM -0300, Daniel Almeida wrote:
> Hi Alice,
> 
> I find it a bit weird that we reverted to v1, given that the previous gpuvm
> attempt was v3. No big deal though.
> 
> 
> > On 28 Nov 2025, at 11:14, Alice Ryhl <aliceryhl@google.com> wrote:
> > 
> > Add a GPUVM abstraction to be used by Rust GPU drivers.
> > 
> > GPUVM keeps track of a GPU's virtual address (VA) space and manages the
> > corresponding virtual mappings represented by "GPU VA" objects. It also
> > keeps track of the gem::Object<T> used to back the mappings through
> > GpuVmBo<T>.
> > 
> > This abstraction is only usable by drivers that wish to use GPUVM in
> > immediate mode. This allows us to build the locking scheme into the API
> > design. It means that the GEM mutex is used for the GEM gpuva list, and
> > that the resv lock is used for the extobj list. The evicted list is not
> > yet used in this version.
> > 
> > This abstraction provides a special handle called the GpuVmCore, which
> > is a wrapper around ARef<GpuVm> that provides access to the interval
> > tree. Generally, all changes to the address space requires mutable
> > access to this unique handle.
> > 
> > Some of the safety comments are still somewhat WIP, but I think the API
> > should be sound as-is.
> > 
> > Co-developed-by: Asahi Lina <lina+kernel@asahilina.net>
> > Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
> > Co-developed-by: Daniel Almeida <daniel.almeida@collabora.com>
> > Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>

> > +//! DRM GPUVM in immediate mode
> > +//!
> > +//! Rust abstractions for using GPUVM in immediate mode. This is when the GPUVM state is updated
> > +//! during `run_job()`, i.e., in the DMA fence signalling critical path, to ensure that the GPUVM
> 
> IMHO: We should initially target synchronous VM_BINDS, which are the opposite
> of what you described above.

Immediate mode is a locking scheme. We have to pick one of them
regardless of whether we do async VM_BIND yet.

(Well ok immediate mode is not just a locking scheme: it also determines
whether vm_bo cleanup is postponed or not.)

> > +/// A DRM GPU VA manager.
> > +///
> > +/// This object is refcounted, but the "core" is only accessible using a special unique handle. The
> 
> I wonder if `Owned<T>` is a good fit here? IIUC, Owned<T> can be refcounted,
> but there is only ever one handle on the Rust side? If so, this seems to be
> what we want here?

Yes, Owned<T> is probably a good fit.

> > +/// core consists of the `core` field and the GPUVM's interval tree.
> > +#[repr(C)]
> > +#[pin_data]
> > +pub struct GpuVm<T: DriverGpuVm> {
> > +    #[pin]
> > +    vm: Opaque<bindings::drm_gpuvm>,
> > +    /// Accessed only through the [`GpuVmCore`] reference.
> > +    core: UnsafeCell<T>,
> 
> This UnsafeCell has been here since Lina’s version. I must say I never
> understood why, and perhaps now is a good time to clarify it given the changes
> we’re making w.r.t to the “unique handle” thing.
> 
> This is just some driver private data. It’s never shared with C. I am not
> sure why we need this wrapper.

The sm_step_* methods receive a `&mut T`. This is UB if other code has
an `&GpuVm<T>` and the `T` is not wrapped in an `UnsafeCell` because
`&GpuVm<T>` implies that the data is not modified.

> > +    /// Shared data not protected by any lock.
> > +    #[pin]
> > +    shared_data: T::SharedData,
> 
> Should we deref to this?

We can do that.

> > +    /// Creates a GPUVM instance.
> > +    #[expect(clippy::new_ret_no_self)]
> > +    pub fn new<E>(
> > +        name: &'static CStr,
> > +        dev: &drm::Device<T::Driver>,
> > +        r_obj: &T::Object,
> 
> Can we call this “reservation_object”, or similar?
> 
> We should probably briefly explain what it does, perhaps linking to the C docs.

Yeah agreed, more docs are probably warranted here.

> I wonder if we should expose the methods below at this moment. We will not need
> them in Tyr until we start submitting jobs. This is still a bit in the future.
> 
> I say this for a few reasons:
> 
> a) Philipp is still working on the fence abstractions,
> 
> b) As a result from the above, we are taking raw fence pointers,
> 
> c) Onur is working on a WW Mutex abstraction [0] that includes a Rust
> implementation of drm_exec (under another name, and useful in other contexts
> outside of DRM). Should we use them here?
> 
> I think your current design with the ExecToken is also ok and perhaps we should
> stick to it, but it's good to at least discuss this with the others.

I don't think we can postpone adding the "obtain" method. It's required
to call sm_map, which is needed for VM_BIND.

> > +    /// Returns a [`GpuVmBoObtain`] for the provided GEM object.
> > +    #[inline]
> > +    pub fn obtain(
> > +        &self,
> > +        obj: &T::Object,
> > +        data: impl PinInit<T::VmBoData>,
> > +    ) -> Result<GpuVmBoObtain<T>, AllocError> {
> 
> Perhaps this should be called GpuVmBo? That’s what you want to “obtain” in the first place.
> 
> This is indeed a question, by the way.

One could possibly use Owned<_> here.

> > +/// A lock guard for the GPUVM's resv lock.
> > +///
> > +/// This guard provides access to the extobj and evicted lists.
> 
> Should we bother with evicted objects at this stage?

The abstractions don't actually support them right now. The resv lock is
currently only here because it's used internally in these abstractions.
It won't be useful to drivers until we add evicted objects.

> > +///
> > +/// # Invariants
> > +///
> > +/// Holds the GPUVM resv lock.
> > +pub struct GpuvmResvLockGuard<'a, T: DriverGpuVm>(&'a GpuVm<T>);
> > +
> > +impl<T: DriverGpuVm> GpuVm<T> {
> > +    /// Lock the VM's resv lock.
> 
> More docs here would be nice.
> 
> > +    #[inline]
> > +    pub fn resv_lock(&self) -> GpuvmResvLockGuard<'_, T> {
> > +        // SAFETY: It's always ok to lock the resv lock.
> > +        unsafe { bindings::dma_resv_lock(self.raw_resv_lock(), ptr::null_mut()) };
> > +        // INVARIANTS: We took the lock.
> > +        GpuvmResvLockGuard(self)
> > +    }
> 
> You can call this more than once and deadlock. Perhaps we should warn about this, or forbid it?

Same as any other lock. I don't think we need to do anything special.

> > +    /// Use the pre-allocated VA to carry out this map operation.
> > +    pub fn insert(self, va: GpuVaAlloc<T>, va_data: impl PinInit<T::VaData>) -> OpMapped<'op, T> {
> > +        let va = va.prepare(va_data);
> > +        // SAFETY: By the type invariants we may access the interval tree.
> > +        unsafe { bindings::drm_gpuva_map(self.vm_bo.gpuvm().as_raw(), va, self.op) };
> > +        // SAFETY: The GEM object is valid, so the mutex is properly initialized.
> 
> > +        unsafe { bindings::mutex_lock(&raw mut (*self.op.gem.obj).gpuva.lock) };
> 
> Should we use Fujita’s might_sleep() support here?

Could make sense yeah.

> > +/// ```
> > +/// struct drm_gpuva_op_unmap {
> > +/// /**
> > +/// * @va: the &drm_gpuva to unmap
> > +/// */
> > +/// struct drm_gpuva *va;
> > +///
> > +/// /**
> > +/// * @keep:
> > +/// *
> > +/// * Indicates whether this &drm_gpuva is physically contiguous with the
> > +/// * original mapping request.
> > +/// *
> > +/// * Optionally, if &keep is set, drivers may keep the actual page table
> > +/// * mappings for this &drm_gpuva, adding the missing page table entries
> > +/// * only and update the &drm_gpuvm accordingly.
> > +/// */
> > +/// bool keep;
> > +/// };
> > +/// ```
> 
> I think the docs could improve here ^

Yeah I can look at it.

> > +impl<T: DriverGpuVm> GpuVmCore<T> {
> > +    /// Create a mapping, removing or remapping anything that overlaps.
> > +    #[inline]
> > +    pub fn sm_map(&mut self, req: OpMapRequest<'_, T>) -> Result {
> 
> I wonder if we should keep this “sm” prefix. Perhaps
> “map_region” or “map_range” would be better names IMHO.

I'll wait for Danilo to weigh in on this. I'm not sure where "sm"
actually comes from.

> > +/// Represents that a given GEM object has at least one mapping on this [`GpuVm`] instance.
> > +///
> > +/// Does not assume that GEM lock is held.
> > +#[repr(C)]
> > +#[pin_data]
> > +pub struct GpuVmBo<T: DriverGpuVm> {
> 
> Oh, we already have GpuVmBo, and GpuVmBoObtain. I see.

Yeah, GpuVmBoObtain and GpuVmBoAlloc are pointers to GpuVmBo.

> > +    #[pin]
> > +    inner: Opaque<bindings::drm_gpuvm_bo>,
> > +    #[pin]
> > +    data: T::VmBoData,
> > +}
> > +
> > +impl<T: DriverGpuVm> GpuVmBo<T> {
> > +    pub(super) const ALLOC_FN: Option<unsafe extern "C" fn() -> *mut bindings::drm_gpuvm_bo> = {
> > +        use core::alloc::Layout;
> > +        let base = Layout::new::<bindings::drm_gpuvm_bo>();
> > +        let rust = Layout::new::<Self>();
> > +        assert!(base.size() <= rust.size());
> 
> We should default to something else instead of panicking IMHO.

This is const context, which makes it a build assertion.

> My overall opinion is that we’re adding a lot of things that will only be
> relevant when we’re more advanced on the job submission front. This
> includes the things that Phillip is working on (i.e.: Fences + JobQueue).
> 
> Perhaps we should keep this iteration downstream (so we’re sure it works
> when the time comes) and focus on synchronous VM_BINDS upstream.
> The Tyr demo that you’ve tested this on is very helpful for this purpose.

Yeah let's split out the prepare, GpuVmExec, and resv_add_fence stuff to
a separate patch.

I don't think sync vs async VM_BIND changes much in which methods or
structs are required here. Only difference is whether you call the
methods from a workqueue or not.

Alice

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction
  2025-12-02  8:39     ` Alice Ryhl
@ 2025-12-02 13:42       ` Daniel Almeida
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Almeida @ 2025-12-02 13:42 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Danilo Krummrich, Matthew Brost, Thomas Hellström,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Boris Brezillon, Steven Price, Liviu Dudau,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Frank Binns,
	Matt Coster, Rob Clark, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Sean Paul, Marijn Suijten, Lyude Paul,
	Lucas De Marchi, Rodrigo Vivi, Sumit Semwal, Christian König,
	dri-devel, linux-kernel, rust-for-linux, linux-arm-msm, freedreno,
	nouveau, intel-xe, linux-media, linaro-mm-sig, Asahi Lina



> On 2 Dec 2025, at 05:39, Alice Ryhl <aliceryhl@google.com> wrote:
> 
> On Mon, Dec 01, 2025 at 12:16:09PM -0300, Daniel Almeida wrote:
>> Hi Alice,
>> 
>> I find it a bit weird that we reverted to v1, given that the previous gpuvm
>> attempt was v3. No big deal though.
>> 
>> 
>>> On 28 Nov 2025, at 11:14, Alice Ryhl <aliceryhl@google.com> wrote:
>>> 
>>> Add a GPUVM abstraction to be used by Rust GPU drivers.
>>> 
>>> GPUVM keeps track of a GPU's virtual address (VA) space and manages the
>>> corresponding virtual mappings represented by "GPU VA" objects. It also
>>> keeps track of the gem::Object<T> used to back the mappings through
>>> GpuVmBo<T>.
>>> 
>>> This abstraction is only usable by drivers that wish to use GPUVM in
>>> immediate mode. This allows us to build the locking scheme into the API
>>> design. It means that the GEM mutex is used for the GEM gpuva list, and
>>> that the resv lock is used for the extobj list. The evicted list is not
>>> yet used in this version.
>>> 
>>> This abstraction provides a special handle called the GpuVmCore, which
>>> is a wrapper around ARef<GpuVm> that provides access to the interval
>>> tree. Generally, all changes to the address space requires mutable
>>> access to this unique handle.
>>> 
>>> Some of the safety comments are still somewhat WIP, but I think the API
>>> should be sound as-is.
>>> 
>>> Co-developed-by: Asahi Lina <lina+kernel@asahilina.net>
>>> Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
>>> Co-developed-by: Daniel Almeida <daniel.almeida@collabora.com>
>>> Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
>>> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> 
>>> +//! DRM GPUVM in immediate mode
>>> +//!
>>> +//! Rust abstractions for using GPUVM in immediate mode. This is when the GPUVM state is updated
>>> +//! during `run_job()`, i.e., in the DMA fence signalling critical path, to ensure that the GPUVM
>> 
>> IMHO: We should initially target synchronous VM_BINDS, which are the opposite
>> of what you described above.
> 
> Immediate mode is a locking scheme. We have to pick one of them
> regardless of whether we do async VM_BIND yet.
> 
> (Well ok immediate mode is not just a locking scheme: it also determines
> whether vm_bo cleanup is postponed or not.)
> 
>>> +/// A DRM GPU VA manager.
>>> +///
>>> +/// This object is refcounted, but the "core" is only accessible using a special unique handle. The
>> 
>> I wonder if `Owned<T>` is a good fit here? IIUC, Owned<T> can be refcounted,
>> but there is only ever one handle on the Rust side? If so, this seems to be
>> what we want here?
> 
> Yes, Owned<T> is probably a good fit.
> 
>>> +/// core consists of the `core` field and the GPUVM's interval tree.
>>> +#[repr(C)]
>>> +#[pin_data]
>>> +pub struct GpuVm<T: DriverGpuVm> {
>>> +    #[pin]
>>> +    vm: Opaque<bindings::drm_gpuvm>,
>>> +    /// Accessed only through the [`GpuVmCore`] reference.
>>> +    core: UnsafeCell<T>,
>> 
>> This UnsafeCell has been here since Lina’s version. I must say I never
>> understood why, and perhaps now is a good time to clarify it given the changes
>> we’re making w.r.t to the “unique handle” thing.
>> 
>> This is just some driver private data. It’s never shared with C. I am not
>> sure why we need this wrapper.
> 
> The sm_step_* methods receive a `&mut T`. This is UB if other code has
> an `&GpuVm<T>` and the `T` is not wrapped in an `UnsafeCell` because
> `&GpuVm<T>` implies that the data is not modified.
> 
>>> +    /// Shared data not protected by any lock.
>>> +    #[pin]
>>> +    shared_data: T::SharedData,
>> 
>> Should we deref to this?
> 
> We can do that.
> 
>>> +    /// Creates a GPUVM instance.
>>> +    #[expect(clippy::new_ret_no_self)]
>>> +    pub fn new<E>(
>>> +        name: &'static CStr,
>>> +        dev: &drm::Device<T::Driver>,
>>> +        r_obj: &T::Object,
>> 
>> Can we call this “reservation_object”, or similar?
>> 
>> We should probably briefly explain what it does, perhaps linking to the C docs.
> 
> Yeah agreed, more docs are probably warranted here.
> 
>> I wonder if we should expose the methods below at this moment. We will not need
>> them in Tyr until we start submitting jobs. This is still a bit in the future.
>> 
>> I say this for a few reasons:
>> 
>> a) Philipp is still working on the fence abstractions,
>> 
>> b) As a result from the above, we are taking raw fence pointers,
>> 
>> c) Onur is working on a WW Mutex abstraction [0] that includes a Rust
>> implementation of drm_exec (under another name, and useful in other contexts
>> outside of DRM). Should we use them here?
>> 
>> I think your current design with the ExecToken is also ok and perhaps we should
>> stick to it, but it's good to at least discuss this with the others.
> 
> I don't think we can postpone adding the "obtain" method. It's required
> to call sm_map, which is needed for VM_BIND.
> 
>>> +    /// Returns a [`GpuVmBoObtain`] for the provided GEM object.
>>> +    #[inline]
>>> +    pub fn obtain(
>>> +        &self,
>>> +        obj: &T::Object,
>>> +        data: impl PinInit<T::VmBoData>,
>>> +    ) -> Result<GpuVmBoObtain<T>, AllocError> {
>> 
>> Perhaps this should be called GpuVmBo? That’s what you want to “obtain” in the first place.
>> 
>> This is indeed a question, by the way.
> 
> One could possibly use Owned<_> here.
> 
>>> +/// A lock guard for the GPUVM's resv lock.
>>> +///
>>> +/// This guard provides access to the extobj and evicted lists.
>> 
>> Should we bother with evicted objects at this stage?
> 
> The abstractions don't actually support them right now. The resv lock is
> currently only here because it's used internally in these abstractions.
> It won't be useful to drivers until we add evicted objects.
> 
>>> +///
>>> +/// # Invariants
>>> +///
>>> +/// Holds the GPUVM resv lock.
>>> +pub struct GpuvmResvLockGuard<'a, T: DriverGpuVm>(&'a GpuVm<T>);
>>> +
>>> +impl<T: DriverGpuVm> GpuVm<T> {
>>> +    /// Lock the VM's resv lock.
>> 
>> More docs here would be nice.
>> 
>>> +    #[inline]
>>> +    pub fn resv_lock(&self) -> GpuvmResvLockGuard<'_, T> {
>>> +        // SAFETY: It's always ok to lock the resv lock.
>>> +        unsafe { bindings::dma_resv_lock(self.raw_resv_lock(), ptr::null_mut()) };
>>> +        // INVARIANTS: We took the lock.
>>> +        GpuvmResvLockGuard(self)
>>> +    }
>> 
>> You can call this more than once and deadlock. Perhaps we should warn about this, or forbid it?
> 
> Same as any other lock. I don't think we need to do anything special.
> 
>>> +    /// Use the pre-allocated VA to carry out this map operation.
>>> +    pub fn insert(self, va: GpuVaAlloc<T>, va_data: impl PinInit<T::VaData>) -> OpMapped<'op, T> {
>>> +        let va = va.prepare(va_data);
>>> +        // SAFETY: By the type invariants we may access the interval tree.
>>> +        unsafe { bindings::drm_gpuva_map(self.vm_bo.gpuvm().as_raw(), va, self.op) };
>>> +        // SAFETY: The GEM object is valid, so the mutex is properly initialized.
>> 
>>> +        unsafe { bindings::mutex_lock(&raw mut (*self.op.gem.obj).gpuva.lock) };
>> 
>> Should we use Fujita’s might_sleep() support here?
> 
> Could make sense yeah.
> 
>>> +/// ```
>>> +/// struct drm_gpuva_op_unmap {
>>> +/// /**
>>> +/// * @va: the &drm_gpuva to unmap
>>> +/// */
>>> +/// struct drm_gpuva *va;
>>> +///
>>> +/// /**
>>> +/// * @keep:
>>> +/// *
>>> +/// * Indicates whether this &drm_gpuva is physically contiguous with the
>>> +/// * original mapping request.
>>> +/// *
>>> +/// * Optionally, if &keep is set, drivers may keep the actual page table
>>> +/// * mappings for this &drm_gpuva, adding the missing page table entries
>>> +/// * only and update the &drm_gpuvm accordingly.
>>> +/// */
>>> +/// bool keep;
>>> +/// };
>>> +/// ```
>> 
>> I think the docs could improve here ^
> 
> Yeah I can look at it.
> 
>>> +impl<T: DriverGpuVm> GpuVmCore<T> {
>>> +    /// Create a mapping, removing or remapping anything that overlaps.
>>> +    #[inline]
>>> +    pub fn sm_map(&mut self, req: OpMapRequest<'_, T>) -> Result {
>> 
>> I wonder if we should keep this “sm” prefix. Perhaps
>> “map_region” or “map_range” would be better names IMHO.
> 
> I'll wait for Danilo to weigh in on this. I'm not sure where "sm"
> actually comes from.

sm probably is a reference to “split/merge”.

> 
>>> +/// Represents that a given GEM object has at least one mapping on this [`GpuVm`] instance.
>>> +///
>>> +/// Does not assume that GEM lock is held.
>>> +#[repr(C)]
>>> +#[pin_data]
>>> +pub struct GpuVmBo<T: DriverGpuVm> {
>> 
>> Oh, we already have GpuVmBo, and GpuVmBoObtain. I see.
> 
> Yeah, GpuVmBoObtain and GpuVmBoAlloc are pointers to GpuVmBo.
> 
>>> +    #[pin]
>>> +    inner: Opaque<bindings::drm_gpuvm_bo>,
>>> +    #[pin]
>>> +    data: T::VmBoData,
>>> +}
>>> +
>>> +impl<T: DriverGpuVm> GpuVmBo<T> {
>>> +    pub(super) const ALLOC_FN: Option<unsafe extern "C" fn() -> *mut bindings::drm_gpuvm_bo> = {
>>> +        use core::alloc::Layout;
>>> +        let base = Layout::new::<bindings::drm_gpuvm_bo>();
>>> +        let rust = Layout::new::<Self>();
>>> +        assert!(base.size() <= rust.size());
>> 
>> We should default to something else instead of panicking IMHO.
> 
> This is const context, which makes it a build assertion.
> 
>> My overall opinion is that we’re adding a lot of things that will only be
>> relevant when we’re more advanced on the job submission front. This
>> includes the things that Phillip is working on (i.e.: Fences + JobQueue).
>> 
>> Perhaps we should keep this iteration downstream (so we’re sure it works
>> when the time comes) and focus on synchronous VM_BINDS upstream.
>> The Tyr demo that you’ve tested this on is very helpful for this purpose.
> 
> Yeah let's split out the prepare, GpuVmExec, and resv_add_fence stuff to
> a separate patch.

Ack

> 
> I don't think sync vs async VM_BIND changes much in which methods or
> structs are required here. Only difference is whether you call the
> methods from a workqueue or not.
> 
> Alice



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-12-02 13:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-28 14:14 [PATCH 0/4] Rust GPUVM support Alice Ryhl
2025-11-28 14:14 ` [PATCH 1/4] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc() Alice Ryhl
2025-11-28 14:24   ` Boris Brezillon
2025-12-01  9:55     ` Alice Ryhl
2025-11-28 14:14 ` [PATCH 2/4] drm/gpuvm: drm_gpuvm_bo_obtain() requires lock and staged mode Alice Ryhl
2025-11-28 14:25   ` Boris Brezillon
2025-11-28 14:14 ` [PATCH 3/4] drm/gpuvm: use const for drm_gpuva_op_* ptrs Alice Ryhl
2025-11-28 14:27   ` Boris Brezillon
2025-11-28 14:14 ` [PATCH 4/4] rust: drm: add GPUVM immediate mode abstraction Alice Ryhl
2025-12-01 15:16   ` Daniel Almeida
2025-12-02  8:39     ` Alice Ryhl
2025-12-02 13:42       ` Daniel Almeida

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).