[PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching
       [not found] <20250226153344.58175-1-thomas.hellstrom@linux.intel.com>
@ 2025-02-26 15:33 ` Thomas Hellström
  2025-02-26 15:40   ` Matthew Brost
  2025-02-26 15:33 ` [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif Thomas Hellström
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellström @ 2025-02-26 15:33 UTC (permalink / raw)
  To: intel-xe; +Cc: Thomas Hellström, Matthew Brost, stable

If a userptr vma subject to prefetching was already invalidated
or invalidated during the prefetch operation, the operation would
repeatedly return -EAGAIN which would typically cause an infinite
loop.

Validate the userptr to ensure this doesn't happen.

Fixes: 5bd24e78829a ("drm/xe/vm: Subclass userptr vmas")
Fixes: 617eebb9c480 ("drm/xe: Fix array of binds")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.9+
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 996000f2424e..4c1ca47667ad 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2307,7 +2307,14 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops,
 		}
 		case DRM_GPUVA_OP_UNMAP:
 		case DRM_GPUVA_OP_PREFETCH:
-			/* FIXME: Need to skip some prefetch ops */
+			vma = gpuva_to_vma(op->base.prefetch.va);
+
+			if (xe_vma_is_userptr(vma)) {
+				err = xe_vma_userptr_pin_pages(to_userptr_vma(vma));
+				if (err)
+					return err;
+			}
+
 			xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask);
 			break;
 		default:
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching
  2025-02-26 15:33 ` [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching Thomas Hellström
@ 2025-02-26 15:40   ` Matthew Brost
  2025-02-26 15:46     ` Thomas Hellström
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Brost @ 2025-02-26 15:40 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-xe, stable

On Wed, Feb 26, 2025 at 04:33:41PM +0100, Thomas Hellström wrote:
> If a userptr vma subject to prefetching was already invalidated
> or invalidated during the prefetch operation, the operation would
> repeatedly return -EAGAIN which would typically cause an infinite
> loop.
> 
> Validate the userptr to ensure this doesn't happen.
> 
> Fixes: 5bd24e78829a ("drm/xe/vm: Subclass userptr vmas")
> Fixes: 617eebb9c480 ("drm/xe: Fix array of binds")
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: <stable@vger.kernel.org> # v6.9+
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/xe/xe_vm.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 996000f2424e..4c1ca47667ad 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -2307,7 +2307,14 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops,
>  		}
>  		case DRM_GPUVA_OP_UNMAP:
>  		case DRM_GPUVA_OP_PREFETCH:
> -			/* FIXME: Need to skip some prefetch ops */

The UNMAP case statement is falling through to pretech case which I
believe is not the intent.

So I think:

case DRM_GPUVA_OP_UNMAP:
	xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask);
	break;
case DRM_GPUVA_OP_PREFETCH:
	<new code>

Matt

> +			vma = gpuva_to_vma(op->base.prefetch.va);
> +
> +			if (xe_vma_is_userptr(vma)) {
> +				err = xe_vma_userptr_pin_pages(to_userptr_vma(vma));
> +				if (err)
> +					return err;
> +			}
> +
>  			xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask);
>  			break;
>  		default:
> -- 
> 2.48.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching
  2025-02-26 15:40   ` Matthew Brost
@ 2025-02-26 15:46     ` Thomas Hellström
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Hellström @ 2025-02-26 15:46 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-xe, stable

On Wed, 2025-02-26 at 07:40 -0800, Matthew Brost wrote:
> On Wed, Feb 26, 2025 at 04:33:41PM +0100, Thomas Hellström wrote:
> > If a userptr vma subject to prefetching was already invalidated
> > or invalidated during the prefetch operation, the operation would
> > repeatedly return -EAGAIN which would typically cause an infinite
> > loop.
> > 
> > Validate the userptr to ensure this doesn't happen.
> > 
> > Fixes: 5bd24e78829a ("drm/xe/vm: Subclass userptr vmas")
> > Fixes: 617eebb9c480 ("drm/xe: Fix array of binds")
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: <stable@vger.kernel.org> # v6.9+
> > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_vm.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c
> > b/drivers/gpu/drm/xe/xe_vm.c
> > index 996000f2424e..4c1ca47667ad 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -2307,7 +2307,14 @@ static int vm_bind_ioctl_ops_parse(struct
> > xe_vm *vm, struct drm_gpuva_ops *ops,
> >  		}
> >  		case DRM_GPUVA_OP_UNMAP:
> >  		case DRM_GPUVA_OP_PREFETCH:
> > -			/* FIXME: Need to skip some prefetch ops
> > */
> 
> The UNMAP case statement is falling through to pretech case which I
> believe is not the intent.
> 
> So I think:
> 
> case DRM_GPUVA_OP_UNMAP:
> 	xe_vma_ops_incr_pt_update_ops(vops, op->tile_mask);
> 	break;
> case DRM_GPUVA_OP_PREFETCH:
> 	<new code>
> 
> Matt

Right.
Will fix.

/Thomas

> 
> > +			vma = gpuva_to_vma(op->base.prefetch.va);
> > +
> > +			if (xe_vma_is_userptr(vma)) {
> > +				err =
> > xe_vma_userptr_pin_pages(to_userptr_vma(vma));
> > +				if (err)
> > +					return err;
> > +			}
> > +
> >  			xe_vma_ops_incr_pt_update_ops(vops, op-
> > >tile_mask);
> >  			break;
> >  		default:
> > -- 
> > 2.48.1
> > 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif
       [not found] <20250226153344.58175-1-thomas.hellstrom@linux.intel.com>
  2025-02-26 15:33 ` [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching Thomas Hellström
@ 2025-02-26 15:33 ` Thomas Hellström
  2025-02-26 17:03   ` Lucas De Marchi
  2025-02-27  5:04   ` Upadhyay, Tejas
  2025-02-26 15:33 ` [PATCH 3/4] drm/xe: Fix fault mode invalidation with unbind Thomas Hellström
  2025-02-26 15:33 ` [PATCH 4/4] drm/xe: Add staging tree for VM binds Thomas Hellström
  3 siblings, 2 replies; 10+ messages in thread
From: Thomas Hellström @ 2025-02-26 15:33 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Maarten Lankhorst,
	José Roberto de Souza, stable

Fix a (harmless) misplaced #endif leading to declarations
appearing multiple times.

Fixes: 0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and userptr")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: <stable@vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index f66075f8a6fe..7c8e39049223 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -282,9 +282,9 @@ static inline void vm_dbg(const struct drm_device *dev,
 			  const char *format, ...)
 { /* noop */ }
 #endif
-#endif
 
 struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm *vm);
 void xe_vm_snapshot_capture_delayed(struct xe_vm_snapshot *snap);
 void xe_vm_snapshot_print(struct xe_vm_snapshot *snap, struct drm_printer *p);
 void xe_vm_snapshot_free(struct xe_vm_snapshot *snap);
+#endif
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif
  2025-02-26 15:33 ` [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif Thomas Hellström
@ 2025-02-26 17:03   ` Lucas De Marchi
  2025-02-27  5:04   ` Upadhyay, Tejas
  1 sibling, 0 replies; 10+ messages in thread
From: Lucas De Marchi @ 2025-02-26 17:03 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: intel-xe, Maarten Lankhorst, José Roberto de Souza, stable

On Wed, Feb 26, 2025 at 04:33:42PM +0100, Thomas Hellström wrote:
>Fix a (harmless) misplaced #endif leading to declarations
>appearing multiple times.
>
>Fixes: 0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and userptr")
>Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>Cc: José Roberto de Souza <jose.souza@intel.com>
>Cc: <stable@vger.kernel.org> # v6.9+

not sure if it deserves to be propagated to stable since it's harmless.
As prep for patch 3, maybe just do v6.12+, but up to you


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


Lucas De Marchi


>Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>---
> drivers/gpu/drm/xe/xe_vm.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
>index f66075f8a6fe..7c8e39049223 100644
>--- a/drivers/gpu/drm/xe/xe_vm.h
>+++ b/drivers/gpu/drm/xe/xe_vm.h
>@@ -282,9 +282,9 @@ static inline void vm_dbg(const struct drm_device *dev,
> 			  const char *format, ...)
> { /* noop */ }
> #endif
>-#endif
>
> struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm *vm);
> void xe_vm_snapshot_capture_delayed(struct xe_vm_snapshot *snap);
> void xe_vm_snapshot_print(struct xe_vm_snapshot *snap, struct drm_printer *p);
> void xe_vm_snapshot_free(struct xe_vm_snapshot *snap);
>+#endif
>-- 
>2.48.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif
  2025-02-26 15:33 ` [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif Thomas Hellström
  2025-02-26 17:03   ` Lucas De Marchi
@ 2025-02-27  5:04   ` Upadhyay, Tejas
  1 sibling, 0 replies; 10+ messages in thread
From: Upadhyay, Tejas @ 2025-02-27  5:04 UTC (permalink / raw)
  To: Thomas Hellström, intel-xe@lists.freedesktop.org
  Cc: Maarten Lankhorst, Souza, Jose, stable@vger.kernel.org



> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of Thomas
> Hellström
> Sent: Wednesday, February 26, 2025 9:04 PM
> To: intel-xe@lists.freedesktop.org
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>; Maarten
> Lankhorst <maarten.lankhorst@linux.intel.com>; Souza, Jose
> <jose.souza@intel.com>; stable@vger.kernel.org
> Subject: [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif
> 
> Fix a (harmless) misplaced #endif leading to declarations appearing multiple
> times.
> 
> Fixes: 0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and
> userptr")
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: <stable@vger.kernel.org> # v6.9+
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/xe/xe_vm.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index
> f66075f8a6fe..7c8e39049223 100644
> --- a/drivers/gpu/drm/xe/xe_vm.h
> +++ b/drivers/gpu/drm/xe/xe_vm.h
> @@ -282,9 +282,9 @@ static inline void vm_dbg(const struct drm_device
> *dev,
>  			  const char *format, ...)
>  { /* noop */ }
>  #endif
> -#endif
> 
>  struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm *vm);  void
> xe_vm_snapshot_capture_delayed(struct xe_vm_snapshot *snap);  void
> xe_vm_snapshot_print(struct xe_vm_snapshot *snap, struct drm_printer *p);
> void xe_vm_snapshot_free(struct xe_vm_snapshot *snap);
> +#endif

LGTM,
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

Tejas
> --
> 2.48.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/4] drm/xe: Fix fault mode invalidation with unbind
       [not found] <20250226153344.58175-1-thomas.hellstrom@linux.intel.com>
  2025-02-26 15:33 ` [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching Thomas Hellström
  2025-02-26 15:33 ` [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif Thomas Hellström
@ 2025-02-26 15:33 ` Thomas Hellström
  2025-02-27  7:03   ` Matthew Brost
  2025-02-26 15:33 ` [PATCH 4/4] drm/xe: Add staging tree for VM binds Thomas Hellström
  3 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellström @ 2025-02-26 15:33 UTC (permalink / raw)
  To: intel-xe; +Cc: Thomas Hellström, Matthew Brost, Matthew Auld, stable

Fix fault mode invalidation racing with unbind leading to the
PTE zapping potentially traversing an invalid page-table tree.
Do this by holding the notifier lock across PTE zapping. This
might transfer any contention waiting on the notifier seqlock
read side to the notifier lock read side, but that shouldn't be
a major problem.

At the same time get rid of the open-coded invalidation in the bind
code by relying on the notifier even when the vma bind is not
yet committed.

Finally let userptr invalidation call a dedicated xe_vm function
performing a full invalidation.

Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v6.12+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_pt.c       | 38 ++++------------
 drivers/gpu/drm/xe/xe_vm.c       | 78 ++++++++++++++++++++------------
 drivers/gpu/drm/xe/xe_vm.h       |  8 ++++
 drivers/gpu/drm/xe/xe_vm_types.h |  4 +-
 4 files changed, 68 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 1ddcc7e79a93..12a627a23eb4 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -1213,42 +1213,22 @@ static int vma_check_userptr(struct xe_vm *vm, struct xe_vma *vma,
 		return 0;
 
 	uvma = to_userptr_vma(vma);
-	notifier_seq = uvma->userptr.notifier_seq;
+	if (xe_pt_userptr_inject_eagain(uvma))
+		xe_vma_userptr_force_invalidate(uvma);
 
-	if (uvma->userptr.initial_bind && !xe_vm_in_fault_mode(vm))
-		return 0;
+	notifier_seq = uvma->userptr.notifier_seq;
 
 	if (!mmu_interval_read_retry(&uvma->userptr.notifier,
-				     notifier_seq) &&
-	    !xe_pt_userptr_inject_eagain(uvma))
+				     notifier_seq))
 		return 0;
 
-	if (xe_vm_in_fault_mode(vm)) {
+	if (xe_vm_in_fault_mode(vm))
 		return -EAGAIN;
-	} else {
-		spin_lock(&vm->userptr.invalidated_lock);
-		list_move_tail(&uvma->userptr.invalidate_link,
-			       &vm->userptr.invalidated);
-		spin_unlock(&vm->userptr.invalidated_lock);
-
-		if (xe_vm_in_preempt_fence_mode(vm)) {
-			struct dma_resv_iter cursor;
-			struct dma_fence *fence;
-			long err;
-
-			dma_resv_iter_begin(&cursor, xe_vm_resv(vm),
-					    DMA_RESV_USAGE_BOOKKEEP);
-			dma_resv_for_each_fence_unlocked(&cursor, fence)
-				dma_fence_enable_sw_signaling(fence);
-			dma_resv_iter_end(&cursor);
-
-			err = dma_resv_wait_timeout(xe_vm_resv(vm),
-						    DMA_RESV_USAGE_BOOKKEEP,
-						    false, MAX_SCHEDULE_TIMEOUT);
-			XE_WARN_ON(err <= 0);
-		}
-	}
 
+	/*
+	 * Just continue the operation since exec or rebind worker
+	 * will take care of rebinding.
+	 */
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 4c1ca47667ad..37d773c0b729 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -580,51 +580,26 @@ static void preempt_rebind_work_func(struct work_struct *w)
 	trace_xe_vm_rebind_worker_exit(vm);
 }
 
-static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
-				   const struct mmu_notifier_range *range,
-				   unsigned long cur_seq)
+static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uvma)
 {
-	struct xe_userptr *userptr = container_of(mni, typeof(*userptr), notifier);
-	struct xe_userptr_vma *uvma = container_of(userptr, typeof(*uvma), userptr);
+	struct xe_userptr *userptr = &uvma->userptr;
 	struct xe_vma *vma = &uvma->vma;
-	struct xe_vm *vm = xe_vma_vm(vma);
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 	long err;
 
-	xe_assert(vm->xe, xe_vma_is_userptr(vma));
-	trace_xe_vma_userptr_invalidate(vma);
-
-	if (!mmu_notifier_range_blockable(range))
-		return false;
-
-	vm_dbg(&xe_vma_vm(vma)->xe->drm,
-	       "NOTIFIER: addr=0x%016llx, range=0x%016llx",
-		xe_vma_start(vma), xe_vma_size(vma));
-
-	down_write(&vm->userptr.notifier_lock);
-	mmu_interval_set_seq(mni, cur_seq);
-
-	/* No need to stop gpu access if the userptr is not yet bound. */
-	if (!userptr->initial_bind) {
-		up_write(&vm->userptr.notifier_lock);
-		return true;
-	}
-
 	/*
 	 * Tell exec and rebind worker they need to repin and rebind this
 	 * userptr.
 	 */
 	if (!xe_vm_in_fault_mode(vm) &&
-	    !(vma->gpuva.flags & XE_VMA_DESTROYED) && vma->tile_present) {
+	    !(vma->gpuva.flags & XE_VMA_DESTROYED)) {
 		spin_lock(&vm->userptr.invalidated_lock);
 		list_move_tail(&userptr->invalidate_link,
 			       &vm->userptr.invalidated);
 		spin_unlock(&vm->userptr.invalidated_lock);
 	}
 
-	up_write(&vm->userptr.notifier_lock);
-
 	/*
 	 * Preempt fences turn into schedule disables, pipeline these.
 	 * Note that even in fault mode, we need to wait for binds and
@@ -642,11 +617,35 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
 				    false, MAX_SCHEDULE_TIMEOUT);
 	XE_WARN_ON(err <= 0);
 
-	if (xe_vm_in_fault_mode(vm)) {
+	if (xe_vm_in_fault_mode(vm) && userptr->initial_bind) {
 		err = xe_vm_invalidate_vma(vma);
 		XE_WARN_ON(err);
 	}
+}
+
+static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
+				   const struct mmu_notifier_range *range,
+				   unsigned long cur_seq)
+{
+	struct xe_userptr_vma *uvma = container_of(mni, typeof(*uvma), userptr.notifier);
+	struct xe_vma *vma = &uvma->vma;
+	struct xe_vm *vm = xe_vma_vm(vma);
+
+	xe_assert(vm->xe, xe_vma_is_userptr(vma));
+	trace_xe_vma_userptr_invalidate(vma);
+
+	if (!mmu_notifier_range_blockable(range))
+		return false;
 
+	vm_dbg(&xe_vma_vm(vma)->xe->drm,
+	       "NOTIFIER: addr=0x%016llx, range=0x%016llx",
+		xe_vma_start(vma), xe_vma_size(vma));
+
+	down_write(&vm->userptr.notifier_lock);
+	mmu_interval_set_seq(mni, cur_seq);
+
+	__vma_userptr_invalidate(vm, uvma);
+	up_write(&vm->userptr.notifier_lock);
 	trace_xe_vma_userptr_invalidate_complete(vma);
 
 	return true;
@@ -656,6 +655,27 @@ static const struct mmu_interval_notifier_ops vma_userptr_notifier_ops = {
 	.invalidate = vma_userptr_invalidate,
 };
 
+#if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
+/**
+ * xe_vma_userptr_force_invalidate() - force invalidate a userptr
+ * @uvma: The userptr vma to invalidate
+ *
+ * Perform a forced userptr invalidation for testing purposes.
+ */
+void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
+{
+	struct xe_vm *vm = xe_vma_vm(&uvma->vma);
+
+	lockdep_assert_held_write(&vm->lock);
+	lockdep_assert_held(&vm->userptr.notifier_lock);
+
+	if (!mmu_interval_read_retry(&uvma->userptr.notifier,
+				     uvma->userptr.notifier_seq))
+		uvma->userptr.notifier_seq -= 2;
+	__vma_userptr_invalidate(vm, uvma);
+}
+#endif
+
 int xe_vm_userptr_pin(struct xe_vm *vm)
 {
 	struct xe_userptr_vma *uvma, *next;
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 7c8e39049223..f5d835271350 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -287,4 +287,12 @@ struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm *vm);
 void xe_vm_snapshot_capture_delayed(struct xe_vm_snapshot *snap);
 void xe_vm_snapshot_print(struct xe_vm_snapshot *snap, struct drm_printer *p);
 void xe_vm_snapshot_free(struct xe_vm_snapshot *snap);
+
+#if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
+void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma);
+#else
+static inline void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
+{
+}
+#endif
 #endif
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 52467b9b5348..1fe79bf23b6b 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -228,8 +228,8 @@ struct xe_vm {
 		 * up for revalidation. Protected from access with the
 		 * @invalidated_lock. Removing items from the list
 		 * additionally requires @lock in write mode, and adding
-		 * items to the list requires the @userptr.notifer_lock in
-		 * write mode.
+		 * items to the list requires either the @userptr.notifer_lock in
+		 * write mode, OR @lock in write mode.
 		 */
 		struct list_head invalidated;
 	} userptr;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] drm/xe: Fix fault mode invalidation with unbind
  2025-02-26 15:33 ` [PATCH 3/4] drm/xe: Fix fault mode invalidation with unbind Thomas Hellström
@ 2025-02-27  7:03   ` Matthew Brost
  0 siblings, 0 replies; 10+ messages in thread
From: Matthew Brost @ 2025-02-27  7:03 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-xe, Matthew Auld, stable

On Wed, Feb 26, 2025 at 04:33:43PM +0100, Thomas Hellström wrote:
> Fix fault mode invalidation racing with unbind leading to the
> PTE zapping potentially traversing an invalid page-table tree.
> Do this by holding the notifier lock across PTE zapping. This
> might transfer any contention waiting on the notifier seqlock
> read side to the notifier lock read side, but that shouldn't be
> a major problem.
> 
> At the same time get rid of the open-coded invalidation in the bind
> code by relying on the notifier even when the vma bind is not
> yet committed.
> 
> Finally let userptr invalidation call a dedicated xe_vm function
> performing a full invalidation.
> 
> Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: <stable@vger.kernel.org> # v6.12+
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/xe/xe_pt.c       | 38 ++++------------
>  drivers/gpu/drm/xe/xe_vm.c       | 78 ++++++++++++++++++++------------
>  drivers/gpu/drm/xe/xe_vm.h       |  8 ++++
>  drivers/gpu/drm/xe/xe_vm_types.h |  4 +-
>  4 files changed, 68 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 1ddcc7e79a93..12a627a23eb4 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -1213,42 +1213,22 @@ static int vma_check_userptr(struct xe_vm *vm, struct xe_vma *vma,
>  		return 0;
>  
>  	uvma = to_userptr_vma(vma);
> -	notifier_seq = uvma->userptr.notifier_seq;
> +	if (xe_pt_userptr_inject_eagain(uvma))
> +		xe_vma_userptr_force_invalidate(uvma);
>  
> -	if (uvma->userptr.initial_bind && !xe_vm_in_fault_mode(vm))
> -		return 0;
> +	notifier_seq = uvma->userptr.notifier_seq;
>  
>  	if (!mmu_interval_read_retry(&uvma->userptr.notifier,
> -				     notifier_seq) &&
> -	    !xe_pt_userptr_inject_eagain(uvma))
> +				     notifier_seq))
>  		return 0;
>  
> -	if (xe_vm_in_fault_mode(vm)) {
> +	if (xe_vm_in_fault_mode(vm))
>  		return -EAGAIN;
> -	} else {
> -		spin_lock(&vm->userptr.invalidated_lock);
> -		list_move_tail(&uvma->userptr.invalidate_link,
> -			       &vm->userptr.invalidated);
> -		spin_unlock(&vm->userptr.invalidated_lock);
> -
> -		if (xe_vm_in_preempt_fence_mode(vm)) {
> -			struct dma_resv_iter cursor;
> -			struct dma_fence *fence;
> -			long err;
> -
> -			dma_resv_iter_begin(&cursor, xe_vm_resv(vm),
> -					    DMA_RESV_USAGE_BOOKKEEP);
> -			dma_resv_for_each_fence_unlocked(&cursor, fence)
> -				dma_fence_enable_sw_signaling(fence);
> -			dma_resv_iter_end(&cursor);
> -
> -			err = dma_resv_wait_timeout(xe_vm_resv(vm),
> -						    DMA_RESV_USAGE_BOOKKEEP,
> -						    false, MAX_SCHEDULE_TIMEOUT);
> -			XE_WARN_ON(err <= 0);
> -		}
> -	}
>  
> +	/*
> +	 * Just continue the operation since exec or rebind worker
> +	 * will take care of rebinding.
> +	 */
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 4c1ca47667ad..37d773c0b729 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -580,51 +580,26 @@ static void preempt_rebind_work_func(struct work_struct *w)
>  	trace_xe_vm_rebind_worker_exit(vm);
>  }
>  
> -static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
> -				   const struct mmu_notifier_range *range,
> -				   unsigned long cur_seq)
> +static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uvma)
>  {
> -	struct xe_userptr *userptr = container_of(mni, typeof(*userptr), notifier);
> -	struct xe_userptr_vma *uvma = container_of(userptr, typeof(*uvma), userptr);
> +	struct xe_userptr *userptr = &uvma->userptr;
>  	struct xe_vma *vma = &uvma->vma;
> -	struct xe_vm *vm = xe_vma_vm(vma);
>  	struct dma_resv_iter cursor;
>  	struct dma_fence *fence;
>  	long err;
>  
> -	xe_assert(vm->xe, xe_vma_is_userptr(vma));
> -	trace_xe_vma_userptr_invalidate(vma);
> -
> -	if (!mmu_notifier_range_blockable(range))
> -		return false;
> -
> -	vm_dbg(&xe_vma_vm(vma)->xe->drm,
> -	       "NOTIFIER: addr=0x%016llx, range=0x%016llx",
> -		xe_vma_start(vma), xe_vma_size(vma));
> -
> -	down_write(&vm->userptr.notifier_lock);
> -	mmu_interval_set_seq(mni, cur_seq);
> -
> -	/* No need to stop gpu access if the userptr is not yet bound. */
> -	if (!userptr->initial_bind) {
> -		up_write(&vm->userptr.notifier_lock);
> -		return true;
> -	}
> -
>  	/*
>  	 * Tell exec and rebind worker they need to repin and rebind this
>  	 * userptr.
>  	 */
>  	if (!xe_vm_in_fault_mode(vm) &&
> -	    !(vma->gpuva.flags & XE_VMA_DESTROYED) && vma->tile_present) {
> +	    !(vma->gpuva.flags & XE_VMA_DESTROYED)) {
>  		spin_lock(&vm->userptr.invalidated_lock);
>  		list_move_tail(&userptr->invalidate_link,
>  			       &vm->userptr.invalidated);
>  		spin_unlock(&vm->userptr.invalidated_lock);
>  	}
>  
> -	up_write(&vm->userptr.notifier_lock);
> -
>  	/*
>  	 * Preempt fences turn into schedule disables, pipeline these.
>  	 * Note that even in fault mode, we need to wait for binds and
> @@ -642,11 +617,35 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
>  				    false, MAX_SCHEDULE_TIMEOUT);
>  	XE_WARN_ON(err <= 0);
>  
> -	if (xe_vm_in_fault_mode(vm)) {
> +	if (xe_vm_in_fault_mode(vm) && userptr->initial_bind) {
>  		err = xe_vm_invalidate_vma(vma);
>  		XE_WARN_ON(err);
>  	}
> +}
> +
> +static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
> +				   const struct mmu_notifier_range *range,
> +				   unsigned long cur_seq)
> +{
> +	struct xe_userptr_vma *uvma = container_of(mni, typeof(*uvma), userptr.notifier);
> +	struct xe_vma *vma = &uvma->vma;
> +	struct xe_vm *vm = xe_vma_vm(vma);
> +
> +	xe_assert(vm->xe, xe_vma_is_userptr(vma));
> +	trace_xe_vma_userptr_invalidate(vma);
> +
> +	if (!mmu_notifier_range_blockable(range))
> +		return false;
>  
> +	vm_dbg(&xe_vma_vm(vma)->xe->drm,
> +	       "NOTIFIER: addr=0x%016llx, range=0x%016llx",
> +		xe_vma_start(vma), xe_vma_size(vma));
> +
> +	down_write(&vm->userptr.notifier_lock);
> +	mmu_interval_set_seq(mni, cur_seq);
> +
> +	__vma_userptr_invalidate(vm, uvma);
> +	up_write(&vm->userptr.notifier_lock);
>  	trace_xe_vma_userptr_invalidate_complete(vma);
>  
>  	return true;
> @@ -656,6 +655,27 @@ static const struct mmu_interval_notifier_ops vma_userptr_notifier_ops = {
>  	.invalidate = vma_userptr_invalidate,
>  };
>  
> +#if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
> +/**
> + * xe_vma_userptr_force_invalidate() - force invalidate a userptr
> + * @uvma: The userptr vma to invalidate
> + *
> + * Perform a forced userptr invalidation for testing purposes.
> + */
> +void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
> +{
> +	struct xe_vm *vm = xe_vma_vm(&uvma->vma);
> +
> +	lockdep_assert_held_write(&vm->lock);
> +	lockdep_assert_held(&vm->userptr.notifier_lock);
> +
> +	if (!mmu_interval_read_retry(&uvma->userptr.notifier,
> +				     uvma->userptr.notifier_seq))
> +		uvma->userptr.notifier_seq -= 2;
> +	__vma_userptr_invalidate(vm, uvma);
> +}
> +#endif
> +
>  int xe_vm_userptr_pin(struct xe_vm *vm)
>  {
>  	struct xe_userptr_vma *uvma, *next;
> diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> index 7c8e39049223..f5d835271350 100644
> --- a/drivers/gpu/drm/xe/xe_vm.h
> +++ b/drivers/gpu/drm/xe/xe_vm.h
> @@ -287,4 +287,12 @@ struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm *vm);
>  void xe_vm_snapshot_capture_delayed(struct xe_vm_snapshot *snap);
>  void xe_vm_snapshot_print(struct xe_vm_snapshot *snap, struct drm_printer *p);
>  void xe_vm_snapshot_free(struct xe_vm_snapshot *snap);
> +
> +#if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
> +void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma);
> +#else
> +static inline void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
> +{
> +}
> +#endif
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
> index 52467b9b5348..1fe79bf23b6b 100644
> --- a/drivers/gpu/drm/xe/xe_vm_types.h
> +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> @@ -228,8 +228,8 @@ struct xe_vm {
>  		 * up for revalidation. Protected from access with the
>  		 * @invalidated_lock. Removing items from the list
>  		 * additionally requires @lock in write mode, and adding
> -		 * items to the list requires the @userptr.notifer_lock in
> -		 * write mode.
> +		 * items to the list requires either the @userptr.notifer_lock in
> +		 * write mode, OR @lock in write mode.
>  		 */
>  		struct list_head invalidated;
>  	} userptr;
> -- 
> 2.48.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/4] drm/xe: Add staging tree for VM binds
       [not found] <20250226153344.58175-1-thomas.hellstrom@linux.intel.com>
                   ` (2 preceding siblings ...)
  2025-02-26 15:33 ` [PATCH 3/4] drm/xe: Fix fault mode invalidation with unbind Thomas Hellström
@ 2025-02-26 15:33 ` Thomas Hellström
  2025-02-26 16:25   ` Thomas Hellström
  3 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellström @ 2025-02-26 15:33 UTC (permalink / raw)
  To: intel-xe; +Cc: Matthew Brost, Thomas Hellström, stable

From: Matthew Brost <matthew.brost@intel.com>

Concurrent VM bind staging and zapping of PTEs from a userptr notifier
do not work because the view of PTEs is not stable. VM binds cannot
acquire the notifier lock during staging, as memory allocations are
required. To resolve this race condition, use a staging tree for VM
binds that is committed only under the userptr notifier lock during the
final step of the bind. This ensures a consistent view of the PTEs in
the userptr notifier.

A follow up may only use staging for VM in fault mode as this is the
only mode in which the above race exists.

v3:
 - Drop zap PTE change (Thomas)
 - s/xe_pt_entry/xe_pt_entry_staging (Thomas)

Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org>
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_pt.c      | 58 +++++++++++++++++++++++----------
 drivers/gpu/drm/xe/xe_pt_walk.c |  3 +-
 drivers/gpu/drm/xe/xe_pt_walk.h |  4 +++
 3 files changed, 46 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 12a627a23eb4..dc24baa84092 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -28,6 +28,8 @@ struct xe_pt_dir {
 	struct xe_pt pt;
 	/** @children: Array of page-table child nodes */
 	struct xe_ptw *children[XE_PDES];
+	/** @staging: Array of page-table staging nodes */
+	struct xe_ptw *staging[XE_PDES];
 };
 
 #if IS_ENABLED(CONFIG_DRM_XE_DEBUG_VM)
@@ -48,9 +50,10 @@ static struct xe_pt_dir *as_xe_pt_dir(struct xe_pt *pt)
 	return container_of(pt, struct xe_pt_dir, pt);
 }
 
-static struct xe_pt *xe_pt_entry(struct xe_pt_dir *pt_dir, unsigned int index)
+static struct xe_pt *
+xe_pt_entry_staging(struct xe_pt_dir *pt_dir, unsigned int index)
 {
-	return container_of(pt_dir->children[index], struct xe_pt, base);
+	return container_of(pt_dir->staging[index], struct xe_pt, base);
 }
 
 static u64 __xe_pt_empty_pte(struct xe_tile *tile, struct xe_vm *vm,
@@ -125,6 +128,7 @@ struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_tile *tile,
 	}
 	pt->bo = bo;
 	pt->base.children = level ? as_xe_pt_dir(pt)->children : NULL;
+	pt->base.staging = level ? as_xe_pt_dir(pt)->staging : NULL;
 
 	if (vm->xef)
 		xe_drm_client_add_bo(vm->xef->client, pt->bo);
@@ -206,8 +210,8 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags, struct llist_head *deferred)
 		struct xe_pt_dir *pt_dir = as_xe_pt_dir(pt);
 
 		for (i = 0; i < XE_PDES; i++) {
-			if (xe_pt_entry(pt_dir, i))
-				xe_pt_destroy(xe_pt_entry(pt_dir, i), flags,
+			if (xe_pt_entry_staging(pt_dir, i))
+				xe_pt_destroy(xe_pt_entry_staging(pt_dir, i), flags,
 					      deferred);
 		}
 	}
@@ -376,8 +380,10 @@ xe_pt_insert_entry(struct xe_pt_stage_bind_walk *xe_walk, struct xe_pt *parent,
 		/* Continue building a non-connected subtree. */
 		struct iosys_map *map = &parent->bo->vmap;
 
-		if (unlikely(xe_child))
+		if (unlikely(xe_child)) {
 			parent->base.children[offset] = &xe_child->base;
+			parent->base.staging[offset] = &xe_child->base;
+		}
 
 		xe_pt_write(xe_walk->vm->xe, map, offset, pte);
 		parent->num_live++;
@@ -614,6 +620,7 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
 			.ops = &xe_pt_stage_bind_ops,
 			.shifts = xe_normal_pt_shifts,
 			.max_level = XE_PT_HIGHEST_LEVEL,
+			.staging = true,
 		},
 		.vm = xe_vma_vm(vma),
 		.tile = tile,
@@ -873,7 +880,7 @@ static void xe_pt_cancel_bind(struct xe_vma *vma,
 	}
 }
 
-static void xe_pt_commit_locks_assert(struct xe_vma *vma)
+static void xe_pt_commit_prepare_locks_assert(struct xe_vma *vma)
 {
 	struct xe_vm *vm = xe_vma_vm(vma);
 
@@ -885,6 +892,16 @@ static void xe_pt_commit_locks_assert(struct xe_vma *vma)
 	xe_vm_assert_held(vm);
 }
 
+static void xe_pt_commit_locks_assert(struct xe_vma *vma)
+{
+	struct xe_vm *vm = xe_vma_vm(vma);
+
+	xe_pt_commit_prepare_locks_assert(vma);
+
+	if (xe_vma_is_userptr(vma))
+		lockdep_assert_held_read(&vm->userptr.notifier_lock);
+}
+
 static void xe_pt_commit(struct xe_vma *vma,
 			 struct xe_vm_pgtable_update *entries,
 			 u32 num_entries, struct llist_head *deferred)
@@ -895,13 +912,17 @@ static void xe_pt_commit(struct xe_vma *vma,
 
 	for (i = 0; i < num_entries; i++) {
 		struct xe_pt *pt = entries[i].pt;
+		struct xe_pt_dir *pt_dir;
 
 		if (!pt->level)
 			continue;
 
+		pt_dir = as_xe_pt_dir(pt);
 		for (j = 0; j < entries[i].qwords; j++) {
 			struct xe_pt *oldpte = entries[i].pt_entries[j].pt;
+			int j_ = j + entries[i].ofs;
 
+			pt_dir->children[j_] = pt_dir->staging[j_];
 			xe_pt_destroy(oldpte, xe_vma_vm(vma)->flags, deferred);
 		}
 	}
@@ -913,7 +934,7 @@ static void xe_pt_abort_bind(struct xe_vma *vma,
 {
 	int i, j;
 
-	xe_pt_commit_locks_assert(vma);
+	xe_pt_commit_prepare_locks_assert(vma);
 
 	for (i = num_entries - 1; i >= 0; --i) {
 		struct xe_pt *pt = entries[i].pt;
@@ -928,10 +949,10 @@ static void xe_pt_abort_bind(struct xe_vma *vma,
 		pt_dir = as_xe_pt_dir(pt);
 		for (j = 0; j < entries[i].qwords; j++) {
 			u32 j_ = j + entries[i].ofs;
-			struct xe_pt *newpte = xe_pt_entry(pt_dir, j_);
+			struct xe_pt *newpte = xe_pt_entry_staging(pt_dir, j_);
 			struct xe_pt *oldpte = entries[i].pt_entries[j].pt;
 
-			pt_dir->children[j_] = oldpte ? &oldpte->base : 0;
+			pt_dir->staging[j_] = oldpte ? &oldpte->base : 0;
 			xe_pt_destroy(newpte, xe_vma_vm(vma)->flags, NULL);
 		}
 	}
@@ -943,7 +964,7 @@ static void xe_pt_commit_prepare_bind(struct xe_vma *vma,
 {
 	u32 i, j;
 
-	xe_pt_commit_locks_assert(vma);
+	xe_pt_commit_prepare_locks_assert(vma);
 
 	for (i = 0; i < num_entries; i++) {
 		struct xe_pt *pt = entries[i].pt;
@@ -961,10 +982,10 @@ static void xe_pt_commit_prepare_bind(struct xe_vma *vma,
 			struct xe_pt *newpte = entries[i].pt_entries[j].pt;
 			struct xe_pt *oldpte = NULL;
 
-			if (xe_pt_entry(pt_dir, j_))
-				oldpte = xe_pt_entry(pt_dir, j_);
+			if (xe_pt_entry_staging(pt_dir, j_))
+				oldpte = xe_pt_entry_staging(pt_dir, j_);
 
-			pt_dir->children[j_] = &newpte->base;
+			pt_dir->staging[j_] = &newpte->base;
 			entries[i].pt_entries[j].pt = oldpte;
 		}
 	}
@@ -1494,6 +1515,7 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile, struct xe_vma *vma,
 			.ops = &xe_pt_stage_unbind_ops,
 			.shifts = xe_normal_pt_shifts,
 			.max_level = XE_PT_HIGHEST_LEVEL,
+			.staging = true,
 		},
 		.tile = tile,
 		.modified_start = xe_vma_start(vma),
@@ -1535,7 +1557,7 @@ static void xe_pt_abort_unbind(struct xe_vma *vma,
 {
 	int i, j;
 
-	xe_pt_commit_locks_assert(vma);
+	xe_pt_commit_prepare_locks_assert(vma);
 
 	for (i = num_entries - 1; i >= 0; --i) {
 		struct xe_vm_pgtable_update *entry = &entries[i];
@@ -1548,7 +1570,7 @@ static void xe_pt_abort_unbind(struct xe_vma *vma,
 			continue;
 
 		for (j = entry->ofs; j < entry->ofs + entry->qwords; j++)
-			pt_dir->children[j] =
+			pt_dir->staging[j] =
 				entries[i].pt_entries[j - entry->ofs].pt ?
 				&entries[i].pt_entries[j - entry->ofs].pt->base : NULL;
 	}
@@ -1561,7 +1583,7 @@ xe_pt_commit_prepare_unbind(struct xe_vma *vma,
 {
 	int i, j;
 
-	xe_pt_commit_locks_assert(vma);
+	xe_pt_commit_prepare_locks_assert(vma);
 
 	for (i = 0; i < num_entries; ++i) {
 		struct xe_vm_pgtable_update *entry = &entries[i];
@@ -1575,8 +1597,8 @@ xe_pt_commit_prepare_unbind(struct xe_vma *vma,
 		pt_dir = as_xe_pt_dir(pt);
 		for (j = entry->ofs; j < entry->ofs + entry->qwords; j++) {
 			entry->pt_entries[j - entry->ofs].pt =
-				xe_pt_entry(pt_dir, j);
-			pt_dir->children[j] = NULL;
+				xe_pt_entry_staging(pt_dir, j);
+			pt_dir->staging[j] = NULL;
 		}
 	}
 }
diff --git a/drivers/gpu/drm/xe/xe_pt_walk.c b/drivers/gpu/drm/xe/xe_pt_walk.c
index b8b3d2aea492..be602a763ff3 100644
--- a/drivers/gpu/drm/xe/xe_pt_walk.c
+++ b/drivers/gpu/drm/xe/xe_pt_walk.c
@@ -74,7 +74,8 @@ int xe_pt_walk_range(struct xe_ptw *parent, unsigned int level,
 		     u64 addr, u64 end, struct xe_pt_walk *walk)
 {
 	pgoff_t offset = xe_pt_offset(addr, level, walk);
-	struct xe_ptw **entries = parent->children ? parent->children : NULL;
+	struct xe_ptw **entries = walk->staging ? (parent->staging ?: NULL) :
+		(parent->children ?: NULL);
 	const struct xe_pt_walk_ops *ops = walk->ops;
 	enum page_walk_action action;
 	struct xe_ptw *child;
diff --git a/drivers/gpu/drm/xe/xe_pt_walk.h b/drivers/gpu/drm/xe/xe_pt_walk.h
index 5ecc4d2f0f65..5c02c244f7de 100644
--- a/drivers/gpu/drm/xe/xe_pt_walk.h
+++ b/drivers/gpu/drm/xe/xe_pt_walk.h
@@ -11,12 +11,14 @@
 /**
  * struct xe_ptw - base class for driver pagetable subclassing.
  * @children: Pointer to an array of children if any.
+ * @staging: Pointer to an array of staging if any.
  *
  * Drivers could subclass this, and if it's a page-directory, typically
  * embed an array of xe_ptw pointers.
  */
 struct xe_ptw {
 	struct xe_ptw **children;
+	struct xe_ptw **staging;
 };
 
 /**
@@ -41,6 +43,8 @@ struct xe_pt_walk {
 	 * as shared pagetables.
 	 */
 	bool shared_pt_mode;
+	/** @staging: Walk staging PT structure */
+	bool staging;
 };
 
 /**
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] drm/xe: Add staging tree for VM binds
  2025-02-26 15:33 ` [PATCH 4/4] drm/xe: Add staging tree for VM binds Thomas Hellström
@ 2025-02-26 16:25   ` Thomas Hellström
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Hellström @ 2025-02-26 16:25 UTC (permalink / raw)
  To: intel-xe; +Cc: Matthew Brost, stable

On Wed, 2025-02-26 at 16:33 +0100, Thomas Hellström wrote:
> From: Matthew Brost <matthew.brost@intel.com>
> 
> Concurrent VM bind staging and zapping of PTEs from a userptr
> notifier
> do not work because the view of PTEs is not stable. VM binds cannot
> acquire the notifier lock during staging, as memory allocations are
> required. To resolve this race condition, use a staging tree for VM
> binds that is committed only under the userptr notifier lock during
> the
> final step of the bind. This ensures a consistent view of the PTEs in
> the userptr notifier.
> 
> A follow up may only use staging for VM in fault mode as this is the
> only mode in which the above race exists.
> 
> v3:
>  - Drop zap PTE change (Thomas)
>  - s/xe_pt_entry/xe_pt_entry_staging (Thomas)
> 
> Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: <stable@vger.kernel.org>
> Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single
> job")
> Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error
> handling")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

> ---
>  drivers/gpu/drm/xe/xe_pt.c      | 58 +++++++++++++++++++++++--------
> --
>  drivers/gpu/drm/xe/xe_pt_walk.c |  3 +-
>  drivers/gpu/drm/xe/xe_pt_walk.h |  4 +++
>  3 files changed, 46 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 12a627a23eb4..dc24baa84092 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -28,6 +28,8 @@ struct xe_pt_dir {
>  	struct xe_pt pt;
>  	/** @children: Array of page-table child nodes */
>  	struct xe_ptw *children[XE_PDES];
> +	/** @staging: Array of page-table staging nodes */
> +	struct xe_ptw *staging[XE_PDES];
>  };
>  
>  #if IS_ENABLED(CONFIG_DRM_XE_DEBUG_VM)
> @@ -48,9 +50,10 @@ static struct xe_pt_dir *as_xe_pt_dir(struct xe_pt
> *pt)
>  	return container_of(pt, struct xe_pt_dir, pt);
>  }
>  
> -static struct xe_pt *xe_pt_entry(struct xe_pt_dir *pt_dir, unsigned
> int index)
> +static struct xe_pt *
> +xe_pt_entry_staging(struct xe_pt_dir *pt_dir, unsigned int index)
>  {
> -	return container_of(pt_dir->children[index], struct xe_pt,
> base);
> +	return container_of(pt_dir->staging[index], struct xe_pt,
> base);
>  }
>  
>  static u64 __xe_pt_empty_pte(struct xe_tile *tile, struct xe_vm *vm,
> @@ -125,6 +128,7 @@ struct xe_pt *xe_pt_create(struct xe_vm *vm,
> struct xe_tile *tile,
>  	}
>  	pt->bo = bo;
>  	pt->base.children = level ? as_xe_pt_dir(pt)->children :
> NULL;
> +	pt->base.staging = level ? as_xe_pt_dir(pt)->staging : NULL;
>  
>  	if (vm->xef)
>  		xe_drm_client_add_bo(vm->xef->client, pt->bo);
> @@ -206,8 +210,8 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags,
> struct llist_head *deferred)
>  		struct xe_pt_dir *pt_dir = as_xe_pt_dir(pt);
>  
>  		for (i = 0; i < XE_PDES; i++) {
> -			if (xe_pt_entry(pt_dir, i))
> -				xe_pt_destroy(xe_pt_entry(pt_dir,
> i), flags,
> +			if (xe_pt_entry_staging(pt_dir, i))
> +				xe_pt_destroy(xe_pt_entry_staging(pt
> _dir, i), flags,
>  					      deferred);
>  		}
>  	}
> @@ -376,8 +380,10 @@ xe_pt_insert_entry(struct xe_pt_stage_bind_walk
> *xe_walk, struct xe_pt *parent,
>  		/* Continue building a non-connected subtree. */
>  		struct iosys_map *map = &parent->bo->vmap;
>  
> -		if (unlikely(xe_child))
> +		if (unlikely(xe_child)) {
>  			parent->base.children[offset] = &xe_child-
> >base;
> +			parent->base.staging[offset] = &xe_child-
> >base;
> +		}
>  
>  		xe_pt_write(xe_walk->vm->xe, map, offset, pte);
>  		parent->num_live++;
> @@ -614,6 +620,7 @@ xe_pt_stage_bind(struct xe_tile *tile, struct
> xe_vma *vma,
>  			.ops = &xe_pt_stage_bind_ops,
>  			.shifts = xe_normal_pt_shifts,
>  			.max_level = XE_PT_HIGHEST_LEVEL,
> +			.staging = true,
>  		},
>  		.vm = xe_vma_vm(vma),
>  		.tile = tile,
> @@ -873,7 +880,7 @@ static void xe_pt_cancel_bind(struct xe_vma *vma,
>  	}
>  }
>  
> -static void xe_pt_commit_locks_assert(struct xe_vma *vma)
> +static void xe_pt_commit_prepare_locks_assert(struct xe_vma *vma)
>  {
>  	struct xe_vm *vm = xe_vma_vm(vma);
>  
> @@ -885,6 +892,16 @@ static void xe_pt_commit_locks_assert(struct
> xe_vma *vma)
>  	xe_vm_assert_held(vm);
>  }
>  
> +static void xe_pt_commit_locks_assert(struct xe_vma *vma)
> +{
> +	struct xe_vm *vm = xe_vma_vm(vma);
> +
> +	xe_pt_commit_prepare_locks_assert(vma);
> +
> +	if (xe_vma_is_userptr(vma))
> +		lockdep_assert_held_read(&vm-
> >userptr.notifier_lock);
> +}
> +
>  static void xe_pt_commit(struct xe_vma *vma,
>  			 struct xe_vm_pgtable_update *entries,
>  			 u32 num_entries, struct llist_head
> *deferred)
> @@ -895,13 +912,17 @@ static void xe_pt_commit(struct xe_vma *vma,
>  
>  	for (i = 0; i < num_entries; i++) {
>  		struct xe_pt *pt = entries[i].pt;
> +		struct xe_pt_dir *pt_dir;
>  
>  		if (!pt->level)
>  			continue;
>  
> +		pt_dir = as_xe_pt_dir(pt);
>  		for (j = 0; j < entries[i].qwords; j++) {
>  			struct xe_pt *oldpte =
> entries[i].pt_entries[j].pt;
> +			int j_ = j + entries[i].ofs;
>  
> +			pt_dir->children[j_] = pt_dir->staging[j_];
>  			xe_pt_destroy(oldpte, xe_vma_vm(vma)->flags,
> deferred);
>  		}
>  	}
> @@ -913,7 +934,7 @@ static void xe_pt_abort_bind(struct xe_vma *vma,
>  {
>  	int i, j;
>  
> -	xe_pt_commit_locks_assert(vma);
> +	xe_pt_commit_prepare_locks_assert(vma);
>  
>  	for (i = num_entries - 1; i >= 0; --i) {
>  		struct xe_pt *pt = entries[i].pt;
> @@ -928,10 +949,10 @@ static void xe_pt_abort_bind(struct xe_vma
> *vma,
>  		pt_dir = as_xe_pt_dir(pt);
>  		for (j = 0; j < entries[i].qwords; j++) {
>  			u32 j_ = j + entries[i].ofs;
> -			struct xe_pt *newpte = xe_pt_entry(pt_dir,
> j_);
> +			struct xe_pt *newpte =
> xe_pt_entry_staging(pt_dir, j_);
>  			struct xe_pt *oldpte =
> entries[i].pt_entries[j].pt;
>  
> -			pt_dir->children[j_] = oldpte ? &oldpte-
> >base : 0;
> +			pt_dir->staging[j_] = oldpte ? &oldpte->base
> : 0;
>  			xe_pt_destroy(newpte, xe_vma_vm(vma)->flags,
> NULL);
>  		}
>  	}
> @@ -943,7 +964,7 @@ static void xe_pt_commit_prepare_bind(struct
> xe_vma *vma,
>  {
>  	u32 i, j;
>  
> -	xe_pt_commit_locks_assert(vma);
> +	xe_pt_commit_prepare_locks_assert(vma);
>  
>  	for (i = 0; i < num_entries; i++) {
>  		struct xe_pt *pt = entries[i].pt;
> @@ -961,10 +982,10 @@ static void xe_pt_commit_prepare_bind(struct
> xe_vma *vma,
>  			struct xe_pt *newpte =
> entries[i].pt_entries[j].pt;
>  			struct xe_pt *oldpte = NULL;
>  
> -			if (xe_pt_entry(pt_dir, j_))
> -				oldpte = xe_pt_entry(pt_dir, j_);
> +			if (xe_pt_entry_staging(pt_dir, j_))
> +				oldpte = xe_pt_entry_staging(pt_dir,
> j_);
>  
> -			pt_dir->children[j_] = &newpte->base;
> +			pt_dir->staging[j_] = &newpte->base;
>  			entries[i].pt_entries[j].pt = oldpte;
>  		}
>  	}
> @@ -1494,6 +1515,7 @@ static unsigned int xe_pt_stage_unbind(struct
> xe_tile *tile, struct xe_vma *vma,
>  			.ops = &xe_pt_stage_unbind_ops,
>  			.shifts = xe_normal_pt_shifts,
>  			.max_level = XE_PT_HIGHEST_LEVEL,
> +			.staging = true,
>  		},
>  		.tile = tile,
>  		.modified_start = xe_vma_start(vma),
> @@ -1535,7 +1557,7 @@ static void xe_pt_abort_unbind(struct xe_vma
> *vma,
>  {
>  	int i, j;
>  
> -	xe_pt_commit_locks_assert(vma);
> +	xe_pt_commit_prepare_locks_assert(vma);
>  
>  	for (i = num_entries - 1; i >= 0; --i) {
>  		struct xe_vm_pgtable_update *entry = &entries[i];
> @@ -1548,7 +1570,7 @@ static void xe_pt_abort_unbind(struct xe_vma
> *vma,
>  			continue;
>  
>  		for (j = entry->ofs; j < entry->ofs + entry->qwords;
> j++)
> -			pt_dir->children[j] =
> +			pt_dir->staging[j] =
>  				entries[i].pt_entries[j - entry-
> >ofs].pt ?
>  				&entries[i].pt_entries[j - entry-
> >ofs].pt->base : NULL;
>  	}
> @@ -1561,7 +1583,7 @@ xe_pt_commit_prepare_unbind(struct xe_vma *vma,
>  {
>  	int i, j;
>  
> -	xe_pt_commit_locks_assert(vma);
> +	xe_pt_commit_prepare_locks_assert(vma);
>  
>  	for (i = 0; i < num_entries; ++i) {
>  		struct xe_vm_pgtable_update *entry = &entries[i];
> @@ -1575,8 +1597,8 @@ xe_pt_commit_prepare_unbind(struct xe_vma *vma,
>  		pt_dir = as_xe_pt_dir(pt);
>  		for (j = entry->ofs; j < entry->ofs + entry->qwords;
> j++) {
>  			entry->pt_entries[j - entry->ofs].pt =
> -				xe_pt_entry(pt_dir, j);
> -			pt_dir->children[j] = NULL;
> +				xe_pt_entry_staging(pt_dir, j);
> +			pt_dir->staging[j] = NULL;
>  		}
>  	}
>  }
> diff --git a/drivers/gpu/drm/xe/xe_pt_walk.c
> b/drivers/gpu/drm/xe/xe_pt_walk.c
> index b8b3d2aea492..be602a763ff3 100644
> --- a/drivers/gpu/drm/xe/xe_pt_walk.c
> +++ b/drivers/gpu/drm/xe/xe_pt_walk.c
> @@ -74,7 +74,8 @@ int xe_pt_walk_range(struct xe_ptw *parent,
> unsigned int level,
>  		     u64 addr, u64 end, struct xe_pt_walk *walk)
>  {
>  	pgoff_t offset = xe_pt_offset(addr, level, walk);
> -	struct xe_ptw **entries = parent->children ? parent-
> >children : NULL;
> +	struct xe_ptw **entries = walk->staging ? (parent->staging
> ?: NULL) :
> +		(parent->children ?: NULL);
>  	const struct xe_pt_walk_ops *ops = walk->ops;
>  	enum page_walk_action action;
>  	struct xe_ptw *child;
> diff --git a/drivers/gpu/drm/xe/xe_pt_walk.h
> b/drivers/gpu/drm/xe/xe_pt_walk.h
> index 5ecc4d2f0f65..5c02c244f7de 100644
> --- a/drivers/gpu/drm/xe/xe_pt_walk.h
> +++ b/drivers/gpu/drm/xe/xe_pt_walk.h
> @@ -11,12 +11,14 @@
>  /**
>   * struct xe_ptw - base class for driver pagetable subclassing.
>   * @children: Pointer to an array of children if any.
> + * @staging: Pointer to an array of staging if any.
>   *
>   * Drivers could subclass this, and if it's a page-directory,
> typically
>   * embed an array of xe_ptw pointers.
>   */
>  struct xe_ptw {
>  	struct xe_ptw **children;
> +	struct xe_ptw **staging;
>  };
>  
>  /**
> @@ -41,6 +43,8 @@ struct xe_pt_walk {
>  	 * as shared pagetables.
>  	 */
>  	bool shared_pt_mode;
> +	/** @staging: Walk staging PT structure */
> +	bool staging;
>  };
>  
>  /**


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-02-27  7:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250226153344.58175-1-thomas.hellstrom@linux.intel.com>
2025-02-26 15:33 ` [PATCH 1/4] drm/xe/vm: Validate userptr during gpu vma prefetching Thomas Hellström
2025-02-26 15:40   ` Matthew Brost
2025-02-26 15:46     ` Thomas Hellström
2025-02-26 15:33 ` [PATCH 2/4] drm/xe/vm: Fix a misplaced #endif Thomas Hellström
2025-02-26 17:03   ` Lucas De Marchi
2025-02-27  5:04   ` Upadhyay, Tejas
2025-02-26 15:33 ` [PATCH 3/4] drm/xe: Fix fault mode invalidation with unbind Thomas Hellström
2025-02-27  7:03   ` Matthew Brost
2025-02-26 15:33 ` [PATCH 4/4] drm/xe: Add staging tree for VM binds Thomas Hellström
2025-02-26 16:25   ` Thomas Hellström

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox