Re: [RFC v2 3/7] drm/xe/svm: Clear CPU_AUTORESET_ACTIVE on first GPU fault

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Arvind Yadav <arvind.yadav@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	<himal.prasad.ghimiray@intel.com>,
	<thomas.hellstrom@linux.intel.com>
Subject: Re: [RFC v2 3/7] drm/xe/svm: Clear CPU_AUTORESET_ACTIVE on first GPU fault
Date: Wed, 29 Apr 2026 21:26:50 -0700	[thread overview]
Message-ID: <afLaCi80ZDcEFKB+@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <20260406085830.1118431-4-arvind.yadav@intel.com>

On Mon, Apr 06, 2026 at 02:28:26PM +0530, Arvind Yadav wrote:
> CPU address mirror VMAs start with cpu_autoreset_active set, indicating
> they are still CPU-only.
> 
> Clear cpu_autoreset_active after the first successful GPU fault so
> subsequent munmap follows the SVM path instead of the autoreset path.
> 
> Do this in xe_svm_handle_pagefault() on the success path only.
> Prefetch faults that install no PTEs must not transition this state.
> 
> v2:
>   - Move xe_vma_gpu_touch() to the success path in
>     xe_svm_handle_pagefault() so prefetch faults that find no range do
>     not transition the state. (Matt)
>   - Add xe_vma_gpu_touch() helper in xe_vm.h and use
>     vma->cpu_autoreset_active instead of vma->gpuva.flags. (Matt)
> 
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_svm.c | 11 +++++++++++
>  drivers/gpu/drm/xe/xe_vm.h  | 13 +++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index 5933b2b6392b..fd57c9d41db8 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -1388,6 +1388,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
>  			    bool atomic)
>  {
>  	int need_vram, ret;
> +
> +	lockdep_assert_held_write(&vm->lock);
> +

This looks unrelated... and will change here [1] to read mode.

[1] https://patchwork.freedesktop.org/patch/707294/?series=162167&rev=4

So I'd say drop this part.

>  retry:
>  	need_vram = xe_vma_need_vram_for_atomic(vm->xe, vma, atomic);
>  	if (need_vram < 0)
> @@ -1406,6 +1409,14 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
>  
>  		goto retry;
>  	}
> +
> +	/*
> +	 * Mark the VMA as GPU-touched only after a successful fault-in.
> +	 * Prefetch faults that find no range must not transition this state.
> +	 */
> +	if (!ret && xe_vma_has_cpu_autoreset_active(vma))
> +		xe_vma_gpu_touch(vma);
> +
>  	return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> index bdf42083da86..8d45f896f90b 100644
> --- a/drivers/gpu/drm/xe/xe_vm.h
> +++ b/drivers/gpu/drm/xe/xe_vm.h
> @@ -441,4 +441,17 @@ static inline struct drm_exec *xe_vm_validation_exec(struct xe_vm *vm)
>  	((READ_ONCE(tile_present) & ~READ_ONCE(tile_invalidated)) & BIT((tile)->id))
>  
>  void xe_vma_mem_attr_copy(struct xe_vma_mem_attr *to, struct xe_vma_mem_attr *from);
> +
> +/**
> + * xe_vma_gpu_touch() - Mark a VMA as no longer CPU-only
> + * @vma: VMA to update
> + *
> + * Clear cpu_autoreset_active after the first successful GPU fault-in.
> + * Caller must hold vm->lock in write mode.
> + */
> +static inline void xe_vma_gpu_touch(struct xe_vma *vma)
> +{
> +	lockdep_assert_held_write(&xe_vma_vm(vma)->lock);
> +	vma->cpu_autoreset_active = false;

Write mode VM lock seems fine for now, but will likely have to change
this to vma->fault_lock with [1]...

Also IMO the caller adjusting this is too late. Adjusting auto-reset
likely should be basically be the first thing handler does, moved to the
xe_pagefault layer... I noticed in [1] the VMA more than likely needs to
be refcounted but that's a different issue.

Lastly - when we call xe_vma_gpu_touch and go from cpu_autoreset_active
1 -> 0 shouldn't we remove the auto-reset notifier [2] [3]? Keeping
notifier around is a non-zero cost as these are entiries in interval
tree the core MM walks for every notifier trigger, so I think deleting
the notifier part makes sense if this is possible - still wrapping my
head around [2] [3], so take this as a suggestion.

Matt

[2] https://patchwork.freedesktop.org/patch/716552/?series=161815&rev=2
[3] https://patchwork.freedesktop.org/patch/716553/?series=161815&rev=2

> +}
>  #endif
> -- 
> 2.43.0
>

next prev parent reply	other threads:[~2026-04-30  4:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-06  8:58 [RFC v2 0/7] drm/xe/svm: Add MMU notifier-based madvise autoreset on munmap Arvind Yadav
2026-04-06  8:58 ` [RFC v2 1/7] drm/xe/vm: Track CPU_AUTORESET state in xe_vma Arvind Yadav
2026-04-30  4:07   ` Matthew Brost
2026-06-04  5:31     ` Yadav, Arvind
2026-04-06  8:58 ` [RFC v2 2/7] drm/xe/vm: Preserve cpu_autoreset_active across GPUVA operations Arvind Yadav
2026-04-30  4:29   ` Matthew Brost
2026-06-04  5:40     ` Yadav, Arvind
2026-04-06  8:58 ` [RFC v2 3/7] drm/xe/svm: Clear CPU_AUTORESET_ACTIVE on first GPU fault Arvind Yadav
2026-04-30  4:26   ` Matthew Brost [this message]
2026-06-04  5:38     ` Yadav, Arvind
2026-04-06  8:58 ` [RFC v2 4/7] drm/xe/vm: Add madvise autoreset interval notifier worker infrastructure Arvind Yadav
2026-04-06  8:58 ` [RFC v2 5/7] drm/xe/vm: Deactivate madvise notifier on GPU touch Arvind Yadav
2026-04-06  8:58 ` [RFC v2 6/7] drm/xe/vm: Wire MADVISE_AUTORESET notifiers into VM lifecycle Arvind Yadav
2026-04-06  8:58 ` [RFC v2 7/7] drm/xe/svm: Correct memory attribute reset for partial unmap Arvind Yadav
2026-04-30  5:02   ` Matthew Brost
2026-04-30  5:08     ` Matthew Brost
2026-06-04  6:05       ` Yadav, Arvind
2026-04-06  9:04 ` ✗ CI.checkpatch: warning for drm/xe/svm: Add MMU notifier-based madvise autoreset on munmap (rev2) Patchwork
2026-04-06  9:06 ` ✓ CI.KUnit: success " Patchwork
2026-04-06  9:54 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-06 12:36 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afLaCi80ZDcEFKB+@gsse-cloud1.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=arvind.yadav@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.