intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org,
	Joonas Lahtinen	 <joonas.lahtinen@linux.intel.com>,
	Jani Nikula <jani.nikula@intel.com>,
	 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Matthew Auld <matthew.auld@intel.com>
Subject: Re: [PATCH v2 09/16] drm/xe: Convert the CPU fault handler for exhaustive eviction
Date: Wed, 27 Aug 2025 16:16:25 +0200	[thread overview]
Message-ID: <cc739f6d910cf772a10b4304c205b971f15bf7dc.camel@linux.intel.com> (raw)
In-Reply-To: <aK46+muNkR2uAq75@lstrano-desk.jf.intel.com>

On Tue, 2025-08-26 at 15:53 -0700, Matthew Brost wrote:
> On Fri, Aug 22, 2025 at 11:40:23AM +0200, Thomas Hellström wrote:
> > The CPU fault handler may populate bos and migrate, and in doing
> > so might interfere with other tasks validating.
> > 
> > Rework the CPU fault handler completely into a fastpath
> > and a slowpath. The fastpath trylocks only the validation lock
> > in read-mode. If that fails, there's a fallback to the
> > slowpath, where we do a full validation transaction.
> > 
> > This mandates open-coding of bo locking, bo idling and
> > bo populating, but we still call into TTM for fault
> > finalizing.
> > 
> > v2:
> > - Rework the CPU fault handler to actually take part in
> >   the exhaustive eviction scheme (Matthew Brost).
> > 
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_bo.c         | 191 ++++++++++++++++++++++++-
> > ----
> >  drivers/gpu/drm/xe/xe_validation.c |   3 +-
> >  2 files changed, 163 insertions(+), 31 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > b/drivers/gpu/drm/xe/xe_bo.c
> > index 76e9c93826a2..686ca5d6038a 100644
> > --- a/drivers/gpu/drm/xe/xe_bo.c
> > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > @@ -1713,57 +1713,188 @@ static void xe_gem_object_close(struct
> > drm_gem_object *obj,
> >  	}
> >  }
> >  
> > -static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
> > +static vm_fault_t __xe_bo_cpu_fault(struct vm_fault *vmf, struct
> > xe_device *xe, struct xe_bo *bo)
> > +{
> > +	vm_fault_t ret;
> > +
> > +	trace_xe_bo_cpu_fault(bo);
> > +
> > +	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma-
> > >vm_page_prot,
> > +				       TTM_BO_VM_NUM_PREFAULT);
> > +	if (ret == VM_FAULT_NOPAGE &&
> > +	    mem_type_is_vram(bo->ttm.resource->mem_type)) {
> > +		mutex_lock(&xe->mem_access.vram_userfault.lock);
> > +		if (list_empty(&bo->vram_userfault_link))
> > +			list_add(&bo->vram_userfault_link,
> > +				 &xe-
> > >mem_access.vram_userfault.list);
> > +		mutex_unlock(&xe->mem_access.vram_userfault.lock);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static vm_fault_t xe_err_to_fault_t(int err)
> > +{
> > +	switch (err) {
> > +	case 0:
> > +	case -EINTR:
> > +	case -ERESTARTSYS:
> > +	case -EAGAIN:
> > +		return VM_FAULT_NOPAGE;
> > +	case -ENOMEM:
> > +	case -ENOSPC:
> > +		return VM_FAULT_OOM;
> > +	default:
> > +		break;
> > +	}
> > +	return VM_FAULT_SIGBUS;
> > +}
> > +
> > +static vm_fault_t xe_bo_cpu_fault_fastpath(struct vm_fault *vmf,
> > struct xe_device *xe,
> > +					   struct xe_bo *bo, bool
> > needs_rpm)
> > +{
> > +	struct ttm_buffer_object *tbo = &bo->ttm;
> > +	vm_fault_t ret = VM_FAULT_RETRY;
> > +	struct xe_validation_ctx ctx;
> > +	int err;
> > +
> > +	if (needs_rpm && !xe_pm_runtime_get_if_active(xe))
> > +		return VM_FAULT_RETRY;
> > +
> > +	err = xe_validation_ctx_init(&ctx, &xe->val, NULL,
> > +				     (struct xe_val_flags) {
> > +					     .interruptible =
> > true,
> > +					     .no_block = true
> > +				     });
> > +	if (err)
> > +		goto out_pm;
> > +
> > +	if (!dma_resv_trylock(tbo->base.resv))
> > +		goto out_validation;
> > +
> > +	if (!dma_resv_test_signaled(tbo->base.resv,
> > DMA_RESV_USAGE_KERNEL))
> > +		goto out_unlock;
> > +
> > +	if (!tbo->resource->bus.is_iomem) {
> > +		struct ttm_operation_ctx ctx = {
> > +			.interruptible = true,
> > +			.no_wait_gpu = true,
> > +			.gfp_retry_mayfail = true,
> > +		};
> > +
> > +		err = ttm_bo_populate(tbo, &ctx);
> 
> The version of the fault handler before didn't have a ttm_bo_populate
> call. Can you explain why it is added here?

It's called from within ttm_bo_vm_fault_reserved() but with a blocking
ttm_operation_ctx. Here we call it non-blocking and if it succeeds the
version in ttm_bo_vm_fault_reserved() will be a NOP.

The functionality is to bring in bos from swap if needed. Or to be able
to access bos with deferred backing store allocation.

> 
> Also in we have this code in ttm_bo_vm_reserve which rejects external
> object marked as unmappable. Do we need something like this?
> 
>         if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL))
> {
>                 if (!(bo->ttm->page_flags &
> TTM_TT_FLAG_EXTERNAL_MAPPABLE)) {
>                         dma_resv_unlock(bo->base.resv);
>                         return VM_FAULT_SIGBUS;
>                 }
>         }

Ah, right. This essentially blocks imported dma-bufs from being mapped
here. I'll fix. This reminds me to add a comment to rework the ttm
helpers to move this stuff to TTM.

/Thomas

> 
> Matt
> 
> > +		if (err) {
> > +			if (err != -ENOMEM && err != -ENOSPC)
> > +				ret = xe_err_to_fault_t(err);
> > +			goto out_unlock;
> > +		}
> > +	}
> > +
> > +	ret = __xe_bo_cpu_fault(vmf, xe, bo);
> > +
> > +out_unlock:
> > +	dma_resv_unlock(tbo->base.resv);
> > +out_validation:
> > +	xe_validation_ctx_fini(&ctx);
> > +out_pm:
> > +	if (needs_rpm)
> > +		xe_pm_runtime_put(xe);
> > +
> > +	return ret;
> > +}
> > +
> > +static vm_fault_t xe_bo_cpu_fault(struct vm_fault *vmf)
> >  {
> >  	struct ttm_buffer_object *tbo = vmf->vma->vm_private_data;
> >  	struct drm_device *ddev = tbo->base.dev;
> >  	struct xe_device *xe = to_xe_device(ddev);
> >  	struct xe_bo *bo = ttm_to_xe_bo(tbo);
> >  	bool needs_rpm = bo->flags & XE_BO_FLAG_VRAM_MASK;
> > -	struct drm_exec *exec;
> > +	bool retry_after_wait = false;
> > +	struct xe_validation_ctx ctx;
> > +	struct drm_exec exec;
> >  	vm_fault_t ret;
> > +	int err = 0;
> >  	int idx;
> >  
> > +	if (!drm_dev_enter(&xe->drm, &idx))
> > +		return ttm_bo_vm_dummy_page(vmf, vmf->vma-
> > >vm_page_prot);
> > +
> > +	ret = xe_bo_cpu_fault_fastpath(vmf, xe, bo, needs_rpm);
> > +	if (ret != VM_FAULT_RETRY)
> > +		goto out;
> > +
> > +	if (fault_flag_allow_retry_first(vmf->flags)) {
> > +		if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
> > +			goto out;
> > +		retry_after_wait = true;
> > +		xe_bo_get(bo);
> > +		mmap_read_unlock(vmf->vma->vm_mm);
> > +	} else {
> > +		ret = VM_FAULT_NOPAGE;
> > +	}
> > +
> > +	/*
> > +	 * The fastpath failed and we were not required to return
> > and retry immediately.
> > +	 * We're now running in one of two modes:
> > +	 *
> > +	 * 1) retry_after_wait == true: The mmap_read_lock() is
> > dropped, and we're trying
> > +	 * to resolve blocking waits. But we can't resolve the
> > fault since the
> > +	 * mmap_read_lock() is dropped. After retrying the fault,
> > the aim is that the fastpath
> > +	 * should succeed. But it may fail since we drop the bo
> > lock.
> > +	 *
> > +	 * 2) retry_after_wait == false: The fastpath failed,
> > typically even after
> > +	 * a retry. Do whatever's necessary to resolve the fault.
> > +	 *
> > +	 * This construct is recommended to avoid excessive waits
> > under the mmap_lock.
> > +	 */
> > +
> >  	if (needs_rpm)
> >  		xe_pm_runtime_get(xe);
> >  
> > -	exec = XE_VALIDATION_UNIMPLEMENTED;
> > -	ret = ttm_bo_vm_reserve(tbo, vmf);
> > -	if (ret)
> > -		goto out;
> > +	xe_validation_guard(&ctx, &xe->val, &exec, (struct
> > xe_val_flags) {.interruptible = true},
> > +			    err) {
> > +		long lerr;
> >  
> > -	if (drm_dev_enter(ddev, &idx)) {
> > -		trace_xe_bo_cpu_fault(bo);
> > +		err = drm_exec_lock_obj(&exec, &tbo->base);
> > +		drm_exec_retry_on_contention(&exec);
> > +		if (err)
> > +			break;
> >  
> > -		xe_validation_assert_exec(xe, exec, &tbo->base);
> > -		ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma-
> > >vm_page_prot,
> > -					      
> > TTM_BO_VM_NUM_PREFAULT);
> > -		drm_dev_exit(idx);
> > +		lerr = dma_resv_wait_timeout(tbo->base.resv,
> > +					    
> > DMA_RESV_USAGE_KERNEL, true,
> > +					    
> > MAX_SCHEDULE_TIMEOUT);
> > +		if (lerr < 0) {
> > +			err = lerr;
> > +			break;
> > +		}
> >  
> > -		if (ret == VM_FAULT_RETRY &&
> > -		    !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
> > -			goto out;
> > +		if (!tbo->resource->bus.is_iomem) {
> > +			struct ttm_operation_ctx tctx = {
> > +				.interruptible = true,
> > +				.no_wait_gpu = false,
> > +				.gfp_retry_mayfail = true,
> > +			};
> >  
> > -		/*
> > -		 * ttm_bo_vm_reserve() already has dma_resv_lock.
> > -		 */
> > -		if (ret == VM_FAULT_NOPAGE &&
> > -		    mem_type_is_vram(tbo->resource->mem_type)) {
> > -			mutex_lock(&xe-
> > >mem_access.vram_userfault.lock);
> > -			if (list_empty(&bo->vram_userfault_link))
> > -				list_add(&bo->vram_userfault_link,
> > -					 &xe-
> > >mem_access.vram_userfault.list);
> > -			mutex_unlock(&xe-
> > >mem_access.vram_userfault.lock);
> > +			err = ttm_bo_populate(tbo, &tctx);
> > +			xe_validation_retry_on_oom(&ctx, &err);
> > +			if (err && (err == -EINTR || err == -
> > ERESTARTSYS))
> > +				break;
> >  		}
> > -	} else {
> > -		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma-
> > >vm_page_prot);
> > +		if (!retry_after_wait)
> > +			ret = __xe_bo_cpu_fault(vmf, xe, bo);
> >  	}
> > +	if (err)
> > +		ret = xe_err_to_fault_t(err);
> >  
> > -	dma_resv_unlock(tbo->base.resv);
> > -out:
> >  	if (needs_rpm)
> >  		xe_pm_runtime_put(xe);
> >  
> > +	if (retry_after_wait)
> > +		xe_bo_put(bo);
> > +out:
> > +	drm_dev_exit(idx);
> > +
> >  	return ret;
> >  }
> >  
> > @@ -1807,7 +1938,7 @@ int xe_bo_read(struct xe_bo *bo, u64 offset,
> > void *dst, int size)
> >  }
> >  
> >  static const struct vm_operations_struct xe_gem_vm_ops = {
> > -	.fault = xe_gem_fault,
> > +	.fault = xe_bo_cpu_fault,
> >  	.open = ttm_bo_vm_open,
> >  	.close = ttm_bo_vm_close,
> >  	.access = xe_bo_vm_access,
> > diff --git a/drivers/gpu/drm/xe/xe_validation.c
> > b/drivers/gpu/drm/xe/xe_validation.c
> > index b90fda3dd5f4..826cd09966ef 100644
> > --- a/drivers/gpu/drm/xe/xe_validation.c
> > +++ b/drivers/gpu/drm/xe/xe_validation.c
> > @@ -241,7 +241,8 @@ int xe_validation_exec_lock(struct
> > xe_validation_ctx *ctx,
> >   */
> >  void xe_validation_ctx_fini(struct xe_validation_ctx *ctx)
> >  {
> > -	drm_exec_fini(ctx->exec);
> > +	if (ctx->exec)
> > +		drm_exec_fini(ctx->exec);
> >  	xe_validation_unlock(ctx);
> >  }
> >  
> > -- 
> > 2.50.1
> > 


  reply	other threads:[~2025-08-27 14:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22  9:40 [PATCH v2 00/16] Driver-managed exhaustive eviction Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 01/16] drm/xe/vm: Don't pin the vm_resv during validation Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 02/16] drm/xe/tests/xe_dma_buf: Set the drm_object::dma_buf member Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 03/16] drm/xe/vm: Clear the scratch_pt pointer on error Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 04/16] drm/xe: Pass down drm_exec context to validation Thomas Hellström
2025-08-22 19:59   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 05/16] drm/xe: Introduce an xe_validation wrapper around drm_exec Thomas Hellström
2025-08-26 20:42   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 06/16] drm/xe: Convert xe_bo_create_user() for exhaustive eviction Thomas Hellström
2025-08-23  9:32   ` Simon Richter
2025-08-22  9:40 ` [PATCH v2 07/16] drm/xe: Convert SVM validation " Thomas Hellström
2025-08-22 19:13   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 08/16] drm/xe: Convert existing drm_exec transactions " Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 09/16] drm/xe: Convert the CPU fault handler " Thomas Hellström
2025-08-26 22:53   ` Matthew Brost
2025-08-27 14:16     ` Thomas Hellström [this message]
2025-08-27 15:52       ` Matthew Brost
2025-08-28  6:18         ` Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 10/16] drm/xe/display: Convert __xe_pin_fb_vma() Thomas Hellström
2025-08-26 21:29   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 11/16] drm/xe: Convert xe_dma_buf.c for exhaustive eviction Thomas Hellström
2025-08-26 21:16   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 12/16] drm/xe: Rename ___xe_bo_create_locked() Thomas Hellström
2025-08-22  9:40 ` [PATCH v2 13/16] drm/xe: Convert xe_bo_create_pin_map_at() for exhaustive eviction Thomas Hellström
2025-08-26 21:27   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 14/16] drm/xe: Convert xe_bo_create_pin_map() " Thomas Hellström
2025-08-26 21:52   ` Matthew Brost
2025-09-02 13:32     ` Thomas Hellström
2025-09-03 18:03       ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 15/16] drm/xe/sriov: Convert pf_provision_vf_lmem " Thomas Hellström
2025-08-22 19:35   ` Matthew Brost
2025-08-22  9:40 ` [PATCH v2 16/16] drm/xe: Convert pinned suspend eviction " Thomas Hellström
2025-08-26 22:08   ` Matthew Brost
2025-08-22 10:50 ` ✗ CI.checkpatch: warning for Driver-managed exhaustive eviction (rev2) Patchwork
2025-08-22 10:51 ` ✓ CI.KUnit: success " Patchwork
2025-08-22 11:31 ` ✓ Xe.CI.BAT: " Patchwork
2025-08-23  4:17 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc739f6d910cf772a10b4304c205b971f15bf7dc.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).