All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Brost <matthew.brost@intel.com>, intel-xe@lists.freedesktop.org
Cc: fei.yang@intel.com, rodrigo.vivi@intel.com
Subject: Re: [PATCH] drm/xe: Invalidate userptr VMA on page pin fault
Date: Mon, 11 Mar 2024 14:29:26 +0100	[thread overview]
Message-ID: <de53c4465285bb4cc25ad4ebd23a950ce1dd12eb.camel@linux.intel.com> (raw)
In-Reply-To: <8203b1a23e65a876d02a5929a8e489eb9ad387de.camel@linux.intel.com>

On Mon, 2024-03-11 at 11:55 +0100, Thomas Hellström wrote:
> Hi, Matthew
> 
> On Fri, 2024-03-08 at 13:37 -0800, Matthew Brost wrote:
> > Rather than return an error to the user or ban the VM when userptr
> > VMA
> > page pin fails with -EFAULT, invalidate VMA mappings. This supports
> > the
> > UMD use case of freeing userptr while still having bindings.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gt_pagefault.c |  4 ++--
> >  drivers/gpu/drm/xe/xe_trace.h        |  2 +-
> >  drivers/gpu/drm/xe/xe_vm.c           | 20 +++++++++++++-------
> >  drivers/gpu/drm/xe/xe_vm_types.h     |  7 ++-----
> >  4 files changed, 18 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > index 73c535193a98..241c294270d9 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > @@ -69,7 +69,7 @@ static bool access_is_atomic(enum access_type
> > access_type)
> >  static bool vma_is_valid(struct xe_tile *tile, struct xe_vma *vma)
> >  {
> >  	return BIT(tile->id) & vma->tile_present &&
> > -		!(BIT(tile->id) & vma->usm.tile_invalidated);
> > +		!(BIT(tile->id) & vma->tile_invalidated);
> >  }
> >  
> >  static bool vma_matches(struct xe_vma *vma, u64 page_addr)
> > @@ -226,7 +226,7 @@ static int handle_pagefault(struct xe_gt *gt,
> > struct pagefault *pf)
> >  
> >  	if (xe_vma_is_userptr(vma))
> >  		ret =
> > xe_vma_userptr_check_repin(to_userptr_vma(vma));
> > -	vma->usm.tile_invalidated &= ~BIT(tile->id);
> > +	vma->tile_invalidated &= ~BIT(tile->id);
> >  
> >  unlock_dma_resv:
> >  	drm_exec_fini(&exec);
> > diff --git a/drivers/gpu/drm/xe/xe_trace.h
> > b/drivers/gpu/drm/xe/xe_trace.h
> > index 4ddc55527f9a..846f14507d5f 100644
> > --- a/drivers/gpu/drm/xe/xe_trace.h
> > +++ b/drivers/gpu/drm/xe/xe_trace.h
> > @@ -468,7 +468,7 @@ DEFINE_EVENT(xe_vma, xe_vma_userptr_invalidate,
> >  	     TP_ARGS(vma)
> >  );
> >  
> > -DEFINE_EVENT(xe_vma, xe_vma_usm_invalidate,
> > +DEFINE_EVENT(xe_vma, xe_vma_invalidate,
> >  	     TP_PROTO(struct xe_vma *vma),
> >  	     TP_ARGS(vma)
> >  );
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c
> > b/drivers/gpu/drm/xe/xe_vm.c
> > index 643b3701a738..9a19044f7ef6 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -724,11 +724,18 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
> >  	list_for_each_entry_safe(uvma, next, &vm-
> > > userptr.repin_list,
> >  				 userptr.repin_link) {
> >  		err = xe_vma_userptr_pin_pages(uvma);
> > -		if (err < 0)
> > -			return err;
> > -
> >  		list_del_init(&uvma->userptr.repin_link);
> > -		list_move_tail(&uvma->vma.combined_links.rebind,
> > &vm->rebind_list);
> > +		if (err == -EFAULT) {
> > +			err = xe_vm_invalidate_vma(&uvma->vma);
> 
> I think we need to check for FAULT_MODE here. If we hit this path in
> FAULT_MODE, we already have an invalid gpu access and can kill the
> VM.
> 
> In preempt-fence mode, we should probably be calling
> xe_vm_unbind_vma(), because xe_vm_invalidate_vma() isn't safe to call
> outside of the mmu_notifier, and if there are still BOOKKEEP fences
> pending- see the asserts in that function.


Actually, xe_vm_invalidate_vma() would probably work if we grabbed the
vm resv and waited for bookkeep fences first, and updated the asserts. 

But then xe_vm_unbind_vma() might still be better since we also clean
up the page-tables.

/Thomas


> 
> > +			if (err)
> > +				return err;
> > +		} else {
> > +			if (err < 0)
> > +				return err;
> > +
> > +			list_move_tail(&uvma-
> > > vma.combined_links.rebind,
> > +				       &vm->rebind_list);
> > +		}
> >  	}
> >  
> >  	return 0;
> > @@ -3214,9 +3221,8 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
> >  	u8 id;
> >  	int ret;
> >  
> > -	xe_assert(xe, xe_vm_in_fault_mode(xe_vma_vm(vma)));
> >  	xe_assert(xe, !xe_vma_is_null(vma));
> > -	trace_xe_vma_usm_invalidate(vma);
> > +	trace_xe_vma_invalidate(vma);
> >  
> >  	/* Check that we don't race with page-table updates */
> >  	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
> > @@ -3254,7 +3260,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
> >  		}
> >  	}
> >  
> > -	vma->usm.tile_invalidated = vma->tile_mask;
> > +	vma->tile_invalidated = vma->tile_mask;
> >  
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_vm_types.h
> > b/drivers/gpu/drm/xe/xe_vm_types.h
> > index 79b5cab57711..ae5fb565f6bf 100644
> > --- a/drivers/gpu/drm/xe/xe_vm_types.h
> > +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> > @@ -84,11 +84,8 @@ struct xe_vma {
> >  		struct work_struct destroy_work;
> >  	};
> >  
> > -	/** @usm: unified shared memory state */
> > -	struct {
> > -		/** @tile_invalidated: VMA has been invalidated */
> > -		u8 tile_invalidated;
> > -	} usm;
> > +	/** @tile_invalidated: VMA has been invalidated */
> > +	u8 tile_invalidated;
> 
> Add a comment in the commit message about removing the usm struct?
> /Thomas
> 
> 
> >  
> >  	/** @tile_mask: Tile mask of where to create binding for
> > this VMA */
> >  	u8 tile_mask;
> 


  reply	other threads:[~2024-03-11 13:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-08 21:37 [PATCH] drm/xe: Invalidate userptr VMA on page pin fault Matthew Brost
2024-03-08 21:42 ` ✓ CI.Patch_applied: success for " Patchwork
2024-03-08 21:42 ` ✓ CI.checkpatch: " Patchwork
2024-03-08 21:43 ` ✗ CI.KUnit: failure " Patchwork
2024-03-11 10:55 ` [PATCH] " Thomas Hellström
2024-03-11 13:29   ` Thomas Hellström [this message]
2024-03-11 18:49     ` Matthew Brost
2024-03-11 19:23       ` Thomas Hellström
  -- strict thread matches above, loose matches on Subject: below --
2024-03-12 18:39 Matthew Brost
2024-03-13 12:18 ` Thomas Hellström

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=de53c4465285bb4cc25ad4ebd23a950ce1dd12eb.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=fei.yang@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.