Re: [PATCH 05/13] drm/xe: Use xe_vma_ops to implement xe_vm_rebind

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: "Zeng, Oak" <oak.zeng@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 05/13] drm/xe: Use xe_vma_ops to implement xe_vm_rebind
Date: Fri, 19 Apr 2024 04:14:19 +0000	[thread overview]
Message-ID: <ZiHvm7C6I9n5XoG6@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <SA1PR11MB699140DE94F7D680AADD7A3E920D2@SA1PR11MB6991.namprd11.prod.outlook.com>

On Thu, Apr 18, 2024 at 09:43:06PM -0600, Zeng, Oak wrote:
> 
> 
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> > Matthew Brost
> > Sent: Wednesday, April 10, 2024 1:41 AM
> > To: intel-xe@lists.freedesktop.org
> > Cc: Brost, Matthew <matthew.brost@intel.com>
> > Subject: [PATCH 05/13] drm/xe: Use xe_vma_ops to implement
> > xe_vm_rebind
> > 
> > All page tables updates are moving to a xe_vma_ops interface to
> > implement 1 job per VM bind IOCTL.
> 
> Just want to make sure I understand it correctly. So far after this patch, the rebind is still many jobs (one job per vma), right?
> 

Yes. A follow on series will convert to 1 job for all of the rebind list.

> 
>  Convert xe_vm_rebind to use a
> > xe_vma_ops based interface.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_vm.c | 78 +++++++++++++++++++++++++++++++-
> > ------
> >  1 file changed, 64 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index 4cd485d5bc0a..9d82396cf5d5 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -811,37 +811,87 @@ int xe_vm_userptr_check_repin(struct xe_vm *vm)
> >  		list_empty_careful(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
> >  }
> > 
> > -static struct dma_fence *
> > -xe_vm_bind_vma(struct xe_vma *vma, struct xe_exec_queue *q,
> > -	       struct xe_sync_entry *syncs, u32 num_syncs,
> > -	       bool first_op, bool last_op);
> > +static void xe_vm_populate_rebind(struct xe_vma_op *op, struct xe_vma
> > *vma,
> > +				  u8 tile_mask)
> > +{
> > +	INIT_LIST_HEAD(&op->link);
> > +	op->base.op = DRM_GPUVA_OP_MAP;
> > +	op->base.map.va.addr = vma->gpuva.va.addr;
> > +	op->base.map.va.range = vma->gpuva.va.range;
> > +	op->base.map.gem.obj = vma->gpuva.gem.obj;
> > +	op->base.map.gem.offset = vma->gpuva.gem.offset;
> > +	op->map.vma = vma;
> > +	op->map.immediate = true;
> > +	op->map.dumpable = vma->gpuva.flags & XE_VMA_DUMPABLE;
> > +	op->map.is_null = xe_vma_is_null(vma);
> > +}
> > +
> > +static int xe_vm_ops_add_rebind(struct xe_vma_ops *vops, struct xe_vma
> > *vma,
> > +				u8 tile_mask)
> > +{
> > +	struct xe_vma_op *op;
> > +
> > +	op = kzalloc(sizeof(*op), GFP_KERNEL);
> > +	if (!op)
> > +		return -ENOMEM;
> > +
> > +	xe_vm_populate_rebind(op, vma, tile_mask);
> > +	list_add_tail(&op->link, &vops->list);
> > +
> > +	return 0;
> > +}
> > +
> > +static struct dma_fence *ops_execute(struct xe_vm *vm,
> > +				     struct xe_vma_ops *vops,
> > +				     bool cleanup);
> > +static void xe_vma_ops_init(struct xe_vma_ops *vops);
> > 
> >  int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
> >  {
> >  	struct dma_fence *fence;
> >  	struct xe_vma *vma, *next;
> > +	struct xe_vma_ops vops;
> > +	struct xe_vma_op *op, *next_op;
> > +	int err;
> > 
> >  	lockdep_assert_held(&vm->lock);
> > -	if (xe_vm_in_lr_mode(vm) && !rebind_worker)
> > +	if ((xe_vm_in_lr_mode(vm) && !rebind_worker) ||
> > +	    list_empty(&vm->rebind_list))
> >  		return 0;
> > 
> > +	xe_vma_ops_init(&vops);
> > +
> >  	xe_vm_assert_held(vm);
> > -	list_for_each_entry_safe(vma, next, &vm->rebind_list,
> > -				 combined_links.rebind) {
> > +	list_for_each_entry(vma, &vm->rebind_list, combined_links.rebind)
> > {
> >  		xe_assert(vm->xe, vma->tile_present);
> > 
> > -		list_del_init(&vma->combined_links.rebind);
> >  		if (rebind_worker)
> >  			trace_xe_vma_rebind_worker(vma);
> >  		else
> >  			trace_xe_vma_rebind_exec(vma);
> > -		fence = xe_vm_bind_vma(vma, NULL, NULL, 0, false, false);
> > -		if (IS_ERR(fence))
> > -			return PTR_ERR(fence);
> > +
> > +		err = xe_vm_ops_add_rebind(&vops, vma,
> > +					   vma->tile_present);
> > +		if (err)
> > +			goto free_ops;
> > +	}
> > +
> > +	fence = ops_execute(vm, &vops, false);
> > +	if (IS_ERR(fence)) {
> > +		err = PTR_ERR(fence);
> 
> So here, if above ops_execute partially succeed (some vma bind failed, some succeed), for those vmas which are successfully bound, it is kept in the vm's rebind_list. Is this the correct behavior? Next time we will rebind them again....
> 

The VM is killed if any VMA ops fails so it doesn't really matter, also
it safe to issue a rebind twice.

In the follow up series, once we have 1 job per the rebind list we can
cope with errors and not kill the VM. In that case we must leave
everything on the rebind list.

So this patch is correct now and for the follow on series.

Matt

> 
> Oak
> 
> 
> > +	} else {
> >  		dma_fence_put(fence);
> > +		list_for_each_entry_safe(vma, next, &vm->rebind_list,
> > +					 combined_links.rebind)
> > +			list_del_init(&vma->combined_links.rebind);
> > +	}
> > +free_ops:
> > +	list_for_each_entry_safe(op, next_op, &vops.list, link) {
> > +		list_del(&op->link);
> > +		kfree(op);
> >  	}
> > 
> > -	return 0;
> > +	return err;
> >  }
> > 
> >  static void xe_vma_free(struct xe_vma *vma)
> > @@ -2516,7 +2566,7 @@ static struct dma_fence *op_execute(struct xe_vm
> > *vm, struct xe_vma *vma,
> >  {
> >  	struct dma_fence *fence = NULL;
> > 
> > -	lockdep_assert_held_write(&vm->lock);
> > +	lockdep_assert_held(&vm->lock);
> > 
> >  	xe_vm_assert_held(vm);
> >  	xe_bo_assert_held(xe_vma_bo(vma));
> > @@ -2635,7 +2685,7 @@ xe_vma_op_execute(struct xe_vm *vm, struct
> > xe_vma_op *op)
> >  {
> >  	struct dma_fence *fence = ERR_PTR(-ENOMEM);
> > 
> > -	lockdep_assert_held_write(&vm->lock);
> > +	lockdep_assert_held(&vm->lock);
> > 
> >  	switch (op->base.op) {
> >  	case DRM_GPUVA_OP_MAP:
> > --
> > 2.34.1
>

next prev parent reply	other threads:[~2024-04-19  4:14 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-10  5:40 [PATCH 00/13] Prep patches for 1 job per VM bind IOCTL Matthew Brost
2024-04-10  5:40 ` [PATCH 01/13] drm/xe: Lock all gpuva ops during " Matthew Brost
2024-04-16 15:51   ` Zeng, Oak
2024-04-16 17:02     ` Matthew Brost
2024-04-10  5:40 ` [PATCH 02/13] drm/xe: Add ops_execute function which returns a fence Matthew Brost
2024-04-18 16:16   ` Zeng, Oak
2024-04-18 19:36     ` Matthew Brost
2024-04-23  3:09       ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 03/13] drm/xe: Move migrate to prefetch to op_lock_and_prep function Matthew Brost
2024-04-18 19:27   ` Zeng, Oak
2024-04-19 19:52     ` Matthew Brost
2024-04-23  3:32       ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 04/13] drm/xe: Add struct xe_vma_ops abstraction Matthew Brost
2024-04-10  5:40 ` [PATCH 05/13] drm/xe: Use xe_vma_ops to implement xe_vm_rebind Matthew Brost
2024-04-19  3:43   ` Zeng, Oak
2024-04-19  4:14     ` Matthew Brost [this message]
2024-04-23  3:17       ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 06/13] drm/xe: Simplify VM bind IOCTL error handling and cleanup Matthew Brost
2024-04-19  4:19   ` Zeng, Oak
2024-04-19 19:16     ` Matthew Brost
2024-04-23  3:22       ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 07/13] drm/xe: Use xe_vma_ops to implement page fault rebinds Matthew Brost
2024-04-19 14:22   ` Zeng, Oak
2024-04-19 19:33     ` Matthew Brost
2024-04-23  3:27       ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 08/13] drm/xe: Add some members to xe_vma_ops Matthew Brost
2024-04-19 14:24   ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 09/13] drm/xe: Add vm_bind_ioctl_ops_fini helper Matthew Brost
2024-04-19 14:51   ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 10/13] drm/xe: Move ufence check to op_lock Matthew Brost
2024-04-19 14:56   ` Zeng, Oak
2024-04-19 19:34     ` Matthew Brost
2024-04-10  5:40 ` [PATCH 11/13] drm/xe: Move ufence add to vm_bind_ioctl_ops_fini Matthew Brost
2024-04-19 15:24   ` Zeng, Oak
2024-04-19 19:45     ` Matthew Brost
2024-04-23  3:36       ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 12/13] drm/xe: Add xe_gt_tlb_invalidation_range and convert PT layer to use this Matthew Brost
2024-04-19 16:00   ` Zeng, Oak
2024-04-10  5:40 ` [PATCH 13/13] drm/xe: Delete PT update selftest Matthew Brost
2024-04-10  6:28 ` ✗ CI.Patch_applied: failure for Prep patches for 1 job per VM bind IOCTL Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZiHvm7C6I9n5XoG6@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=oak.zeng@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox