Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Francois Dugast <francois.dugast@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v7 07/13] drm/xe/hw_engine_group: Add helper to wait for dma fence jobs
Date: Thu, 8 Aug 2024 18:51:04 +0200	[thread overview]
Message-ID: <ZrT3eEzISmlL2ScQ@fdugast-desk> (raw)
In-Reply-To: <ZrQ2Dsv5wx/7ryFI@DUT025-TGLU.fm.intel.com>

On Thu, Aug 08, 2024 at 03:05:50AM +0000, Matthew Brost wrote:
> On Wed, Aug 07, 2024 at 06:23:36PM +0200, Francois Dugast wrote:
> > This is a required feature for faulting long running jobs not to be
> > submitted while dma fence jobs are running on the hw engine group.
> > 
> > v2: Switch to lockdep_assert_held_write in worker, get a proper reference
> >     for the last fence (Matt Brost)
> > 
> > Signed-off-by: Francois Dugast <francois.dugast@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_hw_engine_group.c | 33 +++++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine_group.c b/drivers/gpu/drm/xe/xe_hw_engine_group.c
> > index 3f74ff577a4c..955451960a3d 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine_group.c
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine_group.c
> > @@ -180,3 +180,36 @@ static void xe_hw_engine_group_suspend_faulting_lr_jobs(struct xe_hw_engine_grou
> >  		q->ops->suspend_wait(q);
> >  	}
> >  }
> > +
> > +/**
> > + * xe_hw_engine_group_wait_for_dma_fence_jobs() - Wait for dma fence jobs to complete
> > + * @group: The hw engine group
> > + *
> > + * This function is not meant to be called directly from a user IOCTL as dma_fence_wait()
> > + * is not interruptible.
> > + *
> > + * Return: 0 on success,
> > + *	   -ETIME if waiting for one job failed
> > + */
> > +static int xe_hw_engine_group_wait_for_dma_fence_jobs(struct xe_hw_engine_group *group)
> > +{
> > +	long timeout;
> > +	struct xe_exec_queue *q;
> > +	struct dma_fence *fence;
> > +
> > +	lockdep_assert_held_write(&group->mode_sem);
> > +
> > +	list_for_each_entry(q, &group->exec_queue_list, hw_engine_group_link) {
> > +		if (xe_vm_in_lr_mode(q->vm))
> > +			continue;
> > +
> > +		fence = xe_exec_queue_last_fence_get_for_resume(q, q->vm);
> > +		timeout = dma_fence_wait(fence, false);
> > +		xe_exec_queue_last_fence_put_for_resume(q, q->vm);
> 
> Missed this eariler.
> 
> s/xe_exec_queue_last_fence_put_for_resume/dma_fence_put

Thanks for catching this, will do

> 
> xe_exec_queue_last_fence_get_for_resume gets ref to a fence which can be
> dropped via dma_fence_put. I think this might be the source of CI
> failures [1] [2] too. But neither DG2 or ADL should be triggering this
> code path unless something else is going wrong. Can you look into these
> CI failures too?
> 
> Matt
> 
> [1] https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-136192v7/shard-dg2-433/igt@xe_module_load@reload.html
> [2] https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-136192v7/shard-adlp-1/igt@xe_module_load@unload.html

It seems there is currently an issue causing xe_module_load to fail
independant from this series, see for example [3] #rev2 and #rev3.

Francois

[3] https://patchwork.freedesktop.org/series/136891/

> 
> > +
> > +		if (timeout < 0)
> > +			return -ETIME;
> > +	}
> > +
> > +	return 0;
> > +}
> > -- 
> > 2.43.0
> > 

  reply	other threads:[~2024-08-08 16:51 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-07 16:23 [PATCH v7 00/13] Parallel submission of dma fence jobs and LR jobs with shared hardware resources Francois Dugast
2024-08-07 16:23 ` [PATCH v7 01/13] drm/xe/hw_engine_group: Introduce xe_hw_engine_group Francois Dugast
2024-08-08  2:56   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 02/13] drm/xe/guc_submit: Make suspend_wait interruptible Francois Dugast
2024-08-08  3:13   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 03/13] drm/xe/hw_engine_group: Register hw engine group's exec queues Francois Dugast
2024-08-08  3:22   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 04/13] drm/xe/hw_engine_group: Add helper to suspend faulting LR jobs Francois Dugast
2024-08-08  3:10   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 05/13] drm/xe/exec_queue: Remove duplicated code Francois Dugast
2024-08-08  3:50   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 06/13] drm/xe/exec_queue: Prepare last fence for hw engine group resume context Francois Dugast
2024-08-08  3:24   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 07/13] drm/xe/hw_engine_group: Add helper to wait for dma fence jobs Francois Dugast
2024-08-08  3:05   ` Matthew Brost
2024-08-08 16:51     ` Francois Dugast [this message]
2024-08-07 16:23 ` [PATCH v7 08/13] drm/xe/hw_engine_group: Ensure safe transition between execution modes Francois Dugast
2024-08-08  3:26   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 09/13] drm/xe/exec: Switch hw engine group execution mode upon job submission Francois Dugast
2024-08-08  3:33   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 10/13] drm/xe/guc_submit: Allow calling guc_exec_queue_resume with pending resume Francois Dugast
2024-08-07 16:23 ` [PATCH v7 11/13] drm/xe/hw_engine_group: Resume exec queues suspended by dma fence jobs Francois Dugast
2024-08-08  3:48   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 12/13] drm/xe/vm: Remove restriction that all VMs must be faulting if one is Francois Dugast
2024-08-08  3:45   ` Matthew Brost
2024-08-07 16:23 ` [PATCH v7 13/13] drm/xe/device: Remove unused xe_device::usm::num_vm_in_* Francois Dugast
2024-08-08  3:45   ` Matthew Brost
2024-08-07 16:30 ` ✓ CI.Patch_applied: success for Parallel submission of dma fence jobs and LR jobs with shared hardware resources (rev7) Patchwork
2024-08-07 16:30 ` ✗ CI.checkpatch: warning " Patchwork
2024-08-07 16:31 ` ✓ CI.KUnit: success " Patchwork
2024-08-07 16:43 ` ✓ CI.Build: " Patchwork
2024-08-07 16:45 ` ✓ CI.Hooks: " Patchwork
2024-08-07 16:46 ` ✓ CI.checksparse: " Patchwork
2024-08-07 17:18 ` ✓ CI.BAT: " Patchwork
2024-08-07 19:04 ` ✗ CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZrT3eEzISmlL2ScQ@fdugast-desk \
    --to=francois.dugast@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox