Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	<thomas.hellstrom@linux.intel.com>, <rodrigo.vivi@intel.com>,
	<lucas.demarchi@intel.com>
Subject: Re: [PATCH 1/3] drm/xe: Use ordered wq for preempt fence waiting
Date: Mon, 1 Apr 2024 19:37:47 +0000	[thread overview]
Message-ID: <ZgsNC3oP5x2SpNO/@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <ZgW+a4AV0vd+Klop@DUT025-TGLU.fm.intel.com>

On Thu, Mar 28, 2024 at 07:00:59PM +0000, Matthew Brost wrote:
> On Thu, Mar 28, 2024 at 11:56:48AM -0700, Matt Roper wrote:
> > On Thu, Mar 28, 2024 at 11:21:45AM -0700, Matthew Brost wrote:
> > > Preempt fences can sleep waiting for an exec quuee suspend operation to
> > > complete. Use a device private work queue to avoid hogging system
> > > resources. Even though suspend operations can complete out-of-order, all
> > > suspend operations within a VM need to complete before the preempt
> > > rebind worker can start. With that, use a device private ordered wq for
> > > preempt fence waiting.
> > > 
> > > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_device.c        | 7 ++++++-
> > >  drivers/gpu/drm/xe/xe_device_types.h  | 3 +++
> > >  drivers/gpu/drm/xe/xe_preempt_fence.c | 2 +-
> > >  3 files changed, 10 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > index 01bd5ccf05ca..559bd72fde57 100644
> > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > @@ -226,6 +226,9 @@ static void xe_device_destroy(struct drm_device *dev, void *dummy)
> > >  {
> > >  	struct xe_device *xe = to_xe_device(dev);
> > >  
> > > +	if (xe->preempt_fence_wq)
> > > +		destroy_workqueue(xe->preempt_fence_wq);
> > > +
> > >  	if (xe->ordered_wq)
> > >  		destroy_workqueue(xe->ordered_wq);
> > >  
> > > @@ -291,9 +294,11 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> > >  	INIT_LIST_HEAD(&xe->pinned.external_vram);
> > >  	INIT_LIST_HEAD(&xe->pinned.evicted);
> > >  
> > > +	xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq", 0);
> > >  	xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
> > >  	xe->unordered_wq = alloc_workqueue("xe-unordered-wq", 0, 0);
> > > -	if (!xe->ordered_wq || !xe->unordered_wq) {
> > > +	if (!xe->ordered_wq || !xe->unordered_wq ||
> > > +	    !xe->preempt_fence_wq) {
> > >  		drm_err(&xe->drm, "Failed to allocate xe workqueues\n");
> > >  		err = -ENOMEM;
> > >  		goto err;
> > 
> > Not the fault of this patch, but in cases where some of the workqueues
> > are allocated successfully, but at least one allocation fails, is the
> > error handling in this function correct?  It looks like it might be
> > leaking the ones that were actually allocated?  Same if
> > xe_display_create() fails later in the function.
> > 
> > 
> 
> I did notice this as posting, put this one together late last not night
> quickly. I suppose I can fix the work queue error handling in patch too.
> Let's make sure we are that the actual fix is good first, then I'll
> respin.
> 

This actually handled by calling drmm_add_action_or_reset with
xe_device_destroy above.

Matt

> Matt B.
> 
> > Matt
> > 
> > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > > index 1df3dcc17d75..c710cec835a7 100644
> > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > @@ -363,6 +363,9 @@ struct xe_device {
> > >  	/** @ufence_wq: user fence wait queue */
> > >  	wait_queue_head_t ufence_wq;
> > >  
> > > +	/** @preempt_fence_wq: used to serialize preempt fences */
> > > +	struct workqueue_struct *preempt_fence_wq;
> > > +
> > >  	/** @ordered_wq: used to serialize compute mode resume */
> > >  	struct workqueue_struct *ordered_wq;
> > >  
> > > diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c
> > > index 7bce2a332603..7d50c6e89d8e 100644
> > > --- a/drivers/gpu/drm/xe/xe_preempt_fence.c
> > > +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
> > > @@ -49,7 +49,7 @@ static bool preempt_fence_enable_signaling(struct dma_fence *fence)
> > >  	struct xe_exec_queue *q = pfence->q;
> > >  
> > >  	pfence->error = q->ops->suspend(q);
> > > -	queue_work(system_unbound_wq, &pfence->preempt_work);
> > > +	queue_work(q->vm->xe->preempt_fence_wq, &pfence->preempt_work);
> > >  	return true;
> > >  }
> > >  
> > > -- 
> > > 2.34.1
> > > 
> > 
> > -- 
> > Matt Roper
> > Graphics Software Engineer
> > Linux GPU Platform Enablement
> > Intel Corporation

  reply	other threads:[~2024-04-01 19:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 18:21 [PATCH 0/3] Rework work queue usage Matthew Brost
2024-03-28 18:21 ` [PATCH 1/3] drm/xe: Use ordered wq for preempt fence waiting Matthew Brost
2024-03-28 18:56   ` Matt Roper
2024-03-28 19:00     ` Matthew Brost
2024-04-01 19:37       ` Matthew Brost [this message]
2024-03-28 18:21 ` [PATCH 2/3] drm/xe: Use device, gt ordered work queues for resource cleanup Matthew Brost
2024-03-28 18:21 ` [PATCH 3/3] drm/xe: Use ordered WQ for TLB invalidation fences Matthew Brost
2024-03-28 19:02 ` [PATCH 0/3] Rework work queue usage Lucas De Marchi
2024-03-28 19:13   ` htejun
2024-03-28 19:30     ` Matthew Brost
2024-03-28 19:40       ` Tejun Heo
2024-03-29 16:52         ` Matthew Brost
2024-03-29  1:59 ` ✓ CI.Patch_applied: success for " Patchwork
2024-03-29  2:00 ` ✓ CI.checkpatch: " Patchwork
2024-03-29  2:00 ` ✓ CI.KUnit: " Patchwork
2024-03-29  2:12 ` ✓ CI.Build: " Patchwork
2024-03-29  2:15 ` ✓ CI.Hooks: " Patchwork
2024-03-29  2:16 ` ✓ CI.checksparse: " Patchwork
2024-03-29  2:53 ` ✗ CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZgsNC3oP5x2SpNO/@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox