Re: [PATCH 2/2] drm/xe: Record utilization before destroying the exec queue

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <lucas.demarchi@intel.com>
Subject: Re: [PATCH 2/2] drm/xe: Record utilization before destroying the exec queue
Date: Tue, 25 Jun 2024 17:41:01 +0000	[thread overview]
Message-ID: <ZnsBLb8x/Ma+JIxl@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <Znr9Mk8r1e2gvUja@DUT025-TGLU.fm.intel.com>

On Tue, Jun 25, 2024 at 05:24:02PM +0000, Matthew Brost wrote:
> On Wed, Jun 26, 2024 at 12:58:12AM +0800, Umesh Nerlige Ramappa wrote:
> > Current code captures utilization at the exec queue level whenever a job
> > is completed and then accumulates it into the xe file stats whenever the
> > user queries for per client engine utilization. There is a case where
> > utilization may be lost if the exec queue is destroyed before the user
> > queries the utilization. To overcome that, record the utlization when
> > the exec queue is destroyed.
> > 
> > To do so
> > 
> > 1) Wait for release of all other references to the exec queue. The wait
> >    uses the same timeout as the job scheduling timeout. On timeout, only
> >    a debug message is printed out since this is just a best effort to
> >    capture the utilization prior to destroying the queue.
> > 2) Before releasing the last reference in xe_exec_queue_destroy_ioctl(),
> >    record the utilization in the xe file stats.
> > 
> > Fixes: ce62827bc294 ("drm/xe: Do not access xe file when updating exec queue run_ticks")
> > Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_exec_queue.c       | 11 +++++++++++
> >  drivers/gpu/drm/xe/xe_exec_queue_types.h |  2 ++
> >  2 files changed, 13 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> > index 4d90a16745d2..f1028eaf2d7f 100644
> > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> > @@ -69,6 +69,7 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe,
> >  	q->ops = gt->exec_queue_ops;
> >  	INIT_LIST_HEAD(&q->lr.link);
> >  	INIT_LIST_HEAD(&q->multi_gt_link);
> > +	init_waitqueue_head(&q->wq);
> >  
> >  	q->sched_props.timeslice_us = hwe->eclass->sched_props.timeslice_us;
> >  	q->sched_props.preempt_timeout_us =
> > @@ -825,6 +826,7 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
> >  	struct xe_file *xef = to_xe_file(file);
> >  	struct drm_xe_exec_queue_destroy *args = data;
> >  	struct xe_exec_queue *q;
> > +	int ret;
> >  
> >  	if (XE_IOCTL_DBG(xe, args->pad) ||
> >  	    XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
> > @@ -838,6 +840,15 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
> >  
> >  	xe_exec_queue_kill(q);
> >  
> > +	ret = wait_event_timeout(q->wq, kref_read(&q->refcount) == 1,
> > +			(q->sched_props.job_timeout_ms/1000) * HZ);
> > +	if (!ret)
> > +		drm_dbg(&xe->drm, "Timedout waiting for exec queue run ticks update\n");
> > +
> 
> I can't say I love adding a wait to destroy IOCTL just to accurately
> report some stats. If we can't preempt this exec queue, the kill flow
> can take nearly .7s which would be pretty bad to block in the IOCTL.

Opps, my timing calculation is wrong here as I forget that we reduce to
the preemption timeoot to 1ms upon kill. But my point still stands from
an arch perspective blocking in an IOCTL to collect stats doesn't seem
right to me. Maybe others have a different opinion?

> Even the common case isn't great either...
> 
> Beyond that, this code breaks if the assumption of
> kref_read(&q->refcount) == 1 meaning all jobs are done. e.g. The wedging
> code == 2 takes an extra ref to the exec queue, so this IOCTL will hang
> forever even we wedge the device.

s/even we/if we/

Matt

> 
> Could this be added to exec queue backend code which adjusts on the
> final destroy? e.g. Add it to __guc_exec_queue_fini_async via
> xe_exec_queue_fini? Or does that not work because we don't have the xef?
> 
> Regardless if my suggestion works or not, surely we can do something to
> avoid waiting in this IOCTL. I suggest exploring another solution.
> 
> Matt
> 
> > +	mutex_lock(&xef->exec_queue.lock);
> > +	xef->run_ticks[q->class] += xe_exec_queue_delta_run_ticks(q);
> > +	mutex_unlock(&xef->exec_queue.lock);
> > +
> >  	trace_xe_exec_queue_close(q);
> >  	xe_exec_queue_put(q);
> >  
> > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > index 201588ec33c3..2ae4221d2f61 100644
> > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > @@ -143,6 +143,8 @@ struct xe_exec_queue {
> >  	u64 old_run_ticks;
> >  	/** @run_ticks: hw engine class run time in ticks for this exec queue */
> >  	u64 run_ticks;
> > +	/** @wq: wait queue to wait for cleanup */
> > +	wait_queue_head_t wq;
> >  	/** @lrc: logical ring context for this exec queue */
> >  	struct xe_lrc *lrc[];
> >  };
> > -- 
> > 2.34.1
> >

next prev parent reply	other threads:[~2024-06-25 17:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-25 16:58 [PATCH 0/2] Capture per client stats when destroying an exec queue Umesh Nerlige Ramappa
2024-06-25 16:58 ` [PATCH 1/2] drm/xe: Use a helper to get delta run ticks from the " Umesh Nerlige Ramappa
2024-06-25 17:13   ` Matthew Brost
2024-06-25 16:58 ` [PATCH 2/2] drm/xe: Record utilization before destroying " Umesh Nerlige Ramappa
2024-06-25 17:24   ` Matthew Brost
2024-06-25 17:41     ` Matthew Brost [this message]
2024-06-26  3:21       ` Umesh Nerlige Ramappa
2024-06-27  6:01         ` Matthew Brost
2024-06-27 20:58           ` Lucas De Marchi
2024-06-29  0:47             ` Matthew Brost
2024-06-27 20:45       ` Lucas De Marchi
2024-06-25 17:03 ` ✓ CI.Patch_applied: success for Capture per client stats when destroying an " Patchwork
2024-06-25 17:03 ` ✗ CI.checkpatch: warning " Patchwork
2024-06-25 17:04 ` ✓ CI.KUnit: success " Patchwork
2024-06-25 17:16 ` ✓ CI.Build: " Patchwork
2024-06-25 17:18 ` ✗ CI.Hooks: failure " Patchwork
2024-06-25 17:20 ` ✓ CI.checksparse: success " Patchwork
2024-06-25 17:49 ` ✗ CI.BAT: failure " Patchwork
2024-06-25 21:06 ` ✗ CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnsBLb8x/Ma+JIxl@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox