Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
	<intel-xe@lists.freedesktop.org>,
	Jonathan Cavitt <jonathan.cavitt@intel.com>
Subject: Re: [PATCH 2/3] drm/xe: Accumulate exec queue timestamp on destroy
Date: Mon, 28 Oct 2024 21:59:18 +0000	[thread overview]
Message-ID: <ZyAJNgDg6H4V7xAm@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <Zx/1BZ93nLSC1pFs@orsosgc001>

On Mon, Oct 28, 2024 at 01:33:09PM -0700, Umesh Nerlige Ramappa wrote:
> On Sat, Oct 26, 2024 at 12:08:47PM -0500, Lucas De Marchi wrote:
> > When the exec queue is destroyed, there's a race between a query to the
> > fdinfo and the exec queue value being updated: after the destroy ioctl,
> > if the fdinfo is queried before a call to guc_exec_queue_free_job(),
> > the wrong utilization is reported: it's not accumulated on the query
> > since the queue was removed from the array, and the value wasn't updated
> > yet by the free_job().
> > 
> > Explicitly accumulate the engine utilization so the right value is
> > visible after the ioctl return.
> > 
> > Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2667
> > Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_exec_queue.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> > index d098d2dd1b2d..b15ca84b2422 100644
> > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> > @@ -829,6 +829,14 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
> > 
> > 	xe_exec_queue_kill(q);
> > 
> > +	/*
> > +	 * After killing and destroying the exec queue, make sure userspace has
> > +	 * an updated view of the run ticks, regardless if this was the last
> > +	 * ref: since the exec queue is removed from xef->exec_queue.xa, a
> > +	 * query to fdinfo after this returns could not account for this load.
> > +	 */
> > +	xe_exec_queue_update_run_ticks(q);
> > +
> 
> At this point we may/may-not have the updated LRC timestamp.
> 
> fwiu, xe_exec_queue_kill() is an async call. It will queue a work that will
> disable guc scheduling on the context. Once guc notifies KMD that scheduling
> is disabled on this context, KMD knows for sure that the context has
> switched out and the lrc timestamp is updated for this context. It may work
> well for contexts that switch frequently and may not work for contexts that
> seldom switch or never destroy their exec queue.
> 
> I still believe calling it from job free is the right thing to do. As for
> the ~120 Hz updates, these are just memory updates, so not sure if it's a
> huge performance impact.
> 

I agree with Umesh here - unsure why it is a big deal to update the
busyness in free_job.

Also the timestamp counters can wrap and if they are sampled frequently
enough a wrap will be missed.

Matt

> If the ftrace is getting filed up, we could throttle that.
> 
> Thanks,
> Umesh
> 
> > 	trace_xe_exec_queue_close(q);
> > 	xe_exec_queue_put(q);
> > 
> > -- 
> > 2.47.0
> > 

  reply	other threads:[~2024-10-28 22:00 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-26 17:08 [PATCH 0/3] drm/xe: Fix races on fdinfo Lucas De Marchi
2024-10-26 17:08 ` [PATCH 1/3] drm/xe: Add trace to lrc timestamp update Lucas De Marchi
2024-10-28 14:37   ` Cavitt, Jonathan
2024-10-26 17:08 ` [PATCH 2/3] drm/xe: Accumulate exec queue timestamp on destroy Lucas De Marchi
2024-10-28 14:38   ` Cavitt, Jonathan
2024-10-28 20:33   ` Umesh Nerlige Ramappa
2024-10-28 21:59     ` Matthew Brost [this message]
2024-10-28 22:17       ` Cavitt, Jonathan
2024-10-28 23:05       ` Lucas De Marchi
2024-10-28 22:32     ` Lucas De Marchi
2024-10-29 17:27       ` Umesh Nerlige Ramappa
2024-10-29 17:58         ` Lucas De Marchi
2024-10-29 19:03           ` Umesh Nerlige Ramappa
2024-10-29 19:12           ` Lucas De Marchi
2024-10-29 19:31             ` Umesh Nerlige Ramappa
2024-10-29 19:53               ` Lucas De Marchi
2024-10-26 17:08 ` [PATCH 3/3] drm/xe: Stop accumulating LRC timestamp on job_free Lucas De Marchi
2024-10-28 17:23   ` Nirmoy Das
2024-10-28 20:29   ` Cavitt, Jonathan
2024-10-26 17:15 ` ✓ CI.Patch_applied: success for drm/xe: Fix races on fdinfo Patchwork
2024-10-26 17:16 ` ✗ CI.checkpatch: warning " Patchwork
2024-10-26 17:17 ` ✓ CI.KUnit: success " Patchwork
2024-10-26 17:29 ` ✓ CI.Build: " Patchwork
2024-10-26 17:31 ` ✓ CI.Hooks: " Patchwork
2024-10-26 17:32 ` ✓ CI.checksparse: " Patchwork
2024-10-26 17:57 ` ✓ CI.BAT: " Patchwork
2024-10-27 16:45 ` ✗ CI.FULL: failure " Patchwork
2024-10-29 20:02 ` ✗ CI.Patch_applied: failure for drm/xe: Fix races on fdinfo (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyAJNgDg6H4V7xAm@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jonathan.cavitt@intel.com \
    --cc=lucas.demarchi@intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox