Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org,
	Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>,
	 Jonathan Cavitt <jonathan.cavitt@intel.com>
Subject: Re: [PATCH] drm/xe/oa: Disallow OA from being enabled on active exec_queue's
Date: Tue, 19 Nov 2024 13:08:49 -0800	[thread overview]
Message-ID: <851pz76ie6.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <ZzykYzGwr1GXSqoV@lstrano-desk.jf.intel.com>

On Tue, 19 Nov 2024 06:44:51 -0800, Matthew Brost wrote:
>
> On Mon, Nov 18, 2024 at 05:32:56PM -0800, Ashutosh Dixit wrote:
> > Enabling OA on an exec_queue toggles the OAC_CONTEXT_ENABLE bit in
> > CTXT_SR_CTL register. Toggling this bit changes the size and layout of the
> > underlying HW context image. Therefore, enabling OA on an already active
> > exec_queue (as currently implemented in xe) is an invalid operation and can
> > cause hangs. Therefore, disallow OA from being enabled on active
> > exec_queue's (here, by active we mean a context on which submissions have
> > previously happened).
> >
>
> This is something we will need to keep on eye on then because in various
> experimental code I've played around enabling exec queues upon creation.
> e.g., If we want to allocate a doorbell. I seem to recall Habana wanting
> to enable exec queues upon creation too.

The real requirement here is that HW context image should not have been
loaded before OA is enabled on the exec queue. That is what happens today
in the ENABLED state, correct, when user space submissions start?

If operations such as doorbell are only management requests to GuC (which
don't cause HW context image to be loaded) and if we can name a new state
when the exec queue is handed off to userspace for starting submissions, we
should be able to stay with this approach.

> Just curious if it was ever explored having exec queue creation
> extension which enables OA? It seems like this is something we may need
> at some point if our exec queue creation semantics change of course
> being careful to not break existing flows.

Yeah I did think of it but didn't want to change the uapi.

Also, a different implementation is possible which avoids this resizing of
the context image altogether. It requires the kernel OA code submit its
submissions on the user exec queue (and use that exec queue's VM, currently
OA code uses a kernel exec queue). There are some reasons I don't want to
implement that just yet, but worst case, we can do that if absolutely
needed.

Thanks.
--
Ashutosh

> > Transition from 1 -> 0 for this bit was disallowed in
> > '0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream
> > close")'. Here we disallow the 0 -> 1 transition on active contexts.
> >
> > v2: Don't export exec_queue_enabled, define new xe_exec_queue_op (M Brost)
> >     Directly check OAC_CONTEXT_ENABLE bit from context image (J Cavitt)
> >
> > Bspec: 60314
> > Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_exec_queue_types.h |  2 ++
> >  drivers/gpu/drm/xe/xe_guc_submit.c       |  1 +
> >  drivers/gpu/drm/xe/xe_oa.c               | 13 +++++++++++++
> >  3 files changed, 16 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > index 1158b6062a6cd..b88d617c37b33 100644
> > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > @@ -184,6 +184,8 @@ struct xe_exec_queue_ops {
> >	void (*resume)(struct xe_exec_queue *q);
> >	/** @reset_status: check exec queue reset status */
> >	bool (*reset_status)(struct xe_exec_queue *q);
> > +	/** @enabled: check if exec queue is in enabled state */
> > +	bool (*enabled)(struct xe_exec_queue *q);
> >  };
> >
> >  #endif
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index f9ecee5364d82..b9b9cdb6f768b 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -1660,6 +1660,7 @@ static const struct xe_exec_queue_ops guc_exec_queue_ops = {
> >	.suspend_wait = guc_exec_queue_suspend_wait,
> >	.resume = guc_exec_queue_resume,
> >	.reset_status = guc_exec_queue_reset_status,
> > +	.enabled = exec_queue_enabled,
> >  };
> >
> >  static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
> > index 8dd55798ab312..4a7440c40978c 100644
> > --- a/drivers/gpu/drm/xe/xe_oa.c
> > +++ b/drivers/gpu/drm/xe/xe_oa.c
> > @@ -2066,6 +2066,19 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f
> >		if (XE_IOCTL_DBG(oa->xe, !param.exec_q))
> >			return -ENOENT;
> >
> > +		/*
> > +		 * Disallow OA from being enabled on active exec_queue's. Enabling OA sets the
> > +		 * OAC_CONTEXT_ENABLE bit in CTXT_SR_CTL register. Toggling the bit changes
> > +		 * the size and layout of the underlying HW context image and can cause hangs.
> > +		 */
> > +		if (XE_IOCTL_DBG(oa->xe,
> > +				 !(xe_lrc_read_ctx_reg(param.exec_q->lrc[0],
> > +				       CTX_CONTEXT_CONTROL) & CTX_CTRL_OAC_CONTEXT_ENABLE) &&
> > +				 param.exec_q->ops->enabled(param.exec_q))) {
> > +			ret = -EADDRINUSE;
> > +			goto err_exec_q;
> > +		}
> > +
> >		if (param.exec_q->width > 1)
> >			drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n");
> >	}
> > --
> > 2.41.0
> >

  reply	other threads:[~2024-11-19 21:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-19  1:32 [PATCH] drm/xe/oa: Disallow OA from being enabled on active exec_queue's Ashutosh Dixit
2024-11-19  2:10 ` ✓ CI.Patch_applied: success for " Patchwork
2024-11-19  2:10 ` ✓ CI.checkpatch: " Patchwork
2024-11-19  2:11 ` ✓ CI.KUnit: " Patchwork
2024-11-19  2:29 ` ✓ CI.Build: " Patchwork
2024-11-19  2:32 ` ✓ CI.Hooks: " Patchwork
2024-11-19  2:33 ` ✓ CI.checksparse: " Patchwork
2024-11-19  3:00 ` ✓ CI.BAT: " Patchwork
2024-11-19 14:44 ` [PATCH] " Matthew Brost
2024-11-19 21:08   ` Dixit, Ashutosh [this message]
2024-11-19 22:09     ` Matthew Brost
2024-11-21  4:22       ` Dixit, Ashutosh
2024-11-21  5:16         ` Matthew Brost
2024-11-21 21:04           ` Dixit, Ashutosh
2024-11-21 22:06     ` Umesh Nerlige Ramappa
2024-11-22  2:38       ` Dixit, Ashutosh
2024-11-22 17:49         ` Dixit, Ashutosh
2024-11-19 15:19 ` Cavitt, Jonathan
2024-11-19 15:46 ` ✗ CI.FULL: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=851pz76ie6.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jonathan.cavitt@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox