From: Matthew Brost <matthew.brost@intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: "Cavitt, Jonathan" <jonathan.cavitt@intel.com>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Francois Dugast" <francois.dugast@intel.com>,
"Souza, Jose" <jose.souza@intel.com>
Subject: Re: [PATCH] drm/xe: Drop EXEC_QUEUE_FLAG_BANNED
Date: Thu, 6 Jun 2024 02:47:13 +0000 [thread overview]
Message-ID: <ZmEjMXSuOL36tiQ0@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <ZmDg8mpQ1J0lF9D7@intel.com>
On Wed, Jun 05, 2024 at 06:04:34PM -0400, Rodrigo Vivi wrote:
> On Wed, Jun 05, 2024 at 09:01:06PM +0000, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of Matthew Brost
> > Sent: Tuesday, June 4, 2024 11:47 AM
> > To: intel-xe@lists.freedesktop.org
> > Cc: Brost, Matthew <matthew.brost@intel.com>
> > Subject: [PATCH] drm/xe: Drop EXEC_QUEUE_FLAG_BANNED
>
> please prefer drm/xe/uapi: when the changes impact uapi.
>
Got it. Will be more careful going forward.
> > >
> > > Clean up laying violation of setting q->flags EXEC_QUEUE_FLAG_BANNED bit
> > > in GuC backend. Move banned to GuC owned bit and report banned status to
> > > upper layers via reset_status vfunc. This is a slight change in behavior
> > > as reset_status returns true if wedged or killed bits set too, but in
> > > all of these cases submission to queue is no longer allowed.
>
> This is an uapi change that is in use by mesa and we cannot regress.
Agree.
> We need to ensure that no user space is really using that before we can
> apply anything like this.
>
I don't think affects the uAPI. The killed bit is only set after
removing exec queue from the FD or the FD closing. In either case, the
exec queue is not accessible to the user by the time this bit is set.
The wedged bit is slight could be a change behavior but I'd argue this
is actually fixing a bug. If we set the wedged bit, we skip setting the
banned bit. IMO this fixing bug in the wedged series - when we wedge an
exec queue it should not longer be available for the user to used.
Now that I'm typing, I realize beyond that all IOCTLs return -ECANCELED
once the device is wedged so this change likely isn't even visible to the
user.
In any of these cases - killed, wedged, or banned the exec queue is no
longer available for use by a user too.
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
>
Again will be more diligent about Cc stakeholders on uAPI changes. Let's
get everyones input here.
Matt
> > >
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_exec.c | 2 +-
> > > drivers/gpu/drm/xe/xe_exec_queue.c | 2 +-
> > > drivers/gpu/drm/xe/xe_exec_queue_types.h | 12 +++++-------
> > > drivers/gpu/drm/xe/xe_guc_submit.c | 10 ++++++----
> > > 4 files changed, 13 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
> > > index 97eeb973e897..4cf6c6ab4866 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec.c
> > > +++ b/drivers/gpu/drm/xe/xe_exec.c
> > > @@ -141,7 +141,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> > > q->width != args->num_batch_buffer))
> > > return -EINVAL;
> > >
> > > - if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_BANNED)) {
> > > + if (XE_IOCTL_DBG(xe, q->ops->reset_status(q))) {
> > > err = -ECANCELED;
> > > goto err_exec_queue;
> > > }
> > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> > > index 27215075c799..cf45df0328da 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> > > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> > > @@ -677,7 +677,7 @@ int xe_exec_queue_get_property_ioctl(struct drm_device *dev, void *data,
> > >
> > > switch (args->property) {
> > > case DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN:
> > > - args->value = !!(q->flags & EXEC_QUEUE_FLAG_BANNED);
> > > + args->value = q->ops->reset_status(q);
> >
> > LGTM.
> >
> > Maybe migrating over to using q->ops->reset_status could be done later, and
> > instead we could just check the EXEC_QUEUE_STATE_BANNED flag directly for
> > now, saving the change to reset_status for a separate patch. That way, we'd
> > have more room to justify this change in the commit message separately from
> > the one made to the EXEC_QUEUE_FLAG_BANNED. But it's not strictly
> > necessary, IMO.
> >
> > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > -Jonathan Cavitt
> >
> > > ret = 0;
> > > break;
> > > default:
> > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > index 18d8b2a60928..f0c5f82ce7e3 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > @@ -70,18 +70,16 @@ struct xe_exec_queue {
> > > */
> > > struct dma_fence *last_fence;
> > >
> > > -/* queue no longer allowed to submit */
> > > -#define EXEC_QUEUE_FLAG_BANNED BIT(0)
> > > /* queue used for kernel submission only */
> > > -#define EXEC_QUEUE_FLAG_KERNEL BIT(1)
> > > +#define EXEC_QUEUE_FLAG_KERNEL BIT(0)
> > > /* kernel engine only destroyed at driver unload */
> > > -#define EXEC_QUEUE_FLAG_PERMANENT BIT(2)
> > > +#define EXEC_QUEUE_FLAG_PERMANENT BIT(1)
> > > /* for VM jobs. Caller needs to hold rpm ref when creating queue with this flag */
> > > -#define EXEC_QUEUE_FLAG_VM BIT(3)
> > > +#define EXEC_QUEUE_FLAG_VM BIT(2)
> > > /* child of VM queue for multi-tile VM jobs */
> > > -#define EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD BIT(4)
> > > +#define EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD BIT(3)
> > > /* kernel exec_queue only, set priority to highest level */
> > > -#define EXEC_QUEUE_FLAG_HIGH_PRIORITY BIT(5)
> > > +#define EXEC_QUEUE_FLAG_HIGH_PRIORITY BIT(4)
> > >
> > > /**
> > > * @flags: flags for this exec queue, should statically setup aside from ban
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > index 47aab04cf34f..4464ba337d12 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > @@ -61,6 +61,7 @@ exec_queue_to_guc(struct xe_exec_queue *q)
> > > #define EXEC_QUEUE_STATE_RESET (1 << 6)
> > > #define EXEC_QUEUE_STATE_KILLED (1 << 7)
> > > #define EXEC_QUEUE_STATE_WEDGED (1 << 8)
> > > +#define EXEC_QUEUE_STATE_BANNED (1 << 9)
> > >
> > > static bool exec_queue_registered(struct xe_exec_queue *q)
> > > {
> > > @@ -134,12 +135,12 @@ static void set_exec_queue_destroyed(struct xe_exec_queue *q)
> > >
> > > static bool exec_queue_banned(struct xe_exec_queue *q)
> > > {
> > > - return (q->flags & EXEC_QUEUE_FLAG_BANNED);
> > > + return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_BANNED;
> > > }
> > >
> > > static void set_exec_queue_banned(struct xe_exec_queue *q)
> > > {
> > > - q->flags |= EXEC_QUEUE_FLAG_BANNED;
> > > + atomic_or(EXEC_QUEUE_STATE_BANNED, &q->guc->state);
> > > }
> > >
> > > static bool exec_queue_suspended(struct xe_exec_queue *q)
> > > @@ -189,8 +190,9 @@ static void set_exec_queue_wedged(struct xe_exec_queue *q)
> > >
> > > static bool exec_queue_killed_or_banned_or_wedged(struct xe_exec_queue *q)
> > > {
> > > - return exec_queue_banned(q) || (atomic_read(&q->guc->state) &
> > > - (EXEC_QUEUE_STATE_WEDGED | EXEC_QUEUE_STATE_KILLED));
> > > + return (atomic_read(&q->guc->state) &
> > > + (EXEC_QUEUE_STATE_WEDGED | EXEC_QUEUE_STATE_KILLED |
> > > + EXEC_QUEUE_STATE_BANNED));
> > > }
> > >
> > > #ifdef CONFIG_PROVE_LOCKING
> > > --
> > > 2.34.1
> > >
> > >
next prev parent reply other threads:[~2024-06-06 2:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-04 18:47 [PATCH] drm/xe: Drop EXEC_QUEUE_FLAG_BANNED Matthew Brost
2024-06-05 4:16 ` ✓ CI.Patch_applied: success for " Patchwork
2024-06-05 4:16 ` ✓ CI.checkpatch: " Patchwork
2024-06-05 4:17 ` ✓ CI.KUnit: " Patchwork
2024-06-05 4:29 ` ✓ CI.Build: " Patchwork
2024-06-05 4:29 ` ✗ CI.Hooks: failure " Patchwork
2024-06-05 4:30 ` ✓ CI.checksparse: success " Patchwork
2024-06-05 4:58 ` ✓ CI.BAT: " Patchwork
2024-06-05 14:12 ` ✗ CI.FULL: failure " Patchwork
2024-06-05 21:01 ` [PATCH] " Cavitt, Jonathan
2024-06-05 22:04 ` Rodrigo Vivi
2024-06-06 2:47 ` Matthew Brost [this message]
2024-06-06 9:59 ` Francois Dugast
2024-06-07 16:18 ` Rodrigo Vivi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZmEjMXSuOL36tiQ0@DUT025-TGLU.fm.intel.com \
--to=matthew.brost@intel.com \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jonathan.cavitt@intel.com \
--cc=jose.souza@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox