From: Matthew Brost <matthew.brost@intel.com>
To: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <thomas.hellstrom@linux.intel.com>
Subject: Re: [PATCH] drm/xe/multi_queue: preempt primary on queue group suspend
Date: Wed, 10 Jun 2026 17:15:35 -0700 [thread overview]
Message-ID: <ain+J8bCGhLtnCJB@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <ainYa3M6FkFWUpaG@nvishwa1-desk>
On Wed, Jun 10, 2026 at 02:34:35PM -0700, Niranjana Vishwanathapura wrote:
> On Wed, Jun 10, 2026 at 12:56:41PM -0700, Matthew Brost wrote:
> > On Sun, Jun 07, 2026 at 07:11:00PM -0700, Niranjana Vishwanathapura wrote:
> > > In a multi-queue group only the group's primary queue interfaces with
> > > GuC for scheduling; suspend/resume of secondary queues is handled
> > > internally and is not forwarded to GuC. As a result, suspending a
> > > secondary queue alone (e.g. on its preempt fence signalling) does not
> > > disable the primary's GuC context, so in-flight GPU work of the group
> > > is not actually preempted.
> > >
> > > Route group suspend/resume through the primary using a group level
> > > reference count. The primary's GuC context is disabled on the first
> > > suspend of any group member and re-enabled only after every suspended
> > > member is resumed. A per-queue preempt_suspended flag makes the
> > > resume-all-queues-each-rebind-cycle behavior idempotent and keeps the
> > > count balanced, including on queue teardown. The secondary's own
> > > scheduler state machine is still suspended/resumed internally so its
> > > state stays consistent.
> > >
> >
> > Overall I believe this makes sense. A few comments below.
I just reviewed Thomas’s patch here [1], which is conceptually similar.
I’m wondering whether you should first pull into Thomas’s patch [1],
rebase on top of it, and make the suspend count logic congruent between
the two patches.
Matt
[1] https://patchwork.freedesktop.org/patch/727811/?series=167032&rev=3
> >
> > > Assisted-by: Github-Copilot:Claude-opus-4.8
> > > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_exec_queue.c | 1 +
> > > drivers/gpu/drm/xe/xe_exec_queue_types.h | 17 +++
> > > drivers/gpu/drm/xe/xe_guc_submit.c | 173 +++++++++++++++++++++--
> > > 3 files changed, 181 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> > > index 1b5ca3ce578a..df95855a3d61 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> > > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> > > @@ -842,6 +842,7 @@ static int xe_exec_queue_group_init(struct xe_device *xe, struct xe_exec_queue *
> > > group->primary = q;
> > > group->cgp_bo = bo;
> > > INIT_LIST_HEAD(&group->list);
> > > + spin_lock_init(&group->suspend_lock);
> > > xa_init_flags(&group->xa, XA_FLAGS_ALLOC1);
> > > mutex_init(&group->list_lock);
> > > q->multi_queue.group = group;
> > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > index 2f5ccf294675..30c5dfcc58fb 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > @@ -62,6 +62,16 @@ struct xe_exec_queue_group {
> > > struct list_head list;
> > > /** @list_lock: Secondary queue list lock */
> > > struct mutex list_lock;
> > > + /** @suspend_lock: Protects @suspend_count and queues' @preempt_suspended */
> > > + spinlock_t suspend_lock;
> >
> > Can you mention suspend_lock is nested outside of the queue's message lock?
>
> Ok, sure.
>
> >
> > > + /**
> > > + * @suspend_count: Number of queues in the group currently preempt
> > > + * suspended (e.g. due to their VM's userptr invalidation). Only the
> > > + * primary queue interfaces with GuC, so the primary's GuC context is
> > > + * kept disabled while @suspend_count is non-zero. Protected by
> > > + * @suspend_lock.
> > > + */
> > > + u32 suspend_count;
> > > /** @sync_pending: CGP_SYNC_DONE g2h response pending */
> > > bool sync_pending;
> > > /** @banned: Group banned */
> > > @@ -176,6 +186,13 @@ struct xe_exec_queue {
> > > u8 valid:1;
> > > /** @multi_queue.is_primary: Is primary queue (Q0) of the group */
> > > u8 is_primary:1;
> >
> > For valid and is_primary, could you mention that these values remain
> > static from initialization to destruction, since part of the bitfield is
> > dynamic and protected by a lock?
>
> Ok, will mention.
>
> >
> > > + /**
> > > + * @multi_queue.preempt_suspended: This queue is currently preempt
> > > + * suspended (e.g. due to its VM's userptr invalidation) and is
> > > + * accounted in the group's @xe_exec_queue_group.suspend_count.
> > > + * Protected by @xe_exec_queue_group.suspend_lock.
> > > + */
> > > + u8 preempt_suspended:1;
> > > } multi_queue;
> > >
> > > /** @sched_props: scheduling properties */
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > index 4b247a3019d2..8ec3ab24048f 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > @@ -1657,11 +1657,22 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > return DRM_GPU_SCHED_STAT_NO_HANG;
> > > }
> > >
> > > +static void guc_exec_queue_multi_queue_drop_suspend(struct xe_exec_queue *q);
> > > +
> > > static void guc_exec_queue_fini(struct xe_exec_queue *q)
> > > {
> > > struct xe_guc_exec_queue *ge = q->guc;
> > > struct xe_guc *guc = exec_queue_to_guc(q);
> > >
> > > + /*
> > > + * A queue can leave the group while still preempt suspended (e.g.
> > > + * xe_vm_remove_compute_exec_queue() forces its preempt fence to signal,
> > > + * which suspends it). Drop its contribution to the group suspend count
> > > + * and resume the primary if this was the last suspended queue.
> > > + */
> > > + if (xe_exec_queue_is_multi_queue(q))
> > > + guc_exec_queue_multi_queue_drop_suspend(q);
> > > +
> > > if (xe_exec_queue_is_multi_queue_secondary(q)) {
> > > struct xe_exec_queue_group *group = q->multi_queue.group;
> > >
> > > @@ -2147,20 +2158,82 @@ static int guc_exec_queue_suspend(struct xe_exec_queue *q)
> > > if (exec_queue_killed_or_banned_or_wedged(q))
> > > return -EINVAL;
> > >
> > > - xe_sched_msg_lock(sched);
> > > - if (guc_exec_queue_try_add_msg(q, msg, SUSPEND))
> > > - q->guc->suspend_pending = true;
> > > - xe_sched_msg_unlock(sched);
> > > + if (!xe_exec_queue_is_multi_queue(q)) {
> > > + xe_sched_msg_lock(sched);
> > > + if (guc_exec_queue_try_add_msg(q, msg, SUSPEND))
> > > + q->guc->suspend_pending = true;
> > > + xe_sched_msg_unlock(sched);
> > > +
> > > + return 0;
> > > + }
> > > +
> > > + /*
> > > + * For a multi-queue group only the primary queue interfaces with GuC
> > > + * for scheduling, so the group is preempted by disabling the primary's
> > > + * GuC context. Route all group suspends through the primary using a
> > > + * group level reference count: disable the primary's GuC context on the
> > > + * first suspend so the GPU is actually preempted, and keep it disabled
> > > + * until every suspended queue in the group is resumed.
> > > + *
> > > + * Set the primary's suspend_pending while holding @suspend_lock so a
> > > + * concurrent suspender of the same group cannot observe a not-yet-pending
> > > + * primary and have its suspend_wait() return before the GPU is preempted.
> > > + */
> > > + scoped_guard(spinlock, &q->multi_queue.group->suspend_lock) {
> > > + struct xe_exec_queue_group *group = q->multi_queue.group;
> > > + struct xe_exec_queue *primary;
> > > + struct xe_gpu_scheduler *psched;
> > > + struct xe_sched_msg *pmsg;
> > > +
> > > + if (q->multi_queue.preempt_suspended)
> > > + break;
> > > +
> > > + q->multi_queue.preempt_suspended = true;
> > > +
> > > + /*
> > > + * Disable this queue's own scheduler state machine too. For a
> > > + * secondary this is handled internally (not forwarded to GuC),
> > > + * but updating its state is still required. The primary's own
> > > + * state is updated by the GuC suspend issued below on the first
> > > + * group suspend.
> > > + */
> > > + if (xe_exec_queue_is_multi_queue_secondary(q)) {
> > > + xe_sched_msg_lock(sched);
> > > + if (guc_exec_queue_try_add_msg(q, msg, SUSPEND))
> > > + q->guc->suspend_pending = true;
> > > + xe_sched_msg_unlock(sched);
> > > + }
> > > +
> > > + if (group->suspend_count++)
> > > + break;
> > > +
> > > + primary = xe_exec_queue_multi_queue_primary(q);
> > > + psched = &primary->guc->sched;
> > > + pmsg = primary->guc->static_msgs + STATIC_MSG_SUSPEND;
> > > +
> > > + xe_sched_msg_lock(psched);
> > > + if (guc_exec_queue_try_add_msg(primary, pmsg, SUSPEND))
> > > + primary->guc->suspend_pending = true;
> > > + xe_sched_msg_unlock(psched);
> > > + }
> >
> > Maybe a static helper for this code? e.g., guc_exec_queue_multi_queue_suspend?
> >
> > Also assert is multi_queue in the helper.
>
> Ok, will do.
>
> >
> > >
> > > return 0;
> > > }
> > >
> > > static int guc_exec_queue_suspend_wait(struct xe_exec_queue *q)
> > > {
> > > - struct xe_guc *guc = exec_queue_to_guc(q);
> > > - struct xe_device *xe = guc_to_xe(guc);
> > > + struct xe_guc *guc;
> > > + struct xe_device *xe;
> >
> > This should be able to left as is given guc, xe shouldn't change based
> > secondary vs primary, right?
> >
>
> Yah, makes sense.
>
> > > int ret;
> > >
> > > + /*
> > > + * In multi-queue mode the primary owns the GuC scheduling context for
> > > + * the whole group, so wait on the primary's suspend to complete.
> > > + */
> > > + q = xe_exec_queue_multi_queue_primary(q);
> > > + guc = exec_queue_to_guc(q);
> > > + xe = guc_to_xe(guc);
> > > +
> > > /*
> > > * Likely don't need to check exec_queue_killed() as we clear
> > > * suspend_pending upon kill but to be paranoid but races in which
> > > @@ -2204,11 +2277,91 @@ static void guc_exec_queue_resume(struct xe_exec_queue *q)
> > > struct xe_sched_msg *msg = q->guc->static_msgs + STATIC_MSG_RESUME;
> > > struct xe_guc *guc = exec_queue_to_guc(q);
> > >
> > > - xe_gt_assert(guc_to_gt(guc), !q->guc->suspend_pending);
> > > + if (!xe_exec_queue_is_multi_queue(q)) {
> > > + xe_gt_assert(guc_to_gt(guc), !q->guc->suspend_pending);
> > >
> > > - xe_sched_msg_lock(sched);
> > > - guc_exec_queue_try_add_msg(q, msg, RESUME);
> > > - xe_sched_msg_unlock(sched);
> > > + xe_sched_msg_lock(sched);
> > > + guc_exec_queue_try_add_msg(q, msg, RESUME);
> > > + xe_sched_msg_unlock(sched);
> > > +
> > > + return;
> > > + }
> > > +
> > > + /*
> > > + * Resume is called for every queue of the VM on each rebind cycle, so
> > > + * only act if this queue was actually suspended. Re-enable the primary's
> > > + * GuC context once the last suspended queue of the group is resumed.
> > > + */
> > > + scoped_guard(spinlock, &q->multi_queue.group->suspend_lock) {
> > > + struct xe_exec_queue *primary;
> > > + struct xe_gpu_scheduler *psched;
> > > + struct xe_sched_msg *pmsg;
> > > +
> > > + if (!q->multi_queue.preempt_suspended)
> > > + break;
> > > +
> > > + q->multi_queue.preempt_suspended = false;
> > > +
> > > + /*
> > > + * Re-enable this queue's own scheduler state machine that was
> > > + * disabled on suspend. For a secondary this is handled
> > > + * internally (not forwarded to GuC); the primary's own state is
> > > + * re-enabled by the GuC resume issued below on the last group
> > > + * resume.
> > > + */
> > > + if (xe_exec_queue_is_multi_queue_secondary(q)) {
> > > + xe_sched_msg_lock(sched);
> > > + guc_exec_queue_try_add_msg(q, msg, RESUME);
> > > + xe_sched_msg_unlock(sched);
> > > + }
> > > +
> > > + if (--q->multi_queue.group->suspend_count)
> > > + break;
> > > +
> > > + primary = xe_exec_queue_multi_queue_primary(q);
> > > + psched = &primary->guc->sched;
> > > + pmsg = primary->guc->static_msgs + STATIC_MSG_RESUME;
> > > +
> > > + xe_gt_assert(guc_to_gt(guc), !primary->guc->suspend_pending);
> > > +
> > > + xe_sched_msg_lock(psched);
> > > + guc_exec_queue_try_add_msg(primary, pmsg, RESUME);
> > > + xe_sched_msg_unlock(psched);
> > > + }
> >
> > guc_exec_queue_multi_queue_resume?
> >
>
> Ok, will add.
>
> Thanks,
> Niranjana
>
> > Matt
> >
> > > +}
> > > +
> > > +/*
> > > + * Drop a leaving multi-queue member's contribution to the group preempt
> > > + * suspend count and resume the primary if it was the last suspended queue.
> > > + * See guc_exec_queue_fini().
> > > + */
> > > +static void guc_exec_queue_multi_queue_drop_suspend(struct xe_exec_queue *q)
> > > +{
> > > + scoped_guard(spinlock, &q->multi_queue.group->suspend_lock) {
> > > + struct xe_exec_queue_group *group = q->multi_queue.group;
> > > + struct xe_exec_queue *primary;
> > > + struct xe_gpu_scheduler *psched;
> > > + struct xe_sched_msg *pmsg;
> > > +
> > > + if (!q->multi_queue.preempt_suspended)
> > > + break;
> > > +
> > > + q->multi_queue.preempt_suspended = false;
> > > + if (--group->suspend_count)
> > > + break;
> > > +
> > > + primary = xe_exec_queue_multi_queue_primary(q);
> > > + if (primary == q ||
> > > + exec_queue_killed_or_banned_or_wedged(primary))
> > > + break;
> > > +
> > > + psched = &primary->guc->sched;
> > > + pmsg = primary->guc->static_msgs + STATIC_MSG_RESUME;
> > > +
> > > + xe_sched_msg_lock(psched);
> > > + guc_exec_queue_try_add_msg(primary, pmsg, RESUME);
> > > + xe_sched_msg_unlock(psched);
> > > + }
> > > }
> > >
> > > static bool guc_exec_queue_reset_status(struct xe_exec_queue *q)
> > > --
> > > 2.43.0
> > >
next prev parent reply other threads:[~2026-06-11 0:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-08 2:11 [PATCH] drm/xe/multi_queue: preempt primary on queue group suspend Niranjana Vishwanathapura
2026-06-08 2:18 ` ✓ CI.KUnit: success for " Patchwork
2026-06-08 3:02 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-08 4:15 ` ✓ Xe.CI.FULL: " Patchwork
2026-06-10 19:56 ` [PATCH] " Matthew Brost
2026-06-10 21:34 ` Niranjana Vishwanathapura
2026-06-11 0:15 ` Matthew Brost [this message]
2026-06-11 4:19 ` Niranjana Vishwanathapura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ain+J8bCGhLtnCJB@gsse-cloud1.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=niranjana.vishwanathapura@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.