From: Raag Jadav <raag.jadav@intel.com>
To: "Dong, Zhanjun" <zhanjun.dong@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [PATCH v3 03/10] drm/xe/guc_submit: Support cancelling submission
Date: Sun, 15 Mar 2026 10:58:57 +0100 [thread overview]
Message-ID: <abaC4Y80fFU18eCx@black.igk.intel.com> (raw)
In-Reply-To: <91958e39-1ad7-42e6-b4d0-86e2fe38ca30@intel.com>
On Fri, Mar 13, 2026 at 11:37:03AM -0400, Dong, Zhanjun wrote:
> On 2026-03-08 9:55 a.m., Raag Jadav wrote:
> > In preparation of usecases which require cancelling submission before
> > PCIe FLR, introduce xe_guc_submit_cancel() helper. This cancels and
> > frees any in-flight jobs on the scheduler.
>
> Could you put more info on why add new cancel functions rather than call
> existing xe_sched_submission_stop?
> From commit message, it looks very similar to stop, which also do stop +
> free action.
IIUC submission_stop() doesn't free any jobs, it just stops the scheduler
and cancels wq used to run jobs. But this leaves the jobs on scheduler's
pending list behind if they're not on the wq yet, which results in timeout.
So perhaps I used the terminology wrong, will update this.
Also, I know it's a bit hacky to directly bork the scheduler's pending list
so this can definitely use some standardization. Open to suggestions.
Raag
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> > v3: Cancel in-flight jobs before FLR
> > ---
> > drivers/gpu/drm/xe/xe_gpu_scheduler.c | 11 +++++++++++
> > drivers/gpu/drm/xe/xe_gpu_scheduler.h | 1 +
> > drivers/gpu/drm/xe/xe_guc_submit.c | 24 ++++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_guc_submit.h | 1 +
> > 4 files changed, 37 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > index 9c8004d5dd91..c012dbe84540 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > @@ -90,6 +90,17 @@ void xe_sched_fini(struct xe_gpu_scheduler *sched)
> > drm_sched_fini(&sched->base);
> > }
> > +void xe_sched_submission_cancel(struct xe_gpu_scheduler *sched)
> > +{
> > + struct drm_gpu_scheduler *base = &sched->base;
> > + struct drm_sched_job *job, *tmp;
> > +
> > + list_for_each_entry_safe_reverse(job, tmp, &base->pending_list, list) {
> > + list_del(&job->list);
> > + base->ops->free_job(job);
> > + }
> > +}
> > +
> > void xe_sched_submission_start(struct xe_gpu_scheduler *sched)
> > {
> > drm_sched_wqueue_start(&sched->base);
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > index 664c2db56af3..ba7892db8428 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > @@ -19,6 +19,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
> > struct device *dev);
> > void xe_sched_fini(struct xe_gpu_scheduler *sched);
> > +void xe_sched_submission_cancel(struct xe_gpu_scheduler *sched);
> > void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
> > void xe_sched_submission_stop(struct xe_gpu_scheduler *sched);
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index de716c1fb18e..cba544cc185c 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -2399,6 +2399,30 @@ void xe_guc_submit_stop(struct xe_guc *guc)
> > }
> > +/**
> > + * xe_guc_submit_cancel - Cancel all runs of submission tasks on given GuC.
> > + * @guc: the &xe_guc struct instance whose scheduler is to be cancelled
> > + */
> > +void xe_guc_submit_cancel(struct xe_guc *guc)
> > +{
> > + struct xe_exec_queue *q;
> > + unsigned long index;
> > +
> > + mutex_lock(&guc->submission_state.lock);
> > +
> > + xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) {
> > + struct xe_gpu_scheduler *sched = &q->guc->sched;
> > +
> > + /* Prevent redundant attempts to cancel parallel queues */
> > + if (q->guc->id != index)
> > + continue;
> > +
> > + xe_sched_submission_cancel(sched);
> > + }
> > +
> > + mutex_unlock(&guc->submission_state.lock);
> > +}
> > +
> > static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
> > struct xe_exec_queue *q)
> > {
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.h b/drivers/gpu/drm/xe/xe_guc_submit.h
> > index b3839a90c142..f361a6d32fd3 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.h
> > @@ -16,6 +16,7 @@ int xe_guc_submit_init(struct xe_guc *guc, unsigned int num_ids);
> > int xe_guc_submit_enable(struct xe_guc *guc);
> > void xe_guc_submit_disable(struct xe_guc *guc);
> > +void xe_guc_submit_cancel(struct xe_guc *guc);
> > int xe_guc_submit_reset_prepare(struct xe_guc *guc);
> > void xe_guc_submit_reset_wait(struct xe_guc *guc);
> > void xe_guc_submit_stop(struct xe_guc *guc);
>
next prev parent reply other threads:[~2026-03-15 9:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-08 13:55 [PATCH v3 00/10] Introduce Xe PCIe FLR Raag Jadav
2026-03-08 13:55 ` [PATCH v3 01/10] drm/xe/uc_fw: Allow re-initializing firmware Raag Jadav
2026-03-08 13:55 ` [PATCH v3 02/10] drm/xe/hw_fence: Synchronize fence irq before destroying the job Raag Jadav
2026-03-08 13:55 ` [PATCH v3 03/10] drm/xe/guc_submit: Support cancelling submission Raag Jadav
2026-03-13 15:37 ` Dong, Zhanjun
2026-03-15 9:58 ` Raag Jadav [this message]
2026-03-16 2:31 ` Matthew Brost
2026-03-08 13:55 ` [PATCH v3 04/10] drm/xe/gt: Introduce FLR helpers Raag Jadav
2026-03-08 13:55 ` [PATCH v3 05/10] drm/xe/irq: Introduce xe_irq_disable() Raag Jadav
2026-03-08 13:55 ` [PATCH v3 06/10] drm/xe: Introduce xe_device_assert_lmem_ready() Raag Jadav
2026-03-08 13:55 ` [PATCH v3 07/10] drm/xe/bo_evict: Introduce xe_bo_restore_map() Raag Jadav
2026-03-08 13:55 ` [PATCH v3 08/10] drm/xe/exec_queue: Introduce xe_exec_queue_reinit() Raag Jadav
2026-03-08 13:55 ` [PATCH v3 09/10] drm/xe/migrate: Introduce xe_migrate_reinit() Raag Jadav
2026-03-08 13:55 ` [PATCH v3 10/10] drm/xe/pci: Introduce PCIe FLR Raag Jadav
2026-03-08 14:08 ` ✗ CI.checkpatch: warning for Introduce Xe PCIe FLR (rev3) Patchwork
2026-03-08 14:09 ` ✓ CI.KUnit: success " Patchwork
2026-03-08 14:50 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-08 15:49 ` ✓ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abaC4Y80fFU18eCx@black.igk.intel.com \
--to=raag.jadav@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=zhanjun.dong@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox