From: Matthew Brost <matthew.brost@intel.com>
To: "Zbigniew Kempczyński" <zbigniew.kempczynski@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
Carlos Santa <carlos.santa@intel.com>
Subject: Re: [PATCH] drm/xe: Do not preempt fence signaling CS instructions
Date: Thu, 26 Feb 2026 09:35:56 -0800 [thread overview]
Message-ID: <aaCEfObwLbCrN/3W@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <aXkc+DtEtRWSRAyk@lstrano-desk.jf.intel.com>
On Tue, Jan 27, 2026 at 12:15:52PM -0800, Matthew Brost wrote:
> On Tue, Jan 27, 2026 at 08:20:36AM +0100, Zbigniew Kempczyński wrote:
> > On Thu, Jan 22, 2026 at 09:44:14AM +0100, Zbigniew Kempczyński wrote:
> > > On Tue, Jan 20, 2026 at 01:11:18PM -0800, Matthew Brost wrote:
> > > > On Mon, Jan 19, 2026 at 01:01:34PM +0100, Zbigniew Kempczyński wrote:
> > > > > On Fri, Jan 16, 2026 at 01:05:01PM -0800, Matthew Brost wrote:
> > > > > > On Fri, Jan 16, 2026 at 10:45:39AM +0100, Zbigniew Kempczyński wrote:
> > > > > > > On Wed, Jan 14, 2026 at 04:45:46PM -0800, Matthew Brost wrote:
> > > > > > > > If a batch buffer is complete, it makes little sense to preempt the
> > > > > > > > fence signaling instructions in the ring, as the largest portion of the
> > > > > > > > work (the batch buffer) is already done and fence signaling consists of
> > > > > > > > only a few instructions. If these instructions are preempted, the GuC
> > > > > > > > would need to perform a context switch just to signal the fence, which
> > > > > > > > is costly and delays fence signaling. Avoid this scenario by disabling
> > > > > > > > preemption immediately after the BB start instruction and re-enabling it
> > > > > > > > after executing the fence signaling instructions.
> > > > > > > >
> > > > > > > > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > > > > > > > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > > > > > > > Cc: Carlos Santa <carlos.santa@intel.com>
> > > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > > > > > ---
> > > > > > > > drivers/gpu/drm/xe/xe_ring_ops.c | 9 +++++++++
> > > > > > > > 1 file changed, 9 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > > index a1fd99f2d539..cd645ee400b9 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > > @@ -282,6 +282,9 @@ static void __emit_job_gen12_simple(struct xe_sched_job *job, struct xe_lrc *lrc
> > > > > > > >
> > > > > > > > i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
> > > > > > > >
> > > > > > > > + /* Don't preempt fence signaling */
> > > > > > > > + dw[i++] = MI_ARB_ON_OFF | MI_ARB_DISABLE;
> > > > > > > > +
> > > > > > > > if (job->user_fence.used) {
> > > > > > > > i = emit_flush_dw(dw, i);
> > > > > > > > i = emit_store_imm_ppgtt_posted(job->user_fence.addr,
> > > > > > > > @@ -347,6 +350,9 @@ static void __emit_job_gen12_video(struct xe_sched_job *job, struct xe_lrc *lrc,
> > > > > > > >
> > > > > > > > i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
> > > > > > > >
> > > > > > > > + /* Don't preempt fence signaling */
> > > > > > > > + dw[i++] = MI_ARB_ON_OFF | MI_ARB_DISABLE;
> > > > > > > > +
> > > > > > > > if (job->user_fence.used) {
> > > > > > > > i = emit_flush_dw(dw, i);
> > > > > > > > i = emit_store_imm_ppgtt_posted(job->user_fence.addr,
> > > > > > > > @@ -399,6 +405,9 @@ static void __emit_job_gen12_render_compute(struct xe_sched_job *job,
> > > > > > > >
> > > > > > > > i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
> > > > > > > >
> > > > > > > > + /* Don't preempt fence signaling */
> > > > > > > > + dw[i++] = MI_ARB_ON_OFF | MI_ARB_DISABLE;
> > > > > > > > +
> > > > > > >
> > > > > > > IGT tests which calls compute-walker, then bbe are asynchronous (don't
> > > > > > > wait for completion, pipe-control is necessary to wait on
> > > > > > > compute-walker).
> > > > > > >
> > > > > >
> > > > > > This asynchronous behavior may explain things. Is this a common use
> > > > > > case?
> > > > >
> > > > > Compute runtime if I'm not wrong uses pipe-control explicitely. IGT are
> > > > > not doing this relying on kmd.
> > > > >
> > > >
> > > > It would be good confirm this.
> > > >
> > > > > >
> > > > > > Also do you know if render engines have similar asynchronous behaviors
> > > > > > or is this specific to compute engines?
> > > > >
> > > > > I don't know, I think Mesa folks may know the answer.
> > > > >
> > > >
> > > > Let me ask around about this.
> > > >
> > > > > >
> > > > > > Lastly, the i915 disables preemption on both render / compute engines
> > > > > > immediately after the BB before emitting the pipe control. Is this async
> > > > > > behavior a new few feature in Xe2 parts which only the Xe driver
> > > > > > supports? This might explain why the i915 works and Xe does not.
> >
> > We don't have platforms which supports WMTP in i915 driver. This imo explains why
> > there're no issues observed there. I mean regardless preemption state
> > disabled/enabled all instructions soon or later completes.
> >
> > > > >
> > > > > Test exercises WMTP and this is supported starting at Xe2+. Probably
> > > > > what test is doing has a meaning in this case. First compute-walker
> > > > > submits kernel which loops until it will observe some memory write.
> > > > > Second job executes compute-walker with kernel which does some quick job.
> > > > > But first occupies all EU's so second job can be preempted only when
> > > > > preemption occurs and SIP will be executed. So if we disable preemption
> > > > > immediately we submit compute-walker I think we have no change to enter
> > > > > SIP and switch. Even if I add pipe-control to batch level according
> > > > > to Daniele comment job it is still preemptable and we move pipe-control
> > > > > location from kmd -> batch level..
> > > >
> > > > Does the test work with a pipe control in the batch? That would be a
> > > > good data point - if let me know how I can change the test I can test
> > > > out too.
> > >
> > > I'm going to add appropriate pipe-control to pipeline and send the
> > > patch.
> >
> > This change: https://patchwork.freedesktop.org/series/160484/
> >
> > along with your patch passes the test.
> >
>
> Thanks! Great data point. Now just need to test out Mesa / Compute. I
> suspect they properly insert pipe controls in batches as preemption
> points but need to confirm.
>
Both Mesa / Compute have completed CI runs with this patch and reported
no regressions.
Can I get an RB on this patch to merge?
Zbigniew - did you ever get the IGT change merged?
Matt
> Matt
>
> > --
> > Zbigniew
> >
> > >
> > > --
> > > Zbigniew
> > >
> > > >
> > > > It is fine the batch preempts, we actually want that. Mid-batch
> > > > preemption is not what we are trying to prevent - we are trying to
> > > > prevent preemption of the fence signaling after the batch is done. As
> > > > long as we don't regress any user applications we can safely make this
> > > > change - it is fine to break IGTs which are not doing the right thing
> > > > wrt to batch programming.
> > > >
> > > > Matt
> > > >
> > > > >
> > > > > --
> > > > > Zbigniew
> > > > >
> > > > > >
> > > > > > Matt
> > > > > >
> > > > > > > May you try to put arb disable after emit_render_cache_flush?
> > > > > > >
> > > > > > > --
> > > > > > > Zbigniew
> > > > > > >
> > > > > > > > i = emit_render_cache_flush(job, dw, i);
> > > > > > > >
> > > > > > > > if (job->user_fence.used)
> > > > > > > > --
> > > > > > > > 2.34.1
> > > > > > > >
next prev parent reply other threads:[~2026-02-26 17:36 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-15 0:45 [PATCH] drm/xe: Do not preempt fence signaling CS instructions Matthew Brost
2026-01-15 0:52 ` ✓ CI.KUnit: success for " Patchwork
2026-01-15 1:35 ` ✓ Xe.CI.BAT: " Patchwork
2026-01-15 6:13 ` ✗ Xe.CI.Full: failure " Patchwork
2026-01-16 9:45 ` [PATCH] " Zbigniew Kempczyński
2026-01-16 10:12 ` Francois Dugast
2026-01-16 16:43 ` Daniele Ceraolo Spurio
2026-01-16 19:51 ` Summers, Stuart
2026-01-16 20:44 ` Matthew Brost
2026-01-16 21:07 ` Summers, Stuart
2026-01-16 21:19 ` Matthew Brost
2026-01-16 21:05 ` Matthew Brost
2026-01-19 12:01 ` Zbigniew Kempczyński
2026-01-20 21:10 ` Daniele Ceraolo Spurio
2026-01-20 21:26 ` Matthew Brost
2026-01-20 21:27 ` Matthew Brost
2026-01-22 9:22 ` Zbigniew Kempczyński
2026-01-22 17:47 ` Daniele Ceraolo Spurio
2026-01-27 20:14 ` Matthew Brost
2026-01-20 21:11 ` Matthew Brost
2026-01-22 8:44 ` Zbigniew Kempczyński
2026-01-27 7:20 ` Zbigniew Kempczyński
2026-01-27 20:15 ` Matthew Brost
2026-02-26 17:35 ` Matthew Brost [this message]
2026-02-26 17:49 ` Daniele Ceraolo Spurio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaCEfObwLbCrN/3W@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=carlos.santa@intel.com \
--cc=daniele.ceraolospurio@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=zbigniew.kempczynski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox