From: Harish Chegondi <harish.chegondi@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
Gustavo Sousa <gustavo.sousa@intel.com>,
Michal Wajdeczko <michal.wajdeczko@intel.com>
Subject: Re: [PATCH 1/1] drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04
Date: Mon, 2 Feb 2026 16:16:44 -0800 [thread overview]
Message-ID: <aYE-bPdOqtNpx0uJ@intel.com> (raw)
In-Reply-To: <20260130182814.GB458797@mdroper-desk1.amr.corp.intel.com>
On Fri, Jan 30, 2026 at 10:28:14AM -0800, Matt Roper wrote:
> On Thu, Jan 29, 2026 at 04:48:46PM -0800, Harish Chegondi wrote:
> > On Tue, Jan 27, 2026 at 12:50:04PM -0800, Matt Roper wrote:
> > > On Mon, Jan 26, 2026 at 11:49:50PM -0800, Harish Chegondi wrote:
> > > > Apply WA 18041344222 to Xe2 LPG graphics IP version 20.04 too.
> > > >
> > > > Bspec: 56024
> > > > Cc: Matt Roper <matthew.d.roper@intel.com>
> > > > Cc: Gustavo Sousa <gustavo.sousa@intel.com>
> > > > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_wa.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c
> > > > index a991ee2b8781..1153a7363cff 100644
> > > > --- a/drivers/gpu/drm/xe/xe_wa.c
> > > > +++ b/drivers/gpu/drm/xe/xe_wa.c
> > > > @@ -535,6 +535,13 @@ static const struct xe_rtp_entry_sr engine_was[] = {
> > > > FUNC(xe_rtp_match_first_render_or_compute)),
> > > > XE_RTP_ACTIONS(SET(TDL_TSL_CHICKEN, RES_CHK_SPR_DIS))
> > > > },
> > > > + { XE_RTP_NAME("18041344222"),
> > > > + XE_RTP_RULES(GRAPHICS_VERSION(2004),
> > > > + FUNC(xe_rtp_match_first_render_or_compute),
> > > > + FUNC(xe_rtp_match_not_sriov_vf),
> > >
> > > We don't need this; nothing on the engine_was[] list applied to SRIOV
> > > VFs so we never apply any of them. This RTP match function is intended
> > > for OOB functions (and possibly LRC workarounds).
> >
> > I submitted a patch series https://patchwork.freedesktop.org/series/160865/
> > to remove the RTP match function xe_rtp_match_not_sriov_vf() from the
> > earlier WA patches and the SRIOV IGT tests are failing.
> >
> > Michal mentioned that with the below patch, WAs are applied for VFs
> > 92a5bd302458a1663 (drm/xe/vf: Unblock xe_rtp_process_to_sr for VFs)
>
> It looks like that failure is coming from the 'process' stage because of
> the logic in your xe_rtp_match_gt_has_discontiguous_dss_groups()
> function, not because we're actually trying to apply the workaround.
> Adding the 'FUNC(xe_rtp_match_not_sriov_vf)' rule happens to avoid the
> crash because of short-circuiting, but you'd still be getting the crash
> if you'd put that after your custom FUNC() rule instead.
>
>
> There are two steps for workarounds/tuning programming:
> - Step 1: process the RTP table that describes all workarounds to
> identify the subset that are relevant to the current device, and
> compile a reg_sr list containing that subset
> - Step 2: apply the contents of the reg_sr list to the hardware by
> writing the registers
>
> Of the three main classes of workarounds we have (GT, engine, and LRC),
> only the LRC workarounds are relevant to SRIOV VFs; VFs don't have
> access to read or update the registers in GT and engine workarounds.
> So the commit Michal referenced allowed Step #1 to go forward because
> VFs do legitimately need to process one of the workaround tables (the
> LRC table). However there's also commit c19e705ec981 ("drm/xe/vf: Stop
> applying save-restore MMIOs if VF") that blocks Step 2 for SRIOV VFs for
> the GT and engine workarounds (LRC workarounds are applied in a
> different way and not affected by that).
>
> So the result of those two commits is that we'll process the RTP tables
> for all three types of workarounds and generate reg_sr lists of
> workarounds that we initially thing we need for the current device. For
> the SRIOV VF case, the reg_sr list that gets generated for GT/engine is
> bogus (since we _don't_ actually want or need the workarounds it
> identified), but it mostly doesn't really matter since we effectively
> throw the list away and never apply it.
>
> This mostly works, but there are a couple hitches:
>
> - Even if we'll never apply the workarounds for GT/engine in an
> SRIOV-VF, we still spend the time processing their RTP tables to
> needlessly generate a reg_sr. This is a waste of time, but usually
> harmless.
>
> - If any of the RULE conditions in an RTP entry try to read a register
> that an SRIOV_VF doesn't have access to, it will get back a value of
> zero. On its own this is harmless, but if the logic in the rule
> fails to account for 0 as a possible value, you can wind up with
> things like divide-by-zero errors, which is happening with your
> specific workaround.
>
> - Even though we never apply the register programming in the driver, we
> still report all the engine workarounds' registers to the GuC as part
> of the ADS regset, which will cause the GuC to attempt to
> save/restore them around resets. This is probably harmless since the
> VF GuC won't be able to read/write these registers, so nothing will
> happen (and the PF's KMD+GuC are probably already handling these
> appropriately on its own save/restore list anyway). But it's still a
> bit misleading/confusing.
>
> - I think (haven't checked) that since we filled out a reg_sr with a
> bunch of workarounds that we're not actually going to apply, some of
> the debugfs entries like 'register-save-restore' and 'workarounds'
> probably report misleading/incorrect information when run in a VF.
> Not a huge problem since it's a developer-only debug interface, but
> it could cause confusion.
>
>
> So maybe what we really want to do is block the processing of RTP => SR
> at the GT and engine callsites (but not the LRC callsite) for SRIOV VF.
> Then we won't waste time processing the rules when we already know they
> won't be applied, we won't run into problems with FUNC() rules that
> can't cope with SRIOV VF environments, and we won't report misleading
> information to the GuC and debugfs.
>
Thanks for the detailed explanation of how the workarounds are processed
and applied. I agree with your suggestion to block the processing of
RTP to SR at the GT and engine callsites for SRIOV VF. For this
workaround, I plan to do the following:
1. Add a check for SRIOV VF at the beginning of the function
xe_rtp_match_gt_has_discontiguous_dss_groups(). All the functions in
xe_gt_mcr.c and others that access steering information have the check
IS_SRIOV_VF(). I think xe_rtp_match_gt_has_discontiguous_dss_groups()
should have too.
2. Remove xe_rtp_match_not_sriov_vf() from all applications of this
workaround.
The optimizations to skip processing of the GT and engine workarounds
for SRIOV VF will go in a separate patch.
Please let me know if there are any issues with this approach.
Thank You
Harish.
> Matt
>
> >
> > Thank You
> > Harish.
> >
> > >
> > >
> > > Matt
> > >
> > > > + FUNC(xe_rtp_match_gt_has_discontiguous_dss_groups)),
> > > > + XE_RTP_ACTIONS(SET(TDL_CHICKEN, EUSTALL_PERF_SAMPLING_DISABLE))
> > > > + },
> > > >
> > > > /* Xe2_HPG */
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > >
> > > --
> > > Matt Roper
> > > Graphics Software Engineer
> > > Linux GPU Platform Enablement
> > > Intel Corporation
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation
next prev parent reply other threads:[~2026-02-03 0:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 7:49 [PATCH 1/1] drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04 Harish Chegondi
2026-01-27 8:04 ` ✓ CI.KUnit: success for series starting with [1/1] " Patchwork
2026-01-27 8:45 ` [PATCH 1/1] " Bhadane, Dnyaneshwar
2026-01-27 10:28 ` ✗ Xe.CI.Full: failure for series starting with [1/1] " Patchwork
2026-01-27 20:50 ` [PATCH 1/1] " Matt Roper
2026-01-28 22:54 ` Harish Chegondi
2026-01-30 0:48 ` Harish Chegondi
2026-01-30 18:28 ` Matt Roper
2026-02-03 0:16 ` Harish Chegondi [this message]
2026-02-10 21:55 ` Harish Chegondi
2026-01-28 7:42 ` ✗ Xe.CI.Full: failure for series starting with [1/1] " Patchwork
2026-01-28 7:45 ` ✓ CI.KUnit: success " Patchwork
2026-01-28 15:07 ` ✓ CI.KUnit: success for series starting with [1/1] drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04 (rev2) Patchwork
2026-01-28 15:51 ` ✓ Xe.CI.BAT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYE-bPdOqtNpx0uJ@intel.com \
--to=harish.chegondi@intel.com \
--cc=gustavo.sousa@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.d.roper@intel.com \
--cc=michal.wajdeczko@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox