From: Harish Chegondi <harish.chegondi@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
Gustavo Sousa <gustavo.sousa@intel.com>,
Michal Wajdeczko <michal.wajdeczko@intel.com>
Subject: Re: [PATCH 1/1] drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04
Date: Mon, 2 Feb 2026 16:16:44 -0800 [thread overview]
Message-ID: <aYE-bPdOqtNpx0uJ@intel.com> (raw)
In-Reply-To: <20260130182814.GB458797@mdroper-desk1.amr.corp.intel.com>
On Fri, Jan 30, 2026 at 10:28:14AM -0800, Matt Roper wrote:
> On Thu, Jan 29, 2026 at 04:48:46PM -0800, Harish Chegondi wrote:
> > On Tue, Jan 27, 2026 at 12:50:04PM -0800, Matt Roper wrote:
> > > On Mon, Jan 26, 2026 at 11:49:50PM -0800, Harish Chegondi wrote:
> > > > Apply WA 18041344222 to Xe2 LPG graphics IP version 20.04 too.
> > > >
> > > > Bspec: 56024
> > > > Cc: Matt Roper <matthew.d.roper@intel.com>
> > > > Cc: Gustavo Sousa <gustavo.sousa@intel.com>
> > > > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_wa.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c
> > > > index a991ee2b8781..1153a7363cff 100644
> > > > --- a/drivers/gpu/drm/xe/xe_wa.c
> > > > +++ b/drivers/gpu/drm/xe/xe_wa.c
> > > > @@ -535,6 +535,13 @@ static const struct xe_rtp_entry_sr engine_was[] = {
> > > > FUNC(xe_rtp_match_first_render_or_compute)),
> > > > XE_RTP_ACTIONS(SET(TDL_TSL_CHICKEN, RES_CHK_SPR_DIS))
> > > > },
> > > > + { XE_RTP_NAME("18041344222"),
> > > > + XE_RTP_RULES(GRAPHICS_VERSION(2004),
> > > > + FUNC(xe_rtp_match_first_render_or_compute),
> > > > + FUNC(xe_rtp_match_not_sriov_vf),
> > >
> > > We don't need this; nothing on the engine_was[] list applied to SRIOV
> > > VFs so we never apply any of them. This RTP match function is intended
> > > for OOB functions (and possibly LRC workarounds).
> >
> > I submitted a patch series https://patchwork.freedesktop.org/series/160865/
> > to remove the RTP match function xe_rtp_match_not_sriov_vf() from the
> > earlier WA patches and the SRIOV IGT tests are failing.
> >
> > Michal mentioned that with the below patch, WAs are applied for VFs
> > 92a5bd302458a1663 (drm/xe/vf: Unblock xe_rtp_process_to_sr for VFs)
>
> It looks like that failure is coming from the 'process' stage because of
> the logic in your xe_rtp_match_gt_has_discontiguous_dss_groups()
> function, not because we're actually trying to apply the workaround.
> Adding the 'FUNC(xe_rtp_match_not_sriov_vf)' rule happens to avoid the
> crash because of short-circuiting, but you'd still be getting the crash
> if you'd put that after your custom FUNC() rule instead.
>
>
> There are two steps for workarounds/tuning programming:
> - Step 1: process the RTP table that describes all workarounds to
> identify the subset that are relevant to the current device, and
> compile a reg_sr list containing that subset
> - Step 2: apply the contents of the reg_sr list to the hardware by
> writing the registers
>
> Of the three main classes of workarounds we have (GT, engine, and LRC),
> only the LRC workarounds are relevant to SRIOV VFs; VFs don't have
> access to read or update the registers in GT and engine workarounds.
> So the commit Michal referenced allowed Step #1 to go forward because
> VFs do legitimately need to process one of the workaround tables (the
> LRC table). However there's also commit c19e705ec981 ("drm/xe/vf: Stop
> applying save-restore MMIOs if VF") that blocks Step 2 for SRIOV VFs for
> the GT and engine workarounds (LRC workarounds are applied in a
> different way and not affected by that).
>
> So the result of those two commits is that we'll process the RTP tables
> for all three types of workarounds and generate reg_sr lists of
> workarounds that we initially thing we need for the current device. For
> the SRIOV VF case, the reg_sr list that gets generated for GT/engine is
> bogus (since we _don't_ actually want or need the workarounds it
> identified), but it mostly doesn't really matter since we effectively
> throw the list away and never apply it.
>
> This mostly works, but there are a couple hitches:
>
> - Even if we'll never apply the workarounds for GT/engine in an
> SRIOV-VF, we still spend the time processing their RTP tables to
> needlessly generate a reg_sr. This is a waste of time, but usually
> harmless.
>
> - If any of the RULE conditions in an RTP entry try to read a register
> that an SRIOV_VF doesn't have access to, it will get back a value of
> zero. On its own this is harmless, but if the logic in the rule
> fails to account for 0 as a possible value, you can wind up with
> things like divide-by-zero errors, which is happening with your
> specific workaround.
>
> - Even though we never apply the register programming in the driver, we
> still report all the engine workarounds' registers to the GuC as part
> of the ADS regset, which will cause the GuC to attempt to
> save/restore them around resets. This is probably harmless since the
> VF GuC won't be able to read/write these registers, so nothing will
> happen (and the PF's KMD+GuC are probably already handling these
> appropriately on its own save/restore list anyway). But it's still a
> bit misleading/confusing.
>
> - I think (haven't checked) that since we filled out a reg_sr with a
> bunch of workarounds that we're not actually going to apply, some of
> the debugfs entries like 'register-save-restore' and 'workarounds'
> probably report misleading/incorrect information when run in a VF.
> Not a huge problem since it's a developer-only debug interface, but
> it could cause confusion.
>
>
> So maybe what we really want to do is block the processing of RTP => SR
> at the GT and engine callsites (but not the LRC callsite) for SRIOV VF.
> Then we won't waste time processing the rules when we already know they
> won't be applied, we won't run into problems with FUNC() rules that
> can't cope with SRIOV VF environments, and we won't report misleading
> information to the GuC and debugfs.
>
Thanks for the detailed explanation of how the workarounds are processed
and applied. I agree with your suggestion to block the processing of
RTP to SR at the GT and engine callsites for SRIOV VF. For this
workaround, I plan to do the following:
1. Add a check for SRIOV VF at the beginning of the function
xe_rtp_match_gt_has_discontiguous_dss_groups(). All the functions in
xe_gt_mcr.c and others that access steering information have the check
IS_SRIOV_VF(). I think xe_rtp_match_gt_has_discontiguous_dss_groups()
should have too.
2. Remove xe_rtp_match_not_sriov_vf() from all applications of this
workaround.
The optimizations to skip processing of the GT and engine workarounds
for SRIOV VF will go in a separate patch.
Please let me know if there are any issues with this approach.
Thank You
Harish.
> Matt
>
> >
> > Thank You
> > Harish.
> >
> > >
> > >
> > > Matt
> > >
> > > > + FUNC(xe_rtp_match_gt_has_discontiguous_dss_groups)),
> > > > + XE_RTP_ACTIONS(SET(TDL_CHICKEN, EUSTALL_PERF_SAMPLING_DISABLE))
> > > > + },
> > > >
> > > > /* Xe2_HPG */
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > >
> > > --
> > > Matt Roper
> > > Graphics Software Engineer
> > > Linux GPU Platform Enablement
> > > Intel Corporation
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation
next prev parent reply other threads:[~2026-02-03 0:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 7:49 [PATCH 1/1] drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04 Harish Chegondi
2026-01-27 8:04 ` ✓ CI.KUnit: success for series starting with [1/1] " Patchwork
2026-01-27 8:45 ` [PATCH 1/1] " Bhadane, Dnyaneshwar
2026-01-27 10:28 ` ✗ Xe.CI.Full: failure for series starting with [1/1] " Patchwork
2026-01-27 20:50 ` [PATCH 1/1] " Matt Roper
2026-01-28 22:54 ` Harish Chegondi
2026-01-30 0:48 ` Harish Chegondi
2026-01-30 18:28 ` Matt Roper
2026-02-03 0:16 ` Harish Chegondi [this message]
2026-02-10 21:55 ` Harish Chegondi
2026-01-28 7:42 ` ✗ Xe.CI.Full: failure for series starting with [1/1] " Patchwork
2026-01-28 7:45 ` ✓ CI.KUnit: success " Patchwork
2026-01-28 15:07 ` ✓ CI.KUnit: success for series starting with [1/1] drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04 (rev2) Patchwork
2026-01-28 15:51 ` ✓ Xe.CI.BAT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYE-bPdOqtNpx0uJ@intel.com \
--to=harish.chegondi@intel.com \
--cc=gustavo.sousa@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.d.roper@intel.com \
--cc=michal.wajdeczko@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.