Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Harish Chegondi <harish.chegondi@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 1/1] drm/xe/eustall: Add WA 14027054324 support for graphics IP 35.11
Date: Wed, 1 Jul 2026 11:31:27 -0700	[thread overview]
Message-ID: <akVc_4Pp1MyCM5IH@intel.com> (raw)
In-Reply-To: <20260624205851.GP6214@mdroper-desk1.amr.corp.intel.com>

On Wed, Jun 24, 2026 at 01:58:51PM -0700, Matt Roper wrote:
> On Tue, Jun 16, 2026 at 04:00:18PM -0700, Harish Chegondi wrote:
> > WA 14027054324 is implemented in the firmware and is applied before
> > EU stall sampling and reverted after EU stall sampling. The driver
> > needs to notify the firmware whenever EU stall sampling is being
> > enabled/disabled so that the firmware takes the necessary action.
> > The driver uses a scratch pad register to communicate with the firmware.
> > 
> > Before enabling EU stall sampling, write 0x20 to the SWF scratch pad
> > register to request the firmware to apply the workaround. The firmware
> > applies the workaround and sets the scratch pad register to 0x60
> > as an ACK.
> > 
> > Before disabling EU stall sampling, write 0x40 to the SWF scratch pad
> > register to request the firmware to revert the workaround. The firmware
> > reverts the workaround and sets the scratch pad register to 0 as an ACK.
> > 
> > Bspec update for the SWF scratch pad register is still pending, but has
> > been confirmed offline with the firmware team.
> > 
> > Bspec: 53188
> > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > ---
> >  drivers/gpu/drm/xe/regs/xe_gt_regs.h |  3 +++
> >  drivers/gpu/drm/xe/xe_eu_stall.c     | 37 +++++++++++++++++++++++++---
> >  drivers/gpu/drm/xe/xe_wa_oob.rules   |  1 +
> >  3 files changed, 38 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > index 08251c7a1a4b..2a649d036660 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > @@ -616,6 +616,9 @@
> >  #define   CCS_MODE_CSLICE(cslice, ccs) \
> >  	((ccs) << ((cslice) * CCS_MODE_CSLICE_WIDTH))
> >  
> > +#define SWF_SCRATCHPAD				XE_REG(0x4f000)
> 
> This register isn't part of the GT, so we probably don't want the
> definition in this file.  xe_regs.h would be a better spot for a non-GT
> register that doesn't fall into one of the other unit-specific files.
> 
> Also, even if we're just using one of the scratchpad registers at the
> moment, there are technically 36 of them in hardware, so it might be
> best to make this a parameterized macro and use SWF_SCRATCHPAD(0) to
> make it explicit which one we're referring to (and to support future
> code that might need to use some of the other ones).

I will make the above changes in the next version of the patch.

> 
> > +#define   SWF_EUSTALL_MASK			REG_GENMASK(6, 5)
> > +
> >  #define FORCEWAKE_ACK_GT			XE_REG(0x130044)
> >  
> >  /* Applicable for all FORCEWAKE_DOMAIN and FORCEWAKE_ACK_DOMAIN regs */
> > diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> > index d37770c58c5d..fa54187de5dc 100644
> > --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> > +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/fs.h>
> >  #include <linux/poll.h>
> >  #include <linux/types.h>
> > +#include <linux/iopoll.h>
> >  
> >  #include <drm/drm_drv.h>
> >  #include <generated/xe_wa_oob.h>
> > @@ -20,6 +21,7 @@
> >  #include "xe_gt_printk.h"
> >  #include "xe_gt_topology.h"
> >  #include "xe_macros.h"
> > +#include "xe_mmio.h"
> >  #include "xe_observation.h"
> >  #include "xe_pm.h"
> >  #include "xe_trace.h"
> > @@ -30,6 +32,11 @@
> >  
> >  #define POLL_PERIOD_MS	5
> >  
> > +#define REQ_EUSTALL_ENABLE	0x20
> > +#define ACK_EUSTALL_ENABLE	0x60
> > +#define REQ_EUSTALL_DISABLE	0x40
> > +#define ACK_EUSTALL_DISABLE	0
> > +
> >  static size_t per_xecore_buf_size = SZ_512K;
> >  
> >  struct per_xecore_buf {
> > @@ -682,7 +689,7 @@ static int xe_eu_stall_stream_enable(struct xe_eu_stall_data_stream *stream)
> >  	struct per_xecore_buf *xecore_buf;
> >  	struct xe_gt *gt = stream->gt;
> >  	u16 group, instance;
> > -	int xecore;
> > +	int xecore, ret = 0;
> >  
> >  	/* Take runtime pm ref and forcewake to disable RC6 */
> >  	xe_pm_runtime_get(gt_to_xe(gt));
> > @@ -693,6 +700,19 @@ static int xe_eu_stall_stream_enable(struct xe_eu_stall_data_stream *stream)
> >  		return -ETIMEDOUT;
> >  	}
> >  
> > +	if (XE_GT_WA(gt, 14027054324)) {
> > +		/* Request the firmware to apply the workaround and wait for an ACK */
> > +		xe_mmio_rmw32(&gt->mmio, SWF_SCRATCHPAD, SWF_EUSTALL_MASK, REQ_EUSTALL_ENABLE);
> 
> Since the workaround details still aren't updated yet, I think we need
> to get someone to explicitly confirm what the expectations are for this
> register:

I have requested for the workaround to be updated and they did update.
> 
>  * If they guarantee that this workaround is the sole owner of this
>    register (none of the other register bits will ever be used for other
>    purposes and/or other software/firmware clients), then we can go back
>    to just doing a direct write here.

I have sought clarification on this and I was told that as of now this
workaround is the sole user of this register. I was told that the Bspec
page of this scratchpad register will be updated soon.
> 
>  * If they can't guarantee that other bits of this register won't be
>    used for anything else, then they need to provide some coordination
>    strategy for updates.  Read-modify-write is racy, and it's not just
>    other KMD code running on the CPU that we're potentially racing with.
>    There could be firmware-to-firmware communication going on in other
>    bits that we have no visiblity of, so no way of knowing when it's
>    "safe" to do an RMW.  That's why things like MCR steering have
>    special "semaphore" registers that are used to temporarily acquire
>    exclusive access to the real register.

I was told that in theory the other bits of this register can be used
for a different purpose in the future but there are several of these
registers and usually other register instances are used for a different
purpose. Also, as per the updated workaround, the initial condition is:
SWF_0 == 0x00. So, is it safe to assume this workaround is the sole
user? If so, I will change the patch to doing a direct write.

> 
> > +		ret = read_poll_timeout(xe_mmio_read32, reg_value,
> > +					(reg_value & SWF_EUSTALL_MASK) == ACK_EUSTALL_ENABLE,
> > +					1000, 1000000, true, &gt->mmio, SWF_SCRATCHPAD);
> 
> Was the 1 second timeout here suggested by the firmware team or
> workaround owners, or are we just picking something that sounds like a
> long timeout and hoping that it's long enough?  If we're waiting for a
> formal update to the workaround text, it would be good if we could ask
> them to provide worst case timing information so that we know for sure
> what an appropriate timeout value would be.

I just picked 1 second hoping that it is long enough. I have requested
the worst case timing information to be updated in the workaround, but
it hasn't been updated yet. But it is mentioned in the workaround that
the system controller polls this register every 1ms. I was told that the
firmware response time to apply the workaround should be 1ms too. I
think the timeout can be reduced to maybe 100 milliseconds. I will
follow up on getting the worst case timeout into the workaround details.

Thank You
Harish.

> 
> 
> Matt
> 
> > +		if (ret) {
> > +			xe_gt_err(gt, "Timeout polling for EU stall enable ACK from firmware\n");
> > +			xe_force_wake_put(gt_to_fw(gt), stream->fw_ref);
> > +			xe_pm_runtime_put(gt_to_xe(gt));
> > +			return ret;
> > +		}
> > +	}
> >  	if (XE_GT_WA(gt, 22016596838))
> >  		xe_gt_mcr_multicast_write(gt, ROW_CHICKEN2,
> >  					  REG_MASKED_FIELD_ENABLE(DISABLE_DOP_GATING));
> > @@ -730,7 +750,7 @@ static int xe_eu_stall_stream_enable(struct xe_eu_stall_data_stream *stream)
> >  	reg_value |= XEHPC_EUSTALL_BASE_ENABLE_SAMPLING;
> >  	xe_gt_mcr_multicast_write(gt, XEHPC_EUSTALL_BASE, reg_value);
> >  
> > -	return 0;
> > +	return ret;
> >  }
> >  
> >  static void eu_stall_data_buf_poll_work_fn(struct work_struct *work)
> > @@ -840,6 +860,8 @@ static int xe_eu_stall_enable_locked(struct xe_eu_stall_data_stream *stream)
> >  static int xe_eu_stall_disable_locked(struct xe_eu_stall_data_stream *stream)
> >  {
> >  	struct xe_gt *gt = stream->gt;
> > +	u32 reg_value;
> > +	int ret = 0;
> >  
> >  	if (!stream->enabled)
> >  		return 0;
> > @@ -853,11 +875,20 @@ static int xe_eu_stall_disable_locked(struct xe_eu_stall_data_stream *stream)
> >  	if (XE_GT_WA(gt, 22016596838))
> >  		xe_gt_mcr_multicast_write(gt, ROW_CHICKEN2,
> >  					  REG_MASKED_FIELD_DISABLE(DISABLE_DOP_GATING));
> > +	if (XE_GT_WA(gt, 14027054324)) {
> > +		/* Request the firmware to revert the workaround and wait for an ACK */
> > +		xe_mmio_rmw32(&gt->mmio, SWF_SCRATCHPAD, SWF_EUSTALL_MASK, REQ_EUSTALL_DISABLE);
> > +		ret = read_poll_timeout(xe_mmio_read32, reg_value,
> > +					(reg_value & SWF_EUSTALL_MASK) == ACK_EUSTALL_DISABLE,
> > +					1000, 1000000, true, &gt->mmio, SWF_SCRATCHPAD);
> > +		if (ret)
> > +			xe_gt_err(gt, "Timeout polling for EU stall disable ACK from firmware\n");
> > +	}
> >  
> >  	xe_force_wake_put(gt_to_fw(gt), stream->fw_ref);
> >  	xe_pm_runtime_put(gt_to_xe(gt));
> >  
> > -	return 0;
> > +	return ret;
> >  }
> >  
> >  static long xe_eu_stall_stream_ioctl_locked(struct xe_eu_stall_data_stream *stream,
> > diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
> > index 9027365f0043..41734b9672c5 100644
> > --- a/drivers/gpu/drm/xe/xe_wa_oob.rules
> > +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
> > @@ -66,3 +66,4 @@
> >  14025883347	MEDIA_VERSION_RANGE(1301, 3503)
> >  		GRAPHICS_VERSION_RANGE(2004, 3005)
> >  16029380221	MEDIA_VERSION(3500)
> > +14027054324	GRAPHICS_VERSION(3511)
> > -- 
> > 2.43.0
> > 
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

      reply	other threads:[~2026-07-01 18:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 23:00 [PATCH 1/1] drm/xe/eustall: Add WA 14027054324 support for graphics IP 35.11 Harish Chegondi
2026-06-16 23:07 ` ✓ CI.KUnit: success for series starting with [1/1] " Patchwork
2026-06-17  0:07 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-17  5:01 ` ✓ Xe.CI.FULL: " Patchwork
2026-06-24 20:58 ` [PATCH 1/1] " Matt Roper
2026-07-01 18:31   ` Harish Chegondi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akVc_4Pp1MyCM5IH@intel.com \
    --to=harish.chegondi@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox