Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Sk Anirban <sk.anirban@intel.com>,
	Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	thomas.hellstrom@linux.intel.com, rodrigo.vivi@intel.com,
	John.C.Harrison@Intel.com, badal.nilawar@intel.com,
	nitin.r.gote@intel.com, alexandre.f.demers@gmail.com,
	intel-xe@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.17] drm/xe/ptl: Apply Wa_16026007364
Date: Sat, 25 Oct 2025 11:54:57 -0400	[thread overview]
Message-ID: <20251025160905.3857885-66-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>

From: Sk Anirban <sk.anirban@intel.com>

[ Upstream commit d72779c29d82c6e371cea8b427550bd6923c2577 ]

As part of this WA GuC will save and restore value of two XE3_Media
control registers that were not included in the HW power context.

Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://lore.kernel.org/r/20250716101622.3421480-2-sk.anirban@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What changed (files and specifics)
  - drivers/gpu/drm/xe/abi/guc_klvs_abi.h: Adds a new GuC WA KLV key
    enum, `GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG = 0x900c` (near
    existing WA keys at drivers/gpu/drm/xe/abi/guc_klvs_abi.h:350). This
    is a pure additive identifier so existing paths are unaffected.
  - drivers/gpu/drm/xe/xe_guc_ads.c: Introduces
    `guc_waklv_enable_two_word(...)`, a helper to emit a KLV with LEN=2
    and two 32‑bit payload values, mirroring the existing one‑word and
    simple (LEN=0) helpers (see the existing one‑word helper at
    drivers/gpu/drm/xe/xe_guc_ads.c:288 and `guc_waklv_init` at
    drivers/gpu/drm/xe/xe_guc_ads.c:338).
    - In `guc_waklv_init`, it conditionally emits the new WA KLV only
      when both conditions hold:
      - GuC firmware release version is >= `MAKE_GUC_VER(70, 47, 0)`.
      - Platform WA bit `XE_WA(gt, 16026007364)` is set.
    - The KLV is sent with two dwords `0x0` and `0xF`, via
      `guc_waklv_enable_two_word(...)`, and appended into the existing
      ADS WA KLV buffer using the same offset/remain logic and
      `xe_map_memcpy_to(...)` used by other KLV helpers; on insufficient
      space it only `drm_warn(...)`, matching style already used for
      other entries.
  - drivers/gpu/drm/xe/xe_wa_oob.rules: Adds WA rule `16026007364
    MEDIA_VERSION(3000)`, causing the build to generate
    `XE_WA_OOB_16026007364` for Xe3 Media in `<generated/xe_wa_oob.h>`,
    which then makes `XE_WA(gt, 16026007364)` true on the intended
    hardware.

- Why it matters (bug impact)
  - Commit message: “As part of this WA GuC will save and restore value
    of two XE3_Media control registers that were not included in the HW
    power context.” That means state for two critical media control
    registers is lost across power context save/restore without this WA.
    On affected Xe3 Media platforms, this can cause functional issues on
    power transitions (e.g., RC6 exit, power-gating), impacting users
    running media workloads.

- Scope and risk
  - Small and contained: One enum addition, one helper function, one
    gated call in `guc_waklv_init`, and one WA rule line. No
    architectural changes; no ABI/uAPI changes.
  - Proper gating minimizes regression risk:
    - Firmware gating: The KLV is only sent when
      `GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 47, 0)`, so
      older GuC releases won’t see unknown keys.
    - Hardware gating: The WA is enabled only when `XE_WA(gt,
      16026007364)` is set by the generated OOB WA database for
      `MEDIA_VERSION(3000)`. Other platforms remain untouched.
  - Buffering safety: The WA KLV blob is sized to a full page
    (`guc_ads_waklv_size` returns `SZ_4K`), and the helper checks
    `remain` before writing; failure cases only warn and do not corrupt
    data.
  - Consistency: The new two‑word helper mirrors existing
    one‑word/simple KLV emission patterns
    (drivers/gpu/drm/xe/xe_guc_ads.c:288, 315), and the KLV header
    fields (`GUC_KLV_0_KEY`, `GUC_KLV_0_LEN`) are used consistently with
    other KLV users across the driver.

- Stable backport criteria
  - Fixes a real, user‑visible hardware/firmware interaction bug (lost
    register state on Xe3 Media power transitions).
  - Minimal, localized change within DRM/Xe; no core kernel or
    cross‑subsystem impact.
  - No new features; it only enables a firmware WA when present and
    applicable.
  - Low regression risk due to strict firmware and platform gating.
  - Even if the deployed firmware is older than 70.47.0, the change is
    inert (the KLV is not sent), so it cannot regress those systems and
    transparently benefits systems once they pick up newer GuC firmware.

Given the above, this is a good candidate for stable backport to improve
reliability on affected Xe3 Media platforms with appropriate GuC
firmware.

 drivers/gpu/drm/xe/abi/guc_klvs_abi.h |  1 +
 drivers/gpu/drm/xe/xe_guc_ads.c       | 35 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_wa_oob.rules    |  1 +
 3 files changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index d7719d0e36ca7..45a321d0099f1 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -421,6 +421,7 @@ enum xe_guc_klv_ids {
 	GUC_WORKAROUND_KLV_ID_BACK_TO_BACK_RCS_ENGINE_RESET				= 0x9009,
 	GUC_WA_KLV_WAKE_POWER_DOMAINS_FOR_OUTBOUND_MMIO					= 0x900a,
 	GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH					= 0x900b,
+	GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG					= 0x900c,
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
index 131cfc56be00a..8ff8626227ae4 100644
--- a/drivers/gpu/drm/xe/xe_guc_ads.c
+++ b/drivers/gpu/drm/xe/xe_guc_ads.c
@@ -284,6 +284,35 @@ static size_t calculate_golden_lrc_size(struct xe_guc_ads *ads)
 	return total_size;
 }
 
+static void guc_waklv_enable_two_word(struct xe_guc_ads *ads,
+				      enum xe_guc_klv_ids klv_id,
+				      u32 value1,
+				      u32 value2,
+				      u32 *offset, u32 *remain)
+{
+	u32 size;
+	u32 klv_entry[] = {
+			/* 16:16 key/length */
+			FIELD_PREP(GUC_KLV_0_KEY, klv_id) |
+			FIELD_PREP(GUC_KLV_0_LEN, 2),
+			value1,
+			value2,
+			/* 2 dword data */
+	};
+
+	size = sizeof(klv_entry);
+
+	if (*remain < size) {
+		drm_warn(&ads_to_xe(ads)->drm,
+			 "w/a klv buffer too small to add klv id %d\n", klv_id);
+	} else {
+		xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), *offset,
+				 klv_entry, size);
+		*offset += size;
+		*remain -= size;
+	}
+}
+
 static void guc_waklv_enable_one_word(struct xe_guc_ads *ads,
 				      enum xe_guc_klv_ids klv_id,
 				      u32 value,
@@ -381,6 +410,12 @@ static void guc_waklv_init(struct xe_guc_ads *ads)
 		guc_waklv_enable_simple(ads,
 					GUC_WA_KLV_RESET_BB_STACK_PTR_ON_VF_SWITCH,
 					&offset, &remain);
+	if (GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 47, 0) && XE_WA(gt, 16026007364))
+		guc_waklv_enable_two_word(ads,
+					  GUC_WA_KLV_RESTORE_UNSAVED_MEDIA_CONTROL_REG,
+					  0x0,
+					  0xF,
+					  &offset, &remain);
 
 	size = guc_ads_waklv_size(ads) - remain;
 	if (!size)
diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules
index 710f4423726c9..48c7a42e2fcad 100644
--- a/drivers/gpu/drm/xe/xe_wa_oob.rules
+++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
@@ -73,3 +73,4 @@ no_media_l3	MEDIA_VERSION(3000)
 14022085890	GRAPHICS_VERSION(2001)
 
 15015404425_disable	PLATFORM(PANTHERLAKE), MEDIA_STEP(B0, FOREVER)
+16026007364    MEDIA_VERSION(3000)
-- 
2.51.0


  parent reply	other threads:[~2025-10-25 16:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe/pcode: Initialize data0 for pcode read routine Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe: improve dma-resv handling for backup object Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe: Extend wa_13012615864 to additional Xe2 and Xe3 platforms Sasha Levin
2025-10-25 15:54 ` Sasha Levin [this message]
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] drm/xe: Set GT as wedged before sending wedged uevent Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] drm/xe/i2c: Enable bus mastering Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] drm/xe/configfs: Enforce canonical device names Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17] drm/xe: Extend Wa_22021007897 to Xe3 platforms Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17] drm/xe: Cancel pending TLB inval workers on teardown Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Increase GuC crash dump buffer size Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe/wcl: Extend L3bank mask workaround Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Set upper limit of H2G retries over CTB Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe: Make page size consistent in loop Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe/guc: Add devm release action to safely tear down CT Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe/pf: Program LMTT directory pointer on all GTs within a tile Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] drm/xe/guc: Always add CT disable action during second init step Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] drm/xe/pf: Don't resume device from restart worker Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Return an error code if the GuC load fails Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17] drm/xe: Ensure GT is in C0 during resumes Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17] drm/xe: rework PDE PAT index selection Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Add more GuC load error status codes Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.12] drm/xe: Fix oops in xe_gem_fault when running core_hotunplug test Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251025160905.3857885-66-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=John.C.Harrison@Intel.com \
    --cc=alexandre.f.demers@gmail.com \
    --cc=badal.nilawar@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=nitin.r.gote@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=rodrigo.vivi@intel.com \
    --cc=sk.anirban@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox