Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Zhanjun Dong <zhanjun.dong@intel.com>,
	Stuart Summers <stuart.summers@intel.com>,
	John Harrison <John.C.Harrison@Intel.com>,
	Sasha Levin <sashal@kernel.org>,
	lucas.demarchi@intel.com, thomas.hellstrom@linux.intel.com,
	rodrigo.vivi@intel.com, intel-xe@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Increase GuC crash dump buffer size
Date: Sat, 25 Oct 2025 11:57:10 -0400	[thread overview]
Message-ID: <20251025160905.3857885-199-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>

From: Zhanjun Dong <zhanjun.dong@intel.com>

[ Upstream commit ad83b1da5b786ee2d245e41ce55cb1c71fed7c22 ]

There are platforms already have a maximum dump size of 12KB, to avoid
data truncating, increase GuC crash dump buffer size to 16KB.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250829160427.1245732-1-zhanjun.dong@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What changed
  - The non-debug GuC crash log buffer was doubled from 8 KB to 16 KB by
    changing `CRASH_BUFFER_SIZE` from `SZ_8K` to `SZ_16K` in
    drivers/gpu/drm/xe/xe_guc_log.h:20. Debug builds remain unchanged at
    1 MB (drivers/gpu/drm/xe/xe_guc_log.h:16).

- Why it matters (bugfix, not a feature)
  - Commit message states some platforms produce up to 12 KB crash
    dumps; with an 8 KB buffer this causes truncation. That’s a
    functional defect in diagnostics: incomplete crash logs hinder
    debugging and postmortem analysis. Increasing to 16 KB fixes this
    truncation.

- Containment and safety
  - The size is consumed by the GuC CTL log parameter field using 4 KB
    units unless the size is a multiple of 1 MB. With 16 KB, the unit
    remains 4 KB and the value is encoded via `FIELD_PREP(GUC_LOG_CRASH,
    CRASH_BUFFER_SIZE / LOG_UNIT - 1)` in
    drivers/gpu/drm/xe/xe_guc.c:128, with `LOG_UNIT` set to `SZ_4K` for
    this case (drivers/gpu/drm/xe/xe_guc.c:101-107).
  - The GuC register field for the crash buffer size is 2 bits
    (`GUC_LOG_CRASH` is `REG_GENMASK(5, 4)`,
    drivers/gpu/drm/xe/xe_guc_fwif.h:94), encoding sizes of 4 KB, 8 KB,
    12 KB, and 16 KB. Setting 16 KB is the maximum representable and
    safely covers platforms needing 12 KB without truncation.
  - Compile-time checks enforce correctness and alignment:
    `BUILD_BUG_ON(!IS_ALIGNED(CRASH_BUFFER_SIZE, LOG_UNIT));` in
    drivers/gpu/drm/xe/xe_guc.c:118. 16 KB is aligned to 4 KB, so it
    passes.
  - The total BO allocation for logs increases by only 8 KB via
    `guc_log_size()` (drivers/gpu/drm/xe/xe_guc_log.c:61), which is
    negligible and localized to this driver. No ABI/API changes.
  - The change does not affect debug builds (`CONFIG_DRM_XE_DEBUG_GUC`),
    which already use 1 MB (drivers/gpu/drm/xe/xe_guc_log.h:16).

- Impact scope
  - Only the Intel Xe driver’s GuC logging path is affected. No
    architectural changes, no critical core subsystems touched. Memory
    impact is minimal and bounded per GT/tile.

- Stable criteria assessment
  - Fixes a real user-facing issue (truncated GuC crash dumps) that
    impairs diagnostics.
  - Small, contained change to a single constant; low regression risk.
  - No new features; no behavioral change beyond preventing truncation.
  - Aligns with hardware encodings and existing compile-time guards.

Given the clear bugfix nature, minimal risk, and confined scope, this is
a good candidate for stable backporting.

 drivers/gpu/drm/xe/xe_guc_log.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_log.h b/drivers/gpu/drm/xe/xe_guc_log.h
index f1e2b0be90a9f..98a47ac42b08f 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.h
+++ b/drivers/gpu/drm/xe/xe_guc_log.h
@@ -17,7 +17,7 @@ struct xe_device;
 #define DEBUG_BUFFER_SIZE       SZ_8M
 #define CAPTURE_BUFFER_SIZE     SZ_2M
 #else
-#define CRASH_BUFFER_SIZE	SZ_8K
+#define CRASH_BUFFER_SIZE	SZ_16K
 #define DEBUG_BUFFER_SIZE	SZ_64K
 #define CAPTURE_BUFFER_SIZE	SZ_1M
 #endif
-- 
2.51.0


  parent reply	other threads:[~2025-10-25 16:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe/pcode: Initialize data0 for pcode read routine Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe: improve dma-resv handling for backup object Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe: Extend wa_13012615864 to additional Xe2 and Xe3 platforms Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] drm/xe/ptl: Apply Wa_16026007364 Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] drm/xe: Set GT as wedged before sending wedged uevent Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] drm/xe/i2c: Enable bus mastering Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] drm/xe/configfs: Enforce canonical device names Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17] drm/xe: Extend Wa_22021007897 to Xe3 platforms Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17] drm/xe: Cancel pending TLB inval workers on teardown Sasha Levin
2025-10-25 15:57 ` Sasha Levin [this message]
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe/wcl: Extend L3bank mask workaround Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Set upper limit of H2G retries over CTB Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe: Make page size consistent in loop Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe/guc: Add devm release action to safely tear down CT Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/xe/pf: Program LMTT directory pointer on all GTs within a tile Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] drm/xe/guc: Always add CT disable action during second init step Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] drm/xe/pf: Don't resume device from restart worker Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Return an error code if the GuC load fails Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17] drm/xe: Ensure GT is in C0 during resumes Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17] drm/xe: rework PDE PAT index selection Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.12] drm/xe/guc: Add more GuC load error status codes Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.12] drm/xe: Fix oops in xe_gem_fault when running core_hotunplug test Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251025160905.3857885-199-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=John.C.Harrison@Intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=rodrigo.vivi@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=stuart.summers@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=zhanjun.dong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox