dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx debugbus read
       [not found] <20250902120833.1342615-1-sashal@kernel.org>
@ 2025-09-02 12:08 ` Sasha Levin
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx TPL1 cluster snapshot Sasha Levin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-09-02 12:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Rob Clark, Sasha Levin, linux-arm-msm, dri-devel, freedreno

From: Rob Clark <robin.clark@oss.qualcomm.com>

[ Upstream commit 13ed0a1af263b56a5ebbf38ab7163cbc9dcb009e ]

The bitfield positions changed in a7xx.

v2: Don't open-code the bitfield building
v3: Also fix cx_debugbus

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/666659/
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Fix Analysis

The commit fixes a functional bug in the MSM DRM driver's debugbus
reading functionality for Adreno 7xx GPUs. Specifically:

1. **Incorrect Bitfield Positions**: The code was using A6XX bitfield
   definitions for A7XX hardware, but the hardware changed the bitfield
   positions. Looking at the changes in `a6xx.xml`:
   - A6XX uses bits 15:8 for `PING_BLK_SEL`
   - A7XX uses bits 24:16 for `PING_BLK_SEL`
   This mismatch would cause incorrect values to be written to the debug
bus selection registers, resulting in wrong or garbage data being read.

2. **Affects Core Debugging Functionality**: The debugbus is critical
   for GPU state capture during crashes and debugging. Without this fix,
   debugging A7XX GPUs would be severely impaired as the debugbus reads
   would return incorrect data.

## Stable Criteria Evaluation

1. **Fixes a real bug**: Yes - incorrect register programming leading to
   wrong debug data
2. **Small and contained**: Yes - only 46 lines changed across 2 files
3. **No major side effects**: The changes are conditional based on GPU
   family and only affect debug bus reading
4. **No architectural changes**: Simple conditional logic to use correct
   bitfield definitions
5. **Critical subsystem impact**: Moderate - affects GPU debugging
   capabilities which are important for diagnosing GPU hangs/crashes
6. **Already upstream**: Yes - commit
   13ed0a1af263b56a5ebbf38ab7163cbc9dcb009e is already merged upstream

## Code Safety Analysis

The fix is safe because:
- It adds proper family checks: `if (to_adreno_gpu(gpu)->info->family >=
  ADRENO_7XX_GEN1)`
- Maintains backward compatibility for A6XX hardware
- Only changes register programming for debug bus reads, not normal GPU
  operation
- The fix pattern is applied consistently to both `debugbus_read()` and
  `cx_debugbus_read()` functions

## Impact Assessment

Without this fix, users with A7XX GPUs would experience:
- Incorrect GPU state dumps when GPU crashes occur
- Inability to properly debug GPU issues
- Misleading diagnostic information that could hamper bug resolution

This is a clear bugfix that restores proper functionality for A7XX
hardware without introducing new features or risky changes, making it an
ideal candidate for stable backporting.

 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c   | 32 ++++++++++++++-----
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 14 +++++++-
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index f46bc906ca2a3..61850e2802914 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -174,8 +174,15 @@ static int a6xx_crashdumper_run(struct msm_gpu *gpu,
 static int debugbus_read(struct msm_gpu *gpu, u32 block, u32 offset,
 		u32 *data)
 {
-	u32 reg = A6XX_DBGC_CFG_DBGBUS_SEL_D_PING_INDEX(offset) |
-		A6XX_DBGC_CFG_DBGBUS_SEL_D_PING_BLK_SEL(block);
+	u32 reg;
+
+	if (to_adreno_gpu(gpu)->info->family >= ADRENO_7XX_GEN1) {
+		reg = A7XX_DBGC_CFG_DBGBUS_SEL_D_PING_INDEX(offset) |
+			A7XX_DBGC_CFG_DBGBUS_SEL_D_PING_BLK_SEL(block);
+	} else {
+		reg = A6XX_DBGC_CFG_DBGBUS_SEL_D_PING_INDEX(offset) |
+			A6XX_DBGC_CFG_DBGBUS_SEL_D_PING_BLK_SEL(block);
+	}
 
 	gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_SEL_A, reg);
 	gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_SEL_B, reg);
@@ -198,11 +205,18 @@ static int debugbus_read(struct msm_gpu *gpu, u32 block, u32 offset,
 	readl((ptr) + ((offset) << 2))
 
 /* read a value from the CX debug bus */
-static int cx_debugbus_read(void __iomem *cxdbg, u32 block, u32 offset,
+static int cx_debugbus_read(struct msm_gpu *gpu, void __iomem *cxdbg, u32 block, u32 offset,
 		u32 *data)
 {
-	u32 reg = A6XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_INDEX(offset) |
-		A6XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_BLK_SEL(block);
+	u32 reg;
+
+	if (to_adreno_gpu(gpu)->info->family >= ADRENO_7XX_GEN1) {
+		reg = A7XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_INDEX(offset) |
+			A7XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_BLK_SEL(block);
+	} else {
+		reg = A6XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_INDEX(offset) |
+			A6XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_BLK_SEL(block);
+	}
 
 	cxdbg_write(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_SEL_A, reg);
 	cxdbg_write(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_SEL_B, reg);
@@ -315,7 +329,8 @@ static void a6xx_get_debugbus_block(struct msm_gpu *gpu,
 		ptr += debugbus_read(gpu, block->id, i, ptr);
 }
 
-static void a6xx_get_cx_debugbus_block(void __iomem *cxdbg,
+static void a6xx_get_cx_debugbus_block(struct msm_gpu *gpu,
+		void __iomem *cxdbg,
 		struct a6xx_gpu_state *a6xx_state,
 		const struct a6xx_debugbus_block *block,
 		struct a6xx_gpu_state_obj *obj)
@@ -330,7 +345,7 @@ static void a6xx_get_cx_debugbus_block(void __iomem *cxdbg,
 	obj->handle = block;
 
 	for (ptr = obj->data, i = 0; i < block->count; i++)
-		ptr += cx_debugbus_read(cxdbg, block->id, i, ptr);
+		ptr += cx_debugbus_read(gpu, cxdbg, block->id, i, ptr);
 }
 
 static void a6xx_get_debugbus_blocks(struct msm_gpu *gpu,
@@ -527,7 +542,8 @@ static void a6xx_get_debugbus(struct msm_gpu *gpu,
 			int i;
 
 			for (i = 0; i < nr_cx_debugbus_blocks; i++)
-				a6xx_get_cx_debugbus_block(cxdbg,
+				a6xx_get_cx_debugbus_block(gpu,
+					cxdbg,
 					a6xx_state,
 					&cx_debugbus_blocks[i],
 					&a6xx_state->cx_debugbus[i]);
diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
index d860fd94feae8..86fab2750ba7b 100644
--- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
+++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
@@ -594,10 +594,14 @@ by a particular renderpass/blit.
 	<reg32 offset="0x0600" name="DBGC_CFG_DBGBUS_SEL_A"/>
 	<reg32 offset="0x0601" name="DBGC_CFG_DBGBUS_SEL_B"/>
 	<reg32 offset="0x0602" name="DBGC_CFG_DBGBUS_SEL_C"/>
-	<reg32 offset="0x0603" name="DBGC_CFG_DBGBUS_SEL_D">
+	<reg32 offset="0x0603" name="DBGC_CFG_DBGBUS_SEL_D" variants="A6XX">
 		<bitfield high="7" low="0" name="PING_INDEX"/>
 		<bitfield high="15" low="8" name="PING_BLK_SEL"/>
 	</reg32>
+	<reg32 offset="0x0603" name="DBGC_CFG_DBGBUS_SEL_D" variants="A7XX-">
+		<bitfield high="7" low="0" name="PING_INDEX"/>
+		<bitfield high="24" low="16" name="PING_BLK_SEL"/>
+	</reg32>
 	<reg32 offset="0x0604" name="DBGC_CFG_DBGBUS_CNTLT">
 		<bitfield high="5" low="0" name="TRACEEN"/>
 		<bitfield high="14" low="12" name="GRANU"/>
@@ -3796,6 +3800,14 @@ by a particular renderpass/blit.
 	<reg32 offset="0x0030" name="CFG_DBGBUS_TRACE_BUF2"/>
 </domain>
 
+<domain name="A7XX_CX_DBGC" width="32">
+	<!-- Bitfields shifted, but otherwise the same: -->
+	<reg32 offset="0x0000" name="CFG_DBGBUS_SEL_A" variants="A7XX-">
+		<bitfield high="7" low="0" name="PING_INDEX"/>
+		<bitfield high="24" low="16" name="PING_BLK_SEL"/>
+	</reg32>
+</domain>
+
 <domain name="A6XX_CX_MISC" width="32" prefix="variant" varset="chip">
 	<reg32 offset="0x0001" name="SYSTEM_CACHE_CNTL_0"/>
 	<reg32 offset="0x0002" name="SYSTEM_CACHE_CNTL_1"/>
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx TPL1 cluster snapshot
       [not found] <20250902120833.1342615-1-sashal@kernel.org>
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx debugbus read Sasha Levin
@ 2025-09-02 12:08 ` Sasha Levin
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix debugbus snapshot Sasha Levin
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix order of selector programming in cluster snapshot Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-09-02 12:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Rob Clark, Sasha Levin, linux-arm-msm, dri-devel, freedreno

From: Rob Clark <robin.clark@oss.qualcomm.com>

[ Upstream commit e9621ef610c4a600678da5d8020d4a0dfe686faa ]

Later gens have both a PIPE_BR and PIPE_NONE section.  The snapshot tool
seems to expect this for x1-85 as well.  I guess this was just a bug in
downstream kgsl, which went unnoticed?

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/666662/
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees. Here's the
extensive analysis:

## Technical Analysis

### Nature of the Fix
The commit fixes a bug in GPU crash dump snapshot generation for Adreno
a7xx GPUs (specifically the x1-85 variant). The issue is in the TPL1
(Texture Processing Level 1) cluster snapshot register definitions where
the PIPE_NONE section was missing.

### Code Changes Analysis

1. **In `adreno_gen7_0_0_snapshot.h`**:
   - The original code had only
     `gen7_0_0_tpl1_noncontext_pipe_br_registers` which included all
     registers (0x0b600-0x0b633)
   - The fix splits this into two separate arrays:
     - `gen7_0_0_tpl1_noncontext_pipe_none_registers`: Contains the bulk
       of registers (0x0b600-0x0b633)
     - `gen7_0_0_tpl1_noncontext_pipe_br_registers`: Now only contains
       register 0x0b600

2. **In `adreno_gen7_2_0_snapshot.h`**:
   - Adds a new entry to the `gen7_2_0_sptp_clusters` array:
     ```c
     { A7XX_CLUSTER_NONE, A7XX_TP0_NCTX_REG, A7XX_PIPE_NONE, 0,
     A7XX_USPTP,
     gen7_0_0_tpl1_noncontext_pipe_none_registers, 0xb600 },
     ```
   - This ensures both PIPE_BR and PIPE_NONE sections are captured in
     the snapshot

### Why This is a Bug Fix

1. **Incorrect Snapshot Data**: Without this fix, GPU crash dumps would
   be missing critical register data from the PIPE_NONE section
2. **Tool Compatibility**: The commit message states "The snapshot tool
   seems to expect this for x1-85 as well", indicating external tools
   that parse these snapshots expect this format
3. **Consistency with Later Generations**: The commit aligns x1-85
   (gen7_0_0) behavior with later generations that already have both
   sections

### Backport Suitability Criteria

✅ **Fixes a real bug**: Missing register data in crash dumps affects
debugging capabilities
✅ **Small and contained**: Only modifies snapshot header definitions -
no runtime code changes
✅ **Low risk**: Changes are purely additive to snapshot data structures
✅ **No architectural changes**: Simply corrects register definitions
✅ **No new features**: Only fixes existing snapshot functionality
✅ **Important for stability**: Proper crash dumps are crucial for
debugging GPU issues in production

### Context from Repository

Looking at recent fixes in the same area:
- Multiple snapshot-related fixes have been made recently
  (f28c9fc2c82de, e1d5ccf0a62b8, 06dd5d86c6aef)
- The gen7_9_0 variant already has PIPE_NONE definitions, confirming
  this is the expected pattern
- This appears to be fixing a downstream KGSL (Kernel Graphics Support
  Layer) bug that was carried over

### Risk Assessment

**Very Low Risk** because:
- Only affects devcoredump generation (post-crash debugging data)
- No impact on normal GPU operation
- Changes are data-only (register arrays)
- Aligns with existing patterns in other GPU generations

This is an ideal stable backport candidate as it fixes a specific bug
with minimal code changes and virtually no risk of regression.

 drivers/gpu/drm/msm/adreno/adreno_gen7_0_0_snapshot.h | 11 +++++++++--
 drivers/gpu/drm/msm/adreno/adreno_gen7_2_0_snapshot.h |  2 ++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gen7_0_0_snapshot.h b/drivers/gpu/drm/msm/adreno/adreno_gen7_0_0_snapshot.h
index cb66ece6606b5..4f305de5d7304 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gen7_0_0_snapshot.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gen7_0_0_snapshot.h
@@ -668,12 +668,19 @@ static const u32 gen7_0_0_sp_noncontext_pipe_lpac_usptp_registers[] = {
 };
 static_assert(IS_ALIGNED(sizeof(gen7_0_0_sp_noncontext_pipe_lpac_usptp_registers), 8));
 
-/* Block: TPl1 Cluster: noncontext Pipeline: A7XX_PIPE_BR */
-static const u32 gen7_0_0_tpl1_noncontext_pipe_br_registers[] = {
+/* Block: TPl1 Cluster: noncontext Pipeline: A7XX_PIPE_NONE */
+static const u32 gen7_0_0_tpl1_noncontext_pipe_none_registers[] = {
 	0x0b600, 0x0b600, 0x0b602, 0x0b602, 0x0b604, 0x0b604, 0x0b608, 0x0b60c,
 	0x0b60f, 0x0b621, 0x0b630, 0x0b633,
 	UINT_MAX, UINT_MAX,
 };
+static_assert(IS_ALIGNED(sizeof(gen7_0_0_tpl1_noncontext_pipe_none_registers), 8));
+
+/* Block: TPl1 Cluster: noncontext Pipeline: A7XX_PIPE_BR */
+static const u32 gen7_0_0_tpl1_noncontext_pipe_br_registers[] = {
+	 0x0b600, 0x0b600,
+	 UINT_MAX, UINT_MAX,
+};
 static_assert(IS_ALIGNED(sizeof(gen7_0_0_tpl1_noncontext_pipe_br_registers), 8));
 
 /* Block: TPl1 Cluster: noncontext Pipeline: A7XX_PIPE_LPAC */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gen7_2_0_snapshot.h b/drivers/gpu/drm/msm/adreno/adreno_gen7_2_0_snapshot.h
index 6f8ad50f32ce1..8d44b9377207c 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gen7_2_0_snapshot.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gen7_2_0_snapshot.h
@@ -573,6 +573,8 @@ static struct gen7_sptp_cluster_registers gen7_2_0_sptp_clusters[] = {
 		gen7_0_0_sp_noncontext_pipe_lpac_usptp_registers, 0xaf80 },
 	{ A7XX_CLUSTER_NONE, A7XX_TP0_NCTX_REG, A7XX_PIPE_BR, 0, A7XX_USPTP,
 		gen7_0_0_tpl1_noncontext_pipe_br_registers, 0xb600 },
+	{ A7XX_CLUSTER_NONE, A7XX_TP0_NCTX_REG, A7XX_PIPE_NONE, 0, A7XX_USPTP,
+		gen7_0_0_tpl1_noncontext_pipe_none_registers, 0xb600 },
 	{ A7XX_CLUSTER_NONE, A7XX_TP0_NCTX_REG, A7XX_PIPE_LPAC, 0, A7XX_USPTP,
 		gen7_0_0_tpl1_noncontext_pipe_lpac_registers, 0xb780 },
 	{ A7XX_CLUSTER_SP_PS, A7XX_SP_CTX0_3D_CPS_REG, A7XX_PIPE_BR, 0, A7XX_HLSQ_STATE,
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix debugbus snapshot
       [not found] <20250902120833.1342615-1-sashal@kernel.org>
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx debugbus read Sasha Levin
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx TPL1 cluster snapshot Sasha Levin
@ 2025-09-02 12:08 ` Sasha Levin
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix order of selector programming in cluster snapshot Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-09-02 12:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Rob Clark, Connor Abbott, Sasha Levin, linux-arm-msm, dri-devel,
	freedreno

From: Rob Clark <robin.clark@oss.qualcomm.com>

[ Upstream commit a814ba2d7b847cff15565bbab781df89e190619c ]

We weren't setting the # of captured debugbus blocks.

Reported-by: Connor Abbott <cwabbott0@gmail.com>
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/666660/
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Impact Analysis

1. **Clear bug fix**: The code analysis shows that
   `a7xx_get_debugbus_blocks()` allocates memory for
   `total_debugbus_blocks` worth of debugbus data and populates all of
   them, but critically fails to set `a6xx_state->nr_debugbus =
   total_debugbus_blocks`. This is clearly visible at line 442 of the
   fixed code.

2. **User-visible impact**: Without setting `nr_debugbus`, the function
   `a6xx_show_debugbus()` at line 1949 will iterate 0 times (`for (i =
   0; i < a6xx_state->nr_debugbus; i++)`), meaning NO debugbus data will
   be shown in GPU crash dumps for a7xx GPUs. This severely impacts
   debugging capabilities when GPU hangs or crashes occur.

3. **Regression timeline**: This bug was introduced in commit
   64d6255650d4e0 ("drm/msm: More fully implement devcoredump for a7xx")
   from January 2024, which added the `a7xx_get_debugbus_blocks()`
   function but forgot to set the counter. The a6xx version of this
   function correctly sets `nr_debugbus` at lines 372 and 384.

## Stable Tree Criteria Met

1. **Real bug affecting users**: Yes - debugbus data is completely
   missing from a7xx GPU crash dumps
2. **Small and contained fix**: Yes - single line addition:
   `a6xx_state->nr_debugbus = total_debugbus_blocks;`
3. **No architectural changes**: The fix simply sets an existing counter
   variable that was forgotten
4. **Minimal regression risk**: The change only affects the specific
   code path for a7xx GPUs and simply enables already-allocated and
   populated data to be displayed
5. **Critical debugging functionality**: GPU crash dumps are essential
   for debugging graphics driver issues in production

## Technical Details

The bug is in the a7xx-specific path where:
- Memory is allocated for `total_debugbus_blocks` entries (line 426)
- All blocks are populated via `a6xx_get_debugbus_block()` calls (lines
  430-439)
- But `nr_debugbus` is never set, leaving it at 0
- This causes `a6xx_show_debugbus()` to skip all debugbus output since
  it loops from 0 to `nr_debugbus`

The fix correctly sets `nr_debugbus = total_debugbus_blocks` after
populating all the data, matching the pattern used in the a6xx
equivalent function.

This is a perfect candidate for stable backporting as it fixes a clear
functional regression in debugging infrastructure without any risk of
destabilizing the system.

 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index a85d3df7a5fac..f46bc906ca2a3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -423,8 +423,9 @@ static void a7xx_get_debugbus_blocks(struct msm_gpu *gpu,
 				a6xx_state, &a7xx_debugbus_blocks[gbif_debugbus_blocks[i]],
 				&a6xx_state->debugbus[i + debugbus_blocks_count]);
 		}
-	}
 
+		a6xx_state->nr_debugbus = total_debugbus_blocks;
+	}
 }
 
 static void a6xx_get_debugbus(struct msm_gpu *gpu,
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix order of selector programming in cluster snapshot
       [not found] <20250902120833.1342615-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix debugbus snapshot Sasha Levin
@ 2025-09-02 12:08 ` Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-09-02 12:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Rob Clark, Akhil P Oommen, Sasha Levin, linux-arm-msm, dri-devel,
	freedreno

From: Rob Clark <robin.clark@oss.qualcomm.com>

[ Upstream commit a506578d8909e7e6f0d545af9850ccd4318bf6cf ]

Program the selector _after_ selecting the aperture.  This aligns with
the downstream driver, and fixes a case where we were failing to capture
ctx0 regs (and presumably what we thought were ctx1 regs were actually
ctx0).

Suggested-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/666655/
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees. Here's my
extensive analysis:

## Bug Fix Nature

The commit fixes a **clear functional bug** in the GPU crash
dump/snapshot functionality for Qualcomm Adreno GPUs (specifically the
a7xx series). The bug causes incorrect register capture during GPU crash
dumps:
- **Symptom**: Failing to capture ctx0 (context 0) registers correctly
- **Impact**: ctx1 registers were actually ctx0 registers, meaning
  critical debugging information was lost or misrepresented

## Code Analysis

The fix is **extremely simple and surgical** - it only reorders two
blocks of code in the `a7xx_get_cluster()` function:

**Before the fix:**
```c
/* Some clusters need a selector register to be programmed too */
if (cluster->sel)
    in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);

in += CRASHDUMP_WRITE(in, REG_A7XX_CP_APERTURE_CNTL_CD, ...);
```

**After the fix:**
```c
in += CRASHDUMP_WRITE(in, REG_A7XX_CP_APERTURE_CNTL_CD, ...);

/* Some clusters need a selector register to be programmed too */
if (cluster->sel)
    in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
```

The aperture control register (`REG_A7XX_CP_APERTURE_CNTL_CD`) must be
programmed **before** the selector register. This is a classic hardware
programming sequence issue where register order matters.

## Stable Backport Criteria Met

1. **Fixes a real bug**: Yes - incorrect crash dump data affects
   debugging capability
2. **Small and contained**: Yes - only 4 lines moved, no logic changes
3. **No architectural changes**: Correct - purely a reordering fix
4. **Minimal regression risk**: Yes - only affects crash dump path, not
   normal operation
5. **Aligns with vendor driver**: The commit explicitly states it
   "aligns with the downstream driver"
6. **No new features**: Correct - purely a bug fix

## Additional Supporting Evidence

- The commit has already been marked with "Upstream commit
  a506578d8909..." suggesting it's been accepted upstream
- Similar fixes in the same file (e.g., `f28c9fc2c82de drm/msm: Fix
  debugbus snapshot`) show a pattern of fixing crash dump issues
- The fix is isolated to the crash dump code path
  (`a7xx_get_cluster()`), which is only executed during GPU error
  recovery
- The author (Rob Clark) is a maintainer of the MSM DRM driver, lending
  credibility to the fix

This is an ideal stable backport candidate - it fixes a clear bug with
minimal code change and virtually no risk of regression.

 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 61850e2802914..6e8dbd27addbe 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -776,15 +776,15 @@ static void a7xx_get_cluster(struct msm_gpu *gpu,
 	size_t datasize;
 	int i, regcount = 0;
 
-	/* Some clusters need a selector register to be programmed too */
-	if (cluster->sel)
-		in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
-
 	in += CRASHDUMP_WRITE(in, REG_A7XX_CP_APERTURE_CNTL_CD,
 		A7XX_CP_APERTURE_CNTL_CD_PIPE(cluster->pipe_id) |
 		A7XX_CP_APERTURE_CNTL_CD_CLUSTER(cluster->cluster_id) |
 		A7XX_CP_APERTURE_CNTL_CD_CONTEXT(cluster->context_id));
 
+	/* Some clusters need a selector register to be programmed too */
+	if (cluster->sel)
+		in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
+
 	for (i = 0; cluster->regs[i] != UINT_MAX; i += 2) {
 		int count = RANGE(cluster->regs, i);
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-02 12:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250902120833.1342615-1-sashal@kernel.org>
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx debugbus read Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx TPL1 cluster snapshot Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix debugbus snapshot Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix order of selector programming in cluster snapshot Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).