From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Rob Clark <robin.clark@oss.qualcomm.com>,
Akhil P Oommen <akhilpo@oss.qualcomm.com>,
Sasha Levin <sashal@kernel.org>,
linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org,
freedreno@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix order of selector programming in cluster snapshot
Date: Tue, 2 Sep 2025 08:08:26 -0400 [thread overview]
Message-ID: <20250902120833.1342615-15-sashal@kernel.org> (raw)
In-Reply-To: <20250902120833.1342615-1-sashal@kernel.org>
From: Rob Clark <robin.clark@oss.qualcomm.com>
[ Upstream commit a506578d8909e7e6f0d545af9850ccd4318bf6cf ]
Program the selector _after_ selecting the aperture. This aligns with
the downstream driver, and fixes a case where we were failing to capture
ctx0 regs (and presumably what we thought were ctx1 regs were actually
ctx0).
Suggested-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/666655/
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
## Bug Fix Nature
The commit fixes a **clear functional bug** in the GPU crash
dump/snapshot functionality for Qualcomm Adreno GPUs (specifically the
a7xx series). The bug causes incorrect register capture during GPU crash
dumps:
- **Symptom**: Failing to capture ctx0 (context 0) registers correctly
- **Impact**: ctx1 registers were actually ctx0 registers, meaning
critical debugging information was lost or misrepresented
## Code Analysis
The fix is **extremely simple and surgical** - it only reorders two
blocks of code in the `a7xx_get_cluster()` function:
**Before the fix:**
```c
/* Some clusters need a selector register to be programmed too */
if (cluster->sel)
in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
in += CRASHDUMP_WRITE(in, REG_A7XX_CP_APERTURE_CNTL_CD, ...);
```
**After the fix:**
```c
in += CRASHDUMP_WRITE(in, REG_A7XX_CP_APERTURE_CNTL_CD, ...);
/* Some clusters need a selector register to be programmed too */
if (cluster->sel)
in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
```
The aperture control register (`REG_A7XX_CP_APERTURE_CNTL_CD`) must be
programmed **before** the selector register. This is a classic hardware
programming sequence issue where register order matters.
## Stable Backport Criteria Met
1. **Fixes a real bug**: Yes - incorrect crash dump data affects
debugging capability
2. **Small and contained**: Yes - only 4 lines moved, no logic changes
3. **No architectural changes**: Correct - purely a reordering fix
4. **Minimal regression risk**: Yes - only affects crash dump path, not
normal operation
5. **Aligns with vendor driver**: The commit explicitly states it
"aligns with the downstream driver"
6. **No new features**: Correct - purely a bug fix
## Additional Supporting Evidence
- The commit has already been marked with "Upstream commit
a506578d8909..." suggesting it's been accepted upstream
- Similar fixes in the same file (e.g., `f28c9fc2c82de drm/msm: Fix
debugbus snapshot`) show a pattern of fixing crash dump issues
- The fix is isolated to the crash dump code path
(`a7xx_get_cluster()`), which is only executed during GPU error
recovery
- The author (Rob Clark) is a maintainer of the MSM DRM driver, lending
credibility to the fix
This is an ideal stable backport candidate - it fixes a clear bug with
minimal code change and virtually no risk of regression.
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 61850e2802914..6e8dbd27addbe 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -776,15 +776,15 @@ static void a7xx_get_cluster(struct msm_gpu *gpu,
size_t datasize;
int i, regcount = 0;
- /* Some clusters need a selector register to be programmed too */
- if (cluster->sel)
- in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
-
in += CRASHDUMP_WRITE(in, REG_A7XX_CP_APERTURE_CNTL_CD,
A7XX_CP_APERTURE_CNTL_CD_PIPE(cluster->pipe_id) |
A7XX_CP_APERTURE_CNTL_CD_CLUSTER(cluster->cluster_id) |
A7XX_CP_APERTURE_CNTL_CD_CONTEXT(cluster->context_id));
+ /* Some clusters need a selector register to be programmed too */
+ if (cluster->sel)
+ in += CRASHDUMP_WRITE(in, cluster->sel->cd_reg, cluster->sel->val);
+
for (i = 0; cluster->regs[i] != UINT_MAX; i += 2) {
int count = RANGE(cluster->regs, i);
--
2.50.1
next prev parent reply other threads:[~2025-09-02 12:08 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-02 12:08 [PATCH AUTOSEL 6.16] gpiolib: acpi: Add quirk for ASUS ProArt PX13 Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16] pinctrl: meson: Fix typo in device table macro Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-5.15] HID: intel-ish-hid: Increase ISHTP resume ack timeout to 300ms Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-5.4] gpio: timberdale: fix off-by-one in IRQ type boundary check Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx debugbus read Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.1] ata: ahci_xgene: Use int type for 'rc' to store error codes Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16] HID: elecom: add support for ELECOM M-DT2DRBK Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] hid: fix I2C read buffer overflow in raw_event() for mcp2221 Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix a7xx TPL1 cluster snapshot Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.1] virtio_input: Improve freeze handling Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] regulator: pm8008: fix probe failure due to negative voltage selector Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.12] drm/msm: Fix debugbus snapshot Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.6] HID: quirks: add support for Legion Go dual dinput modes Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-6.6] HID: logitech: Add ids for G PRO 2 LIGHTSPEED Sasha Levin
2025-09-02 12:08 ` Sasha Levin [this message]
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16-5.4] HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version() Sasha Levin
2025-09-02 12:08 ` [PATCH AUTOSEL 6.16] virtio_net: adjust the execution order of function `virtnet_close` during freeze Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250902120833.1342615-15-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=akhilpo@oss.qualcomm.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=robin.clark@oss.qualcomm.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).