AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 7.0] drm/amdgpu/userq: remove queue from doorbell xarray
       [not found] <20260420131539.986432-1-sashal@kernel.org>
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.12] drm/amd/display: Fix dcn401_optimize_bandwidth Sasha Levin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Sunil Khatri, Christian König, Alex Deucher, Sasha Levin,
	airlied, simona, amd-gfx, dri-devel, linux-kernel

From: Sunil Khatri <sunil.khatri@amd.com>

[ Upstream commit f0e46fd06c3f7590b0f06aea3c877f441c2cbccc ]

In case of failure in xa_alloc, remove the queue during
clean up from the userq_doorbell_xa.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Here is my complete analysis:

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `drm/amdgpu/userq`
- Action verb: "remove" (corrective: removing a stale entry from data
  structure on error path)
- Summary: Remove queue from doorbell xarray during cleanup when
  xa_alloc fails

**Step 1.2: Tags**
- `Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>` - author, regular
  AMD GPU contributor
- `Reviewed-by: Christian König <christian.koenig@amd.com>` - AMD GPU
  subsystem co-maintainer
- `Signed-off-by: Alex Deucher <alexander.deucher@amd.com>` - AMD GPU
  subsystem maintainer
- No Fixes: tag (expected for manual review candidates)
- No Cc: stable tag

**Step 1.3: Commit Body Analysis**
The message says: "In case of failure in xa_alloc, remove the queue
during clean up from the userq_doorbell_xa." This clearly describes a
missing cleanup in an error path: the queue gets stored in
`userq_doorbell_xa` but is not removed if the subsequent `xa_alloc`
fails.

**Step 1.4: Hidden Bug Fix Detection**
This is not hidden - it's straightforwardly described as an error path
cleanup fix. However, the severity is greater than the commit message
suggests. It's actually a **use-after-free**: the queue is freed
(`kfree(queue)`) but its pointer remains in the doorbell xarray.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- Single file: `drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c`
- 1 line added: `xa_erase_irq(&adev->userq_doorbell_xa, index);`
- Function modified: `amdgpu_userq_create()`
- Scope: Single-file surgical fix, error path only

**Step 2.2: Code Flow Change**
In `amdgpu_userq_create()`:

1. Line 863: `xa_store_irq(&adev->userq_doorbell_xa, index, queue,
   GFP_KERNEL)` - stores queue in global doorbell xarray
2. Line 870: `xa_alloc(&uq_mgr->userq_xa, &qid, queue, ...)` - attempts
   to allocate a queue ID
3. If xa_alloc fails (line 872): error path calls
   `amdgpu_userq_fence_driver_free()`, `mqd_destroy()`, `kfree(queue)` -
   but did NOT remove the entry from `userq_doorbell_xa`
4. Fix adds `xa_erase_irq(&adev->userq_doorbell_xa, index)` before the
   kfree

**Step 2.3: Bug Mechanism**
This is a **use-after-free** caused by a missing cleanup on error path.
After `kfree(queue)`, the freed pointer remains in `userq_doorbell_xa`.
Six functions in the same file iterate `userq_doorbell_xa` with
`xa_for_each()` and dereference the queue pointer:
- `amdgpu_userq_suspend()` (line 1445)
- `amdgpu_userq_resume()` (line 1471)
- `amdgpu_userq_stop_sched_for_enforce_isolation()` (line 1501)
- `amdgpu_userq_start_sched_for_enforce_isolation()` (line 1535)
- `amdgpu_userq_pre_reset()` (line 1589)
- `amdgpu_userq_post_reset()` (line 1617)

All would crash or corrupt memory by accessing `queue->userq_mgr`,
`queue->state`, etc.

**Step 2.4: Fix Quality**
The fix is obviously correct: it mirrors the cleanup in the normal
destroy path (`amdgpu_userq_cleanup()` at line 463 already calls
`xa_erase_irq(&adev->userq_doorbell_xa, queue->doorbell_index)`). The
fix uses `xa_erase_irq()` matching the `xa_store_irq()` used during
insertion. No regression risk.

## PHASE 3: GIT HISTORY

**Step 3.1: Blame**
The buggy error path (xa_alloc failure without doorbell cleanup) was
introduced by commit `f18719ef4bb7b` (Jesse.Zhang, October 2025) which
converted IDR to XArray. When the XArray conversion added
`xa_store_irq(&adev->userq_doorbell_xa, ...)` before `xa_alloc`, the
error cleanup for `xa_alloc` failure was not updated to remove the
doorbell entry.

**Step 3.2: Fixes Target**
No Fixes: tag present. The bug was introduced by f18719ef4bb7b.

**Step 3.3: Related Changes**
The companion commit `65b5c326ce410` ("refcount userqueues to avoid race
conditions") was cherry-picked with `Cc: stable@vger.kernel.org`,
showing this area is actively being stabilized for stable trees.

**Step 3.4: Author**
Sunil Khatri is a regular AMD GPU contributor with 11 commits to this
file alone. He also authored the refcount fix.

**Step 3.5: Dependencies**
This fix is standalone - it requires only that the doorbell xarray code
(from f18719ef4bb7b) exists, which it does in 7.0.

## PHASE 4: MAILING LIST RESEARCH

The exact commit "remove queue from doorbell xarray" was not found
independently on lore.kernel.org. However, the v2 series of the refcount
patchset (20260302180852) was a 2-patch series that included related
doorbell cleanup work. The patch was reviewed by Christian König (AMD
GPU co-maintainer) and committed through Alex Deucher (AMD GPU
maintainer).

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Modified Function**
`amdgpu_userq_create()` - creates usermode queues for AMD GPU userspace
queue submission.

**Step 5.2: Callers**
`amdgpu_userq_create()` is called from `amdgpu_userq_ioctl()` via
`AMDGPU_USERQ_OP_CREATE`, which is a DRM IOCTL callable from userspace.
This is a user-triggerable code path.

**Step 5.3-5.4: Impact Surface**
The stale entry in `userq_doorbell_xa` will be accessed by any
subsequent suspend, resume, reset, or isolation enforcement operation on
the GPU device. These are triggered by:
- System suspend/resume (S3, S0ix) - common laptop operation
- GPU reset (triggered by hang detection)
- Enforce isolation mode changes

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code in Stable Trees**
The file `amdgpu_userq.c` was first added in the v6.16/6.17 timeframe.
The `userq_doorbell_xa` was introduced by f18719ef4bb7b in the kernel
7.0 development cycle. This code exists in the 7.0 stable tree.

**Step 6.2: Backport Complexity**
This is a clean 1-line addition with no context conflicts expected.

## PHASE 7: SUBSYSTEM CONTEXT

- Subsystem: `drivers/gpu/drm/amd/amdgpu` - AMD GPU driver
- Criticality: IMPORTANT - widely used GPU driver on desktop/laptop
  systems
- Actively developed subsystem with heavy recent churn

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Who is Affected**
Users with AMD GPUs using the usermode queue feature (GFX/Compute/DMA
queues).

**Step 8.2: Trigger Conditions**
The bug requires:
1. A userqueue creation attempt where `xa_store_irq` succeeds but
   `xa_alloc` fails (e.g., hitting `AMDGPU_MAX_USERQ_COUNT` limit or
   memory pressure)
2. Followed by any suspend, resume, reset, or isolation operation

The `xa_alloc` failure can be triggered by resource exhaustion or
concurrent queue creation under memory pressure.

**Step 8.3: Failure Mode Severity**
- **CRITICAL**: Use-after-free on freed queue pointer leads to kernel
  crash (oops/panic), potential memory corruption, or security
  vulnerability. The freed pointer is dereferenced to access
  `queue->userq_mgr`, `queue->state`, etc.

**Step 8.4: Risk-Benefit Ratio**
- BENEFIT: Very high - prevents UAF crash on a common GPU driver
- RISK: Very low - single line addition, mirrors existing cleanup
  pattern, obviously correct
- Ratio: Strongly favorable

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
- Fixes a real use-after-free bug (stale pointer in xarray after kfree)
- Single-line fix, surgically targeted to one error path
- Reviewed by Christian König (AMD GPU co-maintainer)
- Obviously correct (mirrors cleanup in normal destroy path)
- Related refcount commit already tagged Cc: stable
- User-triggerable code path (IOCTL)
- Failure mode is kernel crash or memory corruption

**Evidence AGAINST backporting:**
- Relatively new code (userq feature added in v6.16-7.0 timeframe)
- Trigger requires specific failure condition (xa_alloc failure)

**Stable Rules Checklist:**
1. Obviously correct and tested? YES - mirrors existing cleanup,
   reviewed by maintainer
2. Fixes a real bug? YES - use-after-free
3. Important issue? YES - potential crash, memory corruption, security
4. Small and contained? YES - 1 line added
5. No new features? YES - pure bug fix
6. Can apply to stable? YES - clean 1-line addition

## Verification

- [Phase 1] Parsed tags: Reviewed-by Christian König, Signed-off-by Alex
  Deucher and Sunil Khatri
- [Phase 2] Diff analysis: 1 line added in xa_alloc error path of
  amdgpu_userq_create(), adds missing xa_erase_irq() for doorbell xarray
- [Phase 3] git blame: buggy code introduced in f18719ef4bb7b (Oct
  2025), XArray conversion commit
- [Phase 3] git show f18719ef4bb7b: confirmed it added xa_store_irq to
  doorbell_xa without corresponding xa_erase_irq in xa_alloc failure
  path
- [Phase 3] Related commit 65b5c326ce410 already has Cc: stable
  (confirmed via git show)
- [Phase 4] b4 dig -c 65b5c326ce410: found patch at
  https://patch.msgid.link/20260303120849.2583663-1-sunil.khatri@amd.com
- [Phase 4] Mailing list archive: found v2 series and review discussion,
  Christian König reviewed
- [Phase 5] Grep for userq_doorbell_xa: found 6 xa_for_each iteration
  sites that would dereference the freed pointer
- [Phase 5] amdgpu_userq_create called from amdgpu_userq_ioctl (user-
  triggerable via DRM IOCTL)
- [Phase 6] amdgpu_userq.c first added around v6.16, doorbell_xa added
  in v7.0 cycle
- [Phase 8] Failure mode: UAF -> kernel crash/memory corruption,
  severity CRITICAL

This is a small, surgical, obviously correct fix for a real use-after-
free bug in the AMD GPU driver's error handling path. It was reviewed by
the subsystem co-maintainer and carries essentially zero regression
risk.

**YES**

 drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
index 0a1b93259887a..819c4c26416c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
@@ -872,6 +872,7 @@ amdgpu_userq_create(struct drm_file *filp, union drm_amdgpu_userq *args)
 	if (r) {
 		drm_file_err(uq_mgr->file, "Failed to allocate a queue id\n");
 		amdgpu_userq_fence_driver_free(queue);
+		xa_erase_irq(&adev->userq_doorbell_xa, index);
 		uq_funcs->mqd_destroy(queue);
 		kfree(queue);
 		r = -ENOMEM;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] drm/amd/display: Fix dcn401_optimize_bandwidth
       [not found] <20260420131539.986432-1-sashal@kernel.org>
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0] drm/amdgpu/userq: remove queue from doorbell xarray Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:09 ` [PATCH AUTOSEL 6.18] drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size Sasha Levin
  2026-04-20 13:09 ` [PATCH AUTOSEL 7.0-6.12] drm/amdgpu: fix DF NULL pointer issue for soc24 Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Charlene Liu, Leo Chen, Dmytro Laktyushkin, Ivan Lipski,
	Dan Wheeler, Alex Deucher, Sasha Levin, harry.wentland,
	sunpeng.li, christian.koenig, airlied, simona, amd-gfx, dri-devel,
	linux-kernel

From: Charlene Liu <Charlene.Liu@amd.com>

[ Upstream commit 002f32db0d4292f117994c330928d2374887b28e ]

[Why&How]
We should check for != zstate disallow and programming extend blank from a
different struct.

Reviewed-by: Leo Chen <leo.chen@amd.com>
Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have a complete picture. Let me compile the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `drm/amd/display` (AMD GPU display controller)
- Action verb: "Fix"
- Summary: Fix dcn401_optimize_bandwidth function

Record: [drm/amd/display] [Fix] [Corrects zstate condition and struct
access in dcn401_optimize_bandwidth]

**Step 1.2: Tags**
- Reviewed-by: Leo Chen, Dmytro Laktyushkin (AMD display engineers)
- Tested-by: Dan Wheeler (AMD test engineer)
- Signed-off-by: Charlene Liu (author), Ivan Lipski (submitter), Alex
  Deucher (AMD GPU maintainer)
- No Fixes: tag, no Cc: stable (expected for autosel candidates)

Record: Two reviewers, one tester, submitted through AMD's standard
display patch process. Author is an AMD display team member.

**Step 1.3: Commit Body**
The message says "We should check for != zstate disallow and programming
extend blank from a different struct." This describes two bugs:
1. The zstate check condition was too restrictive
2. The extended blank register value was being read from the wrong
   struct

Record: Two bugs described. Both relate to incorrect programming of
extended blank during bandwidth optimization for DCN4.01 display
hardware.

**Step 1.4: Hidden Bug Fix Detection**
Not hidden — explicitly labeled as "Fix" with clear description.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- 1 file changed:
  `drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c`
- 2 lines changed (1 condition, 1 struct access)
- Function modified: `dcn401_optimize_bandwidth`
- Scope: Single-file, single-function surgical fix

**Step 2.2: Code Flow Change**

Change 1 (line 1476→1486):
- Before: `if (zstate_support == DCN_ZSTATE_SUPPORT_ALLOW)` — only
  enters the extended blank programming block when zstate is exactly
  `ALLOW`
- After: `if (zstate_support != DCN_ZSTATE_SUPPORT_DISALLOW)` — enters
  for ALL allow modes

The enum `dcn_zstate_support_state` has values: UNKNOWN, ALLOW,
ALLOW_Z8_ONLY, ALLOW_Z8_Z10_ONLY, ALLOW_Z10_ONLY, DISALLOW. The old code
missed Z8_ONLY, Z8_Z10_ONLY, and Z10_ONLY modes.

Change 2 (line 1484):
- Before: `pipe_ctx->dlg_regs.min_dst_y_next_start` — reads from old
  DML1 struct `_vcs_dpi_display_dlg_regs_st`
- After: `pipe_ctx->hubp_regs.dlg_regs.min_dst_y_next_start` — reads
  from DML2.1 struct `dml2_dchub_per_pipe_register_set`

**Step 2.3: Bug Mechanism**

Bug 1 (Logic/correctness): Wrong condition excludes valid z-state power
modes from extended blank programming. This is a logic bug where the
original code was copied from DCN20 which only had ALLOW/DISALLOW, but
DCN401 has additional modes (Z8, Z10, etc.).

Bug 2 (Wrong struct access): DCN401 uses DML2.1 which populates
`pipe_ctx->hubp_regs`. The old `pipe_ctx->dlg_regs` is populated by
DML2.0's `dml21_update_pipe_ctx_dchub_regs`, which was removed in commit
63ab80d9ac0a. Reading from the wrong struct yields zero/stale data.

**Step 2.4: Fix Quality**
- Obviously correct: aligns with how all other DCN clk_mgr code checks
  zstate (`!= DISALLOW`), and aligns with how all other DCN401 code
  accesses hubp_regs
- Minimal/surgical: exactly 2 lines
- No regression risk: both changes narrow in on the correct behavior

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
The buggy code was introduced by commit `70839da636050` (Aurabindo
Pillai, 2024-04-19) "drm/amd/display: Add new DCN401 sources". This
commit first appeared in v6.11.

The dlg_regs bug was latent until commit `63ab80d9ac0a` (Rafal
Ostrowski, 2024-12-17) "drm/amd/display: DML2.1 Post-Si Cleanup" which
removed `dml21_update_pipe_ctx_dchub_regs` that was populating
`pipe_ctx->dlg_regs`. This commit is in v6.14+ only.

Record: zstate bug exists since v6.11. dlg_regs bug became active in
v6.14 when the function populating the old struct was removed.

**Step 3.2: No Fixes: tag** (expected)

**Step 3.3: File History**
Recent file changes are mostly unrelated (NULL ptr deref fix, DPP
guards, cursor offload). No prerequisites for this fix.

**Step 3.4: Author**
Charlene Liu is a regular AMD display team contributor with many commits
in the subsystem. Ivan Lipski is the submitter who regularly contributes
AMD display fixes.

**Step 3.5: Dependencies**
No dependencies. The fix is standalone — both `hubp_regs` and the zstate
enum values exist since v6.11.

## PHASE 4: MAILING LIST

**Step 4.1: Original Submission**
Found at: `https://yhbt.net/lore/amd-
gfx/20260218165116.155001-6-IVAN.LIPSKI@amd.com/`
Part of a 9-patch series "DC Patches February 18, 2026". This is patch
5/9. The series includes other display fixes (cursor, DPIA, compiler
warnings).

**Step 4.2: Reviewers**
Appropriate AMD display maintainers and engineers were CC'd: Harry
Wentland, Leo Li, Aurabindo Pillai, Wayne Lin, Tom Chung, and the
specific reviewers Leo Chen and Dmytro Laktyushkin.

**Step 4.3: Bug Report**
No specific bug report link found. This appears to be found during
internal code review/testing at AMD.

**Step 4.4: Series Independence**
Other patches in the series (cursor pos fix, DPIA fix, type mismatch)
are independent. This fix is standalone.

**Step 4.5: No stable-specific discussion found.**

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Modified Function**
`dcn401_optimize_bandwidth` — called during display bandwidth
optimization after clock updates.

**Step 5.2: Callers**
`dcn401_optimize_bandwidth` is a hardware sequencer function registered
in the DCN401 init table. It is called during DC bandwidth optimization,
which occurs on display mode changes, resolution changes, and power
state transitions.

**Step 5.3: Callees**
The affected code calls
`pipe_ctx->plane_res.hubp->funcs->program_extended_blank()` which
programs display hardware registers for extended blanking during z-state
power saving.

**Step 5.4: Call Chain**
Reachable via standard display mode setting paths (user-triggered via
xrandr, display manager, etc.) and power management transitions. This is
not an obscure path.

**Step 5.5: Similar Patterns**
DCN20 (`dcn20_hwseq.c:2458`) has the same `== DCN_ZSTATE_SUPPORT_ALLOW`
pattern, which is potentially also a bug but uses the old DML path where
`pipe_ctx->dlg_regs` IS populated.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code Existence**
- DCN401 code exists since v6.11
- `hubp_regs` field exists since v6.11 (commit 00c391102abc)
- The dlg_regs regression was introduced in v6.14 (commit 63ab80d9ac0a)
- Both bugs are present in v7.0

**Step 6.2: Backport Difficulty**
The patch should apply cleanly to v7.0.y — the modified lines are
unchanged from when they were introduced. For v6.14.y+, both fixes
apply. For v6.12.y and earlier, only the zstate fix applies (dlg_regs is
still populated there).

**Step 6.3: No related fixes in stable.**

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

**Step 7.1:** drm/amd/display — IMPORTANT subsystem. DCN401 is AMD's
display controller for newer GPUs. Display bugs directly affect user
experience.

**Step 7.2:** Actively developed subsystem with frequent commits.

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1:** Affects users of AMD GPUs with DCN4.01 display controller
hardware.

**Step 8.2:** Triggers during power management transitions (z-state
changes) — common on laptops and energy-efficient desktops. The zstate
bug triggers whenever Z8/Z10 power modes are selected instead of full
Z-state.

**Step 8.3:** Failure mode is incorrect display hardware programming:
- Without extended blank programming during z-states: display artifacts,
  flickering, or incorrect power management behavior
- With wrong/zero value for min_dst_y_next_start: incorrect blank
  timing, potential display corruption
- Severity: HIGH (visible display issues for affected hardware)

**Step 8.4:**
- Benefit: HIGH — fixes real display hardware programming bugs for
  DCN401 users
- Risk: VERY LOW — 2-line change, obviously correct, reviewed and tested
- Ratio: Strongly favors backporting

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: Evidence**

FOR backporting:
- Fixes two real bugs: incorrect zstate condition and wrong struct
  access
- Small, surgical fix (2 lines)
- Reviewed by two AMD display engineers, tested
- Affects real hardware (DCN401 GPUs)
- Display bugs directly affect user experience
- Consistent with how other DCN code handles zstate and hubp_regs

AGAINST backporting:
- DCN401 is relatively new hardware (v6.11+)
- No user bug reports (found during code review)
- The dlg_regs bug only manifests in v6.14+ (after DML2.1 cleanup)

**Step 9.2: Stable Rules**
1. Obviously correct? YES — aligns with all other zstate checks and
   hubp_regs usage
2. Fixes real bug? YES — wrong hardware programming values
3. Important issue? YES — display correctness on affected hardware
4. Small and contained? YES — 2 lines, 1 function, 1 file
5. No new features? CORRECT — pure bug fix
6. Applies to stable? YES — cleanly for v7.0.y

**Step 9.3:** No exception category — standard bug fix.

**Step 9.4:** The evidence clearly supports backporting.

## Verification

- [Phase 1] Parsed tags: Reviewed-by Leo Chen and Dmytro Laktyushkin,
  Tested-by Dan Wheeler, author Charlene Liu
- [Phase 2] Diff analysis: 2 lines changed in
  dcn401_optimize_bandwidth(): broadened zstate condition and fixed
  struct access
- [Phase 2] Verified enum values at `dc.h:665-671`: 6 states including
  ALLOW, Z8_ONLY, Z8_Z10_ONLY, Z10_ONLY, DISALLOW
- [Phase 2] Verified `pipe_ctx->hubp_regs` is `struct
  dml2_dchub_per_pipe_register_set` containing `dlg_regs` with
  `min_dst_y_next_start` field
- [Phase 2] Verified `pipe_ctx->dlg_regs` is the OLD
  `_vcs_dpi_display_dlg_regs_st` struct (different type)
- [Phase 3] git blame: buggy code introduced in commit 70839da636050
  (v6.11, "Add new DCN401 sources")
- [Phase 3] Confirmed `hubp_regs` added in commit 00c391102abc (also
  v6.11)
- [Phase 3] Confirmed dlg_regs regression introduced by 63ab80d9ac0a
  (v6.14, "DML2.1 Post-Si Cleanup") which removed
  `dml21_update_pipe_ctx_dchub_regs`
- [Phase 3] Verified `dml21_utils.c:225` populates `pipe_ctx->hubp_regs`
  via memcpy for DCN401 (DML2.1 path)
- [Phase 3] Verified DCN401 uses DML2.1 (`dcn401_resource.c` has
  `.using_dml21 = true`)
- [Phase 3] Verified other DCN401 code uses
  `pipe_ctx->hubp_regs.dlg_regs` (dcn401_hwseq.c:2889-2892)
- [Phase 4] Found original submission at lore (patch 5/9 in DC Patches
  Feb 18 2026 series)
- [Phase 4] Appropriate AMD display maintainers were CC'd
- [Phase 5] Confirmed `program_extended_blank` is registered in hubp
  function tables for DCN31/DCN35
- [Phase 6] Verified 63ab80d9ac0a is NOT in v6.12/v6.13 but IS in v6.14+
- [Phase 6] Both bugs present in v7.0
- [Phase 8] Display correctness issue for DCN401 GPU users during power
  state transitions

The fix is small, surgical, obviously correct, reviewed, tested, and
fixes real display hardware programming bugs for DCN401 users. It meets
all stable kernel criteria.

**YES**

 drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c
index 4973d36aadc5d..a3d33d10853b8 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c
@@ -1492,7 +1492,7 @@ void dcn401_optimize_bandwidth(
 			dc->clk_mgr,
 			context,
 			true);
-	if (context->bw_ctx.bw.dcn.clk.zstate_support == DCN_ZSTATE_SUPPORT_ALLOW) {
+	if (context->bw_ctx.bw.dcn.clk.zstate_support != DCN_ZSTATE_SUPPORT_DISALLOW) {
 		for (i = 0; i < dc->res_pool->pipe_count; ++i) {
 			struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
 
@@ -1500,7 +1500,7 @@ void dcn401_optimize_bandwidth(
 				&& pipe_ctx->stream->adjust.v_total_min == pipe_ctx->stream->adjust.v_total_max
 				&& pipe_ctx->stream->adjust.v_total_max > pipe_ctx->stream->timing.v_total)
 					pipe_ctx->plane_res.hubp->funcs->program_extended_blank(pipe_ctx->plane_res.hubp,
-						pipe_ctx->dlg_regs.min_dst_y_next_start);
+						pipe_ctx->hubp_regs.dlg_regs.min_dst_y_next_start);
 		}
 	}
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.18] drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size
       [not found] <20260420131539.986432-1-sashal@kernel.org>
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0] drm/amdgpu/userq: remove queue from doorbell xarray Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.12] drm/amd/display: Fix dcn401_optimize_bandwidth Sasha Levin
@ 2026-04-20 13:09 ` Sasha Levin
  2026-04-20 13:09 ` [PATCH AUTOSEL 7.0-6.12] drm/amdgpu: fix DF NULL pointer issue for soc24 Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2026-04-20 13:09 UTC (permalink / raw)
  To: patches, stable
  Cc: Donet Tom, Felix Kuehling, Alex Deucher, Sasha Levin,
	Felix.Kuehling, christian.koenig, Xinhui.Pan, airlied, daniel,
	amd-gfx, dri-devel, linux-kernel

From: Donet Tom <donettom@linux.ibm.com>

[ Upstream commit 78746a474e92fc7aaed12219bec7c78ae1bd6156 ]

The control stack size is calculated based on the number of CUs and
waves, and is then aligned to PAGE_SIZE. When the resulting control
stack size is aligned to 64 KB, GPU hangs and queue preemption
failures are observed while running RCCL unit tests on systems with
more than two GPUs.

amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4
amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues

This issue is observed on both 4 KB and 64 KB system page-size
configurations.

This patch fixes the issue by aligning the control stack size to
AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size
will not be 64 KB on systems with a 64 KB page size and queue
preemption works correctly.

Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE,
which can waste memory if the system page size is large. In this patch,
wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated
from wg_data_size and the control stack size, is aligned to PAGE_SIZE.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a3e14436304392fbada359edd0f1d1659850c9b7)
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 drivers/gpu/drm/amd/amdkfd/kfd_queue.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index 2822c90bd7be4..b97f4a51db6e3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -444,10 +444,11 @@ void kfd_queue_ctx_save_restore_size(struct kfd_topology_device *dev)
 		    min(cu_num * 40, props->array_count / props->simd_arrays_per_engine * 512)
 		    : cu_num * 32;
 
-	wg_data_size = ALIGN(cu_num * WG_CONTEXT_DATA_SIZE_PER_CU(gfxv, props), PAGE_SIZE);
+	wg_data_size = ALIGN(cu_num * WG_CONTEXT_DATA_SIZE_PER_CU(gfxv, props),
+				AMDGPU_GPU_PAGE_SIZE);
 	ctl_stack_size = wave_num * CNTL_STACK_BYTES_PER_WAVE(gfxv) + 8;
 	ctl_stack_size = ALIGN(SIZEOF_HSA_USER_CONTEXT_SAVE_AREA_HEADER + ctl_stack_size,
-			       PAGE_SIZE);
+			       AMDGPU_GPU_PAGE_SIZE);
 
 	if ((gfxv / 10000 * 10000) == 100000) {
 		/* HW design limits control stack size to 0x7000.
@@ -459,7 +460,7 @@ void kfd_queue_ctx_save_restore_size(struct kfd_topology_device *dev)
 
 	props->ctl_stack_size = ctl_stack_size;
 	props->debug_memory_size = ALIGN(wave_num * DEBUGGER_BYTES_PER_WAVE, DEBUGGER_BYTES_ALIGN);
-	props->cwsr_size = ctl_stack_size + wg_data_size;
+	props->cwsr_size = ALIGN(ctl_stack_size + wg_data_size, PAGE_SIZE);
 
 	if (gfxv == 80002)	/* GFX_VERSION_TONGA */
 		props->eop_buffer_size = 0x8000;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] drm/amdgpu: fix DF NULL pointer issue for soc24
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2026-04-20 13:09 ` [PATCH AUTOSEL 6.18] drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size Sasha Levin
@ 2026-04-20 13:09 ` Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2026-04-20 13:09 UTC (permalink / raw)
  To: patches, stable
  Cc: Likun Gao, Hawking Zhang, Alex Deucher, Sasha Levin,
	christian.koenig, airlied, daniel, amd-gfx, dri-devel,
	linux-kernel

From: Likun Gao <Likun.Gao@amd.com>

[ Upstream commit 50808826a64b4957b7088c789e539dd0a75a1560 ]

If DF function not initialized, NULL pointer issue
will happen on soc24.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information I need. Let me compile the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: PARSE THE SUBJECT LINE
- **Subsystem:** `drm/amdgpu` (AMD GPU driver)
- **Action verb:** "fix" - explicitly a bug fix
- **Summary:** Fix NULL pointer dereference when DF functions are not
  initialized on soc24

Record: [drm/amdgpu] [fix] [NULL pointer dereference when df.funcs is
NULL in soc24_common_hw_init]

### Step 1.2: PARSE ALL COMMIT MESSAGE TAGS
- **Signed-off-by:** Likun Gao (author)
- **Reviewed-by:** Hawking Zhang (AMD architect/maintainer for amdgpu)
- **Signed-off-by:** Alex Deucher (amdgpu subsystem maintainer)
- No Fixes: tag, no Cc: stable tag, no Reported-by (expected for manual
  review)

Record: Reviewed by Hawking Zhang (AMD subsystem architect) and merged
by Alex Deucher (amdgpu maintainer). Strong trust signal.

### Step 1.3: ANALYZE THE COMMIT BODY TEXT
The body says: "If DF function not initialized, NULL pointer issue will
happen on soc24." This describes a concrete crash scenario: when
`adev->df.funcs` is NULL and code dereferences it to check `->hw_init`.

Record: [Bug: NULL pointer dereference] [Symptom: kernel oops/crash
during GPU hw_init or resume] [Root cause: missing NULL check before
dereferencing df.funcs pointer]

### Step 1.4: DETECT HIDDEN BUG FIXES
Not hidden - this is an explicit "fix" for a NULL pointer dereference.

Record: Not a hidden fix, explicitly labeled as a fix.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: INVENTORY THE CHANGES
- **Files:** 1 file changed (`drivers/gpu/drm/amd/amdgpu/soc24.c`)
- **Lines:** 1 line modified (replacing one condition with a guarded
  condition)
- **Function:** `soc24_common_hw_init()`
- **Scope:** Single-line surgical fix

Record: [soc24.c: 1 line changed in soc24_common_hw_init()] [Scope:
single-line surgical fix]

### Step 2.2: UNDERSTAND THE CODE FLOW CHANGE
Before:

```481:481:drivers/gpu/drm/amd/amdgpu/soc24.c
        if (adev->df.funcs->hw_init)
```

After:
```c
        if (adev->df.funcs && adev->df.funcs->hw_init)
```

The code was dereferencing `adev->df.funcs` (which can be NULL) to check
`hw_init`. The fix adds a NULL guard.

Record: [Before: unconditional dereference of df.funcs -> After: guarded
dereference with NULL check first]

### Step 2.3: IDENTIFY THE BUG MECHANISM
**Category: NULL pointer dereference (d)**
- `adev->df.funcs` can be NULL if the DF IP version doesn't match any
  known version in `amdgpu_discovery.c`
- The code dereferences this NULL pointer to check `->hw_init`
- This causes a kernel oops

Record: [NULL pointer dereference] [df.funcs can be NULL when DF IP
version is unrecognized; the fix adds a standard guard consistent with
soc15.c patterns]

### Step 2.4: ASSESS THE FIX QUALITY
- Obviously correct: the pattern `if (ptr && ptr->member)` is idiomatic
  C null-guard
- Consistent: `soc15.c` already uses `if (adev->df.funcs &&
  adev->df.funcs->hw_init)` and `if (adev->df.funcs &&
  adev->df.funcs->sw_init)` - the exact same pattern
- Minimal: single condition addition, no behavior change when df.funcs
  is non-NULL
- Regression risk: zero - the only change is skipping the call when
  funcs is NULL (which would crash otherwise)

Record: [Obviously correct, minimal, zero regression risk. Matches
existing patterns in soc15.c]

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: BLAME THE CHANGED LINES
The buggy line was introduced by commit `73048bda46c308` ("drm/amdgpu:
Fix atomics on GFX12") by David Belanger on 2024-06-10. This commit
added the DF hw_init call to soc24_common_hw_init but omitted the NULL
check for `adev->df.funcs`, unlike how soc15.c handles it.

Record: [Buggy code introduced by 73048bda46c308, present in v6.11+]

### Step 3.2: FOLLOW THE FIXES TAG
No Fixes: tag present (expected for review candidates).

Record: [No Fixes: tag, but the implicit fix target is 73048bda46c308]

### Step 3.3: CHECK FILE HISTORY
soc24.c was created by `98b912c50e449` (first in v6.11). The buggy
commit `73048bda46c308` was also added in v6.11. The file has had ~20
subsequent commits (refactoring handle pointers, etc.), but none touched
this specific df.funcs line.

Record: [Standalone fix, no prerequisites beyond the original buggy
commit]

### Step 3.4: CHECK THE AUTHOR
Likun Gao is a regular AMD GPU contributor with multiple recent commits
to the amdgpu subsystem. The reviewer (Hawking Zhang) is the original
soc24.c creator and AMD architect.

Record: [Author is a regular AMD contributor; reviewer is the subsystem
architect]

### Step 3.5: DEPENDENCIES
This fix has no dependencies. It simply adds a NULL check guard to an
existing conditional. It will apply to any tree that contains commit
`73048bda46c308`.

Record: [No dependencies, standalone fix]

---

## PHASE 4: MAILING LIST RESEARCH

### Step 4.1: PATCH DISCUSSION
Found the original submission at
[spinics](https://www.spinics.net/lists/amd-gfx/msg138858.html).
Submitted by Alex Deucher on March 6, 2026. No objections or NAKs
visible. It was part of a batch of AMD GPU fixes.

Record: [Found submission on amd-gfx list, no objections, submitted in a
batch of fixes by the maintainer]

### Step 4.2: REVIEWERS
Reviewed by Hawking Zhang (AMD architect), signed off by Alex Deucher
(subsystem maintainer). Both are the top-level amdgpu maintainers.

Record: [Reviewed and merged by subsystem maintainers]

### Step 4.3-4.5: BUG REPORT / RELATED PATCHES / STABLE HISTORY
No syzbot report, no explicit bug report URL. This appears to be an
internally-discovered issue at AMD.

Record: [Internal AMD finding, no external bug report]

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1-5.2: KEY FUNCTIONS AND CALLERS
`soc24_common_hw_init()` is called:
1. During GPU initialization via the `amd_ip_funcs` table (line 588:
   `.hw_init = soc24_common_hw_init`)
2. During resume via `soc24_common_resume()` (line 524-527)

Both are common execution paths for any system with soc24 hardware.

### Step 5.3-5.4: WHY df.funcs CAN BE NULL
In `amdgpu_discovery.c`, `adev->df.funcs` is set in a switch on
`DF_HWIP` version. The default case is `break` (no assignment). If a
soc24 device has a DF IP version not in the list, `df.funcs` remains
NULL. This is the exact trigger.

### Step 5.5: SIMILAR PATTERNS
Verified: `soc15.c` consistently uses the guarded pattern:
- Line 1253: `if (adev->df.funcs && adev->df.funcs->sw_init)`
- Line 1264: `if (adev->df.funcs && adev->df.funcs->sw_fini)`
- Line 1498: `if (adev->df.funcs &&
  adev->df.funcs->get_clockgating_state)`
- `gmc_v9_0.c` also guards with `if (adev->df.funcs && ...)`

soc24.c is the ONLY file missing this guard.

Record: [All other callers guard df.funcs with NULL check; soc24.c is
the sole exception]

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: BUGGY CODE IN STABLE TREES
- `soc24.c` first appeared in v6.11
- The buggy commit `73048bda46c308` is in v6.11+
- Therefore the bug exists in stable trees: **6.11.y, 6.12.y, 7.0.y**
- Not present in 6.6.y or earlier (soc24.c doesn't exist there)

Record: [Bug exists in 6.11.y, 6.12.y, 7.0.y]

### Step 6.2: BACKPORT COMPLICATIONS
The fix is a single-line change. No conflicting refactoring has touched
this specific line. Clean apply expected.

Record: [Expected clean apply to all affected stable trees]

---

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

### Step 7.1: SUBSYSTEM CRITICALITY
- **Subsystem:** GPU driver (drm/amdgpu) - IMPORTANT
- AMD GPUs are extremely common in desktop and laptop systems
- soc24 corresponds to RDNA4 generation (GC 12.0.x) - recent and
  actively shipping hardware

Record: [drm/amdgpu] [IMPORTANT - affects users of recent AMD GPUs]

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: WHO IS AFFECTED
All users with soc24 (RDNA4) AMD GPU hardware where the DF IP version
doesn't match a listed version in discovery.

### Step 8.2: TRIGGER CONDITIONS
The crash triggers during:
- GPU hardware initialization (every boot)
- GPU resume from suspend (every suspend/resume cycle)
These are unavoidable common paths.

### Step 8.3: FAILURE MODE SEVERITY
**CRITICAL** - NULL pointer dereference causes a kernel oops, crashing
the system during boot or resume. The GPU driver is essential for
display output.

### Step 8.4: RISK-BENEFIT RATIO
- **Benefit:** Prevents kernel oops on boot/resume for soc24 users -
  VERY HIGH
- **Risk:** 1-line addition of a NULL check, zero chance of regression -
  VERY LOW
- **Ratio:** Extremely favorable

Record: [Benefit: VERY HIGH (prevents crash), Risk: VERY LOW (1-line
NULL guard)]

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: EVIDENCE COMPILATION

**FOR backporting:**
- Fixes a NULL pointer dereference (kernel oops/crash)
- Affects GPU init and resume paths (common, unavoidable paths)
- Single-line fix, obviously correct
- Matches existing defensive patterns in soc15.c, gmc_v9_0.c,
  amdgpu_ras.c
- Reviewed by AMD architect (Hawking Zhang), merged by subsystem
  maintainer (Alex Deucher)
- Zero regression risk
- Affects actively-shipping hardware (RDNA4)

**AGAINST backporting:**
- None identified

### Step 9.2: STABLE RULES CHECKLIST
1. Obviously correct and tested? **YES** - standard NULL guard, matches
   existing patterns, reviewed by maintainers
2. Fixes a real bug? **YES** - NULL pointer dereference → kernel oops
3. Important issue? **YES** - system crash during GPU init/resume
   (CRITICAL)
4. Small and contained? **YES** - 1 line changed in 1 file
5. No new features or APIs? **YES** - purely defensive check
6. Can apply to stable? **YES** - clean apply expected for 6.11.y+

### Step 9.3: EXCEPTION CATEGORIES
Not an exception case; this is a standard bug fix that meets all normal
criteria.

### Step 9.4: DECISION
This is a textbook stable backport candidate: a one-line NULL pointer
check that prevents a kernel oops during GPU initialization and resume
on recent AMD hardware.

---

## Verification

- [Phase 1] Parsed tags: Reviewed-by: Hawking Zhang, SOB: Likun Gao,
  Alex Deucher
- [Phase 2] Diff analysis: 1 line changed in soc24_common_hw_init(),
  adds `adev->df.funcs &&` guard
- [Phase 3] git blame: buggy code introduced in commit 73048bda46c308
  (v6.11), "Fix atomics on GFX12"
- [Phase 3] git merge-base: confirmed 73048bda46c308 is in v6.11+,
  soc24.c first added in v6.11
- [Phase 4] Found original submission at spinics amd-gfx list, no NAKs
  or concerns
- [Phase 5] Grep for `adev->df.funcs`: soc15.c, gmc_v9_0.c, amdgpu_ras.c
  all use NULL guard; soc24.c is the only exception
- [Phase 5] amdgpu_discovery.c: df.funcs assignment has a `default:
  break` that leaves funcs NULL for unrecognized DF versions
- [Phase 5] soc24_common_resume() calls soc24_common_hw_init(),
  confirming crash on both init and resume
- [Phase 6] Bug exists in 6.11.y, 6.12.y, 7.0.y; not in 6.6.y or earlier
  (soc24.c not present)
- [Phase 8] Failure mode: NULL deref -> kernel oops during GPU
  init/resume, severity CRITICAL

**YES**

 drivers/gpu/drm/amd/amdgpu/soc24.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc24.c b/drivers/gpu/drm/amd/amdgpu/soc24.c
index ecb6c3fcfbd15..984262936545f 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc24.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc24.c
@@ -484,7 +484,7 @@ static int soc24_common_hw_init(struct amdgpu_ip_block *ip_block)
 	if (adev->nbio.funcs->remap_hdp_registers)
 		adev->nbio.funcs->remap_hdp_registers(adev);
 
-	if (adev->df.funcs->hw_init)
+	if (adev->df.funcs && adev->df.funcs->hw_init)
 		adev->df.funcs->hw_init(adev);
 
 	/* enable the doorbell aperture */
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-20 13:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260420131539.986432-1-sashal@kernel.org>
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0] drm/amdgpu/userq: remove queue from doorbell xarray Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.12] drm/amd/display: Fix dcn401_optimize_bandwidth Sasha Levin
2026-04-20 13:09 ` [PATCH AUTOSEL 6.18] drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size Sasha Levin
2026-04-20 13:09 ` [PATCH AUTOSEL 7.0-6.12] drm/amdgpu: fix DF NULL pointer issue for soc24 Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox