public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Ce Sun <cesun102@amd.com>, Lijo Lazar <lijo.lazar@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sasha Levin <sashal@kernel.org>,
	christian.koenig@amd.com, airlied@gmail.com, simona@ffwll.ch,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19-5.15] drm/amdgpu: Adjust usleep_range in fence wait
Date: Mon, 23 Feb 2026 07:37:21 -0500	[thread overview]
Message-ID: <20260223123738.1532940-16-sashal@kernel.org> (raw)
In-Reply-To: <20260223123738.1532940-1-sashal@kernel.org>

From: Ce Sun <cesun102@amd.com>

[ Upstream commit 3ee1c72606bd2842f0f377fd4b118362af0323ae ]

Tune the sleep interval in the PSP fence wait loop from 10-100us to
60-100us.This adjustment results in an overall wait window of 1.2s
(60us * 20000 iterations) to 2 seconds (100us * 20000 iterations),
which guarantees that we can retrieve the correct fence value

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the key information. Let me analyze this commit:

## Analysis

### What the commit does
This is a one-line change that increases the minimum sleep time in the
PSP (Platform Security Processor) fence wait loop from `usleep_range(10,
100)` to `usleep_range(60, 100)`.

### Context
The `psp_cmd_submit_buf()` function submits a command to AMD's Platform
Security Processor and waits for completion by polling a fence buffer.
The loop runs up to `psp_timeout = 20000` iterations.

**Previous behavior:** With `usleep_range(10, 100)`, the total wait
window was approximately:
- Minimum: 10us * 20000 = 200ms
- Maximum: 100us * 20000 = 2000ms (2s)

**New behavior:** With `usleep_range(60, 100)`, the total wait window
is:
- Minimum: 60us * 20000 = 1200ms (1.2s)
- Maximum: 100us * 20000 = 2000ms (2s)

### Bug Analysis
The commit message says this "guarantees that we can retrieve the
correct fence value." This implies that with the 10us minimum, the
kernel scheduler could wake up after approximately 10us each iteration,
causing the 20000 iterations to complete in as little as ~200ms —
**before the PSP hardware has finished its work**. The fix increases the
minimum wait to ensure the total poll window is at least 1.2s.

This is fixing a **real bug**: a timeout that is too short, causing
legitimate PSP commands to fail spuriously. When PSP fence wait fails,
line 762-764 shows it returns `-EINVAL`, which causes firmware load
failures and GPU initialization failures. This can make the GPU non-
functional.

### Stable criteria evaluation
1. **Fixes a real bug**: Yes — PSP commands can time out prematurely,
   causing GPU initialization failures.
2. **Obviously correct**: Yes — it's a one-line change to a sleep range,
   reviewed by AMD engineer Lijo Lazar.
3. **Small and contained**: Yes — single line change in one file.
4. **No new features**: Correct — only adjusts a timing parameter.
5. **Risk**: Very low. The maximum sleep time (100us) is unchanged. The
   only effect is that the minimum sleep per iteration increases, making
   the total wait time longer and more reliable. This cannot cause
   regressions — it only makes the timeout more generous.

### Concerns
- The change is a timing adjustment, which can sometimes be considered a
  "tuning" change rather than a bug fix. However, the commit message
  explicitly states it fixes a correctness problem (failing to retrieve
  the correct fence value), and the consequence of the bug is GPU
  initialization failure.
- No explicit `Fixes:` tag, but that is expected for autosel candidates.

### User Impact
PSP fence wait failures manifest as GPU initialization failures. Users
affected would see their AMD GPU fail to initialize, making it unusable.
This is a high-severity issue for affected users.

## Verification
- Confirmed `psp_timeout = 20000` at line 268 of `amdgpu_psp.c` —
  verified the iteration count claim in the commit message.
- Confirmed the wait loop structure at lines 718-731 — the loop polls
  with `usleep_range` and breaks on timeout.
- Confirmed that timeout failure leads to `-EINVAL` return at line 763,
  causing command failure.
- Confirmed the change is a single-line edit from `usleep_range(10,
  100)` to `usleep_range(60, 100)`.
- Verified the commit is reviewed by Lijo Lazar (AMD PSP engineer) and
  signed off by Alex Deucher (AMD GPU maintainer).
- The math checks out: 10us * 20000 = 200ms (too short), 60us * 20000 =
  1.2s (sufficient per commit message).

This is a small, obvious, low-risk fix for a real bug that causes GPU
initialization failures. It meets all stable kernel criteria.

**YES**

 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 0b10497d487c3..81bdd6aaad2a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -726,7 +726,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		ras_intr = amdgpu_ras_intr_triggered();
 		if (ras_intr)
 			break;
-		usleep_range(10, 100);
+		usleep_range(60, 100);
 		amdgpu_device_invalidate_hdp(psp->adev, NULL);
 	}
 
-- 
2.51.0


  parent reply	other threads:[~2026-02-23 12:38 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 12:37 [PATCH AUTOSEL 6.19-6.1] drm/amd/display: Remove conditional for shaper 3DLUT power-on Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] ASoC: rt721-sdca: Fix issue of fail to detect OMTP jack type Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] ALSA: hda/tas2781: Ignore reset check for SPI device Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.15] btrfs: replace BUG() with error handling in __btrfs_balance() Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.15] ALSA: usb-audio: Add sanity check for OOB writes at silencing Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.12] drm/amd/display: Fix system resume lag issue Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.12] arm64: hugetlbpage: avoid unused-but-set-parameter warning (gcc-16) Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.12] drm/amd/display: Fix writeback on DCN 3.2+ Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] drm/amdgpu: Skip vcn poison irq release on VF Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] drm/amdgpu: return when ras table checksum is error Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] regulator: core: Remove regulator supply_name length limit Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.10] ARM: 9467/1: mm: Don't use %pK through printk Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.10] drm/radeon: Add HAINAN clock adjustment Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] drm/amdgpu: avoid sdma ring reset in sriov Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.12] spi: spidev: fix lock inversion between spi_lock and buf_lock Sasha Levin
2026-02-23 12:37 ` Sasha Levin [this message]
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19] mshv: Ignore second stats page map result failure Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] ALSA: hda/hdmi: Add quirk for TUXEDO IBS14G6 Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19] drm/amd/display: set enable_legacy_fast_update to false for DCN36 Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19] x86/hyperv: Move hv crash init after hypercall pg setup Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] mshv: clear eventfd counter on irqfd shutdown Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.10] drm/amd/display: Avoid updating surface with the same surface under MPO Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.15] ALSA: usb-audio: Update the number of packets properly at receiving Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.12] drm/amd/display: bypass post csc for additional color spaces in dal Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] ASoC: amd: amd_sdw: add machine driver quirk for Lenovo models Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-6.18] ALSA: hda/realtek: Fix headset mic on ASUS Zenbook 14 UX3405MA Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19] Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT Sasha Levin
2026-02-23 12:37 ` [PATCH AUTOSEL 6.19-5.10] drm/amdgpu: Add HAINAN clock adjustment Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260223123738.1532940-16-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=airlied@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=cesun102@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=lijo.lazar@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=simona@ffwll.ch \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox