From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C65053FCB3F; Mon, 20 Apr 2026 13:26:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776691587; cv=none; b=qBx/fGXffBpA0prR/iLvS0qyUXsUW1DzqrcABqeoOMcf8BchwazkKjRSO+Vwwj8TlMgv4Pu84IFR/tfetodT3dBwAIC48GL5DKzcvt9NAkRZhUj3kc+CpBH9rMDgOAqBCuEtv6THIUiu+px9MvrWaFZrSiH2KWgjVHaHA9wbVbo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776691587; c=relaxed/simple; bh=KA9zXBhWuGhCX1Bk2eIy/h8nhJSg+lrpozaPxXRjTgQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=eB5aLcQAuvuN/Ln0M63vEd53sqcBpJ8IuDgNE7r0K6EIy9nwgGoflhSk/ISvqKeQnZScF5LT35PvHr7hnwg+zHTPp92cr2px1kJ/fIAVXo1Pf1VQlnvmU39gtz3x6SZ2T423ZwRl4/wrLgkeLV6LahjneB2x+XGSM4sEDEXkmKw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=b2L3vTmF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="b2L3vTmF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2280EC2BCB8; Mon, 20 Apr 2026 13:26:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1776691587; bh=KA9zXBhWuGhCX1Bk2eIy/h8nhJSg+lrpozaPxXRjTgQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=b2L3vTmFW3d/hmyI8yCQqqcVfny2sL+4tfDdwoSWcqxm75GMwONIllFusloZMuMYB IePbvZDRtg/7yiqWpNg4pTuBQaxM+z/Ey8q5LpgWjHeMjfcSTvn0IuPBBQ+uMBF4rA Q0BfIbs5NKm5ZFw1+MFqnOFEoIf6DYYKmhOeRReEt+esJJSemj/SuVUy4XW2gAv+qY BtMPK6yQNFT6YpExz0uB/KN5MGsuPLiLqidNuNmCOxpoMCgL7tavt0+6zDv6ydA1EO qTsMtC66U435Fl32cykxATZF+MlZxF7jPD8GuAyy0J3X3SVW+sc4QlS8su7Fcpc5Tr JwqFE66Ku1zxw== From: Sasha Levin To: patches@lists.linux.dev, stable@vger.kernel.org Cc: "Jesse.Zhang" , Vitaly Prosyak , =?UTF-8?q?Christian=20K=C3=B6nig?= , Jesse Zhang , Alex Deucher , Sasha Levin , airlied@gmail.com, simona@ffwll.ch, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH AUTOSEL 7.0-5.10] drm/amdgpu: validate fence_count in wait_fences ioctl Date: Mon, 20 Apr 2026 09:18:31 -0400 Message-ID: <20260420132314.1023554-117-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260420132314.1023554-1-sashal@kernel.org> References: <20260420132314.1023554-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 7.0 Content-Transfer-Encoding: 8bit From: "Jesse.Zhang" [ Upstream commit 2cef848812a071991c20090cbe051a0a96c50a0c ] Add an early parameter check in amdgpu_cs_wait_fences_ioctl() to reject a zero fence_count with -EINVAL. dma_fence_wait_any_timeout() requires count > 0. When userspace passes fence_count == 0, the call propagates down to dma_fence core which does not expect a zero-length array and triggers a WARN_ON. Return -EINVAL immediately so the caller gets a clear error instead of hitting an unexpected warning in the DMA fence subsystem. No functional change for well-formed userspace callers. v2: - Reworked commit message to clarify the parameter validation rationale - Removed verbose crash log from commit description - Simplified inline code comment Reviewed-by: Vitaly Prosyak Reviewed-by: Christian König Signed-off-by: Jesse Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- LLM Generated explanations, may be completely bogus: I now have comprehensive information to complete the full analysis. Let me compile the findings. --- ## PHASE 1: COMMIT MESSAGE FORENSICS **Step 1.1: Subject Line** - Subsystem: `drm/amdgpu` - Action verb: "validate" (adding missing input validation = bug fix) - Summary: Validates `fence_count` parameter in `wait_fences` ioctl to reject zero. Record: [drm/amdgpu] [validate] [Adding missing input validation for fence_count in wait_fences ioctl] **Step 1.2: Tags** - Reviewed-by: Vitaly Prosyak (AMD developer) - Reviewed-by: Christian König (DMA-fence subsystem maintainer) - Signed-off-by: Jesse Zhang (author) - Signed-off-by: Alex Deucher (AMD DRM maintainer, committer) - No Fixes: tag (expected for autosel review) - No Cc: stable (expected) Record: Strong review signal — Christian König is the DRM scheduler / DMA-fence maintainer. Alex Deucher is the AMD GPU maintainer. **Step 1.3: Commit Body** Bug: Userspace passes `fence_count == 0` to `AMDGPU_WAIT_FENCES` ioctl. This propagates to `dma_fence_wait_any_timeout()`, which has `WARN_ON(!count)`. The WARN_ON fires in the DMA fence subsystem. Symptom: Kernel WARN_ON triggered from userspace input. On `panic_on_warn` systems, this causes a kernel crash. Fix: Return `-EINVAL` early when `fence_count == 0`. Record: [Bug: WARN_ON trigger from userspace-controlled input] [Symptom: kernel warning/crash] [Author's root cause: dma_fence_wait_any_timeout requires count > 0] **Step 1.4: Hidden Bug Fix Detection** "Validate" = adding missing parameter check. This IS a bug fix: it prevents a WARN_ON (and potential crash) from userspace-controlled input. Record: [Yes, this is a bug fix — adds missing input validation to prevent WARN_ON from ioctl with zero count] --- ## PHASE 2: DIFF ANALYSIS **Step 2.1: Inventory** - Single file modified: `drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c` - +7 lines (4-line comment + 2 lines of code + 1 blank line) - Function modified: `amdgpu_cs_wait_fences_ioctl()` Record: [1 file, +7 lines, single function, surgical fix] **Step 2.2: Code Flow Change** BEFORE: `fence_count == 0` passes through to `memdup_array_user` (returns ZERO_SIZE_PTR), then reaches `dma_fence_wait_any_timeout()` which fires `WARN_ON(!count)`. AFTER: `fence_count == 0` is caught at ioctl entry, returns `-EINVAL` immediately. Record: [Before: WARN_ON triggered. After: clean EINVAL return] **Step 2.3: Bug Mechanism** Category: Missing input validation / parameter check. Mechanism: The ioctl fails to validate a user-controlled parameter before passing it to a core kernel API that has a `WARN_ON` precondition. Verified at line 894 of `dma-fence.c`: ```894:894:drivers/dma-buf/dma-fence.c if (WARN_ON(!fences || !count || timeout < 0)) ``` Record: [Missing input validation] [User-controlled count==0 triggers WARN_ON in dma_fence_wait_any_timeout] **Step 2.4: Fix Quality** - Obviously correct: a simple zero-check before further processing. - Minimal/surgical: 7 lines total including comments. - Regression risk: None. `fence_count == 0` is meaningless ("wait for zero fences"), and the ioctl already failed (with WARN) in this case. Returning `-EINVAL` is the correct behavior. - No API change for well-formed callers (as stated in the commit message). Record: [Fix is obviously correct, minimal, no regression risk] --- ## PHASE 3: GIT HISTORY INVESTIGATION **Step 3.1: Blame** `amdgpu_cs_wait_fences_ioctl` was introduced by commit `eef18a827a9ec5` (Junwei Zhang, 2016-11-04, "drm/amdgpu: add the interface of waiting multiple fences (v4)"). This is v4.10-era code, present in ALL active stable trees. Record: [Bug introduced in eef18a827a9ec5, v4.10 timeframe, present in all stable trees] **Step 3.2: Fixes Tag** No Fixes: tag present. The implicit target is `eef18a827a9ec5` which introduced the ioctl without the validation. Record: [No Fixes: tag. Original code from 2016.] **Step 3.3: File History** Recent changes to `amdgpu_cs.c` include `dea75df7afe14` (memdup_array_user conversion) and `69050f8d6d075` (kzalloc_objs treewide change). These are cosmetic/API modernizations that don't affect the bug or fix logic. Record: [Recent changes are cosmetic. Fix is standalone.] **Step 3.4: Author** Jesse Zhang is a regular AMD GPU contributor with multiple fix commits in the subsystem (SDMA fixes, out-of-bounds fixes, etc.). Record: [Active AMD subsystem contributor] **Step 3.5: Dependencies** The fix adds a simple `if` check at the start of the function, before any recently-changed code. It does NOT depend on patches 2/3 in the series (which touch different files/functions entirely). The series patches are independent input validation improvements. Record: [Standalone fix, no dependencies on other patches] --- ## PHASE 4: MAILING LIST RESEARCH **Step 4.1: Original Submission** Found at https://lists.freedesktop.org/archives/amd- gfx/2026-March/140748.html. This is v2 of the patch with reworked commit message. No NAKs found. Record: [Patch submitted March 2026. v2 incorporated review feedback on commit message clarity.] **Step 4.2: Reviewers** Reviewed by Vitaly Prosyak (AMD) and Christian König (DMA-fence/DRM scheduler maintainer). Committed by Alex Deucher (AMD DRM maintainer). Record: [Subsystem maintainer reviewed and approved] **Step 4.3: Bug Report** No separate bug report link. The bug was found by code inspection (the WARN_ON contract in `dma_fence_wait_any_timeout` is explicit). Record: [Found by code review, not user report] **Step 4.4: Series Context** 3-patch series, all independent input validation improvements. Patch 2/3 changes WARN to DRM_ERROR in `amdgpu_sched_ioctl` (separate file/function). Each is standalone. Record: [Independent patches in the series. This one is self-contained.] **Step 4.5: Stable Discussion** No explicit stable nomination found in the thread. Record: [No explicit stable discussion, which is expected for autosel candidates.] --- ## PHASE 5: CODE SEMANTIC ANALYSIS **Step 5.1: Functions Modified** Only `amdgpu_cs_wait_fences_ioctl()`. **Step 5.2: Callers** This is a DRM ioctl handler registered via `DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, ...)` with `DRM_AUTH|DRM_RENDER_ALLOW`. It is reachable by any process with access to `/dev/dri/renderDNN` — no special privileges required beyond DRM_AUTH. Record: [Ioctl handler, reachable from unprivileged userspace via render node] **Step 5.3-5.4: Call Chain** Userspace ioctl -> `drm_ioctl` -> `amdgpu_cs_wait_fences_ioctl` -> (if !wait_all) `amdgpu_cs_wait_any_fence` -> `dma_fence_wait_any_timeout` -> `WARN_ON(!count)`. Record: [Direct ioctl path, user-controlled trigger, WARN_ON reached with fence_count=0] **Step 5.5: Similar Patterns** The `amdgpu_cs_wait_all_fences` path with count==0 doesn't hit a WARN_ON (the for loop simply doesn't execute), but returns success for a meaningless request. The fix correctly catches both paths by validating at the ioctl entry point. Record: [Fix covers both wait_all and wait_any paths] --- ## PHASE 6: STABLE TREE ANALYSIS **Step 6.1: Buggy Code in Stable** The `amdgpu_cs_wait_fences_ioctl` function has existed since `eef18a827a9ec5` (v4.10, November 2016). It exists in ALL active stable trees. Record: [Bug exists in all stable trees (v5.4, v5.10, v5.15, v6.1, v6.6, v6.12)] **Step 6.2: Backport Complications** The fix adds a check at the very start of the function body, before any code that has been recently modified. In older stable trees, the `memdup_array_user` line would be `memdup_user` with a manual size calculation instead, but the added check comes BEFORE that line. Minor context adjustment may be needed for the surrounding `memdup` call, but the fix itself is trivially applicable. Record: [Clean or near-clean apply expected. Minor context difference in older trees.] **Step 6.3: Related Fixes in Stable** No prior fix for this zero-count issue was found. Record: [No prior fix exists in stable.] --- ## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT **Step 7.1: Subsystem** DRM/AMDGPU — a widely used GPU driver (AMD is one of two major discrete GPU vendors on Linux). Criticality: IMPORTANT. Record: [drm/amdgpu, IMPORTANT — widely used GPU driver] **Step 7.2: Activity** Actively developed (many recent commits). The file has had multiple changes since v6.6. Record: [Very active subsystem] --- ## PHASE 8: IMPACT AND RISK ASSESSMENT **Step 8.1: Affected Users** All users with AMD GPUs (a very large population). Any process with DRM render node access. Record: [All AMD GPU users, unprivileged trigger] **Step 8.2: Trigger Conditions** Any process that opens `/dev/dri/renderDNN` and issues the `AMDGPU_WAIT_FENCES` ioctl with `fence_count == 0`. This is trivially triggerable — no special timing, no race, no specific hardware needed. A single malformed ioctl call from any render-node-capable process triggers it. Record: [Trivially triggerable from unprivileged userspace. Deterministic, no race needed.] **Step 8.3: Failure Mode** - `WARN_ON` fires in `dma_fence_wait_any_timeout()`: produces stack trace in kernel log. - On systems with `panic_on_warn=1` (common in syzbot testing, some hardened deployments): full kernel panic. - Even without `panic_on_warn`, repeated triggering can flood kernel logs and potentially be used for denial of service. - Severity: **MEDIUM-HIGH** (WARN from unprivileged userspace, potential crash with panic_on_warn) Record: [WARN_ON from userspace, CRITICAL with panic_on_warn, MEDIUM otherwise] **Step 8.4: Risk-Benefit** - BENEFIT: High — prevents kernel warning/crash from trivial unprivileged userspace input on all AMD GPU systems. - RISK: Very low — 2 lines of actual code (a simple zero-check), obviously correct, no functional change for valid callers, reviewed by subsystem maintainer. Record: [High benefit, very low risk] --- ## PHASE 9: FINAL SYNTHESIS **Step 9.1: Evidence** FOR backporting: - Fixes a WARN_ON triggerable from unprivileged userspace via ioctl - Trivial to trigger (deterministic, single ioctl call) - Crash on `panic_on_warn` systems - Reviewed by DMA-fence subsystem maintainer (Christian König) and AMD maintainer - 7 lines total, 2 lines of logic — minimal and obviously correct - Bug exists since v4.10 (2016), affects all stable trees - Standalone fix with no dependencies - No functional change for well-formed callers AGAINST backporting: - Not a crash for default kernel configuration (WARN, not BUG) - No user report (found by code inspection) - Minor context may differ in older stable trees **Step 9.2: Stable Rules Checklist** 1. Obviously correct and tested? **YES** — trivial zero-check, reviewed by maintainer 2. Fixes a real bug? **YES** — WARN_ON from userspace input 3. Important issue? **YES** — userspace-triggerable warning, crash with panic_on_warn 4. Small and contained? **YES** — 7 lines in one function 5. No new features or APIs? **YES** — purely input validation 6. Can apply to stable? **YES** — should apply cleanly or near-cleanly **Step 9.3: Exception Categories** N/A — this is a straightforward bug fix, no exception needed. **Step 9.4: Decision** The fix is small, surgical, obviously correct, reviewed by the subsystem maintainer, and prevents a WARN_ON (potential crash) triggerable from unprivileged userspace. It has existed in all stable trees since 2016. The risk is negligible and the benefit is clear. --- ## Verification - [Phase 1] Parsed tags: Reviewed-by Christian König (DMA-fence maintainer), Reviewed-by Vitaly Prosyak, Signed-off Alex Deucher (AMD DRM maintainer) - [Phase 2] Diff analysis: +7 lines in `amdgpu_cs_wait_fences_ioctl()`, adds `if (!wait->in.fence_count) return -EINVAL` before allocation - [Phase 2] Confirmed WARN_ON at dma-fence.c:894: `WARN_ON(!fences || !count || timeout < 0)` — verified by reading the source - [Phase 2] Confirmed `memdup_array_user(ptr, 0, size)` returns ZERO_SIZE_PTR (not error), so zero count passes through to `dma_fence_wait_any_timeout` - [Phase 3] git blame: `amdgpu_cs_wait_fences_ioctl` introduced in eef18a827a9ec5 (2016-11-04), confirmed present in p-5.10, p-5.15 tags - [Phase 3] git log: no prior fix for this zero-count issue in file history - [Phase 4] Found original submission at lists.freedesktop.org amd-gfx March 2026. v2 with reworked commit message. No NAKs. - [Phase 4] Series is 3 independent patches; patch 2/3 touches different file (amdgpu_sched.c). This patch is standalone. - [Phase 5] Ioctl registered with DRM_AUTH|DRM_RENDER_ALLOW — confirmed reachable from unprivileged userspace - [Phase 5] Traced call chain: ioctl -> wait_any_fence -> dma_fence_wait_any_timeout -> WARN_ON(!count) - [Phase 6] Bug exists in all active stable trees (code from 2016) - [Phase 6] Fix applies before any recently-changed code; near-clean apply expected - [Phase 8] Severity: WARN_ON from unprivileged userspace, crash with panic_on_warn; benefit high, risk very low **YES** drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 24e4b4fc91564..142022295fe15 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1747,6 +1747,13 @@ int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, void *data, struct drm_amdgpu_fence *fences; int r; + /* + * fence_count must be non-zero; dma_fence_wait_any_timeout() + * does not accept an empty fence array. + */ + if (!wait->in.fence_count) + return -EINVAL; + /* Get the fences from userspace */ fences = memdup_array_user(u64_to_user_ptr(wait->in.fences), wait->in.fence_count, -- 2.53.0