AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Eric Huang <jinhuieric.huang@amd.com>,
	Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sasha Levin <sashal@kernel.org>,
	Felix.Kuehling@amd.com, amd-gfx@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.17-6.1] drm/amdkfd: fix vram allocation failure for a special case
Date: Sat, 25 Oct 2025 12:00:34 -0400	[thread overview]
Message-ID: <20251025160905.3857885-403-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>

From: Eric Huang <jinhuieric.huang@amd.com>

[ Upstream commit 93aa919ca05bec544b17ee9a1bfe394ce6c94bd8 ]

When it only allocates vram without va, which is 0, and a
SVM range allocated stays in this range, the vram allocation
returns failure. It should be skipped for this case from
SVM usage check.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

This is a small, targeted bug fix that prevents erroneous -EADDRINUSE
failures when userspace allocates a VRAM buffer without providing a VA
(i.e., `va_addr == 0`). The change is confined to the KFD ioctl path and
poses minimal regression risk while fixing a real user-visible issue.

What changed
- In `drivers/gpu/drm/amd/amdkfd/kfd_chardev.c:1045`, inside
  `kfd_ioctl_alloc_memory_of_gpu`, the SVM overlap check was amended to
  skip a special case:
  - New guard added at `drivers/gpu/drm/amd/amdkfd/kfd_chardev.c:1071`:
    - `if (!(!args->va_addr && (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM))
      && interval_tree_iter_first(...)) { ... return -EADDRINUSE; }`
  - Practically, this means the SVM interval-tree overlap check is
    bypassed only when:
    - `args->va_addr == 0` (no VA requested), and
    - `flags` includes `KFD_IOC_ALLOC_MEM_FLAGS_VRAM`.
  - Previously, the overlap check was unconditional, which could falsely
    report “Address already allocated by SVM” when VA is 0 (see the
    surrounding context at
    `drivers/gpu/drm/amd/amdkfd/kfd_chardev.c:1064-1079`).

Why it’s a bug fix
- The commit message accurately describes a failure mode: when
  allocating VRAM-only without a VA (VA=0) and there exists an SVM range
  that falls in that [0, size) range, the ioctl incorrectly returns
  `-EADDRINUSE`. For VRAM-only allocations without a VA, SVM address-
  range conflicts are irrelevant and should not block allocation.
- The code change corrects this by skipping the SVM overlap check for
  that specific case, avoiding a false-positive error.

Safety and scope
- Minimal, localized change: It adds a single conditional guard and
  comment in one function. No ABI or architectural changes.
- Confined to AMD KFD user memory allocation path; does not touch core
  MM, scheduler, or unrelated GPU subsystems.
- Consistency with mapping rules: mapping requires a non-zero VA. In
  `kfd_mem_attach` (called during mapping), mapping with `mem->va == 0`
  is rejected
  (`drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:858-930`, check at
  “if (!va) { ... return -EINVAL; }”). This ensures that skipping the
  SVM check for VA=0 can’t accidentally permit an overlapping SVM GPU-VA
  mapping later: mapping at VA=0 is inherently invalid and denied. Thus
  the change strictly avoids a spurious allocation-time error without
  enabling unsafe mappings.
- Flags behavior matches UAPI: `KFD_IOC_ALLOC_MEM_FLAGS_VRAM` is
  intended for VRAM allocations (`include/uapi/linux/kfd_ioctl.h:407`).
  VRAM-only allocations with VA=0 are valid for certain use cases (e.g.,
  export or CPU-visible VRAM on large BAR), and should not be blocked by
  SVM interval checks.

Stable backport criteria
- Fixes a real bug affecting users (spurious -EADDRINUSE on valid VRAM-
  only allocations).
- Change is small and contained, with clear intent and low regression
  risk.
- No new features or architectural shifts.
- Touches only driver code in a single path
  (`kfd_ioctl_alloc_memory_of_gpu`), no widespread side effects.

Conclusion
- This is a clear, minimal bug fix that prevents erroneous allocation
  failures and aligns with the mapping semantics already enforced
  elsewhere. It is suitable for stable backport.

 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 79ed3be63d0dd..43115a3744694 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1070,7 +1070,12 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep,
 	svm_range_list_lock_and_flush_work(&p->svms, current->mm);
 	mutex_lock(&p->svms.lock);
 	mmap_write_unlock(current->mm);
-	if (interval_tree_iter_first(&p->svms.objects,
+
+	/* Skip a special case that allocates VRAM without VA,
+	 * VA will be invalid of 0.
+	 */
+	if (!(!args->va_addr && (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM)) &&
+	    interval_tree_iter_first(&p->svms.objects,
 				     args->va_addr >> PAGE_SHIFT,
 				     (args->va_addr + args->size - 1) >> PAGE_SHIFT)) {
 		pr_err("Address: 0x%llx already allocated by SVM\n",
-- 
2.51.0


  parent reply	other threads:[~2025-10-25 16:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] amd/amdkfd: enhance kfd process check in switch partition Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-6.12] drm/amdgpu: fix nullptr err of vm_handle_moved Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.1] drm/amdgpu: Allow kfd CRIU with no buffer objects Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17] drm/amd/pm: refine amdgpu pm sysfs node error code Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-6.6] drm/amdkfd: Handle lack of READ permissions in SVM mapping Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-6.6] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-5.4] drm/amdkfd: return -ENOTTY for unsupported IOCTLs Sasha Levin
2025-10-25 16:00 ` Sasha Levin [this message]
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-5.4] drm/amdkfd: Tie UNMAP_LATENCY to queue_preemption Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251025160905.3857885-403-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Felix.Kuehling@amd.com \
    --cc=Harish.Kasiviswanathan@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=jinhuieric.huang@amd.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox