Re: [PATCH 3/3] drm/amdkfd: block THP for non-replayable SVM ranges

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Yitao Jiang" <jytscientist@hotmail.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 3/3] drm/amdkfd: block THP for non-replayable SVM ranges
Date: Thu, 25 Jun 2026 11:11:34 +0000	[thread overview]
Message-ID: <20260625111135.44DBB1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <SY1PR01MB10596F9476BAA5333F02F5A6BC0EC2@SY1PR01MB10596.ausprd01.prod.outlook.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

New issues:
- [High] Error handling in `svm_range_set_attr` aborts the transaction mid-flight without rollback, leaving overlapping ranges in the SVM interval tree and leaking memory.
- [High] `svm_range_add_notifier_locked` silently ignores failures when registering the MMU interval notifier, leading to WARN_ON_ONCE and interval tree corruption.
- [High] Clearing the `KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED` attribute permanently leaks a THP block node and permanently disables THP for the virtual address range.
- [High] Dynamically disabling XNACK via ioctl does not update SVM range notifier flags, leaving THP unblocked and risking GPU page faults.

Pre-existing issues:
- [High] If a user maps SVM memory at virtual address 0, its MMU interval notifier is never removed, causing a Use-After-Free.
--

commit f4df270ac94ba46190c602a81dedff1aa60cc86a
Author: Yitao Jiang <jytscientist@hotmail.com>

drm/amdkfd: block THP for non-replayable SVM ranges

KFD SVM ranges on processes without XNACK, and ranges requested as
GPU_ALWAYS_MAPPED, cannot rely on replayable GPU faults after a CPU THP
remap of the registered VA range. Keep those ranges backed by base
pages while their interval notifier is active.

Opt those SVM interval notifiers into MMU_INTERVAL_NOTIFIER_BLOCK_THP
and update the flag when SVM attributes change. XNACK-enabled ranges
that can handle remaps through replayable faults remain eligible for
THP unless GPU_ALWAYS_MAPPED is requested.

> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 3841943da5ece..0d0feba7b7852 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

[ ... ]

> @@ -81,6 +82,26 @@ static const struct mmu_interval_notifier_ops svm_range_mn_ops = {
>  	.invalidate = svm_range_cpu_invalidate_pagetables,
>  };
>  
> +static unsigned int
> +svm_range_mn_flags(struct svm_range *prange)
> +{
> +	struct kfd_process *p = container_of(prange->svms, struct kfd_process,
> +					     svms);
> +
> +	if (!p->xnack_enabled ||
> +	    (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED))
> +		return MMU_INTERVAL_NOTIFIER_BLOCK_THP;
> +
> +	return 0;
> +}

[Severity: High]
When a process dynamically disables XNACK using the KFD_IOC_SMI_EVENTS ioctl,
does this leave THP unblocked on existing ranges?

In svm_range_switch_xnack_reserve_mem(), p->xnack_enabled is updated
dynamically:

        p->xnack_enabled = xnack_enabled;

However, it does not seem to update the interval notifier flags for existing
SVM ranges. Because the existing ranges remain with THP unblocked, could
khugepaged collapse THPs and generate unrecoverable GPU page faults since
XNACK is disabled?

> +
> +static int
> +svm_range_update_mn_flags_locked(struct svm_range *prange)
> +{
> +	return mmu_interval_notifier_set_flags_locked(&prange->notifier,
> +						      svm_range_mn_flags(prange));
> +}

[Severity: High]
If userspace clears the KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag or XNACK
is enabled, does this permanently leak a THP block node?

When svm_range_update_mn_flags_locked() calls
mmu_interval_notifier_set_flags_locked() with flags = 0, the
interval_sub->flags are zeroed out locally. However, since there is no
teardown path to call mmu_interval_notifier_unblock_thp_locked(), the THP
block node remains.

When the SVM range is eventually freed, mmu_interval_notifier_remove() will
also skip the unblock path because the flag was cleared, resulting in a
permanent memory leak and permanently disabling THP for the virtual address
range.

[ ... ]

> @@ -112,10 +133,11 @@ svm_range_add_notifier_locked(struct mm_struct *mm, struct svm_range *prange)
>  	pr_debug("svms 0x%p prange 0x%p [0x%lx 0x%lx]\n", prange->svms,
>  		 prange, prange->start, prange->last);
>  
> -	mmu_interval_notifier_insert_locked(&prange->notifier, mm,
> -				     prange->start << PAGE_SHIFT,
> -				     prange->npages << PAGE_SHIFT,
> -				     &svm_range_mn_ops);
> +	mmu_interval_notifier_insert_locked_flags(&prange->notifier, mm,
> +						  prange->start << PAGE_SHIFT,
> +						  prange->npages << PAGE_SHIFT,
> +						  &svm_range_mn_ops,
> +						  svm_range_mn_flags(prange));
>  }

[Severity: High]
Does this silently ignore failures during interval notifier registration?

If mmu_interval_notifier_insert_locked_flags() fails, it returns an error
code and prange->notifier.mm remains NULL. Because the return value is
discarded here, subsequent code in svm_range_set_attr() will call
svm_range_update_mn_flags_locked() on the same transaction. This passes the
uninitialized notifier to mmu_interval_notifier_set_flags_locked(), which
will hit a WARN_ON_ONCE(!mm), return -EINVAL, and cause the transaction to
abort, potentially corrupting the interval tree.

[ ... ]

> @@ -3763,6 +3785,12 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
>  	}
>  	list_for_each_entry(prange, &update_list, update_list) {
>  		svm_range_apply_attrs(p, prange, nattr, attrs, &update_mapping);
> +		r = svm_range_update_mn_flags_locked(prange);
> +		if (r) {
> +			mutex_unlock(&svms->lock);
> +			mmap_write_unlock(mm);
> +			goto out;
> +		}

[Severity: High]
Does aborting the transaction here leak memory and corrupt the interval tree?

If svm_range_update_mn_flags_locked() fails, the goto out path completely
bypasses the remove_list processing located further down in the function.
This leaves old cloned ranges permanently in the interval tree alongside
their newly inserted replacements that were added during the insert_list
loop. Additionally, because remove_list is a stack variable, skipping its
teardown leaks the ranges and leaves dangling pointers in the update_list
links.

>  		/* TODO: unmap ranges from GPU that lost access */
>  	}
>  	update_mapping |= !p->xnack_enabled && !list_empty(&remap_list);

[Severity: High]
This is a pre-existing issue, but does svm_range_remove_notifier() fail to
remove the notifier if the SVM range starts at virtual address 0?

In drivers/gpu/drm/amd/amdkfd/kfd_svm.c:svm_range_remove_notifier():

	if (prange->notifier.interval_tree.start != 0 &&
	    prange->notifier.interval_tree.last != 0)
		mmu_interval_notifier_remove(&prange->notifier);

If an SVM range is mapped starting at virtual address 0 (which is permitted
if vm.mmap_min_addr is 0 or if the user has CAP_SYS_RAWIO), the check
evaluates to false and incorrectly skips mmu_interval_notifier_remove().
The prange is subsequently freed, but its embedded notifier structure
remains linked in the mm->notifier_subscriptions->itree, which will cause a
use-after-free the next time the tree is modified.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/SY1PR01MB10596EB75463208A8E1EBBA0FC0EC2@SY1PR01MB10596.ausprd01.prod.outlook.com?part=3

next prev parent reply	other threads:[~2026-06-25 11:11 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-25 10:59 [PATCH 0/3] mm/mmu_notifier, drm/amdgpu: block THP for GPU user mappings Yitao Jiang
2026-06-25 10:59 ` [PATCH 1/3] mm/mmu_notifier: let interval notifiers block THP Yitao Jiang
2026-06-25 11:18   ` sashiko-bot
2026-06-25 11:50   ` David Hildenbrand (Arm)
2026-06-25 11:58   ` Lorenzo Stoakes
2026-06-25 10:59 ` [PATCH 2/3] drm/amdgpu: block THP for HSA userptr notifiers Yitao Jiang
2026-06-25 11:26   ` sashiko-bot
2026-06-25 12:36   ` Christian König
2026-06-25 10:59 ` [PATCH 3/3] drm/amdkfd: block THP for non-replayable SVM ranges Yitao Jiang
2026-06-25 11:11   ` sashiko-bot [this message]
2026-06-25 11:47 ` [PATCH 0/3] mm/mmu_notifier, drm/amdgpu: block THP for GPU user mappings David Hildenbrand (Arm)
2026-06-25 11:54   ` Lorenzo Stoakes
2026-06-25 12:14     ` 回复: " 蒋 亦韬
2026-06-25 12:35 ` Christian König
2026-06-25 13:01   ` 回复: " 蒋 亦韬
2026-06-25 13:06     ` Christian König
2026-06-25 20:51       ` Kuehling, Felix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260625111135.44DBB1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jytscientist@hotmail.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.