* [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
@ 2025-07-22 16:24 Kent Russell
2025-07-22 22:46 ` Chen, Xiaogang
0 siblings, 1 reply; 13+ messages in thread
From: Kent Russell @ 2025-07-22 16:24 UTC (permalink / raw)
To: amd-gfx; +Cc: felix.kuehling, Kent Russell
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_validate_and_map will continue to retry, silently, giving the
impression of a hang.
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e23b5a0f31f2..10b70b941b11 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
next = min(vma->vm_end, end);
npages = (next - addr) >> PAGE_SHIFT;
+ /* HMM requires at least READ permissions. If provided with PROT_NONE,
+ * unmap the memory. If it's not already mapped, this is a no-op
+ * If PROT_WRITE is provided without READ, warn first then unmap
+ */
+ if (!(vma->vm_flags & VM_READ)) {
+ unsigned long e, s;
+
+ if (vma->vm_flags & VM_WRITE)
+ pr_warn("VM_WRITE without VM_READ is not supported");
+ s = max(start, prange->start);
+ e = min(end, prange->last);
+ if (e >= s)
+ svm_range_unmap_from_gpus(prange, s, e,
+ KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+ addr = next;
+ continue;
+ }
+
WRITE_ONCE(p->svms.faulting_task, current);
r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
readonly, owner, NULL,
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-07-22 16:24 Kent Russell
@ 2025-07-22 22:46 ` Chen, Xiaogang
2025-07-23 13:30 ` Russell, Kent
0 siblings, 1 reply; 13+ messages in thread
From: Chen, Xiaogang @ 2025-07-22 22:46 UTC (permalink / raw)
To: Kent Russell, amd-gfx; +Cc: felix.kuehling
[-- Attachment #1: Type: text/plain, Size: 2784 bytes --]
On 7/22/2025 11:24 AM, Kent Russell wrote:
> HMM assumes that pages have READ permissions by default. Inside
> svm_range_validate_and_map, we add READ permissions then add WRITE
> permissions if the VMA isn't read-only. This will conflict with regions
> that only have PROT_WRITE or have PROT_NONE. When that happens,
Why read-only conflict with PROT_WRITE or have PROT_NONE? They are
vma->vm_flags that specifies the vma protection. User can change its
value at runtime. Is user not allowed to change it from read-only to
PROT_NONE?
> svm_range_validate_and_map will continue to retry, silently, giving the
> impression of a hang.
>
> If pages don't have READ permissions, simply unmap them and continue. If
> they weren't mapped in the first place, this would be a no-op. Since x86
> doesn't support write-only, and PROT_NONE doesn't allow reads or writes
> anyways, this will allow the svm range validation to continue without
> getting stuck in a loop forever on mappings we can't use with HMM.
>
> Signed-off-by: Kent Russell<kent.russell@amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index e23b5a0f31f2..10b70b941b11 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
>
> next = min(vma->vm_end, end);
> npages = (next - addr) >> PAGE_SHIFT;
> + /* HMM requires at least READ permissions. If provided with PROT_NONE,
> + * unmap the memory. If it's not already mapped, this is a no-op
> + * If PROT_WRITE is provided without READ, warn first then unmap
> + */
> + if (!(vma->vm_flags & VM_READ)) {
> + unsigned long e, s;
> +
> + if (vma->vm_flags & VM_WRITE)
> + pr_warn("VM_WRITE without VM_READ is not supported");
> + s = max(start, prange->start);
> + e = min(end, prange->last);
> + if (e >= s)
> + svm_range_unmap_from_gpus(prange, s, e,
> + KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
> + addr = next;
> + continue;
> + }
> +
> WRITE_ONCE(p->svms.faulting_task, current);
> r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
> readonly, owner, NULL,
It seems the real problem is at amdgpu_hmm_range_get_pages. It always
set HMM_PFN_REQ_FAULT to hmm_range->default_flags. HMM_PFN_REQ_FAULT
means the page is faultable and a future call with HMM_PFN_REQ_FAULT
could succeed. When vma->vm_flags is PROT_NONE the vma is not faultable,
so hmm_range->default_flags should be not set to HMM_PFN_REQ_FAULT to
avoid hmm_range_fault fault this vma.
Regards
Xiaogang
[-- Attachment #2: Type: text/html, Size: 3610 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-07-22 22:46 ` Chen, Xiaogang
@ 2025-07-23 13:30 ` Russell, Kent
2025-07-23 21:08 ` Chen, Xiaogang
0 siblings, 1 reply; 13+ messages in thread
From: Russell, Kent @ 2025-07-23 13:30 UTC (permalink / raw)
To: Chen, Xiaogang, amd-gfx@lists.freedesktop.org; +Cc: Kuehling, Felix
[-- Attachment #1: Type: text/plain, Size: 4096 bytes --]
[Public]
The big thing is that HMM doesn’t support a lack of read permissions. So does it make sense to make any HMM calls or set any HMM flags if it’s an unsupported configuration that it will fault on? That was why I went down this path, since trying to remove the READ permissions here while still using HMM calls could go sideways, if HMM will automatically fail if READ permissions are missing.
Kent
From: Chen, Xiaogang <Xiaogang.Chen@amd.com>
Sent: Tuesday, July 22, 2025 6:46 PM
To: Russell, Kent <Kent.Russell@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix <Felix.Kuehling@amd.com>
Subject: Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
On 7/22/2025 11:24 AM, Kent Russell wrote:
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
Why read-only conflict with PROT_WRITE or have PROT_NONE? They are vma->vm_flags that specifies the vma protection. User can change its value at runtime. Is user not allowed to change it from read-only to PROT_NONE?
svm_range_validate_and_map will continue to retry, silently, giving the
impression of a hang.
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com><mailto:kent.russell@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e23b5a0f31f2..10b70b941b11 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
next = min(vma->vm_end, end);
npages = (next - addr) >> PAGE_SHIFT;
+ /* HMM requires at least READ permissions. If provided with PROT_NONE,
+ * unmap the memory. If it's not already mapped, this is a no-op
+ * If PROT_WRITE is provided without READ, warn first then unmap
+ */
+ if (!(vma->vm_flags & VM_READ)) {
+ unsigned long e, s;
+
+ if (vma->vm_flags & VM_WRITE)
+ pr_warn("VM_WRITE without VM_READ is not supported");
+ s = max(start, prange->start);
+ e = min(end, prange->last);
+ if (e >= s)
+ svm_range_unmap_from_gpus(prange, s, e,
+ KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+ addr = next;
+ continue;
+ }
+
WRITE_ONCE(p->svms.faulting_task, current);
r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
readonly, owner, NULL,
It seems the real problem is at amdgpu_hmm_range_get_pages. It always set HMM_PFN_REQ_FAULT to hmm_range->default_flags. HMM_PFN_REQ_FAULT means the page is faultable and a future call with HMM_PFN_REQ_FAULT could succeed. When vma->vm_flags is PROT_NONE the vma is not faultable, so hmm_range->default_flags should be not set to HMM_PFN_REQ_FAULT to avoid hmm_range_fault fault this vma.
Regards
Xiaogang
[-- Attachment #2: Type: text/html, Size: 16471 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-07-23 13:30 ` Russell, Kent
@ 2025-07-23 21:08 ` Chen, Xiaogang
0 siblings, 0 replies; 13+ messages in thread
From: Chen, Xiaogang @ 2025-07-23 21:08 UTC (permalink / raw)
To: Russell, Kent, amd-gfx@lists.freedesktop.org; +Cc: Kuehling, Felix
[-- Attachment #1: Type: text/plain, Size: 6010 bytes --]
On 7/23/2025 8:30 AM, Russell, Kent wrote:
>
> [Public]
>
>
> The big thing is that HMM doesn’t support a lack of read permissions.
> So does it make sense to make any HMM calls or set any HMM flags if
> it’s an unsupported configuration that it will fault on? That was why
> I went down this path, since trying to remove the READ permissions
> here while still using HMM calls could go sideways, if HMM will
> automatically fail if READ permissions are missing.
>
I do not see HMM can only work if read permission is specified. From
code comments "If the vma does not allow read access, then assume that
it does not allow write access either".
hmm_range_fault uses cpu page fault to valid pages. If it cannot do that
it return error code specified the reasons. Driver needs check the error
code to decide what to do after. Current driver checks if read is
allowed, if not, driver assumes it is write permission. That does not
consider VM_NONE that specifies the vma is not accessible, then the vms
is not fault-able.
This issue not only exists in svm code. For user buffer registration and
validation current driver does not consider VM_NONE either. I think part
of the changes is: before let hmm_range_fault do page validation check
vma->flags to decide if can set hmm_range->default_flags to
HMM_PFN_REQ_FAULT or not. Do not request hmm fault on area that is not
faultable.
Regards
Xiaogang
> Kent
>
> *From:*Chen, Xiaogang <Xiaogang.Chen@amd.com>
> *Sent:* Tuesday, July 22, 2025 6:46 PM
> *To:* Russell, Kent <Kent.Russell@amd.com>; amd-gfx@lists.freedesktop.org
> *Cc:* Kuehling, Felix <Felix.Kuehling@amd.com>
> *Subject:* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in
> SVM mapping
>
> On 7/22/2025 11:24 AM, Kent Russell wrote:
>
> HMM assumes that pages have READ permissions by default. Inside
>
> svm_range_validate_and_map, we add READ permissions then add WRITE
>
> permissions if the VMA isn't read-only. This will conflict with
> regions
>
> that only have PROT_WRITE or have PROT_NONE. When that happens,
>
> Why read-only conflict with PROT_WRITE or have PROT_NONE? They are
> vma->vm_flags that specifies the vma protection. User can change its
> value at runtime. Is user not allowed to change it from read-only to
> PROT_NONE?
>
> svm_range_validate_and_map will continue to retry, silently,
> giving the
>
> impression of a hang.
>
> If pages don't have READ permissions, simply unmap them and
> continue. If
>
> they weren't mapped in the first place, this would be a no-op.
> Since x86
>
> doesn't support write-only, and PROT_NONE doesn't allow reads or
> writes
>
> anyways, this will allow the svm range validation to continue without
>
> getting stuck in a loop forever on mappings we can't use with HMM.
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> <mailto:kent.russell@amd.com>
>
> ---
>
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
>
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
> index e23b5a0f31f2..10b70b941b11 100644
>
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
> @@ -1713,6 +1713,24 @@ static int
> svm_range_validate_and_map(struct mm_struct *mm,
>
> next = min(vma->vm_end, end);
>
> npages = (next - addr) >> PAGE_SHIFT;
>
> + /* HMM requires at least READ permissions.
> If provided with PROT_NONE,
>
> + * unmap the memory. If it's not already
> mapped, this is a no-op
>
> + * If PROT_WRITE is provided without READ,
> warn first then unmap
>
> + */
>
> + if (!(vma->vm_flags & VM_READ)) {
>
> + unsigned long e, s;
>
> +
>
> + if (vma->vm_flags & VM_WRITE)
>
> + pr_warn("VM_WRITE without
> VM_READ is not supported");
>
> + s = max(start, prange->start);
>
> + e = min(end, prange->last);
>
> + if (e >= s)
>
> + svm_range_unmap_from_gpus(prange, s, e,
>
> +
> KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
>
> + addr = next;
>
> + continue;
>
> + }
>
> +
>
> WRITE_ONCE(p->svms.faulting_task, current);
>
> r =
> amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
>
> readonly,
> owner, NULL,
>
> It seems the real problem is at amdgpu_hmm_range_get_pages. It always
> set HMM_PFN_REQ_FAULT to hmm_range->default_flags. HMM_PFN_REQ_FAULT
> means the page is faultable and a future call with HMM_PFN_REQ_FAULT
> could succeed. When vma->vm_flags is PROT_NONE the vma is not
> faultable, so hmm_range->default_flags should be not set to
> HMM_PFN_REQ_FAULT to avoid hmm_range_fault fault this vma.
>
> Regards
>
> Xiaogang
>
[-- Attachment #2: Type: text/html, Size: 20189 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
@ 2025-07-24 13:59 Kent Russell
2025-07-28 15:47 ` Felix Kuehling
0 siblings, 1 reply; 13+ messages in thread
From: Kent Russell @ 2025-07-24 13:59 UTC (permalink / raw)
To: amd-gfx; +Cc: felix.kuehling, xiaogang.chen, Kent Russell
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_restore_work will continue to retry, silently, giving the
impression of a hang if pr_debug isn't enabled to show the retries..
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e23b5a0f31f2..597b8ac92848 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
next = min(vma->vm_end, end);
npages = (next - addr) >> PAGE_SHIFT;
+ /* HMM requires at least READ permissions. If provided with PROT_NONE,
+ * unmap the memory. If it's not already mapped, this is a no-op
+ * If PROT_WRITE is provided without READ, warn first then unmap
+ */
+ if (!(vma->vm_flags & VM_READ)) {
+ unsigned long e, s;
+
+ if (vma->vm_flags & VM_WRITE)
+ pr_debug("VM_WRITE without VM_READ is not supported");
+ s = max(start, prange->start);
+ e = min(end, prange->last);
+ if (e >= s)
+ svm_range_unmap_from_gpus(prange, s, e,
+ KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+ addr = next;
+ continue;
+ }
+
WRITE_ONCE(p->svms.faulting_task, current);
r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
readonly, owner, NULL,
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-07-24 13:59 Kent Russell
@ 2025-07-28 15:47 ` Felix Kuehling
2025-07-28 16:34 ` Russell, Kent
0 siblings, 1 reply; 13+ messages in thread
From: Felix Kuehling @ 2025-07-28 15:47 UTC (permalink / raw)
To: Kent Russell, amd-gfx; +Cc: xiaogang.chen
On 2025-07-24 09:59, Kent Russell wrote:
> HMM assumes that pages have READ permissions by default. Inside
> svm_range_validate_and_map, we add READ permissions then add WRITE
> permissions if the VMA isn't read-only. This will conflict with regions
> that only have PROT_WRITE or have PROT_NONE. When that happens,
> svm_range_restore_work will continue to retry, silently, giving the
> impression of a hang if pr_debug isn't enabled to show the retries..
>
> If pages don't have READ permissions, simply unmap them and continue. If
> they weren't mapped in the first place, this would be a no-op. Since x86
> doesn't support write-only, and PROT_NONE doesn't allow reads or writes
> anyways, this will allow the svm range validation to continue without
> getting stuck in a loop forever on mappings we can't use with HMM.
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index e23b5a0f31f2..597b8ac92848 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
>
> next = min(vma->vm_end, end);
> npages = (next - addr) >> PAGE_SHIFT;
> + /* HMM requires at least READ permissions. If provided with PROT_NONE,
> + * unmap the memory. If it's not already mapped, this is a no-op
> + * If PROT_WRITE is provided without READ, warn first then unmap
> + */
> + if (!(vma->vm_flags & VM_READ)) {
> + unsigned long e, s;
> +
> + if (vma->vm_flags & VM_WRITE)
> + pr_debug("VM_WRITE without VM_READ is not supported");
> + s = max(start, prange->start);
> + e = min(end, prange->last);
> + if (e >= s)
You need to take svm_range_lock(prange) around svm_range_unmap_from_gpus
to be safe.
If svm_range_unmap_from_gpus returns an error, we should return that to
the caller. In that case svm_range_restore_work should retry.
Regards,
Felix
> + svm_range_unmap_from_gpus(prange, s, e,
> + KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
> + addr = next;
> + continue;
> + }
> +
> WRITE_ONCE(p->svms.faulting_task, current);
> r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
> readonly, owner, NULL,
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-07-28 15:47 ` Felix Kuehling
@ 2025-07-28 16:34 ` Russell, Kent
0 siblings, 0 replies; 13+ messages in thread
From: Russell, Kent @ 2025-07-28 16:34 UTC (permalink / raw)
To: Kuehling, Felix, amd-gfx@lists.freedesktop.org; +Cc: Chen, Xiaogang
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@amd.com>
> Sent: Monday, July 28, 2025 11:47 AM
> To: Russell, Kent <Kent.Russell@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Chen, Xiaogang <Xiaogang.Chen@amd.com>
> Subject: Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM
> mapping
>
>
> On 2025-07-24 09:59, Kent Russell wrote:
> > HMM assumes that pages have READ permissions by default. Inside
> > svm_range_validate_and_map, we add READ permissions then add WRITE
> > permissions if the VMA isn't read-only. This will conflict with regions
> > that only have PROT_WRITE or have PROT_NONE. When that happens,
> > svm_range_restore_work will continue to retry, silently, giving the
> > impression of a hang if pr_debug isn't enabled to show the retries..
> >
> > If pages don't have READ permissions, simply unmap them and continue. If
> > they weren't mapped in the first place, this would be a no-op. Since x86
> > doesn't support write-only, and PROT_NONE doesn't allow reads or writes
> > anyways, this will allow the svm range validation to continue without
> > getting stuck in a loop forever on mappings we can't use with HMM.
> >
> > Signed-off-by: Kent Russell <kent.russell@amd.com>
> > ---
> > drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > index e23b5a0f31f2..597b8ac92848 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > @@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct
> mm_struct *mm,
> >
> > next = min(vma->vm_end, end);
> > npages = (next - addr) >> PAGE_SHIFT;
> > + /* HMM requires at least READ permissions. If provided with
> PROT_NONE,
> > + * unmap the memory. If it's not already mapped, this is a no-
> op
> > + * If PROT_WRITE is provided without READ, warn first then
> unmap
> > + */
> > + if (!(vma->vm_flags & VM_READ)) {
> > + unsigned long e, s;
> > +
> > + if (vma->vm_flags & VM_WRITE)
> > + pr_debug("VM_WRITE without VM_READ is
> not supported");
> > + s = max(start, prange->start);
> > + e = min(end, prange->last);
> > + if (e >= s)
>
> You need to take svm_range_lock(prange) around svm_range_unmap_from_gpus
> to be safe.
>
> If svm_range_unmap_from_gpus returns an error, we should return that to
> the caller. In that case svm_range_restore_work should retry.
>
Got it. Thanks for that!
Kent
> Regards,
> Felix
>
>
> > + svm_range_unmap_from_gpus(prange, s, e,
> > +
> KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
> > + addr = next;
> > + continue;
> > + }
> > +
> > WRITE_ONCE(p->svms.faulting_task, current);
> > r = amdgpu_hmm_range_get_pages(&prange->notifier, addr,
> npages,
> > readonly, owner, NULL,
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
@ 2025-08-05 14:57 Kent Russell
2025-08-08 19:33 ` Felix Kuehling
0 siblings, 1 reply; 13+ messages in thread
From: Kent Russell @ 2025-08-05 14:57 UTC (permalink / raw)
To: amd-gfx; +Cc: felix.kuehling, Kent Russell
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_restore_work will continue to retry, silently, giving the
impression of a hang if pr_debug isn't enabled to show the retries..
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e23b5a0f31f2..449595aab433 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1713,6 +1713,28 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
next = min(vma->vm_end, end);
npages = (next - addr) >> PAGE_SHIFT;
+ /* HMM requires at least READ permissions. If provided with PROT_NONE,
+ * unmap the memory. If it's not already mapped, this is a no-op
+ * If PROT_WRITE is provided without READ, warn first then unmap
+ */
+ if (!(vma->vm_flags & VM_READ)) {
+ unsigned long e, s;
+
+ svm_range_lock(prange);
+ if (vma->vm_flags & VM_WRITE)
+ pr_debug("VM_WRITE without VM_READ is not supported");
+ s = max(start, prange->start);
+ e = min(end, prange->last);
+ if (e >= s)
+ r = svm_range_unmap_from_gpus(prange, s, e,
+ KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+ addr = next;
+ svm_range_unlock(prange);
+ if (r)
+ return r;
+ continue;
+ }
+
WRITE_ONCE(p->svms.faulting_task, current);
r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
readonly, owner, NULL,
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-08-05 14:57 Kent Russell
@ 2025-08-08 19:33 ` Felix Kuehling
2025-08-08 19:47 ` Russell, Kent
0 siblings, 1 reply; 13+ messages in thread
From: Felix Kuehling @ 2025-08-08 19:33 UTC (permalink / raw)
To: Kent Russell, amd-gfx
On 2025-08-05 10:57, Kent Russell wrote:
> HMM assumes that pages have READ permissions by default. Inside
> svm_range_validate_and_map, we add READ permissions then add WRITE
> permissions if the VMA isn't read-only. This will conflict with regions
> that only have PROT_WRITE or have PROT_NONE. When that happens,
> svm_range_restore_work will continue to retry, silently, giving the
> impression of a hang if pr_debug isn't enabled to show the retries..
>
> If pages don't have READ permissions, simply unmap them and continue. If
> they weren't mapped in the first place, this would be a no-op. Since x86
> doesn't support write-only, and PROT_NONE doesn't allow reads or writes
> anyways, this will allow the svm range validation to continue without
> getting stuck in a loop forever on mappings we can't use with HMM.
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index e23b5a0f31f2..449595aab433 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1713,6 +1713,28 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
>
> next = min(vma->vm_end, end);
> npages = (next - addr) >> PAGE_SHIFT;
> + /* HMM requires at least READ permissions. If provided with PROT_NONE,
> + * unmap the memory. If it's not already mapped, this is a no-op
> + * If PROT_WRITE is provided without READ, warn first then unmap
> + */
> + if (!(vma->vm_flags & VM_READ)) {
> + unsigned long e, s;
> +
> + svm_range_lock(prange);
> + if (vma->vm_flags & VM_WRITE)
> + pr_debug("VM_WRITE without VM_READ is not supported");
> + s = max(start, prange->start);
> + e = min(end, prange->last);
> + if (e >= s)
> + r = svm_range_unmap_from_gpus(prange, s, e,
> + KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
> + addr = next;
Maybe move this as the last statement before continue below.
> + svm_range_unlock(prange);
> + if (r)
> + return r;
This will skip some cleanup, including svm_range_unreserve_bos and
kfree(ctx). I think you can just continue in any case. If r != 0 the
loop will terminate.
Regards,
Felix
> + continue;
> + }
> +
> WRITE_ONCE(p->svms.faulting_task, current);
> r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
> readonly, owner, NULL,
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-08-08 19:33 ` Felix Kuehling
@ 2025-08-08 19:47 ` Russell, Kent
2025-08-08 19:56 ` Felix Kuehling
0 siblings, 1 reply; 13+ messages in thread
From: Russell, Kent @ 2025-08-08 19:47 UTC (permalink / raw)
To: Kuehling, Felix, amd-gfx@lists.freedesktop.org
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@amd.com>
> Sent: Friday, August 8, 2025 3:34 PM
> To: Russell, Kent <Kent.Russell@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM
> mapping
>
> On 2025-08-05 10:57, Kent Russell wrote:
> > HMM assumes that pages have READ permissions by default. Inside
> > svm_range_validate_and_map, we add READ permissions then add WRITE
> > permissions if the VMA isn't read-only. This will conflict with regions
> > that only have PROT_WRITE or have PROT_NONE. When that happens,
> > svm_range_restore_work will continue to retry, silently, giving the
> > impression of a hang if pr_debug isn't enabled to show the retries..
> >
> > If pages don't have READ permissions, simply unmap them and continue. If
> > they weren't mapped in the first place, this would be a no-op. Since x86
> > doesn't support write-only, and PROT_NONE doesn't allow reads or writes
> > anyways, this will allow the svm range validation to continue without
> > getting stuck in a loop forever on mappings we can't use with HMM.
> >
> > Signed-off-by: Kent Russell <kent.russell@amd.com>
> > ---
> > drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > index e23b5a0f31f2..449595aab433 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > @@ -1713,6 +1713,28 @@ static int svm_range_validate_and_map(struct
> mm_struct *mm,
> >
> > next = min(vma->vm_end, end);
> > npages = (next - addr) >> PAGE_SHIFT;
> > + /* HMM requires at least READ permissions. If provided with
> PROT_NONE,
> > + * unmap the memory. If it's not already mapped, this is a no-
> op
> > + * If PROT_WRITE is provided without READ, warn first then
> unmap
> > + */
> > + if (!(vma->vm_flags & VM_READ)) {
> > + unsigned long e, s;
> > +
> > + svm_range_lock(prange);
> > + if (vma->vm_flags & VM_WRITE)
> > + pr_debug("VM_WRITE without VM_READ is
> not supported");
> > + s = max(start, prange->start);
> > + e = min(end, prange->last);
> > + if (e >= s)
> > + r = svm_range_unmap_from_gpus(prange, s,
> e,
> > +
> KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
> > + addr = next;
>
> Maybe move this as the last statement before continue below.
Do you mean just the addr=next line? IE Not worrying about setting addr while holding the prange lock?
>
>
> > + svm_range_unlock(prange);
> > + if (r)
> > + return r;
>
> This will skip some cleanup, including svm_range_unreserve_bos and
> kfree(ctx). I think you can just continue in any case. If r != 0 the
> loop will terminate.
Thanks, I missed the !r in the for loop conditions.
Kent
>
> Regards,
> Felix
>
>
> > + continue;
> > + }
> > +
> > WRITE_ONCE(p->svms.faulting_task, current);
> > r = amdgpu_hmm_range_get_pages(&prange->notifier, addr,
> npages,
> > readonly, owner, NULL,
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-08-08 19:47 ` Russell, Kent
@ 2025-08-08 19:56 ` Felix Kuehling
0 siblings, 0 replies; 13+ messages in thread
From: Felix Kuehling @ 2025-08-08 19:56 UTC (permalink / raw)
To: Russell, Kent, amd-gfx@lists.freedesktop.org
On 2025-08-08 15:47, Russell, Kent wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>> -----Original Message-----
>> From: Kuehling, Felix <Felix.Kuehling@amd.com>
>> Sent: Friday, August 8, 2025 3:34 PM
>> To: Russell, Kent <Kent.Russell@amd.com>; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM
>> mapping
>>
>> On 2025-08-05 10:57, Kent Russell wrote:
>>> HMM assumes that pages have READ permissions by default. Inside
>>> svm_range_validate_and_map, we add READ permissions then add WRITE
>>> permissions if the VMA isn't read-only. This will conflict with regions
>>> that only have PROT_WRITE or have PROT_NONE. When that happens,
>>> svm_range_restore_work will continue to retry, silently, giving the
>>> impression of a hang if pr_debug isn't enabled to show the retries..
>>>
>>> If pages don't have READ permissions, simply unmap them and continue. If
>>> they weren't mapped in the first place, this would be a no-op. Since x86
>>> doesn't support write-only, and PROT_NONE doesn't allow reads or writes
>>> anyways, this will allow the svm range validation to continue without
>>> getting stuck in a loop forever on mappings we can't use with HMM.
>>>
>>> Signed-off-by: Kent Russell <kent.russell@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 22 ++++++++++++++++++++++
>>> 1 file changed, 22 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> index e23b5a0f31f2..449595aab433 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> @@ -1713,6 +1713,28 @@ static int svm_range_validate_and_map(struct
>> mm_struct *mm,
>>> next = min(vma->vm_end, end);
>>> npages = (next - addr) >> PAGE_SHIFT;
>>> + /* HMM requires at least READ permissions. If provided with
>> PROT_NONE,
>>> + * unmap the memory. If it's not already mapped, this is a no-
>> op
>>> + * If PROT_WRITE is provided without READ, warn first then
>> unmap
>>> + */
>>> + if (!(vma->vm_flags & VM_READ)) {
>>> + unsigned long e, s;
>>> +
>>> + svm_range_lock(prange);
>>> + if (vma->vm_flags & VM_WRITE)
>>> + pr_debug("VM_WRITE without VM_READ is
>> not supported");
>>> + s = max(start, prange->start);
>>> + e = min(end, prange->last);
>>> + if (e >= s)
>>> + r = svm_range_unmap_from_gpus(prange, s,
>> e,
>>> +
>> KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
>>> + addr = next;
>> Maybe move this as the last statement before continue below.
> Do you mean just the addr=next line? IE Not worrying about setting addr while holding the prange lock?
Yes. Setting addr=next sets things up for the next loop iteration.
Therefore it seems logical to see this just before the continue, rather
than "hiding" it between locking and error handling.
Regards,
Felix
>
>>> + svm_range_unlock(prange);
>>> + if (r)
>>> + return r;
>> This will skip some cleanup, including svm_range_unreserve_bos and
>> kfree(ctx). I think you can just continue in any case. If r != 0 the
>> loop will terminate.
> Thanks, I missed the !r in the for loop conditions.
>
> Kent
>
>> Regards,
>> Felix
>>
>>
>>> + continue;
>>> + }
>>> +
>>> WRITE_ONCE(p->svms.faulting_task, current);
>>> r = amdgpu_hmm_range_get_pages(&prange->notifier, addr,
>> npages,
>>> readonly, owner, NULL,
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
@ 2025-08-08 20:28 Kent Russell
2025-08-08 20:57 ` Felix Kuehling
0 siblings, 1 reply; 13+ messages in thread
From: Kent Russell @ 2025-08-08 20:28 UTC (permalink / raw)
To: amd-gfx; +Cc: felix.kuehling, Kent Russell
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_restore_work will continue to retry, silently, giving the
impression of a hang if pr_debug isn't enabled to show the retries..
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e23b5a0f31f2..521c14c7a789 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1713,6 +1713,29 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
next = min(vma->vm_end, end);
npages = (next - addr) >> PAGE_SHIFT;
+ /* HMM requires at least READ permissions. If provided with PROT_NONE,
+ * unmap the memory. If it's not already mapped, this is a no-op
+ * If PROT_WRITE is provided without READ, warn first then unmap
+ */
+ if (!(vma->vm_flags & VM_READ)) {
+ unsigned long e, s;
+
+ svm_range_lock(prange);
+ if (vma->vm_flags & VM_WRITE)
+ pr_debug("VM_WRITE without VM_READ is not supported");
+ s = max(start, prange->start);
+ e = min(end, prange->last);
+ if (e >= s)
+ r = svm_range_unmap_from_gpus(prange, s, e,
+ KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+ svm_range_unlock(prange);
+ /* If unmap returns non-zero, we'll bail on the next for loop
+ * iteration, so just leave r and continue
+ */
+ addr = next;
+ continue;
+ }
+
WRITE_ONCE(p->svms.faulting_task, current);
r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
readonly, owner, NULL,
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping
2025-08-08 20:28 [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping Kent Russell
@ 2025-08-08 20:57 ` Felix Kuehling
0 siblings, 0 replies; 13+ messages in thread
From: Felix Kuehling @ 2025-08-08 20:57 UTC (permalink / raw)
To: Kent Russell, amd-gfx
On 2025-08-08 16:28, Kent Russell wrote:
> HMM assumes that pages have READ permissions by default. Inside
> svm_range_validate_and_map, we add READ permissions then add WRITE
> permissions if the VMA isn't read-only. This will conflict with regions
> that only have PROT_WRITE or have PROT_NONE. When that happens,
> svm_range_restore_work will continue to retry, silently, giving the
> impression of a hang if pr_debug isn't enabled to show the retries..
>
> If pages don't have READ permissions, simply unmap them and continue. If
> they weren't mapped in the first place, this would be a no-op. Since x86
> doesn't support write-only, and PROT_NONE doesn't allow reads or writes
> anyways, this will allow the svm range validation to continue without
> getting stuck in a loop forever on mappings we can't use with HMM.
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index e23b5a0f31f2..521c14c7a789 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1713,6 +1713,29 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
>
> next = min(vma->vm_end, end);
> npages = (next - addr) >> PAGE_SHIFT;
> + /* HMM requires at least READ permissions. If provided with PROT_NONE,
> + * unmap the memory. If it's not already mapped, this is a no-op
> + * If PROT_WRITE is provided without READ, warn first then unmap
> + */
> + if (!(vma->vm_flags & VM_READ)) {
> + unsigned long e, s;
> +
> + svm_range_lock(prange);
> + if (vma->vm_flags & VM_WRITE)
> + pr_debug("VM_WRITE without VM_READ is not supported");
> + s = max(start, prange->start);
> + e = min(end, prange->last);
> + if (e >= s)
> + r = svm_range_unmap_from_gpus(prange, s, e,
> + KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
> + svm_range_unlock(prange);
> + /* If unmap returns non-zero, we'll bail on the next for loop
> + * iteration, so just leave r and continue
> + */
> + addr = next;
> + continue;
> + }
> +
> WRITE_ONCE(p->svms.faulting_task, current);
> r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
> readonly, owner, NULL,
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-08-08 20:57 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08 20:28 [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping Kent Russell
2025-08-08 20:57 ` Felix Kuehling
-- strict thread matches above, loose matches on Subject: below --
2025-08-05 14:57 Kent Russell
2025-08-08 19:33 ` Felix Kuehling
2025-08-08 19:47 ` Russell, Kent
2025-08-08 19:56 ` Felix Kuehling
2025-07-24 13:59 Kent Russell
2025-07-28 15:47 ` Felix Kuehling
2025-07-28 16:34 ` Russell, Kent
2025-07-22 16:24 Kent Russell
2025-07-22 22:46 ` Chen, Xiaogang
2025-07-23 13:30 ` Russell, Kent
2025-07-23 21:08 ` Chen, Xiaogang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).