[PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
@ 2025-03-06  6:03 Emily Deng
  2025-03-06 15:49 ` Felix Kuehling
  2025-03-06 17:00 ` Chen, Xiaogang
  0 siblings, 2 replies; 7+ messages in thread
From: Emily Deng @ 2025-03-06  6:03 UTC (permalink / raw)
  To: amd-gfx; +Cc: Emily Deng

Issue:
In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
 has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
continues execution and reaches svm_range_from_addr. This results in
a "failed to find prange..." error, causing the page recovery to fail.

How to fix:
Move the timestamp check code under the protection of svm->lock.

v2:
Make sure all right locks are released before go out.

v3:
Directly goto out_unlock_svms, and return -EAGAIN.

v4:
Refine code.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index d04725583f19..83ac14bf7a7a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
 		goto out;
 	}
 
-	/* check if this page fault time stamp is before svms->checkpoint_ts */
-	if (svms->checkpoint_ts[gpuidx] != 0) {
-		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
-			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
-			r = 0;
-			goto out;
-		} else
-			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
-			 * to zero to avoid following ts wrap around give wrong comparing
-			 */
-			svms->checkpoint_ts[gpuidx] = 0;
-	}
-
 	if (!p->xnack_enabled) {
 		pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
 		r = -EFAULT;
@@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
 	mmap_read_lock(mm);
 retry_write_locked:
 	mutex_lock(&svms->lock);
+
+	/* check if this page fault time stamp is before svms->checkpoint_ts */
+	if (svms->checkpoint_ts[gpuidx] != 0) {
+		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
+			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
+			r = -EAGAIN;
+			goto out_unlock_svms;
+		} else
+			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
+			 * to zero to avoid following ts wrap around give wrong comparing
+			 */
+			svms->checkpoint_ts[gpuidx] = 0;
+	}
+
 	prange = svm_range_from_addr(svms, addr, NULL);
 	if (!prange) {
 		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
@@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
 	mutex_unlock(&svms->lock);
 	mmap_read_unlock(mm);
 
-	svm_range_count_fault(node, p, gpuidx);
+	if (r != -EAGAIN)
+		svm_range_count_fault(node, p, gpuidx);
 
 	mmput(mm);
 out:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
  2025-03-06  6:03 [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault Emily Deng
@ 2025-03-06 15:49 ` Felix Kuehling
  2025-03-06 17:00 ` Chen, Xiaogang
  1 sibling, 0 replies; 7+ messages in thread
From: Felix Kuehling @ 2025-03-06 15:49 UTC (permalink / raw)
  To: Emily Deng, amd-gfx


On 2025-03-06 1:03, Emily Deng wrote:
> Issue:
> In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
>  has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
> is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
> continues execution and reaches svm_range_from_addr. This results in
> a "failed to find prange..." error, causing the page recovery to fail.
>
> How to fix:
> Move the timestamp check code under the protection of svm->lock.
>
> v2:
> Make sure all right locks are released before go out.
>
> v3:
> Directly goto out_unlock_svms, and return -EAGAIN.
>
> v4:
> Refine code.
>
> Signed-off-by: Emily Deng <Emily.Deng@amd.com>

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index d04725583f19..83ac14bf7a7a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>  		goto out;
>  	}
>  
> -	/* check if this page fault time stamp is before svms->checkpoint_ts */
> -	if (svms->checkpoint_ts[gpuidx] != 0) {
> -		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
> -			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
> -			r = 0;
> -			goto out;
> -		} else
> -			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
> -			 * to zero to avoid following ts wrap around give wrong comparing
> -			 */
> -			svms->checkpoint_ts[gpuidx] = 0;
> -	}
> -
>  	if (!p->xnack_enabled) {
>  		pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
>  		r = -EFAULT;
> @@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>  	mmap_read_lock(mm);
>  retry_write_locked:
>  	mutex_lock(&svms->lock);
> +
> +	/* check if this page fault time stamp is before svms->checkpoint_ts */
> +	if (svms->checkpoint_ts[gpuidx] != 0) {
> +		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
> +			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
> +			r = -EAGAIN;
> +			goto out_unlock_svms;
> +		} else
> +			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
> +			 * to zero to avoid following ts wrap around give wrong comparing
> +			 */
> +			svms->checkpoint_ts[gpuidx] = 0;
> +	}
> +
>  	prange = svm_range_from_addr(svms, addr, NULL);
>  	if (!prange) {
>  		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
> @@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>  	mutex_unlock(&svms->lock);
>  	mmap_read_unlock(mm);
>  
> -	svm_range_count_fault(node, p, gpuidx);
> +	if (r != -EAGAIN)
> +		svm_range_count_fault(node, p, gpuidx);
>  
>  	mmput(mm);
>  out:

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
  2025-03-06  6:03 [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault Emily Deng
  2025-03-06 15:49 ` Felix Kuehling
@ 2025-03-06 17:00 ` Chen, Xiaogang
  2025-03-07  1:27   ` Deng, Emily
  1 sibling, 1 reply; 7+ messages in thread
From: Chen, Xiaogang @ 2025-03-06 17:00 UTC (permalink / raw)
  To: Emily Deng, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3314 bytes --]

Thanks for catch up and fix this race condition. It looks good to me. 
One minor thing below:

On 3/6/2025 12:03 AM, Emily Deng wrote:
> Issue:
> In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
>   has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
> is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
> continues execution and reaches svm_range_from_addr. This results in
> a "failed to find prange..." error, causing the page recovery to fail.
>
> How to fix:
> Move the timestamp check code under the protection of svm->lock.
>
> v2:
> Make sure all right locks are released before go out.
>
> v3:
> Directly goto out_unlock_svms, and return -EAGAIN.
>
> v4:
> Refine code.
>
> Signed-off-by: Emily Deng<Emily.Deng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------
>   1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index d04725583f19..83ac14bf7a7a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>   		goto out;
>   	}
>   
> -	/* check if this page fault time stamp is before svms->checkpoint_ts */
> -	if (svms->checkpoint_ts[gpuidx] != 0) {
> -		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
> -			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
> -			r = 0;
> -			goto out;
> -		} else
> -			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
> -			 * to zero to avoid following ts wrap around give wrong comparing
> -			 */
> -			svms->checkpoint_ts[gpuidx] = 0;
> -	}
> -
>   	if (!p->xnack_enabled) {
>   		pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
>   		r = -EFAULT;
> @@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>   	mmap_read_lock(mm);
>   retry_write_locked:
>   	mutex_lock(&svms->lock);
> +
> +	/* check if this page fault time stamp is before svms->checkpoint_ts */
> +	if (svms->checkpoint_ts[gpuidx] != 0) {
> +		if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
> +			pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
> +			r = -EAGAIN;

We drop page fault because it is stale, not mean to handle it again. if 
return -EAGAIN we do amdgpu_gmc_filter_faults_remove. If after unmap, 
user map same range again we should treat page fault happened at same 
range as new one.

Regards

Xiaogang

> +			goto out_unlock_svms;
> +		} else
> +			/* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
> +			 * to zero to avoid following ts wrap around give wrong comparing
> +			 */
> +			svms->checkpoint_ts[gpuidx] = 0;
> +	}
> +
>   	prange = svm_range_from_addr(svms, addr, NULL);
>   	if (!prange) {
>   		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
> @@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>   	mutex_unlock(&svms->lock);
>   	mmap_read_unlock(mm);
>   
> -	svm_range_count_fault(node, p, gpuidx);
> +	if (r != -EAGAIN)
> +		svm_range_count_fault(node, p, gpuidx);
>   
>   	mmput(mm);
>   out:

[-- Attachment #2: Type: text/html, Size: 3941 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
  2025-03-06 17:00 ` Chen, Xiaogang
@ 2025-03-07  1:27   ` Deng, Emily
  2025-03-08  0:37     ` Chen, Xiaogang
  0 siblings, 1 reply; 7+ messages in thread
From: Deng, Emily @ 2025-03-07  1:27 UTC (permalink / raw)
  To: Chen, Xiaogang, amd-gfx@lists.freedesktop.org

[-- Attachment #1: Type: text/plain, Size: 4352 bytes --]

[AMD Official Use Only - AMD Internal Distribution Only]



From: Chen, Xiaogang <Xiaogang.Chen@amd.com>
Sent: Friday, March 7, 2025 1:01 AM
To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault


Thanks for catch up and fix this race condition. It looks good to me. One minor thing below:
On 3/6/2025 12:03 AM, Emily Deng wrote:

Issue:

In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts

 has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu

is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages

continues execution and reaches svm_range_from_addr. This results in

a "failed to find prange..." error, causing the page recovery to fail.



How to fix:

Move the timestamp check code under the protection of svm->lock.



v2:

Make sure all right locks are released before go out.



v3:

Directly goto out_unlock_svms, and return -EAGAIN.



v4:

Refine code.



Signed-off-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>

---

 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------

 1 file changed, 16 insertions(+), 14 deletions(-)



diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index d04725583f19..83ac14bf7a7a 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

@@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,

                goto out;

        }



-       /* check if this page fault time stamp is before svms->checkpoint_ts */

-       if (svms->checkpoint_ts[gpuidx] != 0) {

-               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {

-                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);

-                       r = 0;

-                       goto out;

-               } else

-                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts

-                        * to zero to avoid following ts wrap around give wrong comparing

-                        */

-                svms->checkpoint_ts[gpuidx] = 0;

-       }

-

        if (!p->xnack_enabled) {

                pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);

                r = -EFAULT;

@@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,

        mmap_read_lock(mm);

 retry_write_locked:

        mutex_lock(&svms->lock);

+

+       /* check if this page fault time stamp is before svms->checkpoint_ts */

+       if (svms->checkpoint_ts[gpuidx] != 0) {

+               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {

+                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);

+                       r = -EAGAIN;

We drop page fault because it is stale, not mean to handle it again. if return -EAGAIN we do amdgpu_gmc_filter_faults_remove. If after unmap, user map same range again we should treat page fault happened at same range as new one.

Regards

Xiaogang

Sorry, I didn't quite catch that. So, you think we shouldn't remove the fault from amdgpu_gmc_filter_faults_remove?

Emily Deng
Best Wishes






+                       goto out_unlock_svms;

+               } else

+                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts

+                        * to zero to avoid following ts wrap around give wrong comparing

+                        */

+                svms->checkpoint_ts[gpuidx] = 0;

+       }

+

        prange = svm_range_from_addr(svms, addr, NULL);

        if (!prange) {

                pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",

@@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,

        mutex_unlock(&svms->lock);

        mmap_read_unlock(mm);



-       svm_range_count_fault(node, p, gpuidx);

+       if (r != -EAGAIN)

+               svm_range_count_fault(node, p, gpuidx);



        mmput(mm);

 out:

[-- Attachment #2: Type: text/html, Size: 12617 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
  2025-03-07  1:27   ` Deng, Emily
@ 2025-03-08  0:37     ` Chen, Xiaogang
  2025-03-10  0:51       ` Deng, Emily
  0 siblings, 1 reply; 7+ messages in thread
From: Chen, Xiaogang @ 2025-03-08  0:37 UTC (permalink / raw)
  To: Deng, Emily, amd-gfx@lists.freedesktop.org

[-- Attachment #1: Type: text/plain, Size: 5817 bytes --]


On 3/6/2025 7:27 PM, Deng, Emily wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> *From:*Chen, Xiaogang <Xiaogang.Chen@amd.com>
> *Sent:* Friday, March 7, 2025 1:01 AM
> *To:* Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> *Subject:* Re: [PATCH v4] drm/amdgpu: Fix the race condition for 
> draining retry fault
>
> Thanks for catch up and fix this race condition. It looks good to me. 
> One minor thing below:
>
> On 3/6/2025 12:03 AM, Emily Deng wrote:
>
>     Issue:
>
>     In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
>
>       has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
>
>     is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
>
>     continues execution and reaches svm_range_from_addr. This results in
>
>     a "failed to find prange..." error, causing the page recovery to fail.
>
>     How to fix:
>
>     Move the timestamp check code under the protection of svm->lock.
>
>     v2:
>
>     Make sure all right locks are released before go out.
>
>     v3:
>
>     Directly goto out_unlock_svms, and return -EAGAIN.
>
>     v4:
>
>     Refine code.
>
>     Signed-off-by: Emily Deng<Emily.Deng@amd.com> <mailto:Emily.Deng@amd.com>
>
>     ---
>
>       drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------
>
>       1 file changed, 16 insertions(+), 14 deletions(-)
>
>     diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>     index d04725583f19..83ac14bf7a7a 100644
>
>     --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>     +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>     @@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                      goto out;
>
>              }
>
>       
>
>     -       /* check if this page fault time stamp is before svms->checkpoint_ts */
>
>     -       if (svms->checkpoint_ts[gpuidx] != 0) {
>
>     -               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
>
>     -                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
>
>     -                       r = 0;
>
>     -                       goto out;
>
>     -               } else
>
>     -                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
>
>     -                        * to zero to avoid following ts wrap around give wrong comparing
>
>     -                        */
>
>     -                svms->checkpoint_ts[gpuidx] = 0;
>
>     -       }
>
>     -
>
>              if (!p->xnack_enabled) {
>
>                      pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
>
>                      r = -EFAULT;
>
>     @@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>              mmap_read_lock(mm);
>
>       retry_write_locked:
>
>              mutex_lock(&svms->lock);
>
>     +
>
>     +       /* check if this page fault time stamp is before svms->checkpoint_ts */
>
>     +       if (svms->checkpoint_ts[gpuidx] != 0) {
>
>     +               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
>
>     +                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
>
>     +                       r = -EAGAIN;
>
> We drop page fault because it is stale, not mean to handle it again. 
> if return -EAGAIN we do amdgpu_gmc_filter_faults_remove. If after 
> unmap, user map same range again we should treat page fault happened 
> at same range as new one.
>
> Regards
>
> Xiaogang
>
> Sorry, I didn't quite catch that. So, you think we shouldn't 
> remove the fault from amdgpu_gmc_filter_faults_remove?
>
I think return "-EAGAIN" means a page fault with an addr and a pasid is 
not handled this time. Same page fault will come back again and kfd will 
handle it, so remove it from filter to make sure it will be handled next 
time.

In this case this page fault is stale and we do not need or cannot 
handle it. There is no indication it will come back again and we need 
handle it.  I am not sure if should remove it from filter. I just 
concern if should return "-EAGAIN" in this case.

Regards

Xiaogang

> Emily Deng
>
> Best Wishes
>
>     +                       goto out_unlock_svms;
>
>     +               } else
>
>     +                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
>
>     +                        * to zero to avoid following ts wrap around give wrong comparing
>
>     +                        */
>
>     +                svms->checkpoint_ts[gpuidx] = 0;
>
>     +       }
>
>     +
>
>              prange = svm_range_from_addr(svms, addr, NULL);
>
>              if (!prange) {
>
>                      pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
>
>     @@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>              mutex_unlock(&svms->lock);
>
>              mmap_read_unlock(mm);
>
>       
>
>     -       svm_range_count_fault(node, p, gpuidx);
>
>     +       if (r != -EAGAIN)
>
>     +               svm_range_count_fault(node, p, gpuidx);
>
>       
>
>              mmput(mm);
>
>       out:
>

[-- Attachment #2: Type: text/html, Size: 15261 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
  2025-03-08  0:37     ` Chen, Xiaogang
@ 2025-03-10  0:51       ` Deng, Emily
  2025-03-12 22:40         ` Felix Kuehling
  0 siblings, 1 reply; 7+ messages in thread
From: Deng, Emily @ 2025-03-10  0:51 UTC (permalink / raw)
  To: Chen, Xiaogang, amd-gfx@lists.freedesktop.org

[-- Attachment #1: Type: text/plain, Size: 5484 bytes --]

[AMD Official Use Only - AMD Internal Distribution Only]



From: Chen, Xiaogang <Xiaogang.Chen@amd.com>
Sent: Saturday, March 8, 2025 8:38 AM
To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault



On 3/6/2025 7:27 PM, Deng, Emily wrote:

[AMD Official Use Only - AMD Internal Distribution Only]



From: Chen, Xiaogang <Xiaogang.Chen@amd.com><mailto:Xiaogang.Chen@amd.com>
Sent: Friday, March 7, 2025 1:01 AM
To: Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault


Thanks for catch up and fix this race condition. It looks good to me. One minor thing below:
On 3/6/2025 12:03 AM, Emily Deng wrote:

Issue:

In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts

 has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu

is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages

continues execution and reaches svm_range_from_addr. This results in

a "failed to find prange..." error, causing the page recovery to fail.



How to fix:

Move the timestamp check code under the protection of svm->lock.



v2:

Make sure all right locks are released before go out.



v3:

Directly goto out_unlock_svms, and return -EAGAIN.



v4:

Refine code.



Signed-off-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>

---

 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------

 1 file changed, 16 insertions(+), 14 deletions(-)



diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index d04725583f19..83ac14bf7a7a 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

@@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,

                goto out;

        }



-       /* check if this page fault time stamp is before svms->checkpoint_ts */

-       if (svms->checkpoint_ts[gpuidx] != 0) {

-               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {

-                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);

-                       r = 0;

-                       goto out;

-               } else

-                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts

-                        * to zero to avoid following ts wrap around give wrong comparing

-                        */

-                svms->checkpoint_ts[gpuidx] = 0;

-       }

-

        if (!p->xnack_enabled) {

                pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);

                r = -EFAULT;

@@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,

        mmap_read_lock(mm);

 retry_write_locked:

        mutex_lock(&svms->lock);

+

+       /* check if this page fault time stamp is before svms->checkpoint_ts */

+       if (svms->checkpoint_ts[gpuidx] != 0) {

+               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {

+                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);

+                       r = -EAGAIN;

We drop page fault because it is stale, not mean to handle it again. if return -EAGAIN we do amdgpu_gmc_filter_faults_remove. If after unmap, user map same range again we should treat page fault happened at same range as new one.

Regards

Xiaogang

Sorry, I didn't quite catch that. So, you think we shouldn't remove the fault from amdgpu_gmc_filter_faults_remove?

I think return "-EAGAIN" means a page fault with an addr and a pasid is not handled this time. Same page fault will come back again and kfd will handle it, so remove it from filter to make sure it will be handled next time.

Yes, I also think we should remove it to make sure it will be handled next time. From this point of view, it should be removed from filter.

Emily Deng
Best Wishes




In this case this page fault is stale and we do not need or cannot handle it. There is no indication it will come back again and we need handle it.  I am not sure if should remove it from filter. I just concern if should return "-EAGAIN" in this case.

Regards

Xiaogang



Emily Deng
Best Wishes






+                       goto out_unlock_svms;

+               } else

+                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts

+                        * to zero to avoid following ts wrap around give wrong comparing

+                        */

+                svms->checkpoint_ts[gpuidx] = 0;

+       }

+

        prange = svm_range_from_addr(svms, addr, NULL);

        if (!prange) {

                pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",

@@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,

        mutex_unlock(&svms->lock);

        mmap_read_unlock(mm);



-       svm_range_count_fault(node, p, gpuidx);

+       if (r != -EAGAIN)

+               svm_range_count_fault(node, p, gpuidx);



        mmput(mm);

 out:

[-- Attachment #2: Type: text/html, Size: 15908 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault
  2025-03-10  0:51       ` Deng, Emily
@ 2025-03-12 22:40         ` Felix Kuehling
  0 siblings, 0 replies; 7+ messages in thread
From: Felix Kuehling @ 2025-03-12 22:40 UTC (permalink / raw)
  To: Deng, Emily, Chen, Xiaogang, amd-gfx@lists.freedesktop.org

[-- Attachment #1: Type: text/plain, Size: 7168 bytes --]


On 2025-03-09 20:51, Deng, Emily wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> *From:*Chen, Xiaogang <Xiaogang.Chen@amd.com>
> *Sent:* Saturday, March 8, 2025 8:38 AM
> *To:* Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> *Subject:* Re: [PATCH v4] drm/amdgpu: Fix the race condition for 
> draining retry fault
>
> On 3/6/2025 7:27 PM, Deng, Emily wrote:
>
>     [AMD Official Use Only - AMD Internal Distribution Only]
>
>     *From:*Chen, Xiaogang <Xiaogang.Chen@amd.com>
>     <mailto:Xiaogang.Chen@amd.com>
>     *Sent:* Friday, March 7, 2025 1:01 AM
>     *To:* Deng, Emily <Emily.Deng@amd.com>
>     <mailto:Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
>     *Subject:* Re: [PATCH v4] drm/amdgpu: Fix the race condition for
>     draining retry fault
>
>     Thanks for catch up and fix this race condition. It looks good to
>     me. One minor thing below:
>
>     On 3/6/2025 12:03 AM, Emily Deng wrote:
>
>         Issue:
>
>         In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
>
>           has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
>
>         is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
>
>         continues execution and reaches svm_range_from_addr. This results in
>
>         a "failed to find prange..." error, causing the page recovery to fail.
>
>           
>
>         How to fix:
>
>         Move the timestamp check code under the protection of svm->lock.
>
>           
>
>         v2:
>
>         Make sure all right locks are released before go out.
>
>           
>
>         v3:
>
>         Directly goto out_unlock_svms, and return -EAGAIN.
>
>           
>
>         v4:
>
>         Refine code.
>
>           
>
>         Signed-off-by: Emily Deng<Emily.Deng@amd.com>  <mailto:Emily.Deng@amd.com>
>
>         ---
>
>           drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------
>
>           1 file changed, 16 insertions(+), 14 deletions(-)
>
>           
>
>         diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>         index d04725583f19..83ac14bf7a7a 100644
>
>         --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>         +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>         @@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                          goto out;
>
>                  }
>
>           
>
>         -       /* check if this page fault time stamp is before svms->checkpoint_ts */
>
>         -       if (svms->checkpoint_ts[gpuidx] != 0) {
>
>         -               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
>
>         -                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
>
>         -                       r = 0;
>
>         -                       goto out;
>
>         -               } else
>
>         -                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
>
>         -                        * to zero to avoid following ts wrap around give wrong comparing
>
>         -                        */
>
>         -                svms->checkpoint_ts[gpuidx] = 0;
>
>         -       }
>
>         -
>
>                  if (!p->xnack_enabled) {
>
>                          pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
>
>                          r = -EFAULT;
>
>         @@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                  mmap_read_lock(mm);
>
>           retry_write_locked:
>
>                  mutex_lock(&svms->lock);
>
>         +
>
>         +       /* check if this page fault time stamp is before svms->checkpoint_ts */
>
>         +       if (svms->checkpoint_ts[gpuidx] != 0) {
>
>         +               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
>
>         +                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
>
>         +                       r = -EAGAIN;
>
>     We drop page fault because it is stale, not mean to handle it
>     again. if return -EAGAIN we do amdgpu_gmc_filter_faults_remove. If
>     after unmap, user map same range again we should treat page fault
>     happened at same range as new one.
>
>     Regards
>
>     Xiaogang
>
>     Sorry, I didn't quite catch that. So, you think we shouldn't
>     remove the fault from amdgpu_gmc_filter_faults_remove?
>
> I think return "-EAGAIN" means a page fault with an addr and a pasid 
> is not handled this time. Same page fault will come back again and kfd 
> will handle it, so remove it from filter to make sure it will be 
> handled next time.
>
> Yes, I also think we should remove it to make sure it will be handled 
> next time. From this point of view, it should be removed from filter.
>
I agree. Without removing it from the filter, we would ignore later 
faults on the same address, which would result in stalling the 
application indefinitely.

Regards,
   Felix


> Emily Deng
>
> Best Wishes
>
> In this case this page fault is stale and we do not need or cannot 
> handle it. There is no indication it will come back again and we need 
> handle it.  I am not sure if should remove it from filter. I just 
> concern if should return "-EAGAIN" in this case.
>
> Regards
>
> Xiaogang
>
>     Emily Deng
>
>     Best Wishes
>
>           
>
>         +                       goto out_unlock_svms;
>
>         +               } else
>
>         +                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
>
>         +                        * to zero to avoid following ts wrap around give wrong comparing
>
>         +                        */
>
>         +                svms->checkpoint_ts[gpuidx] = 0;
>
>         +       }
>
>         +
>
>                  prange = svm_range_from_addr(svms, addr, NULL);
>
>                  if (!prange) {
>
>                          pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
>
>         @@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                  mutex_unlock(&svms->lock);
>
>                  mmap_read_unlock(mm);
>
>           
>
>         -       svm_range_count_fault(node, p, gpuidx);
>
>         +       if (r != -EAGAIN)
>
>         +               svm_range_count_fault(node, p, gpuidx);
>
>           
>
>                  mmput(mm);
>
>           out:
>

[-- Attachment #2: Type: text/html, Size: 20452 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-03-12 22:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-06  6:03 [PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault Emily Deng
2025-03-06 15:49 ` Felix Kuehling
2025-03-06 17:00 ` Chen, Xiaogang
2025-03-07  1:27   ` Deng, Emily
2025-03-08  0:37     ` Chen, Xiaogang
2025-03-10  0:51       ` Deng, Emily
2025-03-12 22:40         ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.