All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] drm/amdgpu: fix cleaner shader IB size and entity cleanup
@ 2026-06-22 14:23 Wentao Liang
  2026-06-22 14:35 ` sashiko-bot
  2026-06-22 14:48 ` Christian König
  0 siblings, 2 replies; 3+ messages in thread
From: Wentao Liang @ 2026-06-22 14:23 UTC (permalink / raw)
  To: Alex Deucher, Christian König
  Cc: amd-gfx, dri-devel, linux-kernel, Wentao Liang, stable

Fix two issues in amdgpu_gfx_run_cleaner_shader_job():

1. IB buffer overflow: The indirect buffer is hardcoded to 64 bytes,
   but the initialization loop writes up to (align_mask + 1) dwords.
   On modern GFX rings with align_mask = 0xff, this writes 1024 bytes,
   overflowing the 64-byte allocation and corrupting memory.

2. Scheduler entity leak: The drm_sched_entity is not cleaned up on
   the error path after amdgpu_job_alloc_with_ib() fails.

Fix by:
- Dynamically calculating IB size based on ring->funcs->align_mask
- Adding drm_sched_entity_destroy() to the error path

Cc: stable@vger.kernel.org
Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader")
Fixes: 256576ed6895 ("drm/amdgpu: give each kernel job a unique id")
Fixes: 559a285816af ("drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index b8ca876694ff..b50ec1a5c645 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -1651,6 +1651,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
 	struct amdgpu_job *job;
 	struct amdgpu_ib *ib;
 	void *owner;
+	unsigned int ib_size;
 	int i, r;
 
 	/* Initialize the scheduler entity */
@@ -1658,7 +1659,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
 				  &sched, 1, NULL);
 	if (r) {
 		dev_err(adev->dev, "Failed setting up GFX kernel entity.\n");
-		goto err;
+		return r;
 	}
 
 	/*
@@ -1668,8 +1669,15 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
 	 */
 	owner = (void *)(unsigned long)atomic_inc_return(&counter);
 
+	/*
+	 * Allocate IB with enough space for align_mask + 1 dwords.
+	 * The initialization loop below writes exactly this many dwords.
+	 * Each dword is 4 bytes.
+	 */
+	ib_size = (ring->funcs->align_mask + 1) * sizeof(uint32_t);
+
 	r = amdgpu_job_alloc_with_ib(ring->adev, &entity, owner,
-				     64, 0, &job,
+				     ib_size, 0, &job,
 				     AMDGPU_KERNEL_JOB_ID_CLEANER_SHADER);
 	if (r)
 		goto err;
@@ -1686,8 +1694,6 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
 	f = amdgpu_job_submit(job);
 
 	r = dma_fence_wait(f, false);
-	if (r)
-		goto err;
 
 	dma_fence_put(f);
 
@@ -1696,6 +1702,8 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
 	return 0;
 
 err:
+	/* Clean up the scheduler entity */
+	drm_sched_entity_destroy(&entity);
 	return r;
 }
 
-- 
2.39.5 (Apple Git-154)


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] drm/amdgpu: fix cleaner shader IB size and entity cleanup
  2026-06-22 14:23 [PATCH v2] drm/amdgpu: fix cleaner shader IB size and entity cleanup Wentao Liang
@ 2026-06-22 14:35 ` sashiko-bot
  2026-06-22 14:48 ` Christian König
  1 sibling, 0 replies; 3+ messages in thread
From: sashiko-bot @ 2026-06-22 14:35 UTC (permalink / raw)
  To: Wentao Liang; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The patch leaves an unused assignment to `r` after removing the error check for `dma_fence_wait`, resulting in a dead store.
--

commit 24bfae44dd9f0251a61d22e1439ca564bf70d1a3
Author: Wentao Liang <vulab@iscas.ac.cn>

drm/amdgpu: fix cleaner shader IB size and entity cleanup

This commit dynamically calculates the indirect buffer size based on
align_mask to prevent memory corruption. It also ensures the scheduler
entity is properly destroyed on error paths and removes an unneeded error
check after waiting on the fence.

Cc: stable@vger.kernel.org

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index b8ca876694ff8..2da917e18d93b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

[ ... ]

> @@ -1686,8 +1694,6 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>  	f = amdgpu_job_submit(job);
>  
>  	r = dma_fence_wait(f, false);
> -	if (r)
> -		goto err;

[Severity: Low]
Does this leave an unused assignment to r in amdgpu_gfx_run_cleaner_shader_job()?

Since the error check was correctly removed (as dma_fence_wait with intr=false
does not return an error), the assigned value is never read before the
function unconditionally returns 0. Could this cause a dead store compiler
warning?

>  
>  	dma_fence_put(f);
>

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260622142305.45791-1-vulab@iscas.ac.cn?part=1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] drm/amdgpu: fix cleaner shader IB size and entity cleanup
  2026-06-22 14:23 [PATCH v2] drm/amdgpu: fix cleaner shader IB size and entity cleanup Wentao Liang
  2026-06-22 14:35 ` sashiko-bot
@ 2026-06-22 14:48 ` Christian König
  1 sibling, 0 replies; 3+ messages in thread
From: Christian König @ 2026-06-22 14:48 UTC (permalink / raw)
  To: Wentao Liang, Alex Deucher; +Cc: amd-gfx, dri-devel, linux-kernel, stable



On 6/22/26 16:23, Wentao Liang wrote:
> Fix two issues in amdgpu_gfx_run_cleaner_shader_job():
> 
> 1. IB buffer overflow: The indirect buffer is hardcoded to 64 bytes,
>    but the initialization loop writes up to (align_mask + 1) dwords.
>    On modern GFX rings with align_mask = 0xff, this writes 1024 bytes,
>    overflowing the 64-byte allocation and corrupting memory.
> 
> 2. Scheduler entity leak: The drm_sched_entity is not cleaned up on
>    the error path after amdgpu_job_alloc_with_ib() fails.
> 
> Fix by:
> - Dynamically calculating IB size based on ring->funcs->align_mask
> - Adding drm_sched_entity_destroy() to the error path
> 
> Cc: stable@vger.kernel.org
> Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader")
> Fixes: 256576ed6895 ("drm/amdgpu: give each kernel job a unique id")
> Fixes: 559a285816af ("drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader")
> Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index b8ca876694ff..b50ec1a5c645 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -1651,6 +1651,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>  	struct amdgpu_job *job;
>  	struct amdgpu_ib *ib;
>  	void *owner;
> +	unsigned int ib_size;
>  	int i, r;
>  
>  	/* Initialize the scheduler entity */
> @@ -1658,7 +1659,7 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>  				  &sched, 1, NULL);
>  	if (r) {
>  		dev_err(adev->dev, "Failed setting up GFX kernel entity.\n");
> -		goto err;
> +		return r;
>  	}
>  
>  	/*
> @@ -1668,8 +1669,15 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>  	 */
>  	owner = (void *)(unsigned long)atomic_inc_return(&counter);
>  
> +	/*
> +	 * Allocate IB with enough space for align_mask + 1 dwords.
> +	 * The initialization loop below writes exactly this many dwords.
> +	 * Each dword is 4 bytes.
> +	 */
> +	ib_size = (ring->funcs->align_mask + 1) * sizeof(uint32_t);
> +

Please completely drop that.

The ring->funcs->align_mask is the ring alignment mask and has no technical relevance for the IB.

>  	r = amdgpu_job_alloc_with_ib(ring->adev, &entity, owner,
> -				     64, 0, &job,
> +				     ib_size, 0, &job,
>  				     AMDGPU_KERNEL_JOB_ID_CLEANER_SHADER);
>  	if (r)
>  		goto err;
> @@ -1686,8 +1694,6 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>  	f = amdgpu_job_submit(job);
>  
>  	r = dma_fence_wait(f, false);

Please drop assigning r as well.

Regards,
Christian.

> -	if (r)
> -		goto err;
>  
>  	dma_fence_put(f);
>  
> @@ -1696,6 +1702,8 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>  	return 0;
>  
>  err:
> +	/* Clean up the scheduler entity */
> +	drm_sched_entity_destroy(&entity);
>  	return r;
>  }
>  


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-22 14:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-22 14:23 [PATCH v2] drm/amdgpu: fix cleaner shader IB size and entity cleanup Wentao Liang
2026-06-22 14:35 ` sashiko-bot
2026-06-22 14:48 ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.