amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings
@ 2025-09-01 10:00 Timur Kristóf
  2025-09-01 10:00 ` [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments Timur Kristóf
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Timur Kristóf @ 2025-09-01 10:00 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher, christian.koenig, Timur Kristóf

The amdgpu_bo_create_kernel function takes a byte count,
so we need to multiply the extra dword count by four.
(The ring_size is already in bytes so that one is correct here.)

Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for jpeg ring")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 6379bb25bf5c..13f0f0209cbe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -364,11 +364,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 
 	/* Allocate ring buffer */
 	if (ring->ring_obj == NULL) {
-		r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw, PAGE_SIZE,
-					    AMDGPU_GEM_DOMAIN_GTT,
-					    &ring->ring_obj,
-					    &ring->gpu_addr,
-					    (void **)&ring->ring);
+		r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw * 4,
+						PAGE_SIZE,
+						AMDGPU_GEM_DOMAIN_GTT,
+						&ring->ring_obj,
+						&ring->gpu_addr,
+						(void **)&ring->ring);
 		if (r) {
 			dev_err(adev->dev, "(%d) ring create failed\n", r);
 			kvfree(ring->ring_backup);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments
  2025-09-01 10:00 [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Timur Kristóf
@ 2025-09-01 10:00 ` Timur Kristóf
  2025-09-02  6:43   ` Christian König
  2025-09-01 10:00 ` [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs Timur Kristóf
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Timur Kristóf @ 2025-09-01 10:00 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher, christian.koenig, Timur Kristóf

To avoid confusion with dwords.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 122a88294883..9ffadc029ef8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -220,7 +220,7 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain)
  * amdgpu_bo_create_reserved - create reserved BO for kernel use
  *
  * @adev: amdgpu device object
- * @size: size for the new BO
+ * @size: size for the new BO in bytes
  * @align: alignment for the new BO
  * @domain: where to place it
  * @bo_ptr: used to initialize BOs in structures
@@ -317,7 +317,7 @@ int amdgpu_bo_create_reserved(struct amdgpu_device *adev,
  * amdgpu_bo_create_kernel - create BO for kernel use
  *
  * @adev: amdgpu device object
- * @size: size for the new BO
+ * @size: size for the new BO in bytes
  * @align: alignment for the new BO
  * @domain: where to place it
  * @bo_ptr:  used to initialize BOs in structures
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-01 10:00 [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Timur Kristóf
  2025-09-01 10:00 ` [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments Timur Kristóf
@ 2025-09-01 10:00 ` Timur Kristóf
  2025-09-01 10:13   ` Tvrtko Ursulin
  2025-09-02  6:39   ` Christian König
  2025-09-01 10:00 ` [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00 Timur Kristóf
  2025-09-02  6:41 ` [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Christian König
  3 siblings, 2 replies; 21+ messages in thread
From: Timur Kristóf @ 2025-09-01 10:00 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher, christian.koenig, Timur Kristóf

Technically not necessary, but clear the extra dwords too,
so that the command processors don't read uninitialized memory.

Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for jpeg ring")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7670f5d82b9e..6a55a85744a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -474,6 +474,11 @@ static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
 	while (i <= ring->buf_mask)
 		ring->ring[i++] = ring->funcs->nop;
 
+	/* Technically not necessary, but clear the extra dwords too,
+	 * so that the command processors don't read uninitialized memory.
+	 */
+	for (i = 0; i < ring->funcs->extra_dw; i++)
+		ring->ring[ring->ring_size / 4 + i] = ring->funcs->nop;
 }
 
 static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00
  2025-09-01 10:00 [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Timur Kristóf
  2025-09-01 10:00 ` [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments Timur Kristóf
  2025-09-01 10:00 ` [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs Timur Kristóf
@ 2025-09-01 10:00 ` Timur Kristóf
  2025-09-02  6:45   ` Christian König
  2025-09-02  6:41 ` [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Christian König
  3 siblings, 1 reply; 21+ messages in thread
From: Timur Kristóf @ 2025-09-01 10:00 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher, christian.koenig, Timur Kristóf

SDMA v3-v5 can copy almost 4 MiB in a single copy operation.
Use the	same value as PAL and Mesa for copy_max_bytes.

For reference, see oss2DmaCmdBuffer.cpp	in PAL:
"Due to HW limitation, the maximum count may not be 2^n-1,
can only be 2^n - 1 - start_addr[4:2]"

See also sid.h in Mesa:
"There is apparently an undocumented HW limitation that
prevents the HW from copying the last 255 bytes of (1 << 22) - 1"

Fixes: dfe5c2b76b2a ("drm/amdgpu: Correct bytes limit for SDMA 3.0 copy and fill")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 1c076bd1cf73..9302cf0b5e4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -1659,11 +1659,11 @@ static void sdma_v3_0_emit_fill_buffer(struct amdgpu_ib *ib,
 }
 
 static const struct amdgpu_buffer_funcs sdma_v3_0_buffer_funcs = {
-	.copy_max_bytes = 0x3fffe0, /* not 0x3fffff due to HW limitation */
+	.copy_max_bytes = 0x3fff00, /* not 0x3fffff due to HW limitation */
 	.copy_num_dw = 7,
 	.emit_copy_buffer = sdma_v3_0_emit_copy_buffer,
 
-	.fill_max_bytes = 0x3fffe0, /* not 0x3fffff due to HW limitation */
+	.fill_max_bytes = 0x3fff00, /* not 0x3fffff due to HW limitation */
 	.fill_num_dw = 5,
 	.emit_fill_buffer = sdma_v3_0_emit_fill_buffer,
 };
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-01 10:00 ` [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs Timur Kristóf
@ 2025-09-01 10:13   ` Tvrtko Ursulin
  2025-09-02 11:30     ` Timur Kristóf
  2025-09-02  6:39   ` Christian König
  1 sibling, 1 reply; 21+ messages in thread
From: Tvrtko Ursulin @ 2025-09-01 10:13 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher, christian.koenig


Hi,

On 01/09/2025 11:00, Timur Kristóf wrote:
> Technically not necessary, but clear the extra dwords too,
> so that the command processors don't read uninitialized memory.
> 
> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for jpeg ring")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 7670f5d82b9e..6a55a85744a9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -474,6 +474,11 @@ static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>   	while (i <= ring->buf_mask)
>   		ring->ring[i++] = ring->funcs->nop;
>   
> +	/* Technically not necessary, but clear the extra dwords too,
> +	 * so that the command processors don't read uninitialized memory.
> +	 */
> +	for (i = 0; i < ring->funcs->extra_dw; i++)
> +		ring->ring[ring->ring_size / 4 + i] = ring->funcs->nop;

Should I resend this maybe?

commit 11b0b5d942fe46bfb01f021cdb0616c8385d5ea8
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Thu Dec 26 16:12:37 2024 +0000

     drm/amdgpu: Use memset32 for ring clearing

     Use memset32 instead of open coding it, just because it is
     a tiny bit nicer.

     Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
     Cc: Christian König <christian.koenig@amd.com>
     Cc: Sunil Khatri <sunil.khatri@amd.com>

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index dee5a1b4e572..96bfc0c23413 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -369,10 +369,7 @@ static inline void 
amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,

  static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
  {
-       int i = 0;
-       while (i <= ring->buf_mask)
-               ring->ring[i++] = ring->funcs->nop;
-
+       memset32(ring->ring, ring->funcs->nop, ring->buf_mask + 1);
  }

  static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)

Looks like with two loops it would made even more sense to consolidate.

Regards,

Tvrtko

>   }
>   
>   static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-01 10:00 ` [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs Timur Kristóf
  2025-09-01 10:13   ` Tvrtko Ursulin
@ 2025-09-02  6:39   ` Christian König
  2025-09-02  7:42     ` Timur Kristóf
  1 sibling, 1 reply; 21+ messages in thread
From: Christian König @ 2025-09-02  6:39 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 01.09.25 12:00, Timur Kristóf wrote:
> Technically not necessary, but clear the extra dwords too,
> so that the command processors don't read uninitialized memory.

That is most likely a really bad idea.

The extra DWs are filled with a specific pattern for a HW workaround.

Clearing them to NOPs makes no sense at all and potentially even breaks the HW workaround.

Regards,
Christian.

> 
> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for jpeg ring")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 7670f5d82b9e..6a55a85744a9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -474,6 +474,11 @@ static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>  	while (i <= ring->buf_mask)
>  		ring->ring[i++] = ring->funcs->nop;
>  
> +	/* Technically not necessary, but clear the extra dwords too,
> +	 * so that the command processors don't read uninitialized memory.
> +	 */
> +	for (i = 0; i < ring->funcs->extra_dw; i++)
> +		ring->ring[ring->ring_size / 4 + i] = ring->funcs->nop;
>  }
>  
>  static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings
  2025-09-01 10:00 [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Timur Kristóf
                   ` (2 preceding siblings ...)
  2025-09-01 10:00 ` [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00 Timur Kristóf
@ 2025-09-02  6:41 ` Christian König
  2025-09-02  8:26   ` Timur Kristóf
  3 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2025-09-02  6:41 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 01.09.25 12:00, Timur Kristóf wrote:
> The amdgpu_bo_create_kernel function takes a byte count,
> so we need to multiply the extra dword count by four.
> (The ring_size is already in bytes so that one is correct here.)

Good catch, it just doesn't make a difference in practice since everything is rounded up to 4k anyway.

But I'm really wondering if we shouldn't replace the extra_dw with extra_bytes instead.

It should only be used by some multimedia engines anyway.

Regards,
Christian.

> 
> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for jpeg ring")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 6379bb25bf5c..13f0f0209cbe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -364,11 +364,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>  
>  	/* Allocate ring buffer */
>  	if (ring->ring_obj == NULL) {
> -		r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw, PAGE_SIZE,
> -					    AMDGPU_GEM_DOMAIN_GTT,
> -					    &ring->ring_obj,
> -					    &ring->gpu_addr,
> -					    (void **)&ring->ring);
> +		r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw * 4,
> +						PAGE_SIZE,
> +						AMDGPU_GEM_DOMAIN_GTT,
> +						&ring->ring_obj,
> +						&ring->gpu_addr,
> +						(void **)&ring->ring);
>  		if (r) {
>  			dev_err(adev->dev, "(%d) ring create failed\n", r);
>  			kvfree(ring->ring_backup);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments
  2025-09-01 10:00 ` [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments Timur Kristóf
@ 2025-09-02  6:43   ` Christian König
  2025-09-02  8:08     ` Timur Kristóf
  0 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2025-09-02  6:43 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 01.09.25 12:00, Timur Kristóf wrote:
> To avoid confusion with dwords.
> 
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 122a88294883..9ffadc029ef8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -220,7 +220,7 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain)
>   * amdgpu_bo_create_reserved - create reserved BO for kernel use
>   *
>   * @adev: amdgpu device object
> - * @size: size for the new BO
> + * @size: size for the new BO in bytes

That is actually incorrect. The size is in arbitrary units.

For OA and GWS it is the number of HW engines they represent, for GDS it is in bytes and for VRAM and GTT it is in bytes but rounded up to a page size.

Regards,
Christian.

>   * @align: alignment for the new BO
>   * @domain: where to place it
>   * @bo_ptr: used to initialize BOs in structures
> @@ -317,7 +317,7 @@ int amdgpu_bo_create_reserved(struct amdgpu_device *adev,
>   * amdgpu_bo_create_kernel - create BO for kernel use
>   *
>   * @adev: amdgpu device object
> - * @size: size for the new BO
> + * @size: size for the new BO in bytes
>   * @align: alignment for the new BO
>   * @domain: where to place it
>   * @bo_ptr:  used to initialize BOs in structures


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00
  2025-09-01 10:00 ` [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00 Timur Kristóf
@ 2025-09-02  6:45   ` Christian König
  2025-09-02  7:52     ` Timur Kristóf
  0 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2025-09-02  6:45 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 01.09.25 12:00, Timur Kristóf wrote:
> SDMA v3-v5 can copy almost 4 MiB in a single copy operation.
> Use the	same value as PAL and Mesa for copy_max_bytes.
> 
> For reference, see oss2DmaCmdBuffer.cpp	in PAL:
> "Due to HW limitation, the maximum count may not be 2^n-1,
> can only be 2^n - 1 - start_addr[4:2]"

Is that public available? If yes better reference AMDVLK here.

Apart from that looks good to me.

Regards,
Christian.

> 
> See also sid.h in Mesa:
> "There is apparently an undocumented HW limitation that
> prevents the HW from copying the last 255 bytes of (1 << 22) - 1"
> 
> Fixes: dfe5c2b76b2a ("drm/amdgpu: Correct bytes limit for SDMA 3.0 copy and fill")
> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> index 1c076bd1cf73..9302cf0b5e4b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> @@ -1659,11 +1659,11 @@ static void sdma_v3_0_emit_fill_buffer(struct amdgpu_ib *ib,
>  }
>  
>  static const struct amdgpu_buffer_funcs sdma_v3_0_buffer_funcs = {
> -	.copy_max_bytes = 0x3fffe0, /* not 0x3fffff due to HW limitation */
> +	.copy_max_bytes = 0x3fff00, /* not 0x3fffff due to HW limitation */
>  	.copy_num_dw = 7,
>  	.emit_copy_buffer = sdma_v3_0_emit_copy_buffer,
>  
> -	.fill_max_bytes = 0x3fffe0, /* not 0x3fffff due to HW limitation */
> +	.fill_max_bytes = 0x3fff00, /* not 0x3fffff due to HW limitation */
>  	.fill_num_dw = 5,
>  	.emit_fill_buffer = sdma_v3_0_emit_fill_buffer,
>  };


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-02  6:39   ` Christian König
@ 2025-09-02  7:42     ` Timur Kristóf
  2025-09-02  9:54       ` Christian König
  0 siblings, 1 reply; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02  7:42 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: alexander.deucher

On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > Technically not necessary, but clear the extra dwords too,
> > so that the command processors don't read uninitialized memory.
> 
> That is most likely a really bad idea.
> 
> The extra DWs are filled with a specific pattern for a HW workaround.

I was unaware so it looked to me like the extra dwords just remain
uninitialized.

Where in the code does that happen?
And what is the issue that is being worked around?

Also, while we are at it, how was it possible to initialize that
without causing VM faults? Considering that the allocated BO was not
sufficiently large to hold the extra dwords. (That is fixed by the
first patch of this series.)

> 
> Clearing them to NOPs makes no sense at all and potentially even
> breaks the HW workaround.
> 
> Regards,
> Christian.
> 
> > 
> > Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
> > jpeg ring")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > index 7670f5d82b9e..6a55a85744a9 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > @@ -474,6 +474,11 @@ static inline void
> > amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >  	while (i <= ring->buf_mask)
> >  		ring->ring[i++] = ring->funcs->nop;
> >  
> > +	/* Technically not necessary, but clear the extra dwords
> > too,
> > +	 * so that the command processors don't read uninitialized
> > memory.
> > +	 */
> > +	for (i = 0; i < ring->funcs->extra_dw; i++)
> > +		ring->ring[ring->ring_size / 4 + i] = ring->funcs-
> > >nop;
> >  }
> >  
> >  static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
> > uint32_t v)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00
  2025-09-02  6:45   ` Christian König
@ 2025-09-02  7:52     ` Timur Kristóf
  0 siblings, 0 replies; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02  7:52 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: alexander.deucher

On Tue, 2025-09-02 at 08:45 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > SDMA v3-v5 can copy almost 4 MiB in a single copy operation.
> > Use the	same value as PAL and Mesa for copy_max_bytes.
> > 
> > For reference, see oss2DmaCmdBuffer.cpp	in PAL:
> > "Due to HW limitation, the maximum count may not be 2^n-1,
> > can only be 2^n - 1 - start_addr[4:2]"
> 
> Is that public available? If yes better reference AMDVLK here.

I only have access to public code, so that's all I can reference.

AMDVLK uses PAL so I assume the same would apply to it as well.
The above comment is in oss2DmaCmdBuffer.cpp here:
https://github.com/GPUOpen-Drivers/pal/blob/bcec463efe5260776d486a5e3da0c549bc0a75d2/src/core/hw/ossip/oss2/oss2DmaCmdBuffer.cpp#L308



> 
> Apart from that looks good to me.
> 
> Regards,
> Christian.
> 
> > 
> > See also sid.h in Mesa:
> > "There is apparently an undocumented HW limitation that
> > prevents the HW from copying the last 255 bytes of (1 << 22) - 1"
> > 
> > Fixes: dfe5c2b76b2a ("drm/amdgpu: Correct bytes limit for SDMA 3.0
> > copy and fill")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > index 1c076bd1cf73..9302cf0b5e4b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > @@ -1659,11 +1659,11 @@ static void
> > sdma_v3_0_emit_fill_buffer(struct amdgpu_ib *ib,
> >  }
> >  
> >  static const struct amdgpu_buffer_funcs sdma_v3_0_buffer_funcs = {
> > -	.copy_max_bytes = 0x3fffe0, /* not 0x3fffff due to HW
> > limitation */
> > +	.copy_max_bytes = 0x3fff00, /* not 0x3fffff due to HW
> > limitation */
> >  	.copy_num_dw = 7,
> >  	.emit_copy_buffer = sdma_v3_0_emit_copy_buffer,
> >  
> > -	.fill_max_bytes = 0x3fffe0, /* not 0x3fffff due to HW
> > limitation */
> > +	.fill_max_bytes = 0x3fff00, /* not 0x3fffff due to HW
> > limitation */
> >  	.fill_num_dw = 5,
> >  	.emit_fill_buffer = sdma_v3_0_emit_fill_buffer,
> >  };

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments
  2025-09-02  6:43   ` Christian König
@ 2025-09-02  8:08     ` Timur Kristóf
  2025-09-02 11:10       ` Christian König
  0 siblings, 1 reply; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02  8:08 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: alexander.deucher

On Tue, 2025-09-02 at 08:43 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > To avoid confusion with dwords.
> > 
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 122a88294883..9ffadc029ef8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -220,7 +220,7 @@ void amdgpu_bo_placement_from_domain(struct
> > amdgpu_bo *abo, u32 domain)
> >   * amdgpu_bo_create_reserved - create reserved BO for kernel use
> >   *
> >   * @adev: amdgpu device object
> > - * @size: size for the new BO
> > + * @size: size for the new BO in bytes
> 
> That is actually incorrect. The size is in arbitrary units.
> 
> For OA and GWS it is the number of HW engines they represent, for GDS
> it is in bytes and for VRAM and GTT it is in bytes but rounded up to
> a page size.
> 

Can you point me to the part of the code where this is actually
determined?

If it's that complicated, then I think the function could definitely
benefit from more documentation. All I could find was that both
ttm_resource::size and drm_gem_object::size are documented to be in
bytes.


> 
> >   * @align: alignment for the new BO
> >   * @domain: where to place it
> >   * @bo_ptr: used to initialize BOs in structures
> > @@ -317,7 +317,7 @@ int amdgpu_bo_create_reserved(struct
> > amdgpu_device *adev,
> >   * amdgpu_bo_create_kernel - create BO for kernel use
> >   *
> >   * @adev: amdgpu device object
> > - * @size: size for the new BO
> > + * @size: size for the new BO in bytes
> >   * @align: alignment for the new BO
> >   * @domain: where to place it
> >   * @bo_ptr:  used to initialize BOs in structures

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings
  2025-09-02  6:41 ` [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Christian König
@ 2025-09-02  8:26   ` Timur Kristóf
  2025-09-02 11:20     ` Christian König
  0 siblings, 1 reply; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02  8:26 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: alexander.deucher

On Tue, 2025-09-02 at 08:41 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > The amdgpu_bo_create_kernel function takes a byte count,
> > so we need to multiply the extra dword count by four.
> > (The ring_size is already in bytes so that one is correct here.)
> 
> Good catch, it just doesn't make a difference in practice since
> everything is rounded up to 4k anyway.

Yes. It looks like extra_dw is only used by jpeg_v1_0 and vcn_v4_0 and
fortunately both of these are small enough that it doesn't cause a
difference in practice.

I'd still like to fix it though to avoid potential issues in the
future.

> 
> But I'm really wondering if we shouldn't replace the extra_dw with
> extra_bytes instead.

It's up to you. I'm happy to change it to extra_bytes if you think
that's better. I would prefer to keep extra_dw for two reasons:

- Avoid unnecessary churn
- Most of the code related to rings are in dwords, so IMO it's better
to have the extra space in dwords too for consistency.

> 
> It should only be used by some multimedia engines anyway.

Right. vcn_v4_0 has this:
.extra_dw = sizeof(struct amdgpu_vcn_rb_metadata)
Which would also need to be fixed because it's not really in dwords.



> > 
> > Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
> > jpeg ring")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > index 6379bb25bf5c..13f0f0209cbe 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > @@ -364,11 +364,12 @@ int amdgpu_ring_init(struct amdgpu_device
> > *adev, struct amdgpu_ring *ring,
> >  
> >  	/* Allocate ring buffer */
> >  	if (ring->ring_obj == NULL) {
> > -		r = amdgpu_bo_create_kernel(adev, ring->ring_size
> > + ring->funcs->extra_dw, PAGE_SIZE,
> > -					    AMDGPU_GEM_DOMAIN_GTT,
> > -					    &ring->ring_obj,
> > -					    &ring->gpu_addr,
> > -					    (void **)&ring->ring);
> > +		r = amdgpu_bo_create_kernel(adev, ring->ring_size
> > + ring->funcs->extra_dw * 4,
> > +						PAGE_SIZE,
> > +						AMDGPU_GEM_DOMAIN_
> > GTT,
> > +						&ring->ring_obj,
> > +						&ring->gpu_addr,
> > +						(void **)&ring-
> > >ring);
> >  		if (r) {
> >  			dev_err(adev->dev, "(%d) ring create
> > failed\n", r);
> >  			kvfree(ring->ring_backup);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-02  7:42     ` Timur Kristóf
@ 2025-09-02  9:54       ` Christian König
  2025-09-02 10:10         ` Timur Kristóf
  0 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2025-09-02  9:54 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 02.09.25 09:42, Timur Kristóf wrote:
> On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
>> On 01.09.25 12:00, Timur Kristóf wrote:
>>> Technically not necessary, but clear the extra dwords too,
>>> so that the command processors don't read uninitialized memory.
>>
>> That is most likely a really bad idea.
>>
>> The extra DWs are filled with a specific pattern for a HW workaround.
> 
> I was unaware so it looked to me like the extra dwords just remain
> uninitialized.
> 
> Where in the code does that happen?

See vcn_v4_0_init_ring_metadata() for an example.

> And what is the issue that is being worked around?

IIRC on the VCN4 engine there is a HW bug which triggered when some metadata for the ring was not in the same X megabyte segment of the ring buffer. So we just put this small metadata structure directly after the ring.

On some jpeg engine you had to insert commands after the end of the ring to actually make the ring work reliable. See jpeg_v1_0_decode_ring_set_patch_ring().

> Also, while we are at it, how was it possible to initialize that
> without causing VM faults? Considering that the allocated BO was not
> sufficiently large to hold the extra dwords. (That is fixed by the
> first patch of this series.)

BOs for VRAM and GTT are always rounded to the next 4KiB. The workaround need something like 100 bytes or 64 DW, so that always worked.

BTW: I just found that vcn_v4_0_unified_ring_vm_funcs() is using extra_dw as bytes already :)

Regards,
Christian.

> 
>>
>> Clearing them to NOPs makes no sense at all and potentially even
>> breaks the HW workaround.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
>>> jpeg ring")
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 7670f5d82b9e..6a55a85744a9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -474,6 +474,11 @@ static inline void
>>> amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>  	while (i <= ring->buf_mask)
>>>  		ring->ring[i++] = ring->funcs->nop;
>>>  
>>> +	/* Technically not necessary, but clear the extra dwords
>>> too,
>>> +	 * so that the command processors don't read uninitialized
>>> memory.
>>> +	 */
>>> +	for (i = 0; i < ring->funcs->extra_dw; i++)
>>> +		ring->ring[ring->ring_size / 4 + i] = ring->funcs-
>>>> nop;
>>>  }
>>>  
>>>  static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
>>> uint32_t v)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-02  9:54       ` Christian König
@ 2025-09-02 10:10         ` Timur Kristóf
  0 siblings, 0 replies; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02 10:10 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: alexander.deucher

On Tue, 2025-09-02 at 11:54 +0200, Christian König wrote:
> On 02.09.25 09:42, Timur Kristóf wrote:
> > On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
> > > On 01.09.25 12:00, Timur Kristóf wrote:
> > > > Technically not necessary, but clear the extra dwords too,
> > > > so that the command processors don't read uninitialized memory.
> > > 
> > > That is most likely a really bad idea.
> > > 
> > > The extra DWs are filled with a specific pattern for a HW
> > > workaround.
> > 
> > I was unaware so it looked to me like the extra dwords just remain
> > uninitialized.
> > 
> > Where in the code does that happen?
> 
> See vcn_v4_0_init_ring_metadata() for an example.

I see, thanks for explaining that.

So basically it's up to the ring initialization to fill the extra
dwords with something, and it shouldn't be done when clearing the ring.

> 
> > And what is the issue that is being worked around?
> 
> IIRC on the VCN4 engine there is a HW bug which triggered when some
> metadata for the ring was not in the same X megabyte segment of the
> ring buffer. So we just put this small metadata structure directly
> after the ring.
> 
> On some jpeg engine you had to insert commands after the end of the
> ring to actually make the ring work reliable. See
> jpeg_v1_0_decode_ring_set_patch_ring().

Thanks!

> 
> > Also, while we are at it, how was it possible to initialize that
> > without causing VM faults? Considering that the allocated BO was
> > not
> > sufficiently large to hold the extra dwords. (That is fixed by the
> > first patch of this series.)
> 
> BOs for VRAM and GTT are always rounded to the next 4KiB. The
> workaround need something like 100 bytes or 64 DW, so that always
> worked.
> 
> BTW: I just found that vcn_v4_0_unified_ring_vm_funcs() is using
> extra_dw as bytes already :)

I can fix that and include the fix in the next version of this series.

> 


> > > 
> > > Clearing them to NOPs makes no sense at all and potentially even
> > > breaks the HW workaround.
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > > 
> > > > Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword
> > > > for
> > > > jpeg ring")
> > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > index 7670f5d82b9e..6a55a85744a9 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > @@ -474,6 +474,11 @@ static inline void
> > > > amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> > > >  	while (i <= ring->buf_mask)
> > > >  		ring->ring[i++] = ring->funcs->nop;
> > > >  
> > > > +	/* Technically not necessary, but clear the extra
> > > > dwords
> > > > too,
> > > > +	 * so that the command processors don't read
> > > > uninitialized
> > > > memory.
> > > > +	 */
> > > > +	for (i = 0; i < ring->funcs->extra_dw; i++)
> > > > +		ring->ring[ring->ring_size / 4 + i] = ring-
> > > > >funcs-
> > > > > nop;
> > > >  }
> > > >  
> > > >  static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
> > > > uint32_t v)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments
  2025-09-02  8:08     ` Timur Kristóf
@ 2025-09-02 11:10       ` Christian König
  2025-09-02 11:28         ` Timur Kristóf
  0 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2025-09-02 11:10 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 02.09.25 10:08, Timur Kristóf wrote:
> On Tue, 2025-09-02 at 08:43 +0200, Christian König wrote:
>> On 01.09.25 12:00, Timur Kristóf wrote:
>>> To avoid confusion with dwords.
>>>
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 122a88294883..9ffadc029ef8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -220,7 +220,7 @@ void amdgpu_bo_placement_from_domain(struct
>>> amdgpu_bo *abo, u32 domain)
>>>   * amdgpu_bo_create_reserved - create reserved BO for kernel use
>>>   *
>>>   * @adev: amdgpu device object
>>> - * @size: size for the new BO
>>> + * @size: size for the new BO in bytes
>>
>> That is actually incorrect. The size is in arbitrary units.
>>
>> For OA and GWS it is the number of HW engines they represent, for GDS
>> it is in bytes and for VRAM and GTT it is in bytes but rounded up to
>> a page size.
>>
> 
> Can you point me to the part of the code where this is actually
> determined?

See amdgpu_bo_create():

        /* Note that GDS/GWS/OA allocates 1 page per byte/resource. */
        if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
                /* GWS and OA don't need any alignment. */
                page_align = bp->byte_align;
                size <<= PAGE_SHIFT;

        } else if (bp->domain & AMDGPU_GEM_DOMAIN_GDS) {
                /* Both size and alignment must be a multiple of 4. */
                page_align = ALIGN(bp->byte_align, 4);
                size = ALIGN(size, 4) << PAGE_SHIFT;
        } else {
                /* Memory should be aligned at least to a page size. */
                page_align = ALIGN(bp->byte_align, PAGE_SIZE) >> PAGE_SHIFT;
                size = ALIGN(size, PAGE_SIZE);
        }

The GDS/GWS/OA handling here is actually a hack I'm trying to remove for years. It's just that GEM/TTM assumes that all BOs are PAGE_SIZE based objects and so we have to do this tricky workaround.

> If it's that complicated, then I think the function could definitely
> benefit from more documentation. All I could find was that both
> ttm_resource::size and drm_gem_object::size are documented to be in
> bytes.

IIRC we have actually documented that quite extensively. I'm just not sure of hand where, @Alex?

Maybe the UAPI? Probably best to add a reference to that to the functions creating a kernel BO.

Regards,
Christian.

> 
> 
>>
>>>   * @align: alignment for the new BO
>>>   * @domain: where to place it
>>>   * @bo_ptr: used to initialize BOs in structures
>>> @@ -317,7 +317,7 @@ int amdgpu_bo_create_reserved(struct
>>> amdgpu_device *adev,
>>>   * amdgpu_bo_create_kernel - create BO for kernel use
>>>   *
>>>   * @adev: amdgpu device object
>>> - * @size: size for the new BO
>>> + * @size: size for the new BO in bytes
>>>   * @align: alignment for the new BO
>>>   * @domain: where to place it
>>>   * @bo_ptr:  used to initialize BOs in structures


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings
  2025-09-02  8:26   ` Timur Kristóf
@ 2025-09-02 11:20     ` Christian König
  0 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2025-09-02 11:20 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher

On 02.09.25 10:26, Timur Kristóf wrote:
> On Tue, 2025-09-02 at 08:41 +0200, Christian König wrote:
>> On 01.09.25 12:00, Timur Kristóf wrote:
>>> The amdgpu_bo_create_kernel function takes a byte count,
>>> so we need to multiply the extra dword count by four.
>>> (The ring_size is already in bytes so that one is correct here.)
>>
>> Good catch, it just doesn't make a difference in practice since
>> everything is rounded up to 4k anyway.
> 
> Yes. It looks like extra_dw is only used by jpeg_v1_0 and vcn_v4_0 and
> fortunately both of these are small enough that it doesn't cause a
> difference in practice.
> 
> I'd still like to fix it though to avoid potential issues in the
> future.

Yeah, clearly agree. 

> 
>>
>> But I'm really wondering if we shouldn't replace the extra_dw with
>> extra_bytes instead.
> 
> It's up to you. I'm happy to change it to extra_bytes if you think
> that's better.

The VCN code already specifies the extra_dw in bytes.

So I think it is better to rename the field and add an "* 4" to the calculation in the jpeg code.

> I would prefer to keep extra_dw for two reasons:
> 
> - Avoid unnecessary churn
> - Most of the code related to rings are in dwords, so IMO it's better
> to have the extra space in dwords too for consistency.

That more or less just a leftover from the early radeon days.

The GFX/SDMA/UVD engines worked with DW packets, but starting with VCE and then later VCN, PSP and MES the FW teams moved over to using C structs which are obviously counted in bytes.

> 
>>
>> It should only be used by some multimedia engines anyway.
> 
> Right. vcn_v4_0 has this:
> .extra_dw = sizeof(struct amdgpu_vcn_rb_metadata)
> Which would also need to be fixed because it's not really in dwords.

Yeah, exactly that.

Regards,
Christian.

> 
> 
> 
>>>
>>> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
>>> jpeg ring")
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 ++++++-----
>>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 6379bb25bf5c..13f0f0209cbe 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -364,11 +364,12 @@ int amdgpu_ring_init(struct amdgpu_device
>>> *adev, struct amdgpu_ring *ring,
>>>  
>>>  	/* Allocate ring buffer */
>>>  	if (ring->ring_obj == NULL) {
>>> -		r = amdgpu_bo_create_kernel(adev, ring->ring_size
>>> + ring->funcs->extra_dw, PAGE_SIZE,
>>> -					    AMDGPU_GEM_DOMAIN_GTT,
>>> -					    &ring->ring_obj,
>>> -					    &ring->gpu_addr,
>>> -					    (void **)&ring->ring);
>>> +		r = amdgpu_bo_create_kernel(adev, ring->ring_size
>>> + ring->funcs->extra_dw * 4,
>>> +						PAGE_SIZE,
>>> +						AMDGPU_GEM_DOMAIN_
>>> GTT,
>>> +						&ring->ring_obj,
>>> +						&ring->gpu_addr,
>>> +						(void **)&ring-
>>>> ring);
>>>  		if (r) {
>>>  			dev_err(adev->dev, "(%d) ring create
>>> failed\n", r);
>>>  			kvfree(ring->ring_backup);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments
  2025-09-02 11:10       ` Christian König
@ 2025-09-02 11:28         ` Timur Kristóf
  0 siblings, 0 replies; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02 11:28 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: alexander.deucher

On Tue, 2025-09-02 at 13:10 +0200, Christian König wrote:
> On 02.09.25 10:08, Timur Kristóf wrote:
> > On Tue, 2025-09-02 at 08:43 +0200, Christian König wrote:
> > > On 01.09.25 12:00, Timur Kristóf wrote:
> > > > To avoid confusion with dwords.
> > > > 
> > > > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > index 122a88294883..9ffadc029ef8 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > @@ -220,7 +220,7 @@ void amdgpu_bo_placement_from_domain(struct
> > > > amdgpu_bo *abo, u32 domain)
> > > >   * amdgpu_bo_create_reserved - create reserved BO for kernel
> > > > use
> > > >   *
> > > >   * @adev: amdgpu device object
> > > > - * @size: size for the new BO
> > > > + * @size: size for the new BO in bytes
> > > 
> > > That is actually incorrect. The size is in arbitrary units.
> > > 
> > > For OA and GWS it is the number of HW engines they represent, for
> > > GDS
> > > it is in bytes and for VRAM and GTT it is in bytes but rounded up
> > > to
> > > a page size.
> > > 
> > 
> > Can you point me to the part of the code where this is actually
> > determined?
> 
> See amdgpu_bo_create():
> 
>         /* Note that GDS/GWS/OA allocates 1 page per byte/resource.
> */
>         if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS |
> AMDGPU_GEM_DOMAIN_OA)) {
>                 /* GWS and OA don't need any alignment. */
>                 page_align = bp->byte_align;
>                 size <<= PAGE_SHIFT;
> 
>         } else if (bp->domain & AMDGPU_GEM_DOMAIN_GDS) {
>                 /* Both size and alignment must be a multiple of 4.
> */
>                 page_align = ALIGN(bp->byte_align, 4);
>                 size = ALIGN(size, 4) << PAGE_SHIFT;
>         } else {
>                 /* Memory should be aligned at least to a page size.
> */
>                 page_align = ALIGN(bp->byte_align, PAGE_SIZE) >>
> PAGE_SHIFT;
>                 size = ALIGN(size, PAGE_SIZE);
>         }
> 
> The GDS/GWS/OA handling here is actually a hack I'm trying to remove
> for years. It's just that GEM/TTM assumes that all BOs are PAGE_SIZE
> based objects and so we have to do this tricky workaround.
> 
> > If it's that complicated, then I think the function could
> > definitely
> > benefit from more documentation. All I could find was that both
> > ttm_resource::size and drm_gem_object::size are documented to be in
> > bytes.
> 
> IIRC we have actually documented that quite extensively. I'm just not
> sure of hand where, @Alex?
> 
> Maybe the UAPI? Probably best to add a reference to that to the
> functions creating a kernel BO.

Thanks Christian.

If this is already documented, then I agree it's best to refer to the
pre-existing docs. I'll do that (once we figure out where it is).

> 
> Regards,
> Christian.
> 
> > 
> > 
> > > 
> > > >   * @align: alignment for the new BO
> > > >   * @domain: where to place it
> > > >   * @bo_ptr: used to initialize BOs in structures
> > > > @@ -317,7 +317,7 @@ int amdgpu_bo_create_reserved(struct
> > > > amdgpu_device *adev,
> > > >   * amdgpu_bo_create_kernel - create BO for kernel use
> > > >   *
> > > >   * @adev: amdgpu device object
> > > > - * @size: size for the new BO
> > > > + * @size: size for the new BO in bytes
> > > >   * @align: alignment for the new BO
> > > >   * @domain: where to place it
> > > >   * @bo_ptr:  used to initialize BOs in structures

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-01 10:13   ` Tvrtko Ursulin
@ 2025-09-02 11:30     ` Timur Kristóf
  2025-09-02 11:34       ` Christian König
  2025-09-02 11:39       ` Tvrtko Ursulin
  0 siblings, 2 replies; 21+ messages in thread
From: Timur Kristóf @ 2025-09-02 11:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx; +Cc: alexander.deucher, christian.koenig

On Mon, 2025-09-01 at 11:13 +0100, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 01/09/2025 11:00, Timur Kristóf wrote:
> > Technically not necessary, but clear the extra dwords too,
> > so that the command processors don't read uninitialized memory.
> > 
> > Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
> > jpeg ring")
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
> >   1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > index 7670f5d82b9e..6a55a85744a9 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > @@ -474,6 +474,11 @@ static inline void
> > amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >   	while (i <= ring->buf_mask)
> >   		ring->ring[i++] = ring->funcs->nop;
> >   
> > +	/* Technically not necessary, but clear the extra dwords
> > too,
> > +	 * so that the command processors don't read uninitialized
> > memory.
> > +	 */
> > +	for (i = 0; i < ring->funcs->extra_dw; i++)
> > +		ring->ring[ring->ring_size / 4 + i] = ring->funcs-
> > >nop;
> 
> Should I resend this maybe?


Hi Tvrtko,

The patch you commented on is going to be dropped.

However, your patch makes good sense, so I can include it in the next
version of this series if that's OK.

@Christian - does that sound alright to you?


> 
> commit 11b0b5d942fe46bfb01f021cdb0616c8385d5ea8
> Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Date:   Thu Dec 26 16:12:37 2024 +0000
> 
>      drm/amdgpu: Use memset32 for ring clearing
> 
>      Use memset32 instead of open coding it, just because it is
>      a tiny bit nicer.
> 
>      Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>      Cc: Christian König <christian.koenig@amd.com>
>      Cc: Sunil Khatri <sunil.khatri@amd.com>
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index dee5a1b4e572..96bfc0c23413 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -369,10 +369,7 @@ static inline void 
> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
> 
>   static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>   {
> -       int i = 0;
> -       while (i <= ring->buf_mask)
> -               ring->ring[i++] = ring->funcs->nop;
> -
> +       memset32(ring->ring, ring->funcs->nop, ring->buf_mask + 1);
>   }
> 
>   static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
> uint32_t v)
> 
> Looks like with two loops it would made even more sense to
> consolidate.
> 
> Regards,
> 
> Tvrtko
> 
> >   }
> >   
> >   static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
> > uint32_t v)
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-02 11:30     ` Timur Kristóf
@ 2025-09-02 11:34       ` Christian König
  2025-09-02 11:39       ` Tvrtko Ursulin
  1 sibling, 0 replies; 21+ messages in thread
From: Christian König @ 2025-09-02 11:34 UTC (permalink / raw)
  To: Timur Kristóf, Tvrtko Ursulin, amd-gfx; +Cc: alexander.deucher

On 02.09.25 13:30, Timur Kristóf wrote:
> On Mon, 2025-09-01 at 11:13 +0100, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 01/09/2025 11:00, Timur Kristóf wrote:
>>> Technically not necessary, but clear the extra dwords too,
>>> so that the command processors don't read uninitialized memory.
>>>
>>> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
>>> jpeg ring")
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 7670f5d82b9e..6a55a85744a9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -474,6 +474,11 @@ static inline void
>>> amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>   	while (i <= ring->buf_mask)
>>>   		ring->ring[i++] = ring->funcs->nop;
>>>   
>>> +	/* Technically not necessary, but clear the extra dwords
>>> too,
>>> +	 * so that the command processors don't read uninitialized
>>> memory.
>>> +	 */
>>> +	for (i = 0; i < ring->funcs->extra_dw; i++)
>>> +		ring->ring[ring->ring_size / 4 + i] = ring->funcs-
>>>> nop;
>>
>> Should I resend this maybe?
> 
> 
> Hi Tvrtko,
> 
> The patch you commented on is going to be dropped.
> 
> However, your patch makes good sense, so I can include it in the next
> version of this series if that's OK.
> 
> @Christian - does that sound alright to you?

Works for me.

Regards,
Christian.

> 
> 
>>
>> commit 11b0b5d942fe46bfb01f021cdb0616c8385d5ea8
>> Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Date:   Thu Dec 26 16:12:37 2024 +0000
>>
>>      drm/amdgpu: Use memset32 for ring clearing
>>
>>      Use memset32 instead of open coding it, just because it is
>>      a tiny bit nicer.
>>
>>      Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>      Cc: Christian König <christian.koenig@amd.com>
>>      Cc: Sunil Khatri <sunil.khatri@amd.com>
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index dee5a1b4e572..96bfc0c23413 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -369,10 +369,7 @@ static inline void 
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>
>>   static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>   {
>> -       int i = 0;
>> -       while (i <= ring->buf_mask)
>> -               ring->ring[i++] = ring->funcs->nop;
>> -
>> +       memset32(ring->ring, ring->funcs->nop, ring->buf_mask + 1);
>>   }
>>
>>   static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
>> uint32_t v)
>>
>> Looks like with two loops it would made even more sense to
>> consolidate.
>>
>> Regards,
>>
>> Tvrtko
>>
>>>   }
>>>   
>>>   static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
>>> uint32_t v)
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs
  2025-09-02 11:30     ` Timur Kristóf
  2025-09-02 11:34       ` Christian König
@ 2025-09-02 11:39       ` Tvrtko Ursulin
  1 sibling, 0 replies; 21+ messages in thread
From: Tvrtko Ursulin @ 2025-09-02 11:39 UTC (permalink / raw)
  To: Timur Kristóf, amd-gfx; +Cc: alexander.deucher, christian.koenig


On 02/09/2025 12:30, Timur Kristóf wrote:
> On Mon, 2025-09-01 at 11:13 +0100, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 01/09/2025 11:00, Timur Kristóf wrote:
>>> Technically not necessary, but clear the extra dwords too,
>>> so that the command processors don't read uninitialized memory.
>>>
>>> Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword for
>>> jpeg ring")
>>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
>>>    1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 7670f5d82b9e..6a55a85744a9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -474,6 +474,11 @@ static inline void
>>> amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>    	while (i <= ring->buf_mask)
>>>    		ring->ring[i++] = ring->funcs->nop;
>>>    
>>> +	/* Technically not necessary, but clear the extra dwords
>>> too,
>>> +	 * so that the command processors don't read uninitialized
>>> memory.
>>> +	 */
>>> +	for (i = 0; i < ring->funcs->extra_dw; i++)
>>> +		ring->ring[ring->ring_size / 4 + i] = ring->funcs-
>>>> nop;
>>
>> Should I resend this maybe?
> 
> 
> Hi Tvrtko,
> 
> The patch you commented on is going to be dropped.
> 
> However, your patch makes good sense, so I can include it in the next
> version of this series if that's OK.

Yes, (more than) fine by me. Thanks!

Tvrtko

> 
> @Christian - does that sound alright to you?
> 
> 
>>
>> commit 11b0b5d942fe46bfb01f021cdb0616c8385d5ea8
>> Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Date:   Thu Dec 26 16:12:37 2024 +0000
>>
>>       drm/amdgpu: Use memset32 for ring clearing
>>
>>       Use memset32 instead of open coding it, just because it is
>>       a tiny bit nicer.
>>
>>       Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>       Cc: Christian König <christian.koenig@amd.com>
>>       Cc: Sunil Khatri <sunil.khatri@amd.com>
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index dee5a1b4e572..96bfc0c23413 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -369,10 +369,7 @@ static inline void
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>
>>    static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>    {
>> -       int i = 0;
>> -       while (i <= ring->buf_mask)
>> -               ring->ring[i++] = ring->funcs->nop;
>> -
>> +       memset32(ring->ring, ring->funcs->nop, ring->buf_mask + 1);
>>    }
>>
>>    static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
>> uint32_t v)
>>
>> Looks like with two loops it would made even more sense to
>> consolidate.
>>
>> Regards,
>>
>> Tvrtko
>>
>>>    }
>>>    
>>>    static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
>>> uint32_t v)
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-09-02 11:39 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01 10:00 [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Timur Kristóf
2025-09-01 10:00 ` [PATCH 2/4] drm/amdgpu: Clarify that BO size is in bytes in comments Timur Kristóf
2025-09-02  6:43   ` Christian König
2025-09-02  8:08     ` Timur Kristóf
2025-09-02 11:10       ` Christian König
2025-09-02 11:28         ` Timur Kristóf
2025-09-01 10:00 ` [PATCH 3/4] drm/amdgpu: Fill extra dwords with NOPs Timur Kristóf
2025-09-01 10:13   ` Tvrtko Ursulin
2025-09-02 11:30     ` Timur Kristóf
2025-09-02 11:34       ` Christian König
2025-09-02 11:39       ` Tvrtko Ursulin
2025-09-02  6:39   ` Christian König
2025-09-02  7:42     ` Timur Kristóf
2025-09-02  9:54       ` Christian König
2025-09-02 10:10         ` Timur Kristóf
2025-09-01 10:00 ` [PATCH 4/4] drm/amdgpu: Set SDMA v3 copy_max_bytes to 0x3fff00 Timur Kristóf
2025-09-02  6:45   ` Christian König
2025-09-02  7:52     ` Timur Kristóf
2025-09-02  6:41 ` [PATCH 1/4] drm/amdgpu: Fix allocating extra dwords for rings Christian König
2025-09-02  8:26   ` Timur Kristóf
2025-09-02 11:20     ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).