* [v4 1/3] drm/amdgpu: Fix SDMA engine reset with logical instance ID
@ 2025-06-11 10:04 Jesse Zhang
2025-06-11 10:04 ` [v4 2/3] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations Jesse Zhang
2025-06-11 10:04 ` [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x Jesse Zhang
0 siblings, 2 replies; 4+ messages in thread
From: Jesse Zhang @ 2025-06-11 10:04 UTC (permalink / raw)
To: amd-gfx
Cc: Alexander.Deucher, Christian Koenig, jonathan.kim, jiadong.zhu,
Jesse Zhang, Jesse Zhang
This commit makes the following improvements to SDMA engine reset handling:
1. Clarifies in the function documentation that instance_id refers to a logical ID
2. Adds conversion from logical to physical instance ID before performing reset
using GET_INST(SDMA0, instance_id) macro
3. Improves error messaging to indicate when a logical instance reset fails
4. Adds better code organization with blank lines for readability
The change ensures proper SDMA engine reset by using the correct physical
instance ID while maintaining the logical ID interface for callers.
V2: Remove harvest_config check and convert directly to physical instance (Lijo)
Suggested-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index 6716ac281c49..9b54a1ece447 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -540,8 +540,10 @@ static int amdgpu_sdma_soft_reset(struct amdgpu_device *adev, u32 instance_id)
case IP_VERSION(4, 4, 2):
case IP_VERSION(4, 4, 4):
case IP_VERSION(4, 4, 5):
- /* For SDMA 4.x, use the existing DPM interface for backward compatibility */
- r = amdgpu_dpm_reset_sdma(adev, 1 << instance_id);
+ /* For SDMA 4.x, use the existing DPM interface for backward compatibility,
+ * we need to convert the logical instance ID to physical instance ID before reset.
+ */
+ r = amdgpu_dpm_reset_sdma(adev, 1 << GET_INST(SDMA0, instance_id));
break;
case IP_VERSION(5, 0, 0):
case IP_VERSION(5, 0, 1):
@@ -568,7 +570,7 @@ static int amdgpu_sdma_soft_reset(struct amdgpu_device *adev, u32 instance_id)
/**
* amdgpu_sdma_reset_engine - Reset a specific SDMA engine
* @adev: Pointer to the AMDGPU device
- * @instance_id: ID of the SDMA engine instance to reset
+ * @instance_id: Logical ID of the SDMA engine instance to reset
*
* Returns: 0 on success, or a negative error code on failure.
*/
@@ -601,7 +603,7 @@ int amdgpu_sdma_reset_engine(struct amdgpu_device *adev, uint32_t instance_id)
/* Perform the SDMA reset for the specified instance */
ret = amdgpu_sdma_soft_reset(adev, instance_id);
if (ret) {
- dev_err(adev->dev, "Failed to reset SDMA instance %u\n", instance_id);
+ dev_err(adev->dev, "Failed to reset SDMA logical instance %u\n", instance_id);
goto exit;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [v4 2/3] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations
2025-06-11 10:04 [v4 1/3] drm/amdgpu: Fix SDMA engine reset with logical instance ID Jesse Zhang
@ 2025-06-11 10:04 ` Jesse Zhang
2025-06-11 10:04 ` [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x Jesse Zhang
1 sibling, 0 replies; 4+ messages in thread
From: Jesse Zhang @ 2025-06-11 10:04 UTC (permalink / raw)
To: amd-gfx
Cc: Alexander.Deucher, Christian Koenig, jonathan.kim, jiadong.zhu,
Jesse Zhang, Lijo Lazar, Jesse Zhang
Simplify SDMA v4_4_2 queue reset and stop operations by:
1. Removing GET_INST(SDMA0) conversion for ring->me
2. Using the logical instance ID (ring->me) directly
3. Maintaining consistent behavior with other SDMA queue operations
This change aligns with the existing queue handling logic where
ring->me already represents the correct instance identifier.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index 9c169112a5e7..3de125062ee3 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -1670,7 +1670,7 @@ static bool sdma_v4_4_2_page_ring_is_guilty(struct amdgpu_ring *ring)
static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
{
struct amdgpu_device *adev = ring->adev;
- u32 id = GET_INST(SDMA0, ring->me);
+ u32 id = ring->me;
int r;
if (!(adev->sdma.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE))
@@ -1686,7 +1686,7 @@ static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
static int sdma_v4_4_2_stop_queue(struct amdgpu_ring *ring)
{
struct amdgpu_device *adev = ring->adev;
- u32 instance_id = GET_INST(SDMA0, ring->me);
+ u32 instance_id = ring->me;
u32 inst_mask;
uint64_t rptr;
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x
2025-06-11 10:04 [v4 1/3] drm/amdgpu: Fix SDMA engine reset with logical instance ID Jesse Zhang
2025-06-11 10:04 ` [v4 2/3] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations Jesse Zhang
@ 2025-06-11 10:04 ` Jesse Zhang
2025-06-11 14:16 ` Deucher, Alexander
1 sibling, 1 reply; 4+ messages in thread
From: Jesse Zhang @ 2025-06-11 10:04 UTC (permalink / raw)
To: amd-gfx
Cc: Alexander.Deucher, Christian Koenig, jonathan.kim, jiadong.zhu,
Lijo Lazar
From: Lijo Lazar <lijo.lazar@amd.com>
Implement soft reset engine callback for SDMA 4.4.x IPs. This avoids IP
version check in generic implementation.
V2: Correct physical instance ID calculation in soft_reset_engine (Jesse)
v4: keep origin comments (Lijo)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 34 +++---------------------
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 12 +++++++++
2 files changed, 16 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index 9b54a1ece447..a1e54bcef495 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -534,37 +534,11 @@ bool amdgpu_sdma_is_shared_inv_eng(struct amdgpu_device *adev, struct amdgpu_rin
static int amdgpu_sdma_soft_reset(struct amdgpu_device *adev, u32 instance_id)
{
struct amdgpu_sdma_instance *sdma_instance = &adev->sdma.instance[instance_id];
- int r = -EOPNOTSUPP;
-
- switch (amdgpu_ip_version(adev, SDMA0_HWIP, 0)) {
- case IP_VERSION(4, 4, 2):
- case IP_VERSION(4, 4, 4):
- case IP_VERSION(4, 4, 5):
- /* For SDMA 4.x, use the existing DPM interface for backward compatibility,
- * we need to convert the logical instance ID to physical instance ID before reset.
- */
- r = amdgpu_dpm_reset_sdma(adev, 1 << GET_INST(SDMA0, instance_id));
- break;
- case IP_VERSION(5, 0, 0):
- case IP_VERSION(5, 0, 1):
- case IP_VERSION(5, 0, 2):
- case IP_VERSION(5, 0, 5):
- case IP_VERSION(5, 2, 0):
- case IP_VERSION(5, 2, 2):
- case IP_VERSION(5, 2, 4):
- case IP_VERSION(5, 2, 5):
- case IP_VERSION(5, 2, 6):
- case IP_VERSION(5, 2, 3):
- case IP_VERSION(5, 2, 1):
- case IP_VERSION(5, 2, 7):
- if (sdma_instance->funcs->soft_reset_kernel_queue)
- r = sdma_instance->funcs->soft_reset_kernel_queue(adev, instance_id);
- break;
- default:
- break;
- }
- return r;
+ if (sdma_instance->funcs->soft_reset_kernel_queue)
+ return sdma_instance->funcs->soft_reset_kernel_queue(adev, instance_id);
+
+ return -EOPNOTSUPP;
}
/**
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index 3de125062ee3..35b0a7fb37b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -109,6 +109,8 @@ static void sdma_v4_4_2_set_ras_funcs(struct amdgpu_device *adev);
static void sdma_v4_4_2_update_reset_mask(struct amdgpu_device *adev);
static int sdma_v4_4_2_stop_queue(struct amdgpu_ring *ring);
static int sdma_v4_4_2_restore_queue(struct amdgpu_ring *ring);
+static int sdma_v4_4_2_soft_reset_engine(struct amdgpu_device *adev,
+ u32 instance_id);
static u32 sdma_v4_4_2_get_reg_offset(struct amdgpu_device *adev,
u32 instance, u32 offset)
@@ -1337,6 +1339,7 @@ static bool sdma_v4_4_2_fw_support_paging_queue(struct amdgpu_device *adev)
static const struct amdgpu_sdma_funcs sdma_v4_4_2_sdma_funcs = {
.stop_kernel_queue = &sdma_v4_4_2_stop_queue,
.start_kernel_queue = &sdma_v4_4_2_restore_queue,
+ .soft_reset_kernel_queue = &sdma_v4_4_2_soft_reset_engine,
};
static int sdma_v4_4_2_early_init(struct amdgpu_ip_block *ip_block)
@@ -1745,6 +1748,15 @@ static int sdma_v4_4_2_restore_queue(struct amdgpu_ring *ring)
return sdma_v4_4_2_inst_start(adev, inst_mask, true);
}
+static int sdma_v4_4_2_soft_reset_engine(struct amdgpu_device *adev,
+ u32 instance_id)
+{
+ /* For SDMA 4.x, use the existing DPM interface for backward compatibility
+ * we need to convert the logical instance ID to physical instance ID before reset.
+ */
+ return amdgpu_dpm_reset_sdma(adev, 1 << GET_INST(SDMA0, instance_id));
+}
+
static int sdma_v4_4_2_set_trap_irq_state(struct amdgpu_device *adev,
struct amdgpu_irq_src *source,
unsigned type,
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x
2025-06-11 10:04 ` [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x Jesse Zhang
@ 2025-06-11 14:16 ` Deucher, Alexander
0 siblings, 0 replies; 4+ messages in thread
From: Deucher, Alexander @ 2025-06-11 14:16 UTC (permalink / raw)
To: Zhang, Jesse(Jie), amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian, Kim, Jonathan, Zhu, Jiadong, Lazar, Lijo
[Public]
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Jesse
> Zhang
> Sent: Wednesday, June 11, 2025 6:05 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>; Kim, Jonathan <Jonathan.Kim@amd.com>; Zhu,
> Jiadong <Jiadong.Zhu@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com>
> Subject: [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x
>
> From: Lijo Lazar <lijo.lazar@amd.com>
>
> Implement soft reset engine callback for SDMA 4.4.x IPs. This avoids IP version
> check in generic implementation.
>
> V2: Correct physical instance ID calculation in soft_reset_engine (Jesse)
> v4: keep origin comments (Lijo)
>
> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 34 +++---------------------
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 12 +++++++++
> 2 files changed, 16 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index 9b54a1ece447..a1e54bcef495 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -534,37 +534,11 @@ bool amdgpu_sdma_is_shared_inv_eng(struct
> amdgpu_device *adev, struct amdgpu_rin static int
> amdgpu_sdma_soft_reset(struct amdgpu_device *adev, u32 instance_id) {
> struct amdgpu_sdma_instance *sdma_instance = &adev-
> >sdma.instance[instance_id];
> - int r = -EOPNOTSUPP;
> -
> - switch (amdgpu_ip_version(adev, SDMA0_HWIP, 0)) {
> - case IP_VERSION(4, 4, 2):
> - case IP_VERSION(4, 4, 4):
> - case IP_VERSION(4, 4, 5):
> - /* For SDMA 4.x, use the existing DPM interface for backward
> compatibility,
> - * we need to convert the logical instance ID to physical instance ID
> before reset.
> - */
> - r = amdgpu_dpm_reset_sdma(adev, 1 << GET_INST(SDMA0,
> instance_id));
> - break;
> - case IP_VERSION(5, 0, 0):
> - case IP_VERSION(5, 0, 1):
> - case IP_VERSION(5, 0, 2):
> - case IP_VERSION(5, 0, 5):
> - case IP_VERSION(5, 2, 0):
> - case IP_VERSION(5, 2, 2):
> - case IP_VERSION(5, 2, 4):
> - case IP_VERSION(5, 2, 5):
> - case IP_VERSION(5, 2, 6):
> - case IP_VERSION(5, 2, 3):
> - case IP_VERSION(5, 2, 1):
> - case IP_VERSION(5, 2, 7):
> - if (sdma_instance->funcs->soft_reset_kernel_queue)
> - r = sdma_instance->funcs->soft_reset_kernel_queue(adev,
> instance_id);
> - break;
> - default:
> - break;
> - }
>
> - return r;
> + if (sdma_instance->funcs->soft_reset_kernel_queue)
> + return sdma_instance->funcs->soft_reset_kernel_queue(adev,
> +instance_id);
> +
> + return -EOPNOTSUPP;
> }
>
> /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> index 3de125062ee3..35b0a7fb37b9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> @@ -109,6 +109,8 @@ static void sdma_v4_4_2_set_ras_funcs(struct
> amdgpu_device *adev); static void sdma_v4_4_2_update_reset_mask(struct
> amdgpu_device *adev); static int sdma_v4_4_2_stop_queue(struct amdgpu_ring
> *ring); static int sdma_v4_4_2_restore_queue(struct amdgpu_ring *ring);
> +static int sdma_v4_4_2_soft_reset_engine(struct amdgpu_device *adev,
> + u32 instance_id);
>
> static u32 sdma_v4_4_2_get_reg_offset(struct amdgpu_device *adev,
> u32 instance, u32 offset)
> @@ -1337,6 +1339,7 @@ static bool
> sdma_v4_4_2_fw_support_paging_queue(struct amdgpu_device *adev) static
> const struct amdgpu_sdma_funcs sdma_v4_4_2_sdma_funcs = {
> .stop_kernel_queue = &sdma_v4_4_2_stop_queue,
> .start_kernel_queue = &sdma_v4_4_2_restore_queue,
> + .soft_reset_kernel_queue = &sdma_v4_4_2_soft_reset_engine,
> };
>
> static int sdma_v4_4_2_early_init(struct amdgpu_ip_block *ip_block) @@ -1745,6
> +1748,15 @@ static int sdma_v4_4_2_restore_queue(struct amdgpu_ring *ring)
> return sdma_v4_4_2_inst_start(adev, inst_mask, true); }
>
> +static int sdma_v4_4_2_soft_reset_engine(struct amdgpu_device *adev,
> + u32 instance_id)
> +{
> + /* For SDMA 4.x, use the existing DPM interface for backward compatibility
> + * we need to convert the logical instance ID to physical instance ID before
> reset.
> + */
> + return amdgpu_dpm_reset_sdma(adev, 1 << GET_INST(SDMA0,
> instance_id));
> +}
> +
> static int sdma_v4_4_2_set_trap_irq_state(struct amdgpu_device *adev,
> struct amdgpu_irq_src *source,
> unsigned type,
> --
> 2.34.1
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-06-11 14:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-11 10:04 [v4 1/3] drm/amdgpu: Fix SDMA engine reset with logical instance ID Jesse Zhang
2025-06-11 10:04 ` [v4 2/3] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations Jesse Zhang
2025-06-11 10:04 ` [v4 3/3] drm/amdgpu: Add soft reset callback to SDMA v4.4.x Jesse Zhang
2025-06-11 14:16 ` Deucher, Alexander
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).