amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2 V2] drm/amdgpu: Fix SDMA engine reset with logical instance ID
@ 2025-06-11  7:16 Jesse Zhang
  2025-06-11  7:16 ` [PATCH 2/2 V2] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations Jesse Zhang
  0 siblings, 1 reply; 2+ messages in thread
From: Jesse Zhang @ 2025-06-11  7:16 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alexander.Deucher, Christian Koenig, jonathan.kim, jiadong.zhu,
	Lijo Lazar, Jesse Zhang, Jesse Zhang

This commit makes the following improvements to SDMA engine reset handling:

1. Clarifies in the function documentation that instance_id refers to a logical ID
2. Adds conversion from logical to physical instance ID before performing reset
   using GET_INST(SDMA0, instance_id) macro
3. Improves error messaging to indicate when a logical instance reset fails
4. Adds better code organization with blank lines for readability

The change ensures proper SDMA engine reset by using the correct physical
instance ID while maintaining the logical ID interface for callers.

V2: Remove harvest_config check and convert directly to physical instance (Lijo)

Suggested-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index 6716ac281c49..59085e208044 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -568,7 +568,7 @@ static int amdgpu_sdma_soft_reset(struct amdgpu_device *adev, u32 instance_id)
 /**
  * amdgpu_sdma_reset_engine - Reset a specific SDMA engine
  * @adev: Pointer to the AMDGPU device
- * @instance_id: ID of the SDMA engine instance to reset
+ * @instance_id: Logical ID of the SDMA engine instance to reset
  *
  * Returns: 0 on success, or a negative error code on failure.
  */
@@ -581,6 +581,7 @@ int amdgpu_sdma_reset_engine(struct amdgpu_device *adev, uint32_t instance_id)
 	bool gfx_sched_stopped = false, page_sched_stopped = false;
 
 	mutex_lock(&sdma_instance->engine_reset_mutex);
+
 	/* Stop the scheduler's work queue for the GFX and page rings if they are running.
 	* This ensures that no new tasks are submitted to the queues while
 	* the reset is in progress.
@@ -598,10 +599,12 @@ int amdgpu_sdma_reset_engine(struct amdgpu_device *adev, uint32_t instance_id)
 	if (sdma_instance->funcs->stop_kernel_queue)
 		sdma_instance->funcs->stop_kernel_queue(gfx_ring);
 
-	/* Perform the SDMA reset for the specified instance */
-	ret = amdgpu_sdma_soft_reset(adev, instance_id);
+	/* we need to convert the logical instance ID to physical instance ID before reset.
+	 * then perform the SDMA reset for the specified physical instance.
+	 */
+	ret = amdgpu_sdma_soft_reset(adev, GET_INST(SDMA0, instance_id));
 	if (ret) {
-		dev_err(adev->dev, "Failed to reset SDMA instance %u\n", instance_id);
+		dev_err(adev->dev, "Failed to reset SDMA logical instance %u\n", instance_id);
 		goto exit;
 	}
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH 2/2 V2] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations
  2025-06-11  7:16 [PATCH 1/2 V2] drm/amdgpu: Fix SDMA engine reset with logical instance ID Jesse Zhang
@ 2025-06-11  7:16 ` Jesse Zhang
  0 siblings, 0 replies; 2+ messages in thread
From: Jesse Zhang @ 2025-06-11  7:16 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alexander.Deucher, Christian Koenig, jonathan.kim, jiadong.zhu,
	Lijo Lazar, Jesse Zhang, Jesse Zhang

Simplify SDMA v4_4_2 queue reset and stop operations by:
1. Removing GET_INST(SDMA0) conversion for ring->me
2. Using the logical instance ID (ring->me) directly
3. Maintaining consistent behavior with other SDMA queue operations

This change aligns with the existing queue handling logic where
ring->me already represents the correct instance identifier.

Signed-off-by:  Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index 9c169112a5e7..3de125062ee3 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -1670,7 +1670,7 @@ static bool sdma_v4_4_2_page_ring_is_guilty(struct amdgpu_ring *ring)
 static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
 {
 	struct amdgpu_device *adev = ring->adev;
-	u32 id = GET_INST(SDMA0, ring->me);
+	u32 id = ring->me;
 	int r;
 
 	if (!(adev->sdma.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE))
@@ -1686,7 +1686,7 @@ static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
 static int sdma_v4_4_2_stop_queue(struct amdgpu_ring *ring)
 {
 	struct amdgpu_device *adev = ring->adev;
-	u32 instance_id = GET_INST(SDMA0, ring->me);
+	u32 instance_id = ring->me;
 	u32 inst_mask;
 	uint64_t rptr;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-06-11  7:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-11  7:16 [PATCH 1/2 V2] drm/amdgpu: Fix SDMA engine reset with logical instance ID Jesse Zhang
2025-06-11  7:16 ` [PATCH 2/2 V2] drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations Jesse Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).