Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t] lib/amdgpu: Handle -ENODATA in amdgpu_wait_memory
@ 2025-01-15  7:03 Jesse.zhang@amd.com
  2025-01-15  7:30 ` ✗ GitLab.Pipeline: warning for " Patchwork
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Jesse.zhang@amd.com @ 2025-01-15  7:03 UTC (permalink / raw)
  To: igt-dev; +Cc: Vitaly Prosyak, Alex Deucher, Christian Koenig,
	Jesse.zhang@amd.com

The amdgpu_wait_memory function currently asserts if the return value
is non-zero and not -ECANCELED. However, -ENODATA is also a valid
error code that can be returned during GPU job timeout recovery,
particularly for queue resets. This patch updates the function to
also accept -ENODATA as a non-fatal error condition.

This change aligns with recent updates in the AMDGPU kernel driver
where -ENODATA is used to indicate queue-specific resets during
timeout recovery, while -ECANCELED or -ETIME is used for full GPU
resets. For more details, see the kernel discussion:
https://lists.freedesktop.org/archives/amd-gfx/2025-January/118795.html

Cc: Vitaly Prosyak <vitaly.prosyak@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alexander Deucher <alexander.deucher@amd.com>

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
---
 lib/amdgpu/amd_deadlock_helpers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/amdgpu/amd_deadlock_helpers.c b/lib/amdgpu/amd_deadlock_helpers.c
index 8ac6abf8f..f274a6365 100644
--- a/lib/amdgpu/amd_deadlock_helpers.c
+++ b/lib/amdgpu/amd_deadlock_helpers.c
@@ -142,7 +142,7 @@ amdgpu_wait_memory(amdgpu_device_handle device_handle, unsigned int ip_type, uin
 		job_count++;
 	} while (r == 0 && job_count < MAX_JOB_COUNT);
 
-	if (r != 0 && r != -ECANCELED)
+	if (r != 0 && r != -ECANCELED && r != -ENODATA)
 		igt_assert(0);
 
 
@@ -156,7 +156,7 @@ amdgpu_wait_memory(amdgpu_device_handle device_handle, unsigned int ip_type, uin
 
 	r = amdgpu_cs_query_fence_status(&fence_status, AMDGPU_TIMEOUT_INFINITE, 0,
 			&expired);
-	if (r != 0 && r != -ECANCELED)
+	if (r != 0 && r != -ECANCELED && r != -ENODATA)
 		igt_assert(0);
 
 	/* send signal to modify the memory we wait for */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-15 21:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-15  7:03 [PATCH i-g-t] lib/amdgpu: Handle -ENODATA in amdgpu_wait_memory Jesse.zhang@amd.com
2025-01-15  7:30 ` ✗ GitLab.Pipeline: warning for " Patchwork
2025-01-15  7:54 ` ✓ Xe.CI.BAT: success " Patchwork
2025-01-15  7:56 ` ✓ i915.CI.BAT: " Patchwork
2025-01-15 14:50 ` ✗ Xe.CI.Full: failure " Patchwork
2025-01-15 20:50 ` ✗ i915.CI.Full: " Patchwork
2025-01-15 21:05 ` [PATCH i-g-t] " vitaly prosyak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox