All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdkfd: Skip locking KFD when unbinding GPU
@ 2023-11-06  7:14 Lawrence Yiu
  2023-11-06 23:10 ` Felix Kuehling
  0 siblings, 1 reply; 4+ messages in thread
From: Lawrence Yiu @ 2023-11-06  7:14 UTC (permalink / raw)
  To: amd-gfx, Felix.Kuehling
  Cc: alexander.deucher, Xinhui.Pan, christian.koenig, Lawrence Yiu

After unbinding a GPU, KFD becomes locked and unusable, resulting in
applications not being able to use ROCm for compute anymore and rocminfo
outputting the following error message:

ROCk module is loaded
Unable to open /dev/kfd read-write: Invalid argument

KFD remains locked even after rebinding the same GPU and a system reboot
is required to unlock it. Fix this by not locking KFD during the GPU
unbind process.

Closes: https://github.com/RadeonOpenCompute/ROCm/issues/629
Signed-off-by: Lawrence Yiu <lawyiu.dev@gmail.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 0a9cf9dfc224..c9436039e619 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -949,8 +949,8 @@ void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm)
 	if (!kfd->init_complete)
 		return;
 
-	/* for runtime suspend, skip locking kfd */
-	if (!run_pm) {
+	/* for runtime suspend or GPU unbind, skip locking kfd */
+	if (!run_pm && !drm_dev_is_unplugged(adev_to_drm(kfd->adev))) {
 		mutex_lock(&kfd_processes_mutex);
 		count = ++kfd_locked;
 		mutex_unlock(&kfd_processes_mutex);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-07 22:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-06  7:14 [PATCH] drm/amdkfd: Skip locking KFD when unbinding GPU Lawrence Yiu
2023-11-06 23:10 ` Felix Kuehling
2023-11-07 22:03   ` Alex Deucher
2023-11-07 22:16     ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.