All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference
@ 2023-11-07 16:58 Felix Kuehling
  2023-11-07 16:58 ` [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs Felix Kuehling
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 16:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update.

Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7b2a7c9156f0..1442d97ddd0f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1015,8 +1015,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 				bo = gem_to_amdgpu_bo(gobj);
 		}
 		mem = bo->tbo.resource;
-		if (mem->mem_type == TTM_PL_TT ||
-		    mem->mem_type == AMDGPU_PL_PREEMPT)
+		if (mem && (mem->mem_type == TTM_PL_TT ||
+			    mem->mem_type == AMDGPU_PL_PREEMPT))
 			pages_addr = bo->tbo.ttm->dma_address;
 	}
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs
  2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
@ 2023-11-07 16:58 ` Felix Kuehling
  2023-11-07 22:11   ` Felix Kuehling
  2023-11-07 16:58 ` [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in compute VMs Felix Kuehling
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 16:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Create a new VM state to track user BOs that are in the system domain.
In the next patch this will be used do conditionally re-validate them in
amdgpu_vm_handle_moved.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 1442d97ddd0f..0d685577243c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -232,6 +232,22 @@ static void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)
 	spin_unlock(&vm_bo->vm->status_lock);
 }
 
+/**
+ * amdgpu_vm_bo_evicted_user - vm_bo is evicted
+ *
+ * @vm_bo: vm_bo which is evicted
+ *
+ * State for BOs used by user mode queues which are not at the location they
+ * should be.
+ */
+static void amdgpu_vm_bo_evicted_user(struct amdgpu_vm_bo_base *vm_bo)
+{
+	vm_bo->moved = true;
+	spin_lock(&vm_bo->vm->status_lock);
+	list_move(&vm_bo->vm_status, &vm_bo->vm->evicted_user);
+	spin_unlock(&vm_bo->vm->status_lock);
+}
+
 /**
  * amdgpu_vm_bo_relocated - vm_bo is reloacted
  *
@@ -2110,6 +2126,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, int32_t xcp
 	for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
 		vm->reserved_vmid[i] = NULL;
 	INIT_LIST_HEAD(&vm->evicted);
+	INIT_LIST_HEAD(&vm->evicted_user);
 	INIT_LIST_HEAD(&vm->relocated);
 	INIT_LIST_HEAD(&vm->moved);
 	INIT_LIST_HEAD(&vm->idle);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 12cac06fa289..939d0c2219c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -286,9 +286,12 @@ struct amdgpu_vm {
 	/* Lock to protect vm_bo add/del/move on all lists of vm */
 	spinlock_t		status_lock;
 
-	/* BOs who needs a validation */
+	/* Per VM and PT BOs who needs a validation */
 	struct list_head	evicted;
 
+	/* BOs for user mode queues that need a validation */
+	struct list_head	evicted_user;
+
 	/* PT BOs which relocated and their parent need an update */
 	struct list_head	relocated;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in compute VMs
  2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
  2023-11-07 16:58 ` [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs Felix Kuehling
@ 2023-11-07 16:58 ` Felix Kuehling
  2023-11-07 20:24   ` Joshi, Mukul
  2023-11-07 16:58 ` [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles Felix Kuehling
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 16:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the
process_info->kfd_bo_list. There is no explicit KFD API call to validate
them or add eviction fences to them.

This patch automatically validates and fences dymanic DMABuf imports when
they are added to a compute VM. Revalidation after evictions is handled
in the VM code.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |   3 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  15 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  26 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 117 +++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   6 +-
 7 files changed, 164 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index fcf8a98ad15e..68d534a89942 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -178,6 +178,9 @@ int amdgpu_queue_mask_bit_to_set_resource_bit(struct amdgpu_device *adev,
 struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 context,
 				struct mm_struct *mm,
 				struct svm_range_bo *svm_bo);
+int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
+					uint32_t domain,
+					struct dma_fence *fence);
 #if defined(CONFIG_DEBUG_FS)
 int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data);
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 2e302956a279..0c1cb6048259 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -423,9 +423,9 @@ static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 	return ret;
 }
 
-static int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
-					       uint32_t domain,
-					       struct dma_fence *fence)
+int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
+					uint32_t domain,
+					struct dma_fence *fence)
 {
 	int ret = amdgpu_bo_reserve(bo, false);
 
@@ -2948,7 +2948,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 		struct amdgpu_device *adev = amdgpu_ttm_adev(
 			peer_vm->root.bo->tbo.bdev);
 
-		ret = amdgpu_vm_handle_moved(adev, peer_vm, &ctx.ticket);
+		ret = amdgpu_vm_handle_moved(adev, peer_vm, &ctx.ticket, true);
 		if (ret) {
 			pr_debug("Memory eviction: handle moved failed. Try again\n");
 			goto validate_map_fail;
@@ -3001,7 +3001,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 				   &process_info->eviction_fence->base,
 				   DMA_RESV_USAGE_BOOKKEEP);
 	}
-	/* Attach eviction fence to PD / PT BOs */
+	/* Attach eviction fence to PD / PT BOs and DMABuf imports */
 	list_for_each_entry(peer_vm, &process_info->vm_list_head,
 			    vm_list_node) {
 		struct amdgpu_bo *bo = peer_vm->root.bo;
@@ -3009,6 +3009,11 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 		dma_resv_add_fence(bo->tbo.base.resv,
 				   &process_info->eviction_fence->base,
 				   DMA_RESV_USAGE_BOOKKEEP);
+
+		ret = amdgpu_vm_fence_imports(peer_vm, &ctx.ticket,
+					      &process_info->eviction_fence->base);
+		if (ret)
+			break;
 	}
 
 validate_map_fail:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index aafedb344c1b..20f4be8cd635 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1165,7 +1165,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
 			return r;
 	}
 
-	r = amdgpu_vm_handle_moved(adev, vm, &p->ticket);
+	r = amdgpu_vm_handle_moved(adev, vm, &p->ticket, false);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index e7e87a3b2601..234244704f27 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -373,6 +373,10 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
 	struct amdgpu_vm_bo_base *bo_base;
 	int r;
 
+	/* FIXME: This should be after the "if", but needs a fix to make sure
+	 * DMABuf imports are initialized in the right VM list.
+	 */
+	amdgpu_vm_bo_invalidate(adev, bo, false);
 	if (!bo->tbo.resource || bo->tbo.resource->mem_type == TTM_PL_SYSTEM)
 		return;
 
@@ -409,7 +413,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
 		if (!r)
 			r = amdgpu_vm_clear_freed(adev, vm, NULL);
 		if (!r)
-			r = amdgpu_vm_handle_moved(adev, vm, ticket);
+			r = amdgpu_vm_handle_moved(adev, vm, ticket, false);
 
 		if (r && r != -EBUSY)
 			DRM_ERROR("Failed to invalidate VM page tables (%d))\n",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 849fffbb367d..755cc3c559f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -186,6 +186,32 @@ static int amdgpu_gem_object_open(struct drm_gem_object *obj,
 	else
 		++bo_va->ref_count;
 	amdgpu_bo_unreserve(abo);
+
+	/* Validate and add eviction fence to DMABuf imports with dymanic
+	 * attachment in compute VMs. Re-validation will be done by
+	 * amdgpu_vm_handle_moved and the fence will be updated by
+	 * amdgpu_vm_fence_imports in amdgpu_amdkfd_gpuvm_restore_process_bos.
+	 *
+	 * Nested locking below for the case that a GEM object is opened in
+	 * kfd_mem_export_dmabuf. Since the lock below is only taken for imports,
+	 * but not for export, this is a different lock class that cannot lead to
+	 * circular lock dependencies.
+	 */
+	if (!vm->is_compute_context || !vm->process_info)
+		return 0;
+	if (!obj->import_attach ||
+	    !dma_buf_is_dynamic(obj->import_attach->dmabuf))
+		return 0;
+	mutex_lock_nested(&vm->process_info->lock, 1);
+	if (!WARN_ON(!vm->process_info->eviction_fence)) {
+		r = amdgpu_amdkfd_bo_validate_and_fence(abo, AMDGPU_GEM_DOMAIN_GTT,
+							&vm->process_info->eviction_fence->base);
+		if (r)
+			dev_warn(adev->dev, "%d: validate_and_fence failed: %d\n",
+				 vm->task_info.pid, r);
+	}
+	mutex_unlock(&vm->process_info->lock);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0d685577243c..b2c7449ab561 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1307,6 +1307,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
  * @adev: amdgpu_device pointer
  * @vm: requested vm
  * @ticket: optional reservation ticket used to reserve the VM
+ * @valitate: whether to auto-validate invalid DMABuf imports
  *
  * Make sure all BOs which are moved are updated in the PTs.
  *
@@ -1317,7 +1318,8 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
  */
 int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
 			   struct amdgpu_vm *vm,
-			   struct ww_acquire_ctx *ticket)
+			   struct ww_acquire_ctx *ticket,
+			   bool validate)
 {
 	struct amdgpu_bo_va *bo_va;
 	struct dma_resv *resv;
@@ -1337,6 +1339,12 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
 		spin_lock(&vm->status_lock);
 	}
 
+	/* If we're validating user BOs, splice all evicted user BOs into
+	 * the list of invalid BOs for revalidation
+	 */
+	if (validate)
+		list_splice_init(&vm->evicted_user, &vm->invalidated);
+
 	while (!list_empty(&vm->invalidated)) {
 		bo_va = list_first_entry(&vm->invalidated, struct amdgpu_bo_va,
 					 base.vm_status);
@@ -1357,17 +1365,120 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
 			unlock = false;
 		}
 
+		/* Automatically validate DMABuf imports in compute VMs, if we
+		 * have a reservation, or remember them for later validation.
+		 */
+		if (vm->is_compute_context && bo_va->base.bo &&
+		    bo_va->base.bo->tbo.base.import_attach &&
+		    (!bo_va->base.bo->tbo.resource ||
+		     bo_va->base.bo->tbo.resource->mem_type == TTM_PL_SYSTEM)) {
+			struct ttm_operation_ctx ctx = { true, false };
+			struct amdgpu_bo *bo = bo_va->base.bo;
+
+			if (!validate) {
+				r = amdgpu_vm_bo_update(adev, bo_va, clear);
+				if (!r)
+					amdgpu_vm_bo_evicted_user(&bo_va->base);
+				goto unlock;
+			}
+
+			if (clear) {
+				pr_warn_ratelimited("Invalid DMABuf import is busy in pid %d\n", vm->task_info.pid);
+				r = -EBUSY;
+				goto unlock;
+			}
+
+			amdgpu_bo_placement_from_domain(bo,
+							bo->preferred_domains);
+			r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
+			if (r)
+				goto unlock;
+			r = amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD,
+						true);
+			if (r)
+				goto unlock;
+		}
+
 		r = amdgpu_vm_bo_update(adev, bo_va, clear);
+unlock:
+		if (unlock)
+			dma_resv_unlock(resv);
 		if (r)
 			return r;
+		spin_lock(&vm->status_lock);
+	}
+	spin_unlock(&vm->status_lock);
+
+	return 0;
+}
+
+/**
+ * amdgpu_vm_fence_imports - add fence to valid DMABuf imports
+ *
+ * @vm: requested vm
+ * @ticket: optional reservation ticket used to reserve the VM
+ * @fence: fence to add
+ *
+ * Add the specified fence to all dymanic DMABuf imports that are valid.
+ *
+ * Returns:
+ * 0 for success.
+ */
+int amdgpu_vm_fence_imports(struct amdgpu_vm *vm,
+			    struct ww_acquire_ctx *ticket,
+			    struct dma_fence *fence)
+{
+	struct amdgpu_bo_va *bo_va, *tmp;
+	struct dma_resv *resv;
+	LIST_HEAD(imports);
+	bool unlock;
+	int ret = 0;
+
+	if (!vm->is_compute_context)
+		return 0;
+
+	/* Move all the DMABuf imports to a private list so I can reserve
+	 * them while not holding te status_lock.
+	 */
+	spin_lock(&vm->status_lock);
+	list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) {
+		if (bo_va->base.bo && bo_va->base.bo->tbo.base.import_attach &&
+		    dma_buf_is_dynamic(bo_va->base.bo->tbo.base.import_attach->dmabuf))
+			list_move(&bo_va->base.vm_status, &imports);
+	}
+	spin_unlock(&vm->status_lock);
+
+	list_for_each_entry(bo_va, &imports, base.vm_status) {
+		resv = bo_va->base.bo->tbo.base.resv;
+
+		/* Try to reserve the BO */
+		if (dma_resv_trylock(resv)) {
+			unlock = true;
+		/* The caller is already holding the reservation lock */
+		} else if (ticket && dma_resv_locking_ctx(resv) == ticket) {
+			unlock = false;
+		} else {
+			WARN_ONCE(1, "Failed to reserve DMABuf import");
+			ret = -EBUSY;
+			break;
+		}
+
+		ret = dma_resv_reserve_fences(resv, 1);
+		if (!ret)
+			dma_resv_add_fence(resv, fence,
+					   DMA_RESV_USAGE_BOOKKEEP);
 
 		if (unlock)
 			dma_resv_unlock(resv);
-		spin_lock(&vm->status_lock);
+		if (ret)
+			break;
 	}
+
+	spin_lock(&vm->status_lock);
+	list_splice(&imports, &vm->idle);
 	spin_unlock(&vm->status_lock);
 
-	return 0;
+	return ret;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 939d0c2219c0..2db04b8fef97 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -443,7 +443,11 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			  struct dma_fence **fence);
 int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
 			   struct amdgpu_vm *vm,
-			   struct ww_acquire_ctx *ticket);
+			   struct ww_acquire_ctx *ticket,
+			   bool validate);
+int amdgpu_vm_fence_imports(struct amdgpu_vm *vm,
+			    struct ww_acquire_ctx *ticket,
+			    struct dma_fence *fence);
 int amdgpu_vm_flush_compute_tlb(struct amdgpu_device *adev,
 				struct amdgpu_vm *vm,
 				uint32_t flush_type,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
  2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
  2023-11-07 16:58 ` [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs Felix Kuehling
  2023-11-07 16:58 ` [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in compute VMs Felix Kuehling
@ 2023-11-07 16:58 ` Felix Kuehling
  2023-11-07 19:44   ` Errabolu, Ramesh
  2023-11-16 21:53   ` Felix Kuehling
  2023-11-07 16:58 ` [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM Felix Kuehling
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 16:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM
handles are created in a drm_client_dev context to avoid exposing them
in user mode contexts through a DMABuf import.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 11 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  5 +++
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 33 +++++++++++++++----
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |  4 +--
 4 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 6ab17330a6ed..b0a67f16540a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -142,6 +142,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 {
 	int i;
 	int last_valid_bit;
+	int ret;
 
 	amdgpu_amdkfd_gpuvm_init_mem_limits();
 
@@ -160,6 +161,12 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 			.enable_mes = adev->enable_mes,
 		};
 
+		ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", NULL);
+		if (ret) {
+			dev_err(adev->dev, "Failed to init DRM client: %d\n", ret);
+			return;
+		}
+
 		/* this is going to have a few of the MSBs set that we need to
 		 * clear
 		 */
@@ -198,6 +205,10 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 
 		adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
 							&gpu_resources);
+		if (adev->kfd.init_complete)
+			drm_client_register(&adev->kfd.client);
+		else
+			drm_client_release(&adev->kfd.client);
 
 		amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 68d534a89942..4caf8cece028 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -32,6 +32,7 @@
 #include <linux/mmu_notifier.h>
 #include <linux/memremap.h>
 #include <kgd_kfd_interface.h>
+#include <drm/drm_client.h>
 #include <drm/ttm/ttm_execbuf_util.h>
 #include "amdgpu_sync.h"
 #include "amdgpu_vm.h"
@@ -84,6 +85,7 @@ struct kgd_mem {
 
 	struct amdgpu_sync sync;
 
+	uint32_t gem_handle;
 	bool aql_queue;
 	bool is_imported;
 };
@@ -106,6 +108,9 @@ struct amdgpu_kfd_dev {
 
 	/* HMM page migration MEMORY_DEVICE_PRIVATE mapping */
 	struct dev_pagemap pgmap;
+
+	/* Client for KFD BO GEM handle allocations */
+	struct drm_client_dev client;
 };
 
 enum kgd_engine_type {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 0c1cb6048259..4bb8b5fd7598 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -25,6 +25,7 @@
 #include <linux/pagemap.h>
 #include <linux/sched/mm.h>
 #include <linux/sched/task.h>
+#include <linux/fdtable.h>
 #include <drm/ttm/ttm_tt.h>
 
 #include "amdgpu_object.h"
@@ -804,13 +805,22 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,
 static int kfd_mem_export_dmabuf(struct kgd_mem *mem)
 {
 	if (!mem->dmabuf) {
-		struct dma_buf *ret = amdgpu_gem_prime_export(
-			&mem->bo->tbo.base,
+		struct amdgpu_device *bo_adev;
+		struct dma_buf *dmabuf;
+		int r, fd;
+
+		bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
+		r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, bo_adev->kfd.client.file,
+					       mem->gem_handle,
 			mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
-				DRM_RDWR : 0);
-		if (IS_ERR(ret))
-			return PTR_ERR(ret);
-		mem->dmabuf = ret;
+					       DRM_RDWR : 0, &fd);
+		if (r)
+			return r;
+		dmabuf = dma_buf_get(fd);
+		close_fd(fd);
+		if (WARN_ON_ONCE(IS_ERR(dmabuf)))
+			return PTR_ERR(dmabuf);
+		mem->dmabuf = dmabuf;
 	}
 
 	return 0;
@@ -1826,6 +1836,9 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 		pr_debug("Failed to allow vma node access. ret %d\n", ret);
 		goto err_node_allow;
 	}
+	ret = drm_gem_handle_create(adev->kfd.client.file, gobj, &(*mem)->gem_handle);
+	if (ret)
+		goto err_gem_handle_create;
 	bo = gem_to_amdgpu_bo(gobj);
 	if (bo_type == ttm_bo_type_sg) {
 		bo->tbo.sg = sg;
@@ -1877,6 +1890,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 err_pin_bo:
 err_validate_bo:
 	remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info);
+	drm_gem_handle_delete(adev->kfd.client.file, (*mem)->gem_handle);
+err_gem_handle_create:
 	drm_vma_node_revoke(&gobj->vma_node, drm_priv);
 err_node_allow:
 	/* Don't unreserve system mem limit twice */
@@ -1991,8 +2006,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
 
 	/* Free the BO*/
 	drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
-	if (mem->dmabuf)
+	if (!mem->is_imported)
+		drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
+	if (mem->dmabuf) {
 		dma_buf_put(mem->dmabuf);
+		mem->dmabuf = NULL;
+	}
 	mutex_destroy(&mem->lock);
 
 	/* If this releases the last reference, it will end up calling
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 06988cf1db51..4417a9863cd0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1855,8 +1855,8 @@ static uint32_t get_process_num_bos(struct kfd_process *p)
 	return num_of_bos;
 }
 
-static int criu_get_prime_handle(struct kgd_mem *mem, int flags,
-				      u32 *shared_fd)
+static int criu_get_prime_handle(struct kgd_mem *mem,
+				 int flags, u32 *shared_fd)
 {
 	struct dma_buf *dmabuf;
 	int ret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM
  2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
                   ` (2 preceding siblings ...)
  2023-11-07 16:58 ` [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles Felix Kuehling
@ 2023-11-07 16:58 ` Felix Kuehling
  2023-11-08  0:01   ` Errabolu, Ramesh
                     ` (2 more replies)
  2023-11-07 16:58 ` [PATCH 6/6] drm/amdkfd: Bump KFD ioctl version Felix Kuehling
  2023-11-08 12:11 ` [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Christian König
  5 siblings, 3 replies; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 16:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
ensures that a GEM handle is created on import and that obj->dma_buf
will be set and remain set as long as the object is imported into KFD.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  9 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 64 +++++++++++++------
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 15 ++---
 3 files changed, 52 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 4caf8cece028..88a0e0734270 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -318,11 +318,10 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
 					    struct dma_fence **ef);
 int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
 					      struct kfd_vm_fault_info *info);
-int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
-				      struct dma_buf *dmabuf,
-				      uint64_t va, void *drm_priv,
-				      struct kgd_mem **mem, uint64_t *size,
-				      uint64_t *mmap_offset);
+int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
+					 uint64_t va, void *drm_priv,
+					 struct kgd_mem **mem, uint64_t *size,
+					 uint64_t *mmap_offset);
 int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
 				      struct dma_buf **dmabuf);
 void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 4bb8b5fd7598..1077de8bced2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2006,8 +2006,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
 
 	/* Free the BO*/
 	drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
-	if (!mem->is_imported)
-		drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
+	drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
 	if (mem->dmabuf) {
 		dma_buf_put(mem->dmabuf);
 		mem->dmabuf = NULL;
@@ -2363,34 +2362,26 @@ int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
 	return 0;
 }
 
-int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
-				      struct dma_buf *dma_buf,
-				      uint64_t va, void *drm_priv,
-				      struct kgd_mem **mem, uint64_t *size,
-				      uint64_t *mmap_offset)
+static int import_obj_create(struct amdgpu_device *adev,
+			     struct dma_buf *dma_buf,
+			     struct drm_gem_object *obj,
+			     uint64_t va, void *drm_priv,
+			     struct kgd_mem **mem, uint64_t *size,
+			     uint64_t *mmap_offset)
 {
 	struct amdgpu_vm *avm = drm_priv_to_vm(drm_priv);
-	struct drm_gem_object *obj;
 	struct amdgpu_bo *bo;
 	int ret;
 
-	obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
 	bo = gem_to_amdgpu_bo(obj);
 	if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
-				    AMDGPU_GEM_DOMAIN_GTT))) {
+				    AMDGPU_GEM_DOMAIN_GTT)))
 		/* Only VRAM and GTT BOs are supported */
-		ret = -EINVAL;
-		goto err_put_obj;
-	}
+		return -EINVAL;
 
 	*mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
-	if (!*mem) {
-		ret = -ENOMEM;
-		goto err_put_obj;
-	}
+	if (!*mem)
+		return -ENOMEM;
 
 	ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
 	if (ret)
@@ -2440,8 +2431,41 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
 	drm_vma_node_revoke(&obj->vma_node, drm_priv);
 err_free_mem:
 	kfree(*mem);
+	return ret;
+}
+
+int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
+					 uint64_t va, void *drm_priv,
+					 struct kgd_mem **mem, uint64_t *size,
+					 uint64_t *mmap_offset)
+{
+	struct drm_gem_object *obj;
+	uint32_t handle;
+	int ret;
+
+	ret = drm_gem_prime_fd_to_handle(&adev->ddev, adev->kfd.client.file, fd,
+					 &handle);
+	if (ret)
+		return ret;
+	obj = drm_gem_object_lookup(adev->kfd.client.file, handle);
+	if (!obj) {
+		ret = -EINVAL;
+		goto err_release_handle;
+	}
+
+	ret = import_obj_create(adev, obj->dma_buf, obj, va, drm_priv, mem, size,
+				mmap_offset);
+	if (ret)
+		goto err_put_obj;
+
+	(*mem)->gem_handle = handle;
+
+	return 0;
+
 err_put_obj:
 	drm_gem_object_put(obj);
+err_release_handle:
+	drm_gem_handle_delete(adev->kfd.client.file, handle);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 4417a9863cd0..1a2e9f564b7f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1564,16 +1564,11 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
 {
 	struct kfd_ioctl_import_dmabuf_args *args = data;
 	struct kfd_process_device *pdd;
-	struct dma_buf *dmabuf;
 	int idr_handle;
 	uint64_t size;
 	void *mem;
 	int r;
 
-	dmabuf = dma_buf_get(args->dmabuf_fd);
-	if (IS_ERR(dmabuf))
-		return PTR_ERR(dmabuf);
-
 	mutex_lock(&p->mutex);
 	pdd = kfd_process_device_data_by_id(p, args->gpu_id);
 	if (!pdd) {
@@ -1587,10 +1582,10 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
 		goto err_unlock;
 	}
 
-	r = amdgpu_amdkfd_gpuvm_import_dmabuf(pdd->dev->adev, dmabuf,
-					      args->va_addr, pdd->drm_priv,
-					      (struct kgd_mem **)&mem, &size,
-					      NULL);
+	r = amdgpu_amdkfd_gpuvm_import_dmabuf_fd(pdd->dev->adev, args->dmabuf_fd,
+						 args->va_addr, pdd->drm_priv,
+						 (struct kgd_mem **)&mem, &size,
+						 NULL);
 	if (r)
 		goto err_unlock;
 
@@ -1601,7 +1596,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
 	}
 
 	mutex_unlock(&p->mutex);
-	dma_buf_put(dmabuf);
 
 	args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);
 
@@ -1612,7 +1606,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
 					       pdd->drm_priv, NULL);
 err_unlock:
 	mutex_unlock(&p->mutex);
-	dma_buf_put(dmabuf);
 	return r;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] drm/amdkfd: Bump KFD ioctl version
  2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
                   ` (3 preceding siblings ...)
  2023-11-07 16:58 ` [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM Felix Kuehling
@ 2023-11-07 16:58 ` Felix Kuehling
  2023-11-08 12:11 ` [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Christian König
  5 siblings, 0 replies; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 16:58 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

This is not strictly a change in the IOCTL API. This version bump is meant
to indicate to user mode the presence of a number of changes and fixes
that enable the management of VA mappings in compute VMs using the GEM_VA
ioctl for DMABufs exported from KFD.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 include/uapi/linux/kfd_ioctl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index f0ed68974c54..9ce46edc62a5 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -40,9 +40,10 @@
  * - 1.12 - Add DMA buf export ioctl
  * - 1.13 - Add debugger API
  * - 1.14 - Update kfd_event_data
+ * - 1.15 - Enable managing mappings in compute VMs with GEM_VA ioctl
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 14
+#define KFD_IOCTL_MINOR_VERSION 15
 
 struct kfd_ioctl_get_version_args {
 	__u32 major_version;	/* from KFD */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* RE: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
  2023-11-07 16:58 ` [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles Felix Kuehling
@ 2023-11-07 19:44   ` Errabolu, Ramesh
  2023-11-07 19:56     ` Felix Kuehling
  2023-11-16 21:53   ` Felix Kuehling
  1 sibling, 1 reply; 23+ messages in thread
From: Errabolu, Ramesh @ 2023-11-07 19:44 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx@lists.freedesktop.org
  Cc: Chen, Xiaogang, Koenig, Christian

[AMD Official Use Only - General]

My queries inline.

Regards,
Ramesh

-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@amd.com>
Sent: Tuesday, November 7, 2023 10:28 PM
To: amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian <Christian.Koenig@amd.com>; Chen, Xiaogang <Xiaogang.Chen@amd.com>; Errabolu, Ramesh <Ramesh.Errabolu@amd.com>
Subject: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles

Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM handles are created in a drm_client_dev context to avoid exposing them in user mode contexts through a DMABuf import.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 11 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  5 +++
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 33 +++++++++++++++----
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |  4 +--
 4 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 6ab17330a6ed..b0a67f16540a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -142,6 +142,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)  {
        int i;
        int last_valid_bit;
+       int ret;

        amdgpu_amdkfd_gpuvm_init_mem_limits();

@@ -160,6 +161,12 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
                        .enable_mes = adev->enable_mes,
                };

+               ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", NULL);
+               if (ret) {
+                       dev_err(adev->dev, "Failed to init DRM client: %d\n", ret);
+                       return;
+               }
+
                /* this is going to have a few of the MSBs set that we need to
                 * clear
                 */
@@ -198,6 +205,10 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)

                adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
                                                        &gpu_resources);
+               if (adev->kfd.init_complete)
+                       drm_client_register(&adev->kfd.client);
+               else
+                       drm_client_release(&adev->kfd.client);

                amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 68d534a89942..4caf8cece028 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -32,6 +32,7 @@
 #include <linux/mmu_notifier.h>
 #include <linux/memremap.h>
 #include <kgd_kfd_interface.h>
+#include <drm/drm_client.h>
 #include <drm/ttm/ttm_execbuf_util.h>
 #include "amdgpu_sync.h"
 #include "amdgpu_vm.h"
@@ -84,6 +85,7 @@ struct kgd_mem {

        struct amdgpu_sync sync;

+       uint32_t gem_handle;
        bool aql_queue;
        bool is_imported;
 };
@@ -106,6 +108,9 @@ struct amdgpu_kfd_dev {

        /* HMM page migration MEMORY_DEVICE_PRIVATE mapping */
        struct dev_pagemap pgmap;
+
+       /* Client for KFD BO GEM handle allocations */
+       struct drm_client_dev client;
 };

 enum kgd_engine_type {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 0c1cb6048259..4bb8b5fd7598 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -25,6 +25,7 @@
 #include <linux/pagemap.h>
 #include <linux/sched/mm.h>
 #include <linux/sched/task.h>
+#include <linux/fdtable.h>
 #include <drm/ttm/ttm_tt.h>

 #include "amdgpu_object.h"
@@ -804,13 +805,22 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,  static int kfd_mem_export_dmabuf(struct kgd_mem *mem)  {
        if (!mem->dmabuf) {
-               struct dma_buf *ret = amdgpu_gem_prime_export(
-                       &mem->bo->tbo.base,
+               struct amdgpu_device *bo_adev;
+               struct dma_buf *dmabuf;
+               int r, fd;
+
+               bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
+               r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, bo_adev->kfd.client.file,
+                                              mem->gem_handle,
                        mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
-                               DRM_RDWR : 0);
-               if (IS_ERR(ret))
-                       return PTR_ERR(ret);
-               mem->dmabuf = ret;
+                                              DRM_RDWR : 0, &fd);
+               if (r)
+                       return r;
+               dmabuf = dma_buf_get(fd);
+               close_fd(fd);
+               if (WARN_ON_ONCE(IS_ERR(dmabuf)))
+                       return PTR_ERR(dmabuf);
+               mem->dmabuf = dmabuf;
        }

        return 0;
@@ -1826,6 +1836,9 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
                pr_debug("Failed to allow vma node access. ret %d\n", ret);
                goto err_node_allow;
        }
+       ret = drm_gem_handle_create(adev->kfd.client.file, gobj, &(*mem)->gem_handle);
+       if (ret)
+               goto err_gem_handle_create;
        bo = gem_to_amdgpu_bo(gobj);
        if (bo_type == ttm_bo_type_sg) {
                bo->tbo.sg = sg;
@@ -1877,6 +1890,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 err_pin_bo:
 err_validate_bo:
        remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info);
+       drm_gem_handle_delete(adev->kfd.client.file, (*mem)->gem_handle);
+err_gem_handle_create:
        drm_vma_node_revoke(&gobj->vma_node, drm_priv);
 err_node_allow:
        /* Don't unreserve system mem limit twice */ @@ -1991,8 +2006,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(

        /* Free the BO*/
        drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
-       if (mem->dmabuf)
+       if (!mem->is_imported)
+               drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
+       if (mem->dmabuf) {
                dma_buf_put(mem->dmabuf);
+               mem->dmabuf = NULL;
+       }
        mutex_destroy(&mem->lock);

Ramesh: What happens if user invokes "free" using KFD IOCTL while the BO is imported and mapped on the device using DRM IOCTLs. Could it lead to inconsistent state?. For example the call drm_gem_handle_delete() will remove Dmabuf, close the GEM object. If user invokes free() using KFD Apis then the underlying object could be removed. It is not clear if this impacts devices that have just imported.

        /* If this releases the last reference, it will end up calling diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 06988cf1db51..4417a9863cd0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1855,8 +1855,8 @@ static uint32_t get_process_num_bos(struct kfd_process *p)
        return num_of_bos;
 }

-static int criu_get_prime_handle(struct kgd_mem *mem, int flags,
-                                     u32 *shared_fd)
+static int criu_get_prime_handle(struct kgd_mem *mem,
+                                int flags, u32 *shared_fd)
 {
        struct dma_buf *dmabuf;
        int ret;
--
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
  2023-11-07 19:44   ` Errabolu, Ramesh
@ 2023-11-07 19:56     ` Felix Kuehling
  2023-11-08  2:08       ` Errabolu, Ramesh
  0 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 19:56 UTC (permalink / raw)
  To: Errabolu, Ramesh, amd-gfx@lists.freedesktop.org
  Cc: Chen, Xiaogang, Koenig, Christian


On 2023-11-07 14:44, Errabolu, Ramesh wrote:
> [AMD Official Use Only - General]
>
> My queries inline.
>
> Regards,
> Ramesh
>
> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@amd.com>
> Sent: Tuesday, November 7, 2023 10:28 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig@amd.com>; Chen, Xiaogang <Xiaogang.Chen@amd.com>; Errabolu, Ramesh <Ramesh.Errabolu@amd.com>
> Subject: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
>
> Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM handles are created in a drm_client_dev context to avoid exposing them in user mode contexts through a DMABuf import.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 11 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  5 +++
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 33 +++++++++++++++----
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |  4 +--
>   4 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 6ab17330a6ed..b0a67f16540a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -142,6 +142,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)  {
>          int i;
>          int last_valid_bit;
> +       int ret;
>
>          amdgpu_amdkfd_gpuvm_init_mem_limits();
>
> @@ -160,6 +161,12 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>                          .enable_mes = adev->enable_mes,
>                  };
>
> +               ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", NULL);
> +               if (ret) {
> +                       dev_err(adev->dev, "Failed to init DRM client: %d\n", ret);
> +                       return;
> +               }
> +
>                  /* this is going to have a few of the MSBs set that we need to
>                   * clear
>                   */
> @@ -198,6 +205,10 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>
>                  adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
>                                                          &gpu_resources);
> +               if (adev->kfd.init_complete)
> +                       drm_client_register(&adev->kfd.client);
> +               else
> +                       drm_client_release(&adev->kfd.client);
>
>                  amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 68d534a89942..4caf8cece028 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -32,6 +32,7 @@
>   #include <linux/mmu_notifier.h>
>   #include <linux/memremap.h>
>   #include <kgd_kfd_interface.h>
> +#include <drm/drm_client.h>
>   #include <drm/ttm/ttm_execbuf_util.h>
>   #include "amdgpu_sync.h"
>   #include "amdgpu_vm.h"
> @@ -84,6 +85,7 @@ struct kgd_mem {
>
>          struct amdgpu_sync sync;
>
> +       uint32_t gem_handle;
>          bool aql_queue;
>          bool is_imported;
>   };
> @@ -106,6 +108,9 @@ struct amdgpu_kfd_dev {
>
>          /* HMM page migration MEMORY_DEVICE_PRIVATE mapping */
>          struct dev_pagemap pgmap;
> +
> +       /* Client for KFD BO GEM handle allocations */
> +       struct drm_client_dev client;
>   };
>
>   enum kgd_engine_type {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 0c1cb6048259..4bb8b5fd7598 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -25,6 +25,7 @@
>   #include <linux/pagemap.h>
>   #include <linux/sched/mm.h>
>   #include <linux/sched/task.h>
> +#include <linux/fdtable.h>
>   #include <drm/ttm/ttm_tt.h>
>
>   #include "amdgpu_object.h"
> @@ -804,13 +805,22 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,  static int kfd_mem_export_dmabuf(struct kgd_mem *mem)  {
>          if (!mem->dmabuf) {
> -               struct dma_buf *ret = amdgpu_gem_prime_export(
> -                       &mem->bo->tbo.base,
> +               struct amdgpu_device *bo_adev;
> +               struct dma_buf *dmabuf;
> +               int r, fd;
> +
> +               bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
> +               r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, bo_adev->kfd.client.file,
> +                                              mem->gem_handle,
>                          mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
> -                               DRM_RDWR : 0);
> -               if (IS_ERR(ret))
> -                       return PTR_ERR(ret);
> -               mem->dmabuf = ret;
> +                                              DRM_RDWR : 0, &fd);
> +               if (r)
> +                       return r;
> +               dmabuf = dma_buf_get(fd);
> +               close_fd(fd);
> +               if (WARN_ON_ONCE(IS_ERR(dmabuf)))
> +                       return PTR_ERR(dmabuf);
> +               mem->dmabuf = dmabuf;
>          }
>
>          return 0;
> @@ -1826,6 +1836,9 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>                  pr_debug("Failed to allow vma node access. ret %d\n", ret);
>                  goto err_node_allow;
>          }
> +       ret = drm_gem_handle_create(adev->kfd.client.file, gobj, &(*mem)->gem_handle);
> +       if (ret)
> +               goto err_gem_handle_create;
>          bo = gem_to_amdgpu_bo(gobj);
>          if (bo_type == ttm_bo_type_sg) {
>                  bo->tbo.sg = sg;
> @@ -1877,6 +1890,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   err_pin_bo:
>   err_validate_bo:
>          remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info);
> +       drm_gem_handle_delete(adev->kfd.client.file, (*mem)->gem_handle);
> +err_gem_handle_create:
>          drm_vma_node_revoke(&gobj->vma_node, drm_priv);
>   err_node_allow:
>          /* Don't unreserve system mem limit twice */ @@ -1991,8 +2006,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>
>          /* Free the BO*/
>          drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
> -       if (mem->dmabuf)
> +       if (!mem->is_imported)
> +               drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
> +       if (mem->dmabuf) {
>                  dma_buf_put(mem->dmabuf);
> +               mem->dmabuf = NULL;
> +       }
>          mutex_destroy(&mem->lock);
>
> Ramesh: What happens if user invokes "free" using KFD IOCTL while the BO is imported and mapped on the device using DRM IOCTLs. Could it lead to inconsistent state?. For example the call drm_gem_handle_delete() will remove Dmabuf, close the GEM object. If user invokes free() using KFD Apis then the underlying object could be removed. It is not clear if this impacts devices that have just imported.

Reference counting should prevent any such problems. Both the dmabuf 
itself or the import in a DRM device hold a reference to the underlying 
GEM object. Using the KFD free ioctl only drops the KFD reference to the 
GEM object and cleans up the handle and kgd_mem object in KFD.

When the BO gets imported in a DRM render node, that creates a new GEM 
handle in that context.

Regards,
   Felix


>
>          /* If this releases the last reference, it will end up calling diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 06988cf1db51..4417a9863cd0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1855,8 +1855,8 @@ static uint32_t get_process_num_bos(struct kfd_process *p)
>          return num_of_bos;
>   }
>
> -static int criu_get_prime_handle(struct kgd_mem *mem, int flags,
> -                                     u32 *shared_fd)
> +static int criu_get_prime_handle(struct kgd_mem *mem,
> +                                int flags, u32 *shared_fd)
>   {
>          struct dma_buf *dmabuf;
>          int ret;
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in compute VMs
  2023-11-07 16:58 ` [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in compute VMs Felix Kuehling
@ 2023-11-07 20:24   ` Joshi, Mukul
  0 siblings, 0 replies; 23+ messages in thread
From: Joshi, Mukul @ 2023-11-07 20:24 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx@lists.freedesktop.org
  Cc: Chen, Xiaogang, Errabolu, Ramesh, Koenig, Christian

[AMD Official Use Only - General]

> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Felix
> Kuehling
> Sent: Tuesday, November 7, 2023 11:58 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chen, Xiaogang <Xiaogang.Chen@amd.com>; Errabolu, Ramesh
> <Ramesh.Errabolu@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>
> Subject: [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in
> compute VMs
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> DMABuf imports in compute VMs are not wrapped in a kgd_mem object on
> the process_info->kfd_bo_list. There is no explicit KFD API call to validate them
> or add eviction fences to them.
>
> This patch automatically validates and fences dymanic DMABuf imports when
> they are added to a compute VM. Revalidation after evictions is handled in the
> VM code.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |   3 +
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  15 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c   |   6 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  26 ++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 117
> +++++++++++++++++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   6 +-
>  7 files changed, 164 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index fcf8a98ad15e..68d534a89942 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -178,6 +178,9 @@ int
> amdgpu_queue_mask_bit_to_set_resource_bit(struct amdgpu_device *adev,
> struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 context,
>                                 struct mm_struct *mm,
>                                 struct svm_range_bo *svm_bo);
> +int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
> +                                       uint32_t domain,
> +                                       struct dma_fence *fence);
>  #if defined(CONFIG_DEBUG_FS)
>  int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data);  #endif diff --
> git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 2e302956a279..0c1cb6048259 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -423,9 +423,9 @@ static int amdgpu_amdkfd_bo_validate(struct
> amdgpu_bo *bo, uint32_t domain,
>         return ret;
>  }
>
> -static int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
> -                                              uint32_t domain,
> -                                              struct dma_fence *fence)
> +int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
> +                                       uint32_t domain,
> +                                       struct dma_fence *fence)
>  {
>         int ret = amdgpu_bo_reserve(bo, false);
>
> @@ -2948,7 +2948,7 @@ int
> amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence
> **ef)
>                 struct amdgpu_device *adev = amdgpu_ttm_adev(
>                         peer_vm->root.bo->tbo.bdev);
>
> -               ret = amdgpu_vm_handle_moved(adev, peer_vm, &ctx.ticket);
> +               ret = amdgpu_vm_handle_moved(adev, peer_vm, &ctx.ticket,
> + true);
>                 if (ret) {
>                         pr_debug("Memory eviction: handle moved failed. Try again\n");
>                         goto validate_map_fail; @@ -3001,7 +3001,7 @@ int
> amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence
> **ef)
>                                    &process_info->eviction_fence->base,
>                                    DMA_RESV_USAGE_BOOKKEEP);
>         }
> -       /* Attach eviction fence to PD / PT BOs */
> +       /* Attach eviction fence to PD / PT BOs and DMABuf imports */
>         list_for_each_entry(peer_vm, &process_info->vm_list_head,
>                             vm_list_node) {
>                 struct amdgpu_bo *bo = peer_vm->root.bo; @@ -3009,6 +3009,11
> @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct
> dma_fence **ef)
>                 dma_resv_add_fence(bo->tbo.base.resv,
>                                    &process_info->eviction_fence->base,
>                                    DMA_RESV_USAGE_BOOKKEEP);
> +
> +               ret = amdgpu_vm_fence_imports(peer_vm, &ctx.ticket,
> +                                             &process_info->eviction_fence->base);
> +               if (ret)
> +                       break;
>         }
>
>  validate_map_fail:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index aafedb344c1b..20f4be8cd635 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1165,7 +1165,7 @@ static int amdgpu_cs_vm_handling(struct
> amdgpu_cs_parser *p)
>                         return r;
>         }
>
> -       r = amdgpu_vm_handle_moved(adev, vm, &p->ticket);
> +       r = amdgpu_vm_handle_moved(adev, vm, &p->ticket, false);
>         if (r)
>                 return r;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index e7e87a3b2601..234244704f27 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -373,6 +373,10 @@ amdgpu_dma_buf_move_notify(struct
> dma_buf_attachment *attach)
>         struct amdgpu_vm_bo_base *bo_base;
>         int r;
>
> +       /* FIXME: This should be after the "if", but needs a fix to make sure
> +        * DMABuf imports are initialized in the right VM list.
> +        */
> +       amdgpu_vm_bo_invalidate(adev, bo, false);
>         if (!bo->tbo.resource || bo->tbo.resource->mem_type ==
> TTM_PL_SYSTEM)
>                 return;
>
> @@ -409,7 +413,7 @@ amdgpu_dma_buf_move_notify(struct
> dma_buf_attachment *attach)
>                 if (!r)
>                         r = amdgpu_vm_clear_freed(adev, vm, NULL);
>                 if (!r)
> -                       r = amdgpu_vm_handle_moved(adev, vm, ticket);
> +                       r = amdgpu_vm_handle_moved(adev, vm, ticket,
> + false);
>
>                 if (r && r != -EBUSY)
>                         DRM_ERROR("Failed to invalidate VM page tables (%d))\n", diff --
> git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 849fffbb367d..755cc3c559f5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -186,6 +186,32 @@ static int amdgpu_gem_object_open(struct
> drm_gem_object *obj,
>         else
>                 ++bo_va->ref_count;
>         amdgpu_bo_unreserve(abo);
> +
> +       /* Validate and add eviction fence to DMABuf imports with dymanic

dymanic -> dynamic

> +        * attachment in compute VMs. Re-validation will be done by
> +        * amdgpu_vm_handle_moved and the fence will be updated by
> +        * amdgpu_vm_fence_imports in
> amdgpu_amdkfd_gpuvm_restore_process_bos.
> +        *
> +        * Nested locking below for the case that a GEM object is opened in
> +        * kfd_mem_export_dmabuf. Since the lock below is only taken for
> imports,
> +        * but not for export, this is a different lock class that cannot lead to
> +        * circular lock dependencies.
> +        */
> +       if (!vm->is_compute_context || !vm->process_info)
> +               return 0;
> +       if (!obj->import_attach ||
> +           !dma_buf_is_dynamic(obj->import_attach->dmabuf))
> +               return 0;
> +       mutex_lock_nested(&vm->process_info->lock, 1);
> +       if (!WARN_ON(!vm->process_info->eviction_fence)) {
> +               r = amdgpu_amdkfd_bo_validate_and_fence(abo,
> AMDGPU_GEM_DOMAIN_GTT,
> +                                                       &vm->process_info->eviction_fence->base);
> +               if (r)
> +                       dev_warn(adev->dev, "%d: validate_and_fence failed: %d\n",
> +                                vm->task_info.pid, r);
> +       }
> +       mutex_unlock(&vm->process_info->lock);
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 0d685577243c..b2c7449ab561 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1307,6 +1307,7 @@ int amdgpu_vm_clear_freed(struct
> amdgpu_device *adev,
>   * @adev: amdgpu_device pointer
>   * @vm: requested vm
>   * @ticket: optional reservation ticket used to reserve the VM
> + * @valitate: whether to auto-validate invalid DMABuf imports

valitate -> validate

>   *
>   * Make sure all BOs which are moved are updated in the PTs.
>   *
> @@ -1317,7 +1318,8 @@ int amdgpu_vm_clear_freed(struct
> amdgpu_device *adev,
>   */
>  int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
>                            struct amdgpu_vm *vm,
> -                          struct ww_acquire_ctx *ticket)
> +                          struct ww_acquire_ctx *ticket,
> +                          bool validate)
>  {
>         struct amdgpu_bo_va *bo_va;
>         struct dma_resv *resv;
> @@ -1337,6 +1339,12 @@ int amdgpu_vm_handle_moved(struct
> amdgpu_device *adev,
>                 spin_lock(&vm->status_lock);
>         }
>
> +       /* If we're validating user BOs, splice all evicted user BOs into
> +        * the list of invalid BOs for revalidation
> +        */
> +       if (validate)
> +               list_splice_init(&vm->evicted_user, &vm->invalidated);
> +
>         while (!list_empty(&vm->invalidated)) {
>                 bo_va = list_first_entry(&vm->invalidated, struct amdgpu_bo_va,
>                                          base.vm_status); @@ -1357,17 +1365,120 @@ int
> amdgpu_vm_handle_moved(struct amdgpu_device *adev,
>                         unlock = false;
>                 }
>
> +               /* Automatically validate DMABuf imports in compute VMs, if we
> +                * have a reservation, or remember them for later validation.
> +                */
> +               if (vm->is_compute_context && bo_va->base.bo &&
> +                   bo_va->base.bo->tbo.base.import_attach &&
> +                   (!bo_va->base.bo->tbo.resource ||
> +                    bo_va->base.bo->tbo.resource->mem_type == TTM_PL_SYSTEM))
> {
> +                       struct ttm_operation_ctx ctx = { true, false };
> +                       struct amdgpu_bo *bo = bo_va->base.bo;
> +
> +                       if (!validate) {
> +                               r = amdgpu_vm_bo_update(adev, bo_va, clear);
> +                               if (!r)
> +                                       amdgpu_vm_bo_evicted_user(&bo_va->base);
> +                               goto unlock;
> +                       }
> +
> +                       if (clear) {
> +                               pr_warn_ratelimited("Invalid DMABuf import is busy in pid
> %d\n", vm->task_info.pid);
> +                               r = -EBUSY;
> +                               goto unlock;
> +                       }
> +
> +                       amdgpu_bo_placement_from_domain(bo,
> +                                                       bo->preferred_domains);
> +                       r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> +                       if (r)
> +                               goto unlock;
> +                       r = amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD,
> +                                               true);
> +                       if (r)
> +                               goto unlock;
> +               }
> +
>                 r = amdgpu_vm_bo_update(adev, bo_va, clear);
> +unlock:
> +               if (unlock)
> +                       dma_resv_unlock(resv);
>                 if (r)
>                         return r;
> +               spin_lock(&vm->status_lock);
> +       }
> +       spin_unlock(&vm->status_lock);
> +
> +       return 0;
> +}
> +
> +/**
> + * amdgpu_vm_fence_imports - add fence to valid DMABuf imports
> + *
> + * @vm: requested vm
> + * @ticket: optional reservation ticket used to reserve the VM
> + * @fence: fence to add
> + *
> + * Add the specified fence to all dymanic DMABuf imports that are valid.
> + *
> + * Returns:
> + * 0 for success.
> + */
> +int amdgpu_vm_fence_imports(struct amdgpu_vm *vm,
> +                           struct ww_acquire_ctx *ticket,
> +                           struct dma_fence *fence) {
> +       struct amdgpu_bo_va *bo_va, *tmp;
> +       struct dma_resv *resv;
> +       LIST_HEAD(imports);
> +       bool unlock;
> +       int ret = 0;
> +
> +       if (!vm->is_compute_context)
> +               return 0;
> +
> +       /* Move all the DMABuf imports to a private list so I can reserve
> +        * them while not holding te status_lock.
> +        */
> +       spin_lock(&vm->status_lock);
> +       list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) {
> +               if (bo_va->base.bo && bo_va->base.bo->tbo.base.import_attach &&
> +                   dma_buf_is_dynamic(bo_va->base.bo->tbo.base.import_attach-
> >dmabuf))
> +                       list_move(&bo_va->base.vm_status, &imports);
> +       }
> +       spin_unlock(&vm->status_lock);
> +
> +       list_for_each_entry(bo_va, &imports, base.vm_status) {
> +               resv = bo_va->base.bo->tbo.base.resv;
> +
> +               /* Try to reserve the BO */
> +               if (dma_resv_trylock(resv)) {
> +                       unlock = true;
> +               /* The caller is already holding the reservation lock */
> +               } else if (ticket && dma_resv_locking_ctx(resv) == ticket) {
> +                       unlock = false;
> +               } else {
> +                       WARN_ONCE(1, "Failed to reserve DMABuf import");
> +                       ret = -EBUSY;
> +                       break;
> +               }
> +
> +               ret = dma_resv_reserve_fences(resv, 1);
> +               if (!ret)
> +                       dma_resv_add_fence(resv, fence,
> +                                          DMA_RESV_USAGE_BOOKKEEP);
>
>                 if (unlock)
>                         dma_resv_unlock(resv);
> -               spin_lock(&vm->status_lock);
> +               if (ret)
> +                       break;
>         }
> +
> +       spin_lock(&vm->status_lock);
> +       list_splice(&imports, &vm->idle);
>         spin_unlock(&vm->status_lock);
>
> -       return 0;
> +       return ret;
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 939d0c2219c0..2db04b8fef97 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -443,7 +443,11 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
> *adev,
>                           struct dma_fence **fence);  int
> amdgpu_vm_handle_moved(struct amdgpu_device *adev,
>                            struct amdgpu_vm *vm,
> -                          struct ww_acquire_ctx *ticket);
> +                          struct ww_acquire_ctx *ticket,
> +                          bool validate); int
> +amdgpu_vm_fence_imports(struct amdgpu_vm *vm,
> +                           struct ww_acquire_ctx *ticket,
> +                           struct dma_fence *fence);
>  int amdgpu_vm_flush_compute_tlb(struct amdgpu_device *adev,
>                                 struct amdgpu_vm *vm,
>                                 uint32_t flush_type,
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs
  2023-11-07 16:58 ` [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs Felix Kuehling
@ 2023-11-07 22:11   ` Felix Kuehling
  2023-11-08 12:28     ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-07 22:11 UTC (permalink / raw)
  To: amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Hi Christian,

I know you have objected to this patch before. I still think this is the 
best solution for what I need. I can talk you through my reasoning by 
email or offline. If I can't convince you, I will need your guidance for 
a better solution.

Thanks,
   Felix


On 2023-11-07 11:58, Felix Kuehling wrote:
> Create a new VM state to track user BOs that are in the system domain.
> In the next patch this will be used do conditionally re-validate them in
> amdgpu_vm_handle_moved.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++++-
>   2 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 1442d97ddd0f..0d685577243c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -232,6 +232,22 @@ static void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)
>   	spin_unlock(&vm_bo->vm->status_lock);
>   }
>   
> +/**
> + * amdgpu_vm_bo_evicted_user - vm_bo is evicted
> + *
> + * @vm_bo: vm_bo which is evicted
> + *
> + * State for BOs used by user mode queues which are not at the location they
> + * should be.
> + */
> +static void amdgpu_vm_bo_evicted_user(struct amdgpu_vm_bo_base *vm_bo)
> +{
> +	vm_bo->moved = true;
> +	spin_lock(&vm_bo->vm->status_lock);
> +	list_move(&vm_bo->vm_status, &vm_bo->vm->evicted_user);
> +	spin_unlock(&vm_bo->vm->status_lock);
> +}
> +
>   /**
>    * amdgpu_vm_bo_relocated - vm_bo is reloacted
>    *
> @@ -2110,6 +2126,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, int32_t xcp
>   	for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
>   		vm->reserved_vmid[i] = NULL;
>   	INIT_LIST_HEAD(&vm->evicted);
> +	INIT_LIST_HEAD(&vm->evicted_user);
>   	INIT_LIST_HEAD(&vm->relocated);
>   	INIT_LIST_HEAD(&vm->moved);
>   	INIT_LIST_HEAD(&vm->idle);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 12cac06fa289..939d0c2219c0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -286,9 +286,12 @@ struct amdgpu_vm {
>   	/* Lock to protect vm_bo add/del/move on all lists of vm */
>   	spinlock_t		status_lock;
>   
> -	/* BOs who needs a validation */
> +	/* Per VM and PT BOs who needs a validation */
>   	struct list_head	evicted;
>   
> +	/* BOs for user mode queues that need a validation */
> +	struct list_head	evicted_user;
> +
>   	/* PT BOs which relocated and their parent need an update */
>   	struct list_head	relocated;
>   

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM
  2023-11-07 16:58 ` [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM Felix Kuehling
@ 2023-11-08  0:01   ` Errabolu, Ramesh
  2023-11-08 23:20   ` Chen, Xiaogang
  2023-11-09  8:16   ` Christian König
  2 siblings, 0 replies; 23+ messages in thread
From: Errabolu, Ramesh @ 2023-11-08  0:01 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx@lists.freedesktop.org
  Cc: Chen, Xiaogang, Koenig, Christian

[AMD Official Use Only - General]

Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>

-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@amd.com>
Sent: Tuesday, November 7, 2023 10:28 PM
To: amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian <Christian.Koenig@amd.com>; Chen, Xiaogang <Xiaogang.Chen@amd.com>; Errabolu, Ramesh <Ramesh.Errabolu@amd.com>
Subject: [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM

Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This ensures that a GEM handle is created on import and that obj->dma_buf will be set and remain set as long as the object is imported into KFD.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  9 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 64 +++++++++++++------
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 15 ++---
 3 files changed, 52 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 4caf8cece028..88a0e0734270 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -318,11 +318,10 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
                                            struct dma_fence **ef);
 int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
                                              struct kfd_vm_fault_info *info); -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
-                                     struct dma_buf *dmabuf,
-                                     uint64_t va, void *drm_priv,
-                                     struct kgd_mem **mem, uint64_t *size,
-                                     uint64_t *mmap_offset);
+int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
+                                        uint64_t va, void *drm_priv,
+                                        struct kgd_mem **mem, uint64_t *size,
+                                        uint64_t *mmap_offset);
 int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
                                      struct dma_buf **dmabuf);
 void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 4bb8b5fd7598..1077de8bced2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2006,8 +2006,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(

        /* Free the BO*/
        drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
-       if (!mem->is_imported)
-               drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
+       drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
        if (mem->dmabuf) {
                dma_buf_put(mem->dmabuf);
                mem->dmabuf = NULL;
@@ -2363,34 +2362,26 @@ int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
        return 0;
 }

-int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
-                                     struct dma_buf *dma_buf,
-                                     uint64_t va, void *drm_priv,
-                                     struct kgd_mem **mem, uint64_t *size,
-                                     uint64_t *mmap_offset)
+static int import_obj_create(struct amdgpu_device *adev,
+                            struct dma_buf *dma_buf,
+                            struct drm_gem_object *obj,
+                            uint64_t va, void *drm_priv,
+                            struct kgd_mem **mem, uint64_t *size,
+                            uint64_t *mmap_offset)
 {
        struct amdgpu_vm *avm = drm_priv_to_vm(drm_priv);
-       struct drm_gem_object *obj;
        struct amdgpu_bo *bo;
        int ret;

-       obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
-       if (IS_ERR(obj))
-               return PTR_ERR(obj);
-
        bo = gem_to_amdgpu_bo(obj);
        if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
-                                   AMDGPU_GEM_DOMAIN_GTT))) {
+                                   AMDGPU_GEM_DOMAIN_GTT)))
                /* Only VRAM and GTT BOs are supported */
-               ret = -EINVAL;
-               goto err_put_obj;
-       }
+               return -EINVAL;

        *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
-       if (!*mem) {
-               ret = -ENOMEM;
-               goto err_put_obj;
-       }
+       if (!*mem)
+               return -ENOMEM;

        ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
        if (ret)
@@ -2440,8 +2431,41 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
        drm_vma_node_revoke(&obj->vma_node, drm_priv);
 err_free_mem:
        kfree(*mem);
+       return ret;
+}
+
+int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
+                                        uint64_t va, void *drm_priv,
+                                        struct kgd_mem **mem, uint64_t *size,
+                                        uint64_t *mmap_offset)
+{
+       struct drm_gem_object *obj;
+       uint32_t handle;
+       int ret;
+
+       ret = drm_gem_prime_fd_to_handle(&adev->ddev, adev->kfd.client.file, fd,
+                                        &handle);
+       if (ret)
+               return ret;
+       obj = drm_gem_object_lookup(adev->kfd.client.file, handle);
+       if (!obj) {
+               ret = -EINVAL;
+               goto err_release_handle;
+       }
+
+       ret = import_obj_create(adev, obj->dma_buf, obj, va, drm_priv, mem, size,
+                               mmap_offset);
+       if (ret)
+               goto err_put_obj;
+
+       (*mem)->gem_handle = handle;
+
+       return 0;
+
 err_put_obj:
        drm_gem_object_put(obj);
+err_release_handle:
+       drm_gem_handle_delete(adev->kfd.client.file, handle);
        return ret;
 }

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 4417a9863cd0..1a2e9f564b7f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1564,16 +1564,11 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,  {
        struct kfd_ioctl_import_dmabuf_args *args = data;
        struct kfd_process_device *pdd;
-       struct dma_buf *dmabuf;
        int idr_handle;
        uint64_t size;
        void *mem;
        int r;

-       dmabuf = dma_buf_get(args->dmabuf_fd);
-       if (IS_ERR(dmabuf))
-               return PTR_ERR(dmabuf);
-
        mutex_lock(&p->mutex);
        pdd = kfd_process_device_data_by_id(p, args->gpu_id);
        if (!pdd) {
@@ -1587,10 +1582,10 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
                goto err_unlock;
        }

-       r = amdgpu_amdkfd_gpuvm_import_dmabuf(pdd->dev->adev, dmabuf,
-                                             args->va_addr, pdd->drm_priv,
-                                             (struct kgd_mem **)&mem, &size,
-                                             NULL);
+       r = amdgpu_amdkfd_gpuvm_import_dmabuf_fd(pdd->dev->adev, args->dmabuf_fd,
+                                                args->va_addr, pdd->drm_priv,
+                                                (struct kgd_mem **)&mem, &size,
+                                                NULL);
        if (r)
                goto err_unlock;

@@ -1601,7 +1596,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
        }

        mutex_unlock(&p->mutex);
-       dma_buf_put(dmabuf);

        args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);

@@ -1612,7 +1606,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
                                               pdd->drm_priv, NULL);
 err_unlock:
        mutex_unlock(&p->mutex);
-       dma_buf_put(dmabuf);
        return r;
 }

--
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* RE: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
  2023-11-07 19:56     ` Felix Kuehling
@ 2023-11-08  2:08       ` Errabolu, Ramesh
  0 siblings, 0 replies; 23+ messages in thread
From: Errabolu, Ramesh @ 2023-11-08  2:08 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx@lists.freedesktop.org
  Cc: Chen, Xiaogang, Koenig, Christian

[AMD Official Use Only - General]

Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>

-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@amd.com>
Sent: Wednesday, November 8, 2023 1:27 AM
To: Errabolu, Ramesh <Ramesh.Errabolu@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian <Christian.Koenig@amd.com>; Chen, Xiaogang <Xiaogang.Chen@amd.com>
Subject: Re: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles


On 2023-11-07 14:44, Errabolu, Ramesh wrote:
> [AMD Official Use Only - General]
>
> My queries inline.
>
> Regards,
> Ramesh
>
> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@amd.com>
> Sent: Tuesday, November 7, 2023 10:28 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig@amd.com>; Chen, Xiaogang
> <Xiaogang.Chen@amd.com>; Errabolu, Ramesh <Ramesh.Errabolu@amd.com>
> Subject: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM
> handles
>
> Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM handles are created in a drm_client_dev context to avoid exposing them in user mode contexts through a DMABuf import.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 11 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  5 +++
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 33 +++++++++++++++----
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |  4 +--
>   4 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 6ab17330a6ed..b0a67f16540a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -142,6 +142,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)  {
>          int i;
>          int last_valid_bit;
> +       int ret;
>
>          amdgpu_amdkfd_gpuvm_init_mem_limits();
>
> @@ -160,6 +161,12 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>                          .enable_mes = adev->enable_mes,
>                  };
>
> +               ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", NULL);
> +               if (ret) {
> +                       dev_err(adev->dev, "Failed to init DRM client: %d\n", ret);
> +                       return;
> +               }
> +
>                  /* this is going to have a few of the MSBs set that we need to
>                   * clear
>                   */
> @@ -198,6 +205,10 @@ void amdgpu_amdkfd_device_init(struct
> amdgpu_device *adev)
>
>                  adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
>
> &gpu_resources);
> +               if (adev->kfd.init_complete)
> +                       drm_client_register(&adev->kfd.client);
> +               else
> +                       drm_client_release(&adev->kfd.client);
>
>                  amdgpu_amdkfd_total_mem_size +=
> adev->gmc.real_vram_size;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 68d534a89942..4caf8cece028 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -32,6 +32,7 @@
>   #include <linux/mmu_notifier.h>
>   #include <linux/memremap.h>
>   #include <kgd_kfd_interface.h>
> +#include <drm/drm_client.h>
>   #include <drm/ttm/ttm_execbuf_util.h>
>   #include "amdgpu_sync.h"
>   #include "amdgpu_vm.h"
> @@ -84,6 +85,7 @@ struct kgd_mem {
>
>          struct amdgpu_sync sync;
>
> +       uint32_t gem_handle;
>          bool aql_queue;
>          bool is_imported;
>   };
> @@ -106,6 +108,9 @@ struct amdgpu_kfd_dev {
>
>          /* HMM page migration MEMORY_DEVICE_PRIVATE mapping */
>          struct dev_pagemap pgmap;
> +
> +       /* Client for KFD BO GEM handle allocations */
> +       struct drm_client_dev client;
>   };
>
>   enum kgd_engine_type {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 0c1cb6048259..4bb8b5fd7598 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -25,6 +25,7 @@
>   #include <linux/pagemap.h>
>   #include <linux/sched/mm.h>
>   #include <linux/sched/task.h>
> +#include <linux/fdtable.h>
>   #include <drm/ttm/ttm_tt.h>
>
>   #include "amdgpu_object.h"
> @@ -804,13 +805,22 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,  static int kfd_mem_export_dmabuf(struct kgd_mem *mem)  {
>          if (!mem->dmabuf) {
> -               struct dma_buf *ret = amdgpu_gem_prime_export(
> -                       &mem->bo->tbo.base,
> +               struct amdgpu_device *bo_adev;
> +               struct dma_buf *dmabuf;
> +               int r, fd;
> +
> +               bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
> +               r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, bo_adev->kfd.client.file,
> +                                              mem->gem_handle,
>                          mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
> -                               DRM_RDWR : 0);
> -               if (IS_ERR(ret))
> -                       return PTR_ERR(ret);
> -               mem->dmabuf = ret;
> +                                              DRM_RDWR : 0, &fd);
> +               if (r)
> +                       return r;
> +               dmabuf = dma_buf_get(fd);
> +               close_fd(fd);
> +               if (WARN_ON_ONCE(IS_ERR(dmabuf)))
> +                       return PTR_ERR(dmabuf);
> +               mem->dmabuf = dmabuf;
>          }
>
>          return 0;
> @@ -1826,6 +1836,9 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>                  pr_debug("Failed to allow vma node access. ret %d\n", ret);
>                  goto err_node_allow;
>          }
> +       ret = drm_gem_handle_create(adev->kfd.client.file, gobj, &(*mem)->gem_handle);
> +       if (ret)
> +               goto err_gem_handle_create;
>          bo = gem_to_amdgpu_bo(gobj);
>          if (bo_type == ttm_bo_type_sg) {
>                  bo->tbo.sg = sg;
> @@ -1877,6 +1890,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   err_pin_bo:
>   err_validate_bo:
>          remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info);
> +       drm_gem_handle_delete(adev->kfd.client.file,
> +(*mem)->gem_handle);
> +err_gem_handle_create:
>          drm_vma_node_revoke(&gobj->vma_node, drm_priv);
>   err_node_allow:
>          /* Don't unreserve system mem limit twice */ @@ -1991,8
> +2006,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>
>          /* Free the BO*/
>          drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
> -       if (mem->dmabuf)
> +       if (!mem->is_imported)
> +               drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
> +       if (mem->dmabuf) {
>                  dma_buf_put(mem->dmabuf);
> +               mem->dmabuf = NULL;
> +       }
>          mutex_destroy(&mem->lock);
>
> Ramesh: What happens if user invokes "free" using KFD IOCTL while the BO is imported and mapped on the device using DRM IOCTLs. Could it lead to inconsistent state?. For example the call drm_gem_handle_delete() will remove Dmabuf, close the GEM object. If user invokes free() using KFD Apis then the underlying object could be removed. It is not clear if this impacts devices that have just imported.

Reference counting should prevent any such problems. Both the dmabuf itself or the import in a DRM device hold a reference to the underlying GEM object. Using the KFD free ioctl only drops the KFD reference to the GEM object and cleans up the handle and kgd_mem object in KFD.

When the BO gets imported in a DRM render node, that creates a new GEM handle in that context.

Regards,
   Felix


>
>          /* If this releases the last reference, it will end up
> calling diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 06988cf1db51..4417a9863cd0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1855,8 +1855,8 @@ static uint32_t get_process_num_bos(struct kfd_process *p)
>          return num_of_bos;
>   }
>
> -static int criu_get_prime_handle(struct kgd_mem *mem, int flags,
> -                                     u32 *shared_fd)
> +static int criu_get_prime_handle(struct kgd_mem *mem,
> +                                int flags, u32 *shared_fd)
>   {
>          struct dma_buf *dmabuf;
>          int ret;
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference
  2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
                   ` (4 preceding siblings ...)
  2023-11-07 16:58 ` [PATCH 6/6] drm/amdkfd: Bump KFD ioctl version Felix Kuehling
@ 2023-11-08 12:11 ` Christian König
  5 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2023-11-08 12:11 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Am 07.11.23 um 17:58 schrieb Felix Kuehling:
> mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update.
>
> Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation")
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 7b2a7c9156f0..1442d97ddd0f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1015,8 +1015,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   				bo = gem_to_amdgpu_bo(gobj);
>   		}
>   		mem = bo->tbo.resource;
> -		if (mem->mem_type == TTM_PL_TT ||
> -		    mem->mem_type == AMDGPU_PL_PREEMPT)
> +		if (mem && (mem->mem_type == TTM_PL_TT ||
> +			    mem->mem_type == AMDGPU_PL_PREEMPT))
>   			pages_addr = bo->tbo.ttm->dma_address;
>   	}
>   


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs
  2023-11-07 22:11   ` Felix Kuehling
@ 2023-11-08 12:28     ` Christian König
  2023-11-08 21:23       ` Felix Kuehling
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2023-11-08 12:28 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Not necessary objections to this patch here, but rather how this new 
state is used later on.

The fundamental problem is that re-validating things in 
amdgpu_vm_handle_moved() won't work in all cases.

The general approach for both classic CS IOCTL as well as user queues is 
the following:

1. Grab all the necessary locks
     The VM lock + either everything inside the VM (user queues) or just 
the BOs in the submission (CS IOCTL).

2. Validate everything locked
     It can be that additional BO locks are grabbed for eviction and 
even locked BOs are shuffled around during that.

3. Update the PDEs and PTEs
     This can be done only after validating everything because things 
can still shuffle around during validation.

4. Do your CS or make the userqueu / process runable etc...

5. Attach fences to the locked BOs.

6. Unlock everything again.

I think that is part of the problem why handling all updates in 
amdgpu_vm_handle_moved() for the CS doesn't work and sooner or later you 
might run into the same issue during process restore as well.

If I'm not completely mistaken this should be solvable by moving all 
validation to amdgpu_vm_validate_pt_bos() instead (but let us rename 
that function).

Regards,
Christian.

Am 07.11.23 um 23:11 schrieb Felix Kuehling:
> Hi Christian,
>
> I know you have objected to this patch before. I still think this is 
> the best solution for what I need. I can talk you through my reasoning 
> by email or offline. If I can't convince you, I will need your 
> guidance for a better solution.
>
> Thanks,
>   Felix
>
>
> On 2023-11-07 11:58, Felix Kuehling wrote:
>> Create a new VM state to track user BOs that are in the system domain.
>> In the next patch this will be used do conditionally re-validate them in
>> amdgpu_vm_handle_moved.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++++-
>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 1442d97ddd0f..0d685577243c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -232,6 +232,22 @@ static void amdgpu_vm_bo_invalidated(struct 
>> amdgpu_vm_bo_base *vm_bo)
>>       spin_unlock(&vm_bo->vm->status_lock);
>>   }
>>   +/**
>> + * amdgpu_vm_bo_evicted_user - vm_bo is evicted
>> + *
>> + * @vm_bo: vm_bo which is evicted
>> + *
>> + * State for BOs used by user mode queues which are not at the 
>> location they
>> + * should be.
>> + */
>> +static void amdgpu_vm_bo_evicted_user(struct amdgpu_vm_bo_base *vm_bo)
>> +{
>> +    vm_bo->moved = true;
>> +    spin_lock(&vm_bo->vm->status_lock);
>> +    list_move(&vm_bo->vm_status, &vm_bo->vm->evicted_user);
>> +    spin_unlock(&vm_bo->vm->status_lock);
>> +}
>> +
>>   /**
>>    * amdgpu_vm_bo_relocated - vm_bo is reloacted
>>    *
>> @@ -2110,6 +2126,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
>> struct amdgpu_vm *vm, int32_t xcp
>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
>>           vm->reserved_vmid[i] = NULL;
>>       INIT_LIST_HEAD(&vm->evicted);
>> +    INIT_LIST_HEAD(&vm->evicted_user);
>>       INIT_LIST_HEAD(&vm->relocated);
>>       INIT_LIST_HEAD(&vm->moved);
>>       INIT_LIST_HEAD(&vm->idle);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> index 12cac06fa289..939d0c2219c0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> @@ -286,9 +286,12 @@ struct amdgpu_vm {
>>       /* Lock to protect vm_bo add/del/move on all lists of vm */
>>       spinlock_t        status_lock;
>>   -    /* BOs who needs a validation */
>> +    /* Per VM and PT BOs who needs a validation */
>>       struct list_head    evicted;
>>   +    /* BOs for user mode queues that need a validation */
>> +    struct list_head    evicted_user;
>> +
>>       /* PT BOs which relocated and their parent need an update */
>>       struct list_head    relocated;


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs
  2023-11-08 12:28     ` Christian König
@ 2023-11-08 21:23       ` Felix Kuehling
  2023-11-09  8:12         ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-08 21:23 UTC (permalink / raw)
  To: Christian König, amd-gfx
  Cc: xiaogang.chen, ramesh.errabolu, christian.koenig


On 2023-11-08 07:28, Christian König wrote:
> Not necessary objections to this patch here, but rather how this new 
> state is used later on.
>
> The fundamental problem is that re-validating things in 
> amdgpu_vm_handle_moved() won't work in all cases.
>
> The general approach for both classic CS IOCTL as well as user queues 
> is the following:
>
> 1. Grab all the necessary locks
>     The VM lock + either everything inside the VM (user queues) or 
> just the BOs in the submission (CS IOCTL).
>
> 2. Validate everything locked
>     It can be that additional BO locks are grabbed for eviction and 
> even locked BOs are shuffled around during that.
>
> 3. Update the PDEs and PTEs
>     This can be done only after validating everything because things 
> can still shuffle around during validation.
>
> 4. Do your CS or make the userqueu / process runable etc...
>
> 5. Attach fences to the locked BOs.
>
> 6. Unlock everything again.
>
> I think that is part of the problem why handling all updates in 
> amdgpu_vm_handle_moved() for the CS doesn't work and sooner or later 
> you might run into the same issue during process restore as well.
>
> If I'm not completely mistaken this should be solvable by moving all 
> validation to amdgpu_vm_validate_pt_bos() instead (but let us rename 
> that function).

I'm trying to understand what the problem is. As far as I can tell, 
amdgpu_vm_handle_moved just runs a bit later in the CS and 
process-restore code than amdgpu_vm_validate_pt_bos. But it runs with 
all the same reservations locked. My current implementation in 
amdgpu_vm_handle_moved has some failure cases when reservations cannot 
be acquired. I think your drm_exec patch series would make it possible 
to handle this more robustly. That said, in case of KFD process restore, 
we can retry in case of failures already.

Anyway, moving my re-validation code to a renamed version of 
amdgpu_vm_validate_pt_bos shouldn't be a big problem. I just don't 
understand what problem that solves. Maybe it just makes the code 
cleaner to handle both evicted states in one place? Or maybe you're 
pointing to a way to merge the two states into one?

Regards,
   Felix


>
> Regards,
> Christian.
>
> Am 07.11.23 um 23:11 schrieb Felix Kuehling:
>> Hi Christian,
>>
>> I know you have objected to this patch before. I still think this is 
>> the best solution for what I need. I can talk you through my 
>> reasoning by email or offline. If I can't convince you, I will need 
>> your guidance for a better solution.
>>
>> Thanks,
>>   Felix
>>
>>
>> On 2023-11-07 11:58, Felix Kuehling wrote:
>>> Create a new VM state to track user BOs that are in the system domain.
>>> In the next patch this will be used do conditionally re-validate 
>>> them in
>>> amdgpu_vm_handle_moved.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++++-
>>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 1442d97ddd0f..0d685577243c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -232,6 +232,22 @@ static void amdgpu_vm_bo_invalidated(struct 
>>> amdgpu_vm_bo_base *vm_bo)
>>>       spin_unlock(&vm_bo->vm->status_lock);
>>>   }
>>>   +/**
>>> + * amdgpu_vm_bo_evicted_user - vm_bo is evicted
>>> + *
>>> + * @vm_bo: vm_bo which is evicted
>>> + *
>>> + * State for BOs used by user mode queues which are not at the 
>>> location they
>>> + * should be.
>>> + */
>>> +static void amdgpu_vm_bo_evicted_user(struct amdgpu_vm_bo_base *vm_bo)
>>> +{
>>> +    vm_bo->moved = true;
>>> +    spin_lock(&vm_bo->vm->status_lock);
>>> +    list_move(&vm_bo->vm_status, &vm_bo->vm->evicted_user);
>>> +    spin_unlock(&vm_bo->vm->status_lock);
>>> +}
>>> +
>>>   /**
>>>    * amdgpu_vm_bo_relocated - vm_bo is reloacted
>>>    *
>>> @@ -2110,6 +2126,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
>>> struct amdgpu_vm *vm, int32_t xcp
>>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
>>>           vm->reserved_vmid[i] = NULL;
>>>       INIT_LIST_HEAD(&vm->evicted);
>>> +    INIT_LIST_HEAD(&vm->evicted_user);
>>>       INIT_LIST_HEAD(&vm->relocated);
>>>       INIT_LIST_HEAD(&vm->moved);
>>>       INIT_LIST_HEAD(&vm->idle);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> index 12cac06fa289..939d0c2219c0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> @@ -286,9 +286,12 @@ struct amdgpu_vm {
>>>       /* Lock to protect vm_bo add/del/move on all lists of vm */
>>>       spinlock_t        status_lock;
>>>   -    /* BOs who needs a validation */
>>> +    /* Per VM and PT BOs who needs a validation */
>>>       struct list_head    evicted;
>>>   +    /* BOs for user mode queues that need a validation */
>>> +    struct list_head    evicted_user;
>>> +
>>>       /* PT BOs which relocated and their parent need an update */
>>>       struct list_head    relocated;
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM
  2023-11-07 16:58 ` [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM Felix Kuehling
  2023-11-08  0:01   ` Errabolu, Ramesh
@ 2023-11-08 23:20   ` Chen, Xiaogang
  2023-11-08 23:26     ` Felix Kuehling
  2023-11-09  8:16   ` Christian König
  2 siblings, 1 reply; 23+ messages in thread
From: Chen, Xiaogang @ 2023-11-08 23:20 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: ramesh.errabolu, christian.koenig


On 11/7/2023 10:58 AM, Felix Kuehling wrote:
> Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
> ensures that a GEM handle is created on import and that obj->dma_buf
> will be set and remain set as long as the object is imported into KFD.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  9 ++-
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 64 +++++++++++++------
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 15 ++---
>   3 files changed, 52 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 4caf8cece028..88a0e0734270 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -318,11 +318,10 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
>   					    struct dma_fence **ef);
>   int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>   					      struct kfd_vm_fault_info *info);
> -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
> -				      struct dma_buf *dmabuf,
> -				      uint64_t va, void *drm_priv,
> -				      struct kgd_mem **mem, uint64_t *size,
> -				      uint64_t *mmap_offset);
> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
> +					 uint64_t va, void *drm_priv,
> +					 struct kgd_mem **mem, uint64_t *size,
> +					 uint64_t *mmap_offset);
>   int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
>   				      struct dma_buf **dmabuf);
>   void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 4bb8b5fd7598..1077de8bced2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -2006,8 +2006,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>   
>   	/* Free the BO*/
>   	drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
> -	if (!mem->is_imported)
> -		drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
> +	drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);

A minor thing for this patch: I think this is a correction for last 
patch " Export DMABufs from KFD using GEM handles". mem->gem_handle is 
created unconditionally at amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu.  
drm_gem_handle_delete should be put at the lat patch.

Regards

Xiaogang

>   	if (mem->dmabuf) {
>   		dma_buf_put(mem->dmabuf);
>   		mem->dmabuf = NULL;
> @@ -2363,34 +2362,26 @@ int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>   	return 0;
>   }
>   
> -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
> -				      struct dma_buf *dma_buf,
> -				      uint64_t va, void *drm_priv,
> -				      struct kgd_mem **mem, uint64_t *size,
> -				      uint64_t *mmap_offset)
> +static int import_obj_create(struct amdgpu_device *adev,
> +			     struct dma_buf *dma_buf,
> +			     struct drm_gem_object *obj,
> +			     uint64_t va, void *drm_priv,
> +			     struct kgd_mem **mem, uint64_t *size,
> +			     uint64_t *mmap_offset)
>   {
>   	struct amdgpu_vm *avm = drm_priv_to_vm(drm_priv);
> -	struct drm_gem_object *obj;
>   	struct amdgpu_bo *bo;
>   	int ret;
>   
> -	obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
> -	if (IS_ERR(obj))
> -		return PTR_ERR(obj);
> -
>   	bo = gem_to_amdgpu_bo(obj);
>   	if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
> -				    AMDGPU_GEM_DOMAIN_GTT))) {
> +				    AMDGPU_GEM_DOMAIN_GTT)))
>   		/* Only VRAM and GTT BOs are supported */
> -		ret = -EINVAL;
> -		goto err_put_obj;
> -	}
> +		return -EINVAL;
>   
>   	*mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
> -	if (!*mem) {
> -		ret = -ENOMEM;
> -		goto err_put_obj;
> -	}
> +	if (!*mem)
> +		return -ENOMEM;
>   
>   	ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>   	if (ret)
> @@ -2440,8 +2431,41 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>   	drm_vma_node_revoke(&obj->vma_node, drm_priv);
>   err_free_mem:
>   	kfree(*mem);
> +	return ret;
> +}
> +
> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
> +					 uint64_t va, void *drm_priv,
> +					 struct kgd_mem **mem, uint64_t *size,
> +					 uint64_t *mmap_offset)
> +{
> +	struct drm_gem_object *obj;
> +	uint32_t handle;
> +	int ret;
> +
> +	ret = drm_gem_prime_fd_to_handle(&adev->ddev, adev->kfd.client.file, fd,
> +					 &handle);
> +	if (ret)
> +		return ret;
> +	obj = drm_gem_object_lookup(adev->kfd.client.file, handle);
> +	if (!obj) {
> +		ret = -EINVAL;
> +		goto err_release_handle;
> +	}
> +
> +	ret = import_obj_create(adev, obj->dma_buf, obj, va, drm_priv, mem, size,
> +				mmap_offset);
> +	if (ret)
> +		goto err_put_obj;
> +
> +	(*mem)->gem_handle = handle;
> +
> +	return 0;
> +
>   err_put_obj:
>   	drm_gem_object_put(obj);
> +err_release_handle:
> +	drm_gem_handle_delete(adev->kfd.client.file, handle);
>   	return ret;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 4417a9863cd0..1a2e9f564b7f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1564,16 +1564,11 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   {
>   	struct kfd_ioctl_import_dmabuf_args *args = data;
>   	struct kfd_process_device *pdd;
> -	struct dma_buf *dmabuf;
>   	int idr_handle;
>   	uint64_t size;
>   	void *mem;
>   	int r;
>   
> -	dmabuf = dma_buf_get(args->dmabuf_fd);
> -	if (IS_ERR(dmabuf))
> -		return PTR_ERR(dmabuf);
> -
>   	mutex_lock(&p->mutex);
>   	pdd = kfd_process_device_data_by_id(p, args->gpu_id);
>   	if (!pdd) {
> @@ -1587,10 +1582,10 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   		goto err_unlock;
>   	}
>   
> -	r = amdgpu_amdkfd_gpuvm_import_dmabuf(pdd->dev->adev, dmabuf,
> -					      args->va_addr, pdd->drm_priv,
> -					      (struct kgd_mem **)&mem, &size,
> -					      NULL);
> +	r = amdgpu_amdkfd_gpuvm_import_dmabuf_fd(pdd->dev->adev, args->dmabuf_fd,
> +						 args->va_addr, pdd->drm_priv,
> +						 (struct kgd_mem **)&mem, &size,
> +						 NULL);
>   	if (r)
>   		goto err_unlock;
>   
> @@ -1601,7 +1596,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   	}
>   
>   	mutex_unlock(&p->mutex);
> -	dma_buf_put(dmabuf);
>   
>   	args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);
>   
> @@ -1612,7 +1606,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   					       pdd->drm_priv, NULL);
>   err_unlock:
>   	mutex_unlock(&p->mutex);
> -	dma_buf_put(dmabuf);
>   	return r;
>   }
>   

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM
  2023-11-08 23:20   ` Chen, Xiaogang
@ 2023-11-08 23:26     ` Felix Kuehling
  2023-11-08 23:41       ` Chen, Xiaogang
  0 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-08 23:26 UTC (permalink / raw)
  To: Chen, Xiaogang, amd-gfx; +Cc: ramesh.errabolu, christian.koenig

On 2023-11-08 18:20, Chen, Xiaogang wrote:
>
> On 11/7/2023 10:58 AM, Felix Kuehling wrote:
>> Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
>> ensures that a GEM handle is created on import and that obj->dma_buf
>> will be set and remain set as long as the object is imported into KFD.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  9 ++-
>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 64 +++++++++++++------
>>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 15 ++---
>>   3 files changed, 52 insertions(+), 36 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 4caf8cece028..88a0e0734270 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -318,11 +318,10 @@ int 
>> amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
>>                           struct dma_fence **ef);
>>   int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>>                             struct kfd_vm_fault_info *info);
>> -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>> -                      struct dma_buf *dmabuf,
>> -                      uint64_t va, void *drm_priv,
>> -                      struct kgd_mem **mem, uint64_t *size,
>> -                      uint64_t *mmap_offset);
>> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, 
>> int fd,
>> +                     uint64_t va, void *drm_priv,
>> +                     struct kgd_mem **mem, uint64_t *size,
>> +                     uint64_t *mmap_offset);
>>   int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
>>                         struct dma_buf **dmabuf);
>>   void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 4bb8b5fd7598..1077de8bced2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -2006,8 +2006,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>>         /* Free the BO*/
>>       drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
>> -    if (!mem->is_imported)
>> -        drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
>> +    drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
>
> A minor thing for this patch: I think this is a correction for last 
> patch " Export DMABufs from KFD using GEM handles". mem->gem_handle is 
> created unconditionally at amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu.  
> drm_gem_handle_delete should be put at the lat patch.

This change was intentional. Without this patch, imported DMABufs didn't 
get a GEM handle, so I didn't need to delete one. With this patch, I now 
have a GEM handle for imported BOs, so I delete the GEM handle 
unconditionally.

Regards,
   Felix


>
> Regards
>
> Xiaogang
>
>>       if (mem->dmabuf) {
>>           dma_buf_put(mem->dmabuf);
>>           mem->dmabuf = NULL;
>> @@ -2363,34 +2362,26 @@ int 
>> amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>>       return 0;
>>   }
>>   -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>> -                      struct dma_buf *dma_buf,
>> -                      uint64_t va, void *drm_priv,
>> -                      struct kgd_mem **mem, uint64_t *size,
>> -                      uint64_t *mmap_offset)
>> +static int import_obj_create(struct amdgpu_device *adev,
>> +                 struct dma_buf *dma_buf,
>> +                 struct drm_gem_object *obj,
>> +                 uint64_t va, void *drm_priv,
>> +                 struct kgd_mem **mem, uint64_t *size,
>> +                 uint64_t *mmap_offset)
>>   {
>>       struct amdgpu_vm *avm = drm_priv_to_vm(drm_priv);
>> -    struct drm_gem_object *obj;
>>       struct amdgpu_bo *bo;
>>       int ret;
>>   -    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>> -    if (IS_ERR(obj))
>> -        return PTR_ERR(obj);
>> -
>>       bo = gem_to_amdgpu_bo(obj);
>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>> -                    AMDGPU_GEM_DOMAIN_GTT))) {
>> +                    AMDGPU_GEM_DOMAIN_GTT)))
>>           /* Only VRAM and GTT BOs are supported */
>> -        ret = -EINVAL;
>> -        goto err_put_obj;
>> -    }
>> +        return -EINVAL;
>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>> -    if (!*mem) {
>> -        ret = -ENOMEM;
>> -        goto err_put_obj;
>> -    }
>> +    if (!*mem)
>> +        return -ENOMEM;
>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>>       if (ret)
>> @@ -2440,8 +2431,41 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>> amdgpu_device *adev,
>>       drm_vma_node_revoke(&obj->vma_node, drm_priv);
>>   err_free_mem:
>>       kfree(*mem);
>> +    return ret;
>> +}
>> +
>> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, 
>> int fd,
>> +                     uint64_t va, void *drm_priv,
>> +                     struct kgd_mem **mem, uint64_t *size,
>> +                     uint64_t *mmap_offset)
>> +{
>> +    struct drm_gem_object *obj;
>> +    uint32_t handle;
>> +    int ret;
>> +
>> +    ret = drm_gem_prime_fd_to_handle(&adev->ddev, 
>> adev->kfd.client.file, fd,
>> +                     &handle);
>> +    if (ret)
>> +        return ret;
>> +    obj = drm_gem_object_lookup(adev->kfd.client.file, handle);
>> +    if (!obj) {
>> +        ret = -EINVAL;
>> +        goto err_release_handle;
>> +    }
>> +
>> +    ret = import_obj_create(adev, obj->dma_buf, obj, va, drm_priv, 
>> mem, size,
>> +                mmap_offset);
>> +    if (ret)
>> +        goto err_put_obj;
>> +
>> +    (*mem)->gem_handle = handle;
>> +
>> +    return 0;
>> +
>>   err_put_obj:
>>       drm_gem_object_put(obj);
>> +err_release_handle:
>> +    drm_gem_handle_delete(adev->kfd.client.file, handle);
>>       return ret;
>>   }
>>   diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 4417a9863cd0..1a2e9f564b7f 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -1564,16 +1564,11 @@ static int kfd_ioctl_import_dmabuf(struct 
>> file *filep,
>>   {
>>       struct kfd_ioctl_import_dmabuf_args *args = data;
>>       struct kfd_process_device *pdd;
>> -    struct dma_buf *dmabuf;
>>       int idr_handle;
>>       uint64_t size;
>>       void *mem;
>>       int r;
>>   -    dmabuf = dma_buf_get(args->dmabuf_fd);
>> -    if (IS_ERR(dmabuf))
>> -        return PTR_ERR(dmabuf);
>> -
>>       mutex_lock(&p->mutex);
>>       pdd = kfd_process_device_data_by_id(p, args->gpu_id);
>>       if (!pdd) {
>> @@ -1587,10 +1582,10 @@ static int kfd_ioctl_import_dmabuf(struct 
>> file *filep,
>>           goto err_unlock;
>>       }
>>   -    r = amdgpu_amdkfd_gpuvm_import_dmabuf(pdd->dev->adev, dmabuf,
>> -                          args->va_addr, pdd->drm_priv,
>> -                          (struct kgd_mem **)&mem, &size,
>> -                          NULL);
>> +    r = amdgpu_amdkfd_gpuvm_import_dmabuf_fd(pdd->dev->adev, 
>> args->dmabuf_fd,
>> +                         args->va_addr, pdd->drm_priv,
>> +                         (struct kgd_mem **)&mem, &size,
>> +                         NULL);
>>       if (r)
>>           goto err_unlock;
>>   @@ -1601,7 +1596,6 @@ static int kfd_ioctl_import_dmabuf(struct 
>> file *filep,
>>       }
>>         mutex_unlock(&p->mutex);
>> -    dma_buf_put(dmabuf);
>>         args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);
>>   @@ -1612,7 +1606,6 @@ static int kfd_ioctl_import_dmabuf(struct 
>> file *filep,
>>                              pdd->drm_priv, NULL);
>>   err_unlock:
>>       mutex_unlock(&p->mutex);
>> -    dma_buf_put(dmabuf);
>>       return r;
>>   }

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM
  2023-11-08 23:26     ` Felix Kuehling
@ 2023-11-08 23:41       ` Chen, Xiaogang
  0 siblings, 0 replies; 23+ messages in thread
From: Chen, Xiaogang @ 2023-11-08 23:41 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: ramesh.errabolu, christian.koenig


On 11/8/2023 5:26 PM, Felix Kuehling wrote:
> On 2023-11-08 18:20, Chen, Xiaogang wrote:
>>
>> On 11/7/2023 10:58 AM, Felix Kuehling wrote:
>>> Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
>>> ensures that a GEM handle is created on import and that obj->dma_buf
>>> will be set and remain set as long as the object is imported into KFD.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  9 ++-
>>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 64 
>>> +++++++++++++------
>>>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 15 ++---
>>>   3 files changed, 52 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>>> index 4caf8cece028..88a0e0734270 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>>> @@ -318,11 +318,10 @@ int 
>>> amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
>>>                           struct dma_fence **ef);
>>>   int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>>>                             struct kfd_vm_fault_info *info);
>>> -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>> -                      struct dma_buf *dmabuf,
>>> -                      uint64_t va, void *drm_priv,
>>> -                      struct kgd_mem **mem, uint64_t *size,
>>> -                      uint64_t *mmap_offset);
>>> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device 
>>> *adev, int fd,
>>> +                     uint64_t va, void *drm_priv,
>>> +                     struct kgd_mem **mem, uint64_t *size,
>>> +                     uint64_t *mmap_offset);
>>>   int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
>>>                         struct dma_buf **dmabuf);
>>>   void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 4bb8b5fd7598..1077de8bced2 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> @@ -2006,8 +2006,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>>>         /* Free the BO*/
>>> drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
>>> -    if (!mem->is_imported)
>>> -        drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
>>> +    drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
>>
>> A minor thing for this patch: I think this is a correction for last 
>> patch " Export DMABufs from KFD using GEM handles". mem->gem_handle 
>> is created unconditionally at 
>> amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu.  drm_gem_handle_delete 
>> should be put at the lat patch.
>
> This change was intentional. Without this patch, imported DMABufs 
> didn't get a GEM handle, so I didn't need to delete one. With this 
> patch, I now have a GEM handle for imported BOs, so I delete the GEM 
> handle unconditionally.
>
oh, imported mem->gem_handle is generated at 
amdgpu_amdkfd_gpuvm_import_dmabuf_fd. Need refresh my memory after a 
while.  This patch is:

Reviwed-by: Xiaogang.Chen <Xiaogang.Chen@amd.com>

> Regards,
>   Felix
>
>
>>
>> Regards
>>
>> Xiaogang
>>
>>>       if (mem->dmabuf) {
>>>           dma_buf_put(mem->dmabuf);
>>>           mem->dmabuf = NULL;
>>> @@ -2363,34 +2362,26 @@ int 
>>> amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>>>       return 0;
>>>   }
>>>   -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>> -                      struct dma_buf *dma_buf,
>>> -                      uint64_t va, void *drm_priv,
>>> -                      struct kgd_mem **mem, uint64_t *size,
>>> -                      uint64_t *mmap_offset)
>>> +static int import_obj_create(struct amdgpu_device *adev,
>>> +                 struct dma_buf *dma_buf,
>>> +                 struct drm_gem_object *obj,
>>> +                 uint64_t va, void *drm_priv,
>>> +                 struct kgd_mem **mem, uint64_t *size,
>>> +                 uint64_t *mmap_offset)
>>>   {
>>>       struct amdgpu_vm *avm = drm_priv_to_vm(drm_priv);
>>> -    struct drm_gem_object *obj;
>>>       struct amdgpu_bo *bo;
>>>       int ret;
>>>   -    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>>> -    if (IS_ERR(obj))
>>> -        return PTR_ERR(obj);
>>> -
>>>       bo = gem_to_amdgpu_bo(obj);
>>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>>> -                    AMDGPU_GEM_DOMAIN_GTT))) {
>>> +                    AMDGPU_GEM_DOMAIN_GTT)))
>>>           /* Only VRAM and GTT BOs are supported */
>>> -        ret = -EINVAL;
>>> -        goto err_put_obj;
>>> -    }
>>> +        return -EINVAL;
>>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>>> -    if (!*mem) {
>>> -        ret = -ENOMEM;
>>> -        goto err_put_obj;
>>> -    }
>>> +    if (!*mem)
>>> +        return -ENOMEM;
>>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>>>       if (ret)
>>> @@ -2440,8 +2431,41 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>> amdgpu_device *adev,
>>>       drm_vma_node_revoke(&obj->vma_node, drm_priv);
>>>   err_free_mem:
>>>       kfree(*mem);
>>> +    return ret;
>>> +}
>>> +
>>> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device 
>>> *adev, int fd,
>>> +                     uint64_t va, void *drm_priv,
>>> +                     struct kgd_mem **mem, uint64_t *size,
>>> +                     uint64_t *mmap_offset)
>>> +{
>>> +    struct drm_gem_object *obj;
>>> +    uint32_t handle;
>>> +    int ret;
>>> +
>>> +    ret = drm_gem_prime_fd_to_handle(&adev->ddev, 
>>> adev->kfd.client.file, fd,
>>> +                     &handle);
>>> +    if (ret)
>>> +        return ret;
>>> +    obj = drm_gem_object_lookup(adev->kfd.client.file, handle);
>>> +    if (!obj) {
>>> +        ret = -EINVAL;
>>> +        goto err_release_handle;
>>> +    }
>>> +
>>> +    ret = import_obj_create(adev, obj->dma_buf, obj, va, drm_priv, 
>>> mem, size,
>>> +                mmap_offset);
>>> +    if (ret)
>>> +        goto err_put_obj;
>>> +
>>> +    (*mem)->gem_handle = handle;
>>> +
>>> +    return 0;
>>> +
>>>   err_put_obj:
>>>       drm_gem_object_put(obj);
>>> +err_release_handle:
>>> +    drm_gem_handle_delete(adev->kfd.client.file, handle);
>>>       return ret;
>>>   }
>>>   diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> index 4417a9863cd0..1a2e9f564b7f 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> @@ -1564,16 +1564,11 @@ static int kfd_ioctl_import_dmabuf(struct 
>>> file *filep,
>>>   {
>>>       struct kfd_ioctl_import_dmabuf_args *args = data;
>>>       struct kfd_process_device *pdd;
>>> -    struct dma_buf *dmabuf;
>>>       int idr_handle;
>>>       uint64_t size;
>>>       void *mem;
>>>       int r;
>>>   -    dmabuf = dma_buf_get(args->dmabuf_fd);
>>> -    if (IS_ERR(dmabuf))
>>> -        return PTR_ERR(dmabuf);
>>> -
>>>       mutex_lock(&p->mutex);
>>>       pdd = kfd_process_device_data_by_id(p, args->gpu_id);
>>>       if (!pdd) {
>>> @@ -1587,10 +1582,10 @@ static int kfd_ioctl_import_dmabuf(struct 
>>> file *filep,
>>>           goto err_unlock;
>>>       }
>>>   -    r = amdgpu_amdkfd_gpuvm_import_dmabuf(pdd->dev->adev, dmabuf,
>>> -                          args->va_addr, pdd->drm_priv,
>>> -                          (struct kgd_mem **)&mem, &size,
>>> -                          NULL);
>>> +    r = amdgpu_amdkfd_gpuvm_import_dmabuf_fd(pdd->dev->adev, 
>>> args->dmabuf_fd,
>>> +                         args->va_addr, pdd->drm_priv,
>>> +                         (struct kgd_mem **)&mem, &size,
>>> +                         NULL);
>>>       if (r)
>>>           goto err_unlock;
>>>   @@ -1601,7 +1596,6 @@ static int kfd_ioctl_import_dmabuf(struct 
>>> file *filep,
>>>       }
>>>         mutex_unlock(&p->mutex);
>>> -    dma_buf_put(dmabuf);
>>>         args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);
>>>   @@ -1612,7 +1606,6 @@ static int kfd_ioctl_import_dmabuf(struct 
>>> file *filep,
>>>                              pdd->drm_priv, NULL);
>>>   err_unlock:
>>>       mutex_unlock(&p->mutex);
>>> -    dma_buf_put(dmabuf);
>>>       return r;
>>>   }

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs
  2023-11-08 21:23       ` Felix Kuehling
@ 2023-11-09  8:12         ` Christian König
  2023-11-14 22:26           ` Felix Kuehling
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2023-11-09  8:12 UTC (permalink / raw)
  To: Felix Kuehling, Christian König, amd-gfx
  Cc: xiaogang.chen, ramesh.errabolu

Am 08.11.23 um 22:23 schrieb Felix Kuehling:
>
> On 2023-11-08 07:28, Christian König wrote:
>> Not necessary objections to this patch here, but rather how this new 
>> state is used later on.
>>
>> The fundamental problem is that re-validating things in 
>> amdgpu_vm_handle_moved() won't work in all cases.
>>
>> The general approach for both classic CS IOCTL as well as user queues 
>> is the following:
>>
>> 1. Grab all the necessary locks
>>     The VM lock + either everything inside the VM (user queues) or 
>> just the BOs in the submission (CS IOCTL).
>>
>> 2. Validate everything locked
>>     It can be that additional BO locks are grabbed for eviction and 
>> even locked BOs are shuffled around during that.
>>
>> 3. Update the PDEs and PTEs
>>     This can be done only after validating everything because things 
>> can still shuffle around during validation.
>>
>> 4. Do your CS or make the userqueu / process runable etc...
>>
>> 5. Attach fences to the locked BOs.
>>
>> 6. Unlock everything again.
>>
>> I think that is part of the problem why handling all updates in 
>> amdgpu_vm_handle_moved() for the CS doesn't work and sooner or later 
>> you might run into the same issue during process restore as well.
>>
>> If I'm not completely mistaken this should be solvable by moving all 
>> validation to amdgpu_vm_validate_pt_bos() instead (but let us rename 
>> that function).
>
> I'm trying to understand what the problem is. As far as I can tell, 
> amdgpu_vm_handle_moved just runs a bit later in the CS and 
> process-restore code than amdgpu_vm_validate_pt_bos. But it runs with 
> all the same reservations locked. My current implementation in 
> amdgpu_vm_handle_moved has some failure cases when reservations cannot 
> be acquired. I think your drm_exec patch series would make it possible 
> to handle this more robustly. That said, in case of KFD process 
> restore, we can retry in case of failures already.
>
> Anyway, moving my re-validation code to a renamed version of 
> amdgpu_vm_validate_pt_bos shouldn't be a big problem. I just don't 
> understand what problem that solves. Maybe it just makes the code 
> cleaner to handle both evicted states in one place? Or maybe you're 
> pointing to a way to merge the two states into one?

Well yeah merging both state into one might be nice to have, but that 
isn't the fundamental problem.

What basically happens during validation of a BO is that this BO is move 
to a desired place (as described by the placement parameter). During 
this it can happen that we shuffle out other BOs.

Now the issue is that when you validate in amdgpu_vm_handle_moved() it 
can be that by validating BO we shuffle out other BOs which then end up 
of the evicted list again.

This most likely won't happen with KFD, but for GFX CS it's certainly 
possible. So the idea is that we first validate everything and then 
update all the page tables and not inter mix the two operations.

Regards,
Christian.

>
> Regards,
>   Felix
>
>
>>
>> Regards,
>> Christian.
>>
>> Am 07.11.23 um 23:11 schrieb Felix Kuehling:
>>> Hi Christian,
>>>
>>> I know you have objected to this patch before. I still think this is 
>>> the best solution for what I need. I can talk you through my 
>>> reasoning by email or offline. If I can't convince you, I will need 
>>> your guidance for a better solution.
>>>
>>> Thanks,
>>>   Felix
>>>
>>>
>>> On 2023-11-07 11:58, Felix Kuehling wrote:
>>>> Create a new VM state to track user BOs that are in the system domain.
>>>> In the next patch this will be used do conditionally re-validate 
>>>> them in
>>>> amdgpu_vm_handle_moved.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++++-
>>>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 1442d97ddd0f..0d685577243c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -232,6 +232,22 @@ static void amdgpu_vm_bo_invalidated(struct 
>>>> amdgpu_vm_bo_base *vm_bo)
>>>>       spin_unlock(&vm_bo->vm->status_lock);
>>>>   }
>>>>   +/**
>>>> + * amdgpu_vm_bo_evicted_user - vm_bo is evicted
>>>> + *
>>>> + * @vm_bo: vm_bo which is evicted
>>>> + *
>>>> + * State for BOs used by user mode queues which are not at the 
>>>> location they
>>>> + * should be.
>>>> + */
>>>> +static void amdgpu_vm_bo_evicted_user(struct amdgpu_vm_bo_base 
>>>> *vm_bo)
>>>> +{
>>>> +    vm_bo->moved = true;
>>>> +    spin_lock(&vm_bo->vm->status_lock);
>>>> +    list_move(&vm_bo->vm_status, &vm_bo->vm->evicted_user);
>>>> +    spin_unlock(&vm_bo->vm->status_lock);
>>>> +}
>>>> +
>>>>   /**
>>>>    * amdgpu_vm_bo_relocated - vm_bo is reloacted
>>>>    *
>>>> @@ -2110,6 +2126,7 @@ int amdgpu_vm_init(struct amdgpu_device 
>>>> *adev, struct amdgpu_vm *vm, int32_t xcp
>>>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
>>>>           vm->reserved_vmid[i] = NULL;
>>>>       INIT_LIST_HEAD(&vm->evicted);
>>>> +    INIT_LIST_HEAD(&vm->evicted_user);
>>>>       INIT_LIST_HEAD(&vm->relocated);
>>>>       INIT_LIST_HEAD(&vm->moved);
>>>>       INIT_LIST_HEAD(&vm->idle);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> index 12cac06fa289..939d0c2219c0 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> @@ -286,9 +286,12 @@ struct amdgpu_vm {
>>>>       /* Lock to protect vm_bo add/del/move on all lists of vm */
>>>>       spinlock_t        status_lock;
>>>>   -    /* BOs who needs a validation */
>>>> +    /* Per VM and PT BOs who needs a validation */
>>>>       struct list_head    evicted;
>>>>   +    /* BOs for user mode queues that need a validation */
>>>> +    struct list_head    evicted_user;
>>>> +
>>>>       /* PT BOs which relocated and their parent need an update */
>>>>       struct list_head    relocated;
>>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM
  2023-11-07 16:58 ` [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM Felix Kuehling
  2023-11-08  0:01   ` Errabolu, Ramesh
  2023-11-08 23:20   ` Chen, Xiaogang
@ 2023-11-09  8:16   ` Christian König
  2 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2023-11-09  8:16 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: xiaogang.chen, ramesh.errabolu, christian.koenig

Am 07.11.23 um 17:58 schrieb Felix Kuehling:
> Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
> ensures that a GEM handle is created on import and that obj->dma_buf
> will be set and remain set as long as the object is imported into KFD.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  9 ++-
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 64 +++++++++++++------
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 15 ++---
>   3 files changed, 52 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 4caf8cece028..88a0e0734270 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -318,11 +318,10 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
>   					    struct dma_fence **ef);
>   int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>   					      struct kfd_vm_fault_info *info);
> -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
> -				      struct dma_buf *dmabuf,
> -				      uint64_t va, void *drm_priv,
> -				      struct kgd_mem **mem, uint64_t *size,
> -				      uint64_t *mmap_offset);
> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
> +					 uint64_t va, void *drm_priv,
> +					 struct kgd_mem **mem, uint64_t *size,
> +					 uint64_t *mmap_offset);
>   int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
>   				      struct dma_buf **dmabuf);
>   void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 4bb8b5fd7598..1077de8bced2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -2006,8 +2006,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>   
>   	/* Free the BO*/
>   	drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
> -	if (!mem->is_imported)
> -		drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
> +	drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
>   	if (mem->dmabuf) {
>   		dma_buf_put(mem->dmabuf);
>   		mem->dmabuf = NULL;
> @@ -2363,34 +2362,26 @@ int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
>   	return 0;
>   }
>   
> -int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
> -				      struct dma_buf *dma_buf,
> -				      uint64_t va, void *drm_priv,
> -				      struct kgd_mem **mem, uint64_t *size,
> -				      uint64_t *mmap_offset)
> +static int import_obj_create(struct amdgpu_device *adev,
> +			     struct dma_buf *dma_buf,
> +			     struct drm_gem_object *obj,
> +			     uint64_t va, void *drm_priv,
> +			     struct kgd_mem **mem, uint64_t *size,
> +			     uint64_t *mmap_offset)
>   {
>   	struct amdgpu_vm *avm = drm_priv_to_vm(drm_priv);
> -	struct drm_gem_object *obj;
>   	struct amdgpu_bo *bo;
>   	int ret;
>   
> -	obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
> -	if (IS_ERR(obj))
> -		return PTR_ERR(obj);
> -
>   	bo = gem_to_amdgpu_bo(obj);
>   	if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
> -				    AMDGPU_GEM_DOMAIN_GTT))) {
> +				    AMDGPU_GEM_DOMAIN_GTT)))
>   		/* Only VRAM and GTT BOs are supported */
> -		ret = -EINVAL;
> -		goto err_put_obj;
> -	}
> +		return -EINVAL;
>   
>   	*mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
> -	if (!*mem) {
> -		ret = -ENOMEM;
> -		goto err_put_obj;
> -	}
> +	if (!*mem)
> +		return -ENOMEM;
>   
>   	ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>   	if (ret)
> @@ -2440,8 +2431,41 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>   	drm_vma_node_revoke(&obj->vma_node, drm_priv);
>   err_free_mem:
>   	kfree(*mem);
> +	return ret;
> +}
> +
> +int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct amdgpu_device *adev, int fd,
> +					 uint64_t va, void *drm_priv,
> +					 struct kgd_mem **mem, uint64_t *size,
> +					 uint64_t *mmap_offset)
> +{
> +	struct drm_gem_object *obj;
> +	uint32_t handle;
> +	int ret;
> +
> +	ret = drm_gem_prime_fd_to_handle(&adev->ddev, adev->kfd.client.file, fd,
> +					 &handle);
> +	if (ret)
> +		return ret;
> +	obj = drm_gem_object_lookup(adev->kfd.client.file, handle);
> +	if (!obj) {
> +		ret = -EINVAL;
> +		goto err_release_handle;
> +	}
> +
> +	ret = import_obj_create(adev, obj->dma_buf, obj, va, drm_priv, mem, size,
> +				mmap_offset);
> +	if (ret)
> +		goto err_put_obj;
> +
> +	(*mem)->gem_handle = handle;
> +
> +	return 0;
> +
>   err_put_obj:
>   	drm_gem_object_put(obj);
> +err_release_handle:
> +	drm_gem_handle_delete(adev->kfd.client.file, handle);
>   	return ret;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 4417a9863cd0..1a2e9f564b7f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1564,16 +1564,11 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   {
>   	struct kfd_ioctl_import_dmabuf_args *args = data;
>   	struct kfd_process_device *pdd;
> -	struct dma_buf *dmabuf;
>   	int idr_handle;
>   	uint64_t size;
>   	void *mem;
>   	int r;
>   
> -	dmabuf = dma_buf_get(args->dmabuf_fd);
> -	if (IS_ERR(dmabuf))
> -		return PTR_ERR(dmabuf);
> -
>   	mutex_lock(&p->mutex);
>   	pdd = kfd_process_device_data_by_id(p, args->gpu_id);
>   	if (!pdd) {
> @@ -1587,10 +1582,10 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   		goto err_unlock;
>   	}
>   
> -	r = amdgpu_amdkfd_gpuvm_import_dmabuf(pdd->dev->adev, dmabuf,
> -					      args->va_addr, pdd->drm_priv,
> -					      (struct kgd_mem **)&mem, &size,
> -					      NULL);
> +	r = amdgpu_amdkfd_gpuvm_import_dmabuf_fd(pdd->dev->adev, args->dmabuf_fd,
> +						 args->va_addr, pdd->drm_priv,
> +						 (struct kgd_mem **)&mem, &size,
> +						 NULL);
>   	if (r)
>   		goto err_unlock;
>   
> @@ -1601,7 +1596,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   	}
>   
>   	mutex_unlock(&p->mutex);
> -	dma_buf_put(dmabuf);
>   
>   	args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);
>   
> @@ -1612,7 +1606,6 @@ static int kfd_ioctl_import_dmabuf(struct file *filep,
>   					       pdd->drm_priv, NULL);
>   err_unlock:
>   	mutex_unlock(&p->mutex);
> -	dma_buf_put(dmabuf);
>   	return r;
>   }
>   


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs
  2023-11-09  8:12         ` Christian König
@ 2023-11-14 22:26           ` Felix Kuehling
  0 siblings, 0 replies; 23+ messages in thread
From: Felix Kuehling @ 2023-11-14 22:26 UTC (permalink / raw)
  To: Christian König, Christian König, amd-gfx
  Cc: xiaogang.chen, ramesh.errabolu


On 2023-11-09 03:12, Christian König wrote:
> Am 08.11.23 um 22:23 schrieb Felix Kuehling:
>>
>> On 2023-11-08 07:28, Christian König wrote:
>>> Not necessary objections to this patch here, but rather how this new 
>>> state is used later on.
>>>
>>> The fundamental problem is that re-validating things in 
>>> amdgpu_vm_handle_moved() won't work in all cases.
>>>
>>> The general approach for both classic CS IOCTL as well as user 
>>> queues is the following:
>>>
>>> 1. Grab all the necessary locks
>>>     The VM lock + either everything inside the VM (user queues) or 
>>> just the BOs in the submission (CS IOCTL).
>>>
>>> 2. Validate everything locked
>>>     It can be that additional BO locks are grabbed for eviction and 
>>> even locked BOs are shuffled around during that.
>>>
>>> 3. Update the PDEs and PTEs
>>>     This can be done only after validating everything because things 
>>> can still shuffle around during validation.
>>>
>>> 4. Do your CS or make the userqueu / process runable etc...
>>>
>>> 5. Attach fences to the locked BOs.
>>>
>>> 6. Unlock everything again.
>>>
>>> I think that is part of the problem why handling all updates in 
>>> amdgpu_vm_handle_moved() for the CS doesn't work and sooner or later 
>>> you might run into the same issue during process restore as well.
>>>
>>> If I'm not completely mistaken this should be solvable by moving all 
>>> validation to amdgpu_vm_validate_pt_bos() instead (but let us rename 
>>> that function).
>>
>> I'm trying to understand what the problem is. As far as I can tell, 
>> amdgpu_vm_handle_moved just runs a bit later in the CS and 
>> process-restore code than amdgpu_vm_validate_pt_bos. But it runs with 
>> all the same reservations locked. My current implementation in 
>> amdgpu_vm_handle_moved has some failure cases when reservations 
>> cannot be acquired. I think your drm_exec patch series would make it 
>> possible to handle this more robustly. That said, in case of KFD 
>> process restore, we can retry in case of failures already.
>>
>> Anyway, moving my re-validation code to a renamed version of 
>> amdgpu_vm_validate_pt_bos shouldn't be a big problem. I just don't 
>> understand what problem that solves. Maybe it just makes the code 
>> cleaner to handle both evicted states in one place? Or maybe you're 
>> pointing to a way to merge the two states into one?
>
> Well yeah merging both state into one might be nice to have, but that 
> isn't the fundamental problem.
>
> What basically happens during validation of a BO is that this BO is 
> move to a desired place (as described by the placement parameter). 
> During this it can happen that we shuffle out other BOs.
>
> Now the issue is that when you validate in amdgpu_vm_handle_moved() it 
> can be that by validating BO we shuffle out other BOs which then end 
> up of the evicted list again.
>
> This most likely won't happen with KFD, but for GFX CS it's certainly 
> possible. So the idea is that we first validate everything and then 
> update all the page tables and not inter mix the two operations.

I have something working now. But I think the fundamental problem is 
what you say, that BOs get can evicted again before we update the page 
tables. It won't be a problem for KFD, because I only care about DMABuf 
imports, and they don't get evicted if I have the original allocations 
reserved. I just have to make sure I validate/move the original 
allocations before I validate their DMABuf imports. This could still be 
a problem for future graphics interop.

The problem is, that I don't currently have a way to keep the validated 
BOs reserved to prevent their eviction. That is on amd-staging-drm-next, 
where I don't have your drm_exec patch series yet. Maybe this will get 
easier once the branch gets rebased on Linux 6.5 or newer.

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 07.11.23 um 23:11 schrieb Felix Kuehling:
>>>> Hi Christian,
>>>>
>>>> I know you have objected to this patch before. I still think this 
>>>> is the best solution for what I need. I can talk you through my 
>>>> reasoning by email or offline. If I can't convince you, I will need 
>>>> your guidance for a better solution.
>>>>
>>>> Thanks,
>>>>   Felix
>>>>
>>>>
>>>> On 2023-11-07 11:58, Felix Kuehling wrote:
>>>>> Create a new VM state to track user BOs that are in the system 
>>>>> domain.
>>>>> In the next patch this will be used do conditionally re-validate 
>>>>> them in
>>>>> amdgpu_vm_handle_moved.
>>>>>
>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++++++++++++++++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++++-
>>>>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index 1442d97ddd0f..0d685577243c 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -232,6 +232,22 @@ static void amdgpu_vm_bo_invalidated(struct 
>>>>> amdgpu_vm_bo_base *vm_bo)
>>>>>       spin_unlock(&vm_bo->vm->status_lock);
>>>>>   }
>>>>>   +/**
>>>>> + * amdgpu_vm_bo_evicted_user - vm_bo is evicted
>>>>> + *
>>>>> + * @vm_bo: vm_bo which is evicted
>>>>> + *
>>>>> + * State for BOs used by user mode queues which are not at the 
>>>>> location they
>>>>> + * should be.
>>>>> + */
>>>>> +static void amdgpu_vm_bo_evicted_user(struct amdgpu_vm_bo_base 
>>>>> *vm_bo)
>>>>> +{
>>>>> +    vm_bo->moved = true;
>>>>> +    spin_lock(&vm_bo->vm->status_lock);
>>>>> +    list_move(&vm_bo->vm_status, &vm_bo->vm->evicted_user);
>>>>> +    spin_unlock(&vm_bo->vm->status_lock);
>>>>> +}
>>>>> +
>>>>>   /**
>>>>>    * amdgpu_vm_bo_relocated - vm_bo is reloacted
>>>>>    *
>>>>> @@ -2110,6 +2126,7 @@ int amdgpu_vm_init(struct amdgpu_device 
>>>>> *adev, struct amdgpu_vm *vm, int32_t xcp
>>>>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
>>>>>           vm->reserved_vmid[i] = NULL;
>>>>>       INIT_LIST_HEAD(&vm->evicted);
>>>>> +    INIT_LIST_HEAD(&vm->evicted_user);
>>>>>       INIT_LIST_HEAD(&vm->relocated);
>>>>>       INIT_LIST_HEAD(&vm->moved);
>>>>>       INIT_LIST_HEAD(&vm->idle);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> index 12cac06fa289..939d0c2219c0 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> @@ -286,9 +286,12 @@ struct amdgpu_vm {
>>>>>       /* Lock to protect vm_bo add/del/move on all lists of vm */
>>>>>       spinlock_t        status_lock;
>>>>>   -    /* BOs who needs a validation */
>>>>> +    /* Per VM and PT BOs who needs a validation */
>>>>>       struct list_head    evicted;
>>>>>   +    /* BOs for user mode queues that need a validation */
>>>>> +    struct list_head    evicted_user;
>>>>> +
>>>>>       /* PT BOs which relocated and their parent need an update */
>>>>>       struct list_head    relocated;
>>>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
  2023-11-07 16:58 ` [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles Felix Kuehling
  2023-11-07 19:44   ` Errabolu, Ramesh
@ 2023-11-16 21:53   ` Felix Kuehling
  2023-11-17 16:30     ` Christian König
  1 sibling, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2023-11-16 21:53 UTC (permalink / raw)
  To: amd-gfx, Maling list - DRI developers, Thomas Zimmermann
  Cc: xiaogang.chen, ramesh.errabolu, christian.koenig


On 2023-11-07 11:58, Felix Kuehling wrote:
> Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM
> handles are created in a drm_client_dev context to avoid exposing them
> in user mode contexts through a DMABuf import.
This patch (and the next one) won't apply upstream because Thomas 
Zimmerman just made drm_gem_prime_fd_to_handle and 
drm_gem_prime_handle_to_fd private because nobody was using them. 
(drm/prime: Unexport helpers for fd/handle conversion)

Is it OK to export those APIs again? Or is there a better way for 
drivers to export/import DMABufs without using the GEM ioctls?

Regards,
   Felix


>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 11 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  5 +++
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 33 +++++++++++++++----
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |  4 +--
>   4 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 6ab17330a6ed..b0a67f16540a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -142,6 +142,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   {
>   	int i;
>   	int last_valid_bit;
> +	int ret;
>   
>   	amdgpu_amdkfd_gpuvm_init_mem_limits();
>   
> @@ -160,6 +161,12 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   			.enable_mes = adev->enable_mes,
>   		};
>   
> +		ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", NULL);
> +		if (ret) {
> +			dev_err(adev->dev, "Failed to init DRM client: %d\n", ret);
> +			return;
> +		}
> +
>   		/* this is going to have a few of the MSBs set that we need to
>   		 * clear
>   		 */
> @@ -198,6 +205,10 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   
>   		adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
>   							&gpu_resources);
> +		if (adev->kfd.init_complete)
> +			drm_client_register(&adev->kfd.client);
> +		else
> +			drm_client_release(&adev->kfd.client);
>   
>   		amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 68d534a89942..4caf8cece028 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -32,6 +32,7 @@
>   #include <linux/mmu_notifier.h>
>   #include <linux/memremap.h>
>   #include <kgd_kfd_interface.h>
> +#include <drm/drm_client.h>
>   #include <drm/ttm/ttm_execbuf_util.h>
>   #include "amdgpu_sync.h"
>   #include "amdgpu_vm.h"
> @@ -84,6 +85,7 @@ struct kgd_mem {
>   
>   	struct amdgpu_sync sync;
>   
> +	uint32_t gem_handle;
>   	bool aql_queue;
>   	bool is_imported;
>   };
> @@ -106,6 +108,9 @@ struct amdgpu_kfd_dev {
>   
>   	/* HMM page migration MEMORY_DEVICE_PRIVATE mapping */
>   	struct dev_pagemap pgmap;
> +
> +	/* Client for KFD BO GEM handle allocations */
> +	struct drm_client_dev client;
>   };
>   
>   enum kgd_engine_type {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 0c1cb6048259..4bb8b5fd7598 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -25,6 +25,7 @@
>   #include <linux/pagemap.h>
>   #include <linux/sched/mm.h>
>   #include <linux/sched/task.h>
> +#include <linux/fdtable.h>
>   #include <drm/ttm/ttm_tt.h>
>   
>   #include "amdgpu_object.h"
> @@ -804,13 +805,22 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,
>   static int kfd_mem_export_dmabuf(struct kgd_mem *mem)
>   {
>   	if (!mem->dmabuf) {
> -		struct dma_buf *ret = amdgpu_gem_prime_export(
> -			&mem->bo->tbo.base,
> +		struct amdgpu_device *bo_adev;
> +		struct dma_buf *dmabuf;
> +		int r, fd;
> +
> +		bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
> +		r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, bo_adev->kfd.client.file,
> +					       mem->gem_handle,
>   			mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
> -				DRM_RDWR : 0);
> -		if (IS_ERR(ret))
> -			return PTR_ERR(ret);
> -		mem->dmabuf = ret;
> +					       DRM_RDWR : 0, &fd);
> +		if (r)
> +			return r;
> +		dmabuf = dma_buf_get(fd);
> +		close_fd(fd);
> +		if (WARN_ON_ONCE(IS_ERR(dmabuf)))
> +			return PTR_ERR(dmabuf);
> +		mem->dmabuf = dmabuf;
>   	}
>   
>   	return 0;
> @@ -1826,6 +1836,9 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   		pr_debug("Failed to allow vma node access. ret %d\n", ret);
>   		goto err_node_allow;
>   	}
> +	ret = drm_gem_handle_create(adev->kfd.client.file, gobj, &(*mem)->gem_handle);
> +	if (ret)
> +		goto err_gem_handle_create;
>   	bo = gem_to_amdgpu_bo(gobj);
>   	if (bo_type == ttm_bo_type_sg) {
>   		bo->tbo.sg = sg;
> @@ -1877,6 +1890,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   err_pin_bo:
>   err_validate_bo:
>   	remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info);
> +	drm_gem_handle_delete(adev->kfd.client.file, (*mem)->gem_handle);
> +err_gem_handle_create:
>   	drm_vma_node_revoke(&gobj->vma_node, drm_priv);
>   err_node_allow:
>   	/* Don't unreserve system mem limit twice */
> @@ -1991,8 +2006,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>   
>   	/* Free the BO*/
>   	drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
> -	if (mem->dmabuf)
> +	if (!mem->is_imported)
> +		drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
> +	if (mem->dmabuf) {
>   		dma_buf_put(mem->dmabuf);
> +		mem->dmabuf = NULL;
> +	}
>   	mutex_destroy(&mem->lock);
>   
>   	/* If this releases the last reference, it will end up calling
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 06988cf1db51..4417a9863cd0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1855,8 +1855,8 @@ static uint32_t get_process_num_bos(struct kfd_process *p)
>   	return num_of_bos;
>   }
>   
> -static int criu_get_prime_handle(struct kgd_mem *mem, int flags,
> -				      u32 *shared_fd)
> +static int criu_get_prime_handle(struct kgd_mem *mem,
> +				 int flags, u32 *shared_fd)
>   {
>   	struct dma_buf *dmabuf;
>   	int ret;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles
  2023-11-16 21:53   ` Felix Kuehling
@ 2023-11-17 16:30     ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2023-11-17 16:30 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx, Maling list - DRI developers,
	Thomas Zimmermann
  Cc: xiaogang.chen, ramesh.errabolu

Am 16.11.23 um 22:53 schrieb Felix Kuehling:
>
> On 2023-11-07 11:58, Felix Kuehling wrote:
>> Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM
>> handles are created in a drm_client_dev context to avoid exposing them
>> in user mode contexts through a DMABuf import.
> This patch (and the next one) won't apply upstream because Thomas 
> Zimmerman just made drm_gem_prime_fd_to_handle and 
> drm_gem_prime_handle_to_fd private because nobody was using them. 
> (drm/prime: Unexport helpers for fd/handle conversion)
>
> Is it OK to export those APIs again? Or is there a better way for 
> drivers to export/import DMABufs without using the GEM ioctls?

Well let me say so: I think it's the least evil approach :)

So yeah, propose a patch to export them again.

Regards,
Christian.

>
> Regards,
>   Felix
>
>
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 11 +++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  5 +++
>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 33 +++++++++++++++----
>>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |  4 +--
>>   4 files changed, 44 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 6ab17330a6ed..b0a67f16540a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -142,6 +142,7 @@ void amdgpu_amdkfd_device_init(struct 
>> amdgpu_device *adev)
>>   {
>>       int i;
>>       int last_valid_bit;
>> +    int ret;
>>         amdgpu_amdkfd_gpuvm_init_mem_limits();
>>   @@ -160,6 +161,12 @@ void amdgpu_amdkfd_device_init(struct 
>> amdgpu_device *adev)
>>               .enable_mes = adev->enable_mes,
>>           };
>>   +        ret = drm_client_init(&adev->ddev, &adev->kfd.client, 
>> "kfd", NULL);
>> +        if (ret) {
>> +            dev_err(adev->dev, "Failed to init DRM client: %d\n", ret);
>> +            return;
>> +        }
>> +
>>           /* this is going to have a few of the MSBs set that we need to
>>            * clear
>>            */
>> @@ -198,6 +205,10 @@ void amdgpu_amdkfd_device_init(struct 
>> amdgpu_device *adev)
>>             adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
>>                               &gpu_resources);
>> +        if (adev->kfd.init_complete)
>> +            drm_client_register(&adev->kfd.client);
>> +        else
>> +            drm_client_release(&adev->kfd.client);
>>             amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 68d534a89942..4caf8cece028 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -32,6 +32,7 @@
>>   #include <linux/mmu_notifier.h>
>>   #include <linux/memremap.h>
>>   #include <kgd_kfd_interface.h>
>> +#include <drm/drm_client.h>
>>   #include <drm/ttm/ttm_execbuf_util.h>
>>   #include "amdgpu_sync.h"
>>   #include "amdgpu_vm.h"
>> @@ -84,6 +85,7 @@ struct kgd_mem {
>>         struct amdgpu_sync sync;
>>   +    uint32_t gem_handle;
>>       bool aql_queue;
>>       bool is_imported;
>>   };
>> @@ -106,6 +108,9 @@ struct amdgpu_kfd_dev {
>>         /* HMM page migration MEMORY_DEVICE_PRIVATE mapping */
>>       struct dev_pagemap pgmap;
>> +
>> +    /* Client for KFD BO GEM handle allocations */
>> +    struct drm_client_dev client;
>>   };
>>     enum kgd_engine_type {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 0c1cb6048259..4bb8b5fd7598 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/pagemap.h>
>>   #include <linux/sched/mm.h>
>>   #include <linux/sched/task.h>
>> +#include <linux/fdtable.h>
>>   #include <drm/ttm/ttm_tt.h>
>>     #include "amdgpu_object.h"
>> @@ -804,13 +805,22 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,
>>   static int kfd_mem_export_dmabuf(struct kgd_mem *mem)
>>   {
>>       if (!mem->dmabuf) {
>> -        struct dma_buf *ret = amdgpu_gem_prime_export(
>> -            &mem->bo->tbo.base,
>> +        struct amdgpu_device *bo_adev;
>> +        struct dma_buf *dmabuf;
>> +        int r, fd;
>> +
>> +        bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
>> +        r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, 
>> bo_adev->kfd.client.file,
>> +                           mem->gem_handle,
>>               mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
>> -                DRM_RDWR : 0);
>> -        if (IS_ERR(ret))
>> -            return PTR_ERR(ret);
>> -        mem->dmabuf = ret;
>> +                           DRM_RDWR : 0, &fd);
>> +        if (r)
>> +            return r;
>> +        dmabuf = dma_buf_get(fd);
>> +        close_fd(fd);
>> +        if (WARN_ON_ONCE(IS_ERR(dmabuf)))
>> +            return PTR_ERR(dmabuf);
>> +        mem->dmabuf = dmabuf;
>>       }
>>         return 0;
>> @@ -1826,6 +1836,9 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>>           pr_debug("Failed to allow vma node access. ret %d\n", ret);
>>           goto err_node_allow;
>>       }
>> +    ret = drm_gem_handle_create(adev->kfd.client.file, gobj, 
>> &(*mem)->gem_handle);
>> +    if (ret)
>> +        goto err_gem_handle_create;
>>       bo = gem_to_amdgpu_bo(gobj);
>>       if (bo_type == ttm_bo_type_sg) {
>>           bo->tbo.sg = sg;
>> @@ -1877,6 +1890,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>>   err_pin_bo:
>>   err_validate_bo:
>>       remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info);
>> +    drm_gem_handle_delete(adev->kfd.client.file, (*mem)->gem_handle);
>> +err_gem_handle_create:
>>       drm_vma_node_revoke(&gobj->vma_node, drm_priv);
>>   err_node_allow:
>>       /* Don't unreserve system mem limit twice */
>> @@ -1991,8 +2006,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>>         /* Free the BO*/
>>       drm_vma_node_revoke(&mem->bo->tbo.base.vma_node, drm_priv);
>> -    if (mem->dmabuf)
>> +    if (!mem->is_imported)
>> +        drm_gem_handle_delete(adev->kfd.client.file, mem->gem_handle);
>> +    if (mem->dmabuf) {
>>           dma_buf_put(mem->dmabuf);
>> +        mem->dmabuf = NULL;
>> +    }
>>       mutex_destroy(&mem->lock);
>>         /* If this releases the last reference, it will end up calling
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 06988cf1db51..4417a9863cd0 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -1855,8 +1855,8 @@ static uint32_t get_process_num_bos(struct 
>> kfd_process *p)
>>       return num_of_bos;
>>   }
>>   -static int criu_get_prime_handle(struct kgd_mem *mem, int flags,
>> -                      u32 *shared_fd)
>> +static int criu_get_prime_handle(struct kgd_mem *mem,
>> +                 int flags, u32 *shared_fd)
>>   {
>>       struct dma_buf *dmabuf;
>>       int ret;


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-11-17 16:30 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-07 16:58 [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Felix Kuehling
2023-11-07 16:58 ` [PATCH 2/6] drm/amdgpu: New VM state for evicted user BOs Felix Kuehling
2023-11-07 22:11   ` Felix Kuehling
2023-11-08 12:28     ` Christian König
2023-11-08 21:23       ` Felix Kuehling
2023-11-09  8:12         ` Christian König
2023-11-14 22:26           ` Felix Kuehling
2023-11-07 16:58 ` [PATCH 3/6] drm/amdgpu: Auto-validate DMABuf imports in compute VMs Felix Kuehling
2023-11-07 20:24   ` Joshi, Mukul
2023-11-07 16:58 ` [PATCH 4/6] drm/amdkfd: Export DMABufs from KFD using GEM handles Felix Kuehling
2023-11-07 19:44   ` Errabolu, Ramesh
2023-11-07 19:56     ` Felix Kuehling
2023-11-08  2:08       ` Errabolu, Ramesh
2023-11-16 21:53   ` Felix Kuehling
2023-11-17 16:30     ` Christian König
2023-11-07 16:58 ` [PATCH 5/6] drm/amdkfd: Import DMABufs for interop through DRM Felix Kuehling
2023-11-08  0:01   ` Errabolu, Ramesh
2023-11-08 23:20   ` Chen, Xiaogang
2023-11-08 23:26     ` Felix Kuehling
2023-11-08 23:41       ` Chen, Xiaogang
2023-11-09  8:16   ` Christian König
2023-11-07 16:58 ` [PATCH 6/6] drm/amdkfd: Bump KFD ioctl version Felix Kuehling
2023-11-08 12:11 ` [PATCH 1/6] drm/amdgpu: Fix possible null pointer dereference Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.