public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves
@ 2018-09-11  9:55 Christian König
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:55 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

We are going to need this for recoverable page fault handling and it
makes shadow handling during GPU reset much more easier.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b5f20b42439e..a7e39c9dd14b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
 {
 	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
 	WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
-		     !bo->pin_count);
+		     !bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
 	WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
 	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
 		     !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..2f32dc692d40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 	if (r)
 		goto error;
 
-	r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
+	/* Always block for VM page tables before committing the new location */
+	if (bo->type == ttm_bo_type_kernel)
+		r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
+	else
+		r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
 	dma_fence_put(fence);
 	return r;
 
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11  9:55   ` Christian König
       [not found]     ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:56   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:55 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Even when GPU recovery is disabled we could run into a manually
triggered recovery.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
 1 file changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index a7e39c9dd14b..7db0040ca145 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -38,31 +38,6 @@
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
 
-/**
- * DOC: amdgpu_object
- *
- * This defines the interfaces to operate on an &amdgpu_bo buffer object which
- * represents memory used by driver (VRAM, system memory, etc.). The driver
- * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
- * to create/destroy/set buffer object which are then managed by the kernel TTM
- * memory manager.
- * The interfaces are also used internally by kernel clients, including gfx,
- * uvd, etc. for kernel managed allocations used by the GPU.
- *
- */
-
-static bool amdgpu_bo_need_backup(struct amdgpu_device *adev)
-{
-	if (adev->flags & AMD_IS_APU)
-		return false;
-
-	if (amdgpu_gpu_recovery == 0 ||
-	    (amdgpu_gpu_recovery == -1  && !amdgpu_sriov_vf(adev)))
-		return false;
-
-	return true;
-}
-
 /**
  * amdgpu_bo_subtract_pin_size - Remove BO from pin_size accounting
  *
@@ -596,7 +571,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 	if (r)
 		return r;
 
-	if ((flags & AMDGPU_GEM_CREATE_SHADOW) && amdgpu_bo_need_backup(adev)) {
+	if ((flags & AMDGPU_GEM_CREATE_SHADOW) && !(adev->flags & AMD_IS_APU)) {
 		if (!bp->resv)
 			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
 							NULL));
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:55   ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
@ 2018-09-11  9:56   ` Christian König
       [not found]     ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:56   ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

They aren't directly used by the hardware.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 7db0040ca145..3a6f92de5504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 }
 
 static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
-				   unsigned long size, int byte_align,
+				   unsigned long size,
 				   struct amdgpu_bo *bo)
 {
 	struct amdgpu_bo_param bp;
@@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 
 	memset(&bp, 0, sizeof(bp));
 	bp.size = size;
-	bp.byte_align = byte_align;
 	bp.domain = AMDGPU_GEM_DOMAIN_GTT;
 	bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
 		AMDGPU_GEM_CREATE_SHADOW;
@@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
 							NULL));
 
-		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
+		r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
 
 		if (!bp->resv)
 			reservation_object_unlock((*bo_ptr)->tbo.resv);
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:55   ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
  2018-09-11  9:56   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
@ 2018-09-11  9:56   ` Christian König
       [not found]     ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:56   ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
  2018-09-13  9:28   ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

It shouldn't add much overhead and we should make sure that critical
VRAM content is always restored.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 93476b8c2e72..5eba66ecf668 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
 }
 
 /**
- * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
+ * amdgpu_device_recover_vram - Recover some VRAM contents
  *
  * @adev: amdgpu_device pointer
  *
@@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
  * the contents of VRAM might be lost.
  * Returns 0 on success, 1 on failure.
  */
-static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
+static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
 {
 	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
 	struct amdgpu_bo *bo, *tmp;
@@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
 		}
 	}
 
-	if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))
-		r = amdgpu_device_handle_vram_lost(adev);
+	if (!r)
+		r = amdgpu_device_recover_vram(adev);
 
 	return r;
 }
@@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 	amdgpu_virt_release_full_gpu(adev, true);
 	if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
 		atomic_inc(&adev->vram_lost_counter);
-		r = amdgpu_device_handle_vram_lost(adev);
+		r = amdgpu_device_recover_vram(adev);
 	}
 
 	return r;
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-09-11  9:56   ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
@ 2018-09-11  9:56   ` Christian König
       [not found]     ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-13  9:28   ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Don't grab the reservation lock any more and simplify the handling quite
a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 ++++++++---------------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
 3 files changed, 43 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5eba66ecf668..20bb702f5c7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
- *
- * @adev: amdgpu_device pointer
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: amdgpu_bo buffer whose shadow is being restored
- * @fence: dma_fence associated with the operation
- *
- * Restores the VRAM buffer contents from the shadow in GTT.  Used to
- * restore things like GPUVM page tables after a GPU reset where
- * the contents of VRAM might be lost.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
-						  struct amdgpu_ring *ring,
-						  struct amdgpu_bo *bo,
-						  struct dma_fence **fence)
-{
-	uint32_t domain;
-	int r;
-
-	if (!bo->shadow)
-		return 0;
-
-	r = amdgpu_bo_reserve(bo, true);
-	if (r)
-		return r;
-	domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
-	/* if bo has been evicted, then no need to recover */
-	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
-		r = amdgpu_bo_validate(bo->shadow);
-		if (r) {
-			DRM_ERROR("bo validate failed!\n");
-			goto err;
-		}
-
-		r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
-						 NULL, fence, true);
-		if (r) {
-			DRM_ERROR("recover page table failed!\n");
-			goto err;
-		}
-	}
-err:
-	amdgpu_bo_unreserve(bo);
-	return r;
-}
-
 /**
  * amdgpu_device_recover_vram - Recover some VRAM contents
  *
@@ -2996,16 +2948,15 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
  * Restores the contents of VRAM buffers from the shadows in GTT.  Used to
  * restore things like GPUVM page tables after a GPU reset where
  * the contents of VRAM might be lost.
- * Returns 0 on success, 1 on failure.
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
  */
 static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
 {
-	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
-	struct amdgpu_bo *bo, *tmp;
 	struct dma_fence *fence = NULL, *next = NULL;
-	long r = 1;
-	int i = 0;
-	long tmo;
+	struct amdgpu_bo *shadow;
+	long r = 1, tmo;
 
 	if (amdgpu_sriov_runtime(adev))
 		tmo = msecs_to_jiffies(8000);
@@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
 
 	DRM_INFO("recover vram bo from shadow start\n");
 	mutex_lock(&adev->shadow_list_lock);
-	list_for_each_entry_safe(bo, tmp, &adev->shadow_list, shadow_list) {
-		next = NULL;
-		amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
+	list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
+
+		/* No need to recover an evicted BO */
+		if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
+		    shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
+			continue;
+
+		r = amdgpu_bo_restore_shadow(shadow, &next);
+		if (r)
+			break;
+
 		if (fence) {
 			r = dma_fence_wait_timeout(fence, false, tmo);
-			if (r == 0)
-				pr_err("wait fence %p[%d] timeout\n", fence, i);
-			else if (r < 0)
-				pr_err("wait fence %p[%d] interrupted\n", fence, i);
-			if (r < 1) {
-				dma_fence_put(fence);
-				fence = next;
+			dma_fence_put(fence);
+			fence = next;
+			if (r <= 0)
 				break;
-			}
-			i++;
+		} else {
+			fence = next;
 		}
-
-		dma_fence_put(fence);
-		fence = next;
 	}
 	mutex_unlock(&adev->shadow_list_lock);
 
-	if (fence) {
-		r = dma_fence_wait_timeout(fence, false, tmo);
-		if (r == 0)
-			pr_err("wait fence %p[%d] timeout\n", fence, i);
-		else if (r < 0)
-			pr_err("wait fence %p[%d] interrupted\n", fence, i);
-
-	}
+	if (fence)
+		tmo = dma_fence_wait_timeout(fence, false, tmo);
 	dma_fence_put(fence);
 
-	if (r > 0)
-		DRM_INFO("recover vram bo from shadow done\n");
-	else
+	if (r <= 0 || tmo <= 0) {
 		DRM_ERROR("recover vram bo from shadow failed\n");
+		return -EIO;
+	}
 
-	return (r > 0) ? 0 : 1;
+	DRM_INFO("recover vram bo from shadow done\n");
+	return 0;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 3a6f92de5504..5b6d5fcc2151 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 	if (!r) {
 		bo->shadow->parent = amdgpu_bo_ref(bo);
 		mutex_lock(&adev->shadow_list_lock);
-		list_add_tail(&bo->shadow_list, &adev->shadow_list);
+		list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
 		mutex_unlock(&adev->shadow_list_lock);
 	}
 
@@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
 }
 
 /**
- * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
- * @adev: amdgpu device object
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: &amdgpu_bo buffer to be restored
- * @resv: reservation object with embedded fence
+ * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
+ *
+ * @shadow: &amdgpu_bo shadow to be restored
  * @fence: dma_fence associated with the operation
- * @direct: whether to submit the job directly
  *
  * Copies a buffer object's shadow content back to the object.
  * This is used for recovering a buffer from its shadow in case of a gpu
@@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
  * Returns:
  * 0 for success or a negative error code on failure.
  */
-int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
-				  struct amdgpu_ring *ring,
-				  struct amdgpu_bo *bo,
-				  struct reservation_object *resv,
-				  struct dma_fence **fence,
-				  bool direct)
+int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)
 
 {
-	struct amdgpu_bo *shadow = bo->shadow;
-	uint64_t bo_addr, shadow_addr;
-	int r;
-
-	if (!shadow)
-		return -EINVAL;
-
-	bo_addr = amdgpu_bo_gpu_offset(bo);
-	shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
-
-	r = reservation_object_reserve_shared(bo->tbo.resv);
-	if (r)
-		goto err;
+	struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
+	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
+	uint64_t shadow_addr, parent_addr;
 
-	r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
-			       amdgpu_bo_size(bo), resv, fence,
-			       direct, false);
-	if (!r)
-		amdgpu_bo_fence(bo, *fence, true);
+	shadow_addr = amdgpu_bo_gpu_offset(shadow);
+	parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
 
-err:
-	return r;
+	return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
+				  amdgpu_bo_size(shadow), NULL, fence,
+				  true, false);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 907fdf46d895..363db417fb2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
 			       struct reservation_object *resv,
 			       struct dma_fence **fence, bool direct);
 int amdgpu_bo_validate(struct amdgpu_bo *bo);
-int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
-				  struct amdgpu_ring *ring,
-				  struct amdgpu_bo *bo,
-				  struct reservation_object *resv,
-				  struct dma_fence **fence,
-				  bool direct);
+int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
+			     struct dma_fence **fence);
 uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device *adev,
 					    uint32_t domain);
 
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found]     ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11 12:29       ` Michel Dänzer
       [not found]         ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michel Dänzer @ 2018-09-11 12:29 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-09-11 11:55 a.m., Christian König wrote:
> Even when GPU recovery is disabled we could run into a manually
> triggered recovery.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
>  1 file changed, 1 insertion(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index a7e39c9dd14b..7db0040ca145 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -38,31 +38,6 @@
>  #include "amdgpu_trace.h"
>  #include "amdgpu_amdkfd.h"
>  
> -/**
> - * DOC: amdgpu_object
> - *
> - * This defines the interfaces to operate on an &amdgpu_bo buffer object which
> - * represents memory used by driver (VRAM, system memory, etc.). The driver
> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
> - * to create/destroy/set buffer object which are then managed by the kernel TTM
> - * memory manager.
> - * The interfaces are also used internally by kernel clients, including gfx,
> - * uvd, etc. for kernel managed allocations used by the GPU.
> - *
> - */

This comment shouldn't be removed, should it? Otherwise the commit log
needs to explain this, and the reference in Documentation/gpu/amdgpu.rst
needs to be removed as well.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found]         ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-09-11 13:21           ` Christian König
       [not found]             ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 13:21 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 11.09.2018 um 14:29 schrieb Michel Dänzer:
> On 2018-09-11 11:55 a.m., Christian König wrote:
>> Even when GPU recovery is disabled we could run into a manually
>> triggered recovery.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
>>   1 file changed, 1 insertion(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index a7e39c9dd14b..7db0040ca145 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -38,31 +38,6 @@
>>   #include "amdgpu_trace.h"
>>   #include "amdgpu_amdkfd.h"
>>   
>> -/**
>> - * DOC: amdgpu_object
>> - *
>> - * This defines the interfaces to operate on an &amdgpu_bo buffer object which
>> - * represents memory used by driver (VRAM, system memory, etc.). The driver
>> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
>> - * to create/destroy/set buffer object which are then managed by the kernel TTM
>> - * memory manager.
>> - * The interfaces are also used internally by kernel clients, including gfx,
>> - * uvd, etc. for kernel managed allocations used by the GPU.
>> - *
>> - */
> This comment shouldn't be removed, should it? Otherwise the commit log
> needs to explain this, and the reference in Documentation/gpu/amdgpu.rst
> needs to be removed as well.

Oh, that was indeed an accident. Going to fix that.

Thanks,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found]             ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-12 12:37               ` Deng, Emily
  0 siblings, 0 replies; 15+ messages in thread
From: Deng, Emily @ 2018-09-12 12:37 UTC (permalink / raw)
  To: Koenig, Christian, Michel Dänzer
  Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

Acked-by: Emily Deng <Emily.Deng@amd.com>

>-----Original Message-----
>From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>Christian König
>Sent: Tuesday, September 11, 2018 9:22 PM
>To: Michel Dänzer <michel@daenzer.net>
>Cc: amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
>
>Am 11.09.2018 um 14:29 schrieb Michel Dänzer:
>> On 2018-09-11 11:55 a.m., Christian König wrote:
>>> Even when GPU recovery is disabled we could run into a manually
>>> triggered recovery.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +-----------------------
>---
>>>   1 file changed, 1 insertion(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index a7e39c9dd14b..7db0040ca145 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -38,31 +38,6 @@
>>>   #include "amdgpu_trace.h"
>>>   #include "amdgpu_amdkfd.h"
>>>
>>> -/**
>>> - * DOC: amdgpu_object
>>> - *
>>> - * This defines the interfaces to operate on an &amdgpu_bo buffer
>>> object which
>>> - * represents memory used by driver (VRAM, system memory, etc.). The
>>> driver
>>> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these
>>> interfaces
>>> - * to create/destroy/set buffer object which are then managed by the
>>> kernel TTM
>>> - * memory manager.
>>> - * The interfaces are also used internally by kernel clients,
>>> including gfx,
>>> - * uvd, etc. for kernel managed allocations used by the GPU.
>>> - *
>>> - */
>> This comment shouldn't be removed, should it? Otherwise the commit log
>> needs to explain this, and the reference in
>> Documentation/gpu/amdgpu.rst needs to be removed as well.
>
>Oh, that was indeed an accident. Going to fix that.
>
>Thanks,
>Christian.
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-09-11  9:56   ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
@ 2018-09-13  9:28   ` Zhang, Jerry(Junwei)
  4 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:28 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 09/11/2018 05:55 PM, Christian König wrote:
> We are going to need this for recoverable page fault handling and it
> makes shadow handling during GPU reset much more easier.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 6 +++++-
>   2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index b5f20b42439e..a7e39c9dd14b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
>   {
>   	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
>   	WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
> -		     !bo->pin_count);
> +		     !bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
>   	WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
>   	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
>   		     !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d9f3201c9e5c..2f32dc692d40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
>   	if (r)
>   		goto error;
>   
> -	r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
> +	/* Always block for VM page tables before committing the new location */
> +	if (bo->type == ttm_bo_type_kernel)
> +		r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
> +	else
> +		r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
>   	dma_fence_put(fence);
>   	return r;
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery
       [not found]     ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13  9:28       ` Zhang, Jerry(Junwei)
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:28 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/11/2018 05:56 PM, Christian König wrote:
> It shouldn't add much overhead and we should make sure that critical
> VRAM content is always restored.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 93476b8c2e72..5eba66ecf668 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>   }
>   
>   /**
> - * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
> + * amdgpu_device_recover_vram - Recover some VRAM contents
>    *
>    * @adev: amdgpu_device pointer
>    *
> @@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>    * the contents of VRAM might be lost.
>    * Returns 0 on success, 1 on failure.
>    */
> -static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
> +static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>   {
>   	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>   	struct amdgpu_bo *bo, *tmp;
> @@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
>   		}
>   	}
>   
> -	if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))
> -		r = amdgpu_device_handle_vram_lost(adev);
> +	if (!r)
> +		r = amdgpu_device_recover_vram(adev);
>   
>   	return r;
>   }
> @@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
>   	amdgpu_virt_release_full_gpu(adev, true);
>   	if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
>   		atomic_inc(&adev->vram_lost_counter);
> -		r = amdgpu_device_handle_vram_lost(adev);
> +		r = amdgpu_device_recover_vram(adev);
>   	}
>   
>   	return r;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found]     ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13  9:29       ` Zhang, Jerry(Junwei)
       [not found]         ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:29 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/11/2018 05:56 PM, Christian König wrote:
> Don't grab the reservation lock any more and simplify the handling quite
> a bit.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 ++++++++---------------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
>   3 files changed, 43 insertions(+), 120 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5eba66ecf668..20bb702f5c7f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> -/**
> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
> - *
> - * @adev: amdgpu_device pointer
> - * @ring: amdgpu_ring for the engine handling the buffer operations
> - * @bo: amdgpu_bo buffer whose shadow is being restored
> - * @fence: dma_fence associated with the operation
> - *
> - * Restores the VRAM buffer contents from the shadow in GTT.  Used to
> - * restore things like GPUVM page tables after a GPU reset where
> - * the contents of VRAM might be lost.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
> -						  struct amdgpu_ring *ring,
> -						  struct amdgpu_bo *bo,
> -						  struct dma_fence **fence)
> -{
> -	uint32_t domain;
> -	int r;
> -
> -	if (!bo->shadow)
> -		return 0;
> -
> -	r = amdgpu_bo_reserve(bo, true);
> -	if (r)
> -		return r;
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> -	/* if bo has been evicted, then no need to recover */
> -	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
> -		r = amdgpu_bo_validate(bo->shadow);
> -		if (r) {
> -			DRM_ERROR("bo validate failed!\n");
> -			goto err;
> -		}
> -
> -		r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
> -						 NULL, fence, true);
> -		if (r) {
> -			DRM_ERROR("recover page table failed!\n");
> -			goto err;
> -		}
> -	}
> -err:
> -	amdgpu_bo_unreserve(bo);
> -	return r;
> -}
> -
>   /**
>    * amdgpu_device_recover_vram - Recover some VRAM contents
>    *
> @@ -2996,16 +2948,15 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>    * Restores the contents of VRAM buffers from the shadows in GTT.  Used to
>    * restore things like GPUVM page tables after a GPU reset where
>    * the contents of VRAM might be lost.
> - * Returns 0 on success, 1 on failure.
> + *
> + * Returns:
> + * 0 on success, negative error code on failure.
>    */
>   static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>   {
> -	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> -	struct amdgpu_bo *bo, *tmp;
>   	struct dma_fence *fence = NULL, *next = NULL;
> -	long r = 1;
> -	int i = 0;
> -	long tmo;
> +	struct amdgpu_bo *shadow;
> +	long r = 1, tmo;
>   
>   	if (amdgpu_sriov_runtime(adev))
>   		tmo = msecs_to_jiffies(8000);
> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>   
>   	DRM_INFO("recover vram bo from shadow start\n");
>   	mutex_lock(&adev->shadow_list_lock);
> -	list_for_each_entry_safe(bo, tmp, &adev->shadow_list, shadow_list) {
> -		next = NULL;
> -		amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
> +	list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
> +
> +		/* No need to recover an evicted BO */
> +		if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
> +		    shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
is there a change that shadow bo evicted to other domain?
like SYSTEM?

Regards,
Jerry
> +			continue;
> +
> +		r = amdgpu_bo_restore_shadow(shadow, &next);
> +		if (r)
> +			break;
> +
>   		if (fence) {
>   			r = dma_fence_wait_timeout(fence, false, tmo);
> -			if (r == 0)
> -				pr_err("wait fence %p[%d] timeout\n", fence, i);
> -			else if (r < 0)
> -				pr_err("wait fence %p[%d] interrupted\n", fence, i);
> -			if (r < 1) {
> -				dma_fence_put(fence);
> -				fence = next;
> +			dma_fence_put(fence);
> +			fence = next;
> +			if (r <= 0)
>   				break;
> -			}
> -			i++;
> +		} else {
> +			fence = next;
>   		}
> -
> -		dma_fence_put(fence);
> -		fence = next;
>   	}
>   	mutex_unlock(&adev->shadow_list_lock);
>   
> -	if (fence) {
> -		r = dma_fence_wait_timeout(fence, false, tmo);
> -		if (r == 0)
> -			pr_err("wait fence %p[%d] timeout\n", fence, i);
> -		else if (r < 0)
> -			pr_err("wait fence %p[%d] interrupted\n", fence, i);
> -
> -	}
> +	if (fence)
> +		tmo = dma_fence_wait_timeout(fence, false, tmo);
>   	dma_fence_put(fence);
>   
> -	if (r > 0)
> -		DRM_INFO("recover vram bo from shadow done\n");
> -	else
> +	if (r <= 0 || tmo <= 0) {
>   		DRM_ERROR("recover vram bo from shadow failed\n");
> +		return -EIO;
> +	}
>   
> -	return (r > 0) ? 0 : 1;
> +	DRM_INFO("recover vram bo from shadow done\n");
> +	return 0;
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 3a6f92de5504..5b6d5fcc2151 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
>   	if (!r) {
>   		bo->shadow->parent = amdgpu_bo_ref(bo);
>   		mutex_lock(&adev->shadow_list_lock);
> -		list_add_tail(&bo->shadow_list, &adev->shadow_list);
> +		list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>   		mutex_unlock(&adev->shadow_list_lock);
>   	}
>   
> @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>   }
>   
>   /**
> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
> - * @adev: amdgpu device object
> - * @ring: amdgpu_ring for the engine handling the buffer operations
> - * @bo: &amdgpu_bo buffer to be restored
> - * @resv: reservation object with embedded fence
> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
> + *
> + * @shadow: &amdgpu_bo shadow to be restored
>    * @fence: dma_fence associated with the operation
> - * @direct: whether to submit the job directly
>    *
>    * Copies a buffer object's shadow content back to the object.
>    * This is used for recovering a buffer from its shadow in case of a gpu
> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>    * Returns:
>    * 0 for success or a negative error code on failure.
>    */
> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> -				  struct amdgpu_ring *ring,
> -				  struct amdgpu_bo *bo,
> -				  struct reservation_object *resv,
> -				  struct dma_fence **fence,
> -				  bool direct)
> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)
>   
>   {
> -	struct amdgpu_bo *shadow = bo->shadow;
> -	uint64_t bo_addr, shadow_addr;
> -	int r;
> -
> -	if (!shadow)
> -		return -EINVAL;
> -
> -	bo_addr = amdgpu_bo_gpu_offset(bo);
> -	shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
> -
> -	r = reservation_object_reserve_shared(bo->tbo.resv);
> -	if (r)
> -		goto err;
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
> +	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> +	uint64_t shadow_addr, parent_addr;
>   
> -	r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
> -			       amdgpu_bo_size(bo), resv, fence,
> -			       direct, false);
> -	if (!r)
> -		amdgpu_bo_fence(bo, *fence, true);
> +	shadow_addr = amdgpu_bo_gpu_offset(shadow);
> +	parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>   
> -err:
> -	return r;
> +	return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
> +				  amdgpu_bo_size(shadow), NULL, fence,
> +				  true, false);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 907fdf46d895..363db417fb2e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
>   			       struct reservation_object *resv,
>   			       struct dma_fence **fence, bool direct);
>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> -				  struct amdgpu_ring *ring,
> -				  struct amdgpu_bo *bo,
> -				  struct reservation_object *resv,
> -				  struct dma_fence **fence,
> -				  bool direct);
> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
> +			     struct dma_fence **fence);
>   uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device *adev,
>   					    uint32_t domain);
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
       [not found]     ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13  9:30       ` Zhang, Jerry(Junwei)
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:30 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/11/2018 05:56 PM, Christian König wrote:
> They aren't directly used by the hardware.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 7db0040ca145..3a6f92de5504 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   }
>   
>   static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
> -				   unsigned long size, int byte_align,
> +				   unsigned long size,
>   				   struct amdgpu_bo *bo)
>   {
>   	struct amdgpu_bo_param bp;
> @@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
>   
>   	memset(&bp, 0, sizeof(bp));
>   	bp.size = size;
> -	bp.byte_align = byte_align;
>   	bp.domain = AMDGPU_GEM_DOMAIN_GTT;
>   	bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
>   		AMDGPU_GEM_CREATE_SHADOW;
> @@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>   			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
>   							NULL));
>   
> -		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
> +		r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
>   
>   		if (!bp->resv)
>   			reservation_object_unlock((*bo_ptr)->tbo.resv);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found]         ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 11:54           ` Christian König
       [not found]             ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-14 11:54 UTC (permalink / raw)
  To: Zhang, Jerry(Junwei), amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):
> On 09/11/2018 05:56 PM, Christian König wrote:
>> Don't grab the reservation lock any more and simplify the handling quite
>> a bit.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 
>> ++++++++---------------------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
>>   3 files changed, 43 insertions(+), 120 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 5eba66ecf668..20bb702f5c7f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2940,54 +2940,6 @@ static int 
>> amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>>       return 0;
>>   }
>>   -/**
>> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM 
>> buffers
>> - *
>> - * @adev: amdgpu_device pointer
>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>> - * @bo: amdgpu_bo buffer whose shadow is being restored
>> - * @fence: dma_fence associated with the operation
>> - *
>> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
>> - * restore things like GPUVM page tables after a GPU reset where
>> - * the contents of VRAM might be lost.
>> - * Returns 0 on success, negative error code on failure.
>> - */
>> -static int amdgpu_device_recover_vram_from_shadow(struct 
>> amdgpu_device *adev,
>> -                          struct amdgpu_ring *ring,
>> -                          struct amdgpu_bo *bo,
>> -                          struct dma_fence **fence)
>> -{
>> -    uint32_t domain;
>> -    int r;
>> -
>> -    if (!bo->shadow)
>> -        return 0;
>> -
>> -    r = amdgpu_bo_reserve(bo, true);
>> -    if (r)
>> -        return r;
>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>> -    /* if bo has been evicted, then no need to recover */
>> -    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>> -        r = amdgpu_bo_validate(bo->shadow);
>> -        if (r) {
>> -            DRM_ERROR("bo validate failed!\n");
>> -            goto err;
>> -        }
>> -
>> -        r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
>> -                         NULL, fence, true);
>> -        if (r) {
>> -            DRM_ERROR("recover page table failed!\n");
>> -            goto err;
>> -        }
>> -    }
>> -err:
>> -    amdgpu_bo_unreserve(bo);
>> -    return r;
>> -}
>> -
>>   /**
>>    * amdgpu_device_recover_vram - Recover some VRAM contents
>>    *
>> @@ -2996,16 +2948,15 @@ static int 
>> amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>>    * Restores the contents of VRAM buffers from the shadows in GTT.  
>> Used to
>>    * restore things like GPUVM page tables after a GPU reset where
>>    * the contents of VRAM might be lost.
>> - * Returns 0 on success, 1 on failure.
>> + *
>> + * Returns:
>> + * 0 on success, negative error code on failure.
>>    */
>>   static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>>   {
>> -    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>> -    struct amdgpu_bo *bo, *tmp;
>>       struct dma_fence *fence = NULL, *next = NULL;
>> -    long r = 1;
>> -    int i = 0;
>> -    long tmo;
>> +    struct amdgpu_bo *shadow;
>> +    long r = 1, tmo;
>>         if (amdgpu_sriov_runtime(adev))
>>           tmo = msecs_to_jiffies(8000);
>> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct 
>> amdgpu_device *adev)
>>         DRM_INFO("recover vram bo from shadow start\n");
>>       mutex_lock(&adev->shadow_list_lock);
>> -    list_for_each_entry_safe(bo, tmp, &adev->shadow_list, 
>> shadow_list) {
>> -        next = NULL;
>> -        amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
>> +    list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
>> +
>> +        /* No need to recover an evicted BO */
>> +        if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
>> +            shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
> is there a change that shadow bo evicted to other domain?
> like SYSTEM?

Yes, that's why I test "!= TTM_PL_TT" here.

What can happen is that either the shadow or the page table or page 
directory is evicted.

But in this case we don't need to restore anything because of patch #1 
in this series.

Regards,
Christian.

>
> Regards,
> Jerry
>> +            continue;
>> +
>> +        r = amdgpu_bo_restore_shadow(shadow, &next);
>> +        if (r)
>> +            break;
>> +
>>           if (fence) {
>>               r = dma_fence_wait_timeout(fence, false, tmo);
>> -            if (r == 0)
>> -                pr_err("wait fence %p[%d] timeout\n", fence, i);
>> -            else if (r < 0)
>> -                pr_err("wait fence %p[%d] interrupted\n", fence, i);
>> -            if (r < 1) {
>> -                dma_fence_put(fence);
>> -                fence = next;
>> +            dma_fence_put(fence);
>> +            fence = next;
>> +            if (r <= 0)
>>                   break;
>> -            }
>> -            i++;
>> +        } else {
>> +            fence = next;
>>           }
>> -
>> -        dma_fence_put(fence);
>> -        fence = next;
>>       }
>>       mutex_unlock(&adev->shadow_list_lock);
>>   -    if (fence) {
>> -        r = dma_fence_wait_timeout(fence, false, tmo);
>> -        if (r == 0)
>> -            pr_err("wait fence %p[%d] timeout\n", fence, i);
>> -        else if (r < 0)
>> -            pr_err("wait fence %p[%d] interrupted\n", fence, i);
>> -
>> -    }
>> +    if (fence)
>> +        tmo = dma_fence_wait_timeout(fence, false, tmo);
>>       dma_fence_put(fence);
>>   -    if (r > 0)
>> -        DRM_INFO("recover vram bo from shadow done\n");
>> -    else
>> +    if (r <= 0 || tmo <= 0) {
>>           DRM_ERROR("recover vram bo from shadow failed\n");
>> +        return -EIO;
>> +    }
>>   -    return (r > 0) ? 0 : 1;
>> +    DRM_INFO("recover vram bo from shadow done\n");
>> +    return 0;
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 3a6f92de5504..5b6d5fcc2151 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct 
>> amdgpu_device *adev,
>>       if (!r) {
>>           bo->shadow->parent = amdgpu_bo_ref(bo);
>>           mutex_lock(&adev->shadow_list_lock);
>> -        list_add_tail(&bo->shadow_list, &adev->shadow_list);
>> +        list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>>           mutex_unlock(&adev->shadow_list_lock);
>>       }
>>   @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>   }
>>     /**
>> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
>> - * @adev: amdgpu device object
>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>> - * @bo: &amdgpu_bo buffer to be restored
>> - * @resv: reservation object with embedded fence
>> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
>> + *
>> + * @shadow: &amdgpu_bo shadow to be restored
>>    * @fence: dma_fence associated with the operation
>> - * @direct: whether to submit the job directly
>>    *
>>    * Copies a buffer object's shadow content back to the object.
>>    * This is used for recovering a buffer from its shadow in case of 
>> a gpu
>> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>    * Returns:
>>    * 0 for success or a negative error code on failure.
>>    */
>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>> -                  struct amdgpu_ring *ring,
>> -                  struct amdgpu_bo *bo,
>> -                  struct reservation_object *resv,
>> -                  struct dma_fence **fence,
>> -                  bool direct)
>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct 
>> dma_fence **fence)
>>     {
>> -    struct amdgpu_bo *shadow = bo->shadow;
>> -    uint64_t bo_addr, shadow_addr;
>> -    int r;
>> -
>> -    if (!shadow)
>> -        return -EINVAL;
>> -
>> -    bo_addr = amdgpu_bo_gpu_offset(bo);
>> -    shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
>> -
>> -    r = reservation_object_reserve_shared(bo->tbo.resv);
>> -    if (r)
>> -        goto err;
>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
>> +    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>> +    uint64_t shadow_addr, parent_addr;
>>   -    r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>> -                   amdgpu_bo_size(bo), resv, fence,
>> -                   direct, false);
>> -    if (!r)
>> -        amdgpu_bo_fence(bo, *fence, true);
>> +    shadow_addr = amdgpu_bo_gpu_offset(shadow);
>> +    parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>>   -err:
>> -    return r;
>> +    return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
>> +                  amdgpu_bo_size(shadow), NULL, fence,
>> +                  true, false);
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 907fdf46d895..363db417fb2e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct 
>> amdgpu_device *adev,
>>                      struct reservation_object *resv,
>>                      struct dma_fence **fence, bool direct);
>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>> -                  struct amdgpu_ring *ring,
>> -                  struct amdgpu_bo *bo,
>> -                  struct reservation_object *resv,
>> -                  struct dma_fence **fence,
>> -                  bool direct);
>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
>> +                 struct dma_fence **fence);
>>   uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device 
>> *adev,
>>                           uint32_t domain);
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
       [not found] ` <20180914134257.2196-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 13:42   ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2018-09-14 13:42 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

They aren't directly used by the hardware.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index d8e8d653d518..650c45c896f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -526,7 +526,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 }
 
 static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
-				   unsigned long size, int byte_align,
+				   unsigned long size,
 				   struct amdgpu_bo *bo)
 {
 	struct amdgpu_bo_param bp;
@@ -537,7 +537,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 
 	memset(&bp, 0, sizeof(bp));
 	bp.size = size;
-	bp.byte_align = byte_align;
 	bp.domain = AMDGPU_GEM_DOMAIN_GTT;
 	bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
 		AMDGPU_GEM_CREATE_SHADOW;
@@ -586,7 +585,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
 							NULL));
 
-		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
+		r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
 
 		if (!bp->resv)
 			reservation_object_unlock((*bo_ptr)->tbo.resv);
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found]             ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-18  6:15               ` Zhang, Jerry(Junwei)
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-18  6:15 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/14/2018 07:54 PM, Christian König wrote:
> Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):
>> On 09/11/2018 05:56 PM, Christian König wrote:
>>> Don't grab the reservation lock any more and simplify the handling 
>>> quite
>>> a bit.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 
>>> ++++++++---------------------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
>>>   3 files changed, 43 insertions(+), 120 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 5eba66ecf668..20bb702f5c7f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2940,54 +2940,6 @@ static int 
>>> amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>>>       return 0;
>>>   }
>>>   -/**
>>> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM 
>>> buffers
>>> - *
>>> - * @adev: amdgpu_device pointer
>>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>>> - * @bo: amdgpu_bo buffer whose shadow is being restored
>>> - * @fence: dma_fence associated with the operation
>>> - *
>>> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
>>> - * restore things like GPUVM page tables after a GPU reset where
>>> - * the contents of VRAM might be lost.
>>> - * Returns 0 on success, negative error code on failure.
>>> - */
>>> -static int amdgpu_device_recover_vram_from_shadow(struct 
>>> amdgpu_device *adev,
>>> -                          struct amdgpu_ring *ring,
>>> -                          struct amdgpu_bo *bo,
>>> -                          struct dma_fence **fence)
>>> -{
>>> -    uint32_t domain;
>>> -    int r;
>>> -
>>> -    if (!bo->shadow)
>>> -        return 0;
>>> -
>>> -    r = amdgpu_bo_reserve(bo, true);
>>> -    if (r)
>>> -        return r;
>>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>>> -    /* if bo has been evicted, then no need to recover */
>>> -    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>> -        r = amdgpu_bo_validate(bo->shadow);
>>> -        if (r) {
>>> -            DRM_ERROR("bo validate failed!\n");
>>> -            goto err;
>>> -        }
>>> -
>>> -        r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
>>> -                         NULL, fence, true);
>>> -        if (r) {
>>> -            DRM_ERROR("recover page table failed!\n");
>>> -            goto err;
>>> -        }
>>> -    }
>>> -err:
>>> -    amdgpu_bo_unreserve(bo);
>>> -    return r;
>>> -}
>>> -
>>>   /**
>>>    * amdgpu_device_recover_vram - Recover some VRAM contents
>>>    *
>>> @@ -2996,16 +2948,15 @@ static int 
>>> amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>>>    * Restores the contents of VRAM buffers from the shadows in GTT.  
>>> Used to
>>>    * restore things like GPUVM page tables after a GPU reset where
>>>    * the contents of VRAM might be lost.
>>> - * Returns 0 on success, 1 on failure.
>>> + *
>>> + * Returns:
>>> + * 0 on success, negative error code on failure.
>>>    */
>>>   static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>>>   {
>>> -    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> -    struct amdgpu_bo *bo, *tmp;
>>>       struct dma_fence *fence = NULL, *next = NULL;
>>> -    long r = 1;
>>> -    int i = 0;
>>> -    long tmo;
>>> +    struct amdgpu_bo *shadow;
>>> +    long r = 1, tmo;
>>>         if (amdgpu_sriov_runtime(adev))
>>>           tmo = msecs_to_jiffies(8000);
>>> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct 
>>> amdgpu_device *adev)
>>>         DRM_INFO("recover vram bo from shadow start\n");
>>>       mutex_lock(&adev->shadow_list_lock);
>>> -    list_for_each_entry_safe(bo, tmp, &adev->shadow_list, 
>>> shadow_list) {
>>> -        next = NULL;
>>> -        amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
>>> +    list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
>>> +
>>> +        /* No need to recover an evicted BO */
>>> +        if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
>>> +            shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
>> is there a change that shadow bo evicted to other domain?
>> like SYSTEM?
>
> Yes, that's why I test "!= TTM_PL_TT" here.
>
> What can happen is that either the shadow or the page table or page 
> directory is evicted.
>
> But in this case we don't need to restore anything because of patch #1 
> in this series.

Thanks, then it's
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>

Regards,
Jerry

>
> Regards,
> Christian.
>
>>
>> Regards,
>> Jerry
>>> +            continue;
>>> +
>>> +        r = amdgpu_bo_restore_shadow(shadow, &next);
>>> +        if (r)
>>> +            break;
>>> +
>>>           if (fence) {
>>>               r = dma_fence_wait_timeout(fence, false, tmo);
>>> -            if (r == 0)
>>> -                pr_err("wait fence %p[%d] timeout\n", fence, i);
>>> -            else if (r < 0)
>>> -                pr_err("wait fence %p[%d] interrupted\n", fence, i);
>>> -            if (r < 1) {
>>> -                dma_fence_put(fence);
>>> -                fence = next;
>>> +            dma_fence_put(fence);
>>> +            fence = next;
>>> +            if (r <= 0)
>>>                   break;
>>> -            }
>>> -            i++;
>>> +        } else {
>>> +            fence = next;
>>>           }
>>> -
>>> -        dma_fence_put(fence);
>>> -        fence = next;
>>>       }
>>>       mutex_unlock(&adev->shadow_list_lock);
>>>   -    if (fence) {
>>> -        r = dma_fence_wait_timeout(fence, false, tmo);
>>> -        if (r == 0)
>>> -            pr_err("wait fence %p[%d] timeout\n", fence, i);
>>> -        else if (r < 0)
>>> -            pr_err("wait fence %p[%d] interrupted\n", fence, i);
>>> -
>>> -    }
>>> +    if (fence)
>>> +        tmo = dma_fence_wait_timeout(fence, false, tmo);
>>>       dma_fence_put(fence);
>>>   -    if (r > 0)
>>> -        DRM_INFO("recover vram bo from shadow done\n");
>>> -    else
>>> +    if (r <= 0 || tmo <= 0) {
>>>           DRM_ERROR("recover vram bo from shadow failed\n");
>>> +        return -EIO;
>>> +    }
>>>   -    return (r > 0) ? 0 : 1;
>>> +    DRM_INFO("recover vram bo from shadow done\n");
>>> +    return 0;
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 3a6f92de5504..5b6d5fcc2151 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct 
>>> amdgpu_device *adev,
>>>       if (!r) {
>>>           bo->shadow->parent = amdgpu_bo_ref(bo);
>>>           mutex_lock(&adev->shadow_list_lock);
>>> -        list_add_tail(&bo->shadow_list, &adev->shadow_list);
>>> +        list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>>>           mutex_unlock(&adev->shadow_list_lock);
>>>       }
>>>   @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>>   }
>>>     /**
>>> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
>>> - * @adev: amdgpu device object
>>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>>> - * @bo: &amdgpu_bo buffer to be restored
>>> - * @resv: reservation object with embedded fence
>>> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
>>> + *
>>> + * @shadow: &amdgpu_bo shadow to be restored
>>>    * @fence: dma_fence associated with the operation
>>> - * @direct: whether to submit the job directly
>>>    *
>>>    * Copies a buffer object's shadow content back to the object.
>>>    * This is used for recovering a buffer from its shadow in case of 
>>> a gpu
>>> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>>    * Returns:
>>>    * 0 for success or a negative error code on failure.
>>>    */
>>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>>> -                  struct amdgpu_ring *ring,
>>> -                  struct amdgpu_bo *bo,
>>> -                  struct reservation_object *resv,
>>> -                  struct dma_fence **fence,
>>> -                  bool direct)
>>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct 
>>> dma_fence **fence)
>>>     {
>>> -    struct amdgpu_bo *shadow = bo->shadow;
>>> -    uint64_t bo_addr, shadow_addr;
>>> -    int r;
>>> -
>>> -    if (!shadow)
>>> -        return -EINVAL;
>>> -
>>> -    bo_addr = amdgpu_bo_gpu_offset(bo);
>>> -    shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
>>> -
>>> -    r = reservation_object_reserve_shared(bo->tbo.resv);
>>> -    if (r)
>>> -        goto err;
>>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
>>> +    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> +    uint64_t shadow_addr, parent_addr;
>>>   -    r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>>> -                   amdgpu_bo_size(bo), resv, fence,
>>> -                   direct, false);
>>> -    if (!r)
>>> -        amdgpu_bo_fence(bo, *fence, true);
>>> +    shadow_addr = amdgpu_bo_gpu_offset(shadow);
>>> +    parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>>>   -err:
>>> -    return r;
>>> +    return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
>>> +                  amdgpu_bo_size(shadow), NULL, fence,
>>> +                  true, false);
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 907fdf46d895..363db417fb2e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct 
>>> amdgpu_device *adev,
>>>                      struct reservation_object *resv,
>>>                      struct dma_fence **fence, bool direct);
>>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>>> -                  struct amdgpu_ring *ring,
>>> -                  struct amdgpu_bo *bo,
>>> -                  struct reservation_object *resv,
>>> -                  struct dma_fence **fence,
>>> -                  bool direct);
>>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
>>> +                 struct dma_fence **fence);
>>>   uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device 
>>> *adev,
>>>                           uint32_t domain);
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-09-18  6:15 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-11  9:55 [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Christian König
     [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11  9:55   ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
     [not found]     ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 12:29       ` Michel Dänzer
     [not found]         ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-09-11 13:21           ` Christian König
     [not found]             ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-12 12:37               ` Deng, Emily
2018-09-11  9:56   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
     [not found]     ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13  9:30       ` Zhang, Jerry(Junwei)
2018-09-11  9:56   ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
     [not found]     ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13  9:28       ` Zhang, Jerry(Junwei)
2018-09-11  9:56   ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
     [not found]     ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13  9:29       ` Zhang, Jerry(Junwei)
     [not found]         ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
2018-09-14 11:54           ` Christian König
     [not found]             ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-18  6:15               ` Zhang, Jerry(Junwei)
2018-09-13  9:28   ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
  -- strict thread matches above, loose matches on Subject: below --
2018-09-14 13:42 Christian König
     [not found] ` <20180914134257.2196-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-14 13:42   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox