* [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves
@ 2018-09-11 9:55 Christian König
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 9:55 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
We are going to need this for recoverable page fault handling and it
makes shadow handling during GPU reset much more easier.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +++++-
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b5f20b42439e..a7e39c9dd14b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
{
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
- !bo->pin_count);
+ !bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
!(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..2f32dc692d40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
if (r)
goto error;
- r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
+ /* Always block for VM page tables before committing the new location */
+ if (bo->type == ttm_bo_type_kernel)
+ r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
+ else
+ r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
dma_fence_put(fence);
return r;
--
2.14.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/5] drm/amdgpu: always enable shadow BOs
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11 9:55 ` Christian König
[not found] ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 9:56 ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 9:55 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Even when GPU recovery is disabled we could run into a manually
triggered recovery.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
1 file changed, 1 insertion(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index a7e39c9dd14b..7db0040ca145 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -38,31 +38,6 @@
#include "amdgpu_trace.h"
#include "amdgpu_amdkfd.h"
-/**
- * DOC: amdgpu_object
- *
- * This defines the interfaces to operate on an &amdgpu_bo buffer object which
- * represents memory used by driver (VRAM, system memory, etc.). The driver
- * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
- * to create/destroy/set buffer object which are then managed by the kernel TTM
- * memory manager.
- * The interfaces are also used internally by kernel clients, including gfx,
- * uvd, etc. for kernel managed allocations used by the GPU.
- *
- */
-
-static bool amdgpu_bo_need_backup(struct amdgpu_device *adev)
-{
- if (adev->flags & AMD_IS_APU)
- return false;
-
- if (amdgpu_gpu_recovery == 0 ||
- (amdgpu_gpu_recovery == -1 && !amdgpu_sriov_vf(adev)))
- return false;
-
- return true;
-}
-
/**
* amdgpu_bo_subtract_pin_size - Remove BO from pin_size accounting
*
@@ -596,7 +571,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
if (r)
return r;
- if ((flags & AMDGPU_GEM_CREATE_SHADOW) && amdgpu_bo_need_backup(adev)) {
+ if ((flags & AMDGPU_GEM_CREATE_SHADOW) && !(adev->flags & AMD_IS_APU)) {
if (!bp->resv)
WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
NULL));
--
2.14.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 9:55 ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
@ 2018-09-11 9:56 ` Christian König
[not found] ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 9:56 ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 9:56 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
They aren't directly used by the hardware.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 7db0040ca145..3a6f92de5504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
}
static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
- unsigned long size, int byte_align,
+ unsigned long size,
struct amdgpu_bo *bo)
{
struct amdgpu_bo_param bp;
@@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
memset(&bp, 0, sizeof(bp));
bp.size = size;
- bp.byte_align = byte_align;
bp.domain = AMDGPU_GEM_DOMAIN_GTT;
bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
AMDGPU_GEM_CREATE_SHADOW;
@@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
NULL));
- r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
+ r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
if (!bp->resv)
reservation_object_unlock((*bo_ptr)->tbo.resv);
--
2.14.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 9:55 ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
2018-09-11 9:56 ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
@ 2018-09-11 9:56 ` Christian König
[not found] ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 9:56 ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
2018-09-13 9:28 ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 9:56 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
It shouldn't add much overhead and we should make sure that critical
VRAM content is always restored.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 93476b8c2e72..5eba66ecf668 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
}
/**
- * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
+ * amdgpu_device_recover_vram - Recover some VRAM contents
*
* @adev: amdgpu_device pointer
*
@@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
* the contents of VRAM might be lost.
* Returns 0 on success, 1 on failure.
*/
-static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
+static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
{
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
struct amdgpu_bo *bo, *tmp;
@@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
}
}
- if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))
- r = amdgpu_device_handle_vram_lost(adev);
+ if (!r)
+ r = amdgpu_device_recover_vram(adev);
return r;
}
@@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
amdgpu_virt_release_full_gpu(adev, true);
if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
atomic_inc(&adev->vram_lost_counter);
- r = amdgpu_device_handle_vram_lost(adev);
+ r = amdgpu_device_recover_vram(adev);
}
return r;
--
2.14.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
` (2 preceding siblings ...)
2018-09-11 9:56 ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
@ 2018-09-11 9:56 ` Christian König
[not found] ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13 9:28 ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 9:56 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Don't grab the reservation lock any more and simplify the handling quite
a bit.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 ++++++++---------------------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 46 ++++--------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 +--
3 files changed, 43 insertions(+), 120 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5eba66ecf668..20bb702f5c7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
return 0;
}
-/**
- * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
- *
- * @adev: amdgpu_device pointer
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: amdgpu_bo buffer whose shadow is being restored
- * @fence: dma_fence associated with the operation
- *
- * Restores the VRAM buffer contents from the shadow in GTT. Used to
- * restore things like GPUVM page tables after a GPU reset where
- * the contents of VRAM might be lost.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
- struct amdgpu_ring *ring,
- struct amdgpu_bo *bo,
- struct dma_fence **fence)
-{
- uint32_t domain;
- int r;
-
- if (!bo->shadow)
- return 0;
-
- r = amdgpu_bo_reserve(bo, true);
- if (r)
- return r;
- domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
- /* if bo has been evicted, then no need to recover */
- if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
- r = amdgpu_bo_validate(bo->shadow);
- if (r) {
- DRM_ERROR("bo validate failed!\n");
- goto err;
- }
-
- r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
- NULL, fence, true);
- if (r) {
- DRM_ERROR("recover page table failed!\n");
- goto err;
- }
- }
-err:
- amdgpu_bo_unreserve(bo);
- return r;
-}
-
/**
* amdgpu_device_recover_vram - Recover some VRAM contents
*
@@ -2996,16 +2948,15 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
* Restores the contents of VRAM buffers from the shadows in GTT. Used to
* restore things like GPUVM page tables after a GPU reset where
* the contents of VRAM might be lost.
- * Returns 0 on success, 1 on failure.
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
*/
static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
{
- struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
- struct amdgpu_bo *bo, *tmp;
struct dma_fence *fence = NULL, *next = NULL;
- long r = 1;
- int i = 0;
- long tmo;
+ struct amdgpu_bo *shadow;
+ long r = 1, tmo;
if (amdgpu_sriov_runtime(adev))
tmo = msecs_to_jiffies(8000);
@@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
DRM_INFO("recover vram bo from shadow start\n");
mutex_lock(&adev->shadow_list_lock);
- list_for_each_entry_safe(bo, tmp, &adev->shadow_list, shadow_list) {
- next = NULL;
- amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
+ list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
+
+ /* No need to recover an evicted BO */
+ if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
+ shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
+ continue;
+
+ r = amdgpu_bo_restore_shadow(shadow, &next);
+ if (r)
+ break;
+
if (fence) {
r = dma_fence_wait_timeout(fence, false, tmo);
- if (r == 0)
- pr_err("wait fence %p[%d] timeout\n", fence, i);
- else if (r < 0)
- pr_err("wait fence %p[%d] interrupted\n", fence, i);
- if (r < 1) {
- dma_fence_put(fence);
- fence = next;
+ dma_fence_put(fence);
+ fence = next;
+ if (r <= 0)
break;
- }
- i++;
+ } else {
+ fence = next;
}
-
- dma_fence_put(fence);
- fence = next;
}
mutex_unlock(&adev->shadow_list_lock);
- if (fence) {
- r = dma_fence_wait_timeout(fence, false, tmo);
- if (r == 0)
- pr_err("wait fence %p[%d] timeout\n", fence, i);
- else if (r < 0)
- pr_err("wait fence %p[%d] interrupted\n", fence, i);
-
- }
+ if (fence)
+ tmo = dma_fence_wait_timeout(fence, false, tmo);
dma_fence_put(fence);
- if (r > 0)
- DRM_INFO("recover vram bo from shadow done\n");
- else
+ if (r <= 0 || tmo <= 0) {
DRM_ERROR("recover vram bo from shadow failed\n");
+ return -EIO;
+ }
- return (r > 0) ? 0 : 1;
+ DRM_INFO("recover vram bo from shadow done\n");
+ return 0;
}
/**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 3a6f92de5504..5b6d5fcc2151 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
if (!r) {
bo->shadow->parent = amdgpu_bo_ref(bo);
mutex_lock(&adev->shadow_list_lock);
- list_add_tail(&bo->shadow_list, &adev->shadow_list);
+ list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
mutex_unlock(&adev->shadow_list_lock);
}
@@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
}
/**
- * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
- * @adev: amdgpu device object
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: &amdgpu_bo buffer to be restored
- * @resv: reservation object with embedded fence
+ * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
+ *
+ * @shadow: &amdgpu_bo shadow to be restored
* @fence: dma_fence associated with the operation
- * @direct: whether to submit the job directly
*
* Copies a buffer object's shadow content back to the object.
* This is used for recovering a buffer from its shadow in case of a gpu
@@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
* Returns:
* 0 for success or a negative error code on failure.
*/
-int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
- struct amdgpu_ring *ring,
- struct amdgpu_bo *bo,
- struct reservation_object *resv,
- struct dma_fence **fence,
- bool direct)
+int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)
{
- struct amdgpu_bo *shadow = bo->shadow;
- uint64_t bo_addr, shadow_addr;
- int r;
-
- if (!shadow)
- return -EINVAL;
-
- bo_addr = amdgpu_bo_gpu_offset(bo);
- shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
-
- r = reservation_object_reserve_shared(bo->tbo.resv);
- if (r)
- goto err;
+ struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
+ struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
+ uint64_t shadow_addr, parent_addr;
- r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
- amdgpu_bo_size(bo), resv, fence,
- direct, false);
- if (!r)
- amdgpu_bo_fence(bo, *fence, true);
+ shadow_addr = amdgpu_bo_gpu_offset(shadow);
+ parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
-err:
- return r;
+ return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
+ amdgpu_bo_size(shadow), NULL, fence,
+ true, false);
}
/**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 907fdf46d895..363db417fb2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
struct reservation_object *resv,
struct dma_fence **fence, bool direct);
int amdgpu_bo_validate(struct amdgpu_bo *bo);
-int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
- struct amdgpu_ring *ring,
- struct amdgpu_bo *bo,
- struct reservation_object *resv,
- struct dma_fence **fence,
- bool direct);
+int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
+ struct dma_fence **fence);
uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device *adev,
uint32_t domain);
--
2.14.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
[not found] ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11 12:29 ` Michel Dänzer
[not found] ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Michel Dänzer @ 2018-09-11 12:29 UTC (permalink / raw)
To: Christian König; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 2018-09-11 11:55 a.m., Christian König wrote:
> Even when GPU recovery is disabled we could run into a manually
> triggered recovery.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
> 1 file changed, 1 insertion(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index a7e39c9dd14b..7db0040ca145 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -38,31 +38,6 @@
> #include "amdgpu_trace.h"
> #include "amdgpu_amdkfd.h"
>
> -/**
> - * DOC: amdgpu_object
> - *
> - * This defines the interfaces to operate on an &amdgpu_bo buffer object which
> - * represents memory used by driver (VRAM, system memory, etc.). The driver
> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
> - * to create/destroy/set buffer object which are then managed by the kernel TTM
> - * memory manager.
> - * The interfaces are also used internally by kernel clients, including gfx,
> - * uvd, etc. for kernel managed allocations used by the GPU.
> - *
> - */
This comment shouldn't be removed, should it? Otherwise the commit log
needs to explain this, and the reference in Documentation/gpu/amdgpu.rst
needs to be removed as well.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
[not found] ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-09-11 13:21 ` Christian König
[not found] ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 13:21 UTC (permalink / raw)
To: Michel Dänzer; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Am 11.09.2018 um 14:29 schrieb Michel Dänzer:
> On 2018-09-11 11:55 a.m., Christian König wrote:
>> Even when GPU recovery is disabled we could run into a manually
>> triggered recovery.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
>> 1 file changed, 1 insertion(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index a7e39c9dd14b..7db0040ca145 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -38,31 +38,6 @@
>> #include "amdgpu_trace.h"
>> #include "amdgpu_amdkfd.h"
>>
>> -/**
>> - * DOC: amdgpu_object
>> - *
>> - * This defines the interfaces to operate on an &amdgpu_bo buffer object which
>> - * represents memory used by driver (VRAM, system memory, etc.). The driver
>> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
>> - * to create/destroy/set buffer object which are then managed by the kernel TTM
>> - * memory manager.
>> - * The interfaces are also used internally by kernel clients, including gfx,
>> - * uvd, etc. for kernel managed allocations used by the GPU.
>> - *
>> - */
> This comment shouldn't be removed, should it? Otherwise the commit log
> needs to explain this, and the reference in Documentation/gpu/amdgpu.rst
> needs to be removed as well.
Oh, that was indeed an accident. Going to fix that.
Thanks,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
[not found] ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-12 12:37 ` Deng, Emily
0 siblings, 0 replies; 15+ messages in thread
From: Deng, Emily @ 2018-09-12 12:37 UTC (permalink / raw)
To: Koenig, Christian, Michel Dänzer
Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Acked-by: Emily Deng <Emily.Deng@amd.com>
>-----Original Message-----
>From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>Christian König
>Sent: Tuesday, September 11, 2018 9:22 PM
>To: Michel Dänzer <michel@daenzer.net>
>Cc: amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
>
>Am 11.09.2018 um 14:29 schrieb Michel Dänzer:
>> On 2018-09-11 11:55 a.m., Christian König wrote:
>>> Even when GPU recovery is disabled we could run into a manually
>>> triggered recovery.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +-----------------------
>---
>>> 1 file changed, 1 insertion(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index a7e39c9dd14b..7db0040ca145 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -38,31 +38,6 @@
>>> #include "amdgpu_trace.h"
>>> #include "amdgpu_amdkfd.h"
>>>
>>> -/**
>>> - * DOC: amdgpu_object
>>> - *
>>> - * This defines the interfaces to operate on an &amdgpu_bo buffer
>>> object which
>>> - * represents memory used by driver (VRAM, system memory, etc.). The
>>> driver
>>> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these
>>> interfaces
>>> - * to create/destroy/set buffer object which are then managed by the
>>> kernel TTM
>>> - * memory manager.
>>> - * The interfaces are also used internally by kernel clients,
>>> including gfx,
>>> - * uvd, etc. for kernel managed allocations used by the GPU.
>>> - *
>>> - */
>> This comment shouldn't be removed, should it? Otherwise the commit log
>> needs to explain this, and the reference in
>> Documentation/gpu/amdgpu.rst needs to be removed as well.
>
>Oh, that was indeed an accident. Going to fix that.
>
>Thanks,
>Christian.
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
` (3 preceding siblings ...)
2018-09-11 9:56 ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
@ 2018-09-13 9:28 ` Zhang, Jerry(Junwei)
4 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13 9:28 UTC (permalink / raw)
To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 09/11/2018 05:55 PM, Christian König wrote:
> We are going to need this for recoverable page fault handling and it
> makes shadow handling during GPU reset much more easier.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +++++-
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index b5f20b42439e..a7e39c9dd14b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
> {
> WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
> WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
> - !bo->pin_count);
> + !bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
> WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
> WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
> !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d9f3201c9e5c..2f32dc692d40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
> if (r)
> goto error;
>
> - r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
> + /* Always block for VM page tables before committing the new location */
> + if (bo->type == ttm_bo_type_kernel)
> + r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
> + else
> + r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
> dma_fence_put(fence);
> return r;
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery
[not found] ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13 9:28 ` Zhang, Jerry(Junwei)
0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13 9:28 UTC (permalink / raw)
To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 09/11/2018 05:56 PM, Christian König wrote:
> It shouldn't add much overhead and we should make sure that critical
> VRAM content is always restored.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 93476b8c2e72..5eba66ecf668 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
> }
>
> /**
> - * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
> + * amdgpu_device_recover_vram - Recover some VRAM contents
> *
> * @adev: amdgpu_device pointer
> *
> @@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
> * the contents of VRAM might be lost.
> * Returns 0 on success, 1 on failure.
> */
> -static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
> +static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
> {
> struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> struct amdgpu_bo *bo, *tmp;
> @@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
> }
> }
>
> - if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))
> - r = amdgpu_device_handle_vram_lost(adev);
> + if (!r)
> + r = amdgpu_device_recover_vram(adev);
>
> return r;
> }
> @@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
> amdgpu_virt_release_full_gpu(adev, true);
> if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
> atomic_inc(&adev->vram_lost_counter);
> - r = amdgpu_device_handle_vram_lost(adev);
> + r = amdgpu_device_recover_vram(adev);
> }
>
> return r;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
[not found] ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13 9:29 ` Zhang, Jerry(Junwei)
[not found] ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13 9:29 UTC (permalink / raw)
To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 09/11/2018 05:56 PM, Christian König wrote:
> Don't grab the reservation lock any more and simplify the handling quite
> a bit.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 ++++++++---------------------
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 46 ++++--------
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 +--
> 3 files changed, 43 insertions(+), 120 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5eba66ecf668..20bb702f5c7f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
> return 0;
> }
>
> -/**
> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
> - *
> - * @adev: amdgpu_device pointer
> - * @ring: amdgpu_ring for the engine handling the buffer operations
> - * @bo: amdgpu_bo buffer whose shadow is being restored
> - * @fence: dma_fence associated with the operation
> - *
> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
> - * restore things like GPUVM page tables after a GPU reset where
> - * the contents of VRAM might be lost.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
> - struct amdgpu_ring *ring,
> - struct amdgpu_bo *bo,
> - struct dma_fence **fence)
> -{
> - uint32_t domain;
> - int r;
> -
> - if (!bo->shadow)
> - return 0;
> -
> - r = amdgpu_bo_reserve(bo, true);
> - if (r)
> - return r;
> - domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> - /* if bo has been evicted, then no need to recover */
> - if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
> - r = amdgpu_bo_validate(bo->shadow);
> - if (r) {
> - DRM_ERROR("bo validate failed!\n");
> - goto err;
> - }
> -
> - r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
> - NULL, fence, true);
> - if (r) {
> - DRM_ERROR("recover page table failed!\n");
> - goto err;
> - }
> - }
> -err:
> - amdgpu_bo_unreserve(bo);
> - return r;
> -}
> -
> /**
> * amdgpu_device_recover_vram - Recover some VRAM contents
> *
> @@ -2996,16 +2948,15 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
> * Restores the contents of VRAM buffers from the shadows in GTT. Used to
> * restore things like GPUVM page tables after a GPU reset where
> * the contents of VRAM might be lost.
> - * Returns 0 on success, 1 on failure.
> + *
> + * Returns:
> + * 0 on success, negative error code on failure.
> */
> static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
> {
> - struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> - struct amdgpu_bo *bo, *tmp;
> struct dma_fence *fence = NULL, *next = NULL;
> - long r = 1;
> - int i = 0;
> - long tmo;
> + struct amdgpu_bo *shadow;
> + long r = 1, tmo;
>
> if (amdgpu_sriov_runtime(adev))
> tmo = msecs_to_jiffies(8000);
> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>
> DRM_INFO("recover vram bo from shadow start\n");
> mutex_lock(&adev->shadow_list_lock);
> - list_for_each_entry_safe(bo, tmp, &adev->shadow_list, shadow_list) {
> - next = NULL;
> - amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
> + list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
> +
> + /* No need to recover an evicted BO */
> + if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
> + shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
is there a change that shadow bo evicted to other domain?
like SYSTEM?
Regards,
Jerry
> + continue;
> +
> + r = amdgpu_bo_restore_shadow(shadow, &next);
> + if (r)
> + break;
> +
> if (fence) {
> r = dma_fence_wait_timeout(fence, false, tmo);
> - if (r == 0)
> - pr_err("wait fence %p[%d] timeout\n", fence, i);
> - else if (r < 0)
> - pr_err("wait fence %p[%d] interrupted\n", fence, i);
> - if (r < 1) {
> - dma_fence_put(fence);
> - fence = next;
> + dma_fence_put(fence);
> + fence = next;
> + if (r <= 0)
> break;
> - }
> - i++;
> + } else {
> + fence = next;
> }
> -
> - dma_fence_put(fence);
> - fence = next;
> }
> mutex_unlock(&adev->shadow_list_lock);
>
> - if (fence) {
> - r = dma_fence_wait_timeout(fence, false, tmo);
> - if (r == 0)
> - pr_err("wait fence %p[%d] timeout\n", fence, i);
> - else if (r < 0)
> - pr_err("wait fence %p[%d] interrupted\n", fence, i);
> -
> - }
> + if (fence)
> + tmo = dma_fence_wait_timeout(fence, false, tmo);
> dma_fence_put(fence);
>
> - if (r > 0)
> - DRM_INFO("recover vram bo from shadow done\n");
> - else
> + if (r <= 0 || tmo <= 0) {
> DRM_ERROR("recover vram bo from shadow failed\n");
> + return -EIO;
> + }
>
> - return (r > 0) ? 0 : 1;
> + DRM_INFO("recover vram bo from shadow done\n");
> + return 0;
> }
>
> /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 3a6f92de5504..5b6d5fcc2151 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
> if (!r) {
> bo->shadow->parent = amdgpu_bo_ref(bo);
> mutex_lock(&adev->shadow_list_lock);
> - list_add_tail(&bo->shadow_list, &adev->shadow_list);
> + list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
> mutex_unlock(&adev->shadow_list_lock);
> }
>
> @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
> }
>
> /**
> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
> - * @adev: amdgpu device object
> - * @ring: amdgpu_ring for the engine handling the buffer operations
> - * @bo: &amdgpu_bo buffer to be restored
> - * @resv: reservation object with embedded fence
> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
> + *
> + * @shadow: &amdgpu_bo shadow to be restored
> * @fence: dma_fence associated with the operation
> - * @direct: whether to submit the job directly
> *
> * Copies a buffer object's shadow content back to the object.
> * This is used for recovering a buffer from its shadow in case of a gpu
> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
> * Returns:
> * 0 for success or a negative error code on failure.
> */
> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> - struct amdgpu_ring *ring,
> - struct amdgpu_bo *bo,
> - struct reservation_object *resv,
> - struct dma_fence **fence,
> - bool direct)
> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)
>
> {
> - struct amdgpu_bo *shadow = bo->shadow;
> - uint64_t bo_addr, shadow_addr;
> - int r;
> -
> - if (!shadow)
> - return -EINVAL;
> -
> - bo_addr = amdgpu_bo_gpu_offset(bo);
> - shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
> -
> - r = reservation_object_reserve_shared(bo->tbo.resv);
> - if (r)
> - goto err;
> + struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
> + struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> + uint64_t shadow_addr, parent_addr;
>
> - r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
> - amdgpu_bo_size(bo), resv, fence,
> - direct, false);
> - if (!r)
> - amdgpu_bo_fence(bo, *fence, true);
> + shadow_addr = amdgpu_bo_gpu_offset(shadow);
> + parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>
> -err:
> - return r;
> + return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
> + amdgpu_bo_size(shadow), NULL, fence,
> + true, false);
> }
>
> /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 907fdf46d895..363db417fb2e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
> struct reservation_object *resv,
> struct dma_fence **fence, bool direct);
> int amdgpu_bo_validate(struct amdgpu_bo *bo);
> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> - struct amdgpu_ring *ring,
> - struct amdgpu_bo *bo,
> - struct reservation_object *resv,
> - struct dma_fence **fence,
> - bool direct);
> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
> + struct dma_fence **fence);
> uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device *adev,
> uint32_t domain);
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
[not found] ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13 9:30 ` Zhang, Jerry(Junwei)
0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13 9:30 UTC (permalink / raw)
To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 09/11/2018 05:56 PM, Christian König wrote:
> They aren't directly used by the hardware.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 7db0040ca145..3a6f92de5504 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> }
>
> static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
> - unsigned long size, int byte_align,
> + unsigned long size,
> struct amdgpu_bo *bo)
> {
> struct amdgpu_bo_param bp;
> @@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
>
> memset(&bp, 0, sizeof(bp));
> bp.size = size;
> - bp.byte_align = byte_align;
> bp.domain = AMDGPU_GEM_DOMAIN_GTT;
> bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
> AMDGPU_GEM_CREATE_SHADOW;
> @@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
> WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
> NULL));
>
> - r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
> + r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
>
> if (!bp->resv)
> reservation_object_unlock((*bo_ptr)->tbo.resv);
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
[not found] ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 11:54 ` Christian König
[not found] ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-14 11:54 UTC (permalink / raw)
To: Zhang, Jerry(Junwei), amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):
> On 09/11/2018 05:56 PM, Christian König wrote:
>> Don't grab the reservation lock any more and simplify the handling quite
>> a bit.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109
>> ++++++++---------------------
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 46 ++++--------
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 +--
>> 3 files changed, 43 insertions(+), 120 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 5eba66ecf668..20bb702f5c7f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2940,54 +2940,6 @@ static int
>> amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>> return 0;
>> }
>> -/**
>> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM
>> buffers
>> - *
>> - * @adev: amdgpu_device pointer
>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>> - * @bo: amdgpu_bo buffer whose shadow is being restored
>> - * @fence: dma_fence associated with the operation
>> - *
>> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
>> - * restore things like GPUVM page tables after a GPU reset where
>> - * the contents of VRAM might be lost.
>> - * Returns 0 on success, negative error code on failure.
>> - */
>> -static int amdgpu_device_recover_vram_from_shadow(struct
>> amdgpu_device *adev,
>> - struct amdgpu_ring *ring,
>> - struct amdgpu_bo *bo,
>> - struct dma_fence **fence)
>> -{
>> - uint32_t domain;
>> - int r;
>> -
>> - if (!bo->shadow)
>> - return 0;
>> -
>> - r = amdgpu_bo_reserve(bo, true);
>> - if (r)
>> - return r;
>> - domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>> - /* if bo has been evicted, then no need to recover */
>> - if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>> - r = amdgpu_bo_validate(bo->shadow);
>> - if (r) {
>> - DRM_ERROR("bo validate failed!\n");
>> - goto err;
>> - }
>> -
>> - r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
>> - NULL, fence, true);
>> - if (r) {
>> - DRM_ERROR("recover page table failed!\n");
>> - goto err;
>> - }
>> - }
>> -err:
>> - amdgpu_bo_unreserve(bo);
>> - return r;
>> -}
>> -
>> /**
>> * amdgpu_device_recover_vram - Recover some VRAM contents
>> *
>> @@ -2996,16 +2948,15 @@ static int
>> amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>> * Restores the contents of VRAM buffers from the shadows in GTT.
>> Used to
>> * restore things like GPUVM page tables after a GPU reset where
>> * the contents of VRAM might be lost.
>> - * Returns 0 on success, 1 on failure.
>> + *
>> + * Returns:
>> + * 0 on success, negative error code on failure.
>> */
>> static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>> {
>> - struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>> - struct amdgpu_bo *bo, *tmp;
>> struct dma_fence *fence = NULL, *next = NULL;
>> - long r = 1;
>> - int i = 0;
>> - long tmo;
>> + struct amdgpu_bo *shadow;
>> + long r = 1, tmo;
>> if (amdgpu_sriov_runtime(adev))
>> tmo = msecs_to_jiffies(8000);
>> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct
>> amdgpu_device *adev)
>> DRM_INFO("recover vram bo from shadow start\n");
>> mutex_lock(&adev->shadow_list_lock);
>> - list_for_each_entry_safe(bo, tmp, &adev->shadow_list,
>> shadow_list) {
>> - next = NULL;
>> - amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
>> + list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
>> +
>> + /* No need to recover an evicted BO */
>> + if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
>> + shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
> is there a change that shadow bo evicted to other domain?
> like SYSTEM?
Yes, that's why I test "!= TTM_PL_TT" here.
What can happen is that either the shadow or the page table or page
directory is evicted.
But in this case we don't need to restore anything because of patch #1
in this series.
Regards,
Christian.
>
> Regards,
> Jerry
>> + continue;
>> +
>> + r = amdgpu_bo_restore_shadow(shadow, &next);
>> + if (r)
>> + break;
>> +
>> if (fence) {
>> r = dma_fence_wait_timeout(fence, false, tmo);
>> - if (r == 0)
>> - pr_err("wait fence %p[%d] timeout\n", fence, i);
>> - else if (r < 0)
>> - pr_err("wait fence %p[%d] interrupted\n", fence, i);
>> - if (r < 1) {
>> - dma_fence_put(fence);
>> - fence = next;
>> + dma_fence_put(fence);
>> + fence = next;
>> + if (r <= 0)
>> break;
>> - }
>> - i++;
>> + } else {
>> + fence = next;
>> }
>> -
>> - dma_fence_put(fence);
>> - fence = next;
>> }
>> mutex_unlock(&adev->shadow_list_lock);
>> - if (fence) {
>> - r = dma_fence_wait_timeout(fence, false, tmo);
>> - if (r == 0)
>> - pr_err("wait fence %p[%d] timeout\n", fence, i);
>> - else if (r < 0)
>> - pr_err("wait fence %p[%d] interrupted\n", fence, i);
>> -
>> - }
>> + if (fence)
>> + tmo = dma_fence_wait_timeout(fence, false, tmo);
>> dma_fence_put(fence);
>> - if (r > 0)
>> - DRM_INFO("recover vram bo from shadow done\n");
>> - else
>> + if (r <= 0 || tmo <= 0) {
>> DRM_ERROR("recover vram bo from shadow failed\n");
>> + return -EIO;
>> + }
>> - return (r > 0) ? 0 : 1;
>> + DRM_INFO("recover vram bo from shadow done\n");
>> + return 0;
>> }
>> /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 3a6f92de5504..5b6d5fcc2151 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct
>> amdgpu_device *adev,
>> if (!r) {
>> bo->shadow->parent = amdgpu_bo_ref(bo);
>> mutex_lock(&adev->shadow_list_lock);
>> - list_add_tail(&bo->shadow_list, &adev->shadow_list);
>> + list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>> mutex_unlock(&adev->shadow_list_lock);
>> }
>> @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>> }
>> /**
>> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
>> - * @adev: amdgpu device object
>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>> - * @bo: &amdgpu_bo buffer to be restored
>> - * @resv: reservation object with embedded fence
>> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
>> + *
>> + * @shadow: &amdgpu_bo shadow to be restored
>> * @fence: dma_fence associated with the operation
>> - * @direct: whether to submit the job directly
>> *
>> * Copies a buffer object's shadow content back to the object.
>> * This is used for recovering a buffer from its shadow in case of
>> a gpu
>> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>> * Returns:
>> * 0 for success or a negative error code on failure.
>> */
>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>> - struct amdgpu_ring *ring,
>> - struct amdgpu_bo *bo,
>> - struct reservation_object *resv,
>> - struct dma_fence **fence,
>> - bool direct)
>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct
>> dma_fence **fence)
>> {
>> - struct amdgpu_bo *shadow = bo->shadow;
>> - uint64_t bo_addr, shadow_addr;
>> - int r;
>> -
>> - if (!shadow)
>> - return -EINVAL;
>> -
>> - bo_addr = amdgpu_bo_gpu_offset(bo);
>> - shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
>> -
>> - r = reservation_object_reserve_shared(bo->tbo.resv);
>> - if (r)
>> - goto err;
>> + struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
>> + struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>> + uint64_t shadow_addr, parent_addr;
>> - r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>> - amdgpu_bo_size(bo), resv, fence,
>> - direct, false);
>> - if (!r)
>> - amdgpu_bo_fence(bo, *fence, true);
>> + shadow_addr = amdgpu_bo_gpu_offset(shadow);
>> + parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>> -err:
>> - return r;
>> + return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
>> + amdgpu_bo_size(shadow), NULL, fence,
>> + true, false);
>> }
>> /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 907fdf46d895..363db417fb2e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct
>> amdgpu_device *adev,
>> struct reservation_object *resv,
>> struct dma_fence **fence, bool direct);
>> int amdgpu_bo_validate(struct amdgpu_bo *bo);
>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>> - struct amdgpu_ring *ring,
>> - struct amdgpu_bo *bo,
>> - struct reservation_object *resv,
>> - struct dma_fence **fence,
>> - bool direct);
>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
>> + struct dma_fence **fence);
>> uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device
>> *adev,
>> uint32_t domain);
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
[not found] ` <20180914134257.2196-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 13:42 ` Christian König
0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2018-09-14 13:42 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
They aren't directly used by the hardware.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index d8e8d653d518..650c45c896f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -526,7 +526,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
}
static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
- unsigned long size, int byte_align,
+ unsigned long size,
struct amdgpu_bo *bo)
{
struct amdgpu_bo_param bp;
@@ -537,7 +537,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
memset(&bp, 0, sizeof(bp));
bp.size = size;
- bp.byte_align = byte_align;
bp.domain = AMDGPU_GEM_DOMAIN_GTT;
bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
AMDGPU_GEM_CREATE_SHADOW;
@@ -586,7 +585,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
NULL));
- r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
+ r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
if (!bp->resv)
reservation_object_unlock((*bo_ptr)->tbo.resv);
--
2.14.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
[not found] ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-18 6:15 ` Zhang, Jerry(Junwei)
0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-18 6:15 UTC (permalink / raw)
To: christian.koenig-5C7GfCeVMHo,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 09/14/2018 07:54 PM, Christian König wrote:
> Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):
>> On 09/11/2018 05:56 PM, Christian König wrote:
>>> Don't grab the reservation lock any more and simplify the handling
>>> quite
>>> a bit.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109
>>> ++++++++---------------------
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 46 ++++--------
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 +--
>>> 3 files changed, 43 insertions(+), 120 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 5eba66ecf668..20bb702f5c7f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2940,54 +2940,6 @@ static int
>>> amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>>> return 0;
>>> }
>>> -/**
>>> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM
>>> buffers
>>> - *
>>> - * @adev: amdgpu_device pointer
>>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>>> - * @bo: amdgpu_bo buffer whose shadow is being restored
>>> - * @fence: dma_fence associated with the operation
>>> - *
>>> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
>>> - * restore things like GPUVM page tables after a GPU reset where
>>> - * the contents of VRAM might be lost.
>>> - * Returns 0 on success, negative error code on failure.
>>> - */
>>> -static int amdgpu_device_recover_vram_from_shadow(struct
>>> amdgpu_device *adev,
>>> - struct amdgpu_ring *ring,
>>> - struct amdgpu_bo *bo,
>>> - struct dma_fence **fence)
>>> -{
>>> - uint32_t domain;
>>> - int r;
>>> -
>>> - if (!bo->shadow)
>>> - return 0;
>>> -
>>> - r = amdgpu_bo_reserve(bo, true);
>>> - if (r)
>>> - return r;
>>> - domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>>> - /* if bo has been evicted, then no need to recover */
>>> - if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>> - r = amdgpu_bo_validate(bo->shadow);
>>> - if (r) {
>>> - DRM_ERROR("bo validate failed!\n");
>>> - goto err;
>>> - }
>>> -
>>> - r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
>>> - NULL, fence, true);
>>> - if (r) {
>>> - DRM_ERROR("recover page table failed!\n");
>>> - goto err;
>>> - }
>>> - }
>>> -err:
>>> - amdgpu_bo_unreserve(bo);
>>> - return r;
>>> -}
>>> -
>>> /**
>>> * amdgpu_device_recover_vram - Recover some VRAM contents
>>> *
>>> @@ -2996,16 +2948,15 @@ static int
>>> amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>>> * Restores the contents of VRAM buffers from the shadows in GTT.
>>> Used to
>>> * restore things like GPUVM page tables after a GPU reset where
>>> * the contents of VRAM might be lost.
>>> - * Returns 0 on success, 1 on failure.
>>> + *
>>> + * Returns:
>>> + * 0 on success, negative error code on failure.
>>> */
>>> static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>>> {
>>> - struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> - struct amdgpu_bo *bo, *tmp;
>>> struct dma_fence *fence = NULL, *next = NULL;
>>> - long r = 1;
>>> - int i = 0;
>>> - long tmo;
>>> + struct amdgpu_bo *shadow;
>>> + long r = 1, tmo;
>>> if (amdgpu_sriov_runtime(adev))
>>> tmo = msecs_to_jiffies(8000);
>>> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct
>>> amdgpu_device *adev)
>>> DRM_INFO("recover vram bo from shadow start\n");
>>> mutex_lock(&adev->shadow_list_lock);
>>> - list_for_each_entry_safe(bo, tmp, &adev->shadow_list,
>>> shadow_list) {
>>> - next = NULL;
>>> - amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
>>> + list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
>>> +
>>> + /* No need to recover an evicted BO */
>>> + if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
>>> + shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
>> is there a change that shadow bo evicted to other domain?
>> like SYSTEM?
>
> Yes, that's why I test "!= TTM_PL_TT" here.
>
> What can happen is that either the shadow or the page table or page
> directory is evicted.
>
> But in this case we don't need to restore anything because of patch #1
> in this series.
Thanks, then it's
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
Regards,
Jerry
>
> Regards,
> Christian.
>
>>
>> Regards,
>> Jerry
>>> + continue;
>>> +
>>> + r = amdgpu_bo_restore_shadow(shadow, &next);
>>> + if (r)
>>> + break;
>>> +
>>> if (fence) {
>>> r = dma_fence_wait_timeout(fence, false, tmo);
>>> - if (r == 0)
>>> - pr_err("wait fence %p[%d] timeout\n", fence, i);
>>> - else if (r < 0)
>>> - pr_err("wait fence %p[%d] interrupted\n", fence, i);
>>> - if (r < 1) {
>>> - dma_fence_put(fence);
>>> - fence = next;
>>> + dma_fence_put(fence);
>>> + fence = next;
>>> + if (r <= 0)
>>> break;
>>> - }
>>> - i++;
>>> + } else {
>>> + fence = next;
>>> }
>>> -
>>> - dma_fence_put(fence);
>>> - fence = next;
>>> }
>>> mutex_unlock(&adev->shadow_list_lock);
>>> - if (fence) {
>>> - r = dma_fence_wait_timeout(fence, false, tmo);
>>> - if (r == 0)
>>> - pr_err("wait fence %p[%d] timeout\n", fence, i);
>>> - else if (r < 0)
>>> - pr_err("wait fence %p[%d] interrupted\n", fence, i);
>>> -
>>> - }
>>> + if (fence)
>>> + tmo = dma_fence_wait_timeout(fence, false, tmo);
>>> dma_fence_put(fence);
>>> - if (r > 0)
>>> - DRM_INFO("recover vram bo from shadow done\n");
>>> - else
>>> + if (r <= 0 || tmo <= 0) {
>>> DRM_ERROR("recover vram bo from shadow failed\n");
>>> + return -EIO;
>>> + }
>>> - return (r > 0) ? 0 : 1;
>>> + DRM_INFO("recover vram bo from shadow done\n");
>>> + return 0;
>>> }
>>> /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 3a6f92de5504..5b6d5fcc2151 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct
>>> amdgpu_device *adev,
>>> if (!r) {
>>> bo->shadow->parent = amdgpu_bo_ref(bo);
>>> mutex_lock(&adev->shadow_list_lock);
>>> - list_add_tail(&bo->shadow_list, &adev->shadow_list);
>>> + list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>>> mutex_unlock(&adev->shadow_list_lock);
>>> }
>>> @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>> }
>>> /**
>>> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
>>> - * @adev: amdgpu device object
>>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>>> - * @bo: &amdgpu_bo buffer to be restored
>>> - * @resv: reservation object with embedded fence
>>> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
>>> + *
>>> + * @shadow: &amdgpu_bo shadow to be restored
>>> * @fence: dma_fence associated with the operation
>>> - * @direct: whether to submit the job directly
>>> *
>>> * Copies a buffer object's shadow content back to the object.
>>> * This is used for recovering a buffer from its shadow in case of
>>> a gpu
>>> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>> * Returns:
>>> * 0 for success or a negative error code on failure.
>>> */
>>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>>> - struct amdgpu_ring *ring,
>>> - struct amdgpu_bo *bo,
>>> - struct reservation_object *resv,
>>> - struct dma_fence **fence,
>>> - bool direct)
>>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct
>>> dma_fence **fence)
>>> {
>>> - struct amdgpu_bo *shadow = bo->shadow;
>>> - uint64_t bo_addr, shadow_addr;
>>> - int r;
>>> -
>>> - if (!shadow)
>>> - return -EINVAL;
>>> -
>>> - bo_addr = amdgpu_bo_gpu_offset(bo);
>>> - shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
>>> -
>>> - r = reservation_object_reserve_shared(bo->tbo.resv);
>>> - if (r)
>>> - goto err;
>>> + struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
>>> + struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> + uint64_t shadow_addr, parent_addr;
>>> - r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>>> - amdgpu_bo_size(bo), resv, fence,
>>> - direct, false);
>>> - if (!r)
>>> - amdgpu_bo_fence(bo, *fence, true);
>>> + shadow_addr = amdgpu_bo_gpu_offset(shadow);
>>> + parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>>> -err:
>>> - return r;
>>> + return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
>>> + amdgpu_bo_size(shadow), NULL, fence,
>>> + true, false);
>>> }
>>> /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 907fdf46d895..363db417fb2e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct
>>> amdgpu_device *adev,
>>> struct reservation_object *resv,
>>> struct dma_fence **fence, bool direct);
>>> int amdgpu_bo_validate(struct amdgpu_bo *bo);
>>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>>> - struct amdgpu_ring *ring,
>>> - struct amdgpu_bo *bo,
>>> - struct reservation_object *resv,
>>> - struct dma_fence **fence,
>>> - bool direct);
>>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
>>> + struct dma_fence **fence);
>>> uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device
>>> *adev,
>>> uint32_t domain);
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-09-18 6:15 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-11 9:55 [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Christian König
[not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 9:55 ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
[not found] ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 12:29 ` Michel Dänzer
[not found] ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-09-11 13:21 ` Christian König
[not found] ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-12 12:37 ` Deng, Emily
2018-09-11 9:56 ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
[not found] ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13 9:30 ` Zhang, Jerry(Junwei)
2018-09-11 9:56 ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
[not found] ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13 9:28 ` Zhang, Jerry(Junwei)
2018-09-11 9:56 ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
[not found] ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13 9:29 ` Zhang, Jerry(Junwei)
[not found] ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
2018-09-14 11:54 ` Christian König
[not found] ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-18 6:15 ` Zhang, Jerry(Junwei)
2018-09-13 9:28 ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
-- strict thread matches above, loose matches on Subject: below --
2018-09-14 13:42 Christian König
[not found] ` <20180914134257.2196-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-14 13:42 ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox