dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] Visible VRAM Management Improvements
@ 2017-06-28  2:33 John Brooks
  2017-06-28  2:33 ` [PATCH 1/5] drm/amdgpu: Add vis_vramlimit module parameter John Brooks
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: John Brooks @ 2017-06-28  2:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Michel Dänzer,
	Marek Olšák, Christian König
  Cc: David Airlie, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Changes in this version:
- Dropped former patch 1 ("Separate placements and busy placements") as it was
  no longer necessary
- Dropped former patch 4 ("Don't force BOs into visible VRAM if they can go to
  GTT instead") as it was unnecessary and had too many side effects (Thanks
  Christian)
- Dropped former patches 8 and 9 ("Asynchronously move BOs to visible VRAM" and
  an associated optimization), as there may be better ways to address the root
  of the problem
- Added a u64 cast to patch 1 (Thanks Christian)
- Optimized patch 2 in the case where visible VRAM is not smaller than total
  VRAM (Thanks Michel)

The series as it is now fixes Dying Light and improves DiRT Rally frame times.
Without the asynchronous moves, there is still occasional stalling in DiRT
Rally.


The "Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS" patch received
objections, but I have not removed it yet as there seems to be some support for
the idea. The precise details of when to set and clear the flag may need further
discussion.

I note that if the patch is removed (but retaining the preventative measures
against repeated moves back and forth between GTT and invisible VRAM), there
are very few evictions but the average framerate drops from ~89 to ~82.
Presumably we'll see either higher memory presure or lower framerates depending
on whether userspace over- or under-estimates when to set this flag. What we do
in the kernel may depend on which side of this tradeoff we want to end up on
for now.

--
John Brooks (Frogging101)

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] drm/amdgpu: Add vis_vramlimit module parameter
  2017-06-28  2:33 [PATCH v2] Visible VRAM Management Improvements John Brooks
@ 2017-06-28  2:33 ` John Brooks
  2017-06-28  2:33 ` [PATCH 2/5] drm/amdgpu: Throttle visible VRAM moves separately John Brooks
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: John Brooks @ 2017-06-28  2:33 UTC (permalink / raw)
  To: amd-gfx, Michel Dänzer, Marek Olšák,
	Christian König
  Cc: dri-devel

Allow specifying a limit on visible VRAM via a module parameter. This is
helpful for testing performance under visible VRAM pressure.

v2: Add cast to 64-bit (Christian König)

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 ++++++++
 3 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 12d61ed..06ed45c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -74,6 +74,7 @@
  */
 extern int amdgpu_modeset;
 extern int amdgpu_vram_limit;
+extern int amdgpu_vis_vram_limit;
 extern int amdgpu_gart_size;
 extern int amdgpu_moverate;
 extern int amdgpu_benchmarking;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 4c7c262..a14f458 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -73,6 +73,7 @@
 #define KMS_DRIVER_PATCHLEVEL	0
 
 int amdgpu_vram_limit = 0;
+int amdgpu_vis_vram_limit = 0;
 int amdgpu_gart_size = -1; /* auto */
 int amdgpu_moverate = -1; /* auto */
 int amdgpu_benchmarking = 0;
@@ -119,6 +120,9 @@ int amdgpu_lbpw = -1;
 MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
 module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
 
+MODULE_PARM_DESC(vis_vramlimit, "Restrict visible VRAM for testing, in megabytes");
+module_param_named(vis_vramlimit, amdgpu_vis_vram_limit, int, 0444);
+
 MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 64, etc., -1 = auto)");
 module_param_named(gartsize, amdgpu_gart_size, int, 0600);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c9b131b..7d14ae1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1095,6 +1095,7 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
 int amdgpu_ttm_init(struct amdgpu_device *adev)
 {
 	int r;
+	u64 vis_vram_limit;
 
 	r = amdgpu_ttm_global_init(adev);
 	if (r) {
@@ -1118,6 +1119,13 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 		DRM_ERROR("Failed initializing VRAM heap.\n");
 		return r;
 	}
+
+	/* Reduce size of CPU-visible VRAM if requested */
+	vis_vram_limit = (u64)amdgpu_vis_vram_limit * 1024 * 1024;
+	if (amdgpu_vis_vram_limit > 0 &&
+	    vis_vram_limit <= adev->mc.visible_vram_size)
+		adev->mc.visible_vram_size = vis_vram_limit;
+
 	/* Change the size here instead of the init above so only lpfn is affected */
 	amdgpu_ttm_set_active_vram_size(adev, adev->mc.visible_vram_size);
 
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/5] drm/amdgpu: Throttle visible VRAM moves separately
  2017-06-28  2:33 [PATCH v2] Visible VRAM Management Improvements John Brooks
  2017-06-28  2:33 ` [PATCH 1/5] drm/amdgpu: Add vis_vramlimit module parameter John Brooks
@ 2017-06-28  2:33 ` John Brooks
  2017-06-28  2:33 ` [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo John Brooks
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: John Brooks @ 2017-06-28  2:33 UTC (permalink / raw)
  To: amd-gfx, Michel Dänzer, Marek Olšák,
	Christian König
  Cc: dri-devel

The BO move throttling code is designed to allow VRAM to fill quickly if it
is relatively empty. However, this does not take into account situations
where the visible VRAM is smaller than total VRAM, and total VRAM may not
be close to full but the visible VRAM segment is under pressure. In such
situations, visible VRAM would experience unrestricted swapping and
performance would drop.

Add a separate counter specifically for moves involving visible VRAM, and
check it before moving BOs there.

v2: Only perform calculations for separate counter if visible VRAM is
    smaller than total VRAM. (Michel Dänzer)

Fixes: 95844d20ae02 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
Signed-off-by: John Brooks <john@fastquake.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 84 +++++++++++++++++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 ++--
 3 files changed, 78 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 06ed45c..7366115 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1157,7 +1157,9 @@ struct amdgpu_cs_parser {
 	struct list_head		validated;
 	struct dma_fence		*fence;
 	uint64_t			bytes_moved_threshold;
+	uint64_t			bytes_moved_vis_threshold;
 	uint64_t			bytes_moved;
+	uint64_t			bytes_moved_vis;
 	struct amdgpu_bo_list_entry	*evictable;
 
 	/* user fence */
@@ -1591,6 +1593,7 @@ struct amdgpu_device {
 		spinlock_t		lock;
 		s64			last_update_us;
 		s64			accum_us; /* accumulated microseconds */
+		s64			accum_us_vis; /* for visible VRAM */
 		u32			log2_max_MBps;
 	} mm_stats;
 
@@ -1926,7 +1929,8 @@ bool amdgpu_need_post(struct amdgpu_device *adev);
 void amdgpu_update_display_priority(struct amdgpu_device *adev);
 
 int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data);
-void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes);
+void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
+				  u64 num_vis_bytes);
 void amdgpu_ttm_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
 bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
 int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index aeee684..1dfa847 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -223,11 +223,13 @@ static s64 bytes_to_us(struct amdgpu_device *adev, u64 bytes)
  * ticks. The accumulated microseconds (us) are converted to bytes and
  * returned.
  */
-static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
+static void amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev,
+					      u64 *max_bytes,
+					      u64 *max_vis_bytes)
 {
 	s64 time_us, increment_us;
-	u64 max_bytes;
 	u64 free_vram, total_vram, used_vram;
+	u64 free_vis_vram = 0, total_vis_vram = 0, used_vis_vram = 0;
 
 	/* Allow a maximum of 200 accumulated ms. This is basically per-IB
 	 * throttling.
@@ -238,13 +240,23 @@ static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
 	 */
 	const s64 us_upper_bound = 200000;
 
-	if (!adev->mm_stats.log2_max_MBps)
-		return 0;
+	if (!adev->mm_stats.log2_max_MBps) {
+		*max_bytes = 0;
+		*max_vis_bytes = 0;
+		return;
+	}
 
 	total_vram = adev->mc.real_vram_size - adev->vram_pin_size;
 	used_vram = atomic64_read(&adev->vram_usage);
 	free_vram = used_vram >= total_vram ? 0 : total_vram - used_vram;
 
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {
+		total_vis_vram = adev->mc.visible_vram_size;
+		used_vis_vram = atomic64_read(&adev->vram_vis_usage);
+		free_vis_vram = used_vis_vram >= total_vis_vram ?
+			0 : total_vis_vram - used_vis_vram;
+	}
+
 	spin_lock(&adev->mm_stats.lock);
 
 	/* Increase the amount of accumulated us. */
@@ -252,7 +264,11 @@ static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
 	increment_us = time_us - adev->mm_stats.last_update_us;
 	adev->mm_stats.last_update_us = time_us;
 	adev->mm_stats.accum_us = min(adev->mm_stats.accum_us + increment_us,
-                                      us_upper_bound);
+				      us_upper_bound);
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {
+		adev->mm_stats.accum_us_vis = min(adev->mm_stats.accum_us_vis +
+			increment_us, us_upper_bound);
+	}
 
 	/* This prevents the short period of low performance when the VRAM
 	 * usage is low and the driver is in debt or doesn't have enough
@@ -280,23 +296,36 @@ static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
 		adev->mm_stats.accum_us = max(min_us, adev->mm_stats.accum_us);
 	}
 
-	/* This returns 0 if the driver is in debt to disallow (optional)
+	/* Do the same for visible VRAM if half of it is free */
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+	    free_vis_vram >= total_vis_vram / 2) {
+		adev->mm_stats.accum_us_vis = max(bytes_to_us(adev,
+						  free_vis_vram / 2),
+						  adev->mm_stats.accum_us_vis);
+	}
+
+	/* This is set to 0 if the driver is in debt to disallow (optional)
 	 * buffer moves.
 	 */
-	max_bytes = us_to_bytes(adev, adev->mm_stats.accum_us);
+	*max_bytes = us_to_bytes(adev, adev->mm_stats.accum_us);
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size)
+		*max_vis_bytes = us_to_bytes(adev, adev->mm_stats.accum_us_vis);
+	else
+		*max_vis_bytes = 0;
 
 	spin_unlock(&adev->mm_stats.lock);
-	return max_bytes;
 }
 
 /* Report how many bytes have really been moved for the last command
  * submission. This can result in a debt that can stop buffer migrations
  * temporarily.
  */
-void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes)
+void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
+				  u64 num_vis_bytes)
 {
 	spin_lock(&adev->mm_stats.lock);
 	adev->mm_stats.accum_us -= bytes_to_us(adev, num_bytes);
+	adev->mm_stats.accum_us_vis -= bytes_to_us(adev, num_vis_bytes);
 	spin_unlock(&adev->mm_stats.lock);
 }
 
@@ -304,7 +333,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 				 struct amdgpu_bo *bo)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-	u64 initial_bytes_moved;
+	u64 initial_bytes_moved, bytes_moved;
 	uint32_t domain;
 	int r;
 
@@ -314,17 +343,34 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 	/* Don't move this buffer if we have depleted our allowance
 	 * to move it. Don't move anything if the threshold is zero.
 	 */
-	if (p->bytes_moved < p->bytes_moved_threshold)
-		domain = bo->prefered_domains;
-	else
+	if (p->bytes_moved < p->bytes_moved_threshold) {
+		if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+		    (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) {
+			/* And don't move a CPU_ACCESS_REQUIRED BO to limited
+			 * visible VRAM if we've depleted our allowance to do
+			 * that.
+			 */
+			if (p->bytes_moved_vis < p->bytes_moved_vis_threshold)
+				domain = bo->prefered_domains;
+			else
+				domain = bo->allowed_domains;
+		} else {
+			domain = bo->prefered_domains;
+		}
+	} else {
 		domain = bo->allowed_domains;
+	}
 
 retry:
 	amdgpu_ttm_placement_from_domain(bo, domain);
 	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
 	r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
-	p->bytes_moved += atomic64_read(&adev->num_bytes_moved) -
-		initial_bytes_moved;
+	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
+		      initial_bytes_moved;
+	p->bytes_moved += bytes_moved;
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+	    bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
+		p->bytes_moved_vis += bytes_moved;
 
 	if (unlikely(r == -ENOMEM) && domain != bo->allowed_domains) {
 		domain = bo->allowed_domains;
@@ -554,8 +600,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 		list_splice(&need_pages, &p->validated);
 	}
 
-	p->bytes_moved_threshold = amdgpu_cs_get_threshold_for_moves(p->adev);
+	amdgpu_cs_get_threshold_for_moves(p->adev, &p->bytes_moved_threshold,
+					  &p->bytes_moved_vis_threshold);
 	p->bytes_moved = 0;
+	p->bytes_moved_vis = 0;
 	p->evictable = list_last_entry(&p->validated,
 				       struct amdgpu_bo_list_entry,
 				       tv.head);
@@ -579,8 +627,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 		goto error_validate;
 	}
 
-	amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved);
-
+	amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
+				     p->bytes_moved_vis);
 	fpriv->vm.last_eviction_counter =
 		atomic64_read(&p->adev->num_evictions);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8ee6965..dcf1ddb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -322,7 +322,7 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	struct amdgpu_bo *bo;
 	enum ttm_bo_type type;
 	unsigned long page_align;
-	u64 initial_bytes_moved;
+	u64 initial_bytes_moved, bytes_moved;
 	size_t acc_size;
 	int r;
 
@@ -398,8 +398,12 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
 				 &bo->placement, page_align, !kernel, NULL,
 				 acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
-	amdgpu_cs_report_moved_bytes(adev,
-		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
+	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
+		      initial_bytes_moved;
+	if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
+		amdgpu_cs_report_moved_bytes(adev, bytes_moved, bytes_moved);
+	else
+		amdgpu_cs_report_moved_bytes(adev, bytes_moved, 0);
 
 	if (unlikely(r != 0))
 		return r;
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo
  2017-06-28  2:33 [PATCH v2] Visible VRAM Management Improvements John Brooks
  2017-06-28  2:33 ` [PATCH 1/5] drm/amdgpu: Add vis_vramlimit module parameter John Brooks
  2017-06-28  2:33 ` [PATCH 2/5] drm/amdgpu: Throttle visible VRAM moves separately John Brooks
@ 2017-06-28  2:33 ` John Brooks
  2017-06-28 13:06   ` Christian König
       [not found] ` <1498617201-24557-1-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
  2017-06-29  2:33 ` [PATCH v2] Visible VRAM Management Improvements Michel Dänzer
  4 siblings, 1 reply; 25+ messages in thread
From: John Brooks @ 2017-06-28  2:33 UTC (permalink / raw)
  To: amd-gfx, Michel Dänzer, Marek Olšák,
	Christian König
  Cc: dri-devel

Signed-off-by: John Brooks <john@fastquake.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 5 +++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7366115..34c293a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -428,6 +428,9 @@ struct amdgpu_bo {
 	void				*metadata;
 	u32				metadata_size;
 	unsigned			prime_shared_count;
+	unsigned long			last_page_fault_jiffies;
+	unsigned long			last_cs_move_jiffies;
+
 	/* list of all virtual address to which this bo
 	 * is associated to
 	 */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 1dfa847..071b592 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -335,6 +335,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
 	u64 initial_bytes_moved, bytes_moved;
 	uint32_t domain;
+	uint32_t old_mem;
 	int r;
 
 	if (bo->pin_count)
@@ -364,6 +365,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 retry:
 	amdgpu_ttm_placement_from_domain(bo, domain);
 	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
+	old_mem = bo->tbo.mem.mem_type;
 	r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
 	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
 		      initial_bytes_moved;
@@ -377,6 +379,9 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 		goto retry;
 	}
 
+	if (bo->tbo.mem.mem_type != old_mem)
+		bo->last_cs_move_jiffies = jiffies;
+
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index dcf1ddb..b71775c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -953,6 +953,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 	if (bo->mem.mem_type != TTM_PL_VRAM)
 		return 0;
 
+	abo->last_page_fault_jiffies = jiffies;
+
 	size = bo->mem.num_pages << PAGE_SHIFT;
 	offset = bo->mem.start << PAGE_SHIFT;
 	/* TODO: figure out how to map scattered VRAM to the CPU */
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found] ` <1498617201-24557-1-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
@ 2017-06-28  2:33   ` John Brooks
       [not found]     ` <1498617201-24557-5-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
  2017-06-28  2:33   ` [PATCH 5/5] drm/amdgpu: Don't force BOs into visible VRAM for page faults John Brooks
  1 sibling, 1 reply; 25+ messages in thread
From: John Brooks @ 2017-06-28  2:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Michel Dänzer,
	Marek Olšák, Christian König
  Cc: David Airlie, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
it should only be treated as a hint to initially place a BO somewhere CPU
accessible, rather than having a permanent effect on BO placement.

Instead of the flag being set in stone at BO creation, set the flag when a
page fault occurs so that it goes somewhere CPU-visible, and clear it when
the BO is requested by the GPU.

However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
invisible VRAM, where they may promptly generate another page fault. When
BOs are constantly moved back and forth like this, it is highly detrimental
to performance. Only clear the flag on CS if:

- The BO wasn't page faulted for a certain amount of time (currently 10
  seconds), and
- its last page fault didn't occur too soon (currently 500ms) after its
  last CS request, or vice versa.

Setting the flag in amdgpu_fault_reserve_notify() also means that we can
remove the loop to restrict lpfn to the end of visible VRAM, because
amdgpu_ttm_placement_init() will do it for us.

Signed-off-by: John Brooks <john@fastquake.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 45 +++++++++++++++++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
 3 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 071b592..1ac3c2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -341,6 +341,8 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 	if (bo->pin_count)
 		return 0;
 
+	amdgpu_bo_clear_cpu_access_required(bo);
+
 	/* Don't move this buffer if we have depleted our allowance
 	 * to move it. Don't move anything if the threshold is zero.
 	 */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b71775c..658d7b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -943,8 +943,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
 	struct amdgpu_bo *abo;
-	unsigned long offset, size, lpfn;
-	int i, r;
+	unsigned long offset, size;
+	int r;
 
 	if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
 		return 0;
@@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 
 	/* hurrah the memory is not visible ! */
 	atomic64_inc(&adev->num_vram_cpu_page_faults);
+	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
 	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
-	lpfn =	adev->mc.visible_vram_size >> PAGE_SHIFT;
-	for (i = 0; i < abo->placement.num_placement; i++) {
-		/* Force into visible VRAM */
-		if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
-		    (!abo->placements[i].lpfn ||
-		     abo->placements[i].lpfn > lpfn))
-			abo->placements[i].lpfn = lpfn;
-	}
+
 	r = ttm_bo_validate(bo, &abo->placement, false, false);
 	if (unlikely(r == -ENOMEM)) {
 		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
@@ -1033,3 +1027,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
 
 	return bo->tbo.offset;
 }
+
+/**
+ * amdgpu_bo_clear_cpu_access_required
+ * @bo: BO to update
+ *
+ * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while
+ * and it didn't have a page fault too soon after the last time it was moved to
+ * VRAM.
+ *
+ * Caller should have bo reserved.
+ *
+ */
+void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo)
+{
+	const unsigned int page_fault_timeout_ms = 10000;
+	const unsigned int min_period_ms = 500;
+	unsigned int ms_since_pf, period_ms;
+
+	ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies);
+	period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies -
+					 bo->last_cs_move_jiffies));
+
+	/*
+	 * Try to avoid a revolving door between GTT and VRAM. Clearing the
+	 * flag may move this BO back to VRAM, so don't clear it if it's likely
+	 * to page fault and go right back to GTT.
+	 */
+	if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) ||
+	    (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms))
+		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 3824851..b0cb137 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
 				  struct reservation_object *resv,
 				  struct dma_fence **fence,
 				  bool direct);
+void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo);
 
 
 /*
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/5] drm/amdgpu: Don't force BOs into visible VRAM for page faults
       [not found] ` <1498617201-24557-1-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
  2017-06-28  2:33   ` [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS John Brooks
@ 2017-06-28  2:33   ` John Brooks
  2017-06-29  2:30     ` Michel Dänzer
  1 sibling, 1 reply; 25+ messages in thread
From: John Brooks @ 2017-06-28  2:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Michel Dänzer,
	Marek Olšák, Christian König
  Cc: David Airlie, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

There is no need for page faults to force BOs into visible VRAM if it's
full, and the time it takes to do so is great enough to cause noticeable
stuttering. Add GTT as a possible placement so that if visible VRAM is
full, page faults move BOs to GTT instead of evicting other BOs from VRAM.

Signed-off-by: John Brooks <john@fastquake.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 658d7b1..a215d8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -968,19 +968,21 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 	/* hurrah the memory is not visible ! */
 	atomic64_inc(&adev->num_vram_cpu_page_faults);
 	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
+	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
+					 AMDGPU_GEM_DOMAIN_GTT);
+
+	/* Avoid costly evictions; only set GTT as a busy placement */
+	abo->placement.num_busy_placement = 1;
+	abo->placement.busy_placement = &abo->placements[1];
 
 	r = ttm_bo_validate(bo, &abo->placement, false, false);
-	if (unlikely(r == -ENOMEM)) {
-		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
-		return ttm_bo_validate(bo, &abo->placement, false, false);
-	} else if (unlikely(r != 0)) {
+	if (unlikely(r != 0))
 		return r;
-	}
 
 	offset = bo->mem.start << PAGE_SHIFT;
 	/* this should never happen */
-	if ((offset + size) > adev->mc.visible_vram_size)
+	if (bo->mem.mem_type == TTM_PL_VRAM &&
+	    (offset + size) > adev->mc.visible_vram_size)
 		return -EINVAL;
 
 	return 0;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]     ` <1498617201-24557-5-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
@ 2017-06-28 13:05       ` Christian König
  2017-06-28 23:26         ` John Brooks
  2017-06-29  3:18       ` Michel Dänzer
  1 sibling, 1 reply; 25+ messages in thread
From: Christian König @ 2017-06-28 13:05 UTC (permalink / raw)
  To: John Brooks, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Michel Dänzer, Marek Olšák
  Cc: David Airlie, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 28.06.2017 um 04:33 schrieb John Brooks:
> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
> it should only be treated as a hint to initially place a BO somewhere CPU
> accessible, rather than having a permanent effect on BO placement.
>
> Instead of the flag being set in stone at BO creation, set the flag when a
> page fault occurs so that it goes somewhere CPU-visible, and clear it when
> the BO is requested by the GPU.
>
> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> invisible VRAM, where they may promptly generate another page fault. When
> BOs are constantly moved back and forth like this, it is highly detrimental
> to performance. Only clear the flag on CS if:
>
> - The BO wasn't page faulted for a certain amount of time (currently 10
>    seconds), and
> - its last page fault didn't occur too soon (currently 500ms) after its
>    last CS request, or vice versa.
>
> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
> remove the loop to restrict lpfn to the end of visible VRAM, because
> amdgpu_ttm_placement_init() will do it for us.

I'm fine with the general approach, but I'm still absolutely not keen 
about clearing the flag when userspace has originally specified it.

Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT 
or something like this.

Regards,
Christian.

>
> Signed-off-by: John Brooks <john@fastquake.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 45 +++++++++++++++++++++++-------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
>   3 files changed, 38 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 071b592..1ac3c2e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -341,6 +341,8 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>   	if (bo->pin_count)
>   		return 0;
>   
> +	amdgpu_bo_clear_cpu_access_required(bo);
> +
>   	/* Don't move this buffer if we have depleted our allowance
>   	 * to move it. Don't move anything if the threshold is zero.
>   	 */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index b71775c..658d7b1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -943,8 +943,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>   {
>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
>   	struct amdgpu_bo *abo;
> -	unsigned long offset, size, lpfn;
> -	int i, r;
> +	unsigned long offset, size;
> +	int r;
>   
>   	if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
>   		return 0;
> @@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>   
>   	/* hurrah the memory is not visible ! */
>   	atomic64_inc(&adev->num_vram_cpu_page_faults);
> +	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>   	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> -	lpfn =	adev->mc.visible_vram_size >> PAGE_SHIFT;
> -	for (i = 0; i < abo->placement.num_placement; i++) {
> -		/* Force into visible VRAM */
> -		if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
> -		    (!abo->placements[i].lpfn ||
> -		     abo->placements[i].lpfn > lpfn))
> -			abo->placements[i].lpfn = lpfn;
> -	}
> +
>   	r = ttm_bo_validate(bo, &abo->placement, false, false);
>   	if (unlikely(r == -ENOMEM)) {
>   		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> @@ -1033,3 +1027,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
>   
>   	return bo->tbo.offset;
>   }
> +
> +/**
> + * amdgpu_bo_clear_cpu_access_required
> + * @bo: BO to update
> + *
> + * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while
> + * and it didn't have a page fault too soon after the last time it was moved to
> + * VRAM.
> + *
> + * Caller should have bo reserved.
> + *
> + */
> +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo)
> +{
> +	const unsigned int page_fault_timeout_ms = 10000;
> +	const unsigned int min_period_ms = 500;
> +	unsigned int ms_since_pf, period_ms;
> +
> +	ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies);
> +	period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies -
> +					 bo->last_cs_move_jiffies));
> +
> +	/*
> +	 * Try to avoid a revolving door between GTT and VRAM. Clearing the
> +	 * flag may move this BO back to VRAM, so don't clear it if it's likely
> +	 * to page fault and go right back to GTT.
> +	 */
> +	if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) ||
> +	    (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms))
> +		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 3824851..b0cb137 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>   				  struct reservation_object *resv,
>   				  struct dma_fence **fence,
>   				  bool direct);
> +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo);
>   
>   
>   /*


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo
  2017-06-28  2:33 ` [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo John Brooks
@ 2017-06-28 13:06   ` Christian König
       [not found]     ` <e3d7be62-e63b-f370-f159-147aaf8d7c50-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Christian König @ 2017-06-28 13:06 UTC (permalink / raw)
  To: John Brooks, amd-gfx, Michel Dänzer, Marek Olšák; +Cc: dri-devel

Am 28.06.2017 um 04:33 schrieb John Brooks:
> Signed-off-by: John Brooks <john@fastquake.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 3 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 5 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 ++
>   3 files changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 7366115..34c293a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -428,6 +428,9 @@ struct amdgpu_bo {
>   	void				*metadata;
>   	u32				metadata_size;
>   	unsigned			prime_shared_count;
> +	unsigned long			last_page_fault_jiffies;
> +	unsigned long			last_cs_move_jiffies;

Please use jiffies64 here, apart from that the patch looks good to me.

Christian.

> +
>   	/* list of all virtual address to which this bo
>   	 * is associated to
>   	 */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 1dfa847..071b592 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -335,6 +335,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   	u64 initial_bytes_moved, bytes_moved;
>   	uint32_t domain;
> +	uint32_t old_mem;
>   	int r;
>   
>   	if (bo->pin_count)
> @@ -364,6 +365,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>   retry:
>   	amdgpu_ttm_placement_from_domain(bo, domain);
>   	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
> +	old_mem = bo->tbo.mem.mem_type;
>   	r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
>   	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
>   		      initial_bytes_moved;
> @@ -377,6 +379,9 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>   		goto retry;
>   	}
>   
> +	if (bo->tbo.mem.mem_type != old_mem)
> +		bo->last_cs_move_jiffies = jiffies;
> +
>   	return r;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index dcf1ddb..b71775c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -953,6 +953,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>   	if (bo->mem.mem_type != TTM_PL_VRAM)
>   		return 0;
>   
> +	abo->last_page_fault_jiffies = jiffies;
> +
>   	size = bo->mem.num_pages << PAGE_SHIFT;
>   	offset = bo->mem.start << PAGE_SHIFT;
>   	/* TODO: figure out how to map scattered VRAM to the CPU */


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo
       [not found]     ` <e3d7be62-e63b-f370-f159-147aaf8d7c50-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-06-28 22:59       ` John Brooks
  2017-06-29  8:11         ` Christian König
  0 siblings, 1 reply; 25+ messages in thread
From: John Brooks @ 2017-06-28 22:59 UTC (permalink / raw)
  To: Christian König
  Cc: David Airlie, Michel Dänzer,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Marek Olšák,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Wed, Jun 28, 2017 at 03:06:47PM +0200, Christian König wrote:
> Am 28.06.2017 um 04:33 schrieb John Brooks:
> >Signed-off-by: John Brooks <john@fastquake.com>
> >---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 3 +++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 5 +++++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 ++
> >  3 files changed, 10 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >index 7366115..34c293a 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >@@ -428,6 +428,9 @@ struct amdgpu_bo {
> >  	void				*metadata;
> >  	u32				metadata_size;
> >  	unsigned			prime_shared_count;
> >+	unsigned long			last_page_fault_jiffies;
> >+	unsigned long			last_cs_move_jiffies;
> 
> Please use jiffies64 here, apart from that the patch looks good to me.
> 
> Christian.
> 

I'm not sure I understand. Do you mean change these variables to u64 and use
get_jiffies_64() instead of the plain jiffies variable below? I believe
jiffies_64 can be slower than jiffies also.

John

> >+
> >  	/* list of all virtual address to which this bo
> >  	 * is associated to
> >  	 */
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >index 1dfa847..071b592 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >@@ -335,6 +335,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
> >  	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> >  	u64 initial_bytes_moved, bytes_moved;
> >  	uint32_t domain;
> >+	uint32_t old_mem;
> >  	int r;
> >  	if (bo->pin_count)
> >@@ -364,6 +365,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
> >  retry:
> >  	amdgpu_ttm_placement_from_domain(bo, domain);
> >  	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
> >+	old_mem = bo->tbo.mem.mem_type;
> >  	r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
> >  	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
> >  		      initial_bytes_moved;
> >@@ -377,6 +379,9 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
> >  		goto retry;
> >  	}
> >+	if (bo->tbo.mem.mem_type != old_mem)
> >+		bo->last_cs_move_jiffies = jiffies;
> >+
> >  	return r;
> >  }
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >index dcf1ddb..b71775c 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >@@ -953,6 +953,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
> >  	if (bo->mem.mem_type != TTM_PL_VRAM)
> >  		return 0;
> >+	abo->last_page_fault_jiffies = jiffies;
> >+
> >  	size = bo->mem.num_pages << PAGE_SHIFT;
> >  	offset = bo->mem.start << PAGE_SHIFT;
> >  	/* TODO: figure out how to map scattered VRAM to the CPU */
> 
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
  2017-06-28 13:05       ` Christian König
@ 2017-06-28 23:26         ` John Brooks
       [not found]           ` <20170628232639.GB11762-6hIufAJW0g7Gr8qjsLp7YGXnswh1EIUO@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: John Brooks @ 2017-06-28 23:26 UTC (permalink / raw)
  To: Christian König; +Cc: Michel Dänzer, dri-devel, amd-gfx

On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
> Am 28.06.2017 um 04:33 schrieb John Brooks:
> >When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
> >it should only be treated as a hint to initially place a BO somewhere CPU
> >accessible, rather than having a permanent effect on BO placement.
> >
> >Instead of the flag being set in stone at BO creation, set the flag when a
> >page fault occurs so that it goes somewhere CPU-visible, and clear it when
> >the BO is requested by the GPU.
> >
> >However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> >invisible VRAM, where they may promptly generate another page fault. When
> >BOs are constantly moved back and forth like this, it is highly detrimental
> >to performance. Only clear the flag on CS if:
> >
> >- The BO wasn't page faulted for a certain amount of time (currently 10
> >   seconds), and
> >- its last page fault didn't occur too soon (currently 500ms) after its
> >   last CS request, or vice versa.
> >
> >Setting the flag in amdgpu_fault_reserve_notify() also means that we can
> >remove the loop to restrict lpfn to the end of visible VRAM, because
> >amdgpu_ttm_placement_init() will do it for us.
> 
> I'm fine with the general approach, but I'm still absolutely not keen about
> clearing the flag when userspace has originally specified it.
> 
> Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or
> something like this.
> 
> Regards,
> Christian.

Is it the fact that we clear a flag that userspace expects not to have changed
if it queries it later? I think that's the only effect of this that's directly
visible to userspace code. Would it be permissible to change the handling
of the flag without clearing it?

As for a new "hint" flag, I assume this new flag would be an alternative to the
existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use it in
situations where the kernel *should* place a BO somewhere CPU-accessible, but
is permitted to move it elsewhere. Is that correct?

John
> 
> >
> >Signed-off-by: John Brooks <john@fastquake.com>
> >---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 ++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 45 +++++++++++++++++++++++-------
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
> >  3 files changed, 38 insertions(+), 10 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >index 071b592..1ac3c2e 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >@@ -341,6 +341,8 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
> >  	if (bo->pin_count)
> >  		return 0;
> >+	amdgpu_bo_clear_cpu_access_required(bo);
> >+
> >  	/* Don't move this buffer if we have depleted our allowance
> >  	 * to move it. Don't move anything if the threshold is zero.
> >  	 */
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >index b71775c..658d7b1 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >@@ -943,8 +943,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
> >  {
> >  	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
> >  	struct amdgpu_bo *abo;
> >-	unsigned long offset, size, lpfn;
> >-	int i, r;
> >+	unsigned long offset, size;
> >+	int r;
> >  	if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
> >  		return 0;
> >@@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
> >  	/* hurrah the memory is not visible ! */
> >  	atomic64_inc(&adev->num_vram_cpu_page_faults);
> >+	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >  	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> >-	lpfn =	adev->mc.visible_vram_size >> PAGE_SHIFT;
> >-	for (i = 0; i < abo->placement.num_placement; i++) {
> >-		/* Force into visible VRAM */
> >-		if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
> >-		    (!abo->placements[i].lpfn ||
> >-		     abo->placements[i].lpfn > lpfn))
> >-			abo->placements[i].lpfn = lpfn;
> >-	}
> >+
> >  	r = ttm_bo_validate(bo, &abo->placement, false, false);
> >  	if (unlikely(r == -ENOMEM)) {
> >  		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> >@@ -1033,3 +1027,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
> >  	return bo->tbo.offset;
> >  }
> >+
> >+/**
> >+ * amdgpu_bo_clear_cpu_access_required
> >+ * @bo: BO to update
> >+ *
> >+ * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while
> >+ * and it didn't have a page fault too soon after the last time it was moved to
> >+ * VRAM.
> >+ *
> >+ * Caller should have bo reserved.
> >+ *
> >+ */
> >+void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo)
> >+{
> >+	const unsigned int page_fault_timeout_ms = 10000;
> >+	const unsigned int min_period_ms = 500;
> >+	unsigned int ms_since_pf, period_ms;
> >+
> >+	ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies);
> >+	period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies -
> >+					 bo->last_cs_move_jiffies));
> >+
> >+	/*
> >+	 * Try to avoid a revolving door between GTT and VRAM. Clearing the
> >+	 * flag may move this BO back to VRAM, so don't clear it if it's likely
> >+	 * to page fault and go right back to GTT.
> >+	 */
> >+	if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) ||
> >+	    (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms))
> >+		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >+}
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >index 3824851..b0cb137 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >@@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> >  				  struct reservation_object *resv,
> >  				  struct dma_fence **fence,
> >  				  bool direct);
> >+void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo);
> >  /*
> 
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: Don't force BOs into visible VRAM for page faults
  2017-06-28  2:33   ` [PATCH 5/5] drm/amdgpu: Don't force BOs into visible VRAM for page faults John Brooks
@ 2017-06-29  2:30     ` Michel Dänzer
  0 siblings, 0 replies; 25+ messages in thread
From: Michel Dänzer @ 2017-06-29  2:30 UTC (permalink / raw)
  To: John Brooks, Marek Olšák, Christian König
  Cc: dri-devel, amd-gfx

On 28/06/17 11:33 AM, John Brooks wrote:
> There is no need for page faults to force BOs into visible VRAM if it's
> full, and the time it takes to do so is great enough to cause noticeable
> stuttering. Add GTT as a possible placement so that if visible VRAM is
> full, page faults move BOs to GTT instead of evicting other BOs from VRAM.
> 
> Signed-off-by: John Brooks <john@fastquake.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 658d7b1..a215d8c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -968,19 +968,21 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>  	/* hurrah the memory is not visible ! */
>  	atomic64_inc(&adev->num_vram_cpu_page_faults);
>  	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> -	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> +	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
> +					 AMDGPU_GEM_DOMAIN_GTT);
> +
> +	/* Avoid costly evictions; only set GTT as a busy placement */
> +	abo->placement.num_busy_placement = 1;
> +	abo->placement.busy_placement = &abo->placements[1];
>  
>  	r = ttm_bo_validate(bo, &abo->placement, false, false);
> -	if (unlikely(r == -ENOMEM)) {
> -		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> -		return ttm_bo_validate(bo, &abo->placement, false, false);
> -	} else if (unlikely(r != 0)) {
> +	if (unlikely(r != 0))
>  		return r;
> -	}
>  
>  	offset = bo->mem.start << PAGE_SHIFT;
>  	/* this should never happen */
> -	if ((offset + size) > adev->mc.visible_vram_size)
> +	if (bo->mem.mem_type == TTM_PL_VRAM &&
> +	    (offset + size) > adev->mc.visible_vram_size)
>  		return -EINVAL;
>  
>  	return 0;
> 

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] Visible VRAM Management Improvements
  2017-06-28  2:33 [PATCH v2] Visible VRAM Management Improvements John Brooks
                   ` (3 preceding siblings ...)
       [not found] ` <1498617201-24557-1-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
@ 2017-06-29  2:33 ` Michel Dänzer
  4 siblings, 0 replies; 25+ messages in thread
From: Michel Dänzer @ 2017-06-29  2:33 UTC (permalink / raw)
  To: John Brooks, Marek Olšák, Christian König
  Cc: dri-devel, amd-gfx

On 28/06/17 11:33 AM, John Brooks wrote:
> Changes in this version:
> - Dropped former patch 1 ("Separate placements and busy placements") as it was
>   no longer necessary
> - Dropped former patch 4 ("Don't force BOs into visible VRAM if they can go to
>   GTT instead") as it was unnecessary and had too many side effects (Thanks
>   Christian)
> - Dropped former patches 8 and 9 ("Asynchronously move BOs to visible VRAM" and
>   an associated optimization), as there may be better ways to address the root
>   of the problem
> - Added a u64 cast to patch 1 (Thanks Christian)
> - Optimized patch 2 in the case where visible VRAM is not smaller than total
>   VRAM (Thanks Michel)
> 
> The series as it is now fixes Dying Light and improves DiRT Rally frame times.
> Without the asynchronous moves, there is still occasional stalling in DiRT
> Rally.
> 
> 
> The "Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS" patch received
> objections, but I have not removed it yet as there seems to be some support for
> the idea. The precise details of when to set and clear the flag may need further
> discussion.

Have you tried my idea of only clearing the flag when a BO is moved to
VRAM? If that works well enough, patch 3 should not be necessary either.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]           ` <20170628232639.GB11762-6hIufAJW0g7Gr8qjsLp7YGXnswh1EIUO@public.gmane.org>
@ 2017-06-29  2:35             ` Michel Dänzer
  2017-06-29  8:23               ` Christian König
  2017-06-30  1:56               ` John Brooks
  0 siblings, 2 replies; 25+ messages in thread
From: Michel Dänzer @ 2017-06-29  2:35 UTC (permalink / raw)
  To: John Brooks, Christian König
  Cc: David Airlie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Marek Olšák, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 29/06/17 08:26 AM, John Brooks wrote:
> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>> Am 28.06.2017 um 04:33 schrieb John Brooks:
>>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
>>> it should only be treated as a hint to initially place a BO somewhere CPU
>>> accessible, rather than having a permanent effect on BO placement.
>>>
>>> Instead of the flag being set in stone at BO creation, set the flag when a
>>> page fault occurs so that it goes somewhere CPU-visible, and clear it when
>>> the BO is requested by the GPU.
>>>
>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>> invisible VRAM, where they may promptly generate another page fault. When
>>> BOs are constantly moved back and forth like this, it is highly detrimental
>>> to performance. Only clear the flag on CS if:
>>>
>>> - The BO wasn't page faulted for a certain amount of time (currently 10
>>>   seconds), and
>>> - its last page fault didn't occur too soon (currently 500ms) after its
>>>   last CS request, or vice versa.
>>>
>>> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>> amdgpu_ttm_placement_init() will do it for us.
>>
>> I'm fine with the general approach, but I'm still absolutely not keen about
>> clearing the flag when userspace has originally specified it.

Is there any specific concern you have about that?


>> Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or
>> something like this.
> 
> Is it the fact that we clear a flag that userspace expects not to have changed
> if it queries it later? I think that's the only effect of this that's directly
> visible to userspace code.

I don't see any way for userspace to query that.


> As for a new "hint" flag, I assume this new flag would be an alternative to the
> existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use it in
> situations where the kernel *should* place a BO somewhere CPU-accessible, but
> is permitted to move it elsewhere. Is that correct?

That seems silly. The userspace flag could never be more than a hint.
Unfortunately we named it to suggest differently, but we have to live
with that.

If we do need a new hint flag internally in the driver, we should simply
translate AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to the new flag in
amdgpu_gem_create_ioctl, and not expose the new flag to userspace.


But other than the question in my followup to the cover letter, this
patch looks good to me as is.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]     ` <1498617201-24557-5-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
  2017-06-28 13:05       ` Christian König
@ 2017-06-29  3:18       ` Michel Dänzer
  1 sibling, 0 replies; 25+ messages in thread
From: Michel Dänzer @ 2017-06-29  3:18 UTC (permalink / raw)
  To: John Brooks, Marek Olšák, Christian König
  Cc: David Airlie, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 28/06/17 11:33 AM, John Brooks wrote:
> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
> it should only be treated as a hint to initially place a BO somewhere CPU
> accessible, rather than having a permanent effect on BO placement.
> 
> Instead of the flag being set in stone at BO creation, set the flag when a
> page fault occurs so that it goes somewhere CPU-visible, and clear it when
> the BO is requested by the GPU.
> 
> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> invisible VRAM, where they may promptly generate another page fault. When
> BOs are constantly moved back and forth like this, it is highly detrimental
> to performance. Only clear the flag on CS if:
> 
> - The BO wasn't page faulted for a certain amount of time (currently 10
>   seconds), and
> - its last page fault didn't occur too soon (currently 500ms) after its
>   last CS request, or vice versa.
> 
> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
> remove the loop to restrict lpfn to the end of visible VRAM, because
> amdgpu_ttm_placement_init() will do it for us.
> 
> Signed-off-by: John Brooks <john@fastquake.com>

[...]

> @@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>  
>  	/* hurrah the memory is not visible ! */
>  	atomic64_inc(&adev->num_vram_cpu_page_faults);
> +	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;

Talking about this on IRC reminded me that I think we should set the
flag in this function regardless of where the BO is currently located.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo
  2017-06-28 22:59       ` John Brooks
@ 2017-06-29  8:11         ` Christian König
  0 siblings, 0 replies; 25+ messages in thread
From: Christian König @ 2017-06-29  8:11 UTC (permalink / raw)
  To: John Brooks; +Cc: Michel Dänzer, dri-devel, amd-gfx

Am 29.06.2017 um 00:59 schrieb John Brooks:
> On Wed, Jun 28, 2017 at 03:06:47PM +0200, Christian König wrote:
>> Am 28.06.2017 um 04:33 schrieb John Brooks:
>>> Signed-off-by: John Brooks <john@fastquake.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 3 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 5 +++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 ++
>>>   3 files changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 7366115..34c293a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -428,6 +428,9 @@ struct amdgpu_bo {
>>>   	void				*metadata;
>>>   	u32				metadata_size;
>>>   	unsigned			prime_shared_count;
>>> +	unsigned long			last_page_fault_jiffies;
>>> +	unsigned long			last_cs_move_jiffies;
>> Please use jiffies64 here, apart from that the patch looks good to me.
>>
>> Christian.
>>
> I'm not sure I understand. Do you mean change these variables to u64 and use
> get_jiffies_64() instead of the plain jiffies variable below?

Yes, exactly.

> I believe jiffies_64 can be slower than jiffies also.

Yeah, but it doesn't matter on 64bit systems and they don't wrap around 
every 49 days on 32bit systems :)

Christian.

>
> John
>
>>> +
>>>   	/* list of all virtual address to which this bo
>>>   	 * is associated to
>>>   	 */
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 1dfa847..071b592 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -335,6 +335,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>>>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>   	u64 initial_bytes_moved, bytes_moved;
>>>   	uint32_t domain;
>>> +	uint32_t old_mem;
>>>   	int r;
>>>   	if (bo->pin_count)
>>> @@ -364,6 +365,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>>>   retry:
>>>   	amdgpu_ttm_placement_from_domain(bo, domain);
>>>   	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
>>> +	old_mem = bo->tbo.mem.mem_type;
>>>   	r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
>>>   	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
>>>   		      initial_bytes_moved;
>>> @@ -377,6 +379,9 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>>>   		goto retry;
>>>   	}
>>> +	if (bo->tbo.mem.mem_type != old_mem)
>>> +		bo->last_cs_move_jiffies = jiffies;
>>> +
>>>   	return r;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index dcf1ddb..b71775c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -953,6 +953,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>>>   	if (bo->mem.mem_type != TTM_PL_VRAM)
>>>   		return 0;
>>> +	abo->last_page_fault_jiffies = jiffies;
>>> +
>>>   	size = bo->mem.num_pages << PAGE_SHIFT;
>>>   	offset = bo->mem.start << PAGE_SHIFT;
>>>   	/* TODO: figure out how to map scattered VRAM to the CPU */
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
  2017-06-29  2:35             ` Michel Dänzer
@ 2017-06-29  8:23               ` Christian König
  2017-06-29  9:58                 ` Michel Dänzer
  2017-06-30  1:56               ` John Brooks
  1 sibling, 1 reply; 25+ messages in thread
From: Christian König @ 2017-06-29  8:23 UTC (permalink / raw)
  To: Michel Dänzer, John Brooks; +Cc: amd-gfx, dri-devel

Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
> On 29/06/17 08:26 AM, John Brooks wrote:
>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>> Am 28.06.2017 um 04:33 schrieb John Brooks:
>>>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
>>>> it should only be treated as a hint to initially place a BO somewhere CPU
>>>> accessible, rather than having a permanent effect on BO placement.
>>>>
>>>> Instead of the flag being set in stone at BO creation, set the flag when a
>>>> page fault occurs so that it goes somewhere CPU-visible, and clear it when
>>>> the BO is requested by the GPU.
>>>>
>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>> invisible VRAM, where they may promptly generate another page fault. When
>>>> BOs are constantly moved back and forth like this, it is highly detrimental
>>>> to performance. Only clear the flag on CS if:
>>>>
>>>> - The BO wasn't page faulted for a certain amount of time (currently 10
>>>>    seconds), and
>>>> - its last page fault didn't occur too soon (currently 500ms) after its
>>>>    last CS request, or vice versa.
>>>>
>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>> amdgpu_ttm_placement_init() will do it for us.
>>> I'm fine with the general approach, but I'm still absolutely not keen about
>>> clearing the flag when userspace has originally specified it.
> Is there any specific concern you have about that?

Yeah, quite a bunch actually. We want to use this flag for P2P buffer 
sharing in the future as well and I don't intent to add another one like 
CPU_ACCESS_REALLY_REQUIRED or something like this.

>>> Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or
>>> something like this.
>> Is it the fact that we clear a flag that userspace expects not to have changed
>> if it queries it later? I think that's the only effect of this that's directly
>> visible to userspace code.
> I don't see any way for userspace to query that.
>
>
>> As for a new "hint" flag, I assume this new flag would be an alternative to the
>> existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use it in
>> situations where the kernel *should* place a BO somewhere CPU-accessible, but
>> is permitted to move it elsewhere. Is that correct?
> That seems silly. The userspace flag could never be more than a hint.
> Unfortunately we named it to suggest differently, but we have to live
> with that.

No, just the other way around. The CPU_ACCESS_REQUIRED flag was 
introduced to note that it is MANDATORY to always have CPU access to the 
buffer.

It's just mesa which uses the flag as a hint to we could get CPU access.

Regards,
Christian.

> If we do need a new hint flag internally in the driver, we should simply
> translate AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to the new flag in
> amdgpu_gem_create_ioctl, and not expose the new flag to userspace.
>
>
> But other than the question in my followup to the cover letter, this
> patch looks good to me as is.


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
  2017-06-29  8:23               ` Christian König
@ 2017-06-29  9:58                 ` Michel Dänzer
  2017-06-29 10:05                   ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Michel Dänzer @ 2017-06-29  9:58 UTC (permalink / raw)
  To: Christian König, John Brooks; +Cc: dri-devel, amd-gfx

On 29/06/17 05:23 PM, Christian König wrote:
> Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
>> On 29/06/17 08:26 AM, John Brooks wrote:
>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>>>>
>>>>> Instead of the flag being set in stone at BO creation, set the flag
>>>>> when a
>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
>>>>> it when
>>>>> the BO is requested by the GPU.
>>>>>
>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>>> invisible VRAM, where they may promptly generate another page
>>>>> fault. When
>>>>> BOs are constantly moved back and forth like this, it is highly
>>>>> detrimental
>>>>> to performance. Only clear the flag on CS if:
>>>>>
>>>>> - The BO wasn't page faulted for a certain amount of time
>>>>> (currently 10
>>>>>    seconds), and
>>>>> - its last page fault didn't occur too soon (currently 500ms) after
>>>>> its
>>>>>    last CS request, or vice versa.
>>>>>
>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
>>>>> we can
>>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>> I'm fine with the general approach, but I'm still absolutely not
>>>> keen about
>>>> clearing the flag when userspace has originally specified it.
>> Is there any specific concern you have about that?
> 
> Yeah, quite a bunch actually. We want to use this flag for P2P buffer
> sharing in the future as well and I don't intent to add another one like
> CPU_ACCESS_REALLY_REQUIRED or something like this.

Won't a BO need to be pinned while it's being shared with another device?


>>>> Please add a new flag something like
>>>> AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or
>>>> something like this.
>>> Is it the fact that we clear a flag that userspace expects not to
>>> have changed
>>> if it queries it later? I think that's the only effect of this that's
>>> directly
>>> visible to userspace code.
>> I don't see any way for userspace to query that.
>>
>>
>>> As for a new "hint" flag, I assume this new flag would be an
>>> alternative to the
>>> existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use
>>> it in
>>> situations where the kernel *should* place a BO somewhere
>>> CPU-accessible, but
>>> is permitted to move it elsewhere. Is that correct?
>> That seems silly. The userspace flag could never be more than a hint.
>> Unfortunately we named it to suggest differently, but we have to live
>> with that.
> 
> No, just the other way around. The CPU_ACCESS_REQUIRED flag was
> introduced to note that it is MANDATORY to always have CPU access to the
> buffer.
> 
> It's just mesa which uses the flag as a hint to we could get CPU access.

Can you describe a specific scenario where userspace would actually need
the strict semantics? Otherwise I'm afraid what you're demanding boils
down to userspace churn for no benefit.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
  2017-06-29  9:58                 ` Michel Dänzer
@ 2017-06-29 10:05                   ` Daniel Vetter
       [not found]                     ` <20170629100523.khsozocltct7tnfu-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2017-06-29 10:05 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: John Brooks, amd-gfx, dri-devel

On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
> On 29/06/17 05:23 PM, Christian König wrote:
> > Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
> >> On 29/06/17 08:26 AM, John Brooks wrote:
> >>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
> >>>>>
> >>>>> Instead of the flag being set in stone at BO creation, set the flag
> >>>>> when a
> >>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
> >>>>> it when
> >>>>> the BO is requested by the GPU.
> >>>>>
> >>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> >>>>> invisible VRAM, where they may promptly generate another page
> >>>>> fault. When
> >>>>> BOs are constantly moved back and forth like this, it is highly
> >>>>> detrimental
> >>>>> to performance. Only clear the flag on CS if:
> >>>>>
> >>>>> - The BO wasn't page faulted for a certain amount of time
> >>>>> (currently 10
> >>>>>    seconds), and
> >>>>> - its last page fault didn't occur too soon (currently 500ms) after
> >>>>> its
> >>>>>    last CS request, or vice versa.
> >>>>>
> >>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
> >>>>> we can
> >>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
> >>>>> amdgpu_ttm_placement_init() will do it for us.
> >>>> I'm fine with the general approach, but I'm still absolutely not
> >>>> keen about
> >>>> clearing the flag when userspace has originally specified it.
> >> Is there any specific concern you have about that?
> > 
> > Yeah, quite a bunch actually. We want to use this flag for P2P buffer
> > sharing in the future as well and I don't intent to add another one like
> > CPU_ACCESS_REALLY_REQUIRED or something like this.
> 
> Won't a BO need to be pinned while it's being shared with another device?

That's an artifact of the current kernel implementation, I think we could
do better (but for current use-cases where we share a bunch of scanouts
and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
changing.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
  2017-06-29  2:35             ` Michel Dänzer
  2017-06-29  8:23               ` Christian König
@ 2017-06-30  1:56               ` John Brooks
       [not found]                 ` <20170630015638.GA735-6hIufAJW0g7Gr8qjsLp7YGXnswh1EIUO@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: John Brooks @ 2017-06-30  1:56 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: amd-gfx, dri-devel

On Thu, Jun 29, 2017 at 11:35:53AM +0900, Michel Dänzer wrote:
> On 29/06/17 08:26 AM, John Brooks wrote:
> > On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
> >> Am 28.06.2017 um 04:33 schrieb John Brooks:
> >>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
> >>> it should only be treated as a hint to initially place a BO somewhere CPU
> >>> accessible, rather than having a permanent effect on BO placement.
> >>>
> >>> Instead of the flag being set in stone at BO creation, set the flag when a
> >>> page fault occurs so that it goes somewhere CPU-visible, and clear it when
> >>> the BO is requested by the GPU.
> >>>
> >>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> >>> invisible VRAM, where they may promptly generate another page fault. When
> >>> BOs are constantly moved back and forth like this, it is highly detrimental
> >>> to performance. Only clear the flag on CS if:
> >>>
> >>> - The BO wasn't page faulted for a certain amount of time (currently 10
> >>>   seconds), and
> >>> - its last page fault didn't occur too soon (currently 500ms) after its
> >>>   last CS request, or vice versa.
> >>>
> >>> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
> >>> remove the loop to restrict lpfn to the end of visible VRAM, because
> >>> amdgpu_ttm_placement_init() will do it for us.
> >>
> >> I'm fine with the general approach, but I'm still absolutely not keen about
> >> clearing the flag when userspace has originally specified it.
> 
> Is there any specific concern you have about that?
> 
> 
> >> Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or
> >> something like this.
> > 
> > Is it the fact that we clear a flag that userspace expects not to have changed
> > if it queries it later? I think that's the only effect of this that's directly
> > visible to userspace code.
> 
> I don't see any way for userspace to query that.

I was looking at amdgpu_gem_metadata_ioctl(). It looked like it was possible to
query a BO's flags through that method. I don't know if it actually matters; it
was just a stab in the dark for any possibly tangible effect on userspace that
might arise from the kernel changing the flag.

John

> 
> 
> > As for a new "hint" flag, I assume this new flag would be an alternative to the
> > existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use it in
> > situations where the kernel *should* place a BO somewhere CPU-accessible, but
> > is permitted to move it elsewhere. Is that correct?
> 
> That seems silly. The userspace flag could never be more than a hint.
> Unfortunately we named it to suggest differently, but we have to live
> with that.
> 
> If we do need a new hint flag internally in the driver, we should simply
> translate AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to the new flag in
> amdgpu_gem_create_ioctl, and not expose the new flag to userspace.
> 
> 
> But other than the question in my followup to the cover letter, this
> patch looks good to me as is.
> 
> 
> -- 
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]                 ` <20170630015638.GA735-6hIufAJW0g7Gr8qjsLp7YGXnswh1EIUO@public.gmane.org>
@ 2017-06-30  2:16                   ` John Brooks
  0 siblings, 0 replies; 25+ messages in thread
From: John Brooks @ 2017-06-30  2:16 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: David Airlie, Christian König,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Marek Olšák,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2017-06-29 09:56 PM, John Brooks wrote:
> On Thu, Jun 29, 2017 at 11:35:53AM +0900, Michel Dänzer wrote:
>> On 29/06/17 08:26 AM, John Brooks wrote:
>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>>> Am 28.06.2017 um 04:33 schrieb John Brooks:
>>>>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
>>>>> it should only be treated as a hint to initially place a BO somewhere CPU
>>>>> accessible, rather than having a permanent effect on BO placement.
>>>>>
>>>>> Instead of the flag being set in stone at BO creation, set the flag when a
>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear it when
>>>>> the BO is requested by the GPU.
>>>>>
>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>>> invisible VRAM, where they may promptly generate another page fault. When
>>>>> BOs are constantly moved back and forth like this, it is highly detrimental
>>>>> to performance. Only clear the flag on CS if:
>>>>>
>>>>> - The BO wasn't page faulted for a certain amount of time (currently 10
>>>>>    seconds), and
>>>>> - its last page fault didn't occur too soon (currently 500ms) after its
>>>>>    last CS request, or vice versa.
>>>>>
>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
>>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>> I'm fine with the general approach, but I'm still absolutely not keen about
>>>> clearing the flag when userspace has originally specified it.
>> Is there any specific concern you have about that?
>>
>>
>>>> Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or
>>>> something like this.
>>> Is it the fact that we clear a flag that userspace expects not to have changed
>>> if it queries it later? I think that's the only effect of this that's directly
>>> visible to userspace code.
>> I don't see any way for userspace to query that.
> I was looking at amdgpu_gem_metadata_ioctl(). It looked like it was possible to
> query a BO's flags through that method. I don't know if it actually matters; it
> was just a stab in the dark for any possibly tangible effect on userspace that
> might arise from the kernel changing the flag.
>
> John
And it looks like I am not correct about this as I misread it.
It exposes bo->metadata_flags, not bo->flags.

John

>>
>>> As for a new "hint" flag, I assume this new flag would be an alternative to the
>>> existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use it in
>>> situations where the kernel *should* place a BO somewhere CPU-accessible, but
>>> is permitted to move it elsewhere. Is that correct?
>> That seems silly. The userspace flag could never be more than a hint.
>> Unfortunately we named it to suggest differently, but we have to live
>> with that.
>>
>> If we do need a new hint flag internally in the driver, we should simply
>> translate AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to the new flag in
>> amdgpu_gem_create_ioctl, and not expose the new flag to userspace.
>>
>>
>> But other than the question in my followup to the cover letter, this
>> patch looks good to me as is.
>>
>>
>> -- 
>> Earthling Michel Dänzer               |               http://www.amd.com
>> Libre software enthusiast             |             Mesa and X developer
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]                     ` <20170629100523.khsozocltct7tnfu-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2017-06-30  2:24                       ` Michel Dänzer
       [not found]                         ` <e1568d15-42da-4720-dff8-3a6e373f51d8-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Michel Dänzer @ 2017-06-30  2:24 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: John Brooks, Christian König,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 29/06/17 07:05 PM, Daniel Vetter wrote:
> On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
>> On 29/06/17 05:23 PM, Christian König wrote:
>>> Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
>>>> On 29/06/17 08:26 AM, John Brooks wrote:
>>>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>>>>>>
>>>>>>> Instead of the flag being set in stone at BO creation, set the flag
>>>>>>> when a
>>>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
>>>>>>> it when
>>>>>>> the BO is requested by the GPU.
>>>>>>>
>>>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>>>>> invisible VRAM, where they may promptly generate another page
>>>>>>> fault. When
>>>>>>> BOs are constantly moved back and forth like this, it is highly
>>>>>>> detrimental
>>>>>>> to performance. Only clear the flag on CS if:
>>>>>>>
>>>>>>> - The BO wasn't page faulted for a certain amount of time
>>>>>>> (currently 10
>>>>>>>    seconds), and
>>>>>>> - its last page fault didn't occur too soon (currently 500ms) after
>>>>>>> its
>>>>>>>    last CS request, or vice versa.
>>>>>>>
>>>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
>>>>>>> we can
>>>>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>>>> I'm fine with the general approach, but I'm still absolutely not
>>>>>> keen about
>>>>>> clearing the flag when userspace has originally specified it.
>>>> Is there any specific concern you have about that?
>>>
>>> Yeah, quite a bunch actually. We want to use this flag for P2P buffer
>>> sharing in the future as well and I don't intent to add another one like
>>> CPU_ACCESS_REALLY_REQUIRED or something like this.
>>
>> Won't a BO need to be pinned while it's being shared with another device?
> 
> That's an artifact of the current kernel implementation, I think we could
> do better (but for current use-cases where we share a bunch of scanouts
> and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
> changing.

Surely there will need to be some kind of transaction though to let the
driver know when a BO starts/stops being shared with another device?
Either via the existing dma-buf callbacks, or something similar. We
can't rely on userspace setting a "CPU access" flag to make sure a BO
can be shared with other devices?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]                         ` <e1568d15-42da-4720-dff8-3a6e373f51d8-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2017-06-30  6:47                           ` Christian König
       [not found]                             ` <c9b732c1-4bdd-a2ce-1dc2-6abbaf89ce5a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  2017-07-07 14:47                             ` Marek Olšák
  0 siblings, 2 replies; 25+ messages in thread
From: Christian König @ 2017-06-30  6:47 UTC (permalink / raw)
  To: Michel Dänzer, Daniel Vetter
  Cc: John Brooks, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 30.06.2017 um 04:24 schrieb Michel Dänzer:
> On 29/06/17 07:05 PM, Daniel Vetter wrote:
>> On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
>>> On 29/06/17 05:23 PM, Christian König wrote:
>>>> Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
>>>>> On 29/06/17 08:26 AM, John Brooks wrote:
>>>>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>>>>>>> Instead of the flag being set in stone at BO creation, set the flag
>>>>>>>> when a
>>>>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
>>>>>>>> it when
>>>>>>>> the BO is requested by the GPU.
>>>>>>>>
>>>>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>>>>>> invisible VRAM, where they may promptly generate another page
>>>>>>>> fault. When
>>>>>>>> BOs are constantly moved back and forth like this, it is highly
>>>>>>>> detrimental
>>>>>>>> to performance. Only clear the flag on CS if:
>>>>>>>>
>>>>>>>> - The BO wasn't page faulted for a certain amount of time
>>>>>>>> (currently 10
>>>>>>>>     seconds), and
>>>>>>>> - its last page fault didn't occur too soon (currently 500ms) after
>>>>>>>> its
>>>>>>>>     last CS request, or vice versa.
>>>>>>>>
>>>>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
>>>>>>>> we can
>>>>>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>>>>> I'm fine with the general approach, but I'm still absolutely not
>>>>>>> keen about
>>>>>>> clearing the flag when userspace has originally specified it.
>>>>> Is there any specific concern you have about that?
>>>> Yeah, quite a bunch actually. We want to use this flag for P2P buffer
>>>> sharing in the future as well and I don't intent to add another one like
>>>> CPU_ACCESS_REALLY_REQUIRED or something like this.
>>> Won't a BO need to be pinned while it's being shared with another device?
>> That's an artifact of the current kernel implementation, I think we could
>> do better (but for current use-cases where we share a bunch of scanouts
>> and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
>> changing.
> Surely there will need to be some kind of transaction though to let the
> driver know when a BO starts/stops being shared with another device?
> Either via the existing dma-buf callbacks, or something similar. We
> can't rely on userspace setting a "CPU access" flag to make sure a BO
> can be shared with other devices?

Well, the flag was never intended to be used by userspace.

See the history was more like we need something in the kernel to place 
the BO in CPU accessible VRAM.

Then the closed source UMD came along and said hey we have the concept 
of two different heaps for visible and invisible VRAM, how does that 
maps to amdgpu?

I unfortunately was to tired to push back hard enough on this....

Christian.


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]                             ` <c9b732c1-4bdd-a2ce-1dc2-6abbaf89ce5a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-06-30 12:39                               ` Daniel Vetter
       [not found]                                 ` <20170630123904.afbsnmxkxxzuydr2-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2017-06-30 12:39 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, John Brooks,
	Michel Dänzer, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Daniel Vetter

On Fri, Jun 30, 2017 at 08:47:27AM +0200, Christian König wrote:
> Am 30.06.2017 um 04:24 schrieb Michel Dänzer:
> > On 29/06/17 07:05 PM, Daniel Vetter wrote:
> > > On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
> > > > On 29/06/17 05:23 PM, Christian König wrote:
> > > > > Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
> > > > > > On 29/06/17 08:26 AM, John Brooks wrote:
> > > > > > > On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
> > > > > > > > > Instead of the flag being set in stone at BO creation, set the flag
> > > > > > > > > when a
> > > > > > > > > page fault occurs so that it goes somewhere CPU-visible, and clear
> > > > > > > > > it when
> > > > > > > > > the BO is requested by the GPU.
> > > > > > > > > 
> > > > > > > > > However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> > > > > > > > > invisible VRAM, where they may promptly generate another page
> > > > > > > > > fault. When
> > > > > > > > > BOs are constantly moved back and forth like this, it is highly
> > > > > > > > > detrimental
> > > > > > > > > to performance. Only clear the flag on CS if:
> > > > > > > > > 
> > > > > > > > > - The BO wasn't page faulted for a certain amount of time
> > > > > > > > > (currently 10
> > > > > > > > >     seconds), and
> > > > > > > > > - its last page fault didn't occur too soon (currently 500ms) after
> > > > > > > > > its
> > > > > > > > >     last CS request, or vice versa.
> > > > > > > > > 
> > > > > > > > > Setting the flag in amdgpu_fault_reserve_notify() also means that
> > > > > > > > > we can
> > > > > > > > > remove the loop to restrict lpfn to the end of visible VRAM, because
> > > > > > > > > amdgpu_ttm_placement_init() will do it for us.
> > > > > > > > I'm fine with the general approach, but I'm still absolutely not
> > > > > > > > keen about
> > > > > > > > clearing the flag when userspace has originally specified it.
> > > > > > Is there any specific concern you have about that?
> > > > > Yeah, quite a bunch actually. We want to use this flag for P2P buffer
> > > > > sharing in the future as well and I don't intent to add another one like
> > > > > CPU_ACCESS_REALLY_REQUIRED or something like this.
> > > > Won't a BO need to be pinned while it's being shared with another device?
> > > That's an artifact of the current kernel implementation, I think we could
> > > do better (but for current use-cases where we share a bunch of scanouts
> > > and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
> > > changing.
> > Surely there will need to be some kind of transaction though to let the
> > driver know when a BO starts/stops being shared with another device?
> > Either via the existing dma-buf callbacks, or something similar. We
> > can't rely on userspace setting a "CPU access" flag to make sure a BO
> > can be shared with other devices?

Well I just jumped into the middle of this that it's not entirely out of
the question as an idea, but yeah we'd need to rework the dma-buf stuff
with probably a callback to evict mappings/stall for outstanding rendering
or something like that.

> Well, the flag was never intended to be used by userspace.
> 
> See the history was more like we need something in the kernel to place the
> BO in CPU accessible VRAM.
> 
> Then the closed source UMD came along and said hey we have the concept of
> two different heaps for visible and invisible VRAM, how does that maps to
> amdgpu?
> 
> I unfortunately was to tired to push back hard enough on this....

Ehrm, are you saying you have uapi for the closed source stack only?

I can help with the push back on this with a revert, no problem :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
       [not found]                                 ` <20170630123904.afbsnmxkxxzuydr2-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2017-06-30 12:46                                   ` Christian König
  0 siblings, 0 replies; 25+ messages in thread
From: Christian König @ 2017-06-30 12:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: John Brooks, Michel Dänzer,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 30.06.2017 um 14:39 schrieb Daniel Vetter:
> On Fri, Jun 30, 2017 at 08:47:27AM +0200, Christian König wrote:
>> Am 30.06.2017 um 04:24 schrieb Michel Dänzer:
>>> On 29/06/17 07:05 PM, Daniel Vetter wrote:
>>>> On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
>>>>> On 29/06/17 05:23 PM, Christian König wrote:
>>>>>> Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
>>>>>>> On 29/06/17 08:26 AM, John Brooks wrote:
>>>>>>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>>>>>>>>> Instead of the flag being set in stone at BO creation, set the flag
>>>>>>>>>> when a
>>>>>>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
>>>>>>>>>> it when
>>>>>>>>>> the BO is requested by the GPU.
>>>>>>>>>>
>>>>>>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
>>>>>>>>>> invisible VRAM, where they may promptly generate another page
>>>>>>>>>> fault. When
>>>>>>>>>> BOs are constantly moved back and forth like this, it is highly
>>>>>>>>>> detrimental
>>>>>>>>>> to performance. Only clear the flag on CS if:
>>>>>>>>>>
>>>>>>>>>> - The BO wasn't page faulted for a certain amount of time
>>>>>>>>>> (currently 10
>>>>>>>>>>      seconds), and
>>>>>>>>>> - its last page fault didn't occur too soon (currently 500ms) after
>>>>>>>>>> its
>>>>>>>>>>      last CS request, or vice versa.
>>>>>>>>>>
>>>>>>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
>>>>>>>>>> we can
>>>>>>>>>> remove the loop to restrict lpfn to the end of visible VRAM, because
>>>>>>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>>>>>>> I'm fine with the general approach, but I'm still absolutely not
>>>>>>>>> keen about
>>>>>>>>> clearing the flag when userspace has originally specified it.
>>>>>>> Is there any specific concern you have about that?
>>>>>> Yeah, quite a bunch actually. We want to use this flag for P2P buffer
>>>>>> sharing in the future as well and I don't intent to add another one like
>>>>>> CPU_ACCESS_REALLY_REQUIRED or something like this.
>>>>> Won't a BO need to be pinned while it's being shared with another device?
>>>> That's an artifact of the current kernel implementation, I think we could
>>>> do better (but for current use-cases where we share a bunch of scanouts
>>>> and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
>>>> changing.
>>> Surely there will need to be some kind of transaction though to let the
>>> driver know when a BO starts/stops being shared with another device?
>>> Either via the existing dma-buf callbacks, or something similar. We
>>> can't rely on userspace setting a "CPU access" flag to make sure a BO
>>> can be shared with other devices?
> Well I just jumped into the middle of this that it's not entirely out of
> the question as an idea, but yeah we'd need to rework the dma-buf stuff
> with probably a callback to evict mappings/stall for outstanding rendering
> or something like that.
>
>> Well, the flag was never intended to be used by userspace.
>>
>> See the history was more like we need something in the kernel to place the
>> BO in CPU accessible VRAM.
>>
>> Then the closed source UMD came along and said hey we have the concept of
>> two different heaps for visible and invisible VRAM, how does that maps to
>> amdgpu?
>>
>> I unfortunately was to tired to push back hard enough on this....
> Ehrm, are you saying you have uapi for the closed source stack only?

No, Mesa is using that flag as well.

What I'm saying is that we have a flag which became uapi because I was 
to lazy to distinct between uapi and kernel internal flags.

> I can help with the push back on this with a revert, no problem :-)

That would break Mesa and is not an option (unfortunately :).

Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS
  2017-06-30  6:47                           ` Christian König
       [not found]                             ` <c9b732c1-4bdd-a2ce-1dc2-6abbaf89ce5a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-07-07 14:47                             ` Marek Olšák
  1 sibling, 0 replies; 25+ messages in thread
From: Marek Olšák @ 2017-07-07 14:47 UTC (permalink / raw)
  To: Christian König
  Cc: John Brooks, Michel Dänzer, amd-gfx mailing list, dri-devel

On Fri, Jun 30, 2017 at 8:47 AM, Christian König
<deathsimple@vodafone.de> wrote:
> Am 30.06.2017 um 04:24 schrieb Michel Dänzer:
>>
>> On 29/06/17 07:05 PM, Daniel Vetter wrote:
>>>
>>> On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
>>>>
>>>> On 29/06/17 05:23 PM, Christian König wrote:
>>>>>
>>>>> Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
>>>>>>
>>>>>> On 29/06/17 08:26 AM, John Brooks wrote:
>>>>>>>
>>>>>>> On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
>>>>>>>>>
>>>>>>>>> Instead of the flag being set in stone at BO creation, set the flag
>>>>>>>>> when a
>>>>>>>>> page fault occurs so that it goes somewhere CPU-visible, and clear
>>>>>>>>> it when
>>>>>>>>> the BO is requested by the GPU.
>>>>>>>>>
>>>>>>>>> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT
>>>>>>>>> to
>>>>>>>>> invisible VRAM, where they may promptly generate another page
>>>>>>>>> fault. When
>>>>>>>>> BOs are constantly moved back and forth like this, it is highly
>>>>>>>>> detrimental
>>>>>>>>> to performance. Only clear the flag on CS if:
>>>>>>>>>
>>>>>>>>> - The BO wasn't page faulted for a certain amount of time
>>>>>>>>> (currently 10
>>>>>>>>>     seconds), and
>>>>>>>>> - its last page fault didn't occur too soon (currently 500ms) after
>>>>>>>>> its
>>>>>>>>>     last CS request, or vice versa.
>>>>>>>>>
>>>>>>>>> Setting the flag in amdgpu_fault_reserve_notify() also means that
>>>>>>>>> we can
>>>>>>>>> remove the loop to restrict lpfn to the end of visible VRAM,
>>>>>>>>> because
>>>>>>>>> amdgpu_ttm_placement_init() will do it for us.
>>>>>>>>
>>>>>>>> I'm fine with the general approach, but I'm still absolutely not
>>>>>>>> keen about
>>>>>>>> clearing the flag when userspace has originally specified it.
>>>>>>
>>>>>> Is there any specific concern you have about that?
>>>>>
>>>>> Yeah, quite a bunch actually. We want to use this flag for P2P buffer
>>>>> sharing in the future as well and I don't intent to add another one
>>>>> like
>>>>> CPU_ACCESS_REALLY_REQUIRED or something like this.
>>>>
>>>> Won't a BO need to be pinned while it's being shared with another
>>>> device?
>>>
>>> That's an artifact of the current kernel implementation, I think we could
>>> do better (but for current use-cases where we share a bunch of scanouts
>>> and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this
>>> never
>>> changing.
>>
>> Surely there will need to be some kind of transaction though to let the
>> driver know when a BO starts/stops being shared with another device?
>> Either via the existing dma-buf callbacks, or something similar. We
>> can't rely on userspace setting a "CPU access" flag to make sure a BO
>> can be shared with other devices?
>
>
> Well, the flag was never intended to be used by userspace.
>
> See the history was more like we need something in the kernel to place the
> BO in CPU accessible VRAM.
>
> Then the closed source UMD came along and said hey we have the concept of
> two different heaps for visible and invisible VRAM, how does that maps to
> amdgpu?

Mesa stopped using CPU_ACCESS_REQUIRED a couple of days ago.

Marek
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-07-07 14:47 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-28  2:33 [PATCH v2] Visible VRAM Management Improvements John Brooks
2017-06-28  2:33 ` [PATCH 1/5] drm/amdgpu: Add vis_vramlimit module parameter John Brooks
2017-06-28  2:33 ` [PATCH 2/5] drm/amdgpu: Throttle visible VRAM moves separately John Brooks
2017-06-28  2:33 ` [PATCH 3/5] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo John Brooks
2017-06-28 13:06   ` Christian König
     [not found]     ` <e3d7be62-e63b-f370-f159-147aaf8d7c50-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-28 22:59       ` John Brooks
2017-06-29  8:11         ` Christian König
     [not found] ` <1498617201-24557-1-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-28  2:33   ` [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS John Brooks
     [not found]     ` <1498617201-24557-5-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-28 13:05       ` Christian König
2017-06-28 23:26         ` John Brooks
     [not found]           ` <20170628232639.GB11762-6hIufAJW0g7Gr8qjsLp7YGXnswh1EIUO@public.gmane.org>
2017-06-29  2:35             ` Michel Dänzer
2017-06-29  8:23               ` Christian König
2017-06-29  9:58                 ` Michel Dänzer
2017-06-29 10:05                   ` Daniel Vetter
     [not found]                     ` <20170629100523.khsozocltct7tnfu-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2017-06-30  2:24                       ` Michel Dänzer
     [not found]                         ` <e1568d15-42da-4720-dff8-3a6e373f51d8-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-06-30  6:47                           ` Christian König
     [not found]                             ` <c9b732c1-4bdd-a2ce-1dc2-6abbaf89ce5a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-30 12:39                               ` Daniel Vetter
     [not found]                                 ` <20170630123904.afbsnmxkxxzuydr2-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2017-06-30 12:46                                   ` Christian König
2017-07-07 14:47                             ` Marek Olšák
2017-06-30  1:56               ` John Brooks
     [not found]                 ` <20170630015638.GA735-6hIufAJW0g7Gr8qjsLp7YGXnswh1EIUO@public.gmane.org>
2017-06-30  2:16                   ` John Brooks
2017-06-29  3:18       ` Michel Dänzer
2017-06-28  2:33   ` [PATCH 5/5] drm/amdgpu: Don't force BOs into visible VRAM for page faults John Brooks
2017-06-29  2:30     ` Michel Dänzer
2017-06-29  2:33 ` [PATCH v2] Visible VRAM Management Improvements Michel Dänzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).