* [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7
@ 2014-05-07 5:21 Ben Widawsky
2014-05-07 5:21 ` [PATCH 2/9] drm/i915: Extract node allocation from bind Ben Widawsky
` (8 more replies)
0 siblings, 9 replies; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
It was always the intention to do the topdown allocation for context
objects (Chris' idea originally). Unfortunately, I never managed to land
the patch, but someone else did, so now we can use it.
As a reminder, hardware contexts never need to be in the precious GTT
aperture space - which is what is what happens with the normal bottom up
allocation we do today. Doing a top down allocation increases the odds
that the HW contexts can get out of the way, especially with per FD
contexts as is done in full PPGTT
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f6354e0..66fcfc9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1035,8 +1035,7 @@ alloc:
&ppgtt->node, GEN6_PD_SIZE,
GEN6_PD_ALIGN, 0,
0, dev_priv->gtt.base.total,
- DRM_MM_SEARCH_DEFAULT,
- DRM_MM_CREATE_DEFAULT);
+ DRM_MM_TOPDOWN);
if (ret == -ENOSPC && !retried) {
ret = i915_gem_evict_something(dev, &dev_priv->gtt.base,
GEN6_PD_SIZE, GEN6_PD_ALIGN,
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 7:02 ` Chris Wilson
2014-05-07 5:21 ` [PATCH 3/9] drm/i915: WARN on unexpected return from drm_mm Ben Widawsky
` (7 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
The DRM node allocation code was already a bit of an ugly bit of code
within a complex function. Removing it serves the purpose of cleaning
the function up. More importantly, it provides a way to have a
preallocated (address space) VMA to easily skip this code. Something
we're very likely to need.
This is essentially a wrapper for DRM node allocation with an eviction +
retry. It is something which could be moved to core DRM eventually, if
other drivers had similar eviction semantics.
This removes a goto used for something other than error unwinding, a
generally reviled (I am neutral) practice, and replaces it with a
function call to itself that should have the same effect. Note that it's
not a recursive function as all the problem set reduction is done in the
eviction code.
Some might say this change is worse than before because we are using
stack for each subsequent call. Frankly, I'd rather overflow the stack
and blow it up than loop forever. In either case, this is addressed in
the next patch.
I believe, and intend, that other than the stack usage, there is no
functional change here.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem.c | 45 +++++++++++++++++++++++++++++------------
1 file changed, 32 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dae51c3..2a07fa1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3216,6 +3216,34 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
#endif
}
+static int
+i915_gem_find_vm_space(struct i915_address_space *vm,
+ struct drm_mm_node *node,
+ unsigned long size,
+ unsigned long align,
+ unsigned long color,
+ unsigned long start,
+ unsigned long end,
+ uint32_t flags)
+{
+ int ret;
+ ret = drm_mm_insert_node_in_range_generic(&vm->mm, node,
+ size, align, color,
+ start, end,
+ DRM_MM_SEARCH_DEFAULT,
+ DRM_MM_CREATE_DEFAULT);
+ if (ret) {
+ ret = i915_gem_evict_something(vm->dev, vm, size, align, color,
+ flags);
+ if (ret == 0)
+ return i915_gem_find_vm_space(vm, node,
+ size, align, color,
+ start, end, flags);
+ }
+
+ return ret;
+}
+
/**
* Finds free space in the GTT aperture and binds the object there.
*/
@@ -3275,20 +3303,11 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
if (IS_ERR(vma))
goto err_unpin;
-search_free:
- ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
- size, alignment,
- obj->cache_level, 0, gtt_max,
- DRM_MM_SEARCH_DEFAULT,
- DRM_MM_CREATE_DEFAULT);
- if (ret) {
- ret = i915_gem_evict_something(dev, vm, size, alignment,
- obj->cache_level, flags);
- if (ret == 0)
- goto search_free;
-
+ ret = i915_gem_find_vm_space(vm, &vma->node, size, alignment,
+ obj->cache_level, 0, gtt_max, flags);
+ if (ret)
goto err_free_vma;
- }
+
if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
obj->cache_level))) {
ret = -EINVAL;
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 3/9] drm/i915: WARN on unexpected return from drm_mm
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
2014-05-07 5:21 ` [PATCH 2/9] drm/i915: Extract node allocation from bind Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 5:21 ` [PATCH 4/9] drm/i915: Limit the number of node allocation retries Ben Widawsky
` (6 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
We only actually want to retry if the failure mode was not enough space,
and so we'll evict. This will help us realize quickly in case we missed
a change in the common drm code.
NOTE: A similar check is already in place for the GEN7 PPGTT code.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2a07fa1..b6965a2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3233,6 +3233,9 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
DRM_MM_SEARCH_DEFAULT,
DRM_MM_CREATE_DEFAULT);
if (ret) {
+ if (WARN_ON(ret != -ENOSPC))
+ return ret;
+
ret = i915_gem_evict_something(vm->dev, vm, size, align, color,
flags);
if (ret == 0)
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 4/9] drm/i915: Limit the number of node allocation retries
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
2014-05-07 5:21 ` [PATCH 2/9] drm/i915: Extract node allocation from bind Ben Widawsky
2014-05-07 5:21 ` [PATCH 3/9] drm/i915: WARN on unexpected return from drm_mm Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 7:49 ` Daniel Vetter
2014-05-07 5:21 ` [PATCH 5/9] drm/i915: Use new drm node allocator for PPGTT Ben Widawsky
` (5 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
AFAICT, it's impossible to actually infinitely retry the allocation in
our current code. However, a small oversight on my part, slight bug, or
future bug, could easily change that.
To avoid a potential breakage, a simple retry variable of 16 bits will
help us prevent infinitely running.
Retry is limited to 100 as a mostly random number. Some consideration
about stack usage was taken into account. As an example, if we allowed
256 retries on a 32b arch (and my memory serves that all arguments are
passed on the stack for such architectures), thats 33 bytes * 256
retries + (fudge for return address and such)... it's way more than we
want to use already. 64b architectures might be slightly better, since
6? of the first args will get passed through registers, but it's still
bad.
If anything, we might want to do way less than 100, like 3.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b6965a2..de98b26 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3216,6 +3216,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
#endif
}
+#define MAX_VMA_FIND_RETRY 100
static int
i915_gem_find_vm_space(struct i915_address_space *vm,
struct drm_mm_node *node,
@@ -3224,7 +3225,8 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
unsigned long color,
unsigned long start,
unsigned long end,
- uint32_t flags)
+ uint32_t flags,
+ uint8_t retry)
{
int ret;
ret = drm_mm_insert_node_in_range_generic(&vm->mm, node,
@@ -3232,7 +3234,7 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
start, end,
DRM_MM_SEARCH_DEFAULT,
DRM_MM_CREATE_DEFAULT);
- if (ret) {
+ if (ret && (retry < MAX_VMA_FIND_RETRY)) {
if (WARN_ON(ret != -ENOSPC))
return ret;
@@ -3241,7 +3243,8 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
if (ret == 0)
return i915_gem_find_vm_space(vm, node,
size, align, color,
- start, end, flags);
+ start, end, flags,
+ retry++);
}
return ret;
@@ -3306,8 +3309,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
if (IS_ERR(vma))
goto err_unpin;
- ret = i915_gem_find_vm_space(vm, &vma->node, size, alignment,
- obj->cache_level, 0, gtt_max, flags);
+ ret = i915_gem_find_vm_space(vm, &vma->node,
+ size, alignment, obj->cache_level,
+ 0, gtt_max, flags, 1);
if (ret)
goto err_free_vma;
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 5/9] drm/i915: Use new drm node allocator for PPGTT
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
` (2 preceding siblings ...)
2014-05-07 5:21 ` [PATCH 4/9] drm/i915: Limit the number of node allocation retries Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 5:21 ` [PATCH 6/9] drm/i915: Wrap VMA binding Ben Widawsky
` (4 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
The two users were already really similar. By adding the flags (I hope
you like a lot of arguments in your functions), we can satisfy both
callers quite well.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 16 ++++++++++++++++
drivers/gpu/drm/i915/i915_gem.c | 34 +++++++++++++++++-----------------
drivers/gpu/drm/i915/i915_gem_gtt.c | 24 +++++++-----------------
3 files changed, 40 insertions(+), 34 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5f4f631..c1d2bea 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2069,6 +2069,22 @@ void i915_init_vm(struct drm_i915_private *dev_priv,
void i915_gem_free_object(struct drm_gem_object *obj);
void i915_gem_vma_destroy(struct i915_vma *vma);
+#define MAX_VMA_FIND_RETRY 100
+int i915_gem_find_vm_space_generic(struct i915_address_space *vm,
+ struct drm_mm_node *node,
+ unsigned long size,
+ unsigned long align,
+ unsigned long color,
+ unsigned long start,
+ unsigned long end,
+ uint32_t flags,
+ enum drm_mm_search_flags sflags,
+ enum drm_mm_allocator_flags aflags,
+ uint8_t retry);
+#define i915_gem_find_vm_space(vm, node, size, align, color, start, end, flags, retry) \
+ i915_gem_find_vm_space_generic(vm, node, size, align, color, \
+ start, end, flags, DRM_MM_BOTTOMUP, retry)
+
#define PIN_MAPPABLE 0x1
#define PIN_NONBLOCK 0x2
#define PIN_GLOBAL 0x4
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index de98b26..e9599e9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3216,24 +3216,23 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
#endif
}
-#define MAX_VMA_FIND_RETRY 100
-static int
-i915_gem_find_vm_space(struct i915_address_space *vm,
- struct drm_mm_node *node,
- unsigned long size,
- unsigned long align,
- unsigned long color,
- unsigned long start,
- unsigned long end,
- uint32_t flags,
- uint8_t retry)
+int i915_gem_find_vm_space_generic(struct i915_address_space *vm,
+ struct drm_mm_node *node,
+ unsigned long size,
+ unsigned long align,
+ unsigned long color,
+ unsigned long start,
+ unsigned long end,
+ uint32_t flags,
+ enum drm_mm_search_flags sflags,
+ enum drm_mm_allocator_flags aflags,
+ uint8_t retry)
{
int ret;
ret = drm_mm_insert_node_in_range_generic(&vm->mm, node,
size, align, color,
start, end,
- DRM_MM_SEARCH_DEFAULT,
- DRM_MM_CREATE_DEFAULT);
+ sflags, aflags);
if (ret && (retry < MAX_VMA_FIND_RETRY)) {
if (WARN_ON(ret != -ENOSPC))
return ret;
@@ -3241,10 +3240,11 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
ret = i915_gem_evict_something(vm->dev, vm, size, align, color,
flags);
if (ret == 0)
- return i915_gem_find_vm_space(vm, node,
- size, align, color,
- start, end, flags,
- retry++);
+ return i915_gem_find_vm_space_generic(vm, node,
+ size, align, color,
+ start, end,
+ sflags, aflags, flags,
+ retry++);
}
return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 66fcfc9..79ae83b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1022,7 +1022,6 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
{
struct drm_device *dev = ppgtt->base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
- bool retried = false;
int ret;
/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
@@ -1030,22 +1029,13 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
* size. We allocate at the top of the GTT to avoid fragmentation.
*/
BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
-alloc:
- ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
- &ppgtt->node, GEN6_PD_SIZE,
- GEN6_PD_ALIGN, 0,
- 0, dev_priv->gtt.base.total,
- DRM_MM_TOPDOWN);
- if (ret == -ENOSPC && !retried) {
- ret = i915_gem_evict_something(dev, &dev_priv->gtt.base,
- GEN6_PD_SIZE, GEN6_PD_ALIGN,
- I915_CACHE_NONE, 0);
- if (ret)
- return ret;
-
- retried = true;
- goto alloc;
- }
+ ret = i915_gem_find_vm_space_generic(&dev_priv->gtt.base, &ppgtt->node,
+ GEN6_PD_SIZE, GEN6_PD_ALIGN, 0,
+ 0, dev_priv->gtt.base.total,
+ DRM_MM_TOPDOWN, 0,
+ MAX_VMA_FIND_RETRY-1);
+ if (ret)
+ return ret;
if (ppgtt->node.start < dev_priv->gtt.mappable_end)
DRM_DEBUG("Forced to use aperture for PDEs\n");
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 6/9] drm/i915: Wrap VMA binding
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
` (3 preceding siblings ...)
2014-05-07 5:21 ` [PATCH 5/9] drm/i915: Use new drm node allocator for PPGTT Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 7:55 ` Daniel Vetter
2014-05-07 5:21 ` [PATCH 7/9] drm/i915: Make aliasing a 2nd class VM Ben Widawsky
` (3 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
This will be useful for some upcoming patches which do more platform
specific work. Having it in one central place just makes things a bit
cleaner and easier.
There is a small functional change here. There are more calls to the
tracepoints.
NOTE: I didn't actually end up using this patch for the intended purpose, but I
thought it was a nice patch to keep around.
v2: s/i915_gem_bind_vma/i915_gem_vma_bind/
s/i915_gem_unbind_vma/i915_gem_vma_unbind/
(Chris)
v3: Missed one spot
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/i915_gem.c | 13 ++++++-------
drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++--
drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++--
5 files changed, 27 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c1d2bea..30cde3d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2264,6 +2264,9 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
struct i915_address_space *vm);
unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
struct i915_address_space *vm);
+void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level,
+ unsigned flags);
+void i915_gem_vma_unbind(struct i915_vma *vma);
struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
struct i915_address_space *vm);
struct i915_vma *
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e9599e9..1567911 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2771,7 +2771,7 @@ int i915_vma_unbind(struct i915_vma *vma)
trace_i915_vma_unbind(vma);
- vma->unbind_vma(vma);
+ i915_gem_vma_unbind(vma);
i915_gem_gtt_finish_object(obj);
@@ -3342,9 +3342,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
- trace_i915_vma_bind(vma, flags);
- vma->bind_vma(vma, obj->cache_level,
- flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
+ i915_gem_vma_bind(vma, obj->cache_level,
+ flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
i915_gem_verify_gtt(dev);
return vma;
@@ -3548,8 +3547,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
list_for_each_entry(vma, &obj->vma_list, vma_link)
if (drm_mm_node_allocated(&vma->node))
- vma->bind_vma(vma, cache_level,
- obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
+ i915_gem_vma_bind(vma, cache_level,
+ obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
}
list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3914,7 +3913,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
}
if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
- vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+ i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
vma->pin_count++;
if (flags & PIN_MAPPABLE)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f77b4c1..00c7d88 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -655,7 +655,7 @@ static int do_switch(struct intel_ring_buffer *ring,
if (!to->obj->has_global_gtt_mapping) {
struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
&dev_priv->gtt.base);
- vma->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+ i915_gem_vma_bind(vma, to->obj->cache_level, GLOBAL_BIND);
}
if (!to->is_initialized || i915_gem_context_is_default(to))
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6cc004f..211bbbd 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -369,7 +369,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
struct i915_vma *vma =
list_first_entry(&target_i915_obj->vma_list,
typeof(*vma), vma_link);
- vma->bind_vma(vma, target_i915_obj->cache_level, GLOBAL_BIND);
+ i915_gem_vma_bind(vma, target_i915_obj->cache_level,
+ GLOBAL_BIND);
}
/* Validate that the target is in a valid r/w GPU domain */
@@ -1266,7 +1267,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
* allocate space first */
struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
BUG_ON(!vma);
- vma->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+ i915_gem_vma_bind(vma, batch_obj->cache_level, GLOBAL_BIND);
}
if (flags & I915_DISPATCH_SECURE)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 79ae83b..a66cb6a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1319,10 +1319,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
* without telling our object about it. So we need to fake it.
*/
obj->has_global_gtt_mapping = 0;
- vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+ i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
}
-
if (INTEL_INFO(dev)->gen >= 8) {
gen8_setup_private_ppat(dev_priv);
return;
@@ -1986,6 +1985,19 @@ int i915_gem_gtt_init(struct drm_device *dev)
return 0;
}
+void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+ unsigned flags)
+{
+ trace_i915_vma_bind(vma, flags);
+ vma->bind_vma(vma, cache_level, flags);
+}
+
+void i915_gem_vma_unbind(struct i915_vma *vma)
+{
+ trace_i915_vma_unbind(vma);
+ vma->unbind_vma(vma);
+}
+
static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
struct i915_address_space *vm)
{
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 7/9] drm/i915: Make aliasing a 2nd class VM
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
` (4 preceding siblings ...)
2014-05-07 5:21 ` [PATCH 6/9] drm/i915: Wrap VMA binding Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 7:56 ` Daniel Vetter
2014-05-07 5:21 ` [PATCH 8/9] drm/i915: Make pin global flags explicit Ben Widawsky
` (2 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
There is a good debate to be had about how best to fit the aliasing
PPGTT into the code. However, as it stands right now, getting aliasing
PPGTT bindings is a hack, and done through implicit arguments. To make
this absolutely clear, WARN and return an error if a driver writer tries
to do something they shouldn't.
I have no issue with an eventual revert of this patch. It makes sense
for what we have today.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1567911..c54668c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3877,9 +3877,13 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
uint32_t alignment,
unsigned flags)
{
+ struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
struct i915_vma *vma;
int ret;
+ if (WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base))
+ return -ENODEV;
+
if (WARN_ON(flags & (PIN_GLOBAL | PIN_MAPPABLE) && !i915_is_ggtt(vm)))
return -EINVAL;
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 8/9] drm/i915: Make pin global flags explicit
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
` (5 preceding siblings ...)
2014-05-07 5:21 ` [PATCH 7/9] drm/i915: Make aliasing a 2nd class VM Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 5:21 ` [PATCH 9/9] drm/i915: Split out aliasing binds Ben Widawsky
2014-05-07 7:44 ` [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Daniel Vetter
8 siblings, 0 replies; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
The driver currently lets callers pin global, and then tries to do
things correctly inside the function. Doing so has two downsides:
1. It's not possible to exclusively pin to a global, or an aliasing
address space.
2. It's difficult to read, and understand.
The eventual goal when realized should fix both of the issues. This patch
which should have no functional impact begins to address these issues
without intentionally breaking things.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 4 +++-
drivers/gpu/drm/i915/i915_gem.c | 33 ++++++++++++++++++++++--------
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 ++++++--
drivers/gpu/drm/i915/i915_gem_gtt.c | 12 +++++++++--
drivers/gpu/drm/i915/i915_gem_gtt.h | 4 ++++
5 files changed, 48 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30cde3d..413b114 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2088,6 +2088,8 @@ int i915_gem_find_vm_space_generic(struct i915_address_space *vm,
#define PIN_MAPPABLE 0x1
#define PIN_NONBLOCK 0x2
#define PIN_GLOBAL 0x4
+#define PIN_ALIASING 0x8
+#define PIN_GLOBAL_ALIASED (PIN_ALIASING | PIN_GLOBAL)
int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
struct i915_address_space *vm,
uint32_t alignment,
@@ -2314,7 +2316,7 @@ i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj,
uint32_t alignment,
unsigned flags)
{
- return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, flags | PIN_GLOBAL);
+ return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, flags | PIN_GLOBAL_ALIASED);
}
static inline int
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c54668c..86cec5c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3265,8 +3265,12 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
size_t gtt_max =
flags & PIN_MAPPABLE ? dev_priv->gtt.mappable_end : vm->total;
struct i915_vma *vma;
+ u32 vma_bind_flags = 0;
int ret;
+ if (WARN_ON((flags & (PIN_MAPPABLE | PIN_GLOBAL)) == PIN_MAPPABLE))
+ flags |= PIN_GLOBAL;
+
fence_size = i915_gem_get_gtt_size(dev,
obj->base.size,
obj->tiling_mode);
@@ -3342,8 +3346,10 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
- i915_gem_vma_bind(vma, obj->cache_level,
- flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
+ if (flags & PIN_GLOBAL_ALIASED)
+ vma_bind_flags = GLOBAL_BIND | ALIASING_BIND;
+
+ i915_gem_vma_bind(vma, obj->cache_level, vma_bind_flags);
i915_gem_verify_gtt(dev);
return vma;
@@ -3546,9 +3552,14 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
}
list_for_each_entry(vma, &obj->vma_list, vma_link)
- if (drm_mm_node_allocated(&vma->node))
- i915_gem_vma_bind(vma, cache_level,
- obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
+ if (drm_mm_node_allocated(&vma->node)) {
+ u32 bind_flags = 0;
+ if (obj->has_global_gtt_mapping)
+ bind_flags |= GLOBAL_BIND;
+ if (obj->has_aliasing_ppgtt_mapping)
+ bind_flags |= ALIASING_BIND;
+ i915_gem_vma_bind(vma, cache_level, bind_flags);
+ }
}
list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3884,7 +3895,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
if (WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base))
return -ENODEV;
- if (WARN_ON(flags & (PIN_GLOBAL | PIN_MAPPABLE) && !i915_is_ggtt(vm)))
+ if (WARN_ON(flags & (PIN_GLOBAL_ALIASED | PIN_MAPPABLE) && !i915_is_ggtt(vm)))
return -EINVAL;
vma = i915_gem_obj_to_vma(obj, vm);
@@ -3916,8 +3927,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
return PTR_ERR(vma);
}
- if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
- i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
+ if (flags & PIN_GLOBAL_ALIASED) {
+ u32 bind_flags = 0;
+ if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
+ bind_flags |= GLOBAL_BIND;
+ if (flags & PIN_GLOBAL && !obj->has_aliasing_ppgtt_mapping)
+ bind_flags |= ALIASING_BIND;
+ i915_gem_vma_bind(vma, obj->cache_level, bind_flags);
+ }
vma->pin_count++;
if (flags & PIN_MAPPABLE)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 211bbbd..bcb3ae8 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -370,7 +370,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
list_first_entry(&target_i915_obj->vma_list,
typeof(*vma), vma_link);
i915_gem_vma_bind(vma, target_i915_obj->cache_level,
- GLOBAL_BIND);
+ GLOBAL_BIND | ALIASING_BIND);
}
/* Validate that the target is in a valid r/w GPU domain */
@@ -558,6 +558,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
if (need_fence || need_reloc_mappable(vma))
flags |= PIN_MAPPABLE;
+ /* FIXME: What kind of bind does Chris want? */
if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
flags |= PIN_GLOBAL;
@@ -1267,7 +1268,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
* allocate space first */
struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
BUG_ON(!vma);
- i915_gem_vma_bind(vma, batch_obj->cache_level, GLOBAL_BIND);
+ /* FIXME: Current secure dispatch code actually uses PPGTT. We
+ * need to fix this eventually */
+ i915_gem_vma_bind(vma, batch_obj->cache_level,
+ GLOBAL_BIND | ALIASING_BIND);
}
if (flags & I915_DISPATCH_SECURE)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a66cb6a..3f2f84e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1318,8 +1318,16 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
* Unfortunately above, we've just wiped out the mappings
* without telling our object about it. So we need to fake it.
*/
- obj->has_global_gtt_mapping = 0;
- i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
+ if (obj->has_global_gtt_mapping || obj->has_aliasing_ppgtt_mapping) {
+ u32 bind_flags = 0;
+ if (obj->has_global_gtt_mapping)
+ bind_flags |= GLOBAL_BIND;
+ if (obj->has_aliasing_ppgtt_mapping)
+ bind_flags |= ALIASING_BIND;
+ obj->has_global_gtt_mapping = 0;
+ obj->has_aliasing_ppgtt_mapping = 0;
+ i915_gem_vma_bind(vma, obj->cache_level, bind_flags);
+ }
}
if (INTEL_INFO(dev)->gen >= 8) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b5e8ac0..907da1b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -152,7 +152,11 @@ struct i915_vma {
* setting the valid PTE entries to a reserved scratch page. */
void (*unbind_vma)(struct i915_vma *vma);
/* Map an object into an address space with the given cache flags. */
+
+/* Only use this if you know you want a strictly global binding */
#define GLOBAL_BIND (1<<0)
+/* Only use this if you know you want a strictly aliased binding */
+#define ALIASING_BIND (1<<1)
void (*bind_vma)(struct i915_vma *vma,
enum i915_cache_level cache_level,
u32 flags);
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 9/9] drm/i915: Split out aliasing binds
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
` (6 preceding siblings ...)
2014-05-07 5:21 ` [PATCH 8/9] drm/i915: Make pin global flags explicit Ben Widawsky
@ 2014-05-07 5:21 ` Ben Widawsky
2014-05-07 7:59 ` Daniel Vetter
2014-05-07 7:44 ` [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Daniel Vetter
8 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 5:21 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
This patch finishes off actually separating the aliasing and global
finds. Prior to this, all global binds would be aliased. Now if aliasing
binds are required, they must be explicitly asked for. So far, we have
no users of this outside of execbuf - but Mika has already submitted a
patch requiring just this.
A nice benefit of this is we should no longer be able to clobber GTT
only objects from the aliasing PPGTT.
TEST=gem_storedw_batches_loop
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 2 +-
drivers/gpu/drm/i915/i915_gem.c | 6 ++++--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +-
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
4 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 413b114..ba897f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2316,7 +2316,7 @@ i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj,
uint32_t alignment,
unsigned flags)
{
- return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, flags | PIN_GLOBAL_ALIASED);
+ return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, flags | PIN_GLOBAL);
}
static inline int
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 86cec5c..b2056f3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3346,8 +3346,10 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
- if (flags & PIN_GLOBAL_ALIASED)
- vma_bind_flags = GLOBAL_BIND | ALIASING_BIND;
+ if (flags & PIN_ALIASING)
+ vma_bind_flags = ALIASING_BIND;
+ if (flags & PIN_GLOBAL)
+ vma_bind_flags = GLOBAL_BIND;
i915_gem_vma_bind(vma, obj->cache_level, vma_bind_flags);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index bcb3ae8..60049b4 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -549,7 +549,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
unsigned flags;
int ret;
- flags = 0;
+ flags = PIN_ALIASING;
need_fence =
has_fenced_gpu_access &&
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3f2f84e..846b6ee 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1569,6 +1569,9 @@ static void ggtt_bind_vma(struct i915_vma *vma,
}
}
+ if (!(flags & ALIASING_BIND))
+ return;
+
if (dev_priv->mm.aliasing_ppgtt &&
(!obj->has_aliasing_ppgtt_mapping ||
(cache_level != obj->cache_level))) {
--
1.9.2
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 5:21 ` [PATCH 2/9] drm/i915: Extract node allocation from bind Ben Widawsky
@ 2014-05-07 7:02 ` Chris Wilson
2014-05-07 15:45 ` Ben Widawsky
0 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2014-05-07 7:02 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, May 06, 2014 at 10:21:31PM -0700, Ben Widawsky wrote:
> The DRM node allocation code was already a bit of an ugly bit of code
> within a complex function. Removing it serves the purpose of cleaning
> the function up. More importantly, it provides a way to have a
> preallocated (address space) VMA to easily skip this code. Something
> we're very likely to need.
>
> This is essentially a wrapper for DRM node allocation with an eviction +
> retry. It is something which could be moved to core DRM eventually, if
> other drivers had similar eviction semantics.
>
> This removes a goto used for something other than error unwinding, a
> generally reviled (I am neutral) practice, and replaces it with a
> function call to itself that should have the same effect. Note that it's
> not a recursive function as all the problem set reduction is done in the
> eviction code.
>
> Some might say this change is worse than before because we are using
> stack for each subsequent call. Frankly, I'd rather overflow the stack
> and blow it up than loop forever. In either case, this is addressed in
> the next patch.
>
> I believe, and intend, that other than the stack usage, there is no
> functional change here.
Nope, but decoupling evict_something() further does introduce lots more
implied magic.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
` (7 preceding siblings ...)
2014-05-07 5:21 ` [PATCH 9/9] drm/i915: Split out aliasing binds Ben Widawsky
@ 2014-05-07 7:44 ` Daniel Vetter
8 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2014-05-07 7:44 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, May 06, 2014 at 10:21:30PM -0700, Ben Widawsky wrote:
> It was always the intention to do the topdown allocation for context
> objects (Chris' idea originally). Unfortunately, I never managed to land
> the patch, but someone else did, so now we can use it.
>
> As a reminder, hardware contexts never need to be in the precious GTT
> aperture space - which is what is what happens with the normal bottom up
> allocation we do today. Doing a top down allocation increases the odds
> that the HW contexts can get out of the way, especially with per FD
> contexts as is done in full PPGTT
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Queued for -next, thanks for the patch.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index f6354e0..66fcfc9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1035,8 +1035,7 @@ alloc:
> &ppgtt->node, GEN6_PD_SIZE,
> GEN6_PD_ALIGN, 0,
> 0, dev_priv->gtt.base.total,
> - DRM_MM_SEARCH_DEFAULT,
> - DRM_MM_CREATE_DEFAULT);
> + DRM_MM_TOPDOWN);
> if (ret == -ENOSPC && !retried) {
> ret = i915_gem_evict_something(dev, &dev_priv->gtt.base,
> GEN6_PD_SIZE, GEN6_PD_ALIGN,
> --
> 1.9.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/9] drm/i915: Limit the number of node allocation retries
2014-05-07 5:21 ` [PATCH 4/9] drm/i915: Limit the number of node allocation retries Ben Widawsky
@ 2014-05-07 7:49 ` Daniel Vetter
2014-05-07 15:21 ` Ben Widawsky
0 siblings, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2014-05-07 7:49 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, May 06, 2014 at 10:21:33PM -0700, Ben Widawsky wrote:
> AFAICT, it's impossible to actually infinitely retry the allocation in
> our current code. However, a small oversight on my part, slight bug, or
> future bug, could easily change that.
>
> To avoid a potential breakage, a simple retry variable of 16 bits will
> help us prevent infinitely running.
>
> Retry is limited to 100 as a mostly random number. Some consideration
> about stack usage was taken into account. As an example, if we allowed
> 256 retries on a 32b arch (and my memory serves that all arguments are
> passed on the stack for such architectures), thats 33 bytes * 256
> retries + (fudge for return address and such)... it's way more than we
> want to use already. 64b architectures might be slightly better, since
> 6? of the first args will get passed through registers, but it's still
> bad.
>
> If anything, we might want to do way less than 100, like 3.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
So you replace the retry loop with a tailrecursive version in patch 2 and
then add duct-tape later on here? Nope.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b6965a2..de98b26 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3216,6 +3216,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> #endif
> }
>
> +#define MAX_VMA_FIND_RETRY 100
> static int
> i915_gem_find_vm_space(struct i915_address_space *vm,
> struct drm_mm_node *node,
> @@ -3224,7 +3225,8 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
> unsigned long color,
> unsigned long start,
> unsigned long end,
> - uint32_t flags)
> + uint32_t flags,
> + uint8_t retry)
> {
> int ret;
> ret = drm_mm_insert_node_in_range_generic(&vm->mm, node,
> @@ -3232,7 +3234,7 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
> start, end,
> DRM_MM_SEARCH_DEFAULT,
> DRM_MM_CREATE_DEFAULT);
> - if (ret) {
> + if (ret && (retry < MAX_VMA_FIND_RETRY)) {
> if (WARN_ON(ret != -ENOSPC))
> return ret;
>
> @@ -3241,7 +3243,8 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
> if (ret == 0)
> return i915_gem_find_vm_space(vm, node,
> size, align, color,
> - start, end, flags);
> + start, end, flags,
> + retry++);
> }
>
> return ret;
> @@ -3306,8 +3309,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> if (IS_ERR(vma))
> goto err_unpin;
>
> - ret = i915_gem_find_vm_space(vm, &vma->node, size, alignment,
> - obj->cache_level, 0, gtt_max, flags);
> + ret = i915_gem_find_vm_space(vm, &vma->node,
> + size, alignment, obj->cache_level,
> + 0, gtt_max, flags, 1);
> if (ret)
> goto err_free_vma;
>
> --
> 1.9.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 6/9] drm/i915: Wrap VMA binding
2014-05-07 5:21 ` [PATCH 6/9] drm/i915: Wrap VMA binding Ben Widawsky
@ 2014-05-07 7:55 ` Daniel Vetter
2014-05-07 15:54 ` Ben Widawsky
0 siblings, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2014-05-07 7:55 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, May 06, 2014 at 10:21:35PM -0700, Ben Widawsky wrote:
> This will be useful for some upcoming patches which do more platform
> specific work. Having it in one central place just makes things a bit
> cleaner and easier.
>
> There is a small functional change here. There are more calls to the
> tracepoints.
>
> NOTE: I didn't actually end up using this patch for the intended purpose, but I
> thought it was a nice patch to keep around.
>
> v2: s/i915_gem_bind_vma/i915_gem_vma_bind/
> s/i915_gem_unbind_vma/i915_gem_vma_unbind/
> (Chris)
>
> v3: Missed one spot
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
You change the semantics of the vma bind/unbind tracepoints - previously
they've meant we reserve/unreserve a ppgtt address for an object, not
they're called every time we touch the ptes (which is what
vma_bind|unbind actually do). That doesn't look too terribly useful for
tracing any more.
What's the use case for these added tracepoints?
-Daniel
> ---
> drivers/gpu/drm/i915/i915_drv.h | 3 +++
> drivers/gpu/drm/i915/i915_gem.c | 13 ++++++-------
> drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++--
> drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++--
> 5 files changed, 27 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c1d2bea..30cde3d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2264,6 +2264,9 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> struct i915_address_space *vm);
> unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> struct i915_address_space *vm);
> +void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level,
> + unsigned flags);
> +void i915_gem_vma_unbind(struct i915_vma *vma);
> struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm);
> struct i915_vma *
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e9599e9..1567911 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2771,7 +2771,7 @@ int i915_vma_unbind(struct i915_vma *vma)
>
> trace_i915_vma_unbind(vma);
>
> - vma->unbind_vma(vma);
> + i915_gem_vma_unbind(vma);
>
> i915_gem_gtt_finish_object(obj);
>
> @@ -3342,9 +3342,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>
> WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
>
> - trace_i915_vma_bind(vma, flags);
> - vma->bind_vma(vma, obj->cache_level,
> - flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
> + i915_gem_vma_bind(vma, obj->cache_level,
> + flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
>
> i915_gem_verify_gtt(dev);
> return vma;
> @@ -3548,8 +3547,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>
> list_for_each_entry(vma, &obj->vma_list, vma_link)
> if (drm_mm_node_allocated(&vma->node))
> - vma->bind_vma(vma, cache_level,
> - obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
> + i915_gem_vma_bind(vma, cache_level,
> + obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
> }
>
> list_for_each_entry(vma, &obj->vma_list, vma_link)
> @@ -3914,7 +3913,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> }
>
> if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
> - vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> + i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
>
> vma->pin_count++;
> if (flags & PIN_MAPPABLE)
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index f77b4c1..00c7d88 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -655,7 +655,7 @@ static int do_switch(struct intel_ring_buffer *ring,
> if (!to->obj->has_global_gtt_mapping) {
> struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
> &dev_priv->gtt.base);
> - vma->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> + i915_gem_vma_bind(vma, to->obj->cache_level, GLOBAL_BIND);
> }
>
> if (!to->is_initialized || i915_gem_context_is_default(to))
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 6cc004f..211bbbd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -369,7 +369,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> struct i915_vma *vma =
> list_first_entry(&target_i915_obj->vma_list,
> typeof(*vma), vma_link);
> - vma->bind_vma(vma, target_i915_obj->cache_level, GLOBAL_BIND);
> + i915_gem_vma_bind(vma, target_i915_obj->cache_level,
> + GLOBAL_BIND);
> }
>
> /* Validate that the target is in a valid r/w GPU domain */
> @@ -1266,7 +1267,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> * allocate space first */
> struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> BUG_ON(!vma);
> - vma->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> + i915_gem_vma_bind(vma, batch_obj->cache_level, GLOBAL_BIND);
> }
>
> if (flags & I915_DISPATCH_SECURE)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 79ae83b..a66cb6a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1319,10 +1319,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> * without telling our object about it. So we need to fake it.
> */
> obj->has_global_gtt_mapping = 0;
> - vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> + i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
> }
>
> -
> if (INTEL_INFO(dev)->gen >= 8) {
> gen8_setup_private_ppat(dev_priv);
> return;
> @@ -1986,6 +1985,19 @@ int i915_gem_gtt_init(struct drm_device *dev)
> return 0;
> }
>
> +void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> + unsigned flags)
> +{
> + trace_i915_vma_bind(vma, flags);
> + vma->bind_vma(vma, cache_level, flags);
> +}
> +
> +void i915_gem_vma_unbind(struct i915_vma *vma)
> +{
> + trace_i915_vma_unbind(vma);
> + vma->unbind_vma(vma);
> +}
> +
> static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm)
> {
> --
> 1.9.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 7/9] drm/i915: Make aliasing a 2nd class VM
2014-05-07 5:21 ` [PATCH 7/9] drm/i915: Make aliasing a 2nd class VM Ben Widawsky
@ 2014-05-07 7:56 ` Daniel Vetter
0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2014-05-07 7:56 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, May 06, 2014 at 10:21:36PM -0700, Ben Widawsky wrote:
> There is a good debate to be had about how best to fit the aliasing
> PPGTT into the code. However, as it stands right now, getting aliasing
> PPGTT bindings is a hack, and done through implicit arguments. To make
> this absolutely clear, WARN and return an error if a driver writer tries
> to do something they shouldn't.
>
> I have no issue with an eventual revert of this patch. It makes sense
> for what we have today.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Queued for -next, thanks for the patch.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1567911..c54668c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3877,9 +3877,13 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> uint32_t alignment,
> unsigned flags)
> {
> + struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> struct i915_vma *vma;
> int ret;
>
> + if (WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base))
> + return -ENODEV;
> +
> if (WARN_ON(flags & (PIN_GLOBAL | PIN_MAPPABLE) && !i915_is_ggtt(vm)))
> return -EINVAL;
>
> --
> 1.9.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 9/9] drm/i915: Split out aliasing binds
2014-05-07 5:21 ` [PATCH 9/9] drm/i915: Split out aliasing binds Ben Widawsky
@ 2014-05-07 7:59 ` Daniel Vetter
0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2014-05-07 7:59 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, May 06, 2014 at 10:21:38PM -0700, Ben Widawsky wrote:
> This patch finishes off actually separating the aliasing and global
> finds. Prior to this, all global binds would be aliased. Now if aliasing
> binds are required, they must be explicitly asked for. So far, we have
> no users of this outside of execbuf - but Mika has already submitted a
> patch requiring just this.
>
> A nice benefit of this is we should no longer be able to clobber GTT
> only objects from the aliasing PPGTT.
>
> TEST=gem_storedw_batches_loop
Hm, how does this testcase test this bug? I'm surprised ...
Also the usual format is
Testcase: igt/gem_stored_batches_loop
i.e. the testcase name as used by piglit.
Anyway I want these last two patches from this series since they fix up a
regression introduced by full ppgtt. Which is blocking the cmd parser work
(among other stuff). But I'm not sold on the earlier parts, so can you
please rebase just these two?
Also we need to triple-check that we have testcases for all the rebind and
set_cache_level corner-case fun in igt.
-Daniel
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 2 +-
> drivers/gpu/drm/i915/i915_gem.c | 6 ++++--
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +-
> drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
> 4 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 413b114..ba897f0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2316,7 +2316,7 @@ i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj,
> uint32_t alignment,
> unsigned flags)
> {
> - return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, flags | PIN_GLOBAL_ALIASED);
> + return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, flags | PIN_GLOBAL);
> }
>
> static inline int
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 86cec5c..b2056f3 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3346,8 +3346,10 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>
> WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
>
> - if (flags & PIN_GLOBAL_ALIASED)
> - vma_bind_flags = GLOBAL_BIND | ALIASING_BIND;
> + if (flags & PIN_ALIASING)
> + vma_bind_flags = ALIASING_BIND;
> + if (flags & PIN_GLOBAL)
> + vma_bind_flags = GLOBAL_BIND;
>
> i915_gem_vma_bind(vma, obj->cache_level, vma_bind_flags);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index bcb3ae8..60049b4 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -549,7 +549,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> unsigned flags;
> int ret;
>
> - flags = 0;
> + flags = PIN_ALIASING;
>
> need_fence =
> has_fenced_gpu_access &&
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 3f2f84e..846b6ee 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1569,6 +1569,9 @@ static void ggtt_bind_vma(struct i915_vma *vma,
> }
> }
>
> + if (!(flags & ALIASING_BIND))
> + return;
> +
> if (dev_priv->mm.aliasing_ppgtt &&
> (!obj->has_aliasing_ppgtt_mapping ||
> (cache_level != obj->cache_level))) {
> --
> 1.9.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/9] drm/i915: Limit the number of node allocation retries
2014-05-07 7:49 ` Daniel Vetter
@ 2014-05-07 15:21 ` Ben Widawsky
0 siblings, 0 replies; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 15:21 UTC (permalink / raw)
To: Daniel Vetter; +Cc: Intel GFX, Ben Widawsky
On Wed, May 07, 2014 at 09:49:57AM +0200, Daniel Vetter wrote:
> On Tue, May 06, 2014 at 10:21:33PM -0700, Ben Widawsky wrote:
> > AFAICT, it's impossible to actually infinitely retry the allocation in
> > our current code. However, a small oversight on my part, slight bug, or
> > future bug, could easily change that.
> >
> > To avoid a potential breakage, a simple retry variable of 16 bits will
> > help us prevent infinitely running.
> >
> > Retry is limited to 100 as a mostly random number. Some consideration
> > about stack usage was taken into account. As an example, if we allowed
> > 256 retries on a 32b arch (and my memory serves that all arguments are
> > passed on the stack for such architectures), thats 33 bytes * 256
> > retries + (fudge for return address and such)... it's way more than we
> > want to use already. 64b architectures might be slightly better, since
> > 6? of the first args will get passed through registers, but it's still
> > bad.
> >
> > If anything, we might want to do way less than 100, like 3.
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> So you replace the retry loop with a tailrecursive version in patch 2 and
> then add duct-tape later on here? Nope.
> -Daniel
>
I feel like you're implying the retry loop will end, ever. The retry
timeout should probably come first though, I Agree with that much.
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 14 +++++++++-----
> > 1 file changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index b6965a2..de98b26 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3216,6 +3216,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> > #endif
> > }
> >
> > +#define MAX_VMA_FIND_RETRY 100
> > static int
> > i915_gem_find_vm_space(struct i915_address_space *vm,
> > struct drm_mm_node *node,
> > @@ -3224,7 +3225,8 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
> > unsigned long color,
> > unsigned long start,
> > unsigned long end,
> > - uint32_t flags)
> > + uint32_t flags,
> > + uint8_t retry)
> > {
> > int ret;
> > ret = drm_mm_insert_node_in_range_generic(&vm->mm, node,
> > @@ -3232,7 +3234,7 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
> > start, end,
> > DRM_MM_SEARCH_DEFAULT,
> > DRM_MM_CREATE_DEFAULT);
> > - if (ret) {
> > + if (ret && (retry < MAX_VMA_FIND_RETRY)) {
> > if (WARN_ON(ret != -ENOSPC))
> > return ret;
> >
> > @@ -3241,7 +3243,8 @@ i915_gem_find_vm_space(struct i915_address_space *vm,
> > if (ret == 0)
> > return i915_gem_find_vm_space(vm, node,
> > size, align, color,
> > - start, end, flags);
> > + start, end, flags,
> > + retry++);
> > }
> >
> > return ret;
> > @@ -3306,8 +3309,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> > if (IS_ERR(vma))
> > goto err_unpin;
> >
> > - ret = i915_gem_find_vm_space(vm, &vma->node, size, alignment,
> > - obj->cache_level, 0, gtt_max, flags);
> > + ret = i915_gem_find_vm_space(vm, &vma->node,
> > + size, alignment, obj->cache_level,
> > + 0, gtt_max, flags, 1);
> > if (ret)
> > goto err_free_vma;
> >
> > --
> > 1.9.2
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 7:02 ` Chris Wilson
@ 2014-05-07 15:45 ` Ben Widawsky
2014-05-07 15:53 ` Chris Wilson
0 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 15:45 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Wed, May 07, 2014 at 08:02:38AM +0100, Chris Wilson wrote:
> On Tue, May 06, 2014 at 10:21:31PM -0700, Ben Widawsky wrote:
> > The DRM node allocation code was already a bit of an ugly bit of code
> > within a complex function. Removing it serves the purpose of cleaning
> > the function up. More importantly, it provides a way to have a
> > preallocated (address space) VMA to easily skip this code. Something
> > we're very likely to need.
> >
> > This is essentially a wrapper for DRM node allocation with an eviction +
> > retry. It is something which could be moved to core DRM eventually, if
> > other drivers had similar eviction semantics.
> >
> > This removes a goto used for something other than error unwinding, a
> > generally reviled (I am neutral) practice, and replaces it with a
> > function call to itself that should have the same effect. Note that it's
> > not a recursive function as all the problem set reduction is done in the
> > eviction code.
> >
> > Some might say this change is worse than before because we are using
> > stack for each subsequent call. Frankly, I'd rather overflow the stack
> > and blow it up than loop forever. In either case, this is addressed in
> > the next patch.
> >
> > I believe, and intend, that other than the stack usage, there is no
> > functional change here.
>
> Nope, but decoupling evict_something() further does introduce lots more
> implied magic.
> -Chris
>
Hmm. I really don't see what's actually upsetting. Can you be a bit more
explicit about what's so bothersome to you for my edification?
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 15:45 ` Ben Widawsky
@ 2014-05-07 15:53 ` Chris Wilson
2014-05-07 16:00 ` Ben Widawsky
0 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2014-05-07 15:53 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, May 07, 2014 at 08:45:38AM -0700, Ben Widawsky wrote:
> On Wed, May 07, 2014 at 08:02:38AM +0100, Chris Wilson wrote:
> > On Tue, May 06, 2014 at 10:21:31PM -0700, Ben Widawsky wrote:
> > > The DRM node allocation code was already a bit of an ugly bit of code
> > > within a complex function. Removing it serves the purpose of cleaning
> > > the function up. More importantly, it provides a way to have a
> > > preallocated (address space) VMA to easily skip this code. Something
> > > we're very likely to need.
> > >
> > > This is essentially a wrapper for DRM node allocation with an eviction +
> > > retry. It is something which could be moved to core DRM eventually, if
> > > other drivers had similar eviction semantics.
> > >
> > > This removes a goto used for something other than error unwinding, a
> > > generally reviled (I am neutral) practice, and replaces it with a
> > > function call to itself that should have the same effect. Note that it's
> > > not a recursive function as all the problem set reduction is done in the
> > > eviction code.
> > >
> > > Some might say this change is worse than before because we are using
> > > stack for each subsequent call. Frankly, I'd rather overflow the stack
> > > and blow it up than loop forever. In either case, this is addressed in
> > > the next patch.
> > >
> > > I believe, and intend, that other than the stack usage, there is no
> > > functional change here.
> >
> > Nope, but decoupling evict_something() further does introduce lots more
> > implied magic.
> > -Chris
> >
>
> Hmm. I really don't see what's actually upsetting. Can you be a bit more
> explicit about what's so bothersome to you for my edification?
evict_something() make assumptions about the ranges of the vm which it
searches. At the moment, they mirror its parent's function.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 6/9] drm/i915: Wrap VMA binding
2014-05-07 7:55 ` Daniel Vetter
@ 2014-05-07 15:54 ` Ben Widawsky
2014-05-07 16:09 ` Daniel Vetter
0 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 15:54 UTC (permalink / raw)
To: Daniel Vetter; +Cc: Intel GFX, Ben Widawsky
On Wed, May 07, 2014 at 09:55:49AM +0200, Daniel Vetter wrote:
> On Tue, May 06, 2014 at 10:21:35PM -0700, Ben Widawsky wrote:
> > This will be useful for some upcoming patches which do more platform
> > specific work. Having it in one central place just makes things a bit
> > cleaner and easier.
> >
> > There is a small functional change here. There are more calls to the
> > tracepoints.
> >
> > NOTE: I didn't actually end up using this patch for the intended purpose, but I
> > thought it was a nice patch to keep around.
> >
> > v2: s/i915_gem_bind_vma/i915_gem_vma_bind/
> > s/i915_gem_unbind_vma/i915_gem_vma_unbind/
> > (Chris)
> >
> > v3: Missed one spot
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> You change the semantics of the vma bind/unbind tracepoints - previously
> they've meant we reserve/unreserve a ppgtt address for an object, not
> they're called every time we touch the ptes (which is what
> vma_bind|unbind actually do). That doesn't look too terribly useful for
> tracing any more.
>
> What's the use case for these added tracepoints?
> -Daniel
>
You're right. The tracepoint is incredibly useful for debug purposes,
but not as an actual tracepoint. The current tracepoint is badly named,
IMO, but that's fine. I am not sure it's worth fixing though, since
nobody has really spoken up for the patch.
> > ---
> > drivers/gpu/drm/i915/i915_drv.h | 3 +++
> > drivers/gpu/drm/i915/i915_gem.c | 13 ++++++-------
> > drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++--
> > drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++--
> > 5 files changed, 27 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index c1d2bea..30cde3d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2264,6 +2264,9 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > struct i915_address_space *vm);
> > unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > struct i915_address_space *vm);
> > +void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level,
> > + unsigned flags);
> > +void i915_gem_vma_unbind(struct i915_vma *vma);
> > struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > struct i915_address_space *vm);
> > struct i915_vma *
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index e9599e9..1567911 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2771,7 +2771,7 @@ int i915_vma_unbind(struct i915_vma *vma)
> >
> > trace_i915_vma_unbind(vma);
> >
> > - vma->unbind_vma(vma);
> > + i915_gem_vma_unbind(vma);
> >
> > i915_gem_gtt_finish_object(obj);
> >
> > @@ -3342,9 +3342,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >
> > WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
> >
> > - trace_i915_vma_bind(vma, flags);
> > - vma->bind_vma(vma, obj->cache_level,
> > - flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
> > + i915_gem_vma_bind(vma, obj->cache_level,
> > + flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
> >
> > i915_gem_verify_gtt(dev);
> > return vma;
> > @@ -3548,8 +3547,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >
> > list_for_each_entry(vma, &obj->vma_list, vma_link)
> > if (drm_mm_node_allocated(&vma->node))
> > - vma->bind_vma(vma, cache_level,
> > - obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
> > + i915_gem_vma_bind(vma, cache_level,
> > + obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
> > }
> >
> > list_for_each_entry(vma, &obj->vma_list, vma_link)
> > @@ -3914,7 +3913,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > }
> >
> > if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
> > - vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> > + i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
> >
> > vma->pin_count++;
> > if (flags & PIN_MAPPABLE)
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index f77b4c1..00c7d88 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -655,7 +655,7 @@ static int do_switch(struct intel_ring_buffer *ring,
> > if (!to->obj->has_global_gtt_mapping) {
> > struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
> > &dev_priv->gtt.base);
> > - vma->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> > + i915_gem_vma_bind(vma, to->obj->cache_level, GLOBAL_BIND);
> > }
> >
> > if (!to->is_initialized || i915_gem_context_is_default(to))
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 6cc004f..211bbbd 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -369,7 +369,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > struct i915_vma *vma =
> > list_first_entry(&target_i915_obj->vma_list,
> > typeof(*vma), vma_link);
> > - vma->bind_vma(vma, target_i915_obj->cache_level, GLOBAL_BIND);
> > + i915_gem_vma_bind(vma, target_i915_obj->cache_level,
> > + GLOBAL_BIND);
> > }
> >
> > /* Validate that the target is in a valid r/w GPU domain */
> > @@ -1266,7 +1267,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > * allocate space first */
> > struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > BUG_ON(!vma);
> > - vma->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > + i915_gem_vma_bind(vma, batch_obj->cache_level, GLOBAL_BIND);
> > }
> >
> > if (flags & I915_DISPATCH_SECURE)
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 79ae83b..a66cb6a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -1319,10 +1319,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > * without telling our object about it. So we need to fake it.
> > */
> > obj->has_global_gtt_mapping = 0;
> > - vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> > + i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
> > }
> >
> > -
> > if (INTEL_INFO(dev)->gen >= 8) {
> > gen8_setup_private_ppat(dev_priv);
> > return;
> > @@ -1986,6 +1985,19 @@ int i915_gem_gtt_init(struct drm_device *dev)
> > return 0;
> > }
> >
> > +void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> > + unsigned flags)
> > +{
> > + trace_i915_vma_bind(vma, flags);
> > + vma->bind_vma(vma, cache_level, flags);
> > +}
> > +
> > +void i915_gem_vma_unbind(struct i915_vma *vma)
> > +{
> > + trace_i915_vma_unbind(vma);
> > + vma->unbind_vma(vma);
> > +}
> > +
> > static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > struct i915_address_space *vm)
> > {
> > --
> > 1.9.2
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 15:53 ` Chris Wilson
@ 2014-05-07 16:00 ` Ben Widawsky
2014-05-07 16:55 ` Chris Wilson
0 siblings, 1 reply; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 16:00 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Wed, May 07, 2014 at 04:53:08PM +0100, Chris Wilson wrote:
> On Wed, May 07, 2014 at 08:45:38AM -0700, Ben Widawsky wrote:
> > On Wed, May 07, 2014 at 08:02:38AM +0100, Chris Wilson wrote:
> > > On Tue, May 06, 2014 at 10:21:31PM -0700, Ben Widawsky wrote:
> > > > The DRM node allocation code was already a bit of an ugly bit of code
> > > > within a complex function. Removing it serves the purpose of cleaning
> > > > the function up. More importantly, it provides a way to have a
> > > > preallocated (address space) VMA to easily skip this code. Something
> > > > we're very likely to need.
> > > >
> > > > This is essentially a wrapper for DRM node allocation with an eviction +
> > > > retry. It is something which could be moved to core DRM eventually, if
> > > > other drivers had similar eviction semantics.
> > > >
> > > > This removes a goto used for something other than error unwinding, a
> > > > generally reviled (I am neutral) practice, and replaces it with a
> > > > function call to itself that should have the same effect. Note that it's
> > > > not a recursive function as all the problem set reduction is done in the
> > > > eviction code.
> > > >
> > > > Some might say this change is worse than before because we are using
> > > > stack for each subsequent call. Frankly, I'd rather overflow the stack
> > > > and blow it up than loop forever. In either case, this is addressed in
> > > > the next patch.
> > > >
> > > > I believe, and intend, that other than the stack usage, there is no
> > > > functional change here.
> > >
> > > Nope, but decoupling evict_something() further does introduce lots more
> > > implied magic.
> > > -Chris
> > >
> >
> > Hmm. I really don't see what's actually upsetting. Can you be a bit more
> > explicit about what's so bothersome to you for my edification?
>
> evict_something() make assumptions about the ranges of the vm which it
> searches. At the moment, they mirror its parent's function.
> -Chris
>
Ah, thanks. So is plumbing starting eviction offset into evict something an
appealing solution here?
> --
> Chris Wilson, Intel Open Source Technology Centre
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 6/9] drm/i915: Wrap VMA binding
2014-05-07 15:54 ` Ben Widawsky
@ 2014-05-07 16:09 ` Daniel Vetter
0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2014-05-07 16:09 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, May 07, 2014 at 08:54:56AM -0700, Ben Widawsky wrote:
> On Wed, May 07, 2014 at 09:55:49AM +0200, Daniel Vetter wrote:
> > On Tue, May 06, 2014 at 10:21:35PM -0700, Ben Widawsky wrote:
> > > This will be useful for some upcoming patches which do more platform
> > > specific work. Having it in one central place just makes things a bit
> > > cleaner and easier.
> > >
> > > There is a small functional change here. There are more calls to the
> > > tracepoints.
> > >
> > > NOTE: I didn't actually end up using this patch for the intended purpose, but I
> > > thought it was a nice patch to keep around.
> > >
> > > v2: s/i915_gem_bind_vma/i915_gem_vma_bind/
> > > s/i915_gem_unbind_vma/i915_gem_vma_unbind/
> > > (Chris)
> > >
> > > v3: Missed one spot
> > >
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >
> > You change the semantics of the vma bind/unbind tracepoints - previously
> > they've meant we reserve/unreserve a ppgtt address for an object, not
> > they're called every time we touch the ptes (which is what
> > vma_bind|unbind actually do). That doesn't look too terribly useful for
> > tracing any more.
> >
> > What's the use case for these added tracepoints?
> > -Daniel
> >
>
> You're right. The tracepoint is incredibly useful for debug purposes,
> but not as an actual tracepoint. The current tracepoint is badly named,
> IMO, but that's fine. I am not sure it's worth fixing though, since
> nobody has really spoken up for the patch.
Yeah I think a new tracepoint we use every time we frob ptes could be
useful indeed. But conflating that operation with the vm operation isn't
good, especially since the vm operation lacks crucial information like the
actual caching bits we're using with the ptes.
-Daniel
>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.h | 3 +++
> > > drivers/gpu/drm/i915/i915_gem.c | 13 ++++++-------
> > > drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
> > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++--
> > > drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++--
> > > 5 files changed, 27 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index c1d2bea..30cde3d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -2264,6 +2264,9 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > struct i915_address_space *vm);
> > > unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > struct i915_address_space *vm);
> > > +void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level,
> > > + unsigned flags);
> > > +void i915_gem_vma_unbind(struct i915_vma *vma);
> > > struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > struct i915_address_space *vm);
> > > struct i915_vma *
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index e9599e9..1567911 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2771,7 +2771,7 @@ int i915_vma_unbind(struct i915_vma *vma)
> > >
> > > trace_i915_vma_unbind(vma);
> > >
> > > - vma->unbind_vma(vma);
> > > + i915_gem_vma_unbind(vma);
> > >
> > > i915_gem_gtt_finish_object(obj);
> > >
> > > @@ -3342,9 +3342,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> > >
> > > WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
> > >
> > > - trace_i915_vma_bind(vma, flags);
> > > - vma->bind_vma(vma, obj->cache_level,
> > > - flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
> > > + i915_gem_vma_bind(vma, obj->cache_level,
> > > + flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
> > >
> > > i915_gem_verify_gtt(dev);
> > > return vma;
> > > @@ -3548,8 +3547,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >
> > > list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > if (drm_mm_node_allocated(&vma->node))
> > > - vma->bind_vma(vma, cache_level,
> > > - obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
> > > + i915_gem_vma_bind(vma, cache_level,
> > > + obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
> > > }
> > >
> > > list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > @@ -3914,7 +3913,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > }
> > >
> > > if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
> > > - vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> > > + i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
> > >
> > > vma->pin_count++;
> > > if (flags & PIN_MAPPABLE)
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index f77b4c1..00c7d88 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -655,7 +655,7 @@ static int do_switch(struct intel_ring_buffer *ring,
> > > if (!to->obj->has_global_gtt_mapping) {
> > > struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
> > > &dev_priv->gtt.base);
> > > - vma->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> > > + i915_gem_vma_bind(vma, to->obj->cache_level, GLOBAL_BIND);
> > > }
> > >
> > > if (!to->is_initialized || i915_gem_context_is_default(to))
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 6cc004f..211bbbd 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -369,7 +369,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > struct i915_vma *vma =
> > > list_first_entry(&target_i915_obj->vma_list,
> > > typeof(*vma), vma_link);
> > > - vma->bind_vma(vma, target_i915_obj->cache_level, GLOBAL_BIND);
> > > + i915_gem_vma_bind(vma, target_i915_obj->cache_level,
> > > + GLOBAL_BIND);
> > > }
> > >
> > > /* Validate that the target is in a valid r/w GPU domain */
> > > @@ -1266,7 +1267,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > * allocate space first */
> > > struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > > BUG_ON(!vma);
> > > - vma->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > > + i915_gem_vma_bind(vma, batch_obj->cache_level, GLOBAL_BIND);
> > > }
> > >
> > > if (flags & I915_DISPATCH_SECURE)
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 79ae83b..a66cb6a 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -1319,10 +1319,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > > * without telling our object about it. So we need to fake it.
> > > */
> > > obj->has_global_gtt_mapping = 0;
> > > - vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> > > + i915_gem_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
> > > }
> > >
> > > -
> > > if (INTEL_INFO(dev)->gen >= 8) {
> > > gen8_setup_private_ppat(dev_priv);
> > > return;
> > > @@ -1986,6 +1985,19 @@ int i915_gem_gtt_init(struct drm_device *dev)
> > > return 0;
> > > }
> > >
> > > +void i915_gem_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> > > + unsigned flags)
> > > +{
> > > + trace_i915_vma_bind(vma, flags);
> > > + vma->bind_vma(vma, cache_level, flags);
> > > +}
> > > +
> > > +void i915_gem_vma_unbind(struct i915_vma *vma)
> > > +{
> > > + trace_i915_vma_unbind(vma);
> > > + vma->unbind_vma(vma);
> > > +}
> > > +
> > > static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > > struct i915_address_space *vm)
> > > {
> > > --
> > > 1.9.2
> > >
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
> --
> Ben Widawsky, Intel Open Source Technology Center
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 16:00 ` Ben Widawsky
@ 2014-05-07 16:55 ` Chris Wilson
2014-05-07 17:30 ` Ben Widawsky
0 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2014-05-07 16:55 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, May 07, 2014 at 09:00:16AM -0700, Ben Widawsky wrote:
> On Wed, May 07, 2014 at 04:53:08PM +0100, Chris Wilson wrote:
> > On Wed, May 07, 2014 at 08:45:38AM -0700, Ben Widawsky wrote:
> > > Hmm. I really don't see what's actually upsetting. Can you be a bit more
> > > explicit about what's so bothersome to you for my edification?
> >
> > evict_something() make assumptions about the ranges of the vm which it
> > searches. At the moment, they mirror its parent's function.
>
> Ah, thanks. So is plumbing starting eviction offset into evict something an
> appealing solution here?
Yes. I have to admit to not being overly sold on the code migration yet.
I guess you have an ulterior motive... Evictable vm?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/9] drm/i915: Extract node allocation from bind
2014-05-07 16:55 ` Chris Wilson
@ 2014-05-07 17:30 ` Ben Widawsky
0 siblings, 0 replies; 23+ messages in thread
From: Ben Widawsky @ 2014-05-07 17:30 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Wed, May 07, 2014 at 05:55:00PM +0100, Chris Wilson wrote:
> On Wed, May 07, 2014 at 09:00:16AM -0700, Ben Widawsky wrote:
> > On Wed, May 07, 2014 at 04:53:08PM +0100, Chris Wilson wrote:
> > > On Wed, May 07, 2014 at 08:45:38AM -0700, Ben Widawsky wrote:
> > > > Hmm. I really don't see what's actually upsetting. Can you be a bit more
> > > > explicit about what's so bothersome to you for my edification?
> > >
> > > evict_something() make assumptions about the ranges of the vm which it
> > > searches. At the moment, they mirror its parent's function.
> >
> > Ah, thanks. So is plumbing starting eviction offset into evict something an
> > appealing solution here?
>
> Yes. I have to admit to not being overly sold on the code migration yet.
> I guess you have an ulterior motive... Evictable vm?
> -Chris
>
The only immediate goal is to be able to get this bit logic out of
bind_to_vm, so I can use "preallocated" nodes a bit more cleanly. At
least for the present, I have no plans with it other than that. I did
like the resuse of the PDE allocation for gen7.
If I think of a better reason, I'll let you know.
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2014-05-07 17:31 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-07 5:21 [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Ben Widawsky
2014-05-07 5:21 ` [PATCH 2/9] drm/i915: Extract node allocation from bind Ben Widawsky
2014-05-07 7:02 ` Chris Wilson
2014-05-07 15:45 ` Ben Widawsky
2014-05-07 15:53 ` Chris Wilson
2014-05-07 16:00 ` Ben Widawsky
2014-05-07 16:55 ` Chris Wilson
2014-05-07 17:30 ` Ben Widawsky
2014-05-07 5:21 ` [PATCH 3/9] drm/i915: WARN on unexpected return from drm_mm Ben Widawsky
2014-05-07 5:21 ` [PATCH 4/9] drm/i915: Limit the number of node allocation retries Ben Widawsky
2014-05-07 7:49 ` Daniel Vetter
2014-05-07 15:21 ` Ben Widawsky
2014-05-07 5:21 ` [PATCH 5/9] drm/i915: Use new drm node allocator for PPGTT Ben Widawsky
2014-05-07 5:21 ` [PATCH 6/9] drm/i915: Wrap VMA binding Ben Widawsky
2014-05-07 7:55 ` Daniel Vetter
2014-05-07 15:54 ` Ben Widawsky
2014-05-07 16:09 ` Daniel Vetter
2014-05-07 5:21 ` [PATCH 7/9] drm/i915: Make aliasing a 2nd class VM Ben Widawsky
2014-05-07 7:56 ` Daniel Vetter
2014-05-07 5:21 ` [PATCH 8/9] drm/i915: Make pin global flags explicit Ben Widawsky
2014-05-07 5:21 ` [PATCH 9/9] drm/i915: Split out aliasing binds Ben Widawsky
2014-05-07 7:59 ` Daniel Vetter
2014-05-07 7:44 ` [PATCH 1/9] drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox