* [PATCH 3/8] drm/i915: evict VM instead of everything
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-31 0:04 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
When reserving objects during execbuf, it is possible to come across an
object which will not fit given the current fragmentation of the address
space. We do not have any defragment in drm_mm, so the strategy is to
instead evict everything, and reallocate objects.
With the upcoming addition of multiple VMs, there is no point to evict
everything since doing so is overkill for the specific case mentioned
above.
Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem_evict.c | 17 ++++++++++++++++-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 +++++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0602c4b..6e8ade0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2088,6 +2088,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev,
unsigned cache_level,
bool mappable,
bool nonblock);
+int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
int i915_gem_evict_everything(struct drm_device *dev);
/* i915_gem_stolen.c */
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index e9033f0..a3e279d 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -155,7 +155,22 @@ found:
return ret;
}
-static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
+/**
+ * i915_gem_evict_vm - Try to free up VM space
+ *
+ * @vm: Address space to evict from
+ * @do_idle: Boolean directing whether to idle first.
+ *
+ * VM eviction is about freeing up virtual address space. If one wants fine
+ * grained eviction, they should see evict something for more details. In terms
+ * of freeing up actual system memory, this function may not accomplish the
+ * desired result. An object may be shared in multiple address space, and this
+ * function will not assert those objects be freed.
+ *
+ * Using do_idle will result in a more complete eviction because it retires, and
+ * inactivates current BOs.
+ */
+int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
{
struct i915_vma *vma, *next;
int ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index ea4bacb..f0a0bd7 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -546,10 +546,16 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
{
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
+ struct i915_address_space *vm;
struct list_head ordered_vmas;
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
int retry;
+ if (list_empty(vmas))
+ return 0;
+
+ vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm;
+
INIT_LIST_HEAD(&ordered_vmas);
while (!list_empty(vmas)) {
struct drm_i915_gem_exec_object2 *entry;
@@ -638,7 +644,7 @@ err: /* Decrement pin count for bound objects */
if (ret != -ENOSPC || retry++)
return ret;
- ret = i915_gem_evict_everything(ring->dev);
+ ret = i915_gem_evict_vm(vm, true);
if (ret)
return ret;
} while (1);
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 3/8] drm/i915: evict VM instead of everything
2013-08-30 23:43 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
@ 2013-08-31 0:04 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2013-08-31 0:04 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:56PM -0700, Ben Widawsky wrote:
> When reserving objects during execbuf, it is possible to come across an
> object which will not fit given the current fragmentation of the address
> space. We do not have any defragment in drm_mm, so the strategy is to
> instead evict everything, and reallocate objects.
>
> With the upcoming addition of multiple VMs, there is no point to evict
> everything since doing so is overkill for the specific case mentioned
> above.
>
> Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
This loses the ENOSPC notification, and the tracepoints. I've just seen you
add tracepoints later..
I'm pretty certain this will be caught by an extra pass through reserve,
and probably worth it for the simplification. Fortunately, we do have
i-g-t coverage for this.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
@ 2013-09-11 21:57 Ben Widawsky
2013-09-11 21:57 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
` (6 more replies)
0 siblings, 7 replies; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
lifted from Daniel:
pread/pwrite isn't about the object's domain at all, but purely about
synchronizing for outstanding rendering. Replacing the call to
set_to_gtt_domain with a wait_rendering would imo improve code
readability. Furthermore we could pimp pread to only block for
outstanding writes and not for reads.
Since you're not the first one to trip over this: Can I volunteer you
for a follow-up patch to fix this?
Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
Notes:
v2: Switch the pwrite patch to use \!read_only. This was a typo in the original code. (Chris, Daniel)
drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5b510a3..0a99979 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -41,6 +41,9 @@ static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *o
static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj,
bool force);
static __must_check int
+i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
+ bool readonly);
+static __must_check int
i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
struct i915_address_space *vm,
unsigned alignment,
@@ -430,11 +433,9 @@ i915_gem_shmem_pread(struct drm_device *dev,
* optimizes for the case when the gpu will dirty the data
* anyway again before the next pread happens. */
needs_clflush = !cpu_cache_is_coherent(dev, obj->cache_level);
- if (i915_gem_obj_bound_any(obj)) {
- ret = i915_gem_object_set_to_gtt_domain(obj, false);
- if (ret)
- return ret;
- }
+ ret = i915_gem_object_wait_rendering(obj, false);
+ if (ret)
+ return ret;
}
ret = i915_gem_object_get_pages(obj);
@@ -746,11 +747,9 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
* optimizes for the case when the gpu will use the data
* right away and we therefore have to clflush anyway. */
needs_clflush_after = cpu_write_needs_clflush(obj);
- if (i915_gem_obj_bound_any(obj)) {
- ret = i915_gem_object_set_to_gtt_domain(obj, true);
- if (ret)
- return ret;
- }
+ ret = i915_gem_object_wait_rendering(obj, true);
+ if (ret)
+ return ret;
}
/* Same trick applies to invalidate partially written cachelines read
* before writing. */
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/8] drm/i915: Extract vm specific part of eviction
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 21:57 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
` (5 subsequent siblings)
6 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
As we'll see in the next patch, being able to evict for just 1 VM is
handy.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_evict.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index cc8974f..e9033f0 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -155,12 +155,31 @@ found:
return ret;
}
+static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
+{
+ struct i915_vma *vma, *next;
+ int ret;
+
+ if (do_idle) {
+ ret = i915_gpu_idle(vm->dev);
+ if (ret)
+ return ret;
+
+ i915_gem_retire_requests(vm->dev);
+ }
+
+ list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+ if (vma->obj->pin_count == 0)
+ WARN_ON(i915_vma_unbind(vma));
+
+ return 0;
+}
+
int
i915_gem_evict_everything(struct drm_device *dev)
{
drm_i915_private_t *dev_priv = dev->dev_private;
struct i915_address_space *vm;
- struct i915_vma *vma, *next;
bool lists_empty = true;
int ret;
@@ -187,11 +206,8 @@ i915_gem_evict_everything(struct drm_device *dev)
i915_gem_retire_requests(dev);
/* Having flushed everything, unbind() should never raise an error */
- list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
- list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
- if (vma->obj->pin_count == 0)
- WARN_ON(i915_vma_unbind(vma));
- }
+ list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+ WARN_ON(i915_gem_evict_vm(vm, false));
return 0;
}
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/8] drm/i915: evict VM instead of everything
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
2013-09-11 21:57 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 22:45 ` Daniel Vetter
2013-09-11 21:57 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
` (4 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
When reserving objects during execbuf, it is possible to come across an
object which will not fit given the current fragmentation of the address
space. We do not have any defragment in drm_mm, so the strategy is to
instead evict everything, and reallocate objects.
With the upcoming addition of multiple VMs, there is no point to evict
everything since doing so is overkill for the specific case mentioned
above.
Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem_evict.c | 17 ++++++++++++++++-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 +++++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 81ba5bb..7caf71d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2106,6 +2106,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev,
unsigned cache_level,
bool mappable,
bool nonblock);
+int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
int i915_gem_evict_everything(struct drm_device *dev);
/* i915_gem_stolen.c */
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index e9033f0..a3e279d 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -155,7 +155,22 @@ found:
return ret;
}
-static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
+/**
+ * i915_gem_evict_vm - Try to free up VM space
+ *
+ * @vm: Address space to evict from
+ * @do_idle: Boolean directing whether to idle first.
+ *
+ * VM eviction is about freeing up virtual address space. If one wants fine
+ * grained eviction, they should see evict something for more details. In terms
+ * of freeing up actual system memory, this function may not accomplish the
+ * desired result. An object may be shared in multiple address space, and this
+ * function will not assert those objects be freed.
+ *
+ * Using do_idle will result in a more complete eviction because it retires, and
+ * inactivates current BOs.
+ */
+int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
{
struct i915_vma *vma, *next;
int ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c8a01c1..ee93357 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -549,10 +549,16 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
{
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
+ struct i915_address_space *vm;
struct list_head ordered_vmas;
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
int retry;
+ if (list_empty(vmas))
+ return 0;
+
+ vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm;
+
INIT_LIST_HEAD(&ordered_vmas);
while (!list_empty(vmas)) {
struct drm_i915_gem_exec_object2 *entry;
@@ -641,7 +647,7 @@ err: /* Decrement pin count for bound objects */
if (ret != -ENOSPC || retry++)
return ret;
- ret = i915_gem_evict_everything(ring->dev);
+ ret = i915_gem_evict_vm(vm, true);
if (ret)
return ret;
} while (1);
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/8] drm/i915: trace vm eviction instead of everything
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
2013-09-11 21:57 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
2013-09-11 21:57 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 22:02 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
` (3 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
Tracing vm eviction is really the event we care about. For the cases we
evict everything, we still will get the trace.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_evict.c | 2 ++
drivers/gpu/drm/i915/i915_trace.h | 15 +++++++++++++++
2 files changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index a3e279d..b46f7b9 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -175,6 +175,8 @@ int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
struct i915_vma *vma, *next;
int ret;
+ trace_i915_gem_evict_vm(vm);
+
if (do_idle) {
ret = i915_gpu_idle(vm->dev);
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index e2c5ee6..c1a24af 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -233,6 +233,21 @@ TRACE_EVENT(i915_gem_evict_everything,
TP_printk("dev=%d", __entry->dev)
);
+TRACE_EVENT(i915_gem_evict_vm,
+ TP_PROTO(struct i915_address_space *vm),
+ TP_ARGS(vm),
+
+ TP_STRUCT__entry(
+ __field(struct i915_address_space *, vm)
+ ),
+
+ TP_fast_assign(
+ __entry->vm = vm;
+ ),
+
+ TP_printk("vm=%p", __entry->vm)
+);
+
TRACE_EVENT(i915_gem_ring_dispatch,
TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags),
TP_ARGS(ring, seqno, flags),
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 5/8] drm/i915: Convert active API to VMA
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
` (2 preceding siblings ...)
2013-09-11 21:57 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 22:14 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
` (2 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
Even though we track object activity and not VMA, because we have the
active_list be based on the VM, it makes the most sense to use VMAs in
the APIs.
NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
now, leave them be.
v2: Remove leftover hunk from the previous patch which didn't keep
i915_gem_object_move_to_active. That patch had to rely on the ring to
get the dev instead of the obj. (Chris)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 5 ++---
drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++-
drivers/gpu/drm/i915/i915_gem_context.c | 8 ++++----
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +--
4 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7caf71d..09a72c8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1884,9 +1884,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
int i915_gem_object_sync(struct drm_i915_gem_object *obj,
struct intel_ring_buffer *to);
-void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
- struct intel_ring_buffer *ring);
-
+void i915_vma_move_to_active(struct i915_vma *vma,
+ struct intel_ring_buffer *ring);
int i915_gem_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0a99979..5796e31 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1913,7 +1913,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
return 0;
}
-void
+static void
i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
struct intel_ring_buffer *ring)
{
@@ -1952,6 +1952,13 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
}
}
+void i915_vma_move_to_active(struct i915_vma *vma,
+ struct intel_ring_buffer *ring)
+{
+ list_move_tail(&vma->mm_list, &vma->vm->active_list);
+ return i915_gem_object_move_to_active(vma->obj, ring);
+}
+
static void
i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
{
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 26c3fcc..b71649a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -436,11 +436,11 @@ static int do_switch(struct i915_hw_context *to)
* MI_SET_CONTEXT instead of when the next seqno has completed.
*/
if (from != NULL) {
- struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
- struct i915_address_space *ggtt = &dev_priv->gtt.base;
+ struct drm_i915_private *dev_priv = ring->dev->dev_private;
+ struct i915_vma *vma =
+ i915_gem_obj_to_vma(from->obj, &dev_priv->gtt.base);
from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
- list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
- i915_gem_object_move_to_active(from->obj, ring);
+ i915_vma_move_to_active(vma, ring);
/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
* whole damn pipeline, we don't need to explicitly mark the
* object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index ee93357..b26d979 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -872,8 +872,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
obj->base.read_domains = obj->base.pending_read_domains;
obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
- list_move_tail(&vma->mm_list, &vma->vm->active_list);
- i915_gem_object_move_to_active(obj, ring);
+ i915_vma_move_to_active(vma, ring);
if (obj->base.write_domain) {
obj->dirty = 1;
obj->last_write_seqno = intel_ring_get_seqno(ring);
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
` (3 preceding siblings ...)
2013-09-11 21:57 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 22:25 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-09-11 21:57 ` [PATCH 8/8] drm/i915: eliminate vm->insert_entries() Ben Widawsky
6 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
Notes:
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville).
Change logic in unbind to only unbind ggtt when there is a global mapping, and
to remove a redundant check if the aliasing ppgtt exists.
drivers/gpu/drm/i915/i915_drv.h | 69 +++++++++++++-----------
drivers/gpu/drm/i915/i915_gem_gtt.c | 102 ++++++++++++++++++++++++++++++++++++
2 files changed, 141 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 09a72c8..a67b9a0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -465,6 +465,36 @@ enum i915_cache_level {
typedef uint32_t gen6_gtt_pte_t;
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+ struct drm_mm_node node;
+ struct drm_i915_gem_object *obj;
+ struct i915_address_space *vm;
+
+ /** This object's place on the active/inactive lists */
+ struct list_head mm_list;
+
+ struct list_head vma_link; /* Link in the object's VMA list */
+
+ /** This vma's place in the batchbuffer or on the eviction list */
+ struct list_head exec_list;
+
+ /**
+ * Used for performing relocations during execbuffer insertion.
+ */
+ struct hlist_node exec_node;
+ unsigned long exec_handle;
+ struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
struct i915_address_space {
struct drm_mm mm;
struct drm_device *dev;
@@ -503,9 +533,18 @@ struct i915_address_space {
/* FIXME: Need a more generic return type */
gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
enum i915_cache_level level);
+
+ /** Unmap an object from an address space. This usually consists of
+ * setting the valid PTE entries to a reserved scratch page. */
+ void (*unbind_vma)(struct i915_vma *vma);
void (*clear_range)(struct i915_address_space *vm,
unsigned int first_entry,
unsigned int num_entries);
+ /* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+ void (*bind_vma)(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags);
void (*insert_entries)(struct i915_address_space *vm,
struct sg_table *st,
unsigned int first_entry,
@@ -552,36 +591,6 @@ struct i915_hw_ppgtt {
int (*enable)(struct drm_device *dev);
};
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
- struct drm_mm_node node;
- struct drm_i915_gem_object *obj;
- struct i915_address_space *vm;
-
- /** This object's place on the active/inactive lists */
- struct list_head mm_list;
-
- struct list_head vma_link; /* Link in the object's VMA list */
-
- /** This vma's place in the batchbuffer or on the eviction list */
- struct list_head exec_list;
-
- /**
- * Used for performing relocations during execbuffer insertion.
- */
- struct hlist_node exec_node;
- unsigned long exec_handle;
- struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
struct i915_ctx_hang_stats {
/* This context had batch pending when hang was declared */
unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 212f6d8..1d4e430 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -57,6 +57,11 @@
#define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb)
#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6)
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
enum i915_cache_level level)
{
@@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
ppgtt->enable = gen6_ppgtt_enable;
+ ppgtt->base.unbind_vma = NULL;
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+ ppgtt->base.bind_vma = NULL;
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
cache_level);
}
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ WARN_ON(flags);
+
+ gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
struct drm_i915_gem_object *obj)
{
@@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
obj->base.size >> PAGE_SHIFT);
}
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ gen6_ppgtt_clear_range(vma->vm, entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+}
+
extern int intel_iommu_gfx_mapped;
/* Certain Gen5 chipsets require require idling the GPU before
* unmapping anything from the GTT when VT-d is enabled.
@@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
}
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 unused)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+ unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+ AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+ BUG_ON(!i915_is_ggtt(vma->vm));
+ intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+ vma->obj->has_global_gtt_mapping = 1;
+}
+
static void i915_ggtt_clear_range(struct i915_address_space *vm,
unsigned int first_entry,
unsigned int num_entries)
@@ -599,6 +639,43 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
intel_gtt_clear_range(first_entry, num_entries);
}
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+ const unsigned int first = vma->node.start >> PAGE_SHIFT;
+ const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+ BUG_ON(!i915_is_ggtt(vma->vm));
+ vma->obj->has_global_gtt_mapping = 0;
+ intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+ struct drm_device *dev = vma->vm->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+ * the global, just use aliasing */
+ if (!dev_priv->mm.aliasing_ppgtt ||
+ (flags & GLOBAL_BIND)||
+ obj->has_global_gtt_mapping) {
+ gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
+ cache_level);
+ obj->has_global_gtt_mapping = 1;
+ }
+
+ /* If put the mapping in the aliasing PPGTT as well as Global if we have
+ * aliasing, but the user requested global. */
+ if (dev_priv->mm.aliasing_ppgtt) {
+ gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+ vma->obj->pages, entry, cache_level);
+ vma->obj->has_aliasing_ppgtt_mapping = 1;
+ }
+}
void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
enum i915_cache_level cache_level)
@@ -627,6 +704,27 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
obj->has_global_gtt_mapping = 0;
}
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+ struct drm_device *dev = vma->vm->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ if (obj->has_global_gtt_mapping) {
+ gen6_ggtt_clear_range(vma->vm, entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+ obj->has_global_gtt_mapping = 0;
+ }
+
+ if (obj->has_aliasing_ppgtt_mapping) {
+ gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+ entry,
+ obj->base.size >> PAGE_SHIFT);
+ obj->has_aliasing_ppgtt_mapping = 0;
+ }
+}
+
void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
{
struct drm_device *dev = obj->base.dev;
@@ -860,7 +958,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
DRM_ERROR("Scratch setup failed\n");
dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+ dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+ dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
return ret;
}
@@ -892,7 +992,9 @@ static int i915_gmch_probe(struct drm_device *dev,
dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+ dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+ dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
return 0;
}
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 7/8] drm/i915: Use the new vm [un]bind functions
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
` (4 preceding siblings ...)
2013-09-11 21:57 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 22:39 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 8/8] drm/i915: eliminate vm->insert_entries() Ben Widawsky
6 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 9 ------
drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++-----------
drivers/gpu/drm/i915/i915_gem_context.c | 8 +++--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++----------
drivers/gpu/drm/i915/i915_gem_gtt.c | 48 ++----------------------------
5 files changed, 34 insertions(+), 91 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a67b9a0..8dfe6e6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2083,17 +2083,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
/* i915_gem_gtt.c */
void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj);
-
void i915_gem_restore_gtt_mappings(struct drm_device *dev);
int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
void i915_gem_init_global_gtt(struct drm_device *dev);
void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5796e31..3cd6c21 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2696,12 +2696,8 @@ int i915_vma_unbind(struct i915_vma *vma)
trace_i915_vma_unbind(vma);
- if (obj->has_global_gtt_mapping)
- i915_gem_gtt_unbind_object(obj);
- if (obj->has_aliasing_ppgtt_mapping) {
- i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
- obj->has_aliasing_ppgtt_mapping = 0;
- }
+ vma->vm->unbind_vma(vma);
+
i915_gem_gtt_finish_object(obj);
i915_gem_object_unpin_pages(obj);
@@ -3428,7 +3424,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
enum i915_cache_level cache_level)
{
struct drm_device *dev = obj->base.dev;
- drm_i915_private_t *dev_priv = dev->dev_private;
struct i915_vma *vma;
int ret;
@@ -3467,11 +3462,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
return ret;
}
- if (obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(obj, cache_level);
- if (obj->has_aliasing_ppgtt_mapping)
- i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
- obj, cache_level);
+ list_for_each_entry(vma, &obj->vma_list, vma_link)
+ vma->vm->bind_vma(vma, cache_level, 0);
}
list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3799,6 +3791,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
bool map_and_fenceable,
bool nonblocking)
{
+ const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
struct i915_vma *vma;
int ret;
@@ -3827,20 +3820,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
}
if (!i915_gem_obj_bound(obj, vm)) {
- struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
map_and_fenceable,
nonblocking);
if (ret)
return ret;
- if (!dev_priv->mm.aliasing_ppgtt)
- i915_gem_gtt_bind_object(obj, obj->cache_level);
- }
+ vma = i915_gem_obj_to_vma(obj, vm);
+ vm->bind_vma(vma, obj->cache_level, flags);
+ } else
+ vma = i915_gem_obj_to_vma(obj, vm);
+ /* Objects are created map and fenceable. If we bind an object
+ * the first time, and we had aliasing PPGTT (and didn't request
+ * GLOBAL), we'll need to do this on the second bind.*/
if (!obj->has_global_gtt_mapping && map_and_fenceable)
- i915_gem_gtt_bind_object(obj, obj->cache_level);
+ vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
obj->pin_count++;
obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b71649a..ce62a4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
static int do_switch(struct i915_hw_context *to)
{
struct intel_ring_buffer *ring = to->ring;
+ struct drm_i915_private *dev_priv = ring->dev->dev_private;
struct i915_hw_context *from = ring->last_context;
u32 hw_flags = 0;
int ret;
@@ -415,8 +416,11 @@ static int do_switch(struct i915_hw_context *to)
return ret;
}
- if (!to->obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+ if (!to->obj->has_global_gtt_mapping) {
+ struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+ &dev_priv->gtt.base);
+ vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+ }
if (!to->is_initialized || is_default_context(to))
hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..bfe8cef 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
if (unlikely(IS_GEN6(dev) &&
reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
!target_i915_obj->has_global_gtt_mapping)) {
- i915_gem_gtt_bind_object(target_i915_obj,
- target_i915_obj->cache_level);
+ struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+ vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+ GLOBAL_BIND);
}
/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
struct intel_ring_buffer *ring,
bool *need_reloc)
{
- struct drm_i915_private *dev_priv = ring->dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
bool need_fence, need_mappable;
- struct drm_i915_gem_object *obj = vma->obj;
+ u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+ !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
int ret;
need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
}
}
- /* Ensure ppgtt mapping exists if needed */
- if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
- i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
- obj, obj->cache_level);
-
- obj->has_aliasing_ppgtt_mapping = 1;
- }
-
if (entry->offset != vma->node.start) {
entry->offset = vma->node.start;
*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
}
- if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
- !obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(obj, obj->cache_level);
+ vma->vm->bind_vma(vma, obj->cache_level, flags);
return 0;
}
@@ -1117,8 +1109,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
* batch" bit. Hence we need to pin secure batches into the global gtt.
* hsw should have this fixed, but let's be paranoid and do it
* unconditionally for now. */
- if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+ if (flags & I915_DISPATCH_SECURE &&
+ !batch_obj->has_global_gtt_mapping) {
+ struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
+ vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+ }
ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1d4e430..714f836 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
dev_priv->mm.aliasing_ppgtt = NULL;
}
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level)
-{
- ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
- i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
- cache_level);
-}
-
static void __always_unused
gen6_ppgtt_bind_vma(struct i915_vma *vma,
enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
}
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj)
-{
- ppgtt->base.clear_range(&ppgtt->base,
- i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
- obj->base.size >> PAGE_SHIFT);
-}
-
static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
{
const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
dev_priv->gtt.base.total / PAGE_SIZE);
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+ struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+ &dev_priv->gtt.base);
i915_gem_clflush_object(obj, obj->pin_display);
- i915_gem_gtt_bind_object(obj, obj->cache_level);
+ vma->vm->bind_vma(vma, obj->cache_level, 0);
}
i915_gem_chipset_flush(dev);
@@ -677,33 +662,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
}
}
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level)
-{
- struct drm_device *dev = obj->base.dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
- dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
- entry,
- cache_level);
-
- obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
- struct drm_device *dev = obj->base.dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
- dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
- entry,
- obj->base.size >> PAGE_SHIFT);
-
- obj->has_global_gtt_mapping = 0;
-}
-
static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
{
struct drm_device *dev = vma->vm->dev;
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 8/8] drm/i915: eliminate vm->insert_entries()
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
` (5 preceding siblings ...)
2013-09-11 21:57 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
6 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 17 +----------------
1 file changed, 1 insertion(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 714f836..2dc1057 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -339,8 +339,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->enable = gen6_ppgtt_enable;
ppgtt->base.unbind_vma = NULL;
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
- ppgtt->base.bind_vma = NULL;
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+ ppgtt->base.bind_vma = NULL;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->base.scratch = dev_priv->gtt.base.scratch;
ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
@@ -591,19 +591,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
readl(gtt_base);
}
-
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
- struct sg_table *st,
- unsigned int pg_start,
- enum i915_cache_level cache_level)
-{
- unsigned int flags = (cache_level == I915_CACHE_NONE) ?
- AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
-
- intel_gtt_insert_sg_entries(st, pg_start, flags);
-
-}
-
static void i915_ggtt_bind_vma(struct i915_vma *vma,
enum i915_cache_level cache_level,
u32 unused)
@@ -917,7 +904,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
- dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
return ret;
@@ -951,7 +937,6 @@ static int i915_gmch_probe(struct drm_device *dev,
dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
- dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
return 0;
--
1.8.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 4/8] drm/i915: trace vm eviction instead of everything
2013-09-11 21:57 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
@ 2013-09-11 22:02 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2013-09-11 22:02 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 02:57:51PM -0700, Ben Widawsky wrote:
> Tracing vm eviction is really the event we care about. For the cases we
> evict everything, we still will get the trace.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
> drivers/gpu/drm/i915/i915_gem_evict.c | 2 ++
> drivers/gpu/drm/i915/i915_trace.h | 15 +++++++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index a3e279d..b46f7b9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -175,6 +175,8 @@ int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
> struct i915_vma *vma, *next;
> int ret;
>
> + trace_i915_gem_evict_vm(vm);
> +
> if (do_idle) {
> ret = i915_gpu_idle(vm->dev);
> if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index e2c5ee6..c1a24af 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -233,6 +233,21 @@ TRACE_EVENT(i915_gem_evict_everything,
> TP_printk("dev=%d", __entry->dev)
> );
>
> +TRACE_EVENT(i915_gem_evict_vm,
> + TP_PROTO(struct i915_address_space *vm),
> + TP_ARGS(vm),
> +
> + TP_STRUCT__entry(
> + __field(struct i915_address_space *, vm)
> + ),
> +
> + TP_fast_assign(
> + __entry->vm = vm;
> + ),
> +
> + TP_printk("vm=%p", __entry->vm)
> +);
Jesse drilled into me that we must never assume that we are the solitary
device on the system. Do we want to add dev here or remove all the
others?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/8] drm/i915: Convert active API to VMA
2013-09-11 21:57 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-09-11 22:14 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2013-09-11 22:14 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 02:57:52PM -0700, Ben Widawsky wrote:
> if (from != NULL) {
> - struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
> - struct i915_address_space *ggtt = &dev_priv->gtt.base;
> + struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct i915_vma *vma =
> + i915_gem_obj_to_vma(from->obj, &dev_priv->gtt.base);
> from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> - list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
> - i915_gem_object_move_to_active(from->obj, ring);
> + i915_vma_move_to_active(vma, ring);
We do fairly often lookup the ggtt vma for an obj. Is this worth a
speciallised function call?
i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->obj), ring);
struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
{
struct i915_vma *vma;
if (WARN_ON(list_empty(&obj->vma_list))
return NULL;
vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
if (WARN_ON(vma != &to_i915(obj->base.dev)->gtt.base))
return NULL;
return vma;
}
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-11 21:57 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-09-11 22:25 ` Chris Wilson
2013-09-11 22:49 ` Daniel Vetter
0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2013-09-11 22:25 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 02:57:53PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
>
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
>
> Notice that this patch has no impact on functionality. I've decided to
> save the actual change until the next patch because I think it's easier
> to review that way. I'm happy to squash the two, or let Daniel do it on
> merge.
Oh, this is disappointing. The GLOBAL_BIND flag is a let down. Is there
any way we can have the aliasing_ppgtt as a separate vm and so keep the
purity?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 7/8] drm/i915: Use the new vm [un]bind functions
2013-09-11 21:57 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-09-11 22:39 ` Chris Wilson
2013-09-14 1:08 ` Ben Widawsky
0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2013-09-11 22:39 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 02:57:54PM -0700, Ben Widawsky wrote:
> @@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> struct intel_ring_buffer *ring,
> bool *need_reloc)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_gem_object *obj = vma->obj;
> struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> bool need_fence, need_mappable;
> - struct drm_i915_gem_object *obj = vma->obj;
> + u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> + !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
> int ret;
>
> need_fence =
> @@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> }
> }
>
> - /* Ensure ppgtt mapping exists if needed */
> - if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> - obj, obj->cache_level);
> -
> - obj->has_aliasing_ppgtt_mapping = 1;
> - }
> -
> if (entry->offset != vma->node.start) {
> entry->offset = vma->node.start;
> *need_reloc = true;
> @@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
> }
>
> - if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> - !obj->has_global_gtt_mapping)
> - i915_gem_gtt_bind_object(obj, obj->cache_level);
> + vma->vm->bind_vma(vma, obj->cache_level, flags);
This results in us rebinding into the vma unconditionally once/twice for
every object on every execbuffer, right?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/8] drm/i915: evict VM instead of everything
2013-09-11 21:57 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
@ 2013-09-11 22:45 ` Daniel Vetter
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Vetter @ 2013-09-11 22:45 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 02:57:50PM -0700, Ben Widawsky wrote:
> When reserving objects during execbuf, it is possible to come across an
> object which will not fit given the current fragmentation of the address
> space. We do not have any defragment in drm_mm, so the strategy is to
> instead evict everything, and reallocate objects.
>
> With the upcoming addition of multiple VMs, there is no point to evict
> everything since doing so is overkill for the specific case mentioned
> above.
>
> Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Merged the first three patches (with a tiny fixup for this one since
you've forgotten to update one comment that needs to be updated). Leaves
us with a short-time lack of tracepoint for evict_vm but Chris has imo a
good point.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_drv.h | 1 +
> drivers/gpu/drm/i915/i915_gem_evict.c | 17 ++++++++++++++++-
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 +++++++-
> 3 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 81ba5bb..7caf71d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2106,6 +2106,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev,
> unsigned cache_level,
> bool mappable,
> bool nonblock);
> +int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
> int i915_gem_evict_everything(struct drm_device *dev);
>
> /* i915_gem_stolen.c */
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index e9033f0..a3e279d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -155,7 +155,22 @@ found:
> return ret;
> }
>
> -static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
> +/**
> + * i915_gem_evict_vm - Try to free up VM space
> + *
> + * @vm: Address space to evict from
> + * @do_idle: Boolean directing whether to idle first.
> + *
> + * VM eviction is about freeing up virtual address space. If one wants fine
> + * grained eviction, they should see evict something for more details. In terms
> + * of freeing up actual system memory, this function may not accomplish the
> + * desired result. An object may be shared in multiple address space, and this
> + * function will not assert those objects be freed.
> + *
> + * Using do_idle will result in a more complete eviction because it retires, and
> + * inactivates current BOs.
> + */
> +int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
> {
> struct i915_vma *vma, *next;
> int ret;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index c8a01c1..ee93357 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -549,10 +549,16 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> {
> struct drm_i915_gem_object *obj;
> struct i915_vma *vma;
> + struct i915_address_space *vm;
> struct list_head ordered_vmas;
> bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> int retry;
>
> + if (list_empty(vmas))
> + return 0;
> +
> + vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm;
> +
> INIT_LIST_HEAD(&ordered_vmas);
> while (!list_empty(vmas)) {
> struct drm_i915_gem_exec_object2 *entry;
> @@ -641,7 +647,7 @@ err: /* Decrement pin count for bound objects */
> if (ret != -ENOSPC || retry++)
> return ret;
>
> - ret = i915_gem_evict_everything(ring->dev);
> + ret = i915_gem_evict_vm(vm, true);
> if (ret)
> return ret;
> } while (1);
> --
> 1.8.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-11 22:25 ` Chris Wilson
@ 2013-09-11 22:49 ` Daniel Vetter
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Vetter @ 2013-09-11 22:49 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 11:25:50PM +0100, Chris Wilson wrote:
> On Wed, Sep 11, 2013 at 02:57:53PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> >
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> >
> > Notice that this patch has no impact on functionality. I've decided to
> > save the actual change until the next patch because I think it's easier
> > to review that way. I'm happy to squash the two, or let Daniel do it on
> > merge.
>
> Oh, this is disappointing. The GLOBAL_BIND flag is a let down. Is there
> any way we can have the aliasing_ppgtt as a separate vm and so keep the
> purity?
The problem is that the vm manages the address space, and it's paramount
on snb that we can line up objects into both the ggtt and ppgtt at will at
the exact same offset. Otherwise at least pipe control w/a won't work and
reading between bspec gives more promises of things that will break.
Adding complex logic to punt out such holes doesn't sound cleaner nor
simpler than the magic flags we're using here. And I have to admit that I
couldn't come up with another idea (besides flat-out killing aliasing
ppgtt on snb). So ideas welcome ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 7/8] drm/i915: Use the new vm [un]bind functions
2013-09-11 22:39 ` Chris Wilson
@ 2013-09-14 1:08 ` Ben Widawsky
2013-09-14 15:31 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2013-09-14 1:08 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Wed, Sep 11, 2013 at 11:39:30PM +0100, Chris Wilson wrote:
> On Wed, Sep 11, 2013 at 02:57:54PM -0700, Ben Widawsky wrote:
> > @@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> > struct intel_ring_buffer *ring,
> > bool *need_reloc)
> > {
> > - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > + struct drm_i915_gem_object *obj = vma->obj;
> > struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> > bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> > bool need_fence, need_mappable;
> > - struct drm_i915_gem_object *obj = vma->obj;
> > + u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> > + !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
> > int ret;
> >
> > need_fence =
> > @@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> > }
> > }
> >
> > - /* Ensure ppgtt mapping exists if needed */
> > - if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> > - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > - obj, obj->cache_level);
> > -
> > - obj->has_aliasing_ppgtt_mapping = 1;
> > - }
> > -
> > if (entry->offset != vma->node.start) {
> > entry->offset = vma->node.start;
> > *need_reloc = true;
> > @@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> > obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
> > }
> >
> > - if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> > - !obj->has_global_gtt_mapping)
> > - i915_gem_gtt_bind_object(obj, obj->cache_level);
> > + vma->vm->bind_vma(vma, obj->cache_level, flags);
>
> This results in us rebinding into the vma unconditionally once/twice for
> every object on every execbuffer, right?
> -Chris
>
Hmm, how does this suit you (untested)? I think it is addressing your concern.
Also, I suspect squashing this patch with the previous is probably a good thing
to do.
static void gen6_ggtt_bind_vma(struct i915_vma *vma,
enum i915_cache_level cache_level,
u32 flags)
{
struct drm_device *dev = vma->vm->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_i915_gem_object *obj = vma->obj;
const unsigned long entry = vma->node.start >> PAGE_SHIFT;
/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
* the global, just use aliasing */
if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
/* If the object is unbound, or we're change the cache bits */
if (!obj->has_global_gtt_mapping ||
(cache_level != obj->cache_level)) {
gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
cache_level);
obj->has_global_gtt_mapping = 1;
}
}
/* If put the mapping in the aliasing PPGTT as well as Global if we have
* aliasing, but the user requested global. */
if (dev_priv->mm.aliasing_ppgtt &&
(!obj->has_aliasing_ppgtt_mapping ||
(cache_level != obj->cache_level))) {
gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
vma->obj->pages, entry, cache_level);
vma->obj->has_aliasing_ppgtt_mapping = 1;
}
}
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 7/8] drm/i915: Use the new vm [un]bind functions
2013-09-14 1:08 ` Ben Widawsky
@ 2013-09-14 15:31 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2013-09-14 15:31 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Sep 13, 2013 at 06:08:17PM -0700, Ben Widawsky wrote:
> On Wed, Sep 11, 2013 at 11:39:30PM +0100, Chris Wilson wrote:
> > On Wed, Sep 11, 2013 at 02:57:54PM -0700, Ben Widawsky wrote:
> > > @@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> > > struct intel_ring_buffer *ring,
> > > bool *need_reloc)
> > > {
> > > - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > + struct drm_i915_gem_object *obj = vma->obj;
> > > struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> > > bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> > > bool need_fence, need_mappable;
> > > - struct drm_i915_gem_object *obj = vma->obj;
> > > + u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> > > + !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
> > > int ret;
> > >
> > > need_fence =
> > > @@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> > > }
> > > }
> > >
> > > - /* Ensure ppgtt mapping exists if needed */
> > > - if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> > > - i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > - obj, obj->cache_level);
> > > -
> > > - obj->has_aliasing_ppgtt_mapping = 1;
> > > - }
> > > -
> > > if (entry->offset != vma->node.start) {
> > > entry->offset = vma->node.start;
> > > *need_reloc = true;
> > > @@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> > > obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
> > > }
> > >
> > > - if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> > > - !obj->has_global_gtt_mapping)
> > > - i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > + vma->vm->bind_vma(vma, obj->cache_level, flags);
> >
> > This results in us rebinding into the vma unconditionally once/twice for
> > every object on every execbuffer, right?
>
> Hmm, how does this suit you (untested)? I think it is addressing your concern.
> Also, I suspect squashing this patch with the previous is probably a good thing
> to do.
That looks like it should do the trick. However, would it not be cleaner
to move the vma->vm->bind_vma() into the pin()?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-09-14 15:31 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
2013-09-11 21:57 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
2013-09-11 21:57 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
2013-09-11 22:45 ` Daniel Vetter
2013-09-11 21:57 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
2013-09-11 22:02 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
2013-09-11 22:14 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-09-11 22:25 ` Chris Wilson
2013-09-11 22:49 ` Daniel Vetter
2013-09-11 21:57 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-09-11 22:39 ` Chris Wilson
2013-09-14 1:08 ` Ben Widawsky
2013-09-14 15:31 ` Chris Wilson
2013-09-11 21:57 ` [PATCH 8/8] drm/i915: eliminate vm->insert_entries() Ben Widawsky
-- strict thread matches above, loose matches on Subject: below --
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
2013-08-30 23:43 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
2013-08-31 0:04 ` Chris Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox