* [PATCH 0/8] [RESEND] VMA patches
@ 2013-08-30 23:43 Ben Widawsky
2013-08-30 23:43 ` [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
` (7 more replies)
0 siblings, 8 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
Now that [hopefully all] of the fallout from the last round of VMA merging has
been taken care of - Resending the remaining VMA patches which are required for
the full PPGTT support.
Some of these patches are new as a result of the last round of review. For the
ones which are just a RESEND, I've only rebased and un on an IVB.
Ben Widawsky (8):
drm/i915: Synchronize pread/pwrite with wait_rendering
drm/i915: Extract vm specific part of eviction
drm/i915: evict VM instead of everything
drm/1915: trace vm eviction instead of everything
drm/i915: Convert active API to VMA
drm/i915: Add bind/unbind object functions to VM
drm/i915: Use the new vm [un]bind functions
drm/i915: eliminate vm->insert_entries()
drivers/gpu/drm/i915/i915_drv.h | 84 ++++++++++-----------
drivers/gpu/drm/i915/i915_gem.c | 61 +++++++--------
drivers/gpu/drm/i915/i915_gem_context.c | 16 ++--
drivers/gpu/drm/i915/i915_gem_evict.c | 47 ++++++++++--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 40 +++++-----
drivers/gpu/drm/i915/i915_gem_gtt.c | 116 ++++++++++++++++++++---------
drivers/gpu/drm/i915/i915_trace.h | 12 +--
7 files changed, 228 insertions(+), 148 deletions(-)
--
1.8.4
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-30 23:50 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
` (6 subsequent siblings)
7 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
lifted from Daniel:
pread/pwrite isn't about the object's domain at all, but purely about
synchronizing for outstanding rendering. Replacing the call to
set_to_gtt_domain with a wait_rendering would imo improve code
readability. Furthermore we could pimp pread to only block for
outstanding writes and not for reads.
Since you're not the first one to trip over this: Can I volunteer you
for a follow-up patch to fix this?
Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f026153..a839bcb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -41,6 +41,9 @@ static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *o
static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj,
bool force);
static __must_check int
+i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
+ bool readonly);
+static __must_check int
i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
struct i915_address_space *vm,
unsigned alignment,
@@ -430,11 +433,9 @@ i915_gem_shmem_pread(struct drm_device *dev,
* optimizes for the case when the gpu will dirty the data
* anyway again before the next pread happens. */
needs_clflush = !cpu_cache_is_coherent(dev, obj->cache_level);
- if (i915_gem_obj_bound_any(obj)) {
- ret = i915_gem_object_set_to_gtt_domain(obj, false);
- if (ret)
- return ret;
- }
+ ret = i915_gem_object_wait_rendering(obj, false);
+ if (ret)
+ return ret;
}
ret = i915_gem_object_get_pages(obj);
@@ -746,11 +747,9 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
* optimizes for the case when the gpu will use the data
* right away and we therefore have to clflush anyway. */
needs_clflush_after = cpu_write_needs_clflush(obj);
- if (i915_gem_obj_bound_any(obj)) {
- ret = i915_gem_object_set_to_gtt_domain(obj, true);
- if (ret)
- return ret;
- }
+ ret = i915_gem_object_wait_rendering(obj, false);
+ if (ret)
+ return ret;
}
/* Same trick applies to invalidate partially written cachelines read
* before writing. */
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 2/8] drm/i915: Extract vm specific part of eviction
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
2013-08-30 23:43 ` [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-30 23:52 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
` (5 subsequent siblings)
7 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
As we'll see in the next patch, being able to evict for just 1 VM is
handy.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_evict.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index cc8974f..e9033f0 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -155,12 +155,31 @@ found:
return ret;
}
+static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
+{
+ struct i915_vma *vma, *next;
+ int ret;
+
+ if (do_idle) {
+ ret = i915_gpu_idle(vm->dev);
+ if (ret)
+ return ret;
+
+ i915_gem_retire_requests(vm->dev);
+ }
+
+ list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+ if (vma->obj->pin_count == 0)
+ WARN_ON(i915_vma_unbind(vma));
+
+ return 0;
+}
+
int
i915_gem_evict_everything(struct drm_device *dev)
{
drm_i915_private_t *dev_priv = dev->dev_private;
struct i915_address_space *vm;
- struct i915_vma *vma, *next;
bool lists_empty = true;
int ret;
@@ -187,11 +206,8 @@ i915_gem_evict_everything(struct drm_device *dev)
i915_gem_retire_requests(dev);
/* Having flushed everything, unbind() should never raise an error */
- list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
- list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
- if (vma->obj->pin_count == 0)
- WARN_ON(i915_vma_unbind(vma));
- }
+ list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+ WARN_ON(i915_gem_evict_vm(vm, false));
return 0;
}
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 3/8] drm/i915: evict VM instead of everything
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
2013-08-30 23:43 ` [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
2013-08-30 23:43 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-31 0:04 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
` (4 subsequent siblings)
7 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
When reserving objects during execbuf, it is possible to come across an
object which will not fit given the current fragmentation of the address
space. We do not have any defragment in drm_mm, so the strategy is to
instead evict everything, and reallocate objects.
With the upcoming addition of multiple VMs, there is no point to evict
everything since doing so is overkill for the specific case mentioned
above.
Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem_evict.c | 17 ++++++++++++++++-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 +++++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0602c4b..6e8ade0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2088,6 +2088,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev,
unsigned cache_level,
bool mappable,
bool nonblock);
+int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
int i915_gem_evict_everything(struct drm_device *dev);
/* i915_gem_stolen.c */
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index e9033f0..a3e279d 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -155,7 +155,22 @@ found:
return ret;
}
-static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
+/**
+ * i915_gem_evict_vm - Try to free up VM space
+ *
+ * @vm: Address space to evict from
+ * @do_idle: Boolean directing whether to idle first.
+ *
+ * VM eviction is about freeing up virtual address space. If one wants fine
+ * grained eviction, they should see evict something for more details. In terms
+ * of freeing up actual system memory, this function may not accomplish the
+ * desired result. An object may be shared in multiple address space, and this
+ * function will not assert those objects be freed.
+ *
+ * Using do_idle will result in a more complete eviction because it retires, and
+ * inactivates current BOs.
+ */
+int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
{
struct i915_vma *vma, *next;
int ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index ea4bacb..f0a0bd7 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -546,10 +546,16 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
{
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
+ struct i915_address_space *vm;
struct list_head ordered_vmas;
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
int retry;
+ if (list_empty(vmas))
+ return 0;
+
+ vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm;
+
INIT_LIST_HEAD(&ordered_vmas);
while (!list_empty(vmas)) {
struct drm_i915_gem_exec_object2 *entry;
@@ -638,7 +644,7 @@ err: /* Decrement pin count for bound objects */
if (ret != -ENOSPC || retry++)
return ret;
- ret = i915_gem_evict_everything(ring->dev);
+ ret = i915_gem_evict_vm(vm, true);
if (ret)
return ret;
} while (1);
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 4/8] drm/i915: trace vm eviction instead of everything
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
` (2 preceding siblings ...)
2013-08-30 23:43 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-31 0:06 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
` (3 subsequent siblings)
7 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
Tracing vm eviction is really the event we care about. For the cases we
evict everything, we still will get the trace.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_evict.c | 4 ++--
drivers/gpu/drm/i915/i915_trace.h | 12 ++++++------
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index a3e279d..ff91ac6 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -175,6 +175,8 @@ int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
struct i915_vma *vma, *next;
int ret;
+ trace_i915_gem_evict_vm(vm);
+
if (do_idle) {
ret = i915_gpu_idle(vm->dev);
if (ret)
@@ -208,8 +210,6 @@ i915_gem_evict_everything(struct drm_device *dev)
if (lists_empty)
return -ENOSPC;
- trace_i915_gem_evict_everything(dev);
-
/* The gpu_idle will flush everything in the write domain to the
* active list. Then we must move everything off the active list
* with retire requests.
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index e2c5ee6..d2a5502 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -218,19 +218,19 @@ TRACE_EVENT(i915_gem_evict,
__entry->mappable ? ", mappable" : "")
);
-TRACE_EVENT(i915_gem_evict_everything,
- TP_PROTO(struct drm_device *dev),
- TP_ARGS(dev),
+TRACE_EVENT(i915_gem_evict_vm,
+ TP_PROTO(struct i915_address_space *vm),
+ TP_ARGS(vm),
TP_STRUCT__entry(
- __field(u32, dev)
+ __field(struct i915_address_space *, vm)
),
TP_fast_assign(
- __entry->dev = dev->primary->index;
+ __entry->vm = vm;
),
- TP_printk("dev=%d", __entry->dev)
+ TP_printk("vm=%p", __entry->vm)
);
TRACE_EVENT(i915_gem_ring_dispatch,
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 5/8] drm/i915: Convert active API to VMA
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
` (3 preceding siblings ...)
2013-08-30 23:43 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-31 0:08 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
` (2 subsequent siblings)
7 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
Even though we track object activity and not VMA, because we have the
active_list be based on the VM, it makes the most sense to use VMAs in the
APIs.
NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
now, leave them be.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Conflicts:
drivers/gpu/drm/i915/i915_gem_execbuffer.c
---
drivers/gpu/drm/i915/i915_drv.h | 5 ++---
drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++--
drivers/gpu/drm/i915/i915_gem_context.c | 8 ++++----
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +--
4 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6e8ade0..c9ed77a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1866,9 +1866,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
int i915_gem_object_sync(struct drm_i915_gem_object *obj,
struct intel_ring_buffer *to);
-void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
- struct intel_ring_buffer *ring);
-
+void i915_vma_move_to_active(struct i915_vma *vma,
+ struct intel_ring_buffer *ring);
int i915_gem_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a839bcb..8547b97 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1881,11 +1881,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
return 0;
}
-void
+static void
i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
struct intel_ring_buffer *ring)
{
- struct drm_device *dev = obj->base.dev;
+ struct drm_device *dev = ring->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
u32 seqno = intel_ring_get_seqno(ring);
@@ -1920,6 +1920,13 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
}
}
+void i915_vma_move_to_active(struct i915_vma *vma,
+ struct intel_ring_buffer *ring)
+{
+ list_move_tail(&vma->mm_list, &vma->vm->active_list);
+ return i915_gem_object_move_to_active(vma->obj, ring);
+}
+
static void
i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
{
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 403309c..6a0f568 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -436,11 +436,11 @@ static int do_switch(struct i915_hw_context *to)
* MI_SET_CONTEXT instead of when the next seqno has completed.
*/
if (from != NULL) {
- struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
- struct i915_address_space *ggtt = &dev_priv->gtt.base;
+ struct drm_i915_private *dev_priv = ring->dev->dev_private;
+ struct i915_vma *vma =
+ i915_gem_obj_to_vma(from->obj, &dev_priv->gtt.base);
from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
- list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
- i915_gem_object_move_to_active(from->obj, ring);
+ i915_vma_move_to_active(vma, ring);
/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
* whole damn pipeline, we don't need to explicitly mark the
* object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f0a0bd7..928d37b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -869,8 +869,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
obj->base.read_domains = obj->base.pending_read_domains;
obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
- list_move_tail(&vma->mm_list, &vma->vm->active_list);
- i915_gem_object_move_to_active(obj, ring);
+ i915_vma_move_to_active(vma, ring);
if (obj->base.write_domain) {
obj->dirty = 1;
obj->last_write_seqno = intel_ring_get_seqno(ring);
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
` (4 preceding siblings ...)
2013-08-30 23:43 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-08-30 23:43 ` Ben Widawsky
2013-08-31 0:12 ` Chris Wilson
2013-09-02 12:46 ` Ville Syrjälä
2013-08-30 23:44 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-08-30 23:44 ` [PATCH 8/8] drm/i915: eliminate vm->insert_entries() Ben Widawsky
7 siblings, 2 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:43 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels
Use VMA for bind/unbind (Daniel, Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 69 +++++++++++++-----------
drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
2 files changed, 140 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c9ed77a..377ca63 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -461,6 +461,36 @@ enum i915_cache_level {
typedef uint32_t gen6_gtt_pte_t;
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+ struct drm_mm_node node;
+ struct drm_i915_gem_object *obj;
+ struct i915_address_space *vm;
+
+ /** This object's place on the active/inactive lists */
+ struct list_head mm_list;
+
+ struct list_head vma_link; /* Link in the object's VMA list */
+
+ /** This vma's place in the batchbuffer or on the eviction list */
+ struct list_head exec_list;
+
+ /**
+ * Used for performing relocations during execbuffer insertion.
+ */
+ struct hlist_node exec_node;
+ unsigned long exec_handle;
+ struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
struct i915_address_space {
struct drm_mm mm;
struct drm_device *dev;
@@ -499,9 +529,18 @@ struct i915_address_space {
/* FIXME: Need a more generic return type */
gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
enum i915_cache_level level);
+
+ /** Unmap an object from an address space. This usually consists of
+ * setting the valid PTE entries to a reserved scratch page. */
+ void (*unbind_vma)(struct i915_vma *vma);
void (*clear_range)(struct i915_address_space *vm,
unsigned int first_entry,
unsigned int num_entries);
+ /* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+ void (*bind_vma)(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags);
void (*insert_entries)(struct i915_address_space *vm,
struct sg_table *st,
unsigned int first_entry,
@@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
int (*enable)(struct drm_device *dev);
};
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
- struct drm_mm_node node;
- struct drm_i915_gem_object *obj;
- struct i915_address_space *vm;
-
- /** This object's place on the active/inactive lists */
- struct list_head mm_list;
-
- struct list_head vma_link; /* Link in the object's VMA list */
-
- /** This vma's place in the batchbuffer or on the eviction list */
- struct list_head exec_list;
-
- /**
- * Used for performing relocations during execbuffer insertion.
- */
- struct hlist_node exec_node;
- unsigned long exec_handle;
- struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
struct i915_ctx_hang_stats {
/* This context had batch pending when hang was declared */
unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 212f6d8..d5a4580 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -57,6 +57,11 @@
#define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb)
#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6)
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
enum i915_cache_level level)
{
@@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
ppgtt->enable = gen6_ppgtt_enable;
+ ppgtt->base.unbind_vma = NULL;
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+ ppgtt->base.bind_vma = NULL;
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
cache_level);
}
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ WARN_ON(flags);
+
+ gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
struct drm_i915_gem_object *obj)
{
@@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
obj->base.size >> PAGE_SHIFT);
}
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ gen6_ppgtt_clear_range(vma->vm, entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+}
+
extern int intel_iommu_gfx_mapped;
/* Certain Gen5 chipsets require require idling the GPU before
* unmapping anything from the GTT when VT-d is enabled.
@@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
}
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 unused)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+ unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+ AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+ BUG_ON(!i915_is_ggtt(vma->vm));
+ intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+ vma->obj->has_global_gtt_mapping = 1;
+}
+
static void i915_ggtt_clear_range(struct i915_address_space *vm,
unsigned int first_entry,
unsigned int num_entries)
@@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
intel_gtt_clear_range(first_entry, num_entries);
}
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+ const unsigned int first = vma->node.start >> PAGE_SHIFT;
+ const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+ BUG_ON(!i915_is_ggtt(vma->vm));
+ vma->obj->has_global_gtt_mapping = 0;
+ intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+ struct drm_device *dev = vma->vm->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+ * the global, just use aliasing */
+ if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
+ !obj->has_global_gtt_mapping) {
+ gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+ vma->obj->pages, entry, cache_level);
+ vma->obj->has_aliasing_ppgtt_mapping = 1;
+ return;
+ }
+
+ gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
+ obj->has_global_gtt_mapping = 1;
+
+ /* If put the mapping in the aliasing PPGTT as well as Global if we have
+ * aliasing, but the user requested global. */
+ if (dev_priv->mm.aliasing_ppgtt) {
+ gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+ vma->obj->pages, entry, cache_level);
+ vma->obj->has_aliasing_ppgtt_mapping = 1;
+ }
+}
void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
enum i915_cache_level cache_level)
@@ -627,6 +707,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
obj->has_global_gtt_mapping = 0;
}
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+ struct drm_device *dev = vma->vm->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ gen6_ggtt_clear_range(vma->vm, entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+ vma->obj->has_global_gtt_mapping = 0;
+ if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
+ gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+ entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+ vma->obj->has_aliasing_ppgtt_mapping = 0;
+ }
+}
+
void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
{
struct drm_device *dev = obj->base.dev;
@@ -860,7 +957,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
DRM_ERROR("Scratch setup failed\n");
dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+ dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+ dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
return ret;
}
@@ -892,7 +991,9 @@ static int i915_gmch_probe(struct drm_device *dev,
dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+ dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+ dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
return 0;
}
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 7/8] drm/i915: Use the new vm [un]bind functions
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
` (5 preceding siblings ...)
2013-08-30 23:43 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-08-30 23:44 ` Ben Widawsky
2013-08-30 23:44 ` [PATCH 8/8] drm/i915: eliminate vm->insert_entries() Ben Widawsky
7 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:44 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 9 ------
drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++-----------
drivers/gpu/drm/i915/i915_gem_context.c | 8 +++--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++----------
drivers/gpu/drm/i915/i915_gem_gtt.c | 48 ++----------------------------
5 files changed, 34 insertions(+), 91 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 377ca63..94cbd0a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2065,17 +2065,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
/* i915_gem_gtt.c */
void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj);
-
void i915_gem_restore_gtt_mappings(struct drm_device *dev);
int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
void i915_gem_init_global_gtt(struct drm_device *dev);
void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8547b97..0082c81 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2645,12 +2645,8 @@ int i915_vma_unbind(struct i915_vma *vma)
trace_i915_vma_unbind(vma);
- if (obj->has_global_gtt_mapping)
- i915_gem_gtt_unbind_object(obj);
- if (obj->has_aliasing_ppgtt_mapping) {
- i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
- obj->has_aliasing_ppgtt_mapping = 0;
- }
+ vma->vm->unbind_vma(vma);
+
i915_gem_gtt_finish_object(obj);
i915_gem_object_unpin_pages(obj);
@@ -3377,7 +3373,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
enum i915_cache_level cache_level)
{
struct drm_device *dev = obj->base.dev;
- drm_i915_private_t *dev_priv = dev->dev_private;
struct i915_vma *vma;
int ret;
@@ -3416,11 +3411,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
return ret;
}
- if (obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(obj, cache_level);
- if (obj->has_aliasing_ppgtt_mapping)
- i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
- obj, cache_level);
+ list_for_each_entry(vma, &obj->vma_list, vma_link)
+ vma->vm->bind_vma(vma, cache_level, 0);
}
list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3748,6 +3740,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
bool map_and_fenceable,
bool nonblocking)
{
+ const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
struct i915_vma *vma;
int ret;
@@ -3776,20 +3769,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
}
if (!i915_gem_obj_bound(obj, vm)) {
- struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
map_and_fenceable,
nonblocking);
if (ret)
return ret;
- if (!dev_priv->mm.aliasing_ppgtt)
- i915_gem_gtt_bind_object(obj, obj->cache_level);
- }
+ vma = i915_gem_obj_to_vma(obj, vm);
+ vm->bind_vma(vma, obj->cache_level, flags);
+ } else
+ vma = i915_gem_obj_to_vma(obj, vm);
+ /* Objects are created map and fenceable. If we bind an object
+ * the first time, and we had aliasing PPGTT (and didn't request
+ * GLOBAL), we'll need to do this on the second bind.*/
if (!obj->has_global_gtt_mapping && map_and_fenceable)
- i915_gem_gtt_bind_object(obj, obj->cache_level);
+ vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
obj->pin_count++;
obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 6a0f568..91e9b90 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
static int do_switch(struct i915_hw_context *to)
{
struct intel_ring_buffer *ring = to->ring;
+ struct drm_i915_private *dev_priv = ring->dev->dev_private;
struct i915_hw_context *from = ring->last_context;
u32 hw_flags = 0;
int ret;
@@ -415,8 +416,11 @@ static int do_switch(struct i915_hw_context *to)
return ret;
}
- if (!to->obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+ if (!to->obj->has_global_gtt_mapping) {
+ struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+ &dev_priv->gtt.base);
+ vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+ }
if (!to->is_initialized || is_default_context(to))
hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 928d37b..f91b20e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
if (unlikely(IS_GEN6(dev) &&
reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
!target_i915_obj->has_global_gtt_mapping)) {
- i915_gem_gtt_bind_object(target_i915_obj,
- target_i915_obj->cache_level);
+ struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+ vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+ GLOBAL_BIND);
}
/* Validate that the target is in a valid r/w GPU domain */
@@ -461,11 +462,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
struct intel_ring_buffer *ring,
bool *need_reloc)
{
- struct drm_i915_private *dev_priv = ring->dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
bool need_fence, need_mappable;
- struct drm_i915_gem_object *obj = vma->obj;
+ u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+ !vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
int ret;
need_fence =
@@ -494,14 +496,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
}
}
- /* Ensure ppgtt mapping exists if needed */
- if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
- i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
- obj, obj->cache_level);
-
- obj->has_aliasing_ppgtt_mapping = 1;
- }
-
if (entry->offset != vma->node.start) {
entry->offset = vma->node.start;
*need_reloc = true;
@@ -512,9 +506,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
}
- if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
- !obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(obj, obj->cache_level);
+ vma->vm->bind_vma(vma, obj->cache_level, flags);
return 0;
}
@@ -1113,8 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
* batch" bit. Hence we need to pin secure batches into the global gtt.
* hsw should have this fixed, but let's be paranoid and do it
* unconditionally for now. */
- if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
- i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+ if (flags & I915_DISPATCH_SECURE &&
+ !batch_obj->has_global_gtt_mapping) {
+ struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
+ vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+ }
ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d5a4580..3c752ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
dev_priv->mm.aliasing_ppgtt = NULL;
}
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level)
-{
- ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
- i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
- cache_level);
-}
-
static void __always_unused
gen6_ppgtt_bind_vma(struct i915_vma *vma,
enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
}
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_object *obj)
-{
- ppgtt->base.clear_range(&ppgtt->base,
- i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
- obj->base.size >> PAGE_SHIFT);
-}
-
static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
{
const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
dev_priv->gtt.base.total / PAGE_SIZE);
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+ struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+ &dev_priv->gtt.base);
i915_gem_clflush_object(obj, obj->pin_display);
- i915_gem_gtt_bind_object(obj, obj->cache_level);
+ vma->vm->bind_vma(vma, obj->cache_level, 0);
}
i915_gem_chipset_flush(dev);
@@ -680,33 +665,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
}
}
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
- enum i915_cache_level cache_level)
-{
- struct drm_device *dev = obj->base.dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
- dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
- entry,
- cache_level);
-
- obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
- struct drm_device *dev = obj->base.dev;
- struct drm_i915_private *dev_priv = dev->dev_private;
- const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
- dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
- entry,
- obj->base.size >> PAGE_SHIFT);
-
- obj->has_global_gtt_mapping = 0;
-}
-
static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
{
struct drm_device *dev = vma->vm->dev;
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 8/8] drm/i915: eliminate vm->insert_entries()
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
` (6 preceding siblings ...)
2013-08-30 23:44 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-08-30 23:44 ` Ben Widawsky
7 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-08-30 23:44 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 17 +----------------
1 file changed, 1 insertion(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3c752ad..72c8b8a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -339,8 +339,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->enable = gen6_ppgtt_enable;
ppgtt->base.unbind_vma = NULL;
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
- ppgtt->base.bind_vma = NULL;
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+ ppgtt->base.bind_vma = NULL;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->base.scratch = dev_priv->gtt.base.scratch;
ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
@@ -591,19 +591,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
readl(gtt_base);
}
-
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
- struct sg_table *st,
- unsigned int pg_start,
- enum i915_cache_level cache_level)
-{
- unsigned int flags = (cache_level == I915_CACHE_NONE) ?
- AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
-
- intel_gtt_insert_sg_entries(st, pg_start, flags);
-
-}
-
static void i915_ggtt_bind_vma(struct i915_vma *vma,
enum i915_cache_level cache_level,
u32 unused)
@@ -916,7 +903,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
- dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
return ret;
@@ -950,7 +936,6 @@ static int i915_gmch_probe(struct drm_device *dev,
dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
- dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
return 0;
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-08-30 23:43 ` [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
@ 2013-08-30 23:50 ` Chris Wilson
2013-08-31 3:39 ` Ben Widawsky
0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-08-30 23:50 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> lifted from Daniel:
> pread/pwrite isn't about the object's domain at all, but purely about
> synchronizing for outstanding rendering. Replacing the call to
> set_to_gtt_domain with a wait_rendering would imo improve code
> readability. Furthermore we could pimp pread to only block for
> outstanding writes and not for reads.
>
> Since you're not the first one to trip over this: Can I volunteer you
> for a follow-up patch to fix this?
>
> Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
This should fail i-g-t...
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/8] drm/i915: Extract vm specific part of eviction
2013-08-30 23:43 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
@ 2013-08-30 23:52 ` Chris Wilson
2013-08-31 3:39 ` Ben Widawsky
0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-08-30 23:52 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:55PM -0700, Ben Widawsky wrote:
> As we'll see in the next patch, being able to evict for just 1 VM is
> handy.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
> drivers/gpu/drm/i915/i915_gem_evict.c | 28 ++++++++++++++++++++++------
> 1 file changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index cc8974f..e9033f0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -155,12 +155,31 @@ found:
> return ret;
> }
>
> +static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
> +{
> + struct i915_vma *vma, *next;
> + int ret;
> +
> + if (do_idle) {
> + ret = i915_gpu_idle(vm->dev);
> + if (ret)
> + return ret;
> +
> + i915_gem_retire_requests(vm->dev);
> + }
> +
> + list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> + if (vma->obj->pin_count == 0)
> + WARN_ON(i915_vma_unbind(vma));
> +
> + return 0;
> +}
> +
> int
> i915_gem_evict_everything(struct drm_device *dev)
> {
> drm_i915_private_t *dev_priv = dev->dev_private;
> struct i915_address_space *vm;
> - struct i915_vma *vma, *next;
> bool lists_empty = true;
> int ret;
>
> @@ -187,11 +206,8 @@ i915_gem_evict_everything(struct drm_device *dev)
> i915_gem_retire_requests(dev);
>
> /* Having flushed everything, unbind() should never raise an error */
> - list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> - list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> - if (vma->obj->pin_count == 0)
> - WARN_ON(i915_vma_unbind(vma));
> - }
> + list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> + WARN_ON(i915_gem_evict_vm(vm, false));
Wny not use do_idle here?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/8] drm/i915: evict VM instead of everything
2013-08-30 23:43 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
@ 2013-08-31 0:04 ` Chris Wilson
0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2013-08-31 0:04 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:56PM -0700, Ben Widawsky wrote:
> When reserving objects during execbuf, it is possible to come across an
> object which will not fit given the current fragmentation of the address
> space. We do not have any defragment in drm_mm, so the strategy is to
> instead evict everything, and reallocate objects.
>
> With the upcoming addition of multiple VMs, there is no point to evict
> everything since doing so is overkill for the specific case mentioned
> above.
>
> Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
This loses the ENOSPC notification, and the tracepoints. I've just seen you
add tracepoints later..
I'm pretty certain this will be caught by an extra pass through reserve,
and probably worth it for the simplification. Fortunately, we do have
i-g-t coverage for this.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] drm/i915: trace vm eviction instead of everything
2013-08-30 23:43 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
@ 2013-08-31 0:06 ` Chris Wilson
2013-08-31 3:40 ` Ben Widawsky
0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-08-31 0:06 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:57PM -0700, Ben Widawsky wrote:
> Tracing vm eviction is really the event we care about. For the cases we
> evict everything, we still will get the trace.
Keep both until you retire evict_everything for good. Sometimes you need
flashing neon lights to explain things in a trace.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/8] drm/i915: Convert active API to VMA
2013-08-30 23:43 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-08-31 0:08 ` Chris Wilson
0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2013-08-31 0:08 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:58PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
>
> Even though we track object activity and not VMA, because we have the
> active_list be based on the VM, it makes the most sense to use VMAs in the
> APIs.
>
> NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
> now, leave them be.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> Conflicts:
> drivers/gpu/drm/i915/i915_gem_execbuffer.c
There's a silly chunk here, but otherwise ok.
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 ++---
> drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++--
> drivers/gpu/drm/i915/i915_gem_context.c | 8 ++++----
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +--
> 4 files changed, 16 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6e8ade0..c9ed77a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1866,9 +1866,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
> int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> struct intel_ring_buffer *to);
> -void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> - struct intel_ring_buffer *ring);
> -
> +void i915_vma_move_to_active(struct i915_vma *vma,
> + struct intel_ring_buffer *ring);
> int i915_gem_dumb_create(struct drm_file *file_priv,
> struct drm_device *dev,
> struct drm_mode_create_dumb *args);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a839bcb..8547b97 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1881,11 +1881,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> return 0;
> }
>
> -void
> +static void
> i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> struct intel_ring_buffer *ring)
> {
> - struct drm_device *dev = obj->base.dev;
> + struct drm_device *dev = ring->dev;
? Are you trying to anger the static checkers?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-08-30 23:43 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-08-31 0:12 ` Chris Wilson
2013-08-31 3:40 ` Ben Widawsky
2013-09-02 12:46 ` Ville Syrjälä
1 sibling, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-08-31 0:12 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
>
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
>
> Notice that this patch has no impact on functionality. I've decided to
> save the actual change until the next patch because I think it's easier
> to review that way. I'm happy to squash the two, or let Daniel do it on
> merge.
>
> v2:
> Make ggtt handle the quirky aliasing ppgtt
> Add flags to bind object to support above
> Don't ever call bind/unbind directly for PPGTT until we have real, full
> PPGTT (use NULLs to assert this)
> Make sure we rebind the ggtt if there already is a ggtt binding. This
> happens on set cache levels
> Use VMA for bind/unbind (Daniel, Ben)
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
You like pokkadot paint for its inconsistency? Other than interesting
alternation of styles, I see nothing wrong with the logic.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-08-30 23:50 ` Chris Wilson
@ 2013-08-31 3:39 ` Ben Widawsky
2013-09-02 6:32 ` Daniel Vetter
0 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-31 3:39 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX, Daniel Vetter
On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > lifted from Daniel:
> > pread/pwrite isn't about the object's domain at all, but purely about
> > synchronizing for outstanding rendering. Replacing the call to
> > set_to_gtt_domain with a wait_rendering would imo improve code
> > readability. Furthermore we could pimp pread to only block for
> > outstanding writes and not for reads.
> >
> > Since you're not the first one to trip over this: Can I volunteer you
> > for a follow-up patch to fix this?
> >
> > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> This should fail i-g-t...
> -Chris
>
Daniel, how have I failed your plan?
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/8] drm/i915: Extract vm specific part of eviction
2013-08-30 23:52 ` Chris Wilson
@ 2013-08-31 3:39 ` Ben Widawsky
0 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-08-31 3:39 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Sat, Aug 31, 2013 at 12:52:03AM +0100, Chris Wilson wrote:
> On Fri, Aug 30, 2013 at 04:43:55PM -0700, Ben Widawsky wrote:
> > As we'll see in the next patch, being able to evict for just 1 VM is
> > handy.
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> > drivers/gpu/drm/i915/i915_gem_evict.c | 28 ++++++++++++++++++++++------
> > 1 file changed, 22 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index cc8974f..e9033f0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -155,12 +155,31 @@ found:
> > return ret;
> > }
> >
> > +static int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
> > +{
> > + struct i915_vma *vma, *next;
> > + int ret;
> > +
> > + if (do_idle) {
> > + ret = i915_gpu_idle(vm->dev);
> > + if (ret)
> > + return ret;
> > +
> > + i915_gem_retire_requests(vm->dev);
> > + }
> > +
> > + list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> > + if (vma->obj->pin_count == 0)
> > + WARN_ON(i915_vma_unbind(vma));
> > +
> > + return 0;
> > +}
> > +
> > int
> > i915_gem_evict_everything(struct drm_device *dev)
> > {
> > drm_i915_private_t *dev_priv = dev->dev_private;
> > struct i915_address_space *vm;
> > - struct i915_vma *vma, *next;
> > bool lists_empty = true;
> > int ret;
> >
> > @@ -187,11 +206,8 @@ i915_gem_evict_everything(struct drm_device *dev)
> > i915_gem_retire_requests(dev);
> >
> > /* Having flushed everything, unbind() should never raise an error */
> > - list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > - list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> > - if (vma->obj->pin_count == 0)
> > - WARN_ON(i915_vma_unbind(vma));
> > - }
> > + list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > + WARN_ON(i915_gem_evict_vm(vm, false));
>
> Wny not use do_idle here?
> -Chris
>
The point of the extraction was to exactly avoid repeated call to idle +
retire, although in looking back at it, I suppose the cost wouldn't be
much since the first idle would make subsequent calls to idle + retire
just about free).
Just dropping the do_idle parameter, and always idling is what you
probably want.
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] drm/i915: trace vm eviction instead of everything
2013-08-31 0:06 ` Chris Wilson
@ 2013-08-31 3:40 ` Ben Widawsky
0 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-08-31 3:40 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Sat, Aug 31, 2013 at 01:06:02AM +0100, Chris Wilson wrote:
> On Fri, Aug 30, 2013 at 04:43:57PM -0700, Ben Widawsky wrote:
> > Tracing vm eviction is really the event we care about. For the cases we
> > evict everything, we still will get the trace.
>
> Keep both until you retire evict_everything for good. Sometimes you need
> flashing neon lights to explain things in a trace.
> -Chris
>
Got it. Will resend after Daniel has a chance to look. Do you want this
squashed in with the previous patch to keep the trace events as well?
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-08-31 0:12 ` Chris Wilson
@ 2013-08-31 3:40 ` Ben Widawsky
2013-09-03 14:48 ` Chris Wilson
0 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-08-31 3:40 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Sat, Aug 31, 2013 at 01:12:55AM +0100, Chris Wilson wrote:
> On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> >
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> >
> > Notice that this patch has no impact on functionality. I've decided to
> > save the actual change until the next patch because I think it's easier
> > to review that way. I'm happy to squash the two, or let Daniel do it on
> > merge.
> >
> > v2:
> > Make ggtt handle the quirky aliasing ppgtt
> > Add flags to bind object to support above
> > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > PPGTT (use NULLs to assert this)
> > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > happens on set cache levels
> > Use VMA for bind/unbind (Daniel, Ben)
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> You like pokkadot paint for its inconsistency? Other than interesting
> alternation of styles, I see nothing wrong with the logic.
> -Chris
>
To what are you referring? I'm probably more than willing to change
whatever displeases you.
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-08-31 3:39 ` Ben Widawsky
@ 2013-09-02 6:32 ` Daniel Vetter
2013-09-02 13:14 ` Chris Wilson
0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-09-02 6:32 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > lifted from Daniel:
> > > pread/pwrite isn't about the object's domain at all, but purely about
> > > synchronizing for outstanding rendering. Replacing the call to
> > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > readability. Furthermore we could pimp pread to only block for
> > > outstanding writes and not for reads.
> > >
> > > Since you're not the first one to trip over this: Can I volunteer you
> > > for a follow-up patch to fix this?
> > >
> > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >
> > This should fail i-g-t...
> > -Chris
> >
>
> Daniel, how have I failed your plan?
It should work ... Since the enclosing if-block checks for !cpu domain
(for either reads or writes) that implies that going into the gtt domain
is a noop (or better should be) wrt clflushing and we only wait for
outstanding gpu rendering. wait_rendering is an interface that's been
added afterwards. Unfortunately I've failed to explain this trickery in
either a comment or the commit message. Bad me ;-)
What does QA's patch test system say on a non-llc machine here?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-08-30 23:43 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-08-31 0:12 ` Chris Wilson
@ 2013-09-02 12:46 ` Ville Syrjälä
2013-09-04 0:20 ` Ben Widawsky
1 sibling, 1 reply; 31+ messages in thread
From: Ville Syrjälä @ 2013-09-02 12:46 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
>
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
>
> Notice that this patch has no impact on functionality. I've decided to
> save the actual change until the next patch because I think it's easier
> to review that way. I'm happy to squash the two, or let Daniel do it on
> merge.
>
> v2:
> Make ggtt handle the quirky aliasing ppgtt
> Add flags to bind object to support above
> Don't ever call bind/unbind directly for PPGTT until we have real, full
> PPGTT (use NULLs to assert this)
> Make sure we rebind the ggtt if there already is a ggtt binding. This
> happens on set cache levels
> Use VMA for bind/unbind (Daniel, Ben)
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 69 +++++++++++++-----------
> drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 140 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c9ed77a..377ca63 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -461,6 +461,36 @@ enum i915_cache_level {
>
> typedef uint32_t gen6_gtt_pte_t;
>
> +/**
> + * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> + * VMA's presence cannot be guaranteed before binding, or after unbinding the
> + * object into/from the address space.
> + *
> + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> + * will always be <= an objects lifetime. So object refcounting should cover us.
> + */
> +struct i915_vma {
> + struct drm_mm_node node;
> + struct drm_i915_gem_object *obj;
> + struct i915_address_space *vm;
> +
> + /** This object's place on the active/inactive lists */
> + struct list_head mm_list;
> +
> + struct list_head vma_link; /* Link in the object's VMA list */
> +
> + /** This vma's place in the batchbuffer or on the eviction list */
> + struct list_head exec_list;
> +
> + /**
> + * Used for performing relocations during execbuffer insertion.
> + */
> + struct hlist_node exec_node;
> + unsigned long exec_handle;
> + struct drm_i915_gem_exec_object2 *exec_entry;
> +
> +};
> +
> struct i915_address_space {
> struct drm_mm mm;
> struct drm_device *dev;
> @@ -499,9 +529,18 @@ struct i915_address_space {
> /* FIXME: Need a more generic return type */
> gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> enum i915_cache_level level);
> +
> + /** Unmap an object from an address space. This usually consists of
> + * setting the valid PTE entries to a reserved scratch page. */
> + void (*unbind_vma)(struct i915_vma *vma);
> void (*clear_range)(struct i915_address_space *vm,
> unsigned int first_entry,
> unsigned int num_entries);
> + /* Map an object into an address space with the given cache flags. */
> +#define GLOBAL_BIND (1<<0)
> + void (*bind_vma)(struct i915_vma *vma,
> + enum i915_cache_level cache_level,
> + u32 flags);
> void (*insert_entries)(struct i915_address_space *vm,
> struct sg_table *st,
> unsigned int first_entry,
> @@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
> int (*enable)(struct drm_device *dev);
> };
>
> -/**
> - * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> - * VMA's presence cannot be guaranteed before binding, or after unbinding the
> - * object into/from the address space.
> - *
> - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> - * will always be <= an objects lifetime. So object refcounting should cover us.
> - */
> -struct i915_vma {
> - struct drm_mm_node node;
> - struct drm_i915_gem_object *obj;
> - struct i915_address_space *vm;
> -
> - /** This object's place on the active/inactive lists */
> - struct list_head mm_list;
> -
> - struct list_head vma_link; /* Link in the object's VMA list */
> -
> - /** This vma's place in the batchbuffer or on the eviction list */
> - struct list_head exec_list;
> -
> - /**
> - * Used for performing relocations during execbuffer insertion.
> - */
> - struct hlist_node exec_node;
> - unsigned long exec_handle;
> - struct drm_i915_gem_exec_object2 *exec_entry;
> -
> -};
> -
> struct i915_ctx_hang_stats {
> /* This context had batch pending when hang was declared */
> unsigned batch_pending;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 212f6d8..d5a4580 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -57,6 +57,11 @@
> #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb)
> #define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6)
>
> +static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
> + enum i915_cache_level cache_level,
> + u32 flags);
> +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
> +
> static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
> enum i915_cache_level level)
> {
> @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
> ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> ppgtt->enable = gen6_ppgtt_enable;
> + ppgtt->base.unbind_vma = NULL;
> ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> + ppgtt->base.bind_vma = NULL;
> ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> cache_level);
> }
>
> +static void __always_unused
> +gen6_ppgtt_bind_vma(struct i915_vma *vma,
> + enum i915_cache_level cache_level,
> + u32 flags)
> +{
> + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> + WARN_ON(flags);
> +
> + gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> +}
> +
> void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> struct drm_i915_gem_object *obj)
> {
> @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> obj->base.size >> PAGE_SHIFT);
> }
>
> +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> +{
> + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> + gen6_ppgtt_clear_range(vma->vm, entry,
> + vma->obj->base.size >> PAGE_SHIFT);
> +}
> +
> extern int intel_iommu_gfx_mapped;
> /* Certain Gen5 chipsets require require idling the GPU before
> * unmapping anything from the GTT when VT-d is enabled.
> @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
>
> }
>
> +static void i915_ggtt_bind_vma(struct i915_vma *vma,
> + enum i915_cache_level cache_level,
> + u32 unused)
> +{
> + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> + unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> + AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> +
> + BUG_ON(!i915_is_ggtt(vma->vm));
> + intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> + vma->obj->has_global_gtt_mapping = 1;
> +}
> +
> static void i915_ggtt_clear_range(struct i915_address_space *vm,
> unsigned int first_entry,
> unsigned int num_entries)
> @@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
> intel_gtt_clear_range(first_entry, num_entries);
> }
>
> +static void i915_ggtt_unbind_vma(struct i915_vma *vma)
> +{
> + const unsigned int first = vma->node.start >> PAGE_SHIFT;
> + const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
> +
> + BUG_ON(!i915_is_ggtt(vma->vm));
> + vma->obj->has_global_gtt_mapping = 0;
> + intel_gtt_clear_range(first, size);
> +}
> +
> +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> + enum i915_cache_level cache_level,
> + u32 flags)
> +{
> + struct drm_device *dev = vma->vm->dev;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_gem_object *obj = vma->obj;
> + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> + * the global, just use aliasing */
> + if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
> + !obj->has_global_gtt_mapping) {
> + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> + vma->obj->pages, entry, cache_level);
> + vma->obj->has_aliasing_ppgtt_mapping = 1;
> + return;
> + }
> +
> + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
> + obj->has_global_gtt_mapping = 1;
> +
> + /* If put the mapping in the aliasing PPGTT as well as Global if we have
> + * aliasing, but the user requested global. */
> + if (dev_priv->mm.aliasing_ppgtt) {
> + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> + vma->obj->pages, entry, cache_level);
> + vma->obj->has_aliasing_ppgtt_mapping = 1;
> + }
> +}
There's a bit of duplication here. How about this:
{
if (!aliasing_ppgtt ||
flags & GLOBAL_BIND ||
has_ggtt_mapping {
gen6_ppgtt_insert_entries()
has_global_gtt_mapping = true;
}
if (aliasing_ppgtt) {
ppgtt_insert
has_aliasing_ppgtt_mapping = true;
}
}
Not sure if this was the same thing that bothered Chris.
>
> void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> enum i915_cache_level cache_level)
> @@ -627,6 +707,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> obj->has_global_gtt_mapping = 0;
> }
>
> +static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
> +{
> + struct drm_device *dev = vma->vm->dev;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> + gen6_ggtt_clear_range(vma->vm, entry,
> + vma->obj->base.size >> PAGE_SHIFT);
> + vma->obj->has_global_gtt_mapping = 0;
> + if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
> + gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
> + entry,
> + vma->obj->base.size >> PAGE_SHIFT);
> + vma->obj->has_aliasing_ppgtt_mapping = 0;
> + }
> +}
gen6_ggtt_bind_vma() might not have added the ggtt mapping, so I
suppose you should check has_global_gtt_mapping to avoid useless
work. Also the dev_priv->mm.aliasing_ppgtt check is a bit
pointless as there's no way has_aliasing_ppgtt_mapping would be
set then if there's no aliasing ppgtt.
> +
> void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
> {
> struct drm_device *dev = obj->base.dev;
> @@ -860,7 +957,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
> DRM_ERROR("Scratch setup failed\n");
>
> dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> + dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
> dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
> + dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
>
> return ret;
> }
> @@ -892,7 +991,9 @@ static int i915_gmch_probe(struct drm_device *dev,
>
> dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> + dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
> dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
> + dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
>
> return 0;
> }
> --
> 1.8.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-09-02 6:32 ` Daniel Vetter
@ 2013-09-02 13:14 ` Chris Wilson
2013-09-02 14:12 ` Daniel Vetter
0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-09-02 13:14 UTC (permalink / raw)
To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX, Ben Widawsky
On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > lifted from Daniel:
> > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > synchronizing for outstanding rendering. Replacing the call to
> > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > readability. Furthermore we could pimp pread to only block for
> > > > outstanding writes and not for reads.
> > > >
> > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > for a follow-up patch to fix this?
> > > >
> > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > >
> > > This should fail i-g-t...
> > > -Chris
> > >
> >
> > Daniel, how have I failed your plan?
>
> It should work ... Since the enclosing if-block checks for !cpu domain
> (for either reads or writes) that implies that going into the gtt domain
> is a noop (or better should be) wrt clflushing and we only wait for
> outstanding gpu rendering. wait_rendering is an interface that's been
> added afterwards. Unfortunately I've failed to explain this trickery in
> either a comment or the commit message. Bad me ;-)
The issue is that in the patch pwrite is not waiting for any outstanding
GPU reads.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-09-02 13:14 ` Chris Wilson
@ 2013-09-02 14:12 ` Daniel Vetter
2013-09-03 16:08 ` Daniel Vetter
0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-09-02 14:12 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter, Ben Widawsky, Ben Widawsky,
Intel GFX
On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote:
> On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > > lifted from Daniel:
> > > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > > synchronizing for outstanding rendering. Replacing the call to
> > > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > > readability. Furthermore we could pimp pread to only block for
> > > > > outstanding writes and not for reads.
> > > > >
> > > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > > for a follow-up patch to fix this?
> > > > >
> > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > >
> > > > This should fail i-g-t...
> > > > -Chris
> > > >
> > >
> > > Daniel, how have I failed your plan?
> >
> > It should work ... Since the enclosing if-block checks for !cpu domain
> > (for either reads or writes) that implies that going into the gtt domain
> > is a noop (or better should be) wrt clflushing and we only wait for
> > outstanding gpu rendering. wait_rendering is an interface that's been
> > added afterwards. Unfortunately I've failed to explain this trickery in
> > either a comment or the commit message. Bad me ;-)
>
> The issue is that in the patch pwrite is not waiting for any outstanding
> GPU reads.
Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in.
This /should/ have been caught by the gem_concurrent_blt subtests that
exercise pwrites ...
Ben can you please check that this indeed blew up on igt? Should fail on
any platform, no special caching mode required.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-08-31 3:40 ` Ben Widawsky
@ 2013-09-03 14:48 ` Chris Wilson
0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2013-09-03 14:48 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Fri, Aug 30, 2013 at 08:40:36PM -0700, Ben Widawsky wrote:
> On Sat, Aug 31, 2013 at 01:12:55AM +0100, Chris Wilson wrote:
> > On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > > From: Ben Widawsky <ben@bwidawsk.net>
> > >
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > >
> > > Notice that this patch has no impact on functionality. I've decided to
> > > save the actual change until the next patch because I think it's easier
> > > to review that way. I'm happy to squash the two, or let Daniel do it on
> > > merge.
> > >
> > > v2:
> > > Make ggtt handle the quirky aliasing ppgtt
> > > Add flags to bind object to support above
> > > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > > PPGTT (use NULLs to assert this)
> > > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > > happens on set cache levels
> > > Use VMA for bind/unbind (Daniel, Ben)
> > >
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >
> > You like pokkadot paint for its inconsistency? Other than interesting
> > alternation of styles, I see nothing wrong with the logic.
> > -Chris
> >
>
> To what are you referring? I'm probably more than willing to change
> whatever displeases you.
You were alternating between using temporary variables like
const unsigned long entry = vma->node.start >> PAGE_SHIFT;
and doing the computation inline, with no consistency as far as I could
see. It was so very minor, but it looks like cut'n'paste code from
multiple sources.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-09-02 14:12 ` Daniel Vetter
@ 2013-09-03 16:08 ` Daniel Vetter
2013-09-03 23:53 ` Ben Widawsky
0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-09-03 16:08 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter, Ben Widawsky, Ben Widawsky,
Intel GFX
On Mon, Sep 02, 2013 at 04:12:36PM +0200, Daniel Vetter wrote:
> On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote:
> > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > > > lifted from Daniel:
> > > > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > > > synchronizing for outstanding rendering. Replacing the call to
> > > > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > > > readability. Furthermore we could pimp pread to only block for
> > > > > > outstanding writes and not for reads.
> > > > > >
> > > > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > > > for a follow-up patch to fix this?
> > > > > >
> > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > >
> > > > > This should fail i-g-t...
> > > > > -Chris
> > > > >
> > > >
> > > > Daniel, how have I failed your plan?
> > >
> > > It should work ... Since the enclosing if-block checks for !cpu domain
> > > (for either reads or writes) that implies that going into the gtt domain
> > > is a noop (or better should be) wrt clflushing and we only wait for
> > > outstanding gpu rendering. wait_rendering is an interface that's been
> > > added afterwards. Unfortunately I've failed to explain this trickery in
> > > either a comment or the commit message. Bad me ;-)
> >
> > The issue is that in the patch pwrite is not waiting for any outstanding
> > GPU reads.
>
> Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in.
> This /should/ have been caught by the gem_concurrent_blt subtests that
> exercise pwrites ...
>
> Ben can you please check that this indeed blew up on igt? Should fail on
> any platform, no special caching mode required.
Actually it won't blow up since you always set readonly = false. But it'll
kill the neat read-read optimization ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering
2013-09-03 16:08 ` Daniel Vetter
@ 2013-09-03 23:53 ` Ben Widawsky
0 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-09-03 23:53 UTC (permalink / raw)
To: Daniel Vetter; +Cc: Intel GFX, Ben Widawsky
On Tue, Sep 03, 2013 at 06:08:19PM +0200, Daniel Vetter wrote:
> On Mon, Sep 02, 2013 at 04:12:36PM +0200, Daniel Vetter wrote:
> > On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote:
> > > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> > > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > > > > lifted from Daniel:
> > > > > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > > > > synchronizing for outstanding rendering. Replacing the call to
> > > > > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > > > > readability. Furthermore we could pimp pread to only block for
> > > > > > > outstanding writes and not for reads.
> > > > > > >
> > > > > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > > > > for a follow-up patch to fix this?
> > > > > > >
> > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > > >
> > > > > > This should fail i-g-t...
> > > > > > -Chris
> > > > > >
> > > > >
> > > > > Daniel, how have I failed your plan?
> > > >
> > > > It should work ... Since the enclosing if-block checks for !cpu domain
> > > > (for either reads or writes) that implies that going into the gtt domain
> > > > is a noop (or better should be) wrt clflushing and we only wait for
> > > > outstanding gpu rendering. wait_rendering is an interface that's been
> > > > added afterwards. Unfortunately I've failed to explain this trickery in
> > > > either a comment or the commit message. Bad me ;-)
> > >
> > > The issue is that in the patch pwrite is not waiting for any outstanding
> > > GPU reads.
> >
> > Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in.
> > This /should/ have been caught by the gem_concurrent_blt subtests that
> > exercise pwrites ...
> >
> > Ben can you please check that this indeed blew up on igt? Should fail on
> > any platform, no special caching mode required.
>
> Actually it won't blow up since you always set readonly = false. But it'll
> kill the neat read-read optimization ...
> -Daniel
Doh! Sorry about this. Fixed locally.
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-02 12:46 ` Ville Syrjälä
@ 2013-09-04 0:20 ` Ben Widawsky
2013-09-04 7:31 ` Ville Syrjälä
0 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-09-04 0:20 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: Intel GFX, Ben Widawsky
On Mon, Sep 02, 2013 at 03:46:52PM +0300, Ville Syrjälä wrote:
> On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> >
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> >
> > Notice that this patch has no impact on functionality. I've decided to
> > save the actual change until the next patch because I think it's easier
> > to review that way. I'm happy to squash the two, or let Daniel do it on
> > merge.
> >
> > v2:
> > Make ggtt handle the quirky aliasing ppgtt
> > Add flags to bind object to support above
> > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > PPGTT (use NULLs to assert this)
> > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > happens on set cache levels
> > Use VMA for bind/unbind (Daniel, Ben)
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> > drivers/gpu/drm/i915/i915_drv.h | 69 +++++++++++++-----------
> > drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
> > 2 files changed, 140 insertions(+), 30 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index c9ed77a..377ca63 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -461,6 +461,36 @@ enum i915_cache_level {
> >
> > typedef uint32_t gen6_gtt_pte_t;
> >
> > +/**
> > + * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > + * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > + * object into/from the address space.
> > + *
> > + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > + * will always be <= an objects lifetime. So object refcounting should cover us.
> > + */
> > +struct i915_vma {
> > + struct drm_mm_node node;
> > + struct drm_i915_gem_object *obj;
> > + struct i915_address_space *vm;
> > +
> > + /** This object's place on the active/inactive lists */
> > + struct list_head mm_list;
> > +
> > + struct list_head vma_link; /* Link in the object's VMA list */
> > +
> > + /** This vma's place in the batchbuffer or on the eviction list */
> > + struct list_head exec_list;
> > +
> > + /**
> > + * Used for performing relocations during execbuffer insertion.
> > + */
> > + struct hlist_node exec_node;
> > + unsigned long exec_handle;
> > + struct drm_i915_gem_exec_object2 *exec_entry;
> > +
> > +};
> > +
> > struct i915_address_space {
> > struct drm_mm mm;
> > struct drm_device *dev;
> > @@ -499,9 +529,18 @@ struct i915_address_space {
> > /* FIXME: Need a more generic return type */
> > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > enum i915_cache_level level);
> > +
> > + /** Unmap an object from an address space. This usually consists of
> > + * setting the valid PTE entries to a reserved scratch page. */
> > + void (*unbind_vma)(struct i915_vma *vma);
> > void (*clear_range)(struct i915_address_space *vm,
> > unsigned int first_entry,
> > unsigned int num_entries);
> > + /* Map an object into an address space with the given cache flags. */
> > +#define GLOBAL_BIND (1<<0)
> > + void (*bind_vma)(struct i915_vma *vma,
> > + enum i915_cache_level cache_level,
> > + u32 flags);
> > void (*insert_entries)(struct i915_address_space *vm,
> > struct sg_table *st,
> > unsigned int first_entry,
> > @@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
> > int (*enable)(struct drm_device *dev);
> > };
> >
> > -/**
> > - * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > - * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > - * object into/from the address space.
> > - *
> > - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > - * will always be <= an objects lifetime. So object refcounting should cover us.
> > - */
> > -struct i915_vma {
> > - struct drm_mm_node node;
> > - struct drm_i915_gem_object *obj;
> > - struct i915_address_space *vm;
> > -
> > - /** This object's place on the active/inactive lists */
> > - struct list_head mm_list;
> > -
> > - struct list_head vma_link; /* Link in the object's VMA list */
> > -
> > - /** This vma's place in the batchbuffer or on the eviction list */
> > - struct list_head exec_list;
> > -
> > - /**
> > - * Used for performing relocations during execbuffer insertion.
> > - */
> > - struct hlist_node exec_node;
> > - unsigned long exec_handle;
> > - struct drm_i915_gem_exec_object2 *exec_entry;
> > -
> > -};
> > -
> > struct i915_ctx_hang_stats {
> > /* This context had batch pending when hang was declared */
> > unsigned batch_pending;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 212f6d8..d5a4580 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -57,6 +57,11 @@
> > #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb)
> > #define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6)
> >
> > +static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > + enum i915_cache_level cache_level,
> > + u32 flags);
> > +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
> > +
> > static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
> > enum i915_cache_level level)
> > {
> > @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
> > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> > ppgtt->enable = gen6_ppgtt_enable;
> > + ppgtt->base.unbind_vma = NULL;
> > ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > + ppgtt->base.bind_vma = NULL;
> > ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> > ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > cache_level);
> > }
> >
> > +static void __always_unused
> > +gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > + enum i915_cache_level cache_level,
> > + u32 flags)
> > +{
> > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > + WARN_ON(flags);
> > +
> > + gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> > +}
> > +
> > void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > struct drm_i915_gem_object *obj)
> > {
> > @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > obj->base.size >> PAGE_SHIFT);
> > }
> >
> > +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> > +{
> > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > + gen6_ppgtt_clear_range(vma->vm, entry,
> > + vma->obj->base.size >> PAGE_SHIFT);
> > +}
> > +
> > extern int intel_iommu_gfx_mapped;
> > /* Certain Gen5 chipsets require require idling the GPU before
> > * unmapping anything from the GTT when VT-d is enabled.
> > @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> >
> > }
> >
> > +static void i915_ggtt_bind_vma(struct i915_vma *vma,
> > + enum i915_cache_level cache_level,
> > + u32 unused)
> > +{
> > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > + unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> > + AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> > +
> > + BUG_ON(!i915_is_ggtt(vma->vm));
> > + intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> > + vma->obj->has_global_gtt_mapping = 1;
> > +}
> > +
> > static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > unsigned int first_entry,
> > unsigned int num_entries)
> > @@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > intel_gtt_clear_range(first_entry, num_entries);
> > }
> >
> > +static void i915_ggtt_unbind_vma(struct i915_vma *vma)
> > +{
> > + const unsigned int first = vma->node.start >> PAGE_SHIFT;
> > + const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
> > +
> > + BUG_ON(!i915_is_ggtt(vma->vm));
> > + vma->obj->has_global_gtt_mapping = 0;
> > + intel_gtt_clear_range(first, size);
> > +}
> > +
> > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > + enum i915_cache_level cache_level,
> > + u32 flags)
> > +{
> > + struct drm_device *dev = vma->vm->dev;
> > + struct drm_i915_private *dev_priv = dev->dev_private;
> > + struct drm_i915_gem_object *obj = vma->obj;
> > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > + * the global, just use aliasing */
> > + if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
> > + !obj->has_global_gtt_mapping) {
> > + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > + vma->obj->pages, entry, cache_level);
> > + vma->obj->has_aliasing_ppgtt_mapping = 1;
> > + return;
> > + }
> > +
> > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
> > + obj->has_global_gtt_mapping = 1;
> > +
> > + /* If put the mapping in the aliasing PPGTT as well as Global if we have
> > + * aliasing, but the user requested global. */
> > + if (dev_priv->mm.aliasing_ppgtt) {
> > + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > + vma->obj->pages, entry, cache_level);
> > + vma->obj->has_aliasing_ppgtt_mapping = 1;
> > + }
> > +}
>
> There's a bit of duplication here. How about this:
> {
> if (!aliasing_ppgtt ||
> flags & GLOBAL_BIND ||
> has_ggtt_mapping {
> gen6_ppgtt_insert_entries()
^ I'll assume you meant ggtt
> has_global_gtt_mapping = true;
> }
>
> if (aliasing_ppgtt) {
> ppgtt_insert
> has_aliasing_ppgtt_mapping = true;
> }
> }
This looks like an improvement to me. Consider it done.
>
> Not sure if this was the same thing that bothered Chris.
>
> >
> > void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > enum i915_cache_level cache_level)
> > @@ -627,6 +707,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> > obj->has_global_gtt_mapping = 0;
> > }
> >
> > +static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
> > +{
> > + struct drm_device *dev = vma->vm->dev;
> > + struct drm_i915_private *dev_priv = dev->dev_private;
> > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > + gen6_ggtt_clear_range(vma->vm, entry,
> > + vma->obj->base.size >> PAGE_SHIFT);
> > + vma->obj->has_global_gtt_mapping = 0;
> > + if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
> > + gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
> > + entry,
> > + vma->obj->base.size >> PAGE_SHIFT);
> > + vma->obj->has_aliasing_ppgtt_mapping = 0;
> > + }
> > +}
>
> gen6_ggtt_bind_vma() might not have added the ggtt mapping, so I
> suppose you should check has_global_gtt_mapping to avoid useless
> work. Also the dev_priv->mm.aliasing_ppgtt check is a bit
> pointless as there's no way has_aliasing_ppgtt_mapping would be
> set then if there's no aliasing ppgtt.
>
Good points for both. Hopefully changing the first one doesn't expose
any pre-existing bugs.
> > +
> > void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
> > {
> > struct drm_device *dev = obj->base.dev;
> > @@ -860,7 +957,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
> > DRM_ERROR("Scratch setup failed\n");
> >
> > dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> > + dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
> > dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
> > + dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
> >
> > return ret;
> > }
> > @@ -892,7 +991,9 @@ static int i915_gmch_probe(struct drm_device *dev,
> >
> > dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> > dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> > + dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
> > dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
> > + dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
> >
> > return 0;
> > }
> > --
> > 1.8.4
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC
--
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-04 0:20 ` Ben Widawsky
@ 2013-09-04 7:31 ` Ville Syrjälä
0 siblings, 0 replies; 31+ messages in thread
From: Ville Syrjälä @ 2013-09-04 7:31 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Tue, Sep 03, 2013 at 05:20:08PM -0700, Ben Widawsky wrote:
> On Mon, Sep 02, 2013 at 03:46:52PM +0300, Ville Syrjälä wrote:
> > On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > > From: Ben Widawsky <ben@bwidawsk.net>
> > >
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > >
> > > Notice that this patch has no impact on functionality. I've decided to
> > > save the actual change until the next patch because I think it's easier
> > > to review that way. I'm happy to squash the two, or let Daniel do it on
> > > merge.
> > >
> > > v2:
> > > Make ggtt handle the quirky aliasing ppgtt
> > > Add flags to bind object to support above
> > > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > > PPGTT (use NULLs to assert this)
> > > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > > happens on set cache levels
> > > Use VMA for bind/unbind (Daniel, Ben)
> > >
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.h | 69 +++++++++++++-----------
> > > drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 140 insertions(+), 30 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index c9ed77a..377ca63 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -461,6 +461,36 @@ enum i915_cache_level {
> > >
> > > typedef uint32_t gen6_gtt_pte_t;
> > >
> > > +/**
> > > + * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > > + * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > > + * object into/from the address space.
> > > + *
> > > + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > > + * will always be <= an objects lifetime. So object refcounting should cover us.
> > > + */
> > > +struct i915_vma {
> > > + struct drm_mm_node node;
> > > + struct drm_i915_gem_object *obj;
> > > + struct i915_address_space *vm;
> > > +
> > > + /** This object's place on the active/inactive lists */
> > > + struct list_head mm_list;
> > > +
> > > + struct list_head vma_link; /* Link in the object's VMA list */
> > > +
> > > + /** This vma's place in the batchbuffer or on the eviction list */
> > > + struct list_head exec_list;
> > > +
> > > + /**
> > > + * Used for performing relocations during execbuffer insertion.
> > > + */
> > > + struct hlist_node exec_node;
> > > + unsigned long exec_handle;
> > > + struct drm_i915_gem_exec_object2 *exec_entry;
> > > +
> > > +};
> > > +
> > > struct i915_address_space {
> > > struct drm_mm mm;
> > > struct drm_device *dev;
> > > @@ -499,9 +529,18 @@ struct i915_address_space {
> > > /* FIXME: Need a more generic return type */
> > > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > > enum i915_cache_level level);
> > > +
> > > + /** Unmap an object from an address space. This usually consists of
> > > + * setting the valid PTE entries to a reserved scratch page. */
> > > + void (*unbind_vma)(struct i915_vma *vma);
> > > void (*clear_range)(struct i915_address_space *vm,
> > > unsigned int first_entry,
> > > unsigned int num_entries);
> > > + /* Map an object into an address space with the given cache flags. */
> > > +#define GLOBAL_BIND (1<<0)
> > > + void (*bind_vma)(struct i915_vma *vma,
> > > + enum i915_cache_level cache_level,
> > > + u32 flags);
> > > void (*insert_entries)(struct i915_address_space *vm,
> > > struct sg_table *st,
> > > unsigned int first_entry,
> > > @@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
> > > int (*enable)(struct drm_device *dev);
> > > };
> > >
> > > -/**
> > > - * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > > - * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > > - * object into/from the address space.
> > > - *
> > > - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > > - * will always be <= an objects lifetime. So object refcounting should cover us.
> > > - */
> > > -struct i915_vma {
> > > - struct drm_mm_node node;
> > > - struct drm_i915_gem_object *obj;
> > > - struct i915_address_space *vm;
> > > -
> > > - /** This object's place on the active/inactive lists */
> > > - struct list_head mm_list;
> > > -
> > > - struct list_head vma_link; /* Link in the object's VMA list */
> > > -
> > > - /** This vma's place in the batchbuffer or on the eviction list */
> > > - struct list_head exec_list;
> > > -
> > > - /**
> > > - * Used for performing relocations during execbuffer insertion.
> > > - */
> > > - struct hlist_node exec_node;
> > > - unsigned long exec_handle;
> > > - struct drm_i915_gem_exec_object2 *exec_entry;
> > > -
> > > -};
> > > -
> > > struct i915_ctx_hang_stats {
> > > /* This context had batch pending when hang was declared */
> > > unsigned batch_pending;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 212f6d8..d5a4580 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -57,6 +57,11 @@
> > > #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb)
> > > #define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6)
> > >
> > > +static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > > + enum i915_cache_level cache_level,
> > > + u32 flags);
> > > +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
> > > +
> > > static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
> > > enum i915_cache_level level)
> > > {
> > > @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > > ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
> > > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> > > ppgtt->enable = gen6_ppgtt_enable;
> > > + ppgtt->base.unbind_vma = NULL;
> > > ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > > + ppgtt->base.bind_vma = NULL;
> > > ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > > ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> > > ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > > @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > > cache_level);
> > > }
> > >
> > > +static void __always_unused
> > > +gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > > + enum i915_cache_level cache_level,
> > > + u32 flags)
> > > +{
> > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > + WARN_ON(flags);
> > > +
> > > + gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> > > +}
> > > +
> > > void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > struct drm_i915_gem_object *obj)
> > > {
> > > @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > obj->base.size >> PAGE_SHIFT);
> > > }
> > >
> > > +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> > > +{
> > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > + gen6_ppgtt_clear_range(vma->vm, entry,
> > > + vma->obj->base.size >> PAGE_SHIFT);
> > > +}
> > > +
> > > extern int intel_iommu_gfx_mapped;
> > > /* Certain Gen5 chipsets require require idling the GPU before
> > > * unmapping anything from the GTT when VT-d is enabled.
> > > @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> > >
> > > }
> > >
> > > +static void i915_ggtt_bind_vma(struct i915_vma *vma,
> > > + enum i915_cache_level cache_level,
> > > + u32 unused)
> > > +{
> > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > + unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> > > + AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> > > +
> > > + BUG_ON(!i915_is_ggtt(vma->vm));
> > > + intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> > > + vma->obj->has_global_gtt_mapping = 1;
> > > +}
> > > +
> > > static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > > unsigned int first_entry,
> > > unsigned int num_entries)
> > > @@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > > intel_gtt_clear_range(first_entry, num_entries);
> > > }
> > >
> > > +static void i915_ggtt_unbind_vma(struct i915_vma *vma)
> > > +{
> > > + const unsigned int first = vma->node.start >> PAGE_SHIFT;
> > > + const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
> > > +
> > > + BUG_ON(!i915_is_ggtt(vma->vm));
> > > + vma->obj->has_global_gtt_mapping = 0;
> > > + intel_gtt_clear_range(first, size);
> > > +}
> > > +
> > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > > + enum i915_cache_level cache_level,
> > > + u32 flags)
> > > +{
> > > + struct drm_device *dev = vma->vm->dev;
> > > + struct drm_i915_private *dev_priv = dev->dev_private;
> > > + struct drm_i915_gem_object *obj = vma->obj;
> > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > > + * the global, just use aliasing */
> > > + if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
> > > + !obj->has_global_gtt_mapping) {
> > > + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > > + vma->obj->pages, entry, cache_level);
> > > + vma->obj->has_aliasing_ppgtt_mapping = 1;
> > > + return;
> > > + }
> > > +
> > > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
> > > + obj->has_global_gtt_mapping = 1;
> > > +
> > > + /* If put the mapping in the aliasing PPGTT as well as Global if we have
> > > + * aliasing, but the user requested global. */
> > > + if (dev_priv->mm.aliasing_ppgtt) {
> > > + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > > + vma->obj->pages, entry, cache_level);
> > > + vma->obj->has_aliasing_ppgtt_mapping = 1;
> > > + }
> > > +}
> >
> > There's a bit of duplication here. How about this:
> > {
> > if (!aliasing_ppgtt ||
> > flags & GLOBAL_BIND ||
> > has_ggtt_mapping {
> > gen6_ppgtt_insert_entries()
> ^ I'll assume you meant ggtt
Yes.
Just got a funny mental image from this incident:
"Look here boy, this is how it's done!" -> crash and burn
<snip the rest>
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
@ 2013-09-11 21:57 ` Ben Widawsky
2013-09-11 22:25 ` Chris Wilson
0 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-09-11 21:57 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky
From: Ben Widawsky <ben@bwidawsk.net>
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
Notes:
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville).
Change logic in unbind to only unbind ggtt when there is a global mapping, and
to remove a redundant check if the aliasing ppgtt exists.
drivers/gpu/drm/i915/i915_drv.h | 69 +++++++++++++-----------
drivers/gpu/drm/i915/i915_gem_gtt.c | 102 ++++++++++++++++++++++++++++++++++++
2 files changed, 141 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 09a72c8..a67b9a0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -465,6 +465,36 @@ enum i915_cache_level {
typedef uint32_t gen6_gtt_pte_t;
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+ struct drm_mm_node node;
+ struct drm_i915_gem_object *obj;
+ struct i915_address_space *vm;
+
+ /** This object's place on the active/inactive lists */
+ struct list_head mm_list;
+
+ struct list_head vma_link; /* Link in the object's VMA list */
+
+ /** This vma's place in the batchbuffer or on the eviction list */
+ struct list_head exec_list;
+
+ /**
+ * Used for performing relocations during execbuffer insertion.
+ */
+ struct hlist_node exec_node;
+ unsigned long exec_handle;
+ struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
struct i915_address_space {
struct drm_mm mm;
struct drm_device *dev;
@@ -503,9 +533,18 @@ struct i915_address_space {
/* FIXME: Need a more generic return type */
gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
enum i915_cache_level level);
+
+ /** Unmap an object from an address space. This usually consists of
+ * setting the valid PTE entries to a reserved scratch page. */
+ void (*unbind_vma)(struct i915_vma *vma);
void (*clear_range)(struct i915_address_space *vm,
unsigned int first_entry,
unsigned int num_entries);
+ /* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+ void (*bind_vma)(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags);
void (*insert_entries)(struct i915_address_space *vm,
struct sg_table *st,
unsigned int first_entry,
@@ -552,36 +591,6 @@ struct i915_hw_ppgtt {
int (*enable)(struct drm_device *dev);
};
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
- struct drm_mm_node node;
- struct drm_i915_gem_object *obj;
- struct i915_address_space *vm;
-
- /** This object's place on the active/inactive lists */
- struct list_head mm_list;
-
- struct list_head vma_link; /* Link in the object's VMA list */
-
- /** This vma's place in the batchbuffer or on the eviction list */
- struct list_head exec_list;
-
- /**
- * Used for performing relocations during execbuffer insertion.
- */
- struct hlist_node exec_node;
- unsigned long exec_handle;
- struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
struct i915_ctx_hang_stats {
/* This context had batch pending when hang was declared */
unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 212f6d8..1d4e430 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -57,6 +57,11 @@
#define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb)
#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6)
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
enum i915_cache_level level)
{
@@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
ppgtt->enable = gen6_ppgtt_enable;
+ ppgtt->base.unbind_vma = NULL;
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+ ppgtt->base.bind_vma = NULL;
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
cache_level);
}
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ WARN_ON(flags);
+
+ gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
struct drm_i915_gem_object *obj)
{
@@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
obj->base.size >> PAGE_SHIFT);
}
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ gen6_ppgtt_clear_range(vma->vm, entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+}
+
extern int intel_iommu_gfx_mapped;
/* Certain Gen5 chipsets require require idling the GPU before
* unmapping anything from the GTT when VT-d is enabled.
@@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
}
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 unused)
+{
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+ unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+ AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+ BUG_ON(!i915_is_ggtt(vma->vm));
+ intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+ vma->obj->has_global_gtt_mapping = 1;
+}
+
static void i915_ggtt_clear_range(struct i915_address_space *vm,
unsigned int first_entry,
unsigned int num_entries)
@@ -599,6 +639,43 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
intel_gtt_clear_range(first_entry, num_entries);
}
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+ const unsigned int first = vma->node.start >> PAGE_SHIFT;
+ const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+ BUG_ON(!i915_is_ggtt(vma->vm));
+ vma->obj->has_global_gtt_mapping = 0;
+ intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+ struct drm_device *dev = vma->vm->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+ * the global, just use aliasing */
+ if (!dev_priv->mm.aliasing_ppgtt ||
+ (flags & GLOBAL_BIND)||
+ obj->has_global_gtt_mapping) {
+ gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
+ cache_level);
+ obj->has_global_gtt_mapping = 1;
+ }
+
+ /* If put the mapping in the aliasing PPGTT as well as Global if we have
+ * aliasing, but the user requested global. */
+ if (dev_priv->mm.aliasing_ppgtt) {
+ gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+ vma->obj->pages, entry, cache_level);
+ vma->obj->has_aliasing_ppgtt_mapping = 1;
+ }
+}
void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
enum i915_cache_level cache_level)
@@ -627,6 +704,27 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
obj->has_global_gtt_mapping = 0;
}
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+ struct drm_device *dev = vma->vm->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *obj = vma->obj;
+ const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+ if (obj->has_global_gtt_mapping) {
+ gen6_ggtt_clear_range(vma->vm, entry,
+ vma->obj->base.size >> PAGE_SHIFT);
+ obj->has_global_gtt_mapping = 0;
+ }
+
+ if (obj->has_aliasing_ppgtt_mapping) {
+ gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+ entry,
+ obj->base.size >> PAGE_SHIFT);
+ obj->has_aliasing_ppgtt_mapping = 0;
+ }
+}
+
void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
{
struct drm_device *dev = obj->base.dev;
@@ -860,7 +958,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
DRM_ERROR("Scratch setup failed\n");
dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+ dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+ dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
return ret;
}
@@ -892,7 +992,9 @@ static int i915_gmch_probe(struct drm_device *dev,
dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+ dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+ dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
return 0;
}
--
1.8.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-11 21:57 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-09-11 22:25 ` Chris Wilson
2013-09-11 22:49 ` Daniel Vetter
0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-09-11 22:25 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 02:57:53PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
>
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
>
> Notice that this patch has no impact on functionality. I've decided to
> save the actual change until the next patch because I think it's easier
> to review that way. I'm happy to squash the two, or let Daniel do it on
> merge.
Oh, this is disappointing. The GLOBAL_BIND flag is a let down. Is there
any way we can have the aliasing_ppgtt as a separate vm and so keep the
purity?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM
2013-09-11 22:25 ` Chris Wilson
@ 2013-09-11 22:49 ` Daniel Vetter
0 siblings, 0 replies; 31+ messages in thread
From: Daniel Vetter @ 2013-09-11 22:49 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX, Ben Widawsky
On Wed, Sep 11, 2013 at 11:25:50PM +0100, Chris Wilson wrote:
> On Wed, Sep 11, 2013 at 02:57:53PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> >
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> >
> > Notice that this patch has no impact on functionality. I've decided to
> > save the actual change until the next patch because I think it's easier
> > to review that way. I'm happy to squash the two, or let Daniel do it on
> > merge.
>
> Oh, this is disappointing. The GLOBAL_BIND flag is a let down. Is there
> any way we can have the aliasing_ppgtt as a separate vm and so keep the
> purity?
The problem is that the vm manages the address space, and it's paramount
on snb that we can line up objects into both the ggtt and ppgtt at will at
the exact same offset. Otherwise at least pipe control w/a won't work and
reading between bspec gives more promises of things that will break.
Adding complex logic to punt out such holes doesn't sound cleaner nor
simpler than the magic flags we're using here. And I have to admit that I
couldn't come up with another idea (besides flat-out killing aliasing
ppgtt on snb). So ideas welcome ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2013-09-11 22:49 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-30 23:43 [PATCH 0/8] [RESEND] VMA patches Ben Widawsky
2013-08-30 23:43 ` [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
2013-08-30 23:50 ` Chris Wilson
2013-08-31 3:39 ` Ben Widawsky
2013-09-02 6:32 ` Daniel Vetter
2013-09-02 13:14 ` Chris Wilson
2013-09-02 14:12 ` Daniel Vetter
2013-09-03 16:08 ` Daniel Vetter
2013-09-03 23:53 ` Ben Widawsky
2013-08-30 23:43 ` [PATCH 2/8] drm/i915: Extract vm specific part of eviction Ben Widawsky
2013-08-30 23:52 ` Chris Wilson
2013-08-31 3:39 ` Ben Widawsky
2013-08-30 23:43 ` [PATCH 3/8] drm/i915: evict VM instead of everything Ben Widawsky
2013-08-31 0:04 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 4/8] drm/i915: trace vm eviction " Ben Widawsky
2013-08-31 0:06 ` Chris Wilson
2013-08-31 3:40 ` Ben Widawsky
2013-08-30 23:43 ` [PATCH 5/8] drm/i915: Convert active API to VMA Ben Widawsky
2013-08-31 0:08 ` Chris Wilson
2013-08-30 23:43 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-08-31 0:12 ` Chris Wilson
2013-08-31 3:40 ` Ben Widawsky
2013-09-03 14:48 ` Chris Wilson
2013-09-02 12:46 ` Ville Syrjälä
2013-09-04 0:20 ` Ben Widawsky
2013-09-04 7:31 ` Ville Syrjälä
2013-08-30 23:44 ` [PATCH 7/8] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-08-30 23:44 ` [PATCH 8/8] drm/i915: eliminate vm->insert_entries() Ben Widawsky
-- strict thread matches above, loose matches on Subject: below --
2013-09-11 21:57 [PATCH 1/8] drm/i915: Synchronize pread/pwrite with wait_rendering Ben Widawsky
2013-09-11 21:57 ` [PATCH 6/8] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-09-11 22:25 ` Chris Wilson
2013-09-11 22:49 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox