[RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object

intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object
@ 2014-10-30 16:39 Tvrtko Ursulin
  2014-10-30 16:39 ` [PATCH 1/5] drm/i915: Move flags describing VMA mappings into the VMA Tvrtko Ursulin
                   ` (5 more replies)
  0 siblings, 6 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:39 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Assorted preparatory patches towards enabling multiple VMA and GGTT
views for a single GEM object.

Tvrtko Ursulin (12):
  drm/i915: Move flags describing VMA mappings into the VMA
  drm/i915: Prepare for the idea there can be multiple VMA per VM.
  drm/i915: Infrastructure for supporting different GGTT views per
    object
  drm/i915: Make scatter-gather helper available to the driver
  drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer

-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/5] drm/i915: Move flags describing VMA mappings into the VMA
  2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
@ 2014-10-30 16:39 ` Tvrtko Ursulin
  2014-10-30 16:39 ` [RFC 2/5] drm/i915: Prepare for the idea there can be multiple VMA per VM Tvrtko Ursulin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:39 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If these flags are on the object level it will be more difficult to allow
for multiple VMAs per object.

v2: Simplification and cleanup after code review comments (Chris Wilson).

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  2 +-
 drivers/gpu/drm/i915/i915_drv.h            |  2 --
 drivers/gpu/drm/i915/i915_gem.c            |  4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c    | 10 +++++-----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  9 +++------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 24 ++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_gtt.h        |  8 ++++++--
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      | 10 ++++++----
 9 files changed, 36 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d8d08fc..6a15fb6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -118,7 +118,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
 
 static inline const char *get_global_flag(struct drm_i915_gem_object *obj)
 {
-	return obj->has_global_gtt_mapping ? "g" : " ";
+	return i915_gem_obj_to_ggtt(obj) ? "g" : " ";
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ce3f01f..a830b85 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1965,8 +1965,6 @@ struct drm_i915_gem_object {
 	unsigned long gt_ro:1;
 	unsigned int cache_level:3;
 
-	unsigned int has_aliasing_ppgtt_mapping:1;
-	unsigned int has_global_gtt_mapping:1;
 	unsigned int has_dma_mapping:1;
 
 	unsigned int frontbuffer_bits:INTEL_FRONTBUFFER_BITS;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index edab8f1..255a431 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3700,7 +3700,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
 			if (drm_mm_node_allocated(&vma->node))
 				vma->bind_vma(vma, cache_level,
-					      obj->has_global_gtt_mapping ? GLOBAL_BIND : 0);
+						vma->bound & GLOBAL_BIND);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -4096,7 +4096,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 			return PTR_ERR(vma);
 	}
 
-	if (flags & PIN_GLOBAL && !obj->has_global_gtt_mapping)
+	if (flags & PIN_GLOBAL && !(vma->bound & GLOBAL_BIND))
 		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	vma->pin_count++;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index bf41f41..5ff6e94 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -525,6 +525,7 @@ static int do_switch(struct intel_engine_cs *ring,
 	struct intel_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	bool uninitialized = false;
+	struct i915_vma *vma;
 	int ret, i;
 
 	if (from != NULL && ring == &dev_priv->ring[RCS]) {
@@ -574,11 +575,10 @@ static int do_switch(struct intel_engine_cs *ring,
 	if (ret)
 		goto unpin_out;
 
-	if (!to->legacy_hw_ctx.rcs_state->has_global_gtt_mapping) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(to->legacy_hw_ctx.rcs_state,
-							   &dev_priv->gtt.base);
-		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level, GLOBAL_BIND);
-	}
+	vma = i915_gem_obj_to_ggtt(to->legacy_hw_ctx.rcs_state);
+	if (!(vma->bound & GLOBAL_BIND))
+		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level,
+				GLOBAL_BIND);
 
 	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f8cc935..8d56d5b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -357,12 +357,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	 * through the ppgtt for non_secure batchbuffers. */
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
-	    !target_i915_obj->has_global_gtt_mapping)) {
-		struct i915_vma *vma =
-			list_first_entry(&target_i915_obj->vma_list,
-					 typeof(*vma), vma_link);
-		vma->bind_vma(vma, target_i915_obj->cache_level, GLOBAL_BIND);
-	}
+	    !(target_vma->bound & GLOBAL_BIND)))
+		target_vma->bind_vma(target_vma, target_i915_obj->cache_level,
+				GLOBAL_BIND);
 
 	/* Validate that the target is in a valid r/w GPU domain */
 	if (unlikely(reloc->write_domain & (reloc->write_domain - 1))) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b80ab86..814a805 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1324,7 +1324,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 		 * Unfortunately above, we've just wiped out the mappings
 		 * without telling our object about it. So we need to fake it.
 		 */
-		obj->has_global_gtt_mapping = 0;
+		vma->bound &= ~GLOBAL_BIND;
 		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 	}
 
@@ -1521,7 +1521,7 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
 
 	BUG_ON(!i915_is_ggtt(vma->vm));
 	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
-	vma->obj->has_global_gtt_mapping = 1;
+	vma->bound = GLOBAL_BIND;
 }
 
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
@@ -1540,7 +1540,7 @@ static void i915_ggtt_unbind_vma(struct i915_vma *vma)
 	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
 
 	BUG_ON(!i915_is_ggtt(vma->vm));
-	vma->obj->has_global_gtt_mapping = 0;
+	vma->bound = 0;
 	intel_gtt_clear_range(first, size);
 }
 
@@ -1568,24 +1568,24 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	 * flags. At all other times, the GPU will use the aliasing PPGTT.
 	 */
 	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
-		if (!obj->has_global_gtt_mapping ||
+		if (!(vma->bound & GLOBAL_BIND) ||
 		    (cache_level != obj->cache_level)) {
 			vma->vm->insert_entries(vma->vm, obj->pages,
 						vma->node.start,
 						cache_level, flags);
-			obj->has_global_gtt_mapping = 1;
+			vma->bound |= GLOBAL_BIND;
 		}
 	}
 
 	if (dev_priv->mm.aliasing_ppgtt &&
-	    (!obj->has_aliasing_ppgtt_mapping ||
+	    (!(vma->bound & LOCAL_BIND) ||
 	     (cache_level != obj->cache_level))) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.insert_entries(&appgtt->base,
 					    vma->obj->pages,
 					    vma->node.start,
 					    cache_level, flags);
-		vma->obj->has_aliasing_ppgtt_mapping = 1;
+		vma->bound |= LOCAL_BIND;
 	}
 }
 
@@ -1595,21 +1595,21 @@ static void ggtt_unbind_vma(struct i915_vma *vma)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj = vma->obj;
 
-	if (obj->has_global_gtt_mapping) {
+	if (vma->bound & GLOBAL_BIND) {
 		vma->vm->clear_range(vma->vm,
 				     vma->node.start,
 				     obj->base.size,
 				     true);
-		obj->has_global_gtt_mapping = 0;
+		vma->bound &= ~GLOBAL_BIND;
 	}
 
-	if (obj->has_aliasing_ppgtt_mapping) {
+	if (vma->bound & LOCAL_BIND) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.clear_range(&appgtt->base,
 					 vma->node.start,
 					 obj->base.size,
 					 true);
-		obj->has_aliasing_ppgtt_mapping = 0;
+		vma->bound &= ~LOCAL_BIND;
 	}
 }
 
@@ -1687,7 +1687,7 @@ int i915_gem_setup_global_gtt(struct drm_device *dev,
 			DRM_DEBUG_KMS("Reservation failed: %i\n", ret);
 			return ret;
 		}
-		obj->has_global_gtt_mapping = 1;
+		vma->bound |= GLOBAL_BIND;
 	}
 
 	dev_priv->gtt.base.start = start;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d5c14af..d0562d0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -123,6 +123,12 @@ struct i915_vma {
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
 
+	/** Flags and address space this VMA is bound to */
+#define GLOBAL_BIND	(1<<0)
+#define LOCAL_BIND	(1<<1)
+#define PTE_READ_ONLY	(1<<2)
+	unsigned int bound : 4;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -155,8 +161,6 @@ struct i915_vma {
 	 * setting the valid PTE entries to a reserved scratch page. */
 	void (*unbind_vma)(struct i915_vma *vma);
 	/* Map an object into an address space with the given cache flags. */
-#define GLOBAL_BIND (1<<0)
-#define PTE_READ_ONLY (1<<1)
 	void (*bind_vma)(struct i915_vma *vma,
 			 enum i915_cache_level cache_level,
 			 u32 flags);
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 85fda6b..c388918 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -533,7 +533,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 		}
 	}
 
-	obj->has_global_gtt_mapping = 1;
+	vma->bound |= GLOBAL_BIND;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 053d99e..80f0c80 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -567,6 +567,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 			 struct i915_address_space *vm)
 {
 	struct drm_i915_error_object *dst;
+	struct i915_vma *vma = NULL;
 	int num_pages;
 	bool use_ggtt;
 	int i = 0;
@@ -587,16 +588,17 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 		dst->gtt_offset = -1;
 
 	reloc_offset = dst->gtt_offset;
+	if (i915_is_ggtt(vm))
+		vma = i915_gem_obj_to_ggtt(src);
 	use_ggtt = (src->cache_level == I915_CACHE_NONE &&
-		    i915_is_ggtt(vm) &&
-		    src->has_global_gtt_mapping &&
-		    reloc_offset + num_pages * PAGE_SIZE <= dev_priv->gtt.mappable_end);
+		   vma && (vma->bound & GLOBAL_BIND) &&
+		   reloc_offset + num_pages * PAGE_SIZE <= dev_priv->gtt.mappable_end);
 
 	/* Cannot access stolen address directly, try to use the aperture */
 	if (src->stolen) {
 		use_ggtt = true;
 
-		if (!src->has_global_gtt_mapping)
+		if (!(vma && vma->bound & GLOBAL_BIND))
 			goto unwind;
 
 		reloc_offset = i915_gem_obj_ggtt_offset(src);
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC 2/5] drm/i915: Prepare for the idea there can be multiple VMA per VM.
  2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
  2014-10-30 16:39 ` [PATCH 1/5] drm/i915: Move flags describing VMA mappings into the VMA Tvrtko Ursulin
@ 2014-10-30 16:39 ` Tvrtko Ursulin
  2014-10-30 16:39 ` [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:39 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

With multiple VMA per object and per address space we need some sort of
key to distinguish between then for which there is a new 'id' field.

Old API which does not know or care about this will use a default id.

I have tried to identify all places which assume there can only be one.
It compiles and runs the most basic stuff but this is more for
soliciting feedback and review.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c   |  4 ++--
 drivers/gpu/drm/i915/i915_gem.c       | 27 +++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  6 ++++--
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  7 +++++++
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 ++------
 5 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6a15fb6..73d98cc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -156,8 +156,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 			seq_puts(m, " (pp");
 		else
 			seq_puts(m, " (g");
-		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
-			   vma->node.start, vma->node.size);
+		seq_printf(m, "gtt offset: %08lx, size: %08lx, id: %u)",
+			   vma->node.start, vma->node.size, vma->id);
 	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 255a431..7f283d8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2009,8 +2009,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
 			drm_gem_object_reference(&obj->base);
 
 			list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
-				if (i915_vma_unbind(vma))
-					break;
+				i915_vma_unbind(vma);
 
 			if (i915_gem_object_put_pages(obj) == 0)
 				count += obj->base.size >> PAGE_SHIFT;
@@ -2211,17 +2210,14 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-	struct i915_address_space *vm;
 	struct i915_vma *vma;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		vma = i915_gem_obj_to_vma(obj, vm);
-		if (vma && !list_empty(&vma->mm_list))
-			list_move_tail(&vma->mm_list, &vm->inactive_list);
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (!list_empty(&vma->mm_list))
+			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
 	}
 
 	intel_fb_obj_flush(obj, true);
@@ -4493,7 +4489,7 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 {
 	struct i915_vma *vma;
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
-		if (vma->vm == vm)
+		if (vma->vm == vm && vma->id == VMA_ID_DEFAULT)
 			return vma;
 
 	return NULL;
@@ -5188,7 +5184,7 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
 	WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base);
 
 	list_for_each_entry(vma, &o->vma_list, vma_link) {
-		if (vma->vm == vm)
+		if (vma->vm == vm && vma->id == VMA_ID_DEFAULT)
 			return vma->node.start;
 
 	}
@@ -5336,9 +5332,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
 {
 	struct i915_vma *vma;
 
-	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
-	if (vma->vm != i915_obj_to_ggtt(obj))
-		return NULL;
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (vma->vm != i915_obj_to_ggtt(obj))
+			continue;
+		if (vma->id == VMA_ID_DEFAULT)
+			return vma;
+	}
 
-	return vma;
+	return NULL;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 814a805..e4ee2ac 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2127,7 +2127,8 @@ int i915_gem_gtt_init(struct drm_device *dev)
 }
 
 static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
-					      struct i915_address_space *vm)
+					      struct i915_address_space *vm,
+					      unsigned int id)
 {
 	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
 	if (vma == NULL)
@@ -2138,6 +2139,7 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&vma->exec_list);
 	vma->vm = vm;
 	vma->obj = obj;
+	vma->id = id;
 
 	switch (INTEL_INFO(vm->dev)->gen) {
 	case 9:
@@ -2183,7 +2185,7 @@ i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
 
 	vma = i915_gem_obj_to_vma(obj, vm);
 	if (!vma)
-		vma = __i915_gem_vma_create(obj, vm);
+		vma = __i915_gem_vma_create(obj, vm, VMA_ID_DEFAULT);
 
 	return vma;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d0562d0..66bc44b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -128,6 +128,13 @@ struct i915_vma {
 #define LOCAL_BIND	(1<<1)
 #define PTE_READ_ONLY	(1<<2)
 	unsigned int bound : 4;
+	/**
+	 * With multiple VMA mappings per object id distinguishes between
+	 * the ones in the same address space. The default one of zero is
+	 * for users of the legacy API.
+	 */
+#define VMA_ID_DEFAULT	(0)
+	unsigned int id;
 
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 80f0c80..50f4294 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -719,10 +719,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
 			break;
 
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
-			if (vma->vm == vm && vma->pin_count > 0) {
+			if (vma->vm == vm && vma->pin_count > 0)
 				capture_bo(err++, vma);
-				break;
-			}
 	}
 
 	return err - first;
@@ -1098,10 +1096,8 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
-			if (vma->vm == vm && vma->pin_count > 0) {
+			if (vma->vm == vm && vma->pin_count > 0)
 				i++;
-				break;
-			}
 	}
 	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
  2014-10-30 16:39 ` [PATCH 1/5] drm/i915: Move flags describing VMA mappings into the VMA Tvrtko Ursulin
  2014-10-30 16:39 ` [RFC 2/5] drm/i915: Prepare for the idea there can be multiple VMA per VM Tvrtko Ursulin
@ 2014-10-30 16:39 ` Tvrtko Ursulin
  2014-10-30 16:41   ` Chris Wilson
  2014-11-03 15:58   ` Daniel Vetter
  2014-10-30 16:39 ` [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver Tvrtko Ursulin
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:39 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Things like reliable GGTT mappings and mirrored 3d display will need to be
to map the same object twice into the GGTT.

Add a ggtt_view field per VMA and select the page view based on the type
of the view.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h            |  3 +++
 drivers/gpu/drm/i915/i915_gem.c            |  9 ++++---
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 38 +++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.h        | 24 ++++++++++++++++++-
 6 files changed, 63 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a830b85..92dec19 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2518,6 +2518,9 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm,
 				     uint32_t alignment,
 				     uint64_t flags);
+
+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+			u32 flags);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7f283d8..d7027f9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3487,7 +3487,7 @@ search_free:
 	WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
 
 	trace_i915_vma_bind(vma, flags);
-	vma->bind_vma(vma, obj->cache_level,
+	i915_vma_bind(vma, obj->cache_level,
 		      flags & (PIN_MAPPABLE | PIN_GLOBAL) ? GLOBAL_BIND : 0);
 
 	return vma;
@@ -3695,7 +3695,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
 			if (drm_mm_node_allocated(&vma->node))
-				vma->bind_vma(vma, cache_level,
+				i915_vma_bind(vma, cache_level,
 						vma->bound & GLOBAL_BIND);
 	}
 
@@ -4093,7 +4093,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (flags & PIN_GLOBAL && !(vma->bound & GLOBAL_BIND))
-		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
 
 	vma->pin_count++;
 	if (flags & PIN_MAPPABLE)
@@ -4511,6 +4511,9 @@ void i915_gem_vma_destroy(struct i915_vma *vma)
 
 	list_del(&vma->vma_link);
 
+	if (vma->ggtt_view.put_pages)
+		vma->ggtt_view.put_pages(&vma->ggtt_view);
+
 	kfree(vma);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5ff6e94..d365348 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -577,7 +577,7 @@ static int do_switch(struct intel_engine_cs *ring,
 
 	vma = i915_gem_obj_to_ggtt(to->legacy_hw_ctx.rcs_state);
 	if (!(vma->bound & GLOBAL_BIND))
-		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level,
+		i915_vma_bind(vma, to->legacy_hw_ctx.rcs_state->cache_level,
 				GLOBAL_BIND);
 
 	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 8d56d5b..5ba9440 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -358,7 +358,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !(target_vma->bound & GLOBAL_BIND)))
-		target_vma->bind_vma(target_vma, target_i915_obj->cache_level,
+		i915_vma_bind(target_vma, target_i915_obj->cache_level,
 				GLOBAL_BIND);
 
 	/* Validate that the target is in a valid r/w GPU domain */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e4ee2ac..c0e7cd9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -30,6 +30,8 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
+const struct i915_ggtt_view i915_ggtt_view_normal;
+
 static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv);
 static void chv_setup_private_ppat(struct drm_i915_private *dev_priv);
 
@@ -71,7 +73,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 }
 
 
-static void ppgtt_bind_vma(struct i915_vma *vma,
+static void ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			   enum i915_cache_level cache_level,
 			   u32 flags);
 static void ppgtt_unbind_vma(struct i915_vma *vma);
@@ -1194,7 +1196,7 @@ void  i915_ppgtt_release(struct kref *kref)
 }
 
 static void
-ppgtt_bind_vma(struct i915_vma *vma,
+ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 	       enum i915_cache_level cache_level,
 	       u32 flags)
 {
@@ -1202,7 +1204,7 @@ ppgtt_bind_vma(struct i915_vma *vma,
 	if (vma->obj->gt_ro)
 		flags |= PTE_READ_ONLY;
 
-	vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
+	vma->vm->insert_entries(vma->vm, pages, vma->node.start,
 				cache_level, flags);
 }
 
@@ -1325,7 +1327,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 		 * without telling our object about it. So we need to fake it.
 		 */
 		vma->bound &= ~GLOBAL_BIND;
-		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
 	}
 
 
@@ -1511,7 +1513,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 }
 
 
-static void i915_ggtt_bind_vma(struct i915_vma *vma,
+static void i915_ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			       enum i915_cache_level cache_level,
 			       u32 unused)
 {
@@ -1520,7 +1522,7 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
 	BUG_ON(!i915_is_ggtt(vma->vm));
-	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	intel_gtt_insert_sg_entries(pages, entry, flags);
 	vma->bound = GLOBAL_BIND;
 }
 
@@ -1544,7 +1546,7 @@ static void i915_ggtt_unbind_vma(struct i915_vma *vma)
 	intel_gtt_clear_range(first, size);
 }
 
-static void ggtt_bind_vma(struct i915_vma *vma,
+static void ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			  enum i915_cache_level cache_level,
 			  u32 flags)
 {
@@ -1570,7 +1572,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
 		if (!(vma->bound & GLOBAL_BIND) ||
 		    (cache_level != obj->cache_level)) {
-			vma->vm->insert_entries(vma->vm, obj->pages,
+			vma->vm->insert_entries(vma->vm, pages,
 						vma->node.start,
 						cache_level, flags);
 			vma->bound |= GLOBAL_BIND;
@@ -1582,7 +1584,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	     (cache_level != obj->cache_level))) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.insert_entries(&appgtt->base,
-					    vma->obj->pages,
+					    pages,
 					    vma->node.start,
 					    cache_level, flags);
 		vma->bound |= LOCAL_BIND;
@@ -2189,3 +2191,21 @@ i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
 
 	return vma;
 }
+
+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+			u32 flags)
+{
+	struct sg_table *pages;
+
+	if (vma->ggtt_view.get_pages)
+		pages = vma->ggtt_view.get_pages(&vma->ggtt_view, vma->obj);
+	else
+		pages = vma->obj->pages;
+
+	if (pages && !IS_ERR(pages)) {
+		vma->bind_vma(vma, pages, cache_level, flags);
+
+		if (vma->ggtt_view.put_pages)
+			vma->ggtt_view.put_pages(&vma->ggtt_view);
+	}
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 66bc44b..cbaddda 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -109,6 +109,23 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+enum i915_ggtt_view_type {
+	I915_GGTT_VIEW_NORMAL = 0,
+};
+
+struct i915_ggtt_view {
+	enum i915_ggtt_view_type type;
+	unsigned int vma_id;
+	struct sg_table *pages;
+
+	struct sg_table *(*get_pages)(struct i915_ggtt_view *ggtt_view,
+				      struct drm_i915_gem_object *obj);
+
+	void (*put_pages)(struct i915_ggtt_view *ggtt_view);
+};
+
+extern const struct i915_ggtt_view i915_ggtt_view_normal;
+
 enum i915_cache_level;
 /**
  * A VMA represents a GEM BO that is bound into an address space. Therefore, a
@@ -136,6 +153,11 @@ struct i915_vma {
 #define VMA_ID_DEFAULT	(0)
 	unsigned int id;
 
+	/**
+	 * Support different GGTT views into the same object.
+	 */
+	struct i915_ggtt_view ggtt_view;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -168,7 +190,7 @@ struct i915_vma {
 	 * setting the valid PTE entries to a reserved scratch page. */
 	void (*unbind_vma)(struct i915_vma *vma);
 	/* Map an object into an address space with the given cache flags. */
-	void (*bind_vma)(struct i915_vma *vma,
+	void (*bind_vma)(struct i915_vma *vma, struct sg_table *pages,
 			 enum i915_cache_level cache_level,
 			 u32 flags);
 };
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2014-10-30 16:39 ` [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
@ 2014-10-30 16:39 ` Tvrtko Ursulin
  2014-11-03 16:02   ` Daniel Vetter
  2014-11-03 16:05   ` Daniel Vetter
  2014-10-30 16:39 ` [RFC 5/5] drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer Tvrtko Ursulin
  2014-11-06 14:39 ` [RFC] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
  5 siblings, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:39 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

It will be used by other call sites shortly.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         | 38 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_userptr.c | 43 ++-------------------------------
 drivers/gpu/drm/i915/intel_drv.h        |  4 +++
 3 files changed, 44 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 5b157bb..0b34571 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2070,3 +2070,41 @@ int i915_driver_device_is_agp(struct drm_device *dev)
 {
 	return 1;
 }
+
+#if IS_ENABLED(CONFIG_SWIOTLB)
+#define swiotlb_active() swiotlb_nr_tbl()
+#else
+#define swiotlb_active() 0
+#endif
+
+int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
+{
+	struct scatterlist *sg;
+	int ret, n;
+
+	*st = kmalloc(sizeof(**st), GFP_KERNEL);
+	if (*st == NULL)
+		return -ENOMEM;
+
+	if (swiotlb_active()) {
+		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
+		if (ret)
+			goto err;
+
+		for_each_sg((*st)->sgl, sg, num_pages, n)
+			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
+	} else {
+		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
+						0, num_pages << PAGE_SHIFT,
+						GFP_KERNEL);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	kfree(*st);
+	*st = NULL;
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index d384139..d3d55b3 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -479,45 +479,6 @@ struct get_pages_work {
 	struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
-	struct scatterlist *sg;
-	int ret, n;
-
-	*st = kmalloc(sizeof(**st), GFP_KERNEL);
-	if (*st == NULL)
-		return -ENOMEM;
-
-	if (swiotlb_active()) {
-		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-		if (ret)
-			goto err;
-
-		for_each_sg((*st)->sgl, sg, num_pages, n)
-			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-	} else {
-		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-						0, num_pages << PAGE_SHIFT,
-						GFP_KERNEL);
-		if (ret)
-			goto err;
-	}
-
-	return 0;
-
-err:
-	kfree(*st);
-	*st = NULL;
-	return ret;
-}
-
 static void
 __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 {
@@ -557,7 +518,7 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 	if (obj->userptr.work != &work->work) {
 		ret = 0;
 	} else if (pinned == num_pages) {
-		ret = st_set_pages(&obj->pages, pvec, num_pages);
+		ret = i915_st_set_pages(&obj->pages, pvec, num_pages);
 		if (ret == 0) {
 			list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
 			pinned = 0;
@@ -666,7 +627,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 			}
 		}
 	} else {
-		ret = st_set_pages(&obj->pages, pvec, num_pages);
+		ret = i915_st_set_pages(&obj->pages, pvec, num_pages);
 		if (ret == 0) {
 			obj->userptr.work = NULL;
 			pinned = 0;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 787d5af..7e208d6d 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -998,6 +998,10 @@ void vlv_power_sequencer_reset(struct drm_i915_private *dev_priv);
 /* intel_dp_mst.c */
 int intel_dp_mst_encoder_init(struct intel_digital_port *intel_dig_port, int conn_id);
 void intel_dp_mst_encoder_cleanup(struct intel_digital_port *intel_dig_port);
+
+/* intel_dma.c */
+int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages);
+
 /* intel_dsi.c */
 void intel_dsi_init(struct drm_device *dev);
 
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC 5/5] drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer
  2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2014-10-30 16:39 ` [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver Tvrtko Ursulin
@ 2014-10-30 16:39 ` Tvrtko Ursulin
  2014-11-03 16:18   ` Daniel Vetter
  2014-11-06 14:39 ` [RFC] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
  5 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:39 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

It will help future code if this function knows something about of the context
of the display setup object is being pinned for.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_display.c | 25 +++++++++++++------------
 drivers/gpu/drm/i915/intel_drv.h     |  4 ++--
 drivers/gpu/drm/i915/intel_fbdev.c   | 20 ++++++++++----------
 drivers/gpu/drm/i915/intel_sprite.c  |  2 +-
 4 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index a4737c3..a24f84c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2218,11 +2218,13 @@ intel_fb_align_height(struct drm_device *dev, int bits_per_pixel, int height,
 }
 
 int
-intel_pin_and_fence_fb_obj(struct drm_device *dev,
-			   struct drm_i915_gem_object *obj,
+intel_pin_and_fence_fb_obj(struct drm_plane *plane,
+			   struct drm_framebuffer *fb,
 			   struct intel_engine_cs *pipelined)
 {
+	struct drm_device *dev = fb->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = intel_fb_obj(fb);
 	u32 alignment;
 	int ret;
 
@@ -2952,7 +2954,6 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y,
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	enum pipe pipe = intel_crtc->pipe;
 	struct drm_framebuffer *old_fb = crtc->primary->fb;
-	struct drm_i915_gem_object *obj = intel_fb_obj(fb);
 	struct drm_i915_gem_object *old_obj = intel_fb_obj(old_fb);
 	int ret;
 
@@ -2975,9 +2976,9 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y,
 	}
 
 	mutex_lock(&dev->struct_mutex);
-	ret = intel_pin_and_fence_fb_obj(dev, obj, NULL);
+	ret = intel_pin_and_fence_fb_obj(crtc->primary, fb, NULL);
 	if (ret == 0)
-		i915_gem_track_fb(old_obj, obj,
+		i915_gem_track_fb(old_obj, intel_fb_obj(fb),
 				  INTEL_FRONTBUFFER_PRIMARY(pipe));
 	mutex_unlock(&dev->struct_mutex);
 	if (ret != 0) {
@@ -10220,7 +10221,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		ring = &dev_priv->ring[RCS];
 	}
 
-	ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
+	ret = intel_pin_and_fence_fb_obj(crtc->primary, fb, ring);
 	if (ret)
 		goto cleanup_pending;
 
@@ -11427,9 +11428,7 @@ static int __intel_set_mode(struct drm_crtc *crtc,
 		struct drm_i915_gem_object *obj = intel_fb_obj(fb);
 
 		mutex_lock(&dev->struct_mutex);
-		ret = intel_pin_and_fence_fb_obj(dev,
-						 obj,
-						 NULL);
+		ret = intel_pin_and_fence_fb_obj(crtc->primary, fb, NULL);
 		if (ret != 0) {
 			DRM_ERROR("pin & fence failed\n");
 			mutex_unlock(&dev->struct_mutex);
@@ -12109,7 +12108,7 @@ intel_commit_primary_plane(struct drm_plane *plane,
 		 * fail.
 		 */
 		if (plane->fb != fb) {
-			ret = intel_pin_and_fence_fb_obj(dev, obj, NULL);
+			ret = intel_pin_and_fence_fb_obj(plane, fb, NULL);
 			if (ret) {
 				mutex_unlock(&dev->struct_mutex);
 				return ret;
@@ -13782,7 +13781,9 @@ void intel_modeset_gem_init(struct drm_device *dev)
 		if (obj == NULL)
 			continue;
 
-		if (intel_pin_and_fence_fb_obj(dev, obj, NULL)) {
+		if (intel_pin_and_fence_fb_obj(c->primary,
+					       c->primary->fb,
+					       NULL)) {
 			DRM_ERROR("failed to pin boot fb on pipe %d\n",
 				  to_intel_crtc(c)->pipe);
 			drm_framebuffer_unreference(c->primary->fb);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 7e208d6d..563d67e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -891,8 +891,8 @@ bool intel_get_load_detect_pipe(struct drm_connector *connector,
 				struct drm_modeset_acquire_ctx *ctx);
 void intel_release_load_detect_pipe(struct drm_connector *connector,
 				    struct intel_load_detect_pipe *old);
-int intel_pin_and_fence_fb_obj(struct drm_device *dev,
-			       struct drm_i915_gem_object *obj,
+int intel_pin_and_fence_fb_obj(struct drm_plane *plane,
+			       struct drm_framebuffer *fb,
 			       struct intel_engine_cs *pipelined);
 void intel_unpin_fb_obj(struct drm_i915_gem_object *obj);
 struct drm_framebuffer *
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 007c606..a2a7e48 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -119,25 +119,25 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 		goto out;
 	}
 
-	/* Flush everything out, we'll be doing GTT only from now on */
-	ret = intel_pin_and_fence_fb_obj(dev, obj, NULL);
-	if (ret) {
-		DRM_ERROR("failed to pin obj: %d\n", ret);
-		goto out_unref;
-	}
-
 	fb = __intel_framebuffer_create(dev, &mode_cmd, obj);
 	if (IS_ERR(fb)) {
 		ret = PTR_ERR(fb);
-		goto out_unpin;
+		goto out_unref;
+	}
+
+	/* Flush everything out, we'll be doing GTT only from now on */
+	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL);
+	if (ret) {
+		DRM_ERROR("failed to pin obj: %d\n", ret);
+		goto out_fb;
 	}
 
 	ifbdev->fb = to_intel_framebuffer(fb);
 
 	return 0;
 
-out_unpin:
-	i915_gem_object_ggtt_unpin(obj);
+out_fb:
+	drm_framebuffer_remove(fb);
 out_unref:
 	drm_gem_object_unreference(&obj->base);
 out:
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 573be95..6f7bfaf 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -1239,7 +1239,7 @@ intel_commit_sprite_plane(struct drm_plane *plane,
 		 * the sprite planes only require 128KiB alignment and 32 PTE
 		 * padding.
 		 */
-		ret = intel_pin_and_fence_fb_obj(dev, obj, NULL);
+		ret = intel_pin_and_fence_fb_obj(plane, fb, NULL);
 		if (ret == 0)
 			i915_gem_track_fb(old_obj, obj,
 					  INTEL_FRONTBUFFER_SPRITE(pipe));
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-10-30 16:39 ` [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
@ 2014-10-30 16:41   ` Chris Wilson
  2014-10-30 16:55     ` Tvrtko Ursulin
  2014-11-03 15:58   ` Daniel Vetter
  1 sibling, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2014-10-30 16:41 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Thu, Oct 30, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Things like reliable GGTT mappings and mirrored 3d display will need to be
> to map the same object twice into the GGTT.

What's a reliable GGTT mapping and how does it differ from today's
notoriously unreliable mappings?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-10-30 16:41   ` Chris Wilson
@ 2014-10-30 16:55     ` Tvrtko Ursulin
  0 siblings, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-10-30 16:55 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx, Daniel Vetter


On 10/30/2014 04:41 PM, Chris Wilson wrote:
> On Thu, Oct 30, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Things like reliable GGTT mappings and mirrored 3d display will need to be
>> to map the same object twice into the GGTT.
>
> What's a reliable GGTT mapping and how does it differ from today's
> notoriously unreliable mappings?

Ask Daniel. :P But you are probably aiming at the short commit message, 
which I justify with the RFC tag. :)

As far as I understand it idea is to allow mapping a subset of an object 
to work around the 256Mb limit of the mappable region. How it was 
described sounded a bit like VM inside a VM but I am more interested in 
the infrastructure than this particular justification.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-10-30 16:39 ` [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
  2014-10-30 16:41   ` Chris Wilson
@ 2014-11-03 15:58   ` Daniel Vetter
  2014-11-03 16:34     ` Tvrtko Ursulin
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 15:58 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Thu, Oct 30, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Things like reliable GGTT mappings and mirrored 3d display will need to be
> to map the same object twice into the GGTT.
> 
> Add a ggtt_view field per VMA and select the page view based on the type
> of the view.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

lgtm overall, some comments for polish in the crucial parts below.

> @@ -2189,3 +2191,21 @@ i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
>  
>  	return vma;
>  }
> +
> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> +			u32 flags)
> +{
> +	struct sg_table *pages;
> +
> +	if (vma->ggtt_view.get_pages)
> +		pages = vma->ggtt_view.get_pages(&vma->ggtt_view, vma->obj);
> +	else
> +		pages = vma->obj->pages;
> +
> +	if (pages && !IS_ERR(pages)) {
> +		vma->bind_vma(vma, pages, cache_level, flags);
> +
> +		if (vma->ggtt_view.put_pages)
> +			vma->ggtt_view.put_pages(&vma->ggtt_view);
> +	}

tbh this looks a bit like overboarding generalism to me ;-) Imo
- drop the ->put_pages callback and just free the sg table if you have
  one.
- drop teh ->get_pages callbacks and replace it with a new
  i915_vma_shuffle_pages or similar you'll call for non-standard ggtt
  views. Two reasons: shuffle is a more accurate name than get since this
  version doesn't grab it's own page references any more, and with
  currently just one internal user for this a vtable is serious overkill.

> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 66bc44b..cbaddda 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -109,6 +109,23 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>  #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
>  #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
>  
> +enum i915_ggtt_view_type {
> +	I915_GGTT_VIEW_NORMAL = 0,
> +};
> +
> +struct i915_ggtt_view {
> +	enum i915_ggtt_view_type type;
> +	unsigned int vma_id;

Imo vma_id and ggtt_view_type are fairly redundant. If you _really_ want
to make this super-generic (i.e. for ppgtt) just call it gtt_view instead
fo ggtt_view. But I wouldn't bother.

Removing vma_id also removes the int, which I'd ask you to replace with an
explicit enum anyway ;-)

> +	struct sg_table *pages;

This seems unused - leftover from a previous patch which kept the sgtable
around?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-10-30 16:39 ` [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver Tvrtko Ursulin
@ 2014-11-03 16:02   ` Daniel Vetter
  2014-11-03 16:18     ` Tvrtko Ursulin
  2014-11-03 16:05   ` Daniel Vetter
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 16:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Thu, Oct 30, 2014 at 04:39:37PM +0000, Tvrtko Ursulin wrote:

Just a sideline comment here for now ;-)

> +/* intel_dma.c */
> +int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages);

Comments all over should tell you that i915_dma.c is super legacy
territory and really you should never touch it. I think i915_gem.c is a
better place.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-10-30 16:39 ` [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver Tvrtko Ursulin
  2014-11-03 16:02   ` Daniel Vetter
@ 2014-11-03 16:05   ` Daniel Vetter
  2014-11-03 20:30     ` Chris Wilson
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 16:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, Chris Wilson; +Cc: Intel-gfx

On Thu, Oct 30, 2014 at 04:39:37PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> It will be used by other call sites shortly.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c         | 38 +++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 43 ++-------------------------------
>  drivers/gpu/drm/i915/intel_drv.h        |  4 +++
>  3 files changed, 44 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 5b157bb..0b34571 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -2070,3 +2070,41 @@ int i915_driver_device_is_agp(struct drm_device *dev)
>  {
>  	return 1;
>  }
> +
> +#if IS_ENABLED(CONFIG_SWIOTLB)
> +#define swiotlb_active() swiotlb_nr_tbl()
> +#else
> +#define swiotlb_active() 0
> +#endif
> +
> +int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> +{
> +	struct scatterlist *sg;
> +	int ret, n;
> +
> +	*st = kmalloc(sizeof(**st), GFP_KERNEL);
> +	if (*st == NULL)
> +		return -ENOMEM;
> +
> +	if (swiotlb_active()) {
> +		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> +		if (ret)
> +			goto err;
> +
> +		for_each_sg((*st)->sgl, sg, num_pages, n)
> +			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> +	} else {
> +		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> +						0, num_pages << PAGE_SHIFT,
> +						GFP_KERNEL);
> +		if (ret)
> +			goto err;
> +	}

Ok, I don't really understand why we don't just always use
sg_alloc_table_from_pages undconditionally - if swiotlb _ever_ gets in
between us and the hw, everything will pretty much fall apart.

git blame doesn't shed light on this particular issue. Chris?
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 5/5] drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer
  2014-10-30 16:39 ` [RFC 5/5] drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer Tvrtko Ursulin
@ 2014-11-03 16:18   ` Daniel Vetter
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 16:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Thu, Oct 30, 2014 at 04:39:38PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> It will help future code if this function knows something about of the context
> of the display setup object is being pinned for.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

Rebased and applied to dinq, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-11-03 16:02   ` Daniel Vetter
@ 2014-11-03 16:18     ` Tvrtko Ursulin
  2014-11-03 16:53       ` Daniel Vetter
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-03 16:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 11/03/2014 04:02 PM, Daniel Vetter wrote:
> On Thu, Oct 30, 2014 at 04:39:37PM +0000, Tvrtko Ursulin wrote:
>
> Just a sideline comment here for now ;-)
>
>> +/* intel_dma.c */
>> +int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages);
>
> Comments all over should tell you that i915_dma.c is super legacy
> territory and really you should never touch it. I think i915_gem.c is a
> better place.

Didn't spot any comments to this effect, in fact it looks like a 
mish-mash of stuff which is exactly why it looked appropriate for this. 
:) That said I don't mind moving it.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 15:58   ` Daniel Vetter
@ 2014-11-03 16:34     ` Tvrtko Ursulin
  2014-11-03 16:52       ` Daniel Vetter
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-03 16:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel-gfx


On 11/03/2014 03:58 PM, Daniel Vetter wrote:
> On Thu, Oct 30, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Things like reliable GGTT mappings and mirrored 3d display will need to be
>> to map the same object twice into the GGTT.
>>
>> Add a ggtt_view field per VMA and select the page view based on the type
>> of the view.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> lgtm overall, some comments for polish in the crucial parts below.
>
>> @@ -2189,3 +2191,21 @@ i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
>>
>>   	return vma;
>>   }
>> +
>> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>> +			u32 flags)
>> +{
>> +	struct sg_table *pages;
>> +
>> +	if (vma->ggtt_view.get_pages)
>> +		pages = vma->ggtt_view.get_pages(&vma->ggtt_view, vma->obj);
>> +	else
>> +		pages = vma->obj->pages;
>> +
>> +	if (pages && !IS_ERR(pages)) {
>> +		vma->bind_vma(vma, pages, cache_level, flags);
>> +
>> +		if (vma->ggtt_view.put_pages)
>> +			vma->ggtt_view.put_pages(&vma->ggtt_view);
>> +	}
>
> tbh this looks a bit like overboarding generalism to me ;-) Imo
> - drop the ->put_pages callback and just free the sg table if you have
>    one.
> - drop teh ->get_pages callbacks and replace it with a new
>    i915_vma_shuffle_pages or similar you'll call for non-standard ggtt
>    views. Two reasons: shuffle is a more accurate name than get since this
>    version doesn't grab it's own page references any more, and with
>    currently just one internal user for this a vtable is serious overkill.

I actually thought I will need semi-persistent view for this in the 
future patch which get_pages()/put_pages() caters for.

More precisely it would be for holding onto created pages during view 
creation across the i915_gem_gtt_prepare_object and i915_vma_bind in 
i915_gem_object_bind_to_vm.

Also, ->put_pages looks much neater to me from i915_gem_vma_destroy 
rather than leaking the knowledge of internal implementation. That is of 
course if you will allow the above justification for making the 
alternative view at least semi-persistent.

As additional bonus it has the same semantics to GEM get/put_pages.

>> +}
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index 66bc44b..cbaddda 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -109,6 +109,23 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>>   #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
>>   #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
>>
>> +enum i915_ggtt_view_type {
>> +	I915_GGTT_VIEW_NORMAL = 0,
>> +};
>> +
>> +struct i915_ggtt_view {
>> +	enum i915_ggtt_view_type type;
>> +	unsigned int vma_id;
>
> Imo vma_id and ggtt_view_type are fairly redundant. If you _really_ want
> to make this super-generic (i.e. for ppgtt) just call it gtt_view instead
> fo ggtt_view. But I wouldn't bother.

So you suggest to imply VMA id to be equal to GGTT view type and 
VMA_ID_DEFAULT to be fixed to I915_GGTT_VIEW_NORMAL? Is it not a 
logistics/maintenance problem to sync the two then?

> Removing vma_id also removes the int, which I'd ask you to replace with an
> explicit enum anyway ;-)

Yes I left that for later. I mean, when at stage to be talking about 
such detail it is a happy place.

>> +	struct sg_table *pages;
>
> This seems unused - leftover from a previous patch which kept the sgtable
> around?

Yes, survived refactoring somehow, shouldn't be here.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 16:34     ` Tvrtko Ursulin
@ 2014-11-03 16:52       ` Daniel Vetter
  2014-11-03 17:20         ` Tvrtko Ursulin
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 16:52 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Mon, Nov 03, 2014 at 04:34:29PM +0000, Tvrtko Ursulin wrote:
> 
> On 11/03/2014 03:58 PM, Daniel Vetter wrote:
> >On Thu, Oct 30, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>Things like reliable GGTT mappings and mirrored 3d display will need to be
> >>to map the same object twice into the GGTT.
> >>
> >>Add a ggtt_view field per VMA and select the page view based on the type
> >>of the view.
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> >lgtm overall, some comments for polish in the crucial parts below.
> >
> >>@@ -2189,3 +2191,21 @@ i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> >>
> >>  	return vma;
> >>  }
> >>+
> >>+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> >>+			u32 flags)
> >>+{
> >>+	struct sg_table *pages;
> >>+
> >>+	if (vma->ggtt_view.get_pages)
> >>+		pages = vma->ggtt_view.get_pages(&vma->ggtt_view, vma->obj);
> >>+	else
> >>+		pages = vma->obj->pages;
> >>+
> >>+	if (pages && !IS_ERR(pages)) {
> >>+		vma->bind_vma(vma, pages, cache_level, flags);
> >>+
> >>+		if (vma->ggtt_view.put_pages)
> >>+			vma->ggtt_view.put_pages(&vma->ggtt_view);
> >>+	}
> >
> >tbh this looks a bit like overboarding generalism to me ;-) Imo
> >- drop the ->put_pages callback and just free the sg table if you have
> >   one.
> >- drop teh ->get_pages callbacks and replace it with a new
> >   i915_vma_shuffle_pages or similar you'll call for non-standard ggtt
> >   views. Two reasons: shuffle is a more accurate name than get since this
> >   version doesn't grab it's own page references any more, and with
> >   currently just one internal user for this a vtable is serious overkill.
> 
> I actually thought I will need semi-persistent view for this in the future
> patch which get_pages()/put_pages() caters for.
> 
> More precisely it would be for holding onto created pages during view
> creation across the i915_gem_gtt_prepare_object and i915_vma_bind in
> i915_gem_object_bind_to_vm.
> 
> Also, ->put_pages looks much neater to me from i915_gem_vma_destroy rather
> than leaking the knowledge of internal implementation. That is of course if
> you will allow the above justification for making the alternative view at
> least semi-persistent.

I don't see why the view needs to be semi-persistent. And it certainly
doesn't work by attaching the sg table to the vma, since then the pages
can't survive the vma. And the vma is already cached (i.e. we only ever
throw them away lazily).

And I don't hink caching it longer than the vma is useful since when we
throw away the vma that's usually a much bigger performance disaster,
often involving stalls and chasing down backing storage again. Allocating
a puny sg table and filling it again isn't a lot of work. Especially since
this is only for uncommon cases like scanout buffers or ggtt mmaps thus
far.

> As additional bonus it has the same semantics to GEM get/put_pages.

Well, this isn't the same thing. The _really_ crucial part of
get/put_pages is to grab a reference off the backing storage pages, to
make sure they don't get swapped out.

The comes prepare/finish_gtt, whose job it is to ensure that those pages
are mapped into the VT-d iommu.

Finally there's the job of shuffling the resulting dma_addr_t pages from
prepare_gtt just for the insert_entries/vma_bind functions. You're about
two layers away from get/put_pages and this shuffled sg table here is
really only needed to make insert_entries happy. And not even for the
duration of the entire gtt.
> 
> >>+}
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>index 66bc44b..cbaddda 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>@@ -109,6 +109,23 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
> >>  #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
> >>  #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
> >>
> >>+enum i915_ggtt_view_type {
> >>+	I915_GGTT_VIEW_NORMAL = 0,
> >>+};
> >>+
> >>+struct i915_ggtt_view {
> >>+	enum i915_ggtt_view_type type;
> >>+	unsigned int vma_id;
> >
> >Imo vma_id and ggtt_view_type are fairly redundant. If you _really_ want
> >to make this super-generic (i.e. for ppgtt) just call it gtt_view instead
> >fo ggtt_view. But I wouldn't bother.
> 
> So you suggest to imply VMA id to be equal to GGTT view type and
> VMA_ID_DEFAULT to be fixed to I915_GGTT_VIEW_NORMAL? Is it not a
> logistics/maintenance problem to sync the two then?

Well if we kill one of them completely there's no syncing required, which
is what I think we want. Having two will be a nightmare, and I don't see
any use a few years out even for non-ggtt special views.

We can always add more complexity later on if needed.

> >Removing vma_id also removes the int, which I'd ask you to replace with an
> >explicit enum anyway ;-)
> 
> Yes I left that for later. I mean, when at stage to be talking about such
> detail it is a happy place.

I'm not a fan of preemptive generalization - it tends to be the wrong one
and hurt doubly in the future since you have to remove the wrong one first
before you can add the right stuff.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-11-03 16:18     ` Tvrtko Ursulin
@ 2014-11-03 16:53       ` Daniel Vetter
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 16:53 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Mon, Nov 03, 2014 at 04:18:30PM +0000, Tvrtko Ursulin wrote:
> 
> On 11/03/2014 04:02 PM, Daniel Vetter wrote:
> >On Thu, Oct 30, 2014 at 04:39:37PM +0000, Tvrtko Ursulin wrote:
> >
> >Just a sideline comment here for now ;-)
> >
> >>+/* intel_dma.c */
> >>+int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages);
> >
> >Comments all over should tell you that i915_dma.c is super legacy
> >territory and really you should never touch it. I think i915_gem.c is a
> >better place.
> 
> Didn't spot any comments to this effect, in fact it looks like a mish-mash
> of stuff which is exactly why it looked appropriate for this. :) That said I
> don't mind moving it.

There are 1-2 things in there still which needed saving, but overall this
file is doomed ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 16:52       ` Daniel Vetter
@ 2014-11-03 17:20         ` Tvrtko Ursulin
  2014-11-03 17:29           ` Daniel Vetter
  2014-11-03 17:35           ` Daniel Vetter
  0 siblings, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-03 17:20 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel-gfx


On 11/03/2014 04:52 PM, Daniel Vetter wrote:
> On Mon, Nov 03, 2014 at 04:34:29PM +0000, Tvrtko Ursulin wrote:
>>
>> On 11/03/2014 03:58 PM, Daniel Vetter wrote:
>>> On Thu, Oct 30, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Things like reliable GGTT mappings and mirrored 3d display will need to be
>>>> to map the same object twice into the GGTT.
>>>>
>>>> Add a ggtt_view field per VMA and select the page view based on the type
>>>> of the view.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>
>>> lgtm overall, some comments for polish in the crucial parts below.
>>>
>>>> @@ -2189,3 +2191,21 @@ i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
>>>>
>>>>   	return vma;
>>>>   }
>>>> +
>>>> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>>>> +			u32 flags)
>>>> +{
>>>> +	struct sg_table *pages;
>>>> +
>>>> +	if (vma->ggtt_view.get_pages)
>>>> +		pages = vma->ggtt_view.get_pages(&vma->ggtt_view, vma->obj);
>>>> +	else
>>>> +		pages = vma->obj->pages;
>>>> +
>>>> +	if (pages && !IS_ERR(pages)) {
>>>> +		vma->bind_vma(vma, pages, cache_level, flags);
>>>> +
>>>> +		if (vma->ggtt_view.put_pages)
>>>> +			vma->ggtt_view.put_pages(&vma->ggtt_view);
>>>> +	}
>>>
>>> tbh this looks a bit like overboarding generalism to me ;-) Imo
>>> - drop the ->put_pages callback and just free the sg table if you have
>>>    one.
>>> - drop teh ->get_pages callbacks and replace it with a new
>>>    i915_vma_shuffle_pages or similar you'll call for non-standard ggtt
>>>    views. Two reasons: shuffle is a more accurate name than get since this
>>>    version doesn't grab it's own page references any more, and with
>>>    currently just one internal user for this a vtable is serious overkill.
>>
>> I actually thought I will need semi-persistent view for this in the future
>> patch which get_pages()/put_pages() caters for.
>>
>> More precisely it would be for holding onto created pages during view
>> creation across the i915_gem_gtt_prepare_object and i915_vma_bind in
>> i915_gem_object_bind_to_vm.
>>
>> Also, ->put_pages looks much neater to me from i915_gem_vma_destroy rather
>> than leaking the knowledge of internal implementation. That is of course if
>> you will allow the above justification for making the alternative view at
>> least semi-persistent.
>
> I don't see why the view needs to be semi-persistent. And it certainly
> doesn't work by attaching the sg table to the vma, since then the pages
> can't survive the vma. And the vma is already cached (i.e. we only ever
> throw them away lazily).
 >
> And I don't hink caching it longer than the vma is useful since when we
> throw away the vma that's usually a much bigger performance disaster,
> often involving stalls and chasing down backing storage again. Allocating
> a puny sg table and filling it again isn't a lot of work. Especially since
> this is only for uncommon cases like scanout buffers or ggtt mmaps thus
> far.

Ah alright, think I see what you mean. Hm.. to explain once more why I 
put that in...

At the moment it is just the means of allow the alternative GGTT view to 
exists (as in sg_table) for the duration of i915_gem_object_bind_to_vm.

Because it does a two stage process there; the gtt_prepare_object and 
insert entries.

I did not like your idea, well I did not think it is feasible - as in 
easily doable, of stealing the DMA addresses since the SG tables between 
view don't have a 1:1 relationship in number of chunk/pages.

So maybe I am missing something, but to me it looked like a handy way of 
achieving that goal.

You don't like the fact that my put_pages is done only in 
i915_gem_vma_destroy in this patch, which only happens lazily, correct?

If the new get/put_pages calls on the view were done explicitly in 
i915_gem_object_bind_to_vm would that be any better? That would show 
explicitly that the SG table is a throw away.

>> So you suggest to imply VMA id to be equal to GGTT view type and
>> VMA_ID_DEFAULT to be fixed to I915_GGTT_VIEW_NORMAL? Is it not a
>> logistics/maintenance problem to sync the two then?
>
> Well if we kill one of them completely there's no syncing required, which
> is what I think we want. Having two will be a nightmare, and I don't see
> any use a few years out even for non-ggtt special views.
>
> We can always add more complexity later on if needed.

Ok I don't see it at the moment since they are two different concepts to 
me but I'll think about it.

>>> Removing vma_id also removes the int, which I'd ask you to replace with an
>>> explicit enum anyway ;-)
>>
>> Yes I left that for later. I mean, when at stage to be talking about such
>> detail it is a happy place.
>
> I'm not a fan of preemptive generalization - it tends to be the wrong one
> and hurt doubly in the future since you have to remove the wrong one first
> before you can add the right stuff.

In general :), as always it is the question of getting the balance 
right. Because the opposite can also happen and maybe even has in some 
parts of the driver.

Anyway, that's why I didn't bother with polishing the enums etc. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 17:20         ` Tvrtko Ursulin
@ 2014-11-03 17:29           ` Daniel Vetter
  2014-11-04 11:15             ` Tvrtko Ursulin
  2014-11-03 17:35           ` Daniel Vetter
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 17:29 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 3, 2014 at 6:20 PM, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> I did not like your idea, well I did not think it is feasible - as in easily
> doable, of stealing the DMA addresses since the SG tables between view don't
> have a 1:1 relationship in number of chunk/pages.

Ok, so sg table coalescing is getting in the way. On the source sg
table stored in obj->pages we can fix this by using one of the
per-page sg table walkers - they'll take care of all the annoying
details. See how we walk the sg tables in the pte insert fucntions for
examples.

On the new target sg table I'd simply not bother with merging and
allocate a full sg table with sg_alloc_table. Due to the rotation
there won't be a lot of contiguous stuff anyway.

That leaves the ugly problem of resorting the table without going
nuts. Either allocate a temporary array (allocated with drm_malloc_ab
since kmalloc won't be enough for this) of dma_addr_t and use your
approach. Or just walk the sg_tables row-vise on the rotate layout and
only fill in the relevant holes for an O(rows*num_pages) complexity. I
don't expect anyone to actually notice this little inefficient tbh ;-)

In short, I don't see a problem ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 17:20         ` Tvrtko Ursulin
  2014-11-03 17:29           ` Daniel Vetter
@ 2014-11-03 17:35           ` Daniel Vetter
  2014-11-04 10:25             ` Tvrtko Ursulin
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-03 17:35 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 3, 2014 at 6:20 PM, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>> I'm not a fan of preemptive generalization - it tends to be the wrong one
>> and hurt doubly in the future since you have to remove the wrong one first
>> before you can add the right stuff.
>
>
> In general :), as always it is the question of getting the balance right.
> Because the opposite can also happen and maybe even has in some parts of the
> driver.

Well there are ugly corners, but generally just areas where the proper
abstraction wasn't yet clear. Or areas where we've never gotten around
to implement what we discussed we wanted. The few cases where we've
had overgeneralization tended to hurt us badly and actually the
reasons behind some of the cruft in GEM.

> Anyway, that's why I didn't bother with polishing the enums etc. :)

Well I think if we do add some new abstraction it should be polished.
The point of abstraction is to hide details, and that's only possible
if the abstraction is clean&polished and doesn't leak in any
direction. Given that I've mastered in advanced abstract nonsene (i.e.
math) I might have a bit a screwed view here ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-11-03 16:05   ` Daniel Vetter
@ 2014-11-03 20:30     ` Chris Wilson
  2014-11-04 11:56       ` Imre Deak
  0 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2014-11-03 20:30 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx

On Mon, Nov 03, 2014 at 05:05:55PM +0100, Daniel Vetter wrote:
> On Thu, Oct 30, 2014 at 04:39:37PM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > It will be used by other call sites shortly.
> > 
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c         | 38 +++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_gem_userptr.c | 43 ++-------------------------------
> >  drivers/gpu/drm/i915/intel_drv.h        |  4 +++
> >  3 files changed, 44 insertions(+), 41 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 5b157bb..0b34571 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -2070,3 +2070,41 @@ int i915_driver_device_is_agp(struct drm_device *dev)
> >  {
> >  	return 1;
> >  }
> > +
> > +#if IS_ENABLED(CONFIG_SWIOTLB)
> > +#define swiotlb_active() swiotlb_nr_tbl()
> > +#else
> > +#define swiotlb_active() 0
> > +#endif
> > +
> > +int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> > +{
> > +	struct scatterlist *sg;
> > +	int ret, n;
> > +
> > +	*st = kmalloc(sizeof(**st), GFP_KERNEL);
> > +	if (*st == NULL)
> > +		return -ENOMEM;
> > +
> > +	if (swiotlb_active()) {
> > +		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> > +		if (ret)
> > +			goto err;
> > +
> > +		for_each_sg((*st)->sgl, sg, num_pages, n)
> > +			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> > +	} else {
> > +		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> > +						0, num_pages << PAGE_SHIFT,
> > +						GFP_KERNEL);
> > +		if (ret)
> > +			goto err;
> > +	}
> 
> Ok, I don't really understand why we don't just always use
> sg_alloc_table_from_pages undconditionally - if swiotlb _ever_ gets in
> between us and the hw, everything will pretty much fall apart.
> 
> git blame doesn't shed light on this particular issue. Chris?

This is Imre's work...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 17:35           ` Daniel Vetter
@ 2014-11-04 10:25             ` Tvrtko Ursulin
  0 siblings, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-04 10:25 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


On 11/03/2014 05:35 PM, Daniel Vetter wrote:
>> Anyway, that's why I didn't bother with polishing the enums etc. :)
>
> Well I think if we do add some new abstraction it should be polished.
> The point of abstraction is to hide details, and that's only possible
> if the abstraction is clean&polished and doesn't leak in any
> direction. Given that I've mastered in advanced abstract nonsene (i.e.
> math) I might have a bit a screwed view here ;-)

Don't think we were on the same page there. I just said I didn't bother 
with polish (int vs enum) _while_ the whole thing is in the design RFC 
stage. Not that it doesn't need to be polished when it's done.

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-03 17:29           ` Daniel Vetter
@ 2014-11-04 11:15             ` Tvrtko Ursulin
  2014-11-04 11:23               ` Daniel Vetter
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-04 11:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 11/03/2014 05:29 PM, Daniel Vetter wrote:
> On Mon, Nov 3, 2014 at 6:20 PM, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> I did not like your idea, well I did not think it is feasible - as in easily
>> doable, of stealing the DMA addresses since the SG tables between view don't
>> have a 1:1 relationship in number of chunk/pages.
>
> Ok, so sg table coalescing is getting in the way. On the source sg
> table stored in obj->pages we can fix this by using one of the
> per-page sg table walkers - they'll take care of all the annoying
> details. See how we walk the sg tables in the pte insert fucntions for
> examples.

Obviously I am doing the same elsewhere...

> On the new target sg table I'd simply not bother with merging and
> allocate a full sg table with sg_alloc_table. Due to the rotation
> there won't be a lot of contiguous stuff anyway.

For things like mirrored 2d/3d still it could save space. And it kind of 
feels suboptimal to special case and add low level code when there is a 
nice helper functions which handles it all.

> That leaves the ugly problem of resorting the table without going
> nuts. Either allocate a temporary array (allocated with drm_malloc_ab
> since kmalloc won't be enough for this) of dma_addr_t and use your
> approach. Or just walk the sg_tables row-vise on the rotate layout and
> only fill in the relevant holes for an O(rows*num_pages) complexity. I
> don't expect anyone to actually notice this little inefficient tbh ;-)
>
> In short, I don't see a problem ;-)

I just find it a bit ugly and hackish to poke in sg internals only to 
avoid having get_pages/put_pages (Or call them differently if the name 
bothers you the most. get_page_view/put_page_view perhaps?), even if the 
lifetime of that temporary sg_table is limited to VMA creation (as it in 
fact was in my patch).

So I suppose while you see a vtable as a "serious overkill", I saw it as 
a way to avoid messing around in sg internals and reuse existing code. 
Or in other words:

if (ggtt_view->type == NORMAL)
   pages = get_normal();
else if(ggtt_view->type == X1)
   pages = get_x1();
else if(ggtt_view->type == X2)
    pages = get_x2();

etc.. rather than simply:

if (ggtt_view->get_page_view)
   pages = ggtt_view->get_page_view)
else
    pages = obj->pages;

And be done with it. But of course my chances or merging this are slim 
unless I satisfy your taste so please let me know if your stance here is 
still "unshaken".

Also on the question of getting rid of one of VMA id and ggtt_view type, 
how exactly did you envisage that? Simply to imply that the only users 
of multiple VMAs are the GGTT views?

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-04 11:15             ` Tvrtko Ursulin
@ 2014-11-04 11:23               ` Daniel Vetter
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Vetter @ 2014-11-04 11:23 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Nov 04, 2014 at 11:15:16AM +0000, Tvrtko Ursulin wrote:
> 
> On 11/03/2014 05:29 PM, Daniel Vetter wrote:
> >On Mon, Nov 3, 2014 at 6:20 PM, Tvrtko Ursulin
> ><tvrtko.ursulin@linux.intel.com> wrote:
> >>I did not like your idea, well I did not think it is feasible - as in easily
> >>doable, of stealing the DMA addresses since the SG tables between view don't
> >>have a 1:1 relationship in number of chunk/pages.
> >
> >Ok, so sg table coalescing is getting in the way. On the source sg
> >table stored in obj->pages we can fix this by using one of the
> >per-page sg table walkers - they'll take care of all the annoying
> >details. See how we walk the sg tables in the pte insert fucntions for
> >examples.
> 
> Obviously I am doing the same elsewhere...
> 
> >On the new target sg table I'd simply not bother with merging and
> >allocate a full sg table with sg_alloc_table. Due to the rotation
> >there won't be a lot of contiguous stuff anyway.
> 
> For things like mirrored 2d/3d still it could save space. And it kind of
> feels suboptimal to special case and add low level code when there is a nice
> helper functions which handles it all.

If space would be a concern we wouldn't use sg tables. They have a lot of
fluff. Also afaik there's not sg interface to create an sg table from a
list of dma_addr_t, which is what we want.

> >That leaves the ugly problem of resorting the table without going
> >nuts. Either allocate a temporary array (allocated with drm_malloc_ab
> >since kmalloc won't be enough for this) of dma_addr_t and use your
> >approach. Or just walk the sg_tables row-vise on the rotate layout and
> >only fill in the relevant holes for an O(rows*num_pages) complexity. I
> >don't expect anyone to actually notice this little inefficient tbh ;-)
> >
> >In short, I don't see a problem ;-)
> 
> I just find it a bit ugly and hackish to poke in sg internals only to avoid
> having get_pages/put_pages (Or call them differently if the name bothers you
> the most. get_page_view/put_page_view perhaps?), even if the lifetime of
> that temporary sg_table is limited to VMA creation (as it in fact was in my
> patch).

sg tables are just data structures for drivers to map dma memory ranges.
As a driver your supposed to noodle around in them. So maybe I don't
understand your concern?

Wrt naming, get/put imply reference counting, which this doesn't do.
create_page_view/free_page_view I think are better names. Not sure about
the page_view name though.

> So I suppose while you see a vtable as a "serious overkill", I saw it as a
> way to avoid messing around in sg internals and reuse existing code. Or in
> other words:
> 
> if (ggtt_view->type == NORMAL)
>   pages = get_normal();
> else if(ggtt_view->type == X1)
>   pages = get_x1();
> else if(ggtt_view->type == X2)
>    pages = get_x2();
> 
> etc.. rather than simply:
> 
> if (ggtt_view->get_page_view)
>   pages = ggtt_view->get_page_view)
> else
>    pages = obj->pages;
> 
> And be done with it. But of course my chances or merging this are slim
> unless I satisfy your taste so please let me know if your stance here is
> still "unshaken".

Well imo vfuncs are a pain, so we generally add them only when there's
more than 2 implementations. Atm we have one. So if the other two show up
then I'm ok, but before that it just makes chasing callchains more
annoying.

> Also on the question of getting rid of one of VMA id and ggtt_view type, how
> exactly did you envisage that? Simply to imply that the only users of
> multiple VMAs are the GGTT views?

Yeah. We don't have anything else, and I don't expect anything else
really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver
  2014-11-03 20:30     ` Chris Wilson
@ 2014-11-04 11:56       ` Imre Deak
  0 siblings, 0 replies; 43+ messages in thread
From: Imre Deak @ 2014-11-04 11:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel-gfx

On Mon, 2014-11-03 at 20:30 +0000, Chris Wilson wrote:
> On Mon, Nov 03, 2014 at 05:05:55PM +0100, Daniel Vetter wrote:
> > On Thu, Oct 30, 2014 at 04:39:37PM +0000, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > 
> > > It will be used by other call sites shortly.
> > > 
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c         | 38 +++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/i915/i915_gem_userptr.c | 43 ++-------------------------------
> > >  drivers/gpu/drm/i915/intel_drv.h        |  4 +++
> > >  3 files changed, 44 insertions(+), 41 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 5b157bb..0b34571 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -2070,3 +2070,41 @@ int i915_driver_device_is_agp(struct drm_device *dev)
> > >  {
> > >  	return 1;
> > >  }
> > > +
> > > +#if IS_ENABLED(CONFIG_SWIOTLB)
> > > +#define swiotlb_active() swiotlb_nr_tbl()
> > > +#else
> > > +#define swiotlb_active() 0
> > > +#endif
> > > +
> > > +int i915_st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> > > +{
> > > +	struct scatterlist *sg;
> > > +	int ret, n;
> > > +
> > > +	*st = kmalloc(sizeof(**st), GFP_KERNEL);
> > > +	if (*st == NULL)
> > > +		return -ENOMEM;
> > > +
> > > +	if (swiotlb_active()) {
> > > +		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> > > +		if (ret)
> > > +			goto err;
> > > +
> > > +		for_each_sg((*st)->sgl, sg, num_pages, n)
> > > +			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> > > +	} else {
> > > +		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> > > +						0, num_pages << PAGE_SHIFT,
> > > +						GFP_KERNEL);
> > > +		if (ret)
> > > +			goto err;
> > > +	}
> > 
> > Ok, I don't really understand why we don't just always use
> > sg_alloc_table_from_pages undconditionally - if swiotlb _ever_ gets in
> > between us and the hw, everything will pretty much fall apart.
> > 
> > git blame doesn't shed light on this particular issue. Chris?
> 
> This is Imre's work...

Afaict, the distinction was made because of swiotlb's limitation, which
prevented us from using compact sg tables produced by
sg_alloc_table_from_pages(). The commit below explains it more:

commit 426729dcc713b3d1ae802e314030e5556a62da53
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date:   Mon Jun 24 11:47:48 2013 -0400

    drm/i915: make compact dma scatter lists creation work with SWIOTLB
backend.
    
    Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4
    ("drm/i915: create compact dma scatter lists for gem objects") makes
    certain assumptions about the under laying DMA API that are not
always
    correct.
    
    On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
    I see:
    
    [drm:intel_pipe_set_base] *ERROR* pin & fence failed
    [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3],
err = -28
    
    Bit of debugging traced it down to dma_map_sg failing (in
    i915_gem_gtt_prepare_object) as some of the SG entries were huge
(3MB).
    
    That unfortunately are sizes that the SWIOTLB is incapable of
handling -
    the maximum it can handle is a an entry of 512KB of virtual
contiguous
    memory for its bounce buffer. (See IO_TLB_SEGSIZE).
    
    Previous to the above mention git commit the SG entries were of 4KB,
and
    the code introduced by above git commit squashed the CPU contiguous
PFNs
    in one big virtual address provided to DMA API.
    
    This patch is a simple semi-revert - were we emulate the old
behavior
    if we detect that SWIOTLB is online. If it is not online then we
continue
    on with the new compact scatter gather mechanism.
    
    An alternative solution would be for the the '.get_pages' and the
    i915_gem_gtt_prepare_object to retry with smaller max gap of the
    amount of PFNs that can be combined together - but with this issue
    discovered during rc7 that might be too risky.
    


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2014-10-30 16:39 ` [RFC 5/5] drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer Tvrtko Ursulin
@ 2014-11-06 14:39 ` Tvrtko Ursulin
  2014-11-12 17:02   ` Daniel Vetter
  2014-11-27 14:52   ` [PATCH] " Tvrtko Ursulin
  5 siblings, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-06 14:39 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Things like reliable GGTT mmaps and mirrored 2d-on-3d display will need
to map objects into the same address space multiple times.

This also means that objects now can have multiple VMA entries.

Added a GGTT view concept and linked it with the VMA to distinguish betwen
multiple instances per address space.

New objects and GEM functions which do not take this new view as a parameter
assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
previous behaviour.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  4 +-
 drivers/gpu/drm/i915/i915_drv.h            | 44 +++++++++++++--
 drivers/gpu/drm/i915/i915_gem.c            | 88 ++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 71 ++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_gtt.h        | 29 +++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  8 +--
 8 files changed, 179 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0a69813..3f23c51 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -154,8 +154,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 			seq_puts(m, " (pp");
 		else
 			seq_puts(m, " (g");
-		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
-			   vma->node.start, vma->node.size);
+		seq_printf(m, "gtt offset: %08lx, size: %08lx, type: %u)",
+			   vma->node.start, vma->node.size, vma->ggtt_view.type);
 	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 54691bc..50109fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2420,10 +2420,23 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 #define PIN_GLOBAL 0x4
 #define PIN_OFFSET_BIAS 0x8
 #define PIN_OFFSET_MASK (~4095)
+int __must_check i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
+					  struct i915_address_space *vm,
+					  uint32_t alignment,
+					  uint64_t flags,
+					  const struct i915_ggtt_view *view);
+static inline
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm,
 				     uint32_t alignment,
-				     uint64_t flags);
+				     uint64_t flags)
+{
+	return i915_gem_object_pin_view(obj, vm, alignment, flags,
+						&i915_ggtt_view_normal);
+}
+
+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+			u32 flags);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
@@ -2570,18 +2583,41 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 
 void i915_gem_restore_fences(struct drm_device *dev);
 
+unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
+				       struct i915_address_space *vm,
+				       enum i915_ggtt_view_type view);
+static inline
 unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
-				  struct i915_address_space *vm);
+				  struct i915_address_space *vm)
+{
+	return i915_gem_obj_offset_view(o, vm, I915_GGTT_VIEW_NORMAL);
+}
 bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
 bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
 			struct i915_address_space *vm);
 unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 				struct i915_address_space *vm);
+struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
+				          struct i915_address_space *vm,
+					  const struct i915_ggtt_view *view);
+static inline
 struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
-				     struct i915_address_space *vm);
+				     struct i915_address_space *vm)
+{
+	return i915_gem_obj_to_vma_view(obj, vm, &i915_ggtt_view_normal);
+}
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
+				       struct i915_address_space *vm,
+				       const struct i915_ggtt_view *view);
+static inline
 struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
-				  struct i915_address_space *vm);
+				  struct i915_address_space *vm)
+{
+	return i915_gem_obj_lookup_or_create_vma_view(obj, vm,
+						&i915_ggtt_view_normal);
+}
 
 struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj);
 static inline bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1de94cc..b46a0f7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2007,8 +2007,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
 			/* For the unbound phase, this should be a no-op! */
 			list_for_each_entry_safe(vma, v,
 						 &obj->vma_list, vma_link)
-				if (i915_vma_unbind(vma))
-					break;
+				i915_vma_unbind(vma);
 
 			if (i915_gem_object_put_pages(obj) == 0)
 				count += obj->base.size >> PAGE_SHIFT;
@@ -2209,17 +2208,14 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-	struct i915_address_space *vm;
 	struct i915_vma *vma;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		vma = i915_gem_obj_to_vma(obj, vm);
-		if (vma && !list_empty(&vma->mm_list))
-			list_move_tail(&vma->mm_list, &vm->inactive_list);
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (!list_empty(&vma->mm_list))
+			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
 	}
 
 	intel_fb_obj_flush(obj, true);
@@ -2930,6 +2926,21 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
 					    old_write_domain);
 }
 
+static unsigned int count_ggtt_mappings(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma;
+	unsigned int count = 0;
+
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (vma->vm != i915_obj_to_ggtt(obj))
+			continue;
+		if (vma->bound & GLOBAL_BIND)
+			count++;
+	}
+
+	return count;
+}
+
 int i915_vma_unbind(struct i915_vma *vma)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -2960,7 +2971,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 	/* Throw away the active reference before moving to the unbound list */
 	i915_gem_object_retire(obj);
 
-	if (i915_is_ggtt(vma->vm)) {
+	if (i915_is_ggtt(vma->vm) && count_ggtt_mappings(obj) == 1) {
 		i915_gem_object_finish_gtt(obj);
 
 		/* release the fence reg _after_ flushing */
@@ -2983,7 +2994,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 	/* Since the unbound list is global, only move to that list if
 	 * no more VMAs exist. */
 	if (list_empty(&obj->vma_list)) {
-		i915_gem_gtt_finish_object(obj);
+		i915_gem_gtt_finish_object(obj, &vma->ggtt_view);
 		list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
 	}
 
@@ -3393,7 +3404,8 @@ static struct i915_vma *
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
-			   uint64_t flags)
+			   uint64_t flags,
+			   const struct i915_ggtt_view *view)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3443,7 +3455,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
+	vma = i915_gem_obj_lookup_or_create_vma_view(obj, vm, view);
 	if (IS_ERR(vma))
 		goto err_unpin;
 
@@ -3469,15 +3481,16 @@ search_free:
 		goto err_remove_node;
 	}
 
-	ret = i915_gem_gtt_prepare_object(obj);
+	ret = i915_gem_gtt_prepare_object(obj, &vma->ggtt_view);
 	if (ret)
 		goto err_remove_node;
 
-	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
+	if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
+		list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
 	trace_i915_vma_bind(vma, flags);
-	vma->bind_vma(vma, obj->cache_level,
+	i915_vma_bind(vma, obj->cache_level,
 		      flags & PIN_GLOBAL ? GLOBAL_BIND : 0);
 
 	return vma;
@@ -3685,7 +3698,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
 			if (drm_mm_node_allocated(&vma->node))
-				vma->bind_vma(vma, cache_level,
+				i915_vma_bind(vma, cache_level,
 						vma->bound & GLOBAL_BIND);
 	}
 
@@ -4040,10 +4053,11 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags)
 }
 
 int
-i915_gem_object_pin(struct drm_i915_gem_object *obj,
-		    struct i915_address_space *vm,
-		    uint32_t alignment,
-		    uint64_t flags)
+i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
+			 struct i915_address_space *vm,
+			 uint32_t alignment,
+			 uint64_t flags,
+			 const struct i915_ggtt_view *view)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	struct i915_vma *vma;
@@ -4059,7 +4073,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	if (WARN_ON((flags & (PIN_MAPPABLE | PIN_GLOBAL)) == PIN_MAPPABLE))
 		return -EINVAL;
 
-	vma = i915_gem_obj_to_vma(obj, vm);
+	vma = i915_gem_obj_to_vma_view(obj, vm, view);
 	if (vma) {
 		if (WARN_ON(vma->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 			return -EBUSY;
@@ -4069,7 +4083,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     i915_gem_obj_offset(obj, vm), alignment,
+			     i915_gem_obj_offset_view(obj, vm, view->type), alignment,
 			     !!(flags & PIN_MAPPABLE),
 			     obj->map_and_fenceable);
 			ret = i915_vma_unbind(vma);
@@ -4082,13 +4096,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 
 	bound = vma ? vma->bound : 0;
 	if (vma == NULL || !drm_mm_node_allocated(&vma->node)) {
-		vma = i915_gem_object_bind_to_vm(obj, vm, alignment, flags);
+		vma = i915_gem_object_bind_to_vm(obj, vm, alignment,
+							flags, view);
 		if (IS_ERR(vma))
 			return PTR_ERR(vma);
 	}
 
 	if (flags & PIN_GLOBAL && !(vma->bound & GLOBAL_BIND))
-		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
 
 	if ((bound ^ vma->bound) & GLOBAL_BIND) {
 		bool mappable, fenceable;
@@ -4502,12 +4517,13 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	intel_runtime_pm_put(dev_priv);
 }
 
-struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
-				     struct i915_address_space *vm)
+struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
+				          struct i915_address_space *vm,
+					  const struct i915_ggtt_view *view)
 {
 	struct i915_vma *vma;
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
-		if (vma->vm == vm)
+		if (vma->vm == vm && vma->ggtt_view.type == view->type)
 			return vma;
 
 	return NULL;
@@ -5191,8 +5207,9 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
 }
 
 /* All the new VM stuff */
-unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
-				  struct i915_address_space *vm)
+unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
+				       struct i915_address_space *vm,
+				       enum i915_ggtt_view_type view)
 {
 	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
 	struct i915_vma *vma;
@@ -5200,7 +5217,7 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
 	WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base);
 
 	list_for_each_entry(vma, &o->vma_list, vma_link) {
-		if (vma->vm == vm)
+		if (vma->vm == vm && vma->ggtt_view.type == view)
 			return vma->node.start;
 
 	}
@@ -5349,9 +5366,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
 {
 	struct i915_vma *vma;
 
-	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
-	if (vma->vm != i915_obj_to_ggtt(obj))
-		return NULL;
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (vma->vm != i915_obj_to_ggtt(obj))
+			continue;
+		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
+			return vma;
+	}
 
-	return vma;
+	return NULL;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 7d32571..f815565 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -574,7 +574,7 @@ static int do_switch(struct intel_engine_cs *ring,
 
 	vma = i915_gem_obj_to_ggtt(to->legacy_hw_ctx.rcs_state);
 	if (!(vma->bound & GLOBAL_BIND))
-		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level,
+		i915_vma_bind(vma, to->legacy_hw_ctx.rcs_state->cache_level,
 				GLOBAL_BIND);
 
 	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e1ed85a..07e2872 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -358,7 +358,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !(target_vma->bound & GLOBAL_BIND)))
-		target_vma->bind_vma(target_vma, target_i915_obj->cache_level,
+		i915_vma_bind(target_vma, target_i915_obj->cache_level,
 				GLOBAL_BIND);
 
 	/* Validate that the target is in a valid r/w GPU domain */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index de12017..3f8881c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -30,6 +30,8 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
+const struct i915_ggtt_view i915_ggtt_view_normal;
+
 static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv);
 static void chv_setup_private_ppat(struct drm_i915_private *dev_priv);
 
@@ -71,7 +73,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 }
 
 
-static void ppgtt_bind_vma(struct i915_vma *vma,
+static void ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			   enum i915_cache_level cache_level,
 			   u32 flags);
 static void ppgtt_unbind_vma(struct i915_vma *vma);
@@ -1194,7 +1196,7 @@ void  i915_ppgtt_release(struct kref *kref)
 }
 
 static void
-ppgtt_bind_vma(struct i915_vma *vma,
+ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 	       enum i915_cache_level cache_level,
 	       u32 flags)
 {
@@ -1202,7 +1204,7 @@ ppgtt_bind_vma(struct i915_vma *vma,
 	if (vma->obj->gt_ro)
 		flags |= PTE_READ_ONLY;
 
-	vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
+	vma->vm->insert_entries(vma->vm, pages, vma->node.start,
 				cache_level, flags);
 }
 
@@ -1337,7 +1339,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 		 * without telling our object about it. So we need to fake it.
 		 */
 		vma->bound &= ~GLOBAL_BIND;
-		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
 	}
 
 
@@ -1364,9 +1366,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	i915_ggtt_flush(dev_priv);
 }
 
-int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
+int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj,
+				const struct i915_ggtt_view *view)
 {
-	if (obj->has_dma_mapping)
+	if (obj->has_dma_mapping || view->type != I915_GGTT_VIEW_NORMAL)
 		return 0;
 
 	if (!dma_map_sg(&obj->base.dev->pdev->dev,
@@ -1523,7 +1526,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 }
 
 
-static void i915_ggtt_bind_vma(struct i915_vma *vma,
+static void i915_ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			       enum i915_cache_level cache_level,
 			       u32 unused)
 {
@@ -1532,7 +1535,7 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
 	BUG_ON(!i915_is_ggtt(vma->vm));
-	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	intel_gtt_insert_sg_entries(pages, entry, flags);
 	vma->bound = GLOBAL_BIND;
 }
 
@@ -1556,7 +1559,7 @@ static void i915_ggtt_unbind_vma(struct i915_vma *vma)
 	intel_gtt_clear_range(first, size);
 }
 
-static void ggtt_bind_vma(struct i915_vma *vma,
+static void ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			  enum i915_cache_level cache_level,
 			  u32 flags)
 {
@@ -1582,7 +1585,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
 		if (!(vma->bound & GLOBAL_BIND) ||
 		    (cache_level != obj->cache_level)) {
-			vma->vm->insert_entries(vma->vm, obj->pages,
+			vma->vm->insert_entries(vma->vm, pages,
 						vma->node.start,
 						cache_level, flags);
 			vma->bound |= GLOBAL_BIND;
@@ -1594,7 +1597,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	     (cache_level != obj->cache_level))) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.insert_entries(&appgtt->base,
-					    vma->obj->pages,
+					    pages,
 					    vma->node.start,
 					    cache_level, flags);
 		vma->bound |= LOCAL_BIND;
@@ -1625,7 +1628,8 @@ static void ggtt_unbind_vma(struct i915_vma *vma)
 	}
 }
 
-void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
+void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj,
+				const struct i915_ggtt_view *view)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1633,7 +1637,7 @@ void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 
 	interruptible = do_idling(dev_priv);
 
-	if (!obj->has_dma_mapping)
+	if (!obj->has_dma_mapping && view->type == I915_GGTT_VIEW_NORMAL)
 		dma_unmap_sg(&dev->pdev->dev,
 			     obj->pages->sgl, obj->pages->nents,
 			     PCI_DMA_BIDIRECTIONAL);
@@ -2135,7 +2139,8 @@ int i915_gem_gtt_init(struct drm_device *dev)
 }
 
 static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
-					      struct i915_address_space *vm)
+					      struct i915_address_space *vm,
+					      const struct i915_ggtt_view *view)
 {
 	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
 	if (vma == NULL)
@@ -2146,6 +2151,7 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&vma->exec_list);
 	vma->vm = vm;
 	vma->obj = obj;
+	vma->ggtt_view = *view;
 
 	switch (INTEL_INFO(vm->dev)->gen) {
 	case 9:
@@ -2184,14 +2190,43 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
 }
 
 struct i915_vma *
-i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
-				  struct i915_address_space *vm)
+i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
+				       struct i915_address_space *vm,
+				       const struct i915_ggtt_view *view)
 {
 	struct i915_vma *vma;
 
-	vma = i915_gem_obj_to_vma(obj, vm);
+	vma = i915_gem_obj_to_vma_view(obj, vm, view);
 	if (!vma)
-		vma = __i915_gem_vma_create(obj, vm);
+		vma = __i915_gem_vma_create(obj, vm, view);
 
 	return vma;
 }
+
+static inline
+struct sg_table *i915_ggtt_view_pages(struct i915_vma *vma)
+{
+	struct sg_table *pages = ERR_PTR(-EINVAL);
+
+	if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
+		pages = vma->obj->pages;
+	else
+		WARN(1, "Not implemented!\n");
+
+	return pages;
+}
+
+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+			u32 flags)
+{
+	struct sg_table *pages = i915_ggtt_view_pages(vma);
+
+	if (pages && !IS_ERR(pages)) {
+		vma->bind_vma(vma, pages, cache_level, flags);
+
+		if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
+			sg_free_table(pages);
+			kfree(pages);
+		}
+	}
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d0562d0..db6ac60 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -109,7 +109,18 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+enum i915_ggtt_view_type {
+	I915_GGTT_VIEW_NORMAL = 0,
+};
+
+struct i915_ggtt_view {
+	enum i915_ggtt_view_type type;
+};
+
+extern const struct i915_ggtt_view i915_ggtt_view_normal;
+
 enum i915_cache_level;
+
 /**
  * A VMA represents a GEM BO that is bound into an address space. Therefore, a
  * VMA's presence cannot be guaranteed before binding, or after unbinding the
@@ -129,6 +140,15 @@ struct i915_vma {
 #define PTE_READ_ONLY	(1<<2)
 	unsigned int bound : 4;
 
+	/**
+	 * Support different GGTT views into the same object.
+	 * This means there can be multiple VMA mappings per object and per VM.
+	 * i915_ggtt_view_type is used to distinguish between those entries.
+	 * The default one of zero (I915_GGTT_VIEW_NORMAL) is default and also
+	 * assumed in GEM functions which take no ggtt view parameter.
+	 */
+	struct i915_ggtt_view ggtt_view;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -161,7 +181,7 @@ struct i915_vma {
 	 * setting the valid PTE entries to a reserved scratch page. */
 	void (*unbind_vma)(struct i915_vma *vma);
 	/* Map an object into an address space with the given cache flags. */
-	void (*bind_vma)(struct i915_vma *vma,
+	void (*bind_vma)(struct i915_vma *vma, struct sg_table *pages,
 			 enum i915_cache_level cache_level,
 			 u32 flags);
 };
@@ -299,7 +319,10 @@ void i915_check_and_clear_faults(struct drm_device *dev);
 void i915_gem_suspend_gtt_mappings(struct drm_device *dev);
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 
-int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
+int __must_check
+i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj,
+			    const struct i915_ggtt_view *view);
+void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj,
+				const struct i915_ggtt_view *view);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index d17360b..8ebe45b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
 			break;
 
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
-			if (vma->vm == vm && vma->pin_count > 0) {
+			if (vma->vm == vm && vma->pin_count > 0)
 				capture_bo(err++, vma);
-				break;
-			}
 	}
 
 	return err - first;
@@ -1096,10 +1094,8 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
-			if (vma->vm == vm && vma->pin_count > 0) {
+			if (vma->vm == vm && vma->pin_count > 0)
 				i++;
-				break;
-			}
 	}
 	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [RFC] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-06 14:39 ` [RFC] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
@ 2014-11-12 17:02   ` Daniel Vetter
  2014-11-24 12:32     ` Tvrtko Ursulin
  2014-11-27 14:52   ` [PATCH] " Tvrtko Ursulin
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-12 17:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Thu, Nov 06, 2014 at 02:39:25PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Things like reliable GGTT mmaps and mirrored 2d-on-3d display will need
> to map objects into the same address space multiple times.
> 
> This also means that objects now can have multiple VMA entries.
> 
> Added a GGTT view concept and linked it with the VMA to distinguish betwen
> multiple instances per address space.
> 
> New objects and GEM functions which do not take this new view as a parameter
> assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
> previous behaviour.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

Let's hope I've picked the latest version, not sure. Please either have
in-patch changelogs or coverletters with what's new to help confused
maintainers ;-)

Looks good overal, just a few things around the lifetime rules for sg
tables and related stuff that we've discussed offline. And one nit.

I think with these addressed this is ready for detailed review and merging
(also applies to the -internal series).

Cheers, Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  4 +-
>  drivers/gpu/drm/i915/i915_drv.h            | 44 +++++++++++++--
>  drivers/gpu/drm/i915/i915_gem.c            | 88 ++++++++++++++++++------------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 71 ++++++++++++++++++------
>  drivers/gpu/drm/i915/i915_gem_gtt.h        | 29 +++++++++-
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  8 +--
>  8 files changed, 179 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 0a69813..3f23c51 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -154,8 +154,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  			seq_puts(m, " (pp");
>  		else
>  			seq_puts(m, " (g");
> -		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
> -			   vma->node.start, vma->node.size);
> +		seq_printf(m, "gtt offset: %08lx, size: %08lx, type: %u)",
> +			   vma->node.start, vma->node.size, vma->ggtt_view.type);
>  	}
>  	if (obj->stolen)
>  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 54691bc..50109fa 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2420,10 +2420,23 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
>  #define PIN_GLOBAL 0x4
>  #define PIN_OFFSET_BIAS 0x8
>  #define PIN_OFFSET_MASK (~4095)
> +int __must_check i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
> +					  struct i915_address_space *vm,
> +					  uint32_t alignment,
> +					  uint64_t flags,
> +					  const struct i915_ggtt_view *view);
> +static inline
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm,
>  				     uint32_t alignment,
> -				     uint64_t flags);
> +				     uint64_t flags)
> +{
> +	return i915_gem_object_pin_view(obj, vm, alignment, flags,
> +						&i915_ggtt_view_normal);
> +}
> +
> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> +			u32 flags);
>  int __must_check i915_vma_unbind(struct i915_vma *vma);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
> @@ -2570,18 +2583,41 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>  
>  void i915_gem_restore_fences(struct drm_device *dev);
>  
> +unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
> +				       struct i915_address_space *vm,
> +				       enum i915_ggtt_view_type view);
> +static inline
>  unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> -				  struct i915_address_space *vm);
> +				  struct i915_address_space *vm)
> +{
> +	return i915_gem_obj_offset_view(o, vm, I915_GGTT_VIEW_NORMAL);
> +}
>  bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
>  bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
>  			struct i915_address_space *vm);
>  unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
>  				struct i915_address_space *vm);
> +struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
> +				          struct i915_address_space *vm,
> +					  const struct i915_ggtt_view *view);
> +static inline
>  struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> -				     struct i915_address_space *vm);
> +				     struct i915_address_space *vm)
> +{
> +	return i915_gem_obj_to_vma_view(obj, vm, &i915_ggtt_view_normal);
> +}
> +struct i915_vma *
> +i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
> +				       struct i915_address_space *vm,
> +				       const struct i915_ggtt_view *view);
> +static inline
>  struct i915_vma *
>  i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> -				  struct i915_address_space *vm);
> +				  struct i915_address_space *vm)
> +{
> +	return i915_gem_obj_lookup_or_create_vma_view(obj, vm,
> +						&i915_ggtt_view_normal);
> +}
>  
>  struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj);
>  static inline bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1de94cc..b46a0f7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2007,8 +2007,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>  			/* For the unbound phase, this should be a no-op! */
>  			list_for_each_entry_safe(vma, v,
>  						 &obj->vma_list, vma_link)
> -				if (i915_vma_unbind(vma))
> -					break;
> +				i915_vma_unbind(vma);
>  
>  			if (i915_gem_object_put_pages(obj) == 0)
>  				count += obj->base.size >> PAGE_SHIFT;
> @@ -2209,17 +2208,14 @@ void i915_vma_move_to_active(struct i915_vma *vma,
>  static void
>  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  {
> -	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> -	struct i915_address_space *vm;
>  	struct i915_vma *vma;
>  
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> -	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> -		vma = i915_gem_obj_to_vma(obj, vm);
> -		if (vma && !list_empty(&vma->mm_list))
> -			list_move_tail(&vma->mm_list, &vm->inactive_list);
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (!list_empty(&vma->mm_list))
> +			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
>  	}
>  
>  	intel_fb_obj_flush(obj, true);
> @@ -2930,6 +2926,21 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>  					    old_write_domain);
>  }
>  
> +static unsigned int count_ggtt_mappings(struct drm_i915_gem_object *obj)
> +{
> +	struct i915_vma *vma;
> +	unsigned int count = 0;
> +
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (vma->vm != i915_obj_to_ggtt(obj))
> +			continue;
> +		if (vma->bound & GLOBAL_BIND)
> +			count++;
> +	}
> +
> +	return count;
> +}
> +
>  int i915_vma_unbind(struct i915_vma *vma)
>  {
>  	struct drm_i915_gem_object *obj = vma->obj;
> @@ -2960,7 +2971,7 @@ int i915_vma_unbind(struct i915_vma *vma)
>  	/* Throw away the active reference before moving to the unbound list */
>  	i915_gem_object_retire(obj);
>  
> -	if (i915_is_ggtt(vma->vm)) {
> +	if (i915_is_ggtt(vma->vm) && count_ggtt_mappings(obj) == 1) {

This should probably be a check for view == NORMAL instead of
count_mappings == 1 - finish_gtt is about cpu writes through the gtt and
so only applies to the linear/contiguous mapping.

>  		i915_gem_object_finish_gtt(obj);
>  
>  		/* release the fence reg _after_ flushing */
> @@ -2983,7 +2994,7 @@ int i915_vma_unbind(struct i915_vma *vma)
>  	/* Since the unbound list is global, only move to that list if
>  	 * no more VMAs exist. */
>  	if (list_empty(&obj->vma_list)) {
> -		i915_gem_gtt_finish_object(obj);
> +		i915_gem_gtt_finish_object(obj, &vma->ggtt_view);
>  		list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
>  	}
>  
> @@ -3393,7 +3404,8 @@ static struct i915_vma *
>  i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  			   struct i915_address_space *vm,
>  			   unsigned alignment,
> -			   uint64_t flags)
> +			   uint64_t flags,
> +			   const struct i915_ggtt_view *view)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -3443,7 +3455,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> -	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
> +	vma = i915_gem_obj_lookup_or_create_vma_view(obj, vm, view);
>  	if (IS_ERR(vma))
>  		goto err_unpin;
>  
> @@ -3469,15 +3481,16 @@ search_free:
>  		goto err_remove_node;
>  	}
>  
> -	ret = i915_gem_gtt_prepare_object(obj);
> +	ret = i915_gem_gtt_prepare_object(obj, &vma->ggtt_view);
>  	if (ret)
>  		goto err_remove_node;
>  
> -	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> +	if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> +		list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	trace_i915_vma_bind(vma, flags);
> -	vma->bind_vma(vma, obj->cache_level,
> +	i915_vma_bind(vma, obj->cache_level,
>  		      flags & PIN_GLOBAL ? GLOBAL_BIND : 0);
>  
>  	return vma;
> @@ -3685,7 +3698,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  
>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
>  			if (drm_mm_node_allocated(&vma->node))
> -				vma->bind_vma(vma, cache_level,
> +				i915_vma_bind(vma, cache_level,
>  						vma->bound & GLOBAL_BIND);
>  	}
>  
> @@ -4040,10 +4053,11 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags)
>  }
>  
>  int
> -i915_gem_object_pin(struct drm_i915_gem_object *obj,
> -		    struct i915_address_space *vm,
> -		    uint32_t alignment,
> -		    uint64_t flags)
> +i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
> +			 struct i915_address_space *vm,
> +			 uint32_t alignment,
> +			 uint64_t flags,
> +			 const struct i915_ggtt_view *view)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	struct i915_vma *vma;
> @@ -4059,7 +4073,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON((flags & (PIN_MAPPABLE | PIN_GLOBAL)) == PIN_MAPPABLE))
>  		return -EINVAL;
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
> +	vma = i915_gem_obj_to_vma_view(obj, vm, view);
>  	if (vma) {
>  		if (WARN_ON(vma->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  			return -EBUSY;
> @@ -4069,7 +4083,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_offset(obj, vm), alignment,
> +			     i915_gem_obj_offset_view(obj, vm, view->type), alignment,
>  			     !!(flags & PIN_MAPPABLE),
>  			     obj->map_and_fenceable);
>  			ret = i915_vma_unbind(vma);
> @@ -4082,13 +4096,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  
>  	bound = vma ? vma->bound : 0;
>  	if (vma == NULL || !drm_mm_node_allocated(&vma->node)) {
> -		vma = i915_gem_object_bind_to_vm(obj, vm, alignment, flags);
> +		vma = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +							flags, view);

Nit: Please align continuations for paramter lists to the opening (
checkpatch --strict can check for that.

>  		if (IS_ERR(vma))
>  			return PTR_ERR(vma);
>  	}
>  
>  	if (flags & PIN_GLOBAL && !(vma->bound & GLOBAL_BIND))
> -		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> +		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
>  
>  	if ((bound ^ vma->bound) & GLOBAL_BIND) {
>  		bool mappable, fenceable;
> @@ -4502,12 +4517,13 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	intel_runtime_pm_put(dev_priv);
>  }
>  
> -struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> -				     struct i915_address_space *vm)
> +struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
> +				          struct i915_address_space *vm,
> +					  const struct i915_ggtt_view *view)
>  {
>  	struct i915_vma *vma;
>  	list_for_each_entry(vma, &obj->vma_list, vma_link)
> -		if (vma->vm == vm)
> +		if (vma->vm == vm && vma->ggtt_view.type == view->type)
>  			return vma;
>  
>  	return NULL;
> @@ -5191,8 +5207,9 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
>  }
>  
>  /* All the new VM stuff */
> -unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> -				  struct i915_address_space *vm)
> +unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
> +				       struct i915_address_space *vm,
> +				       enum i915_ggtt_view_type view)
>  {
>  	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
>  	struct i915_vma *vma;
> @@ -5200,7 +5217,7 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
>  	WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base);
>  
>  	list_for_each_entry(vma, &o->vma_list, vma_link) {
> -		if (vma->vm == vm)
> +		if (vma->vm == vm && vma->ggtt_view.type == view)
>  			return vma->node.start;
>  
>  	}
> @@ -5349,9 +5366,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
>  {
>  	struct i915_vma *vma;
>  
> -	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
> -	if (vma->vm != i915_obj_to_ggtt(obj))
> -		return NULL;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (vma->vm != i915_obj_to_ggtt(obj))
> +			continue;
> +		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> +			return vma;
> +	}
>  
> -	return vma;
> +	return NULL;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 7d32571..f815565 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -574,7 +574,7 @@ static int do_switch(struct intel_engine_cs *ring,
>  
>  	vma = i915_gem_obj_to_ggtt(to->legacy_hw_ctx.rcs_state);
>  	if (!(vma->bound & GLOBAL_BIND))
> -		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level,
> +		i915_vma_bind(vma, to->legacy_hw_ctx.rcs_state->cache_level,
>  				GLOBAL_BIND);
>  
>  	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index e1ed85a..07e2872 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -358,7 +358,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  	if (unlikely(IS_GEN6(dev) &&
>  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
>  	    !(target_vma->bound & GLOBAL_BIND)))
> -		target_vma->bind_vma(target_vma, target_i915_obj->cache_level,
> +		i915_vma_bind(target_vma, target_i915_obj->cache_level,
>  				GLOBAL_BIND);
>  
>  	/* Validate that the target is in a valid r/w GPU domain */
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index de12017..3f8881c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -30,6 +30,8 @@
>  #include "i915_trace.h"
>  #include "intel_drv.h"
>  
> +const struct i915_ggtt_view i915_ggtt_view_normal;
> +
>  static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv);
>  static void chv_setup_private_ppat(struct drm_i915_private *dev_priv);
>  
> @@ -71,7 +73,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>  }
>  
>  
> -static void ppgtt_bind_vma(struct i915_vma *vma,
> +static void ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  			   enum i915_cache_level cache_level,
>  			   u32 flags);
>  static void ppgtt_unbind_vma(struct i915_vma *vma);
> @@ -1194,7 +1196,7 @@ void  i915_ppgtt_release(struct kref *kref)
>  }
>  
>  static void
> -ppgtt_bind_vma(struct i915_vma *vma,
> +ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  	       enum i915_cache_level cache_level,
>  	       u32 flags)
>  {
> @@ -1202,7 +1204,7 @@ ppgtt_bind_vma(struct i915_vma *vma,
>  	if (vma->obj->gt_ro)
>  		flags |= PTE_READ_ONLY;
>  
> -	vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
> +	vma->vm->insert_entries(vma->vm, pages, vma->node.start,
>  				cache_level, flags);
>  }
>  
> @@ -1337,7 +1339,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  		 * without telling our object about it. So we need to fake it.
>  		 */
>  		vma->bound &= ~GLOBAL_BIND;
> -		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> +		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
>  	}
>  
>  
> @@ -1364,9 +1366,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  	i915_ggtt_flush(dev_priv);
>  }
>  
> -int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
> +int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj,
> +				const struct i915_ggtt_view *view)
>  {
> -	if (obj->has_dma_mapping)
> +	if (obj->has_dma_mapping || view->type != I915_GGTT_VIEW_NORMAL)

This check and the one for finish_gtt are wrong. prepare/finish_gtt set up
the dma_addr fields in the sg table (if available also the iommu mappings
on big core). And that _must_ be done when the first view shows up, no
matter which one it is. E.g. userspace might display a rotate buffer right
aways without even rendering or writing into it (i.e. leaving it black).

The change around finish_gtt is actually a leak since if the last view
around is the rotate one (which can happen if userspace drops the buffer
but leaves it displayed) we won't clean up (and so leak) the iommu
mapping.

A related comment which only applies to the internal patch which uses all
this: For ->vma_bind you only need to fill in the dma addr, not the page
pointer - the pte writing functions don't need more. Even more some object
types (stolen and some dma-buf imports) don't have any page pointers
assigned, and so the rotation code would oops since the page pointer is
NULL.

>  		return 0;
>  
>  	if (!dma_map_sg(&obj->base.dev->pdev->dev,
> @@ -1523,7 +1526,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  }
>  
>  
> -static void i915_ggtt_bind_vma(struct i915_vma *vma,
> +static void i915_ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  			       enum i915_cache_level cache_level,
>  			       u32 unused)
>  {
> @@ -1532,7 +1535,7 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
>  		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
>  
>  	BUG_ON(!i915_is_ggtt(vma->vm));
> -	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> +	intel_gtt_insert_sg_entries(pages, entry, flags);
>  	vma->bound = GLOBAL_BIND;
>  }
>  
> @@ -1556,7 +1559,7 @@ static void i915_ggtt_unbind_vma(struct i915_vma *vma)
>  	intel_gtt_clear_range(first, size);
>  }
>  
> -static void ggtt_bind_vma(struct i915_vma *vma,
> +static void ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  			  enum i915_cache_level cache_level,
>  			  u32 flags)
>  {
> @@ -1582,7 +1585,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
>  	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
>  		if (!(vma->bound & GLOBAL_BIND) ||
>  		    (cache_level != obj->cache_level)) {
> -			vma->vm->insert_entries(vma->vm, obj->pages,
> +			vma->vm->insert_entries(vma->vm, pages,
>  						vma->node.start,
>  						cache_level, flags);
>  			vma->bound |= GLOBAL_BIND;
> @@ -1594,7 +1597,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
>  	     (cache_level != obj->cache_level))) {
>  		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
>  		appgtt->base.insert_entries(&appgtt->base,
> -					    vma->obj->pages,
> +					    pages,
>  					    vma->node.start,
>  					    cache_level, flags);
>  		vma->bound |= LOCAL_BIND;
> @@ -1625,7 +1628,8 @@ static void ggtt_unbind_vma(struct i915_vma *vma)
>  	}
>  }
>  
> -void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
> +void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj,
> +				const struct i915_ggtt_view *view)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1633,7 +1637,7 @@ void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
>  
>  	interruptible = do_idling(dev_priv);
>  
> -	if (!obj->has_dma_mapping)
> +	if (!obj->has_dma_mapping && view->type == I915_GGTT_VIEW_NORMAL)
>  		dma_unmap_sg(&dev->pdev->dev,
>  			     obj->pages->sgl, obj->pages->nents,
>  			     PCI_DMA_BIDIRECTIONAL);
> @@ -2135,7 +2139,8 @@ int i915_gem_gtt_init(struct drm_device *dev)
>  }
>  
>  static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
> -					      struct i915_address_space *vm)
> +					      struct i915_address_space *vm,
> +					      const struct i915_ggtt_view *view)
>  {
>  	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
>  	if (vma == NULL)
> @@ -2146,6 +2151,7 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  	INIT_LIST_HEAD(&vma->exec_list);
>  	vma->vm = vm;
>  	vma->obj = obj;
> +	vma->ggtt_view = *view;
>  
>  	switch (INTEL_INFO(vm->dev)->gen) {
>  	case 9:
> @@ -2184,14 +2190,43 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  }
>  
>  struct i915_vma *
> -i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> -				  struct i915_address_space *vm)
> +i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
> +				       struct i915_address_space *vm,
> +				       const struct i915_ggtt_view *view)
>  {
>  	struct i915_vma *vma;
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
> +	vma = i915_gem_obj_to_vma_view(obj, vm, view);
>  	if (!vma)
> -		vma = __i915_gem_vma_create(obj, vm);
> +		vma = __i915_gem_vma_create(obj, vm, view);
>  
>  	return vma;
>  }
> +
> +static inline
> +struct sg_table *i915_ggtt_view_pages(struct i915_vma *vma)
> +{
> +	struct sg_table *pages = ERR_PTR(-EINVAL);
> +
> +	if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> +		pages = vma->obj->pages;
> +	else
> +		WARN(1, "Not implemented!\n");
> +
> +	return pages;
> +}
> +
> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> +			u32 flags)
> +{
> +	struct sg_table *pages = i915_ggtt_view_pages(vma);
> +
> +	if (pages && !IS_ERR(pages)) {
> +		vma->bind_vma(vma, pages, cache_level, flags);
> +
> +		if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
> +			sg_free_table(pages);
> +			kfree(pages);
> +		}
> +	}
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index d0562d0..db6ac60 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -109,7 +109,18 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>  #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
>  #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
>  
> +enum i915_ggtt_view_type {
> +	I915_GGTT_VIEW_NORMAL = 0,
> +};
> +
> +struct i915_ggtt_view {
> +	enum i915_ggtt_view_type type;
> +};
> +
> +extern const struct i915_ggtt_view i915_ggtt_view_normal;
> +
>  enum i915_cache_level;
> +
>  /**
>   * A VMA represents a GEM BO that is bound into an address space. Therefore, a
>   * VMA's presence cannot be guaranteed before binding, or after unbinding the
> @@ -129,6 +140,15 @@ struct i915_vma {
>  #define PTE_READ_ONLY	(1<<2)
>  	unsigned int bound : 4;
>  
> +	/**
> +	 * Support different GGTT views into the same object.
> +	 * This means there can be multiple VMA mappings per object and per VM.
> +	 * i915_ggtt_view_type is used to distinguish between those entries.
> +	 * The default one of zero (I915_GGTT_VIEW_NORMAL) is default and also
> +	 * assumed in GEM functions which take no ggtt view parameter.
> +	 */
> +	struct i915_ggtt_view ggtt_view;
> +
>  	/** This object's place on the active/inactive lists */
>  	struct list_head mm_list;
>  
> @@ -161,7 +181,7 @@ struct i915_vma {
>  	 * setting the valid PTE entries to a reserved scratch page. */
>  	void (*unbind_vma)(struct i915_vma *vma);
>  	/* Map an object into an address space with the given cache flags. */
> -	void (*bind_vma)(struct i915_vma *vma,
> +	void (*bind_vma)(struct i915_vma *vma, struct sg_table *pages,
>  			 enum i915_cache_level cache_level,
>  			 u32 flags);
>  };
> @@ -299,7 +319,10 @@ void i915_check_and_clear_faults(struct drm_device *dev);
>  void i915_gem_suspend_gtt_mappings(struct drm_device *dev);
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  
> -int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> -void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
> +int __must_check
> +i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj,
> +			    const struct i915_ggtt_view *view);
> +void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj,
> +				const struct i915_ggtt_view *view);
>  
>  #endif
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index d17360b..8ebe45b 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
>  			break;
>  
>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> -			if (vma->vm == vm && vma->pin_count > 0) {
> +			if (vma->vm == vm && vma->pin_count > 0)
>  				capture_bo(err++, vma);
> -				break;
> -			}
>  	}
>  
>  	return err - first;
> @@ -1096,10 +1094,8 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> -			if (vma->vm == vm && vma->pin_count > 0) {
> +			if (vma->vm == vm && vma->pin_count > 0)
>  				i++;
> -				break;
> -			}
>  	}
>  	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
>  
> -- 
> 2.1.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-12 17:02   ` Daniel Vetter
@ 2014-11-24 12:32     ` Tvrtko Ursulin
  2014-11-24 12:36       ` Chris Wilson
  2014-11-24 13:57       ` Daniel Vetter
  0 siblings, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-24 12:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel-gfx


On 11/12/2014 05:02 PM, Daniel Vetter wrote:
> On Thu, Nov 06, 2014 at 02:39:25PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Things like reliable GGTT mmaps and mirrored 2d-on-3d display will need
>> to map objects into the same address space multiple times.
>>
>> This also means that objects now can have multiple VMA entries.
>>
>> Added a GGTT view concept and linked it with the VMA to distinguish betwen
>> multiple instances per address space.
>>
>> New objects and GEM functions which do not take this new view as a parameter
>> assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
>> previous behaviour.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> Let's hope I've picked the latest version, not sure. Please either have
> in-patch changelogs or coverletters with what's new to help confused
> maintainers ;-)

I had a cover letter and version when it was a patch series _and_ the 
concept split into multiple VMA and GGTT view. Once only one patch 
remained from the patch series and it was redesigned to kind of merge 
two concepts into one simplified approach I did not think cover letter 
makes sense any more. Will definitely version as this design goes forward.

> Looks good overal, just a few things around the lifetime rules for sg
> tables and related stuff that we've discussed offline. And one nit.
>
> I think with these addressed this is ready for detailed review and merging
> (also applies to the -internal series).

Excellent, thank you! Just one question below:

>> -int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
>> +int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj,
>> +				const struct i915_ggtt_view *view)
>>   {
>> -	if (obj->has_dma_mapping)
>> +	if (obj->has_dma_mapping || view->type != I915_GGTT_VIEW_NORMAL)
>
> This check and the one for finish_gtt are wrong. prepare/finish_gtt set up
> the dma_addr fields in the sg table (if available also the iommu mappings
> on big core). And that _must_ be done when the first view shows up, no
> matter which one it is. E.g. userspace might display a rotate buffer right
> aways without even rendering or writing into it (i.e. leaving it black).

The plan was to ensure that the "normal" view is always there first. 
Otherwise we can't "steal" DMA addresses and the other views do not have 
a proper SG table to generate DMA addresses from. So the internal code 
was doing that. Am I missing something?

> The change around finish_gtt is actually a leak since if the last view
> around is the rotate one (which can happen if userspace drops the buffer
> but leaves it displayed) we won't clean up (and so leak) the iommu
> mapping.

So I somehow need to ensure finish_gtt on a "normal" view is done last, 
correct? Move it from VMA destructon to object destruction? Would this 
have any implications for the shrinker?

Thanks,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-24 12:32     ` Tvrtko Ursulin
@ 2014-11-24 12:36       ` Chris Wilson
  2014-11-24 13:57       ` Daniel Vetter
  1 sibling, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2014-11-24 12:36 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Mon, Nov 24, 2014 at 12:32:12PM +0000, Tvrtko Ursulin wrote:
> 
> On 11/12/2014 05:02 PM, Daniel Vetter wrote:
> >The change around finish_gtt is actually a leak since if the last view
> >around is the rotate one (which can happen if userspace drops the buffer
> >but leaves it displayed) we won't clean up (and so leak) the iommu
> >mapping.
> 
> So I somehow need to ensure finish_gtt on a "normal" view is done
> last, correct? Move it from VMA destructon to object destruction?
> Would this have any implications for the shrinker?

Just have a obj->dma_pin_count?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-24 12:32     ` Tvrtko Ursulin
  2014-11-24 12:36       ` Chris Wilson
@ 2014-11-24 13:57       ` Daniel Vetter
  1 sibling, 0 replies; 43+ messages in thread
From: Daniel Vetter @ 2014-11-24 13:57 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Mon, Nov 24, 2014 at 12:32:12PM +0000, Tvrtko Ursulin wrote:
> 
> On 11/12/2014 05:02 PM, Daniel Vetter wrote:
> >On Thu, Nov 06, 2014 at 02:39:25PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>Things like reliable GGTT mmaps and mirrored 2d-on-3d display will need
> >>to map objects into the same address space multiple times.
> >>
> >>This also means that objects now can have multiple VMA entries.
> >>
> >>Added a GGTT view concept and linked it with the VMA to distinguish betwen
> >>multiple instances per address space.
> >>
> >>New objects and GEM functions which do not take this new view as a parameter
> >>assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
> >>previous behaviour.
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> >Let's hope I've picked the latest version, not sure. Please either have
> >in-patch changelogs or coverletters with what's new to help confused
> >maintainers ;-)
> 
> I had a cover letter and version when it was a patch series _and_ the
> concept split into multiple VMA and GGTT view. Once only one patch remained
> from the patch series and it was redesigned to kind of merge two concepts
> into one simplified approach I did not think cover letter makes sense any
> more. Will definitely version as this design goes forward.
> 
> >Looks good overal, just a few things around the lifetime rules for sg
> >tables and related stuff that we've discussed offline. And one nit.
> >
> >I think with these addressed this is ready for detailed review and merging
> >(also applies to the -internal series).
> 
> Excellent, thank you! Just one question below:
> 
> >>-int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
> >>+int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj,
> >>+				const struct i915_ggtt_view *view)
> >>  {
> >>-	if (obj->has_dma_mapping)
> >>+	if (obj->has_dma_mapping || view->type != I915_GGTT_VIEW_NORMAL)
> >
> >This check and the one for finish_gtt are wrong. prepare/finish_gtt set up
> >the dma_addr fields in the sg table (if available also the iommu mappings
> >on big core). And that _must_ be done when the first view shows up, no
> >matter which one it is. E.g. userspace might display a rotate buffer right
> >aways without even rendering or writing into it (i.e. leaving it black).
> 
> The plan was to ensure that the "normal" view is always there first.
> Otherwise we can't "steal" DMA addresses and the other views do not have a
> proper SG table to generate DMA addresses from. So the internal code was
> doing that. Am I missing something?
> 
> >The change around finish_gtt is actually a leak since if the last view
> >around is the rotate one (which can happen if userspace drops the buffer
> >but leaves it displayed) we won't clean up (and so leak) the iommu
> >mapping.
> 
> So I somehow need to ensure finish_gtt on a "normal" view is done last,
> correct? Move it from VMA destructon to object destruction? Would this have
> any implications for the shrinker?

Normal or not is irrelevant, all we need is _one_ vma to ensure that the
dma mapping is around. Essentially this amounts to what Chris suggested,
except that we already implicitly track dma_pin_count as

dma_pin_count = 0;
for_each_entry(obj->vma_list, entry)
	dma_pin_count++;

The difference is that we only care about 0->1 and 1->0 transitions, so a
list_empty check is all we really need. And since it doesn't matter
whether we check the head node or a member of the list to check emptiness
we can just look at

list_empty(&vma->vma_list);

in the vma_bind/unbind code to decide whether we should set up or destroy
the dma mapping.

Again the type of view the vma represents (normal ggtt, ppgtt, rotate or
anything else really) is completely irrelevant. I hope that explains the
confusion here a bit.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-06 14:39 ` [RFC] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
  2014-11-12 17:02   ` Daniel Vetter
@ 2014-11-27 14:52   ` Tvrtko Ursulin
  2014-11-28  0:59     ` [PATCH] drm/i915: Infrastructure for supporting shuang.he
  2014-11-28 17:31     ` [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object Daniel Vetter
  1 sibling, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-11-27 14:52 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
to map objects into the same address space multiple times.

Added a GGTT view concept and linked it with the VMA to distinguish between
multiple instances per address space.

New objects and GEM functions which do not take this new view as a parameter
assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
previous behaviour.

This now means that objects can have multiple VMA entries so the code which
assumed there will only be one also had to be modified.

Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
which is DMA mapped on first VMA instantiation and unmapped on the last one
going away.

v2:
    * Removed per view special casing in i915_gem_ggtt_prepare /
      finish_object in favour of creating and destroying DMA mappings
      on first VMA instantiation and last VMA destruction. (Daniel Vetter)
    * Simplified i915_vma_unbind which does not need to count the GGTT views.
      (Daniel Vetter)
    * Also moved obj->map_and_fenceable reset under the same check.
    * Checkpatch cleanups.

Issue: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  5 +-
 drivers/gpu/drm/i915/i915_drv.h            | 46 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem.c            | 73 ++++++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  4 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 61 +++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_gtt.h        | 22 ++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  8 +---
 8 files changed, 159 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 319da61..1a9569f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -154,8 +154,9 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 			seq_puts(m, " (pp");
 		else
 			seq_puts(m, " (g");
-		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
-			   vma->node.start, vma->node.size);
+		seq_printf(m, "gtt offset: %08lx, size: %08lx, type: %u)",
+			   vma->node.start, vma->node.size,
+			   vma->ggtt_view.type);
 	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c4f2cb6..6250a2c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2501,10 +2501,23 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 #define PIN_GLOBAL 0x4
 #define PIN_OFFSET_BIAS 0x8
 #define PIN_OFFSET_MASK (~4095)
+int __must_check i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
+					  struct i915_address_space *vm,
+					  uint32_t alignment,
+					  uint64_t flags,
+					  const struct i915_ggtt_view *view);
+static inline
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm,
 				     uint32_t alignment,
-				     uint64_t flags);
+				     uint64_t flags)
+{
+	return i915_gem_object_pin_view(obj, vm, alignment, flags,
+						&i915_ggtt_view_normal);
+}
+
+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+		   u32 flags);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
@@ -2656,18 +2669,43 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 
 void i915_gem_restore_fences(struct drm_device *dev);
 
+unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
+				       struct i915_address_space *vm,
+				       enum i915_ggtt_view_type view);
+static inline
 unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
-				  struct i915_address_space *vm);
+				  struct i915_address_space *vm)
+{
+	return i915_gem_obj_offset_view(o, vm, I915_GGTT_VIEW_NORMAL);
+}
 bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
 bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
 			struct i915_address_space *vm);
 unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 				struct i915_address_space *vm);
+struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
+					  struct i915_address_space *vm,
+					  const struct i915_ggtt_view *view);
+static inline
 struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
-				     struct i915_address_space *vm);
+				     struct i915_address_space *vm)
+{
+	return i915_gem_obj_to_vma_view(obj, vm, &i915_ggtt_view_normal);
+}
+
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
+				       struct i915_address_space *vm,
+				       const struct i915_ggtt_view *view);
+
+static inline
 struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
-				  struct i915_address_space *vm);
+				  struct i915_address_space *vm)
+{
+	return i915_gem_obj_lookup_or_create_vma_view(obj, vm,
+						&i915_ggtt_view_normal);
+}
 
 struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj);
 static inline bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 86cf428..6213c07 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
 			/* For the unbound phase, this should be a no-op! */
 			list_for_each_entry_safe(vma, v,
 						 &obj->vma_list, vma_link)
-				if (i915_vma_unbind(vma))
-					break;
+				i915_vma_unbind(vma);
 
 			if (i915_gem_object_put_pages(obj) == 0)
 				count += obj->base.size >> PAGE_SHIFT;
@@ -2292,17 +2291,14 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-	struct i915_address_space *vm;
 	struct i915_vma *vma;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		vma = i915_gem_obj_to_vma(obj, vm);
-		if (vma && !list_empty(&vma->mm_list))
-			list_move_tail(&vma->mm_list, &vm->inactive_list);
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (!list_empty(&vma->mm_list))
+			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
 	}
 
 	intel_fb_obj_flush(obj, true);
@@ -3043,7 +3039,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 	/* Throw away the active reference before moving to the unbound list */
 	i915_gem_object_retire(obj);
 
-	if (i915_is_ggtt(vma->vm)) {
+	if (i915_is_ggtt(vma->vm) &&
+	    vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
 		i915_gem_object_finish_gtt(obj);
 
 		/* release the fence reg _after_ flushing */
@@ -3057,7 +3054,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 	vma->unbind_vma(vma);
 
 	list_del_init(&vma->mm_list);
-	if (i915_is_ggtt(vma->vm))
+	if (i915_is_ggtt(vma->vm) &&
+	    vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
 		obj->map_and_fenceable = false;
 
 	drm_mm_remove_node(&vma->node);
@@ -3476,7 +3474,8 @@ static struct i915_vma *
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
-			   uint64_t flags)
+			   uint64_t flags,
+			   const struct i915_ggtt_view *view)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3526,7 +3525,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
+	vma = i915_gem_obj_lookup_or_create_vma_view(obj, vm, view);
 	if (IS_ERR(vma))
 		goto err_unpin;
 
@@ -3560,7 +3559,7 @@ search_free:
 	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
 	trace_i915_vma_bind(vma, flags);
-	vma->bind_vma(vma, obj->cache_level,
+	i915_vma_bind(vma, obj->cache_level,
 		      flags & PIN_GLOBAL ? GLOBAL_BIND : 0);
 
 	return vma;
@@ -3768,8 +3767,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
 			if (drm_mm_node_allocated(&vma->node))
-				vma->bind_vma(vma, cache_level,
-						vma->bound & GLOBAL_BIND);
+				i915_vma_bind(vma, cache_level,
+					      vma->bound & GLOBAL_BIND);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -4123,10 +4122,11 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags)
 }
 
 int
-i915_gem_object_pin(struct drm_i915_gem_object *obj,
-		    struct i915_address_space *vm,
-		    uint32_t alignment,
-		    uint64_t flags)
+i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
+			 struct i915_address_space *vm,
+			 uint32_t alignment,
+			 uint64_t flags,
+			 const struct i915_ggtt_view *view)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	struct i915_vma *vma;
@@ -4142,7 +4142,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	if (WARN_ON((flags & (PIN_MAPPABLE | PIN_GLOBAL)) == PIN_MAPPABLE))
 		return -EINVAL;
 
-	vma = i915_gem_obj_to_vma(obj, vm);
+	vma = i915_gem_obj_to_vma_view(obj, vm, view);
 	if (vma) {
 		if (WARN_ON(vma->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 			return -EBUSY;
@@ -4152,7 +4152,8 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     i915_gem_obj_offset(obj, vm), alignment,
+			     i915_gem_obj_offset_view(obj, vm, view->type),
+			     alignment,
 			     !!(flags & PIN_MAPPABLE),
 			     obj->map_and_fenceable);
 			ret = i915_vma_unbind(vma);
@@ -4165,13 +4166,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 
 	bound = vma ? vma->bound : 0;
 	if (vma == NULL || !drm_mm_node_allocated(&vma->node)) {
-		vma = i915_gem_object_bind_to_vm(obj, vm, alignment, flags);
+		vma = i915_gem_object_bind_to_vm(obj, vm, alignment,
+						 flags, view);
 		if (IS_ERR(vma))
 			return PTR_ERR(vma);
 	}
 
 	if (flags & PIN_GLOBAL && !(vma->bound & GLOBAL_BIND))
-		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
 
 	if ((bound ^ vma->bound) & GLOBAL_BIND) {
 		bool mappable, fenceable;
@@ -4583,12 +4585,13 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	intel_runtime_pm_put(dev_priv);
 }
 
-struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
-				     struct i915_address_space *vm)
+struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
+					  struct i915_address_space *vm,
+					  const struct i915_ggtt_view *view)
 {
 	struct i915_vma *vma;
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
-		if (vma->vm == vm)
+		if (vma->vm == vm && vma->ggtt_view.type == view->type)
 			return vma;
 
 	return NULL;
@@ -5272,8 +5275,9 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
 }
 
 /* All the new VM stuff */
-unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
-				  struct i915_address_space *vm)
+unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
+				       struct i915_address_space *vm,
+				       enum i915_ggtt_view_type view)
 {
 	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
 	struct i915_vma *vma;
@@ -5281,7 +5285,7 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
 	WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base);
 
 	list_for_each_entry(vma, &o->vma_list, vma_link) {
-		if (vma->vm == vm)
+		if (vma->vm == vm && vma->ggtt_view.type == view)
 			return vma->node.start;
 
 	}
@@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
 {
 	struct i915_vma *vma;
 
-	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
-	if (vma->vm != i915_obj_to_ggtt(obj))
-		return NULL;
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (vma->vm != i915_obj_to_ggtt(obj))
+			continue;
+		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
+			return vma;
+	}
 
-	return vma;
+	return NULL;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index d17ff43..4bf9f79 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -580,8 +580,8 @@ static int do_switch(struct intel_engine_cs *ring,
 
 	vma = i915_gem_obj_to_ggtt(to->legacy_hw_ctx.rcs_state);
 	if (!(vma->bound & GLOBAL_BIND))
-		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level,
-				GLOBAL_BIND);
+		i915_vma_bind(vma, to->legacy_hw_ctx.rcs_state->cache_level,
+			      GLOBAL_BIND);
 
 	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e1ed85a..e79caf9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -358,8 +358,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !(target_vma->bound & GLOBAL_BIND)))
-		target_vma->bind_vma(target_vma, target_i915_obj->cache_level,
-				GLOBAL_BIND);
+		i915_vma_bind(target_vma, target_i915_obj->cache_level,
+			      GLOBAL_BIND);
 
 	/* Validate that the target is in a valid r/w GPU domain */
 	if (unlikely(reloc->write_domain & (reloc->write_domain - 1))) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index bee5b0a..2144738 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -30,6 +30,8 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
+const struct i915_ggtt_view i915_ggtt_view_normal;
+
 static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv);
 static void chv_setup_private_ppat(struct drm_i915_private *dev_priv);
 
@@ -76,7 +78,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 }
 
 
-static void ppgtt_bind_vma(struct i915_vma *vma,
+static void ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			   enum i915_cache_level cache_level,
 			   u32 flags);
 static void ppgtt_unbind_vma(struct i915_vma *vma);
@@ -1200,7 +1202,7 @@ void  i915_ppgtt_release(struct kref *kref)
 }
 
 static void
-ppgtt_bind_vma(struct i915_vma *vma,
+ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 	       enum i915_cache_level cache_level,
 	       u32 flags)
 {
@@ -1208,7 +1210,7 @@ ppgtt_bind_vma(struct i915_vma *vma,
 	if (vma->obj->gt_ro)
 		flags |= PTE_READ_ONLY;
 
-	vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
+	vma->vm->insert_entries(vma->vm, pages, vma->node.start,
 				cache_level, flags);
 }
 
@@ -1343,7 +1345,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 		 * without telling our object about it. So we need to fake it.
 		 */
 		vma->bound &= ~GLOBAL_BIND;
-		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
+		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
 	}
 
 
@@ -1529,7 +1531,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 }
 
 
-static void i915_ggtt_bind_vma(struct i915_vma *vma,
+static void i915_ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			       enum i915_cache_level cache_level,
 			       u32 unused)
 {
@@ -1538,7 +1540,7 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
 	BUG_ON(!i915_is_ggtt(vma->vm));
-	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	intel_gtt_insert_sg_entries(pages, entry, flags);
 	vma->bound = GLOBAL_BIND;
 }
 
@@ -1562,7 +1564,7 @@ static void i915_ggtt_unbind_vma(struct i915_vma *vma)
 	intel_gtt_clear_range(first, size);
 }
 
-static void ggtt_bind_vma(struct i915_vma *vma,
+static void ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
 			  enum i915_cache_level cache_level,
 			  u32 flags)
 {
@@ -1588,7 +1590,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
 		if (!(vma->bound & GLOBAL_BIND) ||
 		    (cache_level != obj->cache_level)) {
-			vma->vm->insert_entries(vma->vm, obj->pages,
+			vma->vm->insert_entries(vma->vm, pages,
 						vma->node.start,
 						cache_level, flags);
 			vma->bound |= GLOBAL_BIND;
@@ -1600,7 +1602,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	     (cache_level != obj->cache_level))) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.insert_entries(&appgtt->base,
-					    vma->obj->pages,
+					    pages,
 					    vma->node.start,
 					    cache_level, flags);
 		vma->bound |= LOCAL_BIND;
@@ -2165,7 +2167,8 @@ int i915_gem_gtt_init(struct drm_device *dev)
 }
 
 static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
-					      struct i915_address_space *vm)
+					      struct i915_address_space *vm,
+					      const struct i915_ggtt_view *view)
 {
 	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
 	if (vma == NULL)
@@ -2176,6 +2179,7 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&vma->exec_list);
 	vma->vm = vm;
 	vma->obj = obj;
+	vma->ggtt_view = *view;
 
 	switch (INTEL_INFO(vm->dev)->gen) {
 	case 9:
@@ -2214,14 +2218,43 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
 }
 
 struct i915_vma *
-i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
-				  struct i915_address_space *vm)
+i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
+				       struct i915_address_space *vm,
+				       const struct i915_ggtt_view *view)
 {
 	struct i915_vma *vma;
 
-	vma = i915_gem_obj_to_vma(obj, vm);
+	vma = i915_gem_obj_to_vma_view(obj, vm, view);
 	if (!vma)
-		vma = __i915_gem_vma_create(obj, vm);
+		vma = __i915_gem_vma_create(obj, vm, view);
 
 	return vma;
 }
+
+static inline
+struct sg_table *i915_ggtt_view_pages(struct i915_vma *vma)
+{
+	struct sg_table *pages = ERR_PTR(-EINVAL);
+
+	if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
+		pages = vma->obj->pages;
+	else
+		WARN(1, "Not implemented!\n");
+
+	return pages;
+}
+
+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
+		   u32 flags)
+{
+	struct sg_table *pages = i915_ggtt_view_pages(vma);
+
+	if (pages && !IS_ERR(pages)) {
+		vma->bind_vma(vma, pages, cache_level, flags);
+
+		if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
+			sg_free_table(pages);
+			kfree(pages);
+		}
+	}
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d0562d0..3534727 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -109,7 +109,18 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
 
+enum i915_ggtt_view_type {
+	I915_GGTT_VIEW_NORMAL = 0,
+};
+
+struct i915_ggtt_view {
+	enum i915_ggtt_view_type type;
+};
+
+extern const struct i915_ggtt_view i915_ggtt_view_normal;
+
 enum i915_cache_level;
+
 /**
  * A VMA represents a GEM BO that is bound into an address space. Therefore, a
  * VMA's presence cannot be guaranteed before binding, or after unbinding the
@@ -129,6 +140,15 @@ struct i915_vma {
 #define PTE_READ_ONLY	(1<<2)
 	unsigned int bound : 4;
 
+	/**
+	 * Support different GGTT views into the same object.
+	 * This means there can be multiple VMA mappings per object and per VM.
+	 * i915_ggtt_view_type is used to distinguish between those entries.
+	 * The default one of zero (I915_GGTT_VIEW_NORMAL) is default and also
+	 * assumed in GEM functions which take no ggtt view parameter.
+	 */
+	struct i915_ggtt_view ggtt_view;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -161,7 +181,7 @@ struct i915_vma {
 	 * setting the valid PTE entries to a reserved scratch page. */
 	void (*unbind_vma)(struct i915_vma *vma);
 	/* Map an object into an address space with the given cache flags. */
-	void (*bind_vma)(struct i915_vma *vma,
+	void (*bind_vma)(struct i915_vma *vma, struct sg_table *pages,
 			 enum i915_cache_level cache_level,
 			 u32 flags);
 };
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 89a2f3d..77f1bdc 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
 			break;
 
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
-			if (vma->vm == vm && vma->pin_count > 0) {
+			if (vma->vm == vm && vma->pin_count > 0)
 				capture_bo(err++, vma);
-				break;
-			}
 	}
 
 	return err - first;
@@ -1096,10 +1094,8 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		list_for_each_entry(vma, &obj->vma_list, vma_link)
-			if (vma->vm == vm && vma->pin_count > 0) {
+			if (vma->vm == vm && vma->pin_count > 0)
 				i++;
-				break;
-			}
 	}
 	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting
  2014-11-27 14:52   ` [PATCH] " Tvrtko Ursulin
@ 2014-11-28  0:59     ` shuang.he
  2014-11-28 17:31     ` [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object Daniel Vetter
  1 sibling, 0 replies; 43+ messages in thread
From: shuang.he @ 2014-11-28  0:59 UTC (permalink / raw)
  To: shuang.he, intel-gfx, tvrtko.ursulin

Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -1              364/364              363/364
ILK                 -9              365/366              356/366
SNB                                  450/450              450/450
IVB                                  498/498              498/498
BYT                                  289/289              289/289
HSW                                  564/564              564/564
BDW                                  417/417              417/417
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*PNV  igt_gen3_mixed_blits      PASS(2, M25M7)      CRASH(1, M7)
 ILK  igt_kms_flip_bcs-flip-vs-modeset-interruptible      DMESG_WARN(1, M26)PASS(2, M26)      DMESG_WARN(1, M26)
*ILK  igt_kms_flip_busy-flip-interruptible      PASS(3, M26)      DMESG_WARN(1, M26)
*ILK  igt_kms_flip_flip-vs-panning      PASS(3, M26)      NSPT(1, M26)
*ILK  igt_kms_flip_plain-flip-fb-recreate-interruptible      PASS(3, M26)      DMESG_WARN(1, M26)
 ILK  igt_kms_flip_plain-flip-ts-check-interruptible      DMESG_WARN(1, M26)PASS(2, M26)      DMESG_WARN(1, M26)
*ILK  igt_kms_flip_rcs-flip-vs-dpms      PASS(2, M26)      DMESG_WARN(1, M26)
*ILK  igt_kms_flip_rcs-flip-vs-modeset      NSPT(1, M26)PASS(2, M26)      DMESG_WARN(1, M26)
*ILK  igt_kms_flip_vblank-vs-hang      PASS(3, M26)      DMESG_WARN(1, M26)
 ILK  igt_kms_flip_wf_vblank-vs-modeset-interruptible      DMESG_WARN(1, M26)PASS(3, M26)      DMESG_WARN(1, M26)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-27 14:52   ` [PATCH] " Tvrtko Ursulin
  2014-11-28  0:59     ` [PATCH] drm/i915: Infrastructure for supporting shuang.he
@ 2014-11-28 17:31     ` Daniel Vetter
  2014-12-01 11:32       ` Tvrtko Ursulin
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-11-28 17:31 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, Intel-gfx

On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
> to map objects into the same address space multiple times.
> 
> Added a GGTT view concept and linked it with the VMA to distinguish between
> multiple instances per address space.
> 
> New objects and GEM functions which do not take this new view as a parameter
> assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
> previous behaviour.
> 
> This now means that objects can have multiple VMA entries so the code which
> assumed there will only be one also had to be modified.
> 
> Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
> which is DMA mapped on first VMA instantiation and unmapped on the last one
> going away.
> 
> v2:
>     * Removed per view special casing in i915_gem_ggtt_prepare /
>       finish_object in favour of creating and destroying DMA mappings
>       on first VMA instantiation and last VMA destruction. (Daniel Vetter)
>     * Simplified i915_vma_unbind which does not need to count the GGTT views.
>       (Daniel Vetter)
>     * Also moved obj->map_and_fenceable reset under the same check.
>     * Checkpatch cleanups.
> 
> Issue: VIZ-4544
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

lgtm overall. Please find someone for detailed review for knowledge
sharing (so different geo/team).

A few comemnts and questions below still. Also could you please do a
follow-up patch which adds a DOC: section with a short exlanation of gtt
views and pulls it into the i915 docbook template? We need to start
somewhere with gem docs ...

Cheers, Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  5 +-
>  drivers/gpu/drm/i915/i915_drv.h            | 46 +++++++++++++++++--
>  drivers/gpu/drm/i915/i915_gem.c            | 73 ++++++++++++++++--------------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  4 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 61 +++++++++++++++++++------
>  drivers/gpu/drm/i915/i915_gem_gtt.h        | 22 ++++++++-
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  8 +---
>  8 files changed, 159 insertions(+), 64 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 319da61..1a9569f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -154,8 +154,9 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  			seq_puts(m, " (pp");
>  		else
>  			seq_puts(m, " (g");
> -		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
> -			   vma->node.start, vma->node.size);
> +		seq_printf(m, "gtt offset: %08lx, size: %08lx, type: %u)",
> +			   vma->node.start, vma->node.size,
> +			   vma->ggtt_view.type);
>  	}
>  	if (obj->stolen)
>  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c4f2cb6..6250a2c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2501,10 +2501,23 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
>  #define PIN_GLOBAL 0x4
>  #define PIN_OFFSET_BIAS 0x8
>  #define PIN_OFFSET_MASK (~4095)
> +int __must_check i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
> +					  struct i915_address_space *vm,
> +					  uint32_t alignment,
> +					  uint64_t flags,
> +					  const struct i915_ggtt_view *view);
> +static inline
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm,
>  				     uint32_t alignment,
> -				     uint64_t flags);
> +				     uint64_t flags)
> +{
> +	return i915_gem_object_pin_view(obj, vm, alignment, flags,
> +						&i915_ggtt_view_normal);
> +}
> +
> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> +		   u32 flags);
>  int __must_check i915_vma_unbind(struct i915_vma *vma);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
> @@ -2656,18 +2669,43 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>  
>  void i915_gem_restore_fences(struct drm_device *dev);
>  
> +unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
> +				       struct i915_address_space *vm,
> +				       enum i915_ggtt_view_type view);
> +static inline
>  unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> -				  struct i915_address_space *vm);
> +				  struct i915_address_space *vm)
> +{
> +	return i915_gem_obj_offset_view(o, vm, I915_GGTT_VIEW_NORMAL);
> +}
>  bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
>  bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
>  			struct i915_address_space *vm);
>  unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
>  				struct i915_address_space *vm);
> +struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
> +					  struct i915_address_space *vm,
> +					  const struct i915_ggtt_view *view);
> +static inline
>  struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> -				     struct i915_address_space *vm);
> +				     struct i915_address_space *vm)
> +{
> +	return i915_gem_obj_to_vma_view(obj, vm, &i915_ggtt_view_normal);
> +}
> +
> +struct i915_vma *
> +i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
> +				       struct i915_address_space *vm,
> +				       const struct i915_ggtt_view *view);
> +
> +static inline
>  struct i915_vma *
>  i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> -				  struct i915_address_space *vm);
> +				  struct i915_address_space *vm)
> +{
> +	return i915_gem_obj_lookup_or_create_vma_view(obj, vm,
> +						&i915_ggtt_view_normal);
> +}
>  
>  struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj);
>  static inline bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 86cf428..6213c07 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>  			/* For the unbound phase, this should be a no-op! */
>  			list_for_each_entry_safe(vma, v,
>  						 &obj->vma_list, vma_link)
> -				if (i915_vma_unbind(vma))
> -					break;
> +				i915_vma_unbind(vma);

Why drop the early break if a vma_unbind fails? Looks like a superflous
hunk to me.

>  
>  			if (i915_gem_object_put_pages(obj) == 0)
>  				count += obj->base.size >> PAGE_SHIFT;
> @@ -2292,17 +2291,14 @@ void i915_vma_move_to_active(struct i915_vma *vma,
>  static void
>  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  {
> -	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> -	struct i915_address_space *vm;
>  	struct i915_vma *vma;
>  
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> -	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> -		vma = i915_gem_obj_to_vma(obj, vm);
> -		if (vma && !list_empty(&vma->mm_list))
> -			list_move_tail(&vma->mm_list, &vm->inactive_list);
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (!list_empty(&vma->mm_list))
> +			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
>  	}
>  
>  	intel_fb_obj_flush(obj, true);
> @@ -3043,7 +3039,8 @@ int i915_vma_unbind(struct i915_vma *vma)
>  	/* Throw away the active reference before moving to the unbound list */
>  	i915_gem_object_retire(obj);
>  
> -	if (i915_is_ggtt(vma->vm)) {
> +	if (i915_is_ggtt(vma->vm) &&
> +	    vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
>  		i915_gem_object_finish_gtt(obj);

If anyone adds subobject gtt mmap fault support we'll need to undo this
again, but makes sense for the current usecase. So imo ok either way right
now.

>  
>  		/* release the fence reg _after_ flushing */
> @@ -3057,7 +3054,8 @@ int i915_vma_unbind(struct i915_vma *vma)
>  	vma->unbind_vma(vma);
>  
>  	list_del_init(&vma->mm_list);
> -	if (i915_is_ggtt(vma->vm))
> +	if (i915_is_ggtt(vma->vm) &&
> +	    vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
>  		obj->map_and_fenceable = false;

Same here.

>  
>  	drm_mm_remove_node(&vma->node);
> @@ -3476,7 +3474,8 @@ static struct i915_vma *
>  i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  			   struct i915_address_space *vm,
>  			   unsigned alignment,
> -			   uint64_t flags)
> +			   uint64_t flags,
> +			   const struct i915_ggtt_view *view)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -3526,7 +3525,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> -	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
> +	vma = i915_gem_obj_lookup_or_create_vma_view(obj, vm, view);
>  	if (IS_ERR(vma))
>  		goto err_unpin;
>  
> @@ -3560,7 +3559,7 @@ search_free:
>  	list_add_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	trace_i915_vma_bind(vma, flags);
> -	vma->bind_vma(vma, obj->cache_level,
> +	i915_vma_bind(vma, obj->cache_level,
>  		      flags & PIN_GLOBAL ? GLOBAL_BIND : 0);
>  
>  	return vma;
> @@ -3768,8 +3767,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  
>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
>  			if (drm_mm_node_allocated(&vma->node))
> -				vma->bind_vma(vma, cache_level,
> -						vma->bound & GLOBAL_BIND);
> +				i915_vma_bind(vma, cache_level,
> +					      vma->bound & GLOBAL_BIND);
>  	}
>  
>  	list_for_each_entry(vma, &obj->vma_list, vma_link)
> @@ -4123,10 +4122,11 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags)
>  }
>  
>  int
> -i915_gem_object_pin(struct drm_i915_gem_object *obj,
> -		    struct i915_address_space *vm,
> -		    uint32_t alignment,
> -		    uint64_t flags)
> +i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
> +			 struct i915_address_space *vm,
> +			 uint32_t alignment,
> +			 uint64_t flags,
> +			 const struct i915_ggtt_view *view)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	struct i915_vma *vma;
> @@ -4142,7 +4142,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON((flags & (PIN_MAPPABLE | PIN_GLOBAL)) == PIN_MAPPABLE))
>  		return -EINVAL;
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
> +	vma = i915_gem_obj_to_vma_view(obj, vm, view);
>  	if (vma) {
>  		if (WARN_ON(vma->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  			return -EBUSY;
> @@ -4152,7 +4152,8 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_offset(obj, vm), alignment,
> +			     i915_gem_obj_offset_view(obj, vm, view->type),
> +			     alignment,
>  			     !!(flags & PIN_MAPPABLE),
>  			     obj->map_and_fenceable);
>  			ret = i915_vma_unbind(vma);
> @@ -4165,13 +4166,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  
>  	bound = vma ? vma->bound : 0;
>  	if (vma == NULL || !drm_mm_node_allocated(&vma->node)) {
> -		vma = i915_gem_object_bind_to_vm(obj, vm, alignment, flags);
> +		vma = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +						 flags, view);
>  		if (IS_ERR(vma))
>  			return PTR_ERR(vma);
>  	}
>  
>  	if (flags & PIN_GLOBAL && !(vma->bound & GLOBAL_BIND))
> -		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> +		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
>  
>  	if ((bound ^ vma->bound) & GLOBAL_BIND) {
>  		bool mappable, fenceable;
> @@ -4583,12 +4585,13 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	intel_runtime_pm_put(dev_priv);
>  }
>  
> -struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> -				     struct i915_address_space *vm)
> +struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
> +					  struct i915_address_space *vm,
> +					  const struct i915_ggtt_view *view)
>  {
>  	struct i915_vma *vma;
>  	list_for_each_entry(vma, &obj->vma_list, vma_link)
> -		if (vma->vm == vm)
> +		if (vma->vm == vm && vma->ggtt_view.type == view->type)
>  			return vma;
>  
>  	return NULL;
> @@ -5272,8 +5275,9 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
>  }
>  
>  /* All the new VM stuff */
> -unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> -				  struct i915_address_space *vm)
> +unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
> +				       struct i915_address_space *vm,
> +				       enum i915_ggtt_view_type view)
>  {
>  	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
>  	struct i915_vma *vma;
> @@ -5281,7 +5285,7 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
>  	WARN_ON(vm == &dev_priv->mm.aliasing_ppgtt->base);
>  
>  	list_for_each_entry(vma, &o->vma_list, vma_link) {
> -		if (vma->vm == vm)
> +		if (vma->vm == vm && vma->ggtt_view.type == view)
>  			return vma->node.start;
>  
>  	}
> @@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
>  {
>  	struct i915_vma *vma;
>  
> -	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
> -	if (vma->vm != i915_obj_to_ggtt(obj))
> -		return NULL;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (vma->vm != i915_obj_to_ggtt(obj))
> +			continue;
> +		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> +			return vma;
> +	}

We fairly put the ggtt vma into the head of the list. Imo better to keep
the head slot reserved for the ggtt normal view, might be some random code
relying upon this.

>  
> -	return vma;
> +	return NULL;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index d17ff43..4bf9f79 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -580,8 +580,8 @@ static int do_switch(struct intel_engine_cs *ring,
>  
>  	vma = i915_gem_obj_to_ggtt(to->legacy_hw_ctx.rcs_state);
>  	if (!(vma->bound & GLOBAL_BIND))
> -		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level,
> -				GLOBAL_BIND);
> +		i915_vma_bind(vma, to->legacy_hw_ctx.rcs_state->cache_level,
> +			      GLOBAL_BIND);
>  
>  	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
>  		hw_flags |= MI_RESTORE_INHIBIT;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index e1ed85a..e79caf9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -358,8 +358,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  	if (unlikely(IS_GEN6(dev) &&
>  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
>  	    !(target_vma->bound & GLOBAL_BIND)))
> -		target_vma->bind_vma(target_vma, target_i915_obj->cache_level,
> -				GLOBAL_BIND);
> +		i915_vma_bind(target_vma, target_i915_obj->cache_level,
> +			      GLOBAL_BIND);
>  
>  	/* Validate that the target is in a valid r/w GPU domain */
>  	if (unlikely(reloc->write_domain & (reloc->write_domain - 1))) {
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index bee5b0a..2144738 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -30,6 +30,8 @@
>  #include "i915_trace.h"
>  #include "intel_drv.h"
>  
> +const struct i915_ggtt_view i915_ggtt_view_normal;
> +
>  static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv);
>  static void chv_setup_private_ppat(struct drm_i915_private *dev_priv);
>  
> @@ -76,7 +78,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>  }
>  
>  
> -static void ppgtt_bind_vma(struct i915_vma *vma,
> +static void ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  			   enum i915_cache_level cache_level,
>  			   u32 flags);
>  static void ppgtt_unbind_vma(struct i915_vma *vma);
> @@ -1200,7 +1202,7 @@ void  i915_ppgtt_release(struct kref *kref)
>  }
>  
>  static void
> -ppgtt_bind_vma(struct i915_vma *vma,
> +ppgtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  	       enum i915_cache_level cache_level,
>  	       u32 flags)
>  {
> @@ -1208,7 +1210,7 @@ ppgtt_bind_vma(struct i915_vma *vma,
>  	if (vma->obj->gt_ro)
>  		flags |= PTE_READ_ONLY;
>  
> -	vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
> +	vma->vm->insert_entries(vma->vm, pages, vma->node.start,
>  				cache_level, flags);
>  }
>  
> @@ -1343,7 +1345,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  		 * without telling our object about it. So we need to fake it.
>  		 */
>  		vma->bound &= ~GLOBAL_BIND;
> -		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> +		i915_vma_bind(vma, obj->cache_level, GLOBAL_BIND);
>  	}
>  
>  
> @@ -1529,7 +1531,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  }
>  
>  
> -static void i915_ggtt_bind_vma(struct i915_vma *vma,
> +static void i915_ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  			       enum i915_cache_level cache_level,
>  			       u32 unused)
>  {
> @@ -1538,7 +1540,7 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
>  		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
>  
>  	BUG_ON(!i915_is_ggtt(vma->vm));
> -	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> +	intel_gtt_insert_sg_entries(pages, entry, flags);
>  	vma->bound = GLOBAL_BIND;
>  }
>  
> @@ -1562,7 +1564,7 @@ static void i915_ggtt_unbind_vma(struct i915_vma *vma)
>  	intel_gtt_clear_range(first, size);
>  }
>  
> -static void ggtt_bind_vma(struct i915_vma *vma,
> +static void ggtt_bind_vma(struct i915_vma *vma, struct sg_table *pages,
>  			  enum i915_cache_level cache_level,
>  			  u32 flags)
>  {
> @@ -1588,7 +1590,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
>  	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
>  		if (!(vma->bound & GLOBAL_BIND) ||
>  		    (cache_level != obj->cache_level)) {
> -			vma->vm->insert_entries(vma->vm, obj->pages,
> +			vma->vm->insert_entries(vma->vm, pages,
>  						vma->node.start,
>  						cache_level, flags);
>  			vma->bound |= GLOBAL_BIND;
> @@ -1600,7 +1602,7 @@ static void ggtt_bind_vma(struct i915_vma *vma,
>  	     (cache_level != obj->cache_level))) {
>  		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
>  		appgtt->base.insert_entries(&appgtt->base,
> -					    vma->obj->pages,
> +					    pages,
>  					    vma->node.start,
>  					    cache_level, flags);
>  		vma->bound |= LOCAL_BIND;
> @@ -2165,7 +2167,8 @@ int i915_gem_gtt_init(struct drm_device *dev)
>  }
>  
>  static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
> -					      struct i915_address_space *vm)
> +					      struct i915_address_space *vm,
> +					      const struct i915_ggtt_view *view)
>  {
>  	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
>  	if (vma == NULL)
> @@ -2176,6 +2179,7 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  	INIT_LIST_HEAD(&vma->exec_list);
>  	vma->vm = vm;
>  	vma->obj = obj;
> +	vma->ggtt_view = *view;
>  
>  	switch (INTEL_INFO(vm->dev)->gen) {
>  	case 9:
> @@ -2214,14 +2218,43 @@ static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  }
>  
>  struct i915_vma *
> -i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> -				  struct i915_address_space *vm)
> +i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
> +				       struct i915_address_space *vm,
> +				       const struct i915_ggtt_view *view)
>  {
>  	struct i915_vma *vma;
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
> +	vma = i915_gem_obj_to_vma_view(obj, vm, view);
>  	if (!vma)
> -		vma = __i915_gem_vma_create(obj, vm);
> +		vma = __i915_gem_vma_create(obj, vm, view);
>  
>  	return vma;
>  }
> +
> +static inline
> +struct sg_table *i915_ggtt_view_pages(struct i915_vma *vma)
> +{
> +	struct sg_table *pages = ERR_PTR(-EINVAL);
> +
> +	if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> +		pages = vma->obj->pages;
> +	else
> +		WARN(1, "Not implemented!\n");
> +
> +	return pages;
> +}
> +
> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> +		   u32 flags)
> +{
> +	struct sg_table *pages = i915_ggtt_view_pages(vma);
> +
> +	if (pages && !IS_ERR(pages)) {
> +		vma->bind_vma(vma, pages, cache_level, flags);
> +
> +		if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
> +			sg_free_table(pages);
> +			kfree(pages);
> +		}
> +	}
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index d0562d0..3534727 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -109,7 +109,18 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>  #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
>  #define GEN8_PPAT(i, x)			((uint64_t) (x) << ((i) * 8))
>  
> +enum i915_ggtt_view_type {
> +	I915_GGTT_VIEW_NORMAL = 0,
> +};
> +
> +struct i915_ggtt_view {
> +	enum i915_ggtt_view_type type;
> +};
> +
> +extern const struct i915_ggtt_view i915_ggtt_view_normal;
> +
>  enum i915_cache_level;
> +
>  /**
>   * A VMA represents a GEM BO that is bound into an address space. Therefore, a
>   * VMA's presence cannot be guaranteed before binding, or after unbinding the
> @@ -129,6 +140,15 @@ struct i915_vma {
>  #define PTE_READ_ONLY	(1<<2)
>  	unsigned int bound : 4;
>  
> +	/**
> +	 * Support different GGTT views into the same object.
> +	 * This means there can be multiple VMA mappings per object and per VM.
> +	 * i915_ggtt_view_type is used to distinguish between those entries.
> +	 * The default one of zero (I915_GGTT_VIEW_NORMAL) is default and also
> +	 * assumed in GEM functions which take no ggtt view parameter.
> +	 */
> +	struct i915_ggtt_view ggtt_view;
> +
>  	/** This object's place on the active/inactive lists */
>  	struct list_head mm_list;
>  
> @@ -161,7 +181,7 @@ struct i915_vma {
>  	 * setting the valid PTE entries to a reserved scratch page. */
>  	void (*unbind_vma)(struct i915_vma *vma);
>  	/* Map an object into an address space with the given cache flags. */
> -	void (*bind_vma)(struct i915_vma *vma,
> +	void (*bind_vma)(struct i915_vma *vma, struct sg_table *pages,
>  			 enum i915_cache_level cache_level,
>  			 u32 flags);
>  };
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 89a2f3d..77f1bdc 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
>  			break;
>  
>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> -			if (vma->vm == vm && vma->pin_count > 0) {
> +			if (vma->vm == vm && vma->pin_count > 0)
>  				capture_bo(err++, vma);
> -				break;

Not fully sure about this one, but can't hurt I guess.

> -			}
>  	}
>  
>  	return err - first;
> @@ -1096,10 +1094,8 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> -			if (vma->vm == vm && vma->pin_count > 0) {
> +			if (vma->vm == vm && vma->pin_count > 0)
>  				i++;
> -				break;
> -			}
>  	}
>  	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
>  
> -- 
> 2.1.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-11-28 17:31     ` [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object Daniel Vetter
@ 2014-12-01 11:32       ` Tvrtko Ursulin
  2014-12-01 14:46         ` Tvrtko Ursulin
  2014-12-01 16:07         ` Daniel Vetter
  0 siblings, 2 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-12-01 11:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 11/28/2014 05:31 PM, Daniel Vetter wrote:
> On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
>> to map objects into the same address space multiple times.
>>
>> Added a GGTT view concept and linked it with the VMA to distinguish between
>> multiple instances per address space.
>>
>> New objects and GEM functions which do not take this new view as a parameter
>> assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
>> previous behaviour.
>>
>> This now means that objects can have multiple VMA entries so the code which
>> assumed there will only be one also had to be modified.
>>
>> Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
>> which is DMA mapped on first VMA instantiation and unmapped on the last one
>> going away.
>>
>> v2:
>>      * Removed per view special casing in i915_gem_ggtt_prepare /
>>        finish_object in favour of creating and destroying DMA mappings
>>        on first VMA instantiation and last VMA destruction. (Daniel Vetter)
>>      * Simplified i915_vma_unbind which does not need to count the GGTT views.
>>        (Daniel Vetter)
>>      * Also moved obj->map_and_fenceable reset under the same check.
>>      * Checkpatch cleanups.
>>
>> Issue: VIZ-4544
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> lgtm overall. Please find someone for detailed review for knowledge
> sharing (so different geo/team).

With the current state of who is doing what and where it made most sense 
for Michel to do this.

> A few comemnts and questions below still. Also could you please do a
> follow-up patch which adds a DOC: section with a short exlanation of gtt
> views and pulls it into the i915 docbook template? We need to start
> somewhere with gem docs ...

Sure, I'll find some previous patches from this area to see how roughly 
it should look like.

> Cheers, Daniel
>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c        |  5 +-
>>   drivers/gpu/drm/i915/i915_drv.h            | 46 +++++++++++++++++--
>>   drivers/gpu/drm/i915/i915_gem.c            | 73 ++++++++++++++++--------------
>>   drivers/gpu/drm/i915/i915_gem_context.c    |  4 +-
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 +-
>>   drivers/gpu/drm/i915/i915_gem_gtt.c        | 61 +++++++++++++++++++------
>>   drivers/gpu/drm/i915/i915_gem_gtt.h        | 22 ++++++++-
>>   drivers/gpu/drm/i915/i915_gpu_error.c      |  8 +---
>>   8 files changed, 159 insertions(+), 64 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 319da61..1a9569f 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -154,8 +154,9 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>   			seq_puts(m, " (pp");
>>   		else
>>   			seq_puts(m, " (g");
>> -		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
>> -			   vma->node.start, vma->node.size);
>> +		seq_printf(m, "gtt offset: %08lx, size: %08lx, type: %u)",
>> +			   vma->node.start, vma->node.size,
>> +			   vma->ggtt_view.type);
>>   	}
>>   	if (obj->stolen)
>>   		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index c4f2cb6..6250a2c 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2501,10 +2501,23 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
>>   #define PIN_GLOBAL 0x4
>>   #define PIN_OFFSET_BIAS 0x8
>>   #define PIN_OFFSET_MASK (~4095)
>> +int __must_check i915_gem_object_pin_view(struct drm_i915_gem_object *obj,
>> +					  struct i915_address_space *vm,
>> +					  uint32_t alignment,
>> +					  uint64_t flags,
>> +					  const struct i915_ggtt_view *view);
>> +static inline
>>   int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
>>   				     struct i915_address_space *vm,
>>   				     uint32_t alignment,
>> -				     uint64_t flags);
>> +				     uint64_t flags)
>> +{
>> +	return i915_gem_object_pin_view(obj, vm, alignment, flags,
>> +						&i915_ggtt_view_normal);
>> +}
>> +
>> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>> +		   u32 flags);
>>   int __must_check i915_vma_unbind(struct i915_vma *vma);
>>   int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>>   void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
>> @@ -2656,18 +2669,43 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>>
>>   void i915_gem_restore_fences(struct drm_device *dev);
>>
>> +unsigned long i915_gem_obj_offset_view(struct drm_i915_gem_object *o,
>> +				       struct i915_address_space *vm,
>> +				       enum i915_ggtt_view_type view);
>> +static inline
>>   unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
>> -				  struct i915_address_space *vm);
>> +				  struct i915_address_space *vm)
>> +{
>> +	return i915_gem_obj_offset_view(o, vm, I915_GGTT_VIEW_NORMAL);
>> +}
>>   bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
>>   bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
>>   			struct i915_address_space *vm);
>>   unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
>>   				struct i915_address_space *vm);
>> +struct i915_vma *i915_gem_obj_to_vma_view(struct drm_i915_gem_object *obj,
>> +					  struct i915_address_space *vm,
>> +					  const struct i915_ggtt_view *view);
>> +static inline
>>   struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
>> -				     struct i915_address_space *vm);
>> +				     struct i915_address_space *vm)
>> +{
>> +	return i915_gem_obj_to_vma_view(obj, vm, &i915_ggtt_view_normal);
>> +}
>> +
>> +struct i915_vma *
>> +i915_gem_obj_lookup_or_create_vma_view(struct drm_i915_gem_object *obj,
>> +				       struct i915_address_space *vm,
>> +				       const struct i915_ggtt_view *view);
>> +
>> +static inline
>>   struct i915_vma *
>>   i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
>> -				  struct i915_address_space *vm);
>> +				  struct i915_address_space *vm)
>> +{
>> +	return i915_gem_obj_lookup_or_create_vma_view(obj, vm,
>> +						&i915_ggtt_view_normal);
>> +}
>>
>>   struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj);
>>   static inline bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) {
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 86cf428..6213c07 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>>   			/* For the unbound phase, this should be a no-op! */
>>   			list_for_each_entry_safe(vma, v,
>>   						 &obj->vma_list, vma_link)
>> -				if (i915_vma_unbind(vma))
>> -					break;
>> +				i915_vma_unbind(vma);
>
> Why drop the early break if a vma_unbind fails? Looks like a superflous
> hunk to me.

I wasn't sure about this. (Does it makes sense to try and unbind other 
VMAs if one couldn't be unbound?)

In fact, looking at it now, I am not sure about the unbind flow 
(i915_vma_unbind). Won't i915_gem_object_retire move all VMAs to 
inactive list on first VMA unbind? Retire only on last VMA going away?

>> @@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
>>   {
>>   	struct i915_vma *vma;
>>
>> -	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
>> -	if (vma->vm != i915_obj_to_ggtt(obj))
>> -		return NULL;
>> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>> +		if (vma->vm != i915_obj_to_ggtt(obj))
>> +			continue;
>> +		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
>> +			return vma;
>> +	}
>
> We fairly put the ggtt vma into the head of the list. Imo better to keep
> the head slot reserved for the ggtt normal view, might be some random code
> relying upon this.

Ok.

>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index 89a2f3d..77f1bdc 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
>>   			break;
>>
>>   		list_for_each_entry(vma, &obj->vma_list, vma_link)
>> -			if (vma->vm == vm && vma->pin_count > 0) {
>> +			if (vma->vm == vm && vma->pin_count > 0)
>>   				capture_bo(err++, vma);
>> -				break;
>
> Not fully sure about this one, but can't hurt I guess.

Not sure if it is useful at the moment or at all?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 11:32       ` Tvrtko Ursulin
@ 2014-12-01 14:46         ` Tvrtko Ursulin
  2014-12-01 16:01           ` Daniel Vetter
  2014-12-01 16:07         ` Daniel Vetter
  1 sibling, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-12-01 14:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 12/01/2014 11:32 AM, Tvrtko Ursulin wrote:
>>> @@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct
>>> drm_i915_gem_object *obj)
>>>   {
>>>       struct i915_vma *vma;
>>>
>>> -    vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
>>> -    if (vma->vm != i915_obj_to_ggtt(obj))
>>> -        return NULL;
>>> +    list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>> +        if (vma->vm != i915_obj_to_ggtt(obj))
>>> +            continue;
>>> +        if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
>>> +            return vma;
>>> +    }
>>
>> We fairly put the ggtt vma into the head of the list. Imo better to keep
>> the head slot reserved for the ggtt normal view, might be some random
>> code
>> relying upon this.
>
> Ok.

Although on a second thought - I am not sure this makes sense since 
alternative views can exist without the normal one. Thoughts?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 14:46         ` Tvrtko Ursulin
@ 2014-12-01 16:01           ` Daniel Vetter
  2014-12-01 16:39             ` Tvrtko Ursulin
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-12-01 16:01 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Mon, Dec 01, 2014 at 02:46:29PM +0000, Tvrtko Ursulin wrote:
> 
> On 12/01/2014 11:32 AM, Tvrtko Ursulin wrote:
> >>>@@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct
> >>>drm_i915_gem_object *obj)
> >>>  {
> >>>      struct i915_vma *vma;
> >>>
> >>>-    vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
> >>>-    if (vma->vm != i915_obj_to_ggtt(obj))
> >>>-        return NULL;
> >>>+    list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>>+        if (vma->vm != i915_obj_to_ggtt(obj))
> >>>+            continue;
> >>>+        if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> >>>+            return vma;
> >>>+    }
> >>
> >>We fairly put the ggtt vma into the head of the list. Imo better to keep
> >>the head slot reserved for the ggtt normal view, might be some random
> >>code
> >>relying upon this.
> >
> >Ok.
> 
> Although on a second thought - I am not sure this makes sense since
> alternative views can exist without the normal one. Thoughts?

Yeah, hence we need to put the normal ggtt view at the front and
everything else at the back. I'm just somewhat afraid of something
expecting the normal ggtt view to be the first one and which then
accidentally breaks.

But if you think this is too much fuzz then please split this change out
into a separate patch (i.e. the change to the lookup loop + no longer
inserting ggtt vmas a the front). That way when any regression bisects to
this patch it's clear what's going on.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 11:32       ` Tvrtko Ursulin
  2014-12-01 14:46         ` Tvrtko Ursulin
@ 2014-12-01 16:07         ` Daniel Vetter
  2014-12-01 16:34           ` Tvrtko Ursulin
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-12-01 16:07 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Mon, Dec 01, 2014 at 11:32:42AM +0000, Tvrtko Ursulin wrote:
> On 11/28/2014 05:31 PM, Daniel Vetter wrote:
> >On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index 86cf428..6213c07 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> >>  			/* For the unbound phase, this should be a no-op! */
> >>  			list_for_each_entry_safe(vma, v,
> >>  						 &obj->vma_list, vma_link)
> >>-				if (i915_vma_unbind(vma))
> >>-					break;
> >>+				i915_vma_unbind(vma);
> >
> >Why drop the early break if a vma_unbind fails? Looks like a superflous
> >hunk to me.
> 
> I wasn't sure about this. (Does it makes sense to try and unbind other VMAs
> if one couldn't be unbound?)
> 
> In fact, looking at it now, I am not sure about the unbind flow
> (i915_vma_unbind). Won't i915_gem_object_retire move all VMAs to inactive
> list on first VMA unbind? Retire only on last VMA going away?

Yeah only the first vma_unbind might fail with the current code. The
problem though is that you ignore all failures.

Aside: In general this is also about reducing the size of the diff to only
have essential changes. I'm happy when people throw in more cleanup, but
it should be done as follow-up/prep patches. This is because often the
only evidence for fixing a bug we have is "it bisects to this commit". So
making commits in complex code like gem minimal is the name of the game.

> >>@@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
> >>  {
> >>  	struct i915_vma *vma;
> >>
> >>-	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
> >>-	if (vma->vm != i915_obj_to_ggtt(obj))
> >>-		return NULL;
> >>+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>+		if (vma->vm != i915_obj_to_ggtt(obj))
> >>+			continue;
> >>+		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> >>+			return vma;
> >>+	}
> >
> >We fairly put the ggtt vma into the head of the list. Imo better to keep
> >the head slot reserved for the ggtt normal view, might be some random code
> >relying upon this.
> 
> Ok.
> 
> >>diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>index 89a2f3d..77f1bdc 100644
> >>--- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >>+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>@@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
> >>  			break;
> >>
> >>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> >>-			if (vma->vm == vm && vma->pin_count > 0) {
> >>+			if (vma->vm == vm && vma->pin_count > 0)
> >>  				capture_bo(err++, vma);
> >>-				break;
> >
> >Not fully sure about this one, but can't hurt I guess.
> 
> Not sure if it is useful at the moment or at all?

Probably not useful right now. Otoh if we ever wire up the display fault
registers on modern platforms this migh become useful to cross-check that
the current display plane register settings match up with the
corresponding buffer. Won't hurt either though.

If you feel like make it a separate patch perhaps.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 16:07         ` Daniel Vetter
@ 2014-12-01 16:34           ` Tvrtko Ursulin
  2014-12-01 17:19             ` Daniel Vetter
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-12-01 16:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 12/01/2014 04:07 PM, Daniel Vetter wrote:
> On Mon, Dec 01, 2014 at 11:32:42AM +0000, Tvrtko Ursulin wrote:
>> On 11/28/2014 05:31 PM, Daniel Vetter wrote:
>>> On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>> index 86cf428..6213c07 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>>>>   			/* For the unbound phase, this should be a no-op! */
>>>>   			list_for_each_entry_safe(vma, v,
>>>>   						 &obj->vma_list, vma_link)
>>>> -				if (i915_vma_unbind(vma))
>>>> -					break;
>>>> +				i915_vma_unbind(vma);
>>>
>>> Why drop the early break if a vma_unbind fails? Looks like a superflous
>>> hunk to me.
>>
>> I wasn't sure about this. (Does it makes sense to try and unbind other VMAs
>> if one couldn't be unbound?)
>>
>> In fact, looking at it now, I am not sure about the unbind flow
>> (i915_vma_unbind). Won't i915_gem_object_retire move all VMAs to inactive
>> list on first VMA unbind? Retire only on last VMA going away?
>
> Yeah only the first vma_unbind might fail with the current code. The
> problem though is that you ignore all failures.

I am not sure what you mean. Why only the first unbind can fail?

The part I was unsure about was this break removal in the shrinker. 
Whether or not it makes sense to go through all VMAs regardless if one 
failed to unbind? Is there any space to be gained by doing that?

Alternatively, I also looked at it as: If it doesn't make sense to go 
through all of then, then what to do if the first unbind succeeds and 
some other fails?  End results sounds the same as trying to unbind as 
much as possible. So I opted for doing that.

My second concern is that object retire on 1st VMA unbind. Should that 
only be done when the last VMA is going away?

As it stands (in my v2 patch) it can move all VMAs onto the inactive 
list when first one is unbound which looks wrong.

> Aside: In general this is also about reducing the size of the diff to only
> have essential changes. I'm happy when people throw in more cleanup, but
> it should be done as follow-up/prep patches. This is because often the
> only evidence for fixing a bug we have is "it bisects to this commit". So
> making commits in complex code like gem minimal is the name of the game.

Sure, as soon as one understands what is essential and what is not. :)

>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> index 89a2f3d..77f1bdc 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> @@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
>>>>   			break;
>>>>
>>>>   		list_for_each_entry(vma, &obj->vma_list, vma_link)
>>>> -			if (vma->vm == vm && vma->pin_count > 0) {
>>>> +			if (vma->vm == vm && vma->pin_count > 0)
>>>>   				capture_bo(err++, vma);
>>>> -				break;
>>>
>>> Not fully sure about this one, but can't hurt I guess.
>>
>> Not sure if it is useful at the moment or at all?
>
> Probably not useful right now. Otoh if we ever wire up the display fault
> registers on modern platforms this migh become useful to cross-check that
> the current display plane register settings match up with the
> corresponding buffer. Won't hurt either though.
>
> If you feel like make it a separate patch perhaps.

I don't know, it sounds like an overkill to do that for this short hunk 
so I prefer to leave it in if you don't mind.

Regards,

Tvrtko






_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 16:01           ` Daniel Vetter
@ 2014-12-01 16:39             ` Tvrtko Ursulin
  2014-12-01 17:16               ` Daniel Vetter
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-12-01 16:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 12/01/2014 04:01 PM, Daniel Vetter wrote:
> On Mon, Dec 01, 2014 at 02:46:29PM +0000, Tvrtko Ursulin wrote:
>>
>> On 12/01/2014 11:32 AM, Tvrtko Ursulin wrote:
>>>>> @@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct
>>>>> drm_i915_gem_object *obj)
>>>>>   {
>>>>>       struct i915_vma *vma;
>>>>>
>>>>> -    vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
>>>>> -    if (vma->vm != i915_obj_to_ggtt(obj))
>>>>> -        return NULL;
>>>>> +    list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>> +        if (vma->vm != i915_obj_to_ggtt(obj))
>>>>> +            continue;
>>>>> +        if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
>>>>> +            return vma;
>>>>> +    }
>>>>
>>>> We fairly put the ggtt vma into the head of the list. Imo better to keep
>>>> the head slot reserved for the ggtt normal view, might be some random
>>>> code
>>>> relying upon this.
>>>
>>> Ok.
>>
>> Although on a second thought - I am not sure this makes sense since
>> alternative views can exist without the normal one. Thoughts?
>
> Yeah, hence we need to put the normal ggtt view at the front and
> everything else at the back. I'm just somewhat afraid of something
> expecting the normal ggtt view to be the first one and which then
> accidentally breaks.

No, my point was that we can't guarantee that since the first entry will 
be an alternative view when it is the only item on the list. So in the 
light of that does it make sense to bother with this special casing at all?

> But if you think this is too much fuzz then please split this change out
> into a separate patch (i.e. the change to the lookup loop + no longer
> inserting ggtt vmas a the front). That way when any regression bisects to
> this patch it's clear what's going on.

To double check if I understand what you mean, you lost me with "fuzz":

If you agree with my comment above, then the recommendation is for 
another prep patch to do what you say here? Ie. remove the special case 
for GGTT VMA list_add ?

i915_gem_obj_ti_ggtt can remain untouched in that prep patch, in my 
view, since it will be the only GGTT VMA, no?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 16:39             ` Tvrtko Ursulin
@ 2014-12-01 17:16               ` Daniel Vetter
  2014-12-01 17:24                 ` Tvrtko Ursulin
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-12-01 17:16 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Mon, Dec 01, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
> 
> On 12/01/2014 04:01 PM, Daniel Vetter wrote:
> >On Mon, Dec 01, 2014 at 02:46:29PM +0000, Tvrtko Ursulin wrote:
> >>
> >>On 12/01/2014 11:32 AM, Tvrtko Ursulin wrote:
> >>>>>@@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct
> >>>>>drm_i915_gem_object *obj)
> >>>>>  {
> >>>>>      struct i915_vma *vma;
> >>>>>
> >>>>>-    vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
> >>>>>-    if (vma->vm != i915_obj_to_ggtt(obj))
> >>>>>-        return NULL;
> >>>>>+    list_for_each_entry(vma, &obj->vma_list, vma_link) {
> >>>>>+        if (vma->vm != i915_obj_to_ggtt(obj))
> >>>>>+            continue;
> >>>>>+        if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
> >>>>>+            return vma;
> >>>>>+    }
> >>>>
> >>>>We fairly put the ggtt vma into the head of the list. Imo better to keep
> >>>>the head slot reserved for the ggtt normal view, might be some random
> >>>>code
> >>>>relying upon this.
> >>>
> >>>Ok.
> >>
> >>Although on a second thought - I am not sure this makes sense since
> >>alternative views can exist without the normal one. Thoughts?
> >
> >Yeah, hence we need to put the normal ggtt view at the front and
> >everything else at the back. I'm just somewhat afraid of something
> >expecting the normal ggtt view to be the first one and which then
> >accidentally breaks.
> 
> No, my point was that we can't guarantee that since the first entry will be
> an alternative view when it is the only item on the list. So in the light of
> that does it make sense to bother with this special casing at all?
> 
> >But if you think this is too much fuzz then please split this change out
> >into a separate patch (i.e. the change to the lookup loop + no longer
> >inserting ggtt vmas a the front). That way when any regression bisects to
> >this patch it's clear what's going on.
> 
> To double check if I understand what you mean, you lost me with "fuzz":

Hm I guess my understanding of fuzz isn't quite in line with common usage.
I've meant "hairy work". Oh well ...

> If you agree with my comment above, then the recommendation is for another
> prep patch to do what you say here? Ie. remove the special case for GGTT VMA
> list_add ?

Yes.

> i915_gem_obj_ti_ggtt can remain untouched in that prep patch, in my view,
> since it will be the only GGTT VMA, no?

If you don't insert the ggtt view any more at the front the current
obj_to_ggtt won't work any more. So you must change that in the same patch
(otherwise you break bisectability, which is the entire point of the
split-out).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 16:34           ` Tvrtko Ursulin
@ 2014-12-01 17:19             ` Daniel Vetter
  2014-12-01 17:50               ` Tvrtko Ursulin
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Vetter @ 2014-12-01 17:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Mon, Dec 01, 2014 at 04:34:16PM +0000, Tvrtko Ursulin wrote:
> 
> On 12/01/2014 04:07 PM, Daniel Vetter wrote:
> >On Mon, Dec 01, 2014 at 11:32:42AM +0000, Tvrtko Ursulin wrote:
> >>On 11/28/2014 05:31 PM, Daniel Vetter wrote:
> >>>On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
> >>>>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>>>index 86cf428..6213c07 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>@@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> >>>>  			/* For the unbound phase, this should be a no-op! */
> >>>>  			list_for_each_entry_safe(vma, v,
> >>>>  						 &obj->vma_list, vma_link)
> >>>>-				if (i915_vma_unbind(vma))
> >>>>-					break;
> >>>>+				i915_vma_unbind(vma);
> >>>
> >>>Why drop the early break if a vma_unbind fails? Looks like a superflous
> >>>hunk to me.
> >>
> >>I wasn't sure about this. (Does it makes sense to try and unbind other VMAs
> >>if one couldn't be unbound?)
> >>
> >>In fact, looking at it now, I am not sure about the unbind flow
> >>(i915_vma_unbind). Won't i915_gem_object_retire move all VMAs to inactive
> >>list on first VMA unbind? Retire only on last VMA going away?
> >
> >Yeah only the first vma_unbind might fail with the current code. The
> >problem though is that you ignore all failures.
> 
> I am not sure what you mean. Why only the first unbind can fail?
> 
> The part I was unsure about was this break removal in the shrinker. Whether
> or not it makes sense to go through all VMAs regardless if one failed to
> unbind? Is there any space to be gained by doing that?
> 
> Alternatively, I also looked at it as: If it doesn't make sense to go
> through all of then, then what to do if the first unbind succeeds and some
> other fails?  End results sounds the same as trying to unbind as much as
> possible. So I opted for doing that.
> 
> My second concern is that object retire on 1st VMA unbind. Should that only
> be done when the last VMA is going away?
> 
> As it stands (in my v2 patch) it can move all VMAs onto the inactive list
> when first one is unbound which looks wrong.

Well I've started this discussion by simply asking why we need this. I
think both versions are correct.

> >Aside: In general this is also about reducing the size of the diff to only
> >have essential changes. I'm happy when people throw in more cleanup, but
> >it should be done as follow-up/prep patches. This is because often the
> >only evidence for fixing a bug we have is "it bisects to this commit". So
> >making commits in complex code like gem minimal is the name of the game.
> 
> Sure, as soon as one understands what is essential and what is not. :)
> 
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>>index 89a2f3d..77f1bdc 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>>@@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
> >>>>  			break;
> >>>>
> >>>>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> >>>>-			if (vma->vm == vm && vma->pin_count > 0) {
> >>>>+			if (vma->vm == vm && vma->pin_count > 0)
> >>>>  				capture_bo(err++, vma);
> >>>>-				break;
> >>>
> >>>Not fully sure about this one, but can't hurt I guess.
> >>
> >>Not sure if it is useful at the moment or at all?
> >
> >Probably not useful right now. Otoh if we ever wire up the display fault
> >registers on modern platforms this migh become useful to cross-check that
> >the current display plane register settings match up with the
> >corresponding buffer. Won't hurt either though.
> >
> >If you feel like make it a separate patch perhaps.
> 
> I don't know, it sounds like an overkill to do that for this short hunk so I
> prefer to leave it in if you don't mind.

Unfortunately "let's just leave this slightly unrelated hunk in the patch
because too much work to split it out" has bitten me countless times in
gem. So if it's really just cleanup (we seem to agree that both old&new
work) but not cleanup enough to justify it's own patch then I'd like to
drop it. Not least because churn for churn's sake just makes everyone's
live more painful (especially backporters).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 17:16               ` Daniel Vetter
@ 2014-12-01 17:24                 ` Tvrtko Ursulin
  0 siblings, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-12-01 17:24 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 12/01/2014 05:16 PM, Daniel Vetter wrote:
> On Mon, Dec 01, 2014 at 04:39:36PM +0000, Tvrtko Ursulin wrote:
>>
>> On 12/01/2014 04:01 PM, Daniel Vetter wrote:
>>> On Mon, Dec 01, 2014 at 02:46:29PM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 12/01/2014 11:32 AM, Tvrtko Ursulin wrote:
>>>>>>> @@ -5430,9 +5434,12 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct
>>>>>>> drm_i915_gem_object *obj)
>>>>>>>   {
>>>>>>>       struct i915_vma *vma;
>>>>>>>
>>>>>>> -    vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
>>>>>>> -    if (vma->vm != i915_obj_to_ggtt(obj))
>>>>>>> -        return NULL;
>>>>>>> +    list_for_each_entry(vma, &obj->vma_list, vma_link) {
>>>>>>> +        if (vma->vm != i915_obj_to_ggtt(obj))
>>>>>>> +            continue;
>>>>>>> +        if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
>>>>>>> +            return vma;
>>>>>>> +    }
>>>>>>
>>>>>> We fairly put the ggtt vma into the head of the list. Imo better to keep
>>>>>> the head slot reserved for the ggtt normal view, might be some random
>>>>>> code
>>>>>> relying upon this.
>>>>>
>>>>> Ok.
>>>>
>>>> Although on a second thought - I am not sure this makes sense since
>>>> alternative views can exist without the normal one. Thoughts?
>>>
>>> Yeah, hence we need to put the normal ggtt view at the front and
>>> everything else at the back. I'm just somewhat afraid of something
>>> expecting the normal ggtt view to be the first one and which then
>>> accidentally breaks.
>>
>> No, my point was that we can't guarantee that since the first entry will be
>> an alternative view when it is the only item on the list. So in the light of
>> that does it make sense to bother with this special casing at all?
>>
>>> But if you think this is too much fuzz then please split this change out
>>> into a separate patch (i.e. the change to the lookup loop + no longer
>>> inserting ggtt vmas a the front). That way when any regression bisects to
>>> this patch it's clear what's going on.
>>
>> To double check if I understand what you mean, you lost me with "fuzz":
>
> Hm I guess my understanding of fuzz isn't quite in line with common usage.
> I've meant "hairy work". Oh well ...

Since patch(1) sorts out simple mismatches with "fuzz" to me it is a 
synonym for trivial. :)

>> If you agree with my comment above, then the recommendation is for another
>> prep patch to do what you say here? Ie. remove the special case for GGTT VMA
>> list_add ?
>
> Yes.
>
>> i915_gem_obj_ti_ggtt can remain untouched in that prep patch, in my view,
>> since it will be the only GGTT VMA, no?
>
> If you don't insert the ggtt view any more at the front the current
> obj_to_ggtt won't work any more. So you must change that in the same patch
> (otherwise you break bisectability, which is the entire point of the
> split-out).

Ah yes, I overlooked how the current obj_to_ggtt looks. Or to better say 
I only looked at my version and assumed I only added the view type check.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 17:19             ` Daniel Vetter
@ 2014-12-01 17:50               ` Tvrtko Ursulin
  2014-12-02  9:12                 ` Daniel Vetter
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2014-12-01 17:50 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 12/01/2014 05:19 PM, Daniel Vetter wrote:
> On Mon, Dec 01, 2014 at 04:34:16PM +0000, Tvrtko Ursulin wrote:
>>
>> On 12/01/2014 04:07 PM, Daniel Vetter wrote:
>>> On Mon, Dec 01, 2014 at 11:32:42AM +0000, Tvrtko Ursulin wrote:
>>>> On 11/28/2014 05:31 PM, Daniel Vetter wrote:
>>>>> On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>>>> index 86cf428..6213c07 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>>>> @@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>>>>>>   			/* For the unbound phase, this should be a no-op! */
>>>>>>   			list_for_each_entry_safe(vma, v,
>>>>>>   						 &obj->vma_list, vma_link)
>>>>>> -				if (i915_vma_unbind(vma))
>>>>>> -					break;
>>>>>> +				i915_vma_unbind(vma);
>>>>>
>>>>> Why drop the early break if a vma_unbind fails? Looks like a superflous
>>>>> hunk to me.
>>>>
>>>> I wasn't sure about this. (Does it makes sense to try and unbind other VMAs
>>>> if one couldn't be unbound?)
>>>>
>>>> In fact, looking at it now, I am not sure about the unbind flow
>>>> (i915_vma_unbind). Won't i915_gem_object_retire move all VMAs to inactive
>>>> list on first VMA unbind? Retire only on last VMA going away?
>>>
>>> Yeah only the first vma_unbind might fail with the current code. The
>>> problem though is that you ignore all failures.
>>
>> I am not sure what you mean. Why only the first unbind can fail?
>>
>> The part I was unsure about was this break removal in the shrinker. Whether
>> or not it makes sense to go through all VMAs regardless if one failed to
>> unbind? Is there any space to be gained by doing that?
>>
>> Alternatively, I also looked at it as: If it doesn't make sense to go
>> through all of then, then what to do if the first unbind succeeds and some
>> other fails?  End results sounds the same as trying to unbind as much as
>> possible. So I opted for doing that.
>>
>> My second concern is that object retire on 1st VMA unbind. Should that only
>> be done when the last VMA is going away?
>>
>> As it stands (in my v2 patch) it can move all VMAs onto the inactive list
>> when first one is unbound which looks wrong.
>
> Well I've started this discussion by simply asking why we need this. I
> think both versions are correct.

OK I'll try to reverse engineer more before v3 to try and establish the 
answers to the retire question.

>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>>> index 89a2f3d..77f1bdc 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>>> @@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
>>>>>>   			break;
>>>>>>
>>>>>>   		list_for_each_entry(vma, &obj->vma_list, vma_link)
>>>>>> -			if (vma->vm == vm && vma->pin_count > 0) {
>>>>>> +			if (vma->vm == vm && vma->pin_count > 0)
>>>>>>   				capture_bo(err++, vma);
>>>>>> -				break;
>>>>>
>>>>> Not fully sure about this one, but can't hurt I guess.
>>>>
>>>> Not sure if it is useful at the moment or at all?
>>>
>>> Probably not useful right now. Otoh if we ever wire up the display fault
>>> registers on modern platforms this migh become useful to cross-check that
>>> the current display plane register settings match up with the
>>> corresponding buffer. Won't hurt either though.
>>>
>>> If you feel like make it a separate patch perhaps.
>>
>> I don't know, it sounds like an overkill to do that for this short hunk so I
>> prefer to leave it in if you don't mind.
>
> Unfortunately "let's just leave this slightly unrelated hunk in the patch
> because too much work to split it out" has bitten me countless times in
> gem. So if it's really just cleanup (we seem to agree that both old&new
> work) but not cleanup enough to justify it's own patch then I'd like to
> drop it. Not least because churn for churn's sake just makes everyone's
> live more painful (especially backporters).

Overkill wasn't the correct word choice - I did not mean it is too much 
work to split it out if you imply laziness. I really saw it as more than 
just cleanup.  And when you said "if you feel like it" I didn't bother 
explaining in detail why I think it makes sense for it to stay together.

It will "work" with or without, correct. Just won't capture more than 
one VMA in the error state, after the patch adding support for multiple 
VMAs. So it kind of does and doesn't work, depends how you look at it.

I did not think that to be a valuable split for the sake of a trivial 
hunk like the one above. But OK, I can split it out no big deal.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object
  2014-12-01 17:50               ` Tvrtko Ursulin
@ 2014-12-02  9:12                 ` Daniel Vetter
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Vetter @ 2014-12-02  9:12 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Mon, Dec 01, 2014 at 05:50:24PM +0000, Tvrtko Ursulin wrote:
> 
> On 12/01/2014 05:19 PM, Daniel Vetter wrote:
> >On Mon, Dec 01, 2014 at 04:34:16PM +0000, Tvrtko Ursulin wrote:
> >>
> >>On 12/01/2014 04:07 PM, Daniel Vetter wrote:
> >>>On Mon, Dec 01, 2014 at 11:32:42AM +0000, Tvrtko Ursulin wrote:
> >>>>On 11/28/2014 05:31 PM, Daniel Vetter wrote:
> >>>>>On Thu, Nov 27, 2014 at 02:52:44PM +0000, Tvrtko Ursulin wrote:
> >>>>>>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>>>>>index 86cf428..6213c07 100644
> >>>>>>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>>>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>>>@@ -2090,8 +2090,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> >>>>>>  			/* For the unbound phase, this should be a no-op! */
> >>>>>>  			list_for_each_entry_safe(vma, v,
> >>>>>>  						 &obj->vma_list, vma_link)
> >>>>>>-				if (i915_vma_unbind(vma))
> >>>>>>-					break;
> >>>>>>+				i915_vma_unbind(vma);
> >>>>>
> >>>>>Why drop the early break if a vma_unbind fails? Looks like a superflous
> >>>>>hunk to me.
> >>>>
> >>>>I wasn't sure about this. (Does it makes sense to try and unbind other VMAs
> >>>>if one couldn't be unbound?)
> >>>>
> >>>>In fact, looking at it now, I am not sure about the unbind flow
> >>>>(i915_vma_unbind). Won't i915_gem_object_retire move all VMAs to inactive
> >>>>list on first VMA unbind? Retire only on last VMA going away?
> >>>
> >>>Yeah only the first vma_unbind might fail with the current code. The
> >>>problem though is that you ignore all failures.
> >>
> >>I am not sure what you mean. Why only the first unbind can fail?
> >>
> >>The part I was unsure about was this break removal in the shrinker. Whether
> >>or not it makes sense to go through all VMAs regardless if one failed to
> >>unbind? Is there any space to be gained by doing that?
> >>
> >>Alternatively, I also looked at it as: If it doesn't make sense to go
> >>through all of then, then what to do if the first unbind succeeds and some
> >>other fails?  End results sounds the same as trying to unbind as much as
> >>possible. So I opted for doing that.
> >>
> >>My second concern is that object retire on 1st VMA unbind. Should that only
> >>be done when the last VMA is going away?
> >>
> >>As it stands (in my v2 patch) it can move all VMAs onto the inactive list
> >>when first one is unbound which looks wrong.
> >
> >Well I've started this discussion by simply asking why we need this. I
> >think both versions are correct.
> 
> OK I'll try to reverse engineer more before v3 to try and establish the
> answers to the retire question.

Just to clarify: This comment was in the same context as the longer
explanation below. So three options imo are:
- We really need this (and then it probably needs a comment in the code or
  explanation in the commit message).
- If it's a cleanup, split it out if it's worth it.
- Or drop the hunk completely.

> >>>>>>diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>>>>index 89a2f3d..77f1bdc 100644
> >>>>>>--- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>>>>+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>>>>@@ -717,10 +717,8 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
> >>>>>>  			break;
> >>>>>>
> >>>>>>  		list_for_each_entry(vma, &obj->vma_list, vma_link)
> >>>>>>-			if (vma->vm == vm && vma->pin_count > 0) {
> >>>>>>+			if (vma->vm == vm && vma->pin_count > 0)
> >>>>>>  				capture_bo(err++, vma);
> >>>>>>-				break;
> >>>>>
> >>>>>Not fully sure about this one, but can't hurt I guess.
> >>>>
> >>>>Not sure if it is useful at the moment or at all?
> >>>
> >>>Probably not useful right now. Otoh if we ever wire up the display fault
> >>>registers on modern platforms this migh become useful to cross-check that
> >>>the current display plane register settings match up with the
> >>>corresponding buffer. Won't hurt either though.
> >>>
> >>>If you feel like make it a separate patch perhaps.
> >>
> >>I don't know, it sounds like an overkill to do that for this short hunk so I
> >>prefer to leave it in if you don't mind.
> >
> >Unfortunately "let's just leave this slightly unrelated hunk in the patch
> >because too much work to split it out" has bitten me countless times in
> >gem. So if it's really just cleanup (we seem to agree that both old&new
> >work) but not cleanup enough to justify it's own patch then I'd like to
> >drop it. Not least because churn for churn's sake just makes everyone's
> >live more painful (especially backporters).
> 
> Overkill wasn't the correct word choice - I did not mean it is too much work
> to split it out if you imply laziness. I really saw it as more than just
> cleanup.  And when you said "if you feel like it" I didn't bother explaining
> in detail why I think it makes sense for it to stay together.
> 
> It will "work" with or without, correct. Just won't capture more than one
> VMA in the error state, after the patch adding support for multiple VMAs. So
> it kind of does and doesn't work, depends how you look at it.
> 
> I did not think that to be a valuable split for the sake of a trivial hunk
> like the one above. But OK, I can split it out no big deal.

Hm, the code around is a bit a mess and probably did break somewhere a
while ago, which is a bit confusing. But yeah if we have platforms with
special ggtt views and not full ppgtt enabled we'll indeed run the risk of
capturing the wrong bo.

So looks good now to me, maybe with a small comment added to the commit
message for dummies like me?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2014-12-02  9:12 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-30 16:39 [RFC 0/5] Preparation for multiple VMA and GGTT views per GEM object Tvrtko Ursulin
2014-10-30 16:39 ` [PATCH 1/5] drm/i915: Move flags describing VMA mappings into the VMA Tvrtko Ursulin
2014-10-30 16:39 ` [RFC 2/5] drm/i915: Prepare for the idea there can be multiple VMA per VM Tvrtko Ursulin
2014-10-30 16:39 ` [RFC 3/5] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
2014-10-30 16:41   ` Chris Wilson
2014-10-30 16:55     ` Tvrtko Ursulin
2014-11-03 15:58   ` Daniel Vetter
2014-11-03 16:34     ` Tvrtko Ursulin
2014-11-03 16:52       ` Daniel Vetter
2014-11-03 17:20         ` Tvrtko Ursulin
2014-11-03 17:29           ` Daniel Vetter
2014-11-04 11:15             ` Tvrtko Ursulin
2014-11-04 11:23               ` Daniel Vetter
2014-11-03 17:35           ` Daniel Vetter
2014-11-04 10:25             ` Tvrtko Ursulin
2014-10-30 16:39 ` [PATCH 4/5] drm/i915: Make scatter-gather helper available to the driver Tvrtko Ursulin
2014-11-03 16:02   ` Daniel Vetter
2014-11-03 16:18     ` Tvrtko Ursulin
2014-11-03 16:53       ` Daniel Vetter
2014-11-03 16:05   ` Daniel Vetter
2014-11-03 20:30     ` Chris Wilson
2014-11-04 11:56       ` Imre Deak
2014-10-30 16:39 ` [RFC 5/5] drm/i915: Make intel_pin_and_fence_fb_obj take plane and framebuffer Tvrtko Ursulin
2014-11-03 16:18   ` Daniel Vetter
2014-11-06 14:39 ` [RFC] drm/i915: Infrastructure for supporting different GGTT views per object Tvrtko Ursulin
2014-11-12 17:02   ` Daniel Vetter
2014-11-24 12:32     ` Tvrtko Ursulin
2014-11-24 12:36       ` Chris Wilson
2014-11-24 13:57       ` Daniel Vetter
2014-11-27 14:52   ` [PATCH] " Tvrtko Ursulin
2014-11-28  0:59     ` [PATCH] drm/i915: Infrastructure for supporting shuang.he
2014-11-28 17:31     ` [PATCH] drm/i915: Infrastructure for supporting different GGTT views per object Daniel Vetter
2014-12-01 11:32       ` Tvrtko Ursulin
2014-12-01 14:46         ` Tvrtko Ursulin
2014-12-01 16:01           ` Daniel Vetter
2014-12-01 16:39             ` Tvrtko Ursulin
2014-12-01 17:16               ` Daniel Vetter
2014-12-01 17:24                 ` Tvrtko Ursulin
2014-12-01 16:07         ` Daniel Vetter
2014-12-01 16:34           ` Tvrtko Ursulin
2014-12-01 17:19             ` Daniel Vetter
2014-12-01 17:50               ` Tvrtko Ursulin
2014-12-02  9:12                 ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).