intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/7] Reorganise calls to vmap() GEM objects
@ 2016-03-01 16:33 Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 1/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Alex Dai and Chris Wilson have both recently posted patches to
rationalise the use of vmap() for mapping GEM objects into kernel
virtual space. However, they addressed different areas, with Alex's
patch being derived from the copy_batch() code, whereas Chris' patch
refactored the dma-buf and ringbuffer code.

So this patchset unifies the two, copying Chris' interfaces which
unite pin-and-vmap for convenient lifecycle management, but using
Alex's code underneath to permit partial mappings. And finally
there's a little optimisation I've added for "small" objects e.g.
ringbuffers and contexts, which are expected to be the objects most
commonly handled by this code.

v5:
  Added another of Chris' patches, introducing drm_malloc_gfp().

  Split Chris' original patch into three, of which two are actually
  minor unrelated improvements, and only one is actually addresses
  the vmap() reorganisation [Tvrtko Ursulin]

  Decided not to hold onto vmappings after the pin count goes to
  zero. This may reduce the benefit of Chris' scheme somewhat, but
  does avoid any increased risk of exhausting kernel vmap space on
  32-bit kernels [Tvrtko Ursulin]. Potentially, the vunmap() could
  be moved back to the put_pages() stage (thus extending the cache
  lifetime) if a suitable notifier were written, but that's not
  included here.

v6:
  Addressed a few review comments by Tvrtko & Chris; the main
  functional change is that i915_gem_object_vmap_range() now takes
  a range in bytes rather than pages, so we use sg_nents_for_len()
  to validate it against the actual scatterlist.

v7:
  Reverted to passing range specification in pages not bytes, and
  allow npages==0 as a shorthand for "up to the end of the object".
  Reordered patch sequence, with the most local changes first and
  the optional ones at the end.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

Alex Dai (1):
  drm/i915: introduce and use i915_gem_object_vmap_range()

Chris Wilson (3):
  drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error
    handling
  drm/i915: move locking in i915_gem_unmap_dma_buf()
  drm,i915: introduce drm_malloc_gfp()

Dave Gordon (3):
  drm/i915: optimise i915_gem_object_vmap_range() for small objects
  drm/i915: refactor duplicate object vmap functions (the final rework?)
  drm: add parameter-order checking to drm memory allocators

 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |   2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c       |  34 ++-------
 drivers/gpu/drm/i915/i915_drv.h              |  26 +++++--
 drivers/gpu/drm/i915/i915_gem.c              | 103 +++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c       |  53 ++------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  12 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c          |   5 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c      |  15 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c      |  52 +++++---------
 include/drm/drm_mem_util.h                   |  44 +++++++++++-
 10 files changed, 206 insertions(+), 140 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 1/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 2/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

Replace multiple "unpin(); return errno;" sequences with
a branch to a single common label for all the error paths.

Extracted from Chris Wilson's patch:
    drm/i915: Refactor duplicate object vmap functions
in preparation for the reimplementation of the same.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 45ce45a..8f52556 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2098,15 +2098,13 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			return ret;
 
 		ret = i915_gem_object_set_to_cpu_domain(obj, true);
-		if (ret) {
-			i915_gem_object_ggtt_unpin(obj);
-			return ret;
-		}
+		if (ret)
+			goto unpin;
 
 		ringbuf->virtual_start = vmap_obj(obj);
 		if (ringbuf->virtual_start == NULL) {
-			i915_gem_object_ggtt_unpin(obj);
-			return -ENOMEM;
+			ret = -ENOMEM;
+			goto unpin;
 		}
 	} else {
 		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
@@ -2114,10 +2112,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			return ret;
 
 		ret = i915_gem_object_set_to_gtt_domain(obj, true);
-		if (ret) {
-			i915_gem_object_ggtt_unpin(obj);
-			return ret;
-		}
+		if (ret)
+			goto unpin;
 
 		/* Access through the GTT requires the device to be awake. */
 		assert_rpm_wakelock_held(dev_priv);
@@ -2125,14 +2121,18 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base +
 						    i915_gem_obj_ggtt_offset(obj), ringbuf->size);
 		if (ringbuf->virtual_start == NULL) {
-			i915_gem_object_ggtt_unpin(obj);
-			return -EINVAL;
+			ret = -ENOMEM;
+			goto unpin;
 		}
 	}
 
 	ringbuf->vma = i915_gem_obj_to_ggtt(obj);
 
 	return 0;
+
+unpin:
+	i915_gem_object_ggtt_unpin(obj);
+	return ret;
 }
 
 static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 2/7] drm/i915: move locking in i915_gem_unmap_dma_buf()
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 1/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 3/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

The struct_mutex doesn't need to be (and therefore shouldn't be)
held around the various non-i915 operations in this function. We
only need the struct_mutex when we get to the unpin_pages() call.

Extracted from Chris Wilson's patch:
    drm/i915: Refactor duplicate object vmap functions
in preparation for the reimplementation of the same.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 1f3eef6..616f078 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -95,14 +95,12 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
 
-	mutex_lock(&obj->base.dev->struct_mutex);
-
 	dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir);
 	sg_free_table(sg);
 	kfree(sg);
 
+	mutex_lock(&obj->base.dev->struct_mutex);
 	i915_gem_object_unpin_pages(obj);
-
 	mutex_unlock(&obj->base.dev->struct_mutex);
 }
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 3/7] drm,i915: introduce drm_malloc_gfp()
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 1/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 2/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 4/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Chris Wilson <chris@chris-wilson.co.uk>

I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want
to try a high-order kmalloc() before using a vmalloc().

So refactor my usage into drm_malloc_gfp().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 +++-----
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  5 +++--
 drivers/gpu/drm/i915/i915_gem_userptr.c    | 15 ++++-----------
 include/drm/drm_mem_util.h                 | 19 +++++++++++++++++++
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 1328bc5..f734b3c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1775,11 +1775,9 @@ static bool only_mappable_for_reloc(unsigned int flags)
 		return -EINVAL;
 	}
 
-	exec2_list = kmalloc(sizeof(*exec2_list)*args->buffer_count,
-			     GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
-	if (exec2_list == NULL)
-		exec2_list = drm_malloc_ab(sizeof(*exec2_list),
-					   args->buffer_count);
+	exec2_list = drm_malloc_gfp(sizeof(*exec2_list),
+				    args->buffer_count,
+				    GFP_TEMPORARY);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
 			  args->buffer_count);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 49e4f26..6af2462 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3416,8 +3416,9 @@ struct i915_vma *
 	int ret = -ENOMEM;
 
 	/* Allocate a temporary list of source pages for random access. */
-	page_addr_list = drm_malloc_ab(obj->base.size / PAGE_SIZE,
-				       sizeof(dma_addr_t));
+	page_addr_list = drm_malloc_gfp(obj->base.size / PAGE_SIZE,
+					sizeof(dma_addr_t),
+					GFP_TEMPORARY);
 	if (!page_addr_list)
 		return ERR_PTR(ret);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 4b09c84..4c98c8c 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -494,10 +494,7 @@ struct get_pages_work {
 	ret = -ENOMEM;
 	pinned = 0;
 
-	pvec = kmalloc(npages*sizeof(struct page *),
-		       GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
-	if (pvec == NULL)
-		pvec = drm_malloc_ab(npages, sizeof(struct page *));
+	pvec = drm_malloc_gfp(npages, sizeof(struct page *), GFP_TEMPORARY);
 	if (pvec != NULL) {
 		struct mm_struct *mm = obj->userptr.mm->mm;
 
@@ -634,14 +631,10 @@ struct get_pages_work {
 	pvec = NULL;
 	pinned = 0;
 	if (obj->userptr.mm->mm == current->mm) {
-		pvec = kmalloc(num_pages*sizeof(struct page *),
-			       GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
+		pvec = drm_malloc_gfp(num_pages, sizeof(struct page *), GFP_TEMPORARY);
 		if (pvec == NULL) {
-			pvec = drm_malloc_ab(num_pages, sizeof(struct page *));
-			if (pvec == NULL) {
-				__i915_gem_userptr_set_active(obj, false);
-				return -ENOMEM;
-			}
+			__i915_gem_userptr_set_active(obj, false);
+			return -ENOMEM;
 		}
 
 		pinned = __get_user_pages_fast(obj->userptr.ptr, num_pages,
diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
index e42495a..741ce75 100644
--- a/include/drm/drm_mem_util.h
+++ b/include/drm/drm_mem_util.h
@@ -54,6 +54,25 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
 			 GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
+static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
+{
+	if (size != 0 && nmemb > SIZE_MAX / size)
+		return NULL;
+
+	if (size * nmemb <= PAGE_SIZE)
+	    return kmalloc(nmemb * size, gfp);
+
+	if (gfp & __GFP_RECLAIMABLE) {
+		void *ptr = kmalloc(nmemb * size,
+				    gfp | __GFP_NOWARN | __GFP_NORETRY);
+		if (ptr)
+			return ptr;
+	}
+
+	return __vmalloc(size * nmemb,
+			 gfp | __GFP_HIGHMEM, PAGE_KERNEL);
+}
+
 static __inline void drm_free_large(void *ptr)
 {
 	kvfree(ptr);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 4/7] drm/i915: introduce and use i915_gem_object_vmap_range()
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (2 preceding siblings ...)
  2016-03-01 16:33 ` [PATCH v7 3/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-01 17:39   ` Tvrtko Ursulin
  2016-03-01 16:33 ` [PATCH v7 5/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

There are several places inside driver where a GEM object is mapped
to kernel virtual space. The mapping may be done either for the whole
object or only a subset of it.

This patch introduces a function i915_gem_object_vmap_range() to
implement the common functionality. The code itself is extracted and
adapted from that in vmap_batch(), but also replaces vmap_obj() and
the open-coded version in i915_gem_dmabuf_vmap().

v2: use obj->pages->nents for iteration within i915_gem_object_vmap;
    break when it finishes all desired pages. The caller must pass
    the actual page count required. [Tvrtko Ursulin]

v4: renamed to i915_gem_object_vmap_range() to make its function
    clearer. [Dave Gordon]

v5: use Chris Wilson's new drm_malloc_gfp() rather than kmalloc() or
    drm_malloc_ab(). [Dave Gordon]

v6: changed range checking to not use pages->nents. [Tvrtko Ursulin]
    Use sg_nents_for_len() for range check instead. [Dave Gordon]
    Pass range parameters in bytes rather than pages (both callers
    were converting from bytes to pages anyway, so this reduces the
    number of places where the conversion is done).

v7: changed range parameters back to pages, and simplified parameter
    validation. [Tvrtko Ursulin] As a convenience for callers, allow
    npages==0 as a shorthand for "up to the end of the object".

With this change, we have only one vmap() in the whole driver :)

Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 34 +++----------------
 drivers/gpu/drm/i915/i915_drv.h         |  4 +++
 drivers/gpu/drm/i915/i915_gem.c         | 59 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 15 ++-------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +------------
 5 files changed, 71 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 814d894..4d48617 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -863,37 +863,13 @@ void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring)
 static u32 *vmap_batch(struct drm_i915_gem_object *obj,
 		       unsigned start, unsigned len)
 {
-	int i;
-	void *addr = NULL;
-	struct sg_page_iter sg_iter;
-	int first_page = start >> PAGE_SHIFT;
-	int last_page = (len + start + 4095) >> PAGE_SHIFT;
-	int npages = last_page - first_page;
-	struct page **pages;
-
-	pages = drm_malloc_ab(npages, sizeof(*pages));
-	if (pages == NULL) {
-		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-		goto finish;
-	}
-
-	i = 0;
-	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first_page) {
-		pages[i++] = sg_page_iter_page(&sg_iter);
-		if (i == npages)
-			break;
-	}
+	unsigned long first, npages;
 
-	addr = vmap(pages, i, 0, PAGE_KERNEL);
-	if (addr == NULL) {
-		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
-		goto finish;
-	}
+	/* Convert [start, len) to pages */
+	first = start >> PAGE_SHIFT;
+	npages = DIV_ROUND_UP(start + len, PAGE_SIZE) - first;
 
-finish:
-	if (pages)
-		drm_free_large(pages);
-	return (u32*)addr;
+	return i915_gem_object_vmap_range(obj, first, npages);
 }
 
 /* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a4dcb74..b3ae191 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2983,6 +2983,10 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 	obj->pages_pin_count--;
 }
 
+void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
+					      unsigned long first,
+					      unsigned long npages);
+
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_engine_cs *to,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d31d3a..d7c9ccd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2400,6 +2400,65 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 	return 0;
 }
 
+/**
+ * i915_gem_object_vmap_range - map some or all of a GEM object into kernel space
+ * @obj: the GEM object to be mapped
+ * @first: offset in pages of the start of the range to be mapped
+ * @npages: length in pages of the range to be mapped. For convenience, a
+ *          length of zero is taken to mean "the remainder of the object"
+ *
+ * Map a given range of a GEM object into kernel virtual space.  The caller must
+ * make sure the associated pages are gathered and pinned before calling this
+ * function, and is responsible for unmapping the returned address when it is no
+ * longer required.
+ *
+ * Returns the address at which the object has been mapped, or NULL on failure.
+ */
+void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
+				 unsigned long first,
+				 unsigned long npages)
+{
+	unsigned long max_pages = obj->base.size >> PAGE_SHIFT;
+	struct scatterlist *sg = obj->pages->sgl;
+	struct sg_page_iter sg_iter;
+	struct page **pages;
+	unsigned long i = 0;
+	void *addr = NULL;
+
+	/* Minimal range check */
+	if (first + npages > max_pages) {
+		DRM_DEBUG_DRIVER("Invalid page range\n");
+		return NULL;
+	}
+
+	/* npages==0 is shorthand for "the rest of the object" */
+	if (npages == 0)
+		npages = max_pages - first;
+
+	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
+	if (pages == NULL) {
+		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
+		return NULL;
+	}
+
+	for_each_sg_page(sg, &sg_iter, max_pages, first) {
+		pages[i] = sg_page_iter_page(&sg_iter);
+		if (++i == npages) {
+			addr = vmap(pages, npages, 0, PAGE_KERNEL);
+			break;
+		}
+	}
+
+	/* We should have got here via the 'break' above */
+	WARN_ON(i != npages);
+	if (addr == NULL)
+		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
+
+	drm_free_large(pages);
+
+	return addr;
+}
+
 void i915_vma_move_to_active(struct i915_vma *vma,
 			     struct drm_i915_gem_request *req)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 616f078..3a5d01a 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -108,9 +108,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	struct drm_device *dev = obj->base.dev;
-	struct sg_page_iter sg_iter;
-	struct page **pages;
-	int ret, i;
+	int ret;
 
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
@@ -129,16 +127,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 
 	ret = -ENOMEM;
 
-	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
-	if (pages == NULL)
-		goto err_unpin;
-
-	i = 0;
-	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
-		pages[i++] = sg_page_iter_page(&sg_iter);
-
-	obj->dma_buf_vmapping = vmap(pages, i, 0, PAGE_KERNEL);
-	drm_free_large(pages);
+	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0, 0);
 
 	if (!obj->dma_buf_vmapping)
 		goto err_unpin;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8f52556..58a18e1 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2064,27 +2064,6 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 	i915_gem_object_ggtt_unpin(ringbuf->obj);
 }
 
-static u32 *vmap_obj(struct drm_i915_gem_object *obj)
-{
-	struct sg_page_iter sg_iter;
-	struct page **pages;
-	void *addr;
-	int i;
-
-	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
-	if (pages == NULL)
-		return NULL;
-
-	i = 0;
-	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
-		pages[i++] = sg_page_iter_page(&sg_iter);
-
-	addr = vmap(pages, i, 0, PAGE_KERNEL);
-	drm_free_large(pages);
-
-	return addr;
-}
-
 int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 				     struct intel_ringbuffer *ringbuf)
 {
@@ -2101,7 +2080,7 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		if (ret)
 			goto unpin;
 
-		ringbuf->virtual_start = vmap_obj(obj);
+		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0, 0);
 		if (ringbuf->virtual_start == NULL) {
 			ret = -ENOMEM;
 			goto unpin;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 5/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (3 preceding siblings ...)
  2016-03-01 16:33 ` [PATCH v7 4/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-01 16:33 ` [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?) Dave Gordon
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx

We're using this function for ringbuffers and other "small" objects, so
it's worth avoiding an extra malloc()/free() cycle if the page array is
small enough to put on the stack. Here we've chosen an arbitrary cutoff
of 32 (4k) pages, which is big enough for a ringbuffer (4 pages) or a
context image (currently up to 22 pages).

v5:
    change name of local array [Chris Wilson]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d7c9ccd..5b6774b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2421,7 +2421,8 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 	unsigned long max_pages = obj->base.size >> PAGE_SHIFT;
 	struct scatterlist *sg = obj->pages->sgl;
 	struct sg_page_iter sg_iter;
-	struct page **pages;
+	struct page *stack_pages[32];
+	struct page **pages = stack_pages;
 	unsigned long i = 0;
 	void *addr = NULL;
 
@@ -2435,10 +2436,13 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 	if (npages == 0)
 		npages = max_pages - first;
 
-	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
-	if (pages == NULL) {
-		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-		return NULL;
+	if (npages > ARRAY_SIZE(stack_pages)) {
+		/* Too big for stack -- allocate temporary array instead */
+		pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
+		if (pages == NULL) {
+			DRM_DEBUG_DRIVER("Failed to get space for pages\n");
+			return NULL;
+		}
 	}
 
 	for_each_sg_page(sg, &sg_iter, max_pages, first) {
@@ -2454,7 +2458,8 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 	if (addr == NULL)
 		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
 
-	drm_free_large(pages);
+	if (pages != stack_pages)
+		drm_free_large(pages);
 
 	return addr;
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?)
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (4 preceding siblings ...)
  2016-03-01 16:33 ` [PATCH v7 5/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-02 12:08   ` Chris Wilson
  2016-03-01 16:33 ` [PATCH v7 7/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
  2016-03-02  6:54 ` ✗ Fi.CI.BAT: warning for Reorganise calls to vmap() GEM objects (rev5) Patchwork
  7 siblings, 1 reply; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx

This is essentially Chris Wilson's patch of a similar name, reworked on
top of Alex Dai's recent patch:
| drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space

Chris' original commentary said:
| We now have two implementations for vmapping a whole object, one for
| dma-buf and one for the ringbuffer. If we couple the vmapping into
| the obj->pages lifetime, then we can reuse an obj->vmapping for both
| and at the same time couple it into the shrinker.
|
| v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
| v3: Call unpin_vmap from the right dmabuf unmapper

v4: reimplements the same functionality, but now as wrappers round the
    recently-introduced i915_gem_object_vmap_range() from Alex's patch
    mentioned above.

v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
    this is the third and most substantial portion.

    Decided not to hold onto vmappings after the pin count goes to zero.
    This may reduce the benefit of Chris' scheme a bit, but does avoid
    any increased risk of exhausting kernel vmap space on 32-bit kernels
    [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
    the put_pages() stage if a suitable notifier were written, but we're
    not doing that here. Nonetheless, the simplification of both dmabuf
    and ringbuffer code makes it worthwhile in its own right.

v6: change BUG_ON() to WARN_ON(). [Tvrtko Ursulin]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 22 ++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem.c         | 39 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 36 ++++--------------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++----
 4 files changed, 65 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b3ae191..f1ad3b3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
 		struct scatterlist *sg;
 		int last;
 	} get_page;
-
-	/* prime dma-buf support */
-	void *dma_buf_vmapping;
-	int vmapping_count;
+	void *vmapping;
 
 	/** Breadcrumb of last rendering to the buffer.
 	 * There can only be one writer, but we allow for multiple readers.
@@ -2980,7 +2977,22 @@ static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pages_pin_count == 0);
-	obj->pages_pin_count--;
+	if (--obj->pages_pin_count == 0 && obj->vmapping) {
+		/*
+		 * Releasing the vmapping here may yield less benefit than
+		 * if we kept it until put_pages(), but on the other hand
+		 * avoids issues of exhausting kernel vmappable address
+		 * space on 32-bit kernels.
+		 */
+		vunmap(obj->vmapping);
+		obj->vmapping = NULL;
+	}
+}
+
+void *__must_check i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj);
+static inline void i915_gem_object_unpin_vmap(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_unpin_pages(obj);
 }
 
 void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5b6774b..4bca643 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2235,6 +2235,12 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 	ops->put_pages(obj);
 	obj->pages = NULL;
 
+	/* vmapping should have been dropped when pages_pin_count went to 0 */
+	if (WARN_ON(obj->vmapping)) {
+		vunmap(obj->vmapping);
+		obj->vmapping = NULL;
+	}
+
 	i915_gem_object_invalidate(obj);
 
 	return 0;
@@ -2464,6 +2470,39 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 	return addr;
 }
 
+/**
+ * i915_gem_object_pin_vmap - pin a GEM object and map it into kernel space
+ * @obj: the GEM object to be mapped
+ *
+ * Combines the functions of get_pages(), pin_pages() and vmap_range() on
+ * the whole object.  The caller should release the mapping by calling
+ * i915_gem_object_unpin_vmap() when it is no longer required.
+ *
+ * Returns the address at which the object has been mapped, or an ERR_PTR
+ * on failure.
+ */
+void *i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		return ERR_PTR(ret);
+
+	i915_gem_object_pin_pages(obj);
+
+	if (obj->vmapping == NULL) {
+		obj->vmapping = i915_gem_object_vmap_range(obj, 0, 0);
+
+		if (obj->vmapping == NULL) {
+			i915_gem_object_unpin_pages(obj);
+			return ERR_PTR(-ENOMEM);
+		}
+	}
+
+	return obj->vmapping;
+}
+
 void i915_vma_move_to_active(struct i915_vma *vma,
 			     struct drm_i915_gem_request *req)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 3a5d01a..adc7b5e 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -108,40 +108,17 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	struct drm_device *dev = obj->base.dev;
+	void *addr;
 	int ret;
 
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
 		return ERR_PTR(ret);
 
-	if (obj->dma_buf_vmapping) {
-		obj->vmapping_count++;
-		goto out_unlock;
-	}
-
-	ret = i915_gem_object_get_pages(obj);
-	if (ret)
-		goto err;
-
-	i915_gem_object_pin_pages(obj);
-
-	ret = -ENOMEM;
-
-	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0, 0);
-
-	if (!obj->dma_buf_vmapping)
-		goto err_unpin;
-
-	obj->vmapping_count = 1;
-out_unlock:
+	addr = i915_gem_object_pin_vmap(obj);
 	mutex_unlock(&dev->struct_mutex);
-	return obj->dma_buf_vmapping;
 
-err_unpin:
-	i915_gem_object_unpin_pages(obj);
-err:
-	mutex_unlock(&dev->struct_mutex);
-	return ERR_PTR(ret);
+	return addr;
 }
 
 static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
@@ -150,12 +127,7 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
 	struct drm_device *dev = obj->base.dev;
 
 	mutex_lock(&dev->struct_mutex);
-	if (--obj->vmapping_count == 0) {
-		vunmap(obj->dma_buf_vmapping);
-		obj->dma_buf_vmapping = NULL;
-
-		i915_gem_object_unpin_pages(obj);
-	}
+	i915_gem_object_unpin_vmap(obj);
 	mutex_unlock(&dev->struct_mutex);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 58a18e1..47f186e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2056,7 +2056,7 @@ static int init_phys_status_page(struct intel_engine_cs *ring)
 void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 {
 	if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
-		vunmap(ringbuf->virtual_start);
+		i915_gem_object_unpin_vmap(ringbuf->obj);
 	else
 		iounmap(ringbuf->virtual_start);
 	ringbuf->virtual_start = NULL;
@@ -2080,9 +2080,10 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		if (ret)
 			goto unpin;
 
-		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0, 0);
-		if (ringbuf->virtual_start == NULL) {
-			ret = -ENOMEM;
+		ringbuf->virtual_start = i915_gem_object_pin_vmap(obj);
+		if (IS_ERR(ringbuf->virtual_start)) {
+			ret = PTR_ERR(ringbuf->virtual_start);
+			ringbuf->virtual_start = NULL;
 			goto unpin;
 		}
 	} else {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 7/7] drm: add parameter-order checking to drm memory allocators
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (5 preceding siblings ...)
  2016-03-01 16:33 ` [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?) Dave Gordon
@ 2016-03-01 16:33 ` Dave Gordon
  2016-03-02 15:00   ` Tvrtko Ursulin
  2016-03-02  6:54 ` ✗ Fi.CI.BAT: warning for Reorganise calls to vmap() GEM objects (rev5) Patchwork
  7 siblings, 1 reply; 17+ messages in thread
From: Dave Gordon @ 2016-03-01 16:33 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon, dri-devel

After the recent addition of drm_malloc_gfp(), it was noticed that
some callers of these functions has swapped the parameters in the
call - it's supposed to be 'number of members' and 'sizeof(element)',
but a few callers had got the size first and the count second. This
isn't otherwise detected because they're both type 'size_t', and
the implementation at present just multiplies them anyway, so the
result is still right. But some future implementation might treat
them differently (e.g. allowing 0 elements but not zero size), so
let's add some compile-time checks and complain if the second (size)
parameter isn't a sizeof() expression, or at least a compile-time
constant.

This patch also fixes those callers where the order was wrong.

v6: removed duplicate BUILD_BUG_ON_MSG(); avoided renaming functions
    by shadowing them with #defines and then calling the function
    (non-recursively!) from inside the #define [Chris Wilson]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>Cc: dri-
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  8 ++++----
 include/drm/drm_mem_util.h                   | 27 ++++++++++++++++++++++++---
 3 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 1aba01a..9ae4a71 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -340,7 +340,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	 */
 	bos = drm_malloc_ab(args->nr_bos, sizeof(*bos));
 	relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
-	stream = drm_malloc_ab(1, args->stream_size);
+	stream = drm_malloc_ab(args->stream_size, sizeof(*stream));
 	cmdbuf = etnaviv_gpu_cmdbuf_new(gpu, ALIGN(args->stream_size, 8) + 8,
 					args->nr_bos);
 	if (!bos || !relocs || !stream || !cmdbuf) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f734b3c..1a136d9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1686,8 +1686,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
 	}
 
 	/* Copy in the exec list from userland */
-	exec_list = drm_malloc_ab(sizeof(*exec_list), args->buffer_count);
-	exec2_list = drm_malloc_ab(sizeof(*exec2_list), args->buffer_count);
+	exec_list = drm_malloc_ab(args->buffer_count, sizeof(*exec_list));
+	exec2_list = drm_malloc_ab(args->buffer_count, sizeof(*exec2_list));
 	if (exec_list == NULL || exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
 			  args->buffer_count);
@@ -1775,8 +1775,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
 		return -EINVAL;
 	}
 
-	exec2_list = drm_malloc_gfp(sizeof(*exec2_list),
-				    args->buffer_count,
+	exec2_list = drm_malloc_gfp(args->buffer_count,
+				    sizeof(*exec2_list),
 				    GFP_TEMPORARY);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
index 741ce75..5b0111c 100644
--- a/include/drm/drm_mem_util.h
+++ b/include/drm/drm_mem_util.h
@@ -29,7 +29,7 @@
 
 #include <linux/vmalloc.h>
 
-static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
+static __inline__ void *drm_calloc_large(size_t nmemb, const size_t size)
 {
 	if (size != 0 && nmemb > SIZE_MAX / size)
 		return NULL;
@@ -41,8 +41,15 @@ static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
 			 GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO, PAGE_KERNEL);
 }
 
+#define	drm_calloc_large(nmemb, size)					\
+({									\
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
+		"Non-constant 'size' - check argument ordering?");	\
+	(drm_calloc_large)(nmemb, size);				\
+})
+
 /* Modeled after cairo's malloc_ab, it's like calloc but without the zeroing. */
-static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
+static __inline__ void *drm_malloc_ab(size_t nmemb, const size_t size)
 {
 	if (size != 0 && nmemb > SIZE_MAX / size)
 		return NULL;
@@ -54,7 +61,14 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
 			 GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
-static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
+#define	drm_malloc_ab(nmemb, size)					\
+({									\
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
+		"Non-constant 'size' - check argument ordering?");	\
+	(drm_malloc_ab)(nmemb, size);					\
+})
+
+static __inline__ void *drm_malloc_gfp(size_t nmemb, const size_t size, gfp_t gfp)
 {
 	if (size != 0 && nmemb > SIZE_MAX / size)
 		return NULL;
@@ -73,6 +87,13 @@ static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
 			 gfp | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
+#define	drm_malloc_gfp(nmemb, size, gfp)				\
+({									\
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
+		"Non-constant 'size' - check argument ordering?");	\
+	(drm_malloc_gfp)(nmemb, size, gfp);				\
+})
+
 static __inline void drm_free_large(void *ptr)
 {
 	kvfree(ptr);
-- 
1.9.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/7] drm/i915: introduce and use i915_gem_object_vmap_range()
  2016-03-01 16:33 ` [PATCH v7 4/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
@ 2016-03-01 17:39   ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-03-01 17:39 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx



On 01/03/16 16:33, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> There are several places inside driver where a GEM object is mapped
> to kernel virtual space. The mapping may be done either for the whole
> object or only a subset of it.
>
> This patch introduces a function i915_gem_object_vmap_range() to
> implement the common functionality. The code itself is extracted and
> adapted from that in vmap_batch(), but also replaces vmap_obj() and
> the open-coded version in i915_gem_dmabuf_vmap().
>
> v2: use obj->pages->nents for iteration within i915_gem_object_vmap;
>      break when it finishes all desired pages. The caller must pass
>      the actual page count required. [Tvrtko Ursulin]
>
> v4: renamed to i915_gem_object_vmap_range() to make its function
>      clearer. [Dave Gordon]
>
> v5: use Chris Wilson's new drm_malloc_gfp() rather than kmalloc() or
>      drm_malloc_ab(). [Dave Gordon]
>
> v6: changed range checking to not use pages->nents. [Tvrtko Ursulin]
>      Use sg_nents_for_len() for range check instead. [Dave Gordon]
>      Pass range parameters in bytes rather than pages (both callers
>      were converting from bytes to pages anyway, so this reduces the
>      number of places where the conversion is done).
>
> v7: changed range parameters back to pages, and simplified parameter
>      validation. [Tvrtko Ursulin] As a convenience for callers, allow
>      npages==0 as a shorthand for "up to the end of the object".

Looks OK to me,

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


> With this change, we have only one vmap() in the whole driver :)
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_cmd_parser.c  | 34 +++----------------
>   drivers/gpu/drm/i915/i915_drv.h         |  4 +++
>   drivers/gpu/drm/i915/i915_gem.c         | 59 +++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 15 ++-------
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +------------
>   5 files changed, 71 insertions(+), 64 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 814d894..4d48617 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -863,37 +863,13 @@ void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring)
>   static u32 *vmap_batch(struct drm_i915_gem_object *obj,
>   		       unsigned start, unsigned len)
>   {
> -	int i;
> -	void *addr = NULL;
> -	struct sg_page_iter sg_iter;
> -	int first_page = start >> PAGE_SHIFT;
> -	int last_page = (len + start + 4095) >> PAGE_SHIFT;
> -	int npages = last_page - first_page;
> -	struct page **pages;
> -
> -	pages = drm_malloc_ab(npages, sizeof(*pages));
> -	if (pages == NULL) {
> -		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> -		goto finish;
> -	}
> -
> -	i = 0;
> -	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first_page) {
> -		pages[i++] = sg_page_iter_page(&sg_iter);
> -		if (i == npages)
> -			break;
> -	}
> +	unsigned long first, npages;
>
> -	addr = vmap(pages, i, 0, PAGE_KERNEL);
> -	if (addr == NULL) {
> -		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
> -		goto finish;
> -	}
> +	/* Convert [start, len) to pages */
> +	first = start >> PAGE_SHIFT;
> +	npages = DIV_ROUND_UP(start + len, PAGE_SIZE) - first;
>
> -finish:
> -	if (pages)
> -		drm_free_large(pages);
> -	return (u32*)addr;
> +	return i915_gem_object_vmap_range(obj, first, npages);
>   }
>
>   /* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a4dcb74..b3ae191 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2983,6 +2983,10 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>   	obj->pages_pin_count--;
>   }
>
> +void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
> +					      unsigned long first,
> +					      unsigned long npages);
> +
>   int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   			 struct intel_engine_cs *to,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3d31d3a..d7c9ccd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2400,6 +2400,65 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>   	return 0;
>   }
>
> +/**
> + * i915_gem_object_vmap_range - map some or all of a GEM object into kernel space
> + * @obj: the GEM object to be mapped
> + * @first: offset in pages of the start of the range to be mapped
> + * @npages: length in pages of the range to be mapped. For convenience, a
> + *          length of zero is taken to mean "the remainder of the object"
> + *
> + * Map a given range of a GEM object into kernel virtual space.  The caller must
> + * make sure the associated pages are gathered and pinned before calling this
> + * function, and is responsible for unmapping the returned address when it is no
> + * longer required.
> + *
> + * Returns the address at which the object has been mapped, or NULL on failure.
> + */
> +void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
> +				 unsigned long first,
> +				 unsigned long npages)
> +{
> +	unsigned long max_pages = obj->base.size >> PAGE_SHIFT;
> +	struct scatterlist *sg = obj->pages->sgl;
> +	struct sg_page_iter sg_iter;
> +	struct page **pages;
> +	unsigned long i = 0;
> +	void *addr = NULL;
> +
> +	/* Minimal range check */
> +	if (first + npages > max_pages) {
> +		DRM_DEBUG_DRIVER("Invalid page range\n");
> +		return NULL;
> +	}
> +
> +	/* npages==0 is shorthand for "the rest of the object" */
> +	if (npages == 0)
> +		npages = max_pages - first;
> +
> +	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
> +	if (pages == NULL) {
> +		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> +		return NULL;
> +	}
> +
> +	for_each_sg_page(sg, &sg_iter, max_pages, first) {
> +		pages[i] = sg_page_iter_page(&sg_iter);
> +		if (++i == npages) {
> +			addr = vmap(pages, npages, 0, PAGE_KERNEL);
> +			break;
> +		}
> +	}
> +
> +	/* We should have got here via the 'break' above */
> +	WARN_ON(i != npages);
> +	if (addr == NULL)
> +		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
> +
> +	drm_free_large(pages);
> +
> +	return addr;
> +}
> +
>   void i915_vma_move_to_active(struct i915_vma *vma,
>   			     struct drm_i915_gem_request *req)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> index 616f078..3a5d01a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> @@ -108,9 +108,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>   {
>   	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>   	struct drm_device *dev = obj->base.dev;
> -	struct sg_page_iter sg_iter;
> -	struct page **pages;
> -	int ret, i;
> +	int ret;
>
>   	ret = i915_mutex_lock_interruptible(dev);
>   	if (ret)
> @@ -129,16 +127,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>
>   	ret = -ENOMEM;
>
> -	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
> -	if (pages == NULL)
> -		goto err_unpin;
> -
> -	i = 0;
> -	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
> -		pages[i++] = sg_page_iter_page(&sg_iter);
> -
> -	obj->dma_buf_vmapping = vmap(pages, i, 0, PAGE_KERNEL);
> -	drm_free_large(pages);
> +	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0, 0);
>
>   	if (!obj->dma_buf_vmapping)
>   		goto err_unpin;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8f52556..58a18e1 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2064,27 +2064,6 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>   	i915_gem_object_ggtt_unpin(ringbuf->obj);
>   }
>
> -static u32 *vmap_obj(struct drm_i915_gem_object *obj)
> -{
> -	struct sg_page_iter sg_iter;
> -	struct page **pages;
> -	void *addr;
> -	int i;
> -
> -	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
> -	if (pages == NULL)
> -		return NULL;
> -
> -	i = 0;
> -	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
> -		pages[i++] = sg_page_iter_page(&sg_iter);
> -
> -	addr = vmap(pages, i, 0, PAGE_KERNEL);
> -	drm_free_large(pages);
> -
> -	return addr;
> -}
> -
>   int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   				     struct intel_ringbuffer *ringbuf)
>   {
> @@ -2101,7 +2080,7 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   		if (ret)
>   			goto unpin;
>
> -		ringbuf->virtual_start = vmap_obj(obj);
> +		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0, 0);
>   		if (ringbuf->virtual_start == NULL) {
>   			ret = -ENOMEM;
>   			goto unpin;
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.BAT: warning for Reorganise calls to vmap() GEM objects (rev5)
  2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (6 preceding siblings ...)
  2016-03-01 16:33 ` [PATCH v7 7/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
@ 2016-03-02  6:54 ` Patchwork
  2016-03-02 12:38   ` Dave Gordon
  7 siblings, 1 reply; 17+ messages in thread
From: Patchwork @ 2016-03-02  6:54 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: Reorganise calls to vmap() GEM objects (rev5)
URL   : https://patchwork.freedesktop.org/series/3688/
State : warning

== Summary ==

Series 3688v5 Reorganise calls to vmap() GEM objects
http://patchwork.freedesktop.org/api/1.0/series/3688/revisions/5/mbox/

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (ilk-hp8440p) UNSTABLE
        Subgroup basic-flip-vs-modeset:
                dmesg-warn -> PASS       (ilk-hp8440p) UNSTABLE
        Subgroup basic-flip-vs-wf_vblank:
                fail       -> PASS       (snb-x220t)
                dmesg-warn -> PASS       (hsw-gt2)
                pass       -> DMESG-WARN (hsw-brixbox)
        Subgroup basic-plain-flip:
                pass       -> DMESG-WARN (hsw-gt2)
Test kms_force_connector_basic:
        Subgroup force-edid:
                pass       -> SKIP       (snb-x220t)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                dmesg-warn -> PASS       (bsw-nuc-2)
Test pm_rpm:
        Subgroup basic-pci-d3-state:
                dmesg-fail -> FAIL       (snb-x220t)
                pass       -> DMESG-WARN (bsw-nuc-2)
        Subgroup basic-rte:
                pass       -> DMESG-WARN (snb-x220t)

bdw-nuci7        total:169  pass:158  dwarn:0   dfail:0   fail:0   skip:11 
bdw-ultra        total:169  pass:155  dwarn:0   dfail:0   fail:0   skip:14 
bsw-nuc-2        total:169  pass:137  dwarn:1   dfail:0   fail:1   skip:30 
byt-nuc          total:169  pass:144  dwarn:0   dfail:0   fail:0   skip:25 
hsw-brixbox      total:169  pass:153  dwarn:1   dfail:0   fail:0   skip:15 
hsw-gt2          total:169  pass:157  dwarn:1   dfail:0   fail:1   skip:10 
ilk-hp8440p      total:169  pass:117  dwarn:1   dfail:0   fail:1   skip:50 
ivb-t430s        total:169  pass:153  dwarn:0   dfail:0   fail:1   skip:15 
skl-i5k-2        total:169  pass:153  dwarn:0   dfail:0   fail:0   skip:16 
skl-i7k-2        total:169  pass:153  dwarn:0   dfail:0   fail:0   skip:16 
snb-dellxps      total:169  pass:144  dwarn:1   dfail:0   fail:1   skip:23 
snb-x220t        total:169  pass:143  dwarn:1   dfail:0   fail:2   skip:23 

Results at /archive/results/CI_IGT_test/Patchwork_1507/

f9cadb616ff17d482312fba07db772b6604ce799 drm-intel-nightly: 2016y-03m-01d-17h-16m-32s UTC integration manifest
e1844f3e272a3699a8366bfac854f4f11add8a3c drm: add parameter-order checking to drm memory allocators
388142f7010f3853bfa87c75d9780874f01bde6c drm/i915: refactor duplicate object vmap functions (the final rework?)
928337a27604203fd575c96b8fc6c66efe851336 drm/i915: optimise i915_gem_object_vmap_range() for small objects
3efd2debe3cea0a63e094d238ef24668972da8ac drm/i915: introduce and use i915_gem_object_vmap_range()
b8e2b5bb8c80207ad5e62aa4a22721ac68f23863 drm,i915: introduce drm_malloc_gfp()
ef1ee90066babd6a4ba876a5cc0ed59c766f50c9 drm/i915: move locking in i915_gem_unmap_dma_buf()
53ac7fe84ca9ba56792401c3395d13cfb3b9225e drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?)
  2016-03-01 16:33 ` [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?) Dave Gordon
@ 2016-03-02 12:08   ` Chris Wilson
  2016-03-02 15:40     ` Dave Gordon
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-03-02 12:08 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Mar 01, 2016 at 04:33:58PM +0000, Dave Gordon wrote:
> This is essentially Chris Wilson's patch of a similar name, reworked on
> top of Alex Dai's recent patch:
> | drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
> 
> Chris' original commentary said:
> | We now have two implementations for vmapping a whole object, one for
> | dma-buf and one for the ringbuffer. If we couple the vmapping into
> | the obj->pages lifetime, then we can reuse an obj->vmapping for both
> | and at the same time couple it into the shrinker.
> |
> | v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
> | v3: Call unpin_vmap from the right dmabuf unmapper
> 
> v4: reimplements the same functionality, but now as wrappers round the
>     recently-introduced i915_gem_object_vmap_range() from Alex's patch
>     mentioned above.
> 
> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>     this is the third and most substantial portion.
> 
>     Decided not to hold onto vmappings after the pin count goes to zero.
>     This may reduce the benefit of Chris' scheme a bit, but does avoid
>     any increased risk of exhausting kernel vmap space on 32-bit kernels
>     [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
>     the put_pages() stage if a suitable notifier were written, but we're
>     not doing that here. Nonetheless, the simplification of both dmabuf
>     and ringbuffer code makes it worthwhile in its own right.
> 
> v6: change BUG_ON() to WARN_ON(). [Tvrtko Ursulin]
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Alex Dai <yu.dai@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 22 ++++++++++++++-----
>  drivers/gpu/drm/i915/i915_gem.c         | 39 +++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 36 ++++--------------------------
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++----
>  4 files changed, 65 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b3ae191..f1ad3b3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>  		struct scatterlist *sg;
>  		int last;
>  	} get_page;
> -
> -	/* prime dma-buf support */
> -	void *dma_buf_vmapping;
> -	int vmapping_count;
> +	void *vmapping;
>  
>  	/** Breadcrumb of last rendering to the buffer.
>  	 * There can only be one writer, but we allow for multiple readers.
> @@ -2980,7 +2977,22 @@ static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>  static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pages_pin_count == 0);
> -	obj->pages_pin_count--;
> +	if (--obj->pages_pin_count == 0 && obj->vmapping) {
> +		/*
> +		 * Releasing the vmapping here may yield less benefit than
> +		 * if we kept it until put_pages(), but on the other hand

Yields no benefit. Makes the patch pointless.
Plus there is also pressure to enable WC vmaps.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for Reorganise calls to vmap() GEM objects (rev5)
  2016-03-02  6:54 ` ✗ Fi.CI.BAT: warning for Reorganise calls to vmap() GEM objects (rev5) Patchwork
@ 2016-03-02 12:38   ` Dave Gordon
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Gordon @ 2016-03-02 12:38 UTC (permalink / raw)
  To: intel-gfx, Chris Wilson, Tvrtko Ursulin

On 02/03/16 06:54, Patchwork wrote:
> == Series Details ==
>
> Series: Reorganise calls to vmap() GEM objects (rev5)
> URL   : https://patchwork.freedesktop.org/series/3688/
> State : warning

All of the issues below are known and unrelated to the patchset

> == Summary ==
>
> Series 3688v5 Reorganise calls to vmap() GEM objects
> http://patchwork.freedesktop.org/api/1.0/series/3688/revisions/5/mbox/
>
> Test kms_flip:
>          Subgroup basic-flip-vs-dpms:
>                  pass       -> DMESG-WARN (ilk-hp8440p) UNSTABLE
>          Subgroup basic-flip-vs-modeset:
>                  dmesg-warn -> PASS       (ilk-hp8440p) UNSTABLE
>          Subgroup basic-flip-vs-wf_vblank:
>                  fail       -> PASS       (snb-x220t)
>                  dmesg-warn -> PASS       (hsw-gt2)
>                  pass       -> DMESG-WARN (hsw-brixbox)

https://bugs.freedesktop.org/show_bug.cgi?id=94349

>          Subgroup basic-plain-flip:
>                  pass       -> DMESG-WARN (hsw-gt2)

Same again:
https://bugs.freedesktop.org/show_bug.cgi?id=94349

> Test kms_force_connector_basic:
>          Subgroup force-edid:
>                  pass       -> SKIP       (snb-x220t)
> Test kms_pipe_crc_basic:
>          Subgroup suspend-read-crc-pipe-c:
>                  dmesg-warn -> PASS       (bsw-nuc-2)
> Test pm_rpm:
>          Subgroup basic-pci-d3-state:
>                  dmesg-fail -> FAIL       (snb-x220t)

https://bugs.freedesktop.org/show_bug.cgi?id=93123

>                  pass       -> DMESG-WARN (bsw-nuc-2)

https://bugs.freedesktop.org/show_bug.cgi?id=94164

>          Subgroup basic-rte:
>                  pass       -> DMESG-WARN (snb-x220t)

https://bugs.freedesktop.org/show_bug.cgi?id=94349 again

> bdw-nuci7        total:169  pass:158  dwarn:0   dfail:0   fail:0   skip:11
> bdw-ultra        total:169  pass:155  dwarn:0   dfail:0   fail:0   skip:14
> bsw-nuc-2        total:169  pass:137  dwarn:1   dfail:0   fail:1   skip:30
> byt-nuc          total:169  pass:144  dwarn:0   dfail:0   fail:0   skip:25
> hsw-brixbox      total:169  pass:153  dwarn:1   dfail:0   fail:0   skip:15
> hsw-gt2          total:169  pass:157  dwarn:1   dfail:0   fail:1   skip:10
> ilk-hp8440p      total:169  pass:117  dwarn:1   dfail:0   fail:1   skip:50
> ivb-t430s        total:169  pass:153  dwarn:0   dfail:0   fail:1   skip:15
> skl-i5k-2        total:169  pass:153  dwarn:0   dfail:0   fail:0   skip:16
> skl-i7k-2        total:169  pass:153  dwarn:0   dfail:0   fail:0   skip:16
> snb-dellxps      total:169  pass:144  dwarn:1   dfail:0   fail:1   skip:23
> snb-x220t        total:169  pass:143  dwarn:1   dfail:0   fail:2   skip:23
>
> Results at /archive/results/CI_IGT_test/Patchwork_1507/
>
> f9cadb616ff17d482312fba07db772b6604ce799 drm-intel-nightly: 2016y-03m-01d-17h-16m-32s UTC integration manifest
> e1844f3e272a3699a8366bfac854f4f11add8a3c drm: add parameter-order checking to drm memory allocators
> 388142f7010f3853bfa87c75d9780874f01bde6c drm/i915: refactor duplicate object vmap functions (the final rework?)
> 928337a27604203fd575c96b8fc6c66efe851336 drm/i915: optimise i915_gem_object_vmap_range() for small objects
> 3efd2debe3cea0a63e094d238ef24668972da8ac drm/i915: introduce and use i915_gem_object_vmap_range()
> b8e2b5bb8c80207ad5e62aa4a22721ac68f23863 drm,i915: introduce drm_malloc_gfp()
> ef1ee90066babd6a4ba876a5cc0ed59c766f50c9 drm/i915: move locking in i915_gem_unmap_dma_buf()
> 53ac7fe84ca9ba56792401c3395d13cfb3b9225e drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 7/7] drm: add parameter-order checking to drm memory allocators
  2016-03-01 16:33 ` [PATCH v7 7/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
@ 2016-03-02 15:00   ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-03-02 15:00 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx; +Cc: dri-devel


On 01/03/16 16:33, Dave Gordon wrote:
> After the recent addition of drm_malloc_gfp(), it was noticed that
> some callers of these functions has swapped the parameters in the
> call - it's supposed to be 'number of members' and 'sizeof(element)',
> but a few callers had got the size first and the count second. This
> isn't otherwise detected because they're both type 'size_t', and
> the implementation at present just multiplies them anyway, so the
> result is still right. But some future implementation might treat
> them differently (e.g. allowing 0 elements but not zero size), so
> let's add some compile-time checks and complain if the second (size)
> parameter isn't a sizeof() expression, or at least a compile-time
> constant.
>
> This patch also fixes those callers where the order was wrong.
>
> v6: removed duplicate BUILD_BUG_ON_MSG(); avoided renaming functions
>      by shadowing them with #defines and then calling the function
>      (non-recursively!) from inside the #define [Chris Wilson]
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>Cc: dri-
> Cc: dri-devel@lists.freedesktop.org

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Daniel, there are two DRM core patches in this series and only thing 
missing is convincing Chris that 6/7 does bring some improvement, 
especially looking forward to following GuC refactoring it will enable.

Assuming that gets resolved, I assume because of the core DRM bits I 
will need to ping you to pickup the series?

Regards,

Tvrtko

> ---
>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  2 +-
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  8 ++++----
>   include/drm/drm_mem_util.h                   | 27 ++++++++++++++++++++++++---
>   3 files changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 1aba01a..9ae4a71 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -340,7 +340,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
>   	 */
>   	bos = drm_malloc_ab(args->nr_bos, sizeof(*bos));
>   	relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
> -	stream = drm_malloc_ab(1, args->stream_size);
> +	stream = drm_malloc_ab(args->stream_size, sizeof(*stream));
>   	cmdbuf = etnaviv_gpu_cmdbuf_new(gpu, ALIGN(args->stream_size, 8) + 8,
>   					args->nr_bos);
>   	if (!bos || !relocs || !stream || !cmdbuf) {
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index f734b3c..1a136d9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1686,8 +1686,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
>   	}
>
>   	/* Copy in the exec list from userland */
> -	exec_list = drm_malloc_ab(sizeof(*exec_list), args->buffer_count);
> -	exec2_list = drm_malloc_ab(sizeof(*exec2_list), args->buffer_count);
> +	exec_list = drm_malloc_ab(args->buffer_count, sizeof(*exec_list));
> +	exec2_list = drm_malloc_ab(args->buffer_count, sizeof(*exec2_list));
>   	if (exec_list == NULL || exec2_list == NULL) {
>   		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
>   			  args->buffer_count);
> @@ -1775,8 +1775,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
>   		return -EINVAL;
>   	}
>
> -	exec2_list = drm_malloc_gfp(sizeof(*exec2_list),
> -				    args->buffer_count,
> +	exec2_list = drm_malloc_gfp(args->buffer_count,
> +				    sizeof(*exec2_list),
>   				    GFP_TEMPORARY);
>   	if (exec2_list == NULL) {
>   		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
> diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
> index 741ce75..5b0111c 100644
> --- a/include/drm/drm_mem_util.h
> +++ b/include/drm/drm_mem_util.h
> @@ -29,7 +29,7 @@
>
>   #include <linux/vmalloc.h>
>
> -static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
> +static __inline__ void *drm_calloc_large(size_t nmemb, const size_t size)
>   {
>   	if (size != 0 && nmemb > SIZE_MAX / size)
>   		return NULL;
> @@ -41,8 +41,15 @@ static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
>   			 GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO, PAGE_KERNEL);
>   }
>
> +#define	drm_calloc_large(nmemb, size)					\
> +({									\
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
> +		"Non-constant 'size' - check argument ordering?");	\
> +	(drm_calloc_large)(nmemb, size);				\
> +})
> +
>   /* Modeled after cairo's malloc_ab, it's like calloc but without the zeroing. */
> -static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
> +static __inline__ void *drm_malloc_ab(size_t nmemb, const size_t size)
>   {
>   	if (size != 0 && nmemb > SIZE_MAX / size)
>   		return NULL;
> @@ -54,7 +61,14 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
>   			 GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
>   }
>
> -static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
> +#define	drm_malloc_ab(nmemb, size)					\
> +({									\
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
> +		"Non-constant 'size' - check argument ordering?");	\
> +	(drm_malloc_ab)(nmemb, size);					\
> +})
> +
> +static __inline__ void *drm_malloc_gfp(size_t nmemb, const size_t size, gfp_t gfp)
>   {
>   	if (size != 0 && nmemb > SIZE_MAX / size)
>   		return NULL;
> @@ -73,6 +87,13 @@ static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
>   			 gfp | __GFP_HIGHMEM, PAGE_KERNEL);
>   }
>
> +#define	drm_malloc_gfp(nmemb, size, gfp)				\
> +({									\
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
> +		"Non-constant 'size' - check argument ordering?");	\
> +	(drm_malloc_gfp)(nmemb, size, gfp);				\
> +})
> +
>   static __inline void drm_free_large(void *ptr)
>   {
>   	kvfree(ptr);
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?)
  2016-03-02 12:08   ` Chris Wilson
@ 2016-03-02 15:40     ` Dave Gordon
  2016-03-08  9:43       ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Gordon @ 2016-03-02 15:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org

On 02/03/16 12:08, Chris Wilson wrote:
> On Tue, Mar 01, 2016 at 04:33:58PM +0000, Dave Gordon wrote:
>> This is essentially Chris Wilson's patch of a similar name, reworked on
>> top of Alex Dai's recent patch:
>> | drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
>>
>> Chris' original commentary said:
>> | We now have two implementations for vmapping a whole object, one for
>> | dma-buf and one for the ringbuffer. If we couple the vmapping into
>> | the obj->pages lifetime, then we can reuse an obj->vmapping for both
>> | and at the same time couple it into the shrinker.
>> |
>> | v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
>> | v3: Call unpin_vmap from the right dmabuf unmapper
>>
>> v4: reimplements the same functionality, but now as wrappers round the
>>      recently-introduced i915_gem_object_vmap_range() from Alex's patch
>>      mentioned above.
>>
>> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>>      this is the third and most substantial portion.
>>
>>      Decided not to hold onto vmappings after the pin count goes to zero.
>>      This may reduce the benefit of Chris' scheme a bit, but does avoid
>>      any increased risk of exhausting kernel vmap space on 32-bit kernels
>>      [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
>>      the put_pages() stage if a suitable notifier were written, but we're
>>      not doing that here. Nonetheless, the simplification of both dmabuf
>>      and ringbuffer code makes it worthwhile in its own right.
>>
>> v6: change BUG_ON() to WARN_ON(). [Tvrtko Ursulin]
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Alex Dai <yu.dai@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         | 22 ++++++++++++++-----
>>   drivers/gpu/drm/i915/i915_gem.c         | 39 +++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 36 ++++--------------------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++----
>>   4 files changed, 65 insertions(+), 41 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index b3ae191..f1ad3b3 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>>   		struct scatterlist *sg;
>>   		int last;
>>   	} get_page;
>> -
>> -	/* prime dma-buf support */
>> -	void *dma_buf_vmapping;
>> -	int vmapping_count;
>> +	void *vmapping;
>>
>>   	/** Breadcrumb of last rendering to the buffer.
>>   	 * There can only be one writer, but we allow for multiple readers.
>> @@ -2980,7 +2977,22 @@ static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>>   static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>>   {
>>   	BUG_ON(obj->pages_pin_count == 0);
>> -	obj->pages_pin_count--;
>> +	if (--obj->pages_pin_count == 0 && obj->vmapping) {
>> +		/*
>> +		 * Releasing the vmapping here may yield less benefit than
>> +		 * if we kept it until put_pages(), but on the other hand
>
> Yields no benefit. Makes the patch pointless.
> Plus there is also pressure to enable WC vmaps.
> -Chris

The patch is not pointless -- at the very least, it:
+ reduces the size of "struct drm_i915_gem_object" (OK, only by 4 bytes)
+ replaces special-function code for dmabufs with more generic code that 
can be reused for other objects (for now, ringbuffers; next GuC-shared 
objects -- see Alex's patch "drm/i915/guc: Simplify code by keeping vmap 
of guc_client object" which will eliminate lot of short-term 
kmap_atomics with persistent kmaps).
+ provides a shorthand for the sequence of { get_pages(), pin_pages(), 
vmap() } so we don't have to open-code it (and deal with all the error 
paths) in several different places

Thus there is an engineering benefit even if this version doesn't 
provide any performance benefit. And if, as the next step, you want to 
extend the vmap lifetime, you just have to remove those few lines in 
i915_gem_object_unpin_pages() and incorporate the notifier that you 
prototyped earlier -- if it actually provides any performance boost.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?)
  2016-03-02 15:40     ` Dave Gordon
@ 2016-03-08  9:43       ` Tvrtko Ursulin
  2016-03-22 15:25         ` Dave Gordon
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-03-08  9:43 UTC (permalink / raw)
  To: Dave Gordon, Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org


On 02/03/16 15:40, Dave Gordon wrote:
> On 02/03/16 12:08, Chris Wilson wrote:
>> On Tue, Mar 01, 2016 at 04:33:58PM +0000, Dave Gordon wrote:
>>> This is essentially Chris Wilson's patch of a similar name, reworked on
>>> top of Alex Dai's recent patch:
>>> | drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
>>>
>>> Chris' original commentary said:
>>> | We now have two implementations for vmapping a whole object, one for
>>> | dma-buf and one for the ringbuffer. If we couple the vmapping into
>>> | the obj->pages lifetime, then we can reuse an obj->vmapping for both
>>> | and at the same time couple it into the shrinker.
>>> |
>>> | v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
>>> | v3: Call unpin_vmap from the right dmabuf unmapper
>>>
>>> v4: reimplements the same functionality, but now as wrappers round the
>>>      recently-introduced i915_gem_object_vmap_range() from Alex's patch
>>>      mentioned above.
>>>
>>> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>>>      this is the third and most substantial portion.
>>>
>>>      Decided not to hold onto vmappings after the pin count goes to
>>> zero.
>>>      This may reduce the benefit of Chris' scheme a bit, but does avoid
>>>      any increased risk of exhausting kernel vmap space on 32-bit
>>> kernels
>>>      [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
>>>      the put_pages() stage if a suitable notifier were written, but
>>> we're
>>>      not doing that here. Nonetheless, the simplification of both dmabuf
>>>      and ringbuffer code makes it worthwhile in its own right.
>>>
>>> v6: change BUG_ON() to WARN_ON(). [Tvrtko Ursulin]
>>>
>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Alex Dai <yu.dai@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h         | 22 ++++++++++++++-----
>>>   drivers/gpu/drm/i915/i915_gem.c         | 39
>>> +++++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 36
>>> ++++--------------------------
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++----
>>>   4 files changed, 65 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index b3ae191..f1ad3b3 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>>>           struct scatterlist *sg;
>>>           int last;
>>>       } get_page;
>>> -
>>> -    /* prime dma-buf support */
>>> -    void *dma_buf_vmapping;
>>> -    int vmapping_count;
>>> +    void *vmapping;
>>>
>>>       /** Breadcrumb of last rendering to the buffer.
>>>        * There can only be one writer, but we allow for multiple
>>> readers.
>>> @@ -2980,7 +2977,22 @@ static inline void
>>> i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>>>   static inline void i915_gem_object_unpin_pages(struct
>>> drm_i915_gem_object *obj)
>>>   {
>>>       BUG_ON(obj->pages_pin_count == 0);
>>> -    obj->pages_pin_count--;
>>> +    if (--obj->pages_pin_count == 0 && obj->vmapping) {
>>> +        /*
>>> +         * Releasing the vmapping here may yield less benefit than
>>> +         * if we kept it until put_pages(), but on the other hand
>>
>> Yields no benefit. Makes the patch pointless.
>> Plus there is also pressure to enable WC vmaps.
>> -Chris
>
> The patch is not pointless -- at the very least, it:
> + reduces the size of "struct drm_i915_gem_object" (OK, only by 4 bytes)
> + replaces special-function code for dmabufs with more generic code that
> can be reused for other objects (for now, ringbuffers; next GuC-shared
> objects -- see Alex's patch "drm/i915/guc: Simplify code by keeping vmap
> of guc_client object" which will eliminate lot of short-term
> kmap_atomics with persistent kmaps).
> + provides a shorthand for the sequence of { get_pages(), pin_pages(),
> vmap() } so we don't have to open-code it (and deal with all the error
> paths) in several different places
>
> Thus there is an engineering benefit even if this version doesn't
> provide any performance benefit. And if, as the next step, you want to
> extend the vmap lifetime, you just have to remove those few lines in
> i915_gem_object_unpin_pages() and incorporate the notifier that you
> prototyped earlier -- if it actually provides any performance boost.

So Chris do you ack on this series on the basis of the above - that it 
consolidates the current code and following GuC patch will be another 
user of the pin_vmap API?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?)
  2016-03-08  9:43       ` Tvrtko Ursulin
@ 2016-03-22 15:25         ` Dave Gordon
  2016-03-23 12:23           ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Gordon @ 2016-03-22 15:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Chris Wilson, Daniel Vetter,
	Intel Graphics Development

On 08/03/16 09:43, Tvrtko Ursulin wrote:
>
> On 02/03/16 15:40, Dave Gordon wrote:
>> On 02/03/16 12:08, Chris Wilson wrote:
>>> On Tue, Mar 01, 2016 at 04:33:58PM +0000, Dave Gordon wrote:
>>>> This is essentially Chris Wilson's patch of a similar name, reworked on
>>>> top of Alex Dai's recent patch:
>>>> | drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
>>>>
>>>> Chris' original commentary said:
>>>> | We now have two implementations for vmapping a whole object, one for
>>>> | dma-buf and one for the ringbuffer. If we couple the vmapping into
>>>> | the obj->pages lifetime, then we can reuse an obj->vmapping for both
>>>> | and at the same time couple it into the shrinker.
>>>> |
>>>> | v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
>>>> | v3: Call unpin_vmap from the right dmabuf unmapper
>>>>
>>>> v4: reimplements the same functionality, but now as wrappers round the
>>>>      recently-introduced i915_gem_object_vmap_range() from Alex's patch
>>>>      mentioned above.
>>>>
>>>> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>>>>      this is the third and most substantial portion.
>>>>
>>>>      Decided not to hold onto vmappings after the pin count goes to
>>>> zero.
>>>>      This may reduce the benefit of Chris' scheme a bit, but does avoid
>>>>      any increased risk of exhausting kernel vmap space on 32-bit
>>>> kernels
>>>>      [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred
>>>> until
>>>>      the put_pages() stage if a suitable notifier were written, but
>>>> we're
>>>>      not doing that here. Nonetheless, the simplification of both
>>>> dmabuf
>>>>      and ringbuffer code makes it worthwhile in its own right.
>>>>
>>>> v6: change BUG_ON() to WARN_ON(). [Tvrtko Ursulin]
>>>>
>>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Alex Dai <yu.dai@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_drv.h         | 22 ++++++++++++++-----
>>>>   drivers/gpu/drm/i915/i915_gem.c         | 39
>>>> +++++++++++++++++++++++++++++++++
>>>>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 36
>>>> ++++--------------------------
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++----
>>>>   4 files changed, 65 insertions(+), 41 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>> index b3ae191..f1ad3b3 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>>>>           struct scatterlist *sg;
>>>>           int last;
>>>>       } get_page;
>>>> -
>>>> -    /* prime dma-buf support */
>>>> -    void *dma_buf_vmapping;
>>>> -    int vmapping_count;
>>>> +    void *vmapping;
>>>>
>>>>       /** Breadcrumb of last rendering to the buffer.
>>>>        * There can only be one writer, but we allow for multiple
>>>> readers.
>>>> @@ -2980,7 +2977,22 @@ static inline void
>>>> i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>>>>   static inline void i915_gem_object_unpin_pages(struct
>>>> drm_i915_gem_object *obj)
>>>>   {
>>>>       BUG_ON(obj->pages_pin_count == 0);
>>>> -    obj->pages_pin_count--;
>>>> +    if (--obj->pages_pin_count == 0 && obj->vmapping) {
>>>> +        /*
>>>> +         * Releasing the vmapping here may yield less benefit than
>>>> +         * if we kept it until put_pages(), but on the other hand
>>>
>>> Yields no benefit. Makes the patch pointless.
>>> Plus there is also pressure to enable WC vmaps.
>>> -Chris
>>
>> The patch is not pointless -- at the very least, it:
>> + reduces the size of "struct drm_i915_gem_object" (OK, only by 4 bytes)
>> + replaces special-function code for dmabufs with more generic code that
>> can be reused for other objects (for now, ringbuffers; next GuC-shared
>> objects -- see Alex's patch "drm/i915/guc: Simplify code by keeping vmap
>> of guc_client object" which will eliminate lot of short-term
>> kmap_atomics with persistent kmaps).
>> + provides a shorthand for the sequence of { get_pages(), pin_pages(),
>> vmap() } so we don't have to open-code it (and deal with all the error
>> paths) in several different places
>>
>> Thus there is an engineering benefit even if this version doesn't
>> provide any performance benefit. And if, as the next step, you want to
>> extend the vmap lifetime, you just have to remove those few lines in
>> i915_gem_object_unpin_pages() and incorporate the notifier that you
>> prototyped earlier -- if it actually provides any performance boost.
>
> So Chris do you ack on this series on the basis of the above - that it
> consolidates the current code and following GuC patch will be another
> user of the pin_vmap API?
>
> Regards,
> Tvrtko

I see that Chris has posted a patch to add a vmap notifier, although it 
hasn't yet got its R-B. So I suggest we merge this patch series now, and 
then update it by moving the vunmap() into put_pages() when Chris has 
the notifier finalised. IIRC you wanted Daniel to merge the new DRM bits 
(patches 3 and 7, which already have their R-Bs) ?

Or we can merge 1-5+7, all of which already have R-Bs, and I can turn
6 into a GuC-private version, without the benefit of simplifying and 
unifying the corresponding object-mapping management in the DMAbuf and 
ringbuffer code.

Or I can repost just the bits that don't rely on drm_malloc_gfp() and 
exclude the final patch so that we can move ahead on the bits we 
actually want for improving the performance of the GuC interface and 
reducing the number of kmap_atomic calls elsewhere, and then the omitted 
bits can be added back once drm_malloc_gfp() has been merged upstream 
and the notifier is working.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?)
  2016-03-22 15:25         ` Dave Gordon
@ 2016-03-23 12:23           ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2016-03-23 12:23 UTC (permalink / raw)
  To: Dave Gordon, Chris Wilson, Daniel Vetter,
	Intel Graphics Development


On 22/03/16 15:25, Dave Gordon wrote:
> On 08/03/16 09:43, Tvrtko Ursulin wrote:
>>
>> On 02/03/16 15:40, Dave Gordon wrote:
>>> On 02/03/16 12:08, Chris Wilson wrote:
>>>> On Tue, Mar 01, 2016 at 04:33:58PM +0000, Dave Gordon wrote:
>>>>> This is essentially Chris Wilson's patch of a similar name,
>>>>> reworked on
>>>>> top of Alex Dai's recent patch:
>>>>> | drm/i915: Add i915_gem_object_vmap to map GEM object to virtual
>>>>> space
>>>>>
>>>>> Chris' original commentary said:
>>>>> | We now have two implementations for vmapping a whole object, one for
>>>>> | dma-buf and one for the ringbuffer. If we couple the vmapping into
>>>>> | the obj->pages lifetime, then we can reuse an obj->vmapping for both
>>>>> | and at the same time couple it into the shrinker.
>>>>> |
>>>>> | v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
>>>>> | v3: Call unpin_vmap from the right dmabuf unmapper
>>>>>
>>>>> v4: reimplements the same functionality, but now as wrappers round the
>>>>>      recently-introduced i915_gem_object_vmap_range() from Alex's
>>>>> patch
>>>>>      mentioned above.
>>>>>
>>>>> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>>>>>      this is the third and most substantial portion.
>>>>>
>>>>>      Decided not to hold onto vmappings after the pin count goes to
>>>>> zero.
>>>>>      This may reduce the benefit of Chris' scheme a bit, but does
>>>>> avoid
>>>>>      any increased risk of exhausting kernel vmap space on 32-bit
>>>>> kernels
>>>>>      [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred
>>>>> until
>>>>>      the put_pages() stage if a suitable notifier were written, but
>>>>> we're
>>>>>      not doing that here. Nonetheless, the simplification of both
>>>>> dmabuf
>>>>>      and ringbuffer code makes it worthwhile in its own right.
>>>>>
>>>>> v6: change BUG_ON() to WARN_ON(). [Tvrtko Ursulin]
>>>>>
>>>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Alex Dai <yu.dai@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/i915_drv.h         | 22 ++++++++++++++-----
>>>>>   drivers/gpu/drm/i915/i915_gem.c         | 39
>>>>> +++++++++++++++++++++++++++++++++
>>>>>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 36
>>>>> ++++--------------------------
>>>>>   drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++----
>>>>>   4 files changed, 65 insertions(+), 41 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index b3ae191..f1ad3b3 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>>>>>           struct scatterlist *sg;
>>>>>           int last;
>>>>>       } get_page;
>>>>> -
>>>>> -    /* prime dma-buf support */
>>>>> -    void *dma_buf_vmapping;
>>>>> -    int vmapping_count;
>>>>> +    void *vmapping;
>>>>>
>>>>>       /** Breadcrumb of last rendering to the buffer.
>>>>>        * There can only be one writer, but we allow for multiple
>>>>> readers.
>>>>> @@ -2980,7 +2977,22 @@ static inline void
>>>>> i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>>>>>   static inline void i915_gem_object_unpin_pages(struct
>>>>> drm_i915_gem_object *obj)
>>>>>   {
>>>>>       BUG_ON(obj->pages_pin_count == 0);
>>>>> -    obj->pages_pin_count--;
>>>>> +    if (--obj->pages_pin_count == 0 && obj->vmapping) {
>>>>> +        /*
>>>>> +         * Releasing the vmapping here may yield less benefit than
>>>>> +         * if we kept it until put_pages(), but on the other hand
>>>>
>>>> Yields no benefit. Makes the patch pointless.
>>>> Plus there is also pressure to enable WC vmaps.
>>>> -Chris
>>>
>>> The patch is not pointless -- at the very least, it:
>>> + reduces the size of "struct drm_i915_gem_object" (OK, only by 4 bytes)
>>> + replaces special-function code for dmabufs with more generic code that
>>> can be reused for other objects (for now, ringbuffers; next GuC-shared
>>> objects -- see Alex's patch "drm/i915/guc: Simplify code by keeping vmap
>>> of guc_client object" which will eliminate lot of short-term
>>> kmap_atomics with persistent kmaps).
>>> + provides a shorthand for the sequence of { get_pages(), pin_pages(),
>>> vmap() } so we don't have to open-code it (and deal with all the error
>>> paths) in several different places
>>>
>>> Thus there is an engineering benefit even if this version doesn't
>>> provide any performance benefit. And if, as the next step, you want to
>>> extend the vmap lifetime, you just have to remove those few lines in
>>> i915_gem_object_unpin_pages() and incorporate the notifier that you
>>> prototyped earlier -- if it actually provides any performance boost.
>>
>> So Chris do you ack on this series on the basis of the above - that it
>> consolidates the current code and following GuC patch will be another
>> user of the pin_vmap API?
>>
>> Regards,
>> Tvrtko
>
> I see that Chris has posted a patch to add a vmap notifier, although it
> hasn't yet got its R-B. So I suggest we merge this patch series now, and
> then update it by moving the vunmap() into put_pages() when Chris has
> the notifier finalised. IIRC you wanted Daniel to merge the new DRM bits
> (patches 3 and 7, which already have their R-Bs) ?
>
> Or we can merge 1-5+7, all of which already have R-Bs, and I can turn
> 6 into a GuC-private version, without the benefit of simplifying and
> unifying the corresponding object-mapping management in the DMAbuf and
> ringbuffer code.
>
> Or I can repost just the bits that don't rely on drm_malloc_gfp() and
> exclude the final patch so that we can move ahead on the bits we
> actually want for improving the performance of the GuC interface and
> reducing the number of kmap_atomic calls elsewhere, and then the omitted
> bits can be added back once drm_malloc_gfp() has been merged upstream
> and the notifier is working.

I've chatted with Chris and Daniel on IRC and here is the summary and 
way forward I think.

1. Drop 6/7, and probably 7/7 unless you can get etnaviv people to r-b/ack.

2. Add the patch which fixes the actual scheduling while atomic in GuC
to the end of the series with a Bugzila & Testcase tag in that patch.

(This step should allow Chris to provide an Acked-by.)

3. Cc dri-devel on all patches of the series since some touch DRM core. 
(This is standard recommended practice).

4. Rebase & resend as new series.

5. Review the new patch in the series.

6. Explain CI results.

7. Merge. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-03-23 12:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-01 16:33 [PATCH v7 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
2016-03-01 16:33 ` [PATCH v7 1/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
2016-03-01 16:33 ` [PATCH v7 2/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
2016-03-01 16:33 ` [PATCH v7 3/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
2016-03-01 16:33 ` [PATCH v7 4/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
2016-03-01 17:39   ` Tvrtko Ursulin
2016-03-01 16:33 ` [PATCH v7 5/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
2016-03-01 16:33 ` [PATCH v7 6/7] drm/i915: refactor duplicate object vmap functions (the final rework?) Dave Gordon
2016-03-02 12:08   ` Chris Wilson
2016-03-02 15:40     ` Dave Gordon
2016-03-08  9:43       ` Tvrtko Ursulin
2016-03-22 15:25         ` Dave Gordon
2016-03-23 12:23           ` Tvrtko Ursulin
2016-03-01 16:33 ` [PATCH v7 7/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
2016-03-02 15:00   ` Tvrtko Ursulin
2016-03-02  6:54 ` ✗ Fi.CI.BAT: warning for Reorganise calls to vmap() GEM objects (rev5) Patchwork
2016-03-02 12:38   ` Dave Gordon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).