All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] Reorganise calls to vmap() GEM objects
@ 2016-02-29 11:13 Dave Gordon
  2016-02-29 11:13 ` [PATCH v5 1/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Alex Dai and Chris Wilson have both recently posted patches to
rationalise the use of vmap() for mapping GEM objects into kernel
virtual space. However, they addressed different areas, with Alex's
patch being derived from the copy_batch() code, whereas Chris' patch
refactored the dma-buf and ringbuffer code.

So this patchset unifies the two, copying Chris' interfaces which
unite pin-and-vmap for convenient lifecycle management, but using
Alex's code underneath to permit partial mappings. And finally
there's a little optimisation I've added for "small" objects e.g.
ringbuffers and contexts, which are expected to be the objects most
commonly handled by this code.

v5:
  Added another of Chris' patches, introducing drm_malloc_gfp().

  Split Chris' original patch into three, of which two are actually
  minor unrelated improvements, and only one is actually addresses
  the vmap() reorganisation [Tvrtko Ursulin]

  Decided not to hold onto vmappings after the pin count goes to
  zero. This may reduce the benefit of Chris' scheme somewhat, but
  does avoid any increased risk of exhausting kernel vmap space on
  32-bit kernels [Tvrtko Ursulin]. Potentially, the vunmap() could
  be moved back to the put_pages() stage (thus extending the cache
  lifetime) if a suitable notifier were written, but that's not
  included here.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

Alex Dai (1):
  drm/i915: introduce and use i915_gem_object_vmap_range()

Chris Wilson (3):
  drm,i915: introduce drm_malloc_gfp()
  drm/i915: move locking in i915_gem_unmap_dma_buf()
  drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error
    handling

Dave Gordon (3):
  drm: add parameter-order checking to drm memory allocators
  drm/i915: optimise i915_gem_object_vmap_range() for small objects
  drm/i915: refactor duplicate object vmap functions (reworked some
    more)

 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c       | 28 +--------
 drivers/gpu/drm/i915/i915_drv.h              | 26 ++++++--
 drivers/gpu/drm/i915/i915_gem.c              | 88 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c       | 53 ++---------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 12 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c          |  5 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c      | 15 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c      | 52 +++++-----------
 include/drm/drm_mem_util.h                   | 47 ++++++++++++++-
 10 files changed, 190 insertions(+), 138 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 1/7] drm,i915: introduce drm_malloc_gfp()
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 11:13 ` [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Chris Wilson <chris@chris-wilson.co.uk>

I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want
to try a high-order kmalloc() before using a vmalloc().

So refactor my usage into drm_malloc_gfp().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 +++-----
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  5 +++--
 drivers/gpu/drm/i915/i915_gem_userptr.c    | 15 ++++-----------
 include/drm/drm_mem_util.h                 | 19 +++++++++++++++++++
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 8fd00d2..18a5dcc 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1776,11 +1776,9 @@ static bool only_mappable_for_reloc(unsigned int flags)
 		return -EINVAL;
 	}
 
-	exec2_list = kmalloc(sizeof(*exec2_list)*args->buffer_count,
-			     GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
-	if (exec2_list == NULL)
-		exec2_list = drm_malloc_ab(sizeof(*exec2_list),
-					   args->buffer_count);
+	exec2_list = drm_malloc_gfp(sizeof(*exec2_list),
+				    args->buffer_count,
+				    GFP_TEMPORARY);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
 			  args->buffer_count);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9127f8f..2b04c3a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3414,8 +3414,9 @@ struct i915_vma *
 	int ret = -ENOMEM;
 
 	/* Allocate a temporary list of source pages for random access. */
-	page_addr_list = drm_malloc_ab(obj->base.size / PAGE_SIZE,
-				       sizeof(dma_addr_t));
+	page_addr_list = drm_malloc_gfp(obj->base.size / PAGE_SIZE,
+					sizeof(dma_addr_t),
+					GFP_TEMPORARY);
 	if (!page_addr_list)
 		return ERR_PTR(ret);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 7107f2f..b4be458 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -494,10 +494,7 @@ struct get_pages_work {
 	ret = -ENOMEM;
 	pinned = 0;
 
-	pvec = kmalloc(npages*sizeof(struct page *),
-		       GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
-	if (pvec == NULL)
-		pvec = drm_malloc_ab(npages, sizeof(struct page *));
+	pvec = drm_malloc_gfp(npages, sizeof(struct page *), GFP_TEMPORARY);
 	if (pvec != NULL) {
 		struct mm_struct *mm = obj->userptr.mm->mm;
 
@@ -634,14 +631,10 @@ struct get_pages_work {
 	pvec = NULL;
 	pinned = 0;
 	if (obj->userptr.mm->mm == current->mm) {
-		pvec = kmalloc(num_pages*sizeof(struct page *),
-			       GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
+		pvec = drm_malloc_gfp(num_pages, sizeof(struct page *), GFP_TEMPORARY);
 		if (pvec == NULL) {
-			pvec = drm_malloc_ab(num_pages, sizeof(struct page *));
-			if (pvec == NULL) {
-				__i915_gem_userptr_set_active(obj, false);
-				return -ENOMEM;
-			}
+			__i915_gem_userptr_set_active(obj, false);
+			return -ENOMEM;
 		}
 
 		pinned = __get_user_pages_fast(obj->userptr.ptr, num_pages,
diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
index e42495a..741ce75 100644
--- a/include/drm/drm_mem_util.h
+++ b/include/drm/drm_mem_util.h
@@ -54,6 +54,25 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
 			 GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
+static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
+{
+	if (size != 0 && nmemb > SIZE_MAX / size)
+		return NULL;
+
+	if (size * nmemb <= PAGE_SIZE)
+	    return kmalloc(nmemb * size, gfp);
+
+	if (gfp & __GFP_RECLAIMABLE) {
+		void *ptr = kmalloc(nmemb * size,
+				    gfp | __GFP_NOWARN | __GFP_NORETRY);
+		if (ptr)
+			return ptr;
+	}
+
+	return __vmalloc(size * nmemb,
+			 gfp | __GFP_HIGHMEM, PAGE_KERNEL);
+}
+
 static __inline void drm_free_large(void *ptr)
 {
 	kvfree(ptr);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
  2016-02-29 11:13 ` [PATCH v5 1/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 16:16   ` Tvrtko Ursulin
  2016-02-29 11:13 ` [PATCH v5 3/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Gordon, dri-devel

After the recent addition of drm_malloc_gfp(), it was noticed that
some callers of these functions has swapped the parameters in the
call - it's supposed to be 'number of members' and 'sizeof(element)',
but a few callers had got the size first and the count second. This
isn't otherwise detected because they're both type 'size_t', and
the implementation at present just multiplies them anyway, so the
result is still right. But some future implementation might treat
them differently (e.g. allowing 0 elements but not zero size), so
let's add some compile-time checks and complain if the second (size)
parameter isn't a sizeof() expression, or at least a compile-time
constant.

This patch also fixes those callers where the order was wrong.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>Cc: dri-
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  8 ++++----
 include/drm/drm_mem_util.h                   | 30 +++++++++++++++++++++++++---
 3 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 1aba01a..9ae4a71 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -340,7 +340,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	 */
 	bos = drm_malloc_ab(args->nr_bos, sizeof(*bos));
 	relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
-	stream = drm_malloc_ab(1, args->stream_size);
+	stream = drm_malloc_ab(args->stream_size, sizeof(*stream));
 	cmdbuf = etnaviv_gpu_cmdbuf_new(gpu, ALIGN(args->stream_size, 8) + 8,
 					args->nr_bos);
 	if (!bos || !relocs || !stream || !cmdbuf) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 18a5dcc..865876d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1687,8 +1687,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
 	}
 
 	/* Copy in the exec list from userland */
-	exec_list = drm_malloc_ab(sizeof(*exec_list), args->buffer_count);
-	exec2_list = drm_malloc_ab(sizeof(*exec2_list), args->buffer_count);
+	exec_list = drm_malloc_ab(args->buffer_count, sizeof(*exec_list));
+	exec2_list = drm_malloc_ab(args->buffer_count, sizeof(*exec2_list));
 	if (exec_list == NULL || exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
 			  args->buffer_count);
@@ -1776,8 +1776,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
 		return -EINVAL;
 	}
 
-	exec2_list = drm_malloc_gfp(sizeof(*exec2_list),
-				    args->buffer_count,
+	exec2_list = drm_malloc_gfp(args->buffer_count,
+				    sizeof(*exec2_list),
 				    GFP_TEMPORARY);
 	if (exec2_list == NULL) {
 		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
index 741ce75..886ff0a 100644
--- a/include/drm/drm_mem_util.h
+++ b/include/drm/drm_mem_util.h
@@ -29,7 +29,7 @@
 
 #include <linux/vmalloc.h>
 
-static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
+static __inline__ void *__drm_calloc_large(size_t nmemb, const size_t size)
 {
 	if (size != 0 && nmemb > SIZE_MAX / size)
 		return NULL;
@@ -41,8 +41,15 @@ static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
 			 GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO, PAGE_KERNEL);
 }
 
+#define	drm_calloc_large(nmemb, size)					\
+({									\
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
+		"Non-constant 'size' - check argument ordering?");	\
+	__drm_calloc_large(nmemb, size);				\
+})
+
 /* Modeled after cairo's malloc_ab, it's like calloc but without the zeroing. */
-static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
+static __inline__ void *__drm_malloc_ab(size_t nmemb, const size_t size)
 {
 	if (size != 0 && nmemb > SIZE_MAX / size)
 		return NULL;
@@ -54,8 +61,18 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
 			 GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
-static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
+#define	drm_malloc_ab(nmemb, size)					\
+({									\
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
+		"Non-constant 'size' - check argument ordering?");	\
+	__drm_malloc_ab(nmemb, size);					\
+})
+
+static __inline__ void *__drm_malloc_gfp(size_t nmemb, const size_t size, gfp_t gfp)
 {
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),
+		"Non-constant 'size' - check argument ordering?");
+
 	if (size != 0 && nmemb > SIZE_MAX / size)
 		return NULL;
 
@@ -73,6 +90,13 @@ static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
 			 gfp | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
+#define	drm_malloc_gfp(nmemb, size, gfp)				\
+({									\
+	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
+		"Non-constant 'size' - check argument ordering?");	\
+	__drm_malloc_gfp(nmemb, size, gfp);				\
+})
+
 static __inline void drm_free_large(void *ptr)
 {
 	kvfree(ptr);
-- 
1.9.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 3/7] drm/i915: introduce and use i915_gem_object_vmap_range()
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
  2016-02-29 11:13 ` [PATCH v5 1/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
  2016-02-29 11:13 ` [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 11:53   ` Tvrtko Ursulin
  2016-02-29 11:13 ` [PATCH v5 4/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

There are several places inside driver where a GEM object is mapped
to kernel virtual space. The mapping may be done either for the whole
object or only a subset of it.

This patch introduces a function i915_gem_object_vmap_range() to
implement the common functionality. The code itself is extracted and
adapted from that in vmap_batch(), but also replaces vmap_obj() and the
open-coded version in i915_gem_dmabuf_vmap().

v2: Use obj->pages->nents for iteration within i915_gem_object_vmap;
    break when it finishes all desired pages. The caller must pass the
    actual page count required. [Tvrtko Ursulin]

v4: renamed to i915_gem_object_vmap_range() to make its function
    clearer. [Dave Gordon]

v5: use Chris Wilson's new drm_malloc_gfp() rather than kmalloc() or
    drm_malloc_ab(). [Dave Gordon]

Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 28 +-------------------
 drivers/gpu/drm/i915/i915_drv.h         |  4 +++
 drivers/gpu/drm/i915/i915_gem.c         | 47 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 16 +++--------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++---------------
 5 files changed, 57 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 814d894..114b55f 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -863,37 +863,11 @@ void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring)
 static u32 *vmap_batch(struct drm_i915_gem_object *obj,
 		       unsigned start, unsigned len)
 {
-	int i;
-	void *addr = NULL;
-	struct sg_page_iter sg_iter;
 	int first_page = start >> PAGE_SHIFT;
 	int last_page = (len + start + 4095) >> PAGE_SHIFT;
 	int npages = last_page - first_page;
-	struct page **pages;
-
-	pages = drm_malloc_ab(npages, sizeof(*pages));
-	if (pages == NULL) {
-		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-		goto finish;
-	}
-
-	i = 0;
-	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first_page) {
-		pages[i++] = sg_page_iter_page(&sg_iter);
-		if (i == npages)
-			break;
-	}
-
-	addr = vmap(pages, i, 0, PAGE_KERNEL);
-	if (addr == NULL) {
-		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
-		goto finish;
-	}
 
-finish:
-	if (pages)
-		drm_free_large(pages);
-	return (u32*)addr;
+	return i915_gem_object_vmap_range(obj, first_page, npages);
 }
 
 /* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9e76bfc..918b0bb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2983,6 +2983,10 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 	obj->pages_pin_count--;
 }
 
+void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
+					      unsigned int first,
+					      unsigned int npages);
+
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_engine_cs *to,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f68f346..c621b3e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2400,6 +2400,53 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 	return 0;
 }
 
+/**
+ * i915_gem_object_vmap_range - map some or all of a GEM object into kernel space
+ * @obj: the GEM object to be mapped
+ * @first: index of the first page of the object to be mapped
+ * @npages: length of the mapping to be created, in pages
+ *
+ * Map a given page range of a GEM object into kernel virtual space. The caller
+ * must make sure the associated pages are gathered and pinned before calling
+ * this function, and is responsible for vunmapping the returned address.
+ *
+ * Returns the address at which the object has been mapped, or NULL on failure.
+ */
+void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
+				 unsigned int first,
+				 unsigned int npages)
+{
+	struct sg_page_iter sg_iter;
+	struct page **pages;
+	void *addr;
+	int i;
+
+	if (first + npages > obj->pages->nents) {
+		DRM_DEBUG_DRIVER("Invalid page count\n");
+		return NULL;
+	}
+
+	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
+	if (pages == NULL) {
+		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
+		return NULL;
+	}
+
+	i = 0;
+	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first) {
+		pages[i++] = sg_page_iter_page(&sg_iter);
+		if (i == npages)
+			break;
+	}
+
+	addr = vmap(pages, npages, 0, PAGE_KERNEL);
+	if (addr == NULL)
+		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
+	drm_free_large(pages);
+
+	return addr;
+}
+
 void i915_vma_move_to_active(struct i915_vma *vma,
 			     struct drm_i915_gem_request *req)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 1f3eef6..aee4149 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -110,9 +110,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	struct drm_device *dev = obj->base.dev;
-	struct sg_page_iter sg_iter;
-	struct page **pages;
-	int ret, i;
+	int ret;
 
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
@@ -131,16 +129,8 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 
 	ret = -ENOMEM;
 
-	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
-	if (pages == NULL)
-		goto err_unpin;
-
-	i = 0;
-	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
-		pages[i++] = sg_page_iter_page(&sg_iter);
-
-	obj->dma_buf_vmapping = vmap(pages, i, 0, PAGE_KERNEL);
-	drm_free_large(pages);
+	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0,
+						dma_buf->size >> PAGE_SHIFT);
 
 	if (!obj->dma_buf_vmapping)
 		goto err_unpin;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 45ce45a..434a452 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2064,27 +2064,6 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 	i915_gem_object_ggtt_unpin(ringbuf->obj);
 }
 
-static u32 *vmap_obj(struct drm_i915_gem_object *obj)
-{
-	struct sg_page_iter sg_iter;
-	struct page **pages;
-	void *addr;
-	int i;
-
-	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
-	if (pages == NULL)
-		return NULL;
-
-	i = 0;
-	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
-		pages[i++] = sg_page_iter_page(&sg_iter);
-
-	addr = vmap(pages, i, 0, PAGE_KERNEL);
-	drm_free_large(pages);
-
-	return addr;
-}
-
 int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 				     struct intel_ringbuffer *ringbuf)
 {
@@ -2103,7 +2082,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			return ret;
 		}
 
-		ringbuf->virtual_start = vmap_obj(obj);
+		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
+						ringbuf->size >> PAGE_SHIFT);
 		if (ringbuf->virtual_start == NULL) {
 			i915_gem_object_ggtt_unpin(obj);
 			return -ENOMEM;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 4/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (2 preceding siblings ...)
  2016-02-29 11:13 ` [PATCH v5 3/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 12:01   ` Tvrtko Ursulin
  2016-02-29 11:13 ` [PATCH v5 5/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx

We're using this function for ringbuffers and other "small" objects, so
it's worth avoiding an extra malloc()/free() cycle if the page array is
small enough to put on the stack. Here we've chosen an arbitrary cutoff
of 32 (4k) pages, which is big enough for a ringbuffer (4 pages) or a
context image (currently up to 22 pages).

v5:
    change name of local array [Chris Wilson]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c621b3e..c126211 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2417,7 +2417,8 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 				 unsigned int npages)
 {
 	struct sg_page_iter sg_iter;
-	struct page **pages;
+	struct page *stack_pages[32];
+	struct page **pages = stack_pages;
 	void *addr;
 	int i;
 
@@ -2426,10 +2427,13 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 		return NULL;
 	}
 
-	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
-	if (pages == NULL) {
-		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-		return NULL;
+	if (npages > ARRAY_SIZE(stack_pages)) {
+		/* Too big for stack -- allocate temporary array instead */
+		pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
+		if (pages == NULL) {
+			DRM_DEBUG_DRIVER("Failed to get space for pages\n");
+			return NULL;
+		}
 	}
 
 	i = 0;
@@ -2442,7 +2446,8 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 	addr = vmap(pages, npages, 0, PAGE_KERNEL);
 	if (addr == NULL)
 		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
-	drm_free_large(pages);
+	if (pages != stack_pages)
+		drm_free_large(pages);
 
 	return addr;
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 5/7] drm/i915: move locking in i915_gem_unmap_dma_buf()
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (3 preceding siblings ...)
  2016-02-29 11:13 ` [PATCH v5 4/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 12:07   ` Tvrtko Ursulin
  2016-02-29 11:13 ` [PATCH v5 6/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

The struct_mutex doesn't need to be (and therefore shouldn't be)
held around the various non-i915 operations in this function. We
only need the struct_mutex when we get to the unpin_pages() call.

Extracted from Chris Wilson's patch:
    drm/i915: Refactor duplicate object vmap functions
in preparation for the reimplementation of the same.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index aee4149..68e21d1 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -95,14 +95,12 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
 
-	mutex_lock(&obj->base.dev->struct_mutex);
-
 	dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir);
 	sg_free_table(sg);
 	kfree(sg);
 
+	mutex_lock(&obj->base.dev->struct_mutex);
 	i915_gem_object_unpin_pages(obj);
-
 	mutex_unlock(&obj->base.dev->struct_mutex);
 }
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 6/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (4 preceding siblings ...)
  2016-02-29 11:13 ` [PATCH v5 5/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 12:10   ` Tvrtko Ursulin
  2016-02-29 11:13 ` [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more) Dave Gordon
  2016-02-29 11:28 ` ✗ Fi.CI.BAT: failure for Reorganise calls to vmap() GEM objects (rev3) Patchwork
  7 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

Replace multiple "unpin(); return errno;" sequences with
a branch to a single common label for all the error paths.

Extracted from Chris Wilson's patch:
    drm/i915: Refactor duplicate object vmap functions
in preparation for the reimplementation of the same.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 434a452..15e2d29 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2077,16 +2077,14 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			return ret;
 
 		ret = i915_gem_object_set_to_cpu_domain(obj, true);
-		if (ret) {
-			i915_gem_object_ggtt_unpin(obj);
-			return ret;
-		}
+		if (ret)
+			goto unpin;
 
 		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
 						ringbuf->size >> PAGE_SHIFT);
 		if (ringbuf->virtual_start == NULL) {
-			i915_gem_object_ggtt_unpin(obj);
-			return -ENOMEM;
+			ret = -ENOMEM;
+			goto unpin;
 		}
 	} else {
 		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
@@ -2094,10 +2092,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			return ret;
 
 		ret = i915_gem_object_set_to_gtt_domain(obj, true);
-		if (ret) {
-			i915_gem_object_ggtt_unpin(obj);
-			return ret;
-		}
+		if (ret)
+			goto unpin;
 
 		/* Access through the GTT requires the device to be awake. */
 		assert_rpm_wakelock_held(dev_priv);
@@ -2105,14 +2101,18 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base +
 						    i915_gem_obj_ggtt_offset(obj), ringbuf->size);
 		if (ringbuf->virtual_start == NULL) {
-			i915_gem_object_ggtt_unpin(obj);
-			return -EINVAL;
+			ret = -ENOMEM;
+			goto unpin;
 		}
 	}
 
 	ringbuf->vma = i915_gem_obj_to_ggtt(obj);
 
 	return 0;
+
+unpin:
+	i915_gem_object_ggtt_unpin(obj);
+	return ret;
 }
 
 static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more)
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (5 preceding siblings ...)
  2016-02-29 11:13 ` [PATCH v5 6/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
@ 2016-02-29 11:13 ` Dave Gordon
  2016-02-29 12:36   ` Tvrtko Ursulin
  2016-02-29 11:28 ` ✗ Fi.CI.BAT: failure for Reorganise calls to vmap() GEM objects (rev3) Patchwork
  7 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 11:13 UTC (permalink / raw)
  To: intel-gfx

This is essentially Chris Wilson's patch of a similar name, reworked on
top of Alex Dai's recent patch:
> drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
Chris' original commentary said:

> We now have two implementations for vmapping a whole object, one for
> dma-buf and one for the ringbuffer. If we couple the vmapping into
> the obj->pages lifetime, then we can reuse an obj->vmapping for both
> and at the same time couple it into the shrinker.
> 
> v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
> v3: Call unpin_vmap from the right dmabuf unmapper

v4: reimplements the same functionality, but now as wrappers round the
    recently-introduced i915_gem_object_vmap_range() from Alex's patch
    mentioned above.

v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
    this is the third and most substantial portion.

    Decided not to hold onto vmappings after the pin count goes to zero.
    This may reduce the benefit of Chris' scheme a bit, but does avoid
    any increased risk of exhausting kernel vmap space on 32-bit kernels
    [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
    the put_pages() stage if a suitable notifier were written, but we're
    not doing that here. Nonetheless, the simplification of both dmabuf
    and ringbuffer code makes it worthwhile in its own right.

With this change, we have only one vmap() in the whole driver :)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 22 +++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem.c         | 36 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 37 ++++-----------------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 10 ++++-----
 4 files changed, 62 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 918b0bb..7facdb5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
 		struct scatterlist *sg;
 		int last;
 	} get_page;
-
-	/* prime dma-buf support */
-	void *dma_buf_vmapping;
-	int vmapping_count;
+	void *vmapping;
 
 	/** Breadcrumb of last rendering to the buffer.
 	 * There can only be one writer, but we allow for multiple readers.
@@ -2980,7 +2977,22 @@ static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pages_pin_count == 0);
-	obj->pages_pin_count--;
+	if (--obj->pages_pin_count == 0 && obj->vmapping) {
+		/*
+		 * Releasing the vmapping here may yield less benefit than
+		 * if we kept it until put_pages(), but on the other hand
+		 * avoids issues of exhausting kernel vmappable address
+		 * space on 32-bit kernels.
+		 */
+		vunmap(obj->vmapping);
+		obj->vmapping = NULL;
+	}
+}
+
+void *__must_check i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj);
+static inline void i915_gem_object_unpin_vmap(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_unpin_pages(obj);
 }
 
 void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c126211..79d3d5b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2235,6 +2235,8 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 	ops->put_pages(obj);
 	obj->pages = NULL;
 
+	BUG_ON(obj->vmapping);
+
 	i915_gem_object_invalidate(obj);
 
 	return 0;
@@ -2452,6 +2454,40 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
 	return addr;
 }
 
+/**
+ * i915_gem_object_pin_vmap - pin a GEM object and map it into kernel space
+ * @obj: the GEM object to be mapped
+ *
+ * Combines the functions of get_pages(), pin_pages() and vmap_range() on
+ * the whole object.  The caller should release the mapping by calling
+ * i915_gem_object_unpin_vmap() when it is no longer required.
+ *
+ * Returns the address at which the object has been mapped, or an ERR_PTR
+ * on failure.
+ */
+void *i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		return ERR_PTR(ret);
+
+	i915_gem_object_pin_pages(obj);
+
+	if (obj->vmapping == NULL) {
+		obj->vmapping = i915_gem_object_vmap_range(obj, 0,
+					obj->base.size >> PAGE_SHIFT);
+
+		if (obj->vmapping == NULL) {
+			i915_gem_object_unpin_pages(obj);
+			return ERR_PTR(-ENOMEM);
+		}
+	}
+
+	return obj->vmapping;
+}
+
 void i915_vma_move_to_active(struct i915_vma *vma,
 			     struct drm_i915_gem_request *req)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 68e21d1..adc7b5e 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -108,41 +108,17 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	struct drm_device *dev = obj->base.dev;
+	void *addr;
 	int ret;
 
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
 		return ERR_PTR(ret);
 
-	if (obj->dma_buf_vmapping) {
-		obj->vmapping_count++;
-		goto out_unlock;
-	}
-
-	ret = i915_gem_object_get_pages(obj);
-	if (ret)
-		goto err;
-
-	i915_gem_object_pin_pages(obj);
-
-	ret = -ENOMEM;
-
-	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0,
-						dma_buf->size >> PAGE_SHIFT);
-
-	if (!obj->dma_buf_vmapping)
-		goto err_unpin;
-
-	obj->vmapping_count = 1;
-out_unlock:
+	addr = i915_gem_object_pin_vmap(obj);
 	mutex_unlock(&dev->struct_mutex);
-	return obj->dma_buf_vmapping;
 
-err_unpin:
-	i915_gem_object_unpin_pages(obj);
-err:
-	mutex_unlock(&dev->struct_mutex);
-	return ERR_PTR(ret);
+	return addr;
 }
 
 static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
@@ -151,12 +127,7 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
 	struct drm_device *dev = obj->base.dev;
 
 	mutex_lock(&dev->struct_mutex);
-	if (--obj->vmapping_count == 0) {
-		vunmap(obj->dma_buf_vmapping);
-		obj->dma_buf_vmapping = NULL;
-
-		i915_gem_object_unpin_pages(obj);
-	}
+	i915_gem_object_unpin_vmap(obj);
 	mutex_unlock(&dev->struct_mutex);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 15e2d29..47f186e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2056,7 +2056,7 @@ static int init_phys_status_page(struct intel_engine_cs *ring)
 void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 {
 	if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
-		vunmap(ringbuf->virtual_start);
+		i915_gem_object_unpin_vmap(ringbuf->obj);
 	else
 		iounmap(ringbuf->virtual_start);
 	ringbuf->virtual_start = NULL;
@@ -2080,10 +2080,10 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		if (ret)
 			goto unpin;
 
-		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
-						ringbuf->size >> PAGE_SHIFT);
-		if (ringbuf->virtual_start == NULL) {
-			ret = -ENOMEM;
+		ringbuf->virtual_start = i915_gem_object_pin_vmap(obj);
+		if (IS_ERR(ringbuf->virtual_start)) {
+			ret = PTR_ERR(ringbuf->virtual_start);
+			ringbuf->virtual_start = NULL;
 			goto unpin;
 		}
 	} else {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.BAT: failure for Reorganise calls to vmap() GEM objects (rev3)
  2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
                   ` (6 preceding siblings ...)
  2016-02-29 11:13 ` [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more) Dave Gordon
@ 2016-02-29 11:28 ` Patchwork
  7 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2016-02-29 11:28 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: Reorganise calls to vmap() GEM objects (rev3)
URL   : https://patchwork.freedesktop.org/series/3688/
State : failure

== Summary ==

Series 3688v3 Reorganise calls to vmap() GEM objects
http://patchwork.freedesktop.org/api/1.0/series/3688/revisions/3/mbox/

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (ilk-hp8440p) UNSTABLE
        Subgroup basic-flip-vs-modeset:
                pass       -> DMESG-WARN (ilk-hp8440p) UNSTABLE
Test kms_force_connector_basic:
        Subgroup force-load-detect:
                fail       -> DMESG-FAIL (hsw-gt2)
                fail       -> DMESG-FAIL (snb-dellxps)
                dmesg-fail -> FAIL       (ilk-hp8440p)
                skip       -> DMESG-FAIL (ivb-t430s)
        Subgroup prune-stale-modes:
                pass       -> SKIP       (ilk-hp8440p)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                pass       -> SKIP       (hsw-brixbox)
                dmesg-warn -> PASS       (skl-i7k-2) UNSTABLE
        Subgroup suspend-read-crc-pipe-c:
                pass       -> DMESG-WARN (skl-i7k-2) UNSTABLE
Test prime_self_import:
        Subgroup basic-with_fd_dup:
                dmesg-warn -> PASS       (bsw-nuc-2)

bsw-nuc-2        total:169  pass:138  dwarn:0   dfail:0   fail:1   skip:30 
byt-nuc          total:169  pass:144  dwarn:0   dfail:0   fail:0   skip:25 
hsw-brixbox      total:169  pass:154  dwarn:0   dfail:0   fail:0   skip:15 
hsw-gt2          total:169  pass:158  dwarn:0   dfail:1   fail:0   skip:10 
ilk-hp8440p      total:169  pass:116  dwarn:2   dfail:0   fail:1   skip:50 
ivb-t430s        total:169  pass:154  dwarn:0   dfail:1   fail:0   skip:14 
skl-i7k-2        total:169  pass:152  dwarn:1   dfail:0   fail:0   skip:16 
snb-dellxps      total:169  pass:146  dwarn:0   dfail:1   fail:0   skip:22 

Results at /archive/results/CI_IGT_test/Patchwork_1491/

59c2aa9790ada24e1c13cd582a91fea33dc75b00 drm-intel-nightly: 2016y-02m-29d-09h-14m-18s UTC integration manifest
bdbcace9f788d2ad592a234578cef44fa8084ae7 drm/i915: refactor duplicate object vmap functions (reworked some more)
70d2c94781bb145f39c00061e942e7fdc7f9c296 drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling
ecb5ef8759f861a3fec2e5ee337bc0cfe9b580d5 drm/i915: move locking in i915_gem_unmap_dma_buf()
7044eb4ac1dbf4bc589f8a4d8bb78b0037c8e96b drm/i915: optimise i915_gem_object_vmap_range() for small objects
79b3a05d13ef54b8154d486d3c449c37d096e23f drm/i915: introduce and use i915_gem_object_vmap_range()
3d093d69d1d2bf34555427244036c777e9d2dc60 drm: add parameter-order checking to drm memory allocators
fba73bd2531442d662cba69528647de3caaf895c drm,i915: introduce drm_malloc_gfp()

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 3/7] drm/i915: introduce and use i915_gem_object_vmap_range()
  2016-02-29 11:13 ` [PATCH v5 3/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
@ 2016-02-29 11:53   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-02-29 11:53 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 29/02/16 11:13, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> There are several places inside driver where a GEM object is mapped
> to kernel virtual space. The mapping may be done either for the whole
> object or only a subset of it.
>
> This patch introduces a function i915_gem_object_vmap_range() to
> implement the common functionality. The code itself is extracted and
> adapted from that in vmap_batch(), but also replaces vmap_obj() and the
> open-coded version in i915_gem_dmabuf_vmap().
>
> v2: Use obj->pages->nents for iteration within i915_gem_object_vmap;
>      break when it finishes all desired pages. The caller must pass the
>      actual page count required. [Tvrtko Ursulin]
>
> v4: renamed to i915_gem_object_vmap_range() to make its function
>      clearer. [Dave Gordon]
>
> v5: use Chris Wilson's new drm_malloc_gfp() rather than kmalloc() or
>      drm_malloc_ab(). [Dave Gordon]
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_cmd_parser.c  | 28 +-------------------
>   drivers/gpu/drm/i915/i915_drv.h         |  4 +++
>   drivers/gpu/drm/i915/i915_gem.c         | 47 +++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 16 +++--------
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++---------------
>   5 files changed, 57 insertions(+), 62 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 814d894..114b55f 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -863,37 +863,11 @@ void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring)
>   static u32 *vmap_batch(struct drm_i915_gem_object *obj,
>   		       unsigned start, unsigned len)
>   {
> -	int i;
> -	void *addr = NULL;
> -	struct sg_page_iter sg_iter;
>   	int first_page = start >> PAGE_SHIFT;
>   	int last_page = (len + start + 4095) >> PAGE_SHIFT;
>   	int npages = last_page - first_page;
> -	struct page **pages;
> -
> -	pages = drm_malloc_ab(npages, sizeof(*pages));
> -	if (pages == NULL) {
> -		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> -		goto finish;
> -	}
> -
> -	i = 0;
> -	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first_page) {
> -		pages[i++] = sg_page_iter_page(&sg_iter);
> -		if (i == npages)
> -			break;
> -	}
> -
> -	addr = vmap(pages, i, 0, PAGE_KERNEL);
> -	if (addr == NULL) {
> -		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
> -		goto finish;
> -	}
>
> -finish:
> -	if (pages)
> -		drm_free_large(pages);
> -	return (u32*)addr;
> +	return i915_gem_object_vmap_range(obj, first_page, npages);
>   }
>
>   /* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 9e76bfc..918b0bb 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2983,6 +2983,10 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>   	obj->pages_pin_count--;
>   }
>
> +void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
> +					      unsigned int first,
> +					      unsigned int npages);
> +
>   int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>   int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   			 struct intel_engine_cs *to,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f68f346..c621b3e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2400,6 +2400,53 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>   	return 0;
>   }
>
> +/**
> + * i915_gem_object_vmap_range - map some or all of a GEM object into kernel space
> + * @obj: the GEM object to be mapped
> + * @first: index of the first page of the object to be mapped
> + * @npages: length of the mapping to be created, in pages
> + *
> + * Map a given page range of a GEM object into kernel virtual space. The caller
> + * must make sure the associated pages are gathered and pinned before calling
> + * this function, and is responsible for vunmapping the returned address.

Could add an assert into the code for the above? (pages exist and are 
pinned)

> + *
> + * Returns the address at which the object has been mapped, or NULL on failure.
> + */
> +void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
> +				 unsigned int first,
> +				 unsigned int npages)
> +{
> +	struct sg_page_iter sg_iter;
> +	struct page **pages;
> +	void *addr;
> +	int i;
> +
> +	if (first + npages > obj->pages->nents) {
> +		DRM_DEBUG_DRIVER("Invalid page count\n");

This is incorrect I think because nents is less or equal to number of 
pages in an object. So it can fail the attempt incorrectly. Should 
probably check against obj->base.size >> PAGE_SHIFT.

> +		return NULL;
> +	}
> +
> +	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
> +	if (pages == NULL) {
> +		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> +		return NULL;
> +	}
> +
> +	i = 0;
> +	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first) {
> +		pages[i++] = sg_page_iter_page(&sg_iter);
> +		if (i == npages)
> +			break;
> +	}
> +
> +	addr = vmap(pages, npages, 0, PAGE_KERNEL);
> +	if (addr == NULL)
> +		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
> +	drm_free_large(pages);
> +
> +	return addr;
> +}
> +
>   void i915_vma_move_to_active(struct i915_vma *vma,
>   			     struct drm_i915_gem_request *req)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> index 1f3eef6..aee4149 100644
> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> @@ -110,9 +110,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>   {
>   	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>   	struct drm_device *dev = obj->base.dev;
> -	struct sg_page_iter sg_iter;
> -	struct page **pages;
> -	int ret, i;
> +	int ret;
>
>   	ret = i915_mutex_lock_interruptible(dev);
>   	if (ret)
> @@ -131,16 +129,8 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>
>   	ret = -ENOMEM;
>
> -	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
> -	if (pages == NULL)
> -		goto err_unpin;
> -
> -	i = 0;
> -	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
> -		pages[i++] = sg_page_iter_page(&sg_iter);
> -
> -	obj->dma_buf_vmapping = vmap(pages, i, 0, PAGE_KERNEL);
> -	drm_free_large(pages);
> +	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0,
> +						dma_buf->size >> PAGE_SHIFT);
>
>   	if (!obj->dma_buf_vmapping)
>   		goto err_unpin;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 45ce45a..434a452 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2064,27 +2064,6 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>   	i915_gem_object_ggtt_unpin(ringbuf->obj);
>   }
>
> -static u32 *vmap_obj(struct drm_i915_gem_object *obj)
> -{
> -	struct sg_page_iter sg_iter;
> -	struct page **pages;
> -	void *addr;
> -	int i;
> -
> -	pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
> -	if (pages == NULL)
> -		return NULL;
> -
> -	i = 0;
> -	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
> -		pages[i++] = sg_page_iter_page(&sg_iter);
> -
> -	addr = vmap(pages, i, 0, PAGE_KERNEL);
> -	drm_free_large(pages);
> -
> -	return addr;
> -}
> -
>   int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   				     struct intel_ringbuffer *ringbuf)
>   {
> @@ -2103,7 +2082,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   			return ret;
>   		}
>
> -		ringbuf->virtual_start = vmap_obj(obj);
> +		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
> +						ringbuf->size >> PAGE_SHIFT);
>   		if (ringbuf->virtual_start == NULL) {
>   			i915_gem_object_ggtt_unpin(obj);
>   			return -ENOMEM;
>

Rest looks good to me.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 4/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects
  2016-02-29 11:13 ` [PATCH v5 4/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
@ 2016-02-29 12:01   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-02-29 12:01 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 29/02/16 11:13, Dave Gordon wrote:
> We're using this function for ringbuffers and other "small" objects, so
> it's worth avoiding an extra malloc()/free() cycle if the page array is
> small enough to put on the stack. Here we've chosen an arbitrary cutoff
> of 32 (4k) pages, which is big enough for a ringbuffer (4 pages) or a
> context image (currently up to 22 pages).
>
> v5:
>      change name of local array [Chris Wilson]
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Alex Dai <yu.dai@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 17 +++++++++++------
>   1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c621b3e..c126211 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2417,7 +2417,8 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
>   				 unsigned int npages)
>   {
>   	struct sg_page_iter sg_iter;
> -	struct page **pages;
> +	struct page *stack_pages[32];
> +	struct page **pages = stack_pages;
>   	void *addr;
>   	int i;
>
> @@ -2426,10 +2427,13 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
>   		return NULL;
>   	}
>
> -	pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
> -	if (pages == NULL) {
> -		DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> -		return NULL;
> +	if (npages > ARRAY_SIZE(stack_pages)) {
> +		/* Too big for stack -- allocate temporary array instead */
> +		pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
> +		if (pages == NULL) {
> +			DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> +			return NULL;
> +		}
>   	}
>
>   	i = 0;
> @@ -2442,7 +2446,8 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
>   	addr = vmap(pages, npages, 0, PAGE_KERNEL);
>   	if (addr == NULL)
>   		DRM_DEBUG_DRIVER("Failed to vmap pages\n");
> -	drm_free_large(pages);
> +	if (pages != stack_pages)
> +		drm_free_large(pages);
>
>   	return addr;
>   }
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 5/7] drm/i915: move locking in i915_gem_unmap_dma_buf()
  2016-02-29 11:13 ` [PATCH v5 5/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
@ 2016-02-29 12:07   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-02-29 12:07 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 29/02/16 11:13, Dave Gordon wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> The struct_mutex doesn't need to be (and therefore shouldn't be)
> held around the various non-i915 operations in this function. We
> only need the struct_mutex when we get to the unpin_pages() call.
>
> Extracted from Chris Wilson's patch:
>      drm/i915: Refactor duplicate object vmap functions
> in preparation for the reimplementation of the same.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_dmabuf.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> index aee4149..68e21d1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> @@ -95,14 +95,12 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
>   {
>   	struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
>
> -	mutex_lock(&obj->base.dev->struct_mutex);
> -
>   	dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir);
>   	sg_free_table(sg);
>   	kfree(sg);
>
> +	mutex_lock(&obj->base.dev->struct_mutex);
>   	i915_gem_object_unpin_pages(obj);
> -
>   	mutex_unlock(&obj->base.dev->struct_mutex);
>   }

Looks OK under my superficial understanding of dmabuf. But then 
i915_gem_map_dma_buf holds the mutex for too long as well. Anyway,

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 6/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling
  2016-02-29 11:13 ` [PATCH v5 6/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
@ 2016-02-29 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-02-29 12:10 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx



On 29/02/16 11:13, Dave Gordon wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> Replace multiple "unpin(); return errno;" sequences with
> a branch to a single common label for all the error paths.
>
> Extracted from Chris Wilson's patch:
>      drm/i915: Refactor duplicate object vmap functions
> in preparation for the reimplementation of the same.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++------------
>   1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 434a452..15e2d29 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2077,16 +2077,14 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   			return ret;
>
>   		ret = i915_gem_object_set_to_cpu_domain(obj, true);
> -		if (ret) {
> -			i915_gem_object_ggtt_unpin(obj);
> -			return ret;
> -		}
> +		if (ret)
> +			goto unpin;
>
>   		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
>   						ringbuf->size >> PAGE_SHIFT);
>   		if (ringbuf->virtual_start == NULL) {
> -			i915_gem_object_ggtt_unpin(obj);
> -			return -ENOMEM;
> +			ret = -ENOMEM;
> +			goto unpin;
>   		}
>   	} else {
>   		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
> @@ -2094,10 +2092,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   			return ret;
>
>   		ret = i915_gem_object_set_to_gtt_domain(obj, true);
> -		if (ret) {
> -			i915_gem_object_ggtt_unpin(obj);
> -			return ret;
> -		}
> +		if (ret)
> +			goto unpin;
>
>   		/* Access through the GTT requires the device to be awake. */
>   		assert_rpm_wakelock_held(dev_priv);
> @@ -2105,14 +2101,18 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   		ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base +
>   						    i915_gem_obj_ggtt_offset(obj), ringbuf->size);
>   		if (ringbuf->virtual_start == NULL) {
> -			i915_gem_object_ggtt_unpin(obj);
> -			return -EINVAL;
> +			ret = -ENOMEM;
> +			goto unpin;
>   		}
>   	}
>
>   	ringbuf->vma = i915_gem_obj_to_ggtt(obj);
>
>   	return 0;
> +
> +unpin:
> +	i915_gem_object_ggtt_unpin(obj);
> +	return ret;
>   }
>
>   static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more)
  2016-02-29 11:13 ` [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more) Dave Gordon
@ 2016-02-29 12:36   ` Tvrtko Ursulin
  2016-02-29 20:42     ` Dave Gordon
  0 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-02-29 12:36 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx


On 29/02/16 11:13, Dave Gordon wrote:
> This is essentially Chris Wilson's patch of a similar name, reworked on
> top of Alex Dai's recent patch:
>> drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
> Chris' original commentary said:
>
>> We now have two implementations for vmapping a whole object, one for
>> dma-buf and one for the ringbuffer. If we couple the vmapping into
>> the obj->pages lifetime, then we can reuse an obj->vmapping for both
>> and at the same time couple it into the shrinker.
>>
>> v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
>> v3: Call unpin_vmap from the right dmabuf unmapper
>
> v4: reimplements the same functionality, but now as wrappers round the
>      recently-introduced i915_gem_object_vmap_range() from Alex's patch
>      mentioned above.
>
> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>      this is the third and most substantial portion.
>
>      Decided not to hold onto vmappings after the pin count goes to zero.
>      This may reduce the benefit of Chris' scheme a bit, but does avoid
>      any increased risk of exhausting kernel vmap space on 32-bit kernels
>      [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
>      the put_pages() stage if a suitable notifier were written, but we're
>      not doing that here. Nonetheless, the simplification of both dmabuf
>      and ringbuffer code makes it worthwhile in its own right.
>
> With this change, we have only one vmap() in the whole driver :)

The above line does not apply to this patch after the split any more. :)

>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Alex Dai <yu.dai@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         | 22 +++++++++++++++-----
>   drivers/gpu/drm/i915/i915_gem.c         | 36 ++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 37 ++++-----------------------------
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 10 ++++-----
>   4 files changed, 62 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 918b0bb..7facdb5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>   		struct scatterlist *sg;
>   		int last;
>   	} get_page;
> -
> -	/* prime dma-buf support */
> -	void *dma_buf_vmapping;
> -	int vmapping_count;
> +	void *vmapping;
>
>   	/** Breadcrumb of last rendering to the buffer.
>   	 * There can only be one writer, but we allow for multiple readers.
> @@ -2980,7 +2977,22 @@ static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>   static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>   {
>   	BUG_ON(obj->pages_pin_count == 0);
> -	obj->pages_pin_count--;
> +	if (--obj->pages_pin_count == 0 && obj->vmapping) {
> +		/*
> +		 * Releasing the vmapping here may yield less benefit than
> +		 * if we kept it until put_pages(), but on the other hand
> +		 * avoids issues of exhausting kernel vmappable address
> +		 * space on 32-bit kernels.
> +		 */
> +		vunmap(obj->vmapping);
> +		obj->vmapping = NULL;
> +	}
> +}
> +
> +void *__must_check i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj);
> +static inline void i915_gem_object_unpin_vmap(struct drm_i915_gem_object *obj)
> +{
> +	i915_gem_object_unpin_pages(obj);
>   }
>
>   void *__must_check i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c126211..79d3d5b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2235,6 +2235,8 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>   	ops->put_pages(obj);
>   	obj->pages = NULL;
>
> +	BUG_ON(obj->vmapping);
> +

Do we know for certain kernel will crash and not just maybe leak a 
vmapping? I think not so would be better to just WARN.

Maybe even WARN_ON and return -EBUSY higher up, as the function already 
does for obj->pages_pin_count?

>   	i915_gem_object_invalidate(obj);
>
>   	return 0;
> @@ -2452,6 +2454,40 @@ void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
>   	return addr;
>   }
>
> +/**
> + * i915_gem_object_pin_vmap - pin a GEM object and map it into kernel space
> + * @obj: the GEM object to be mapped
> + *
> + * Combines the functions of get_pages(), pin_pages() and vmap_range() on
> + * the whole object.  The caller should release the mapping by calling
> + * i915_gem_object_unpin_vmap() when it is no longer required.
> + *
> + * Returns the address at which the object has been mapped, or an ERR_PTR
> + * on failure.
> + */
> +void *i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj)
> +{
> +	int ret;
> +
> +	ret = i915_gem_object_get_pages(obj);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	i915_gem_object_pin_pages(obj);
> +
> +	if (obj->vmapping == NULL) {
> +		obj->vmapping = i915_gem_object_vmap_range(obj, 0,
> +					obj->base.size >> PAGE_SHIFT);
> +
> +		if (obj->vmapping == NULL) {
> +			i915_gem_object_unpin_pages(obj);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +	}
> +
> +	return obj->vmapping;
> +}
> +
>   void i915_vma_move_to_active(struct i915_vma *vma,
>   			     struct drm_i915_gem_request *req)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> index 68e21d1..adc7b5e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> @@ -108,41 +108,17 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>   {
>   	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>   	struct drm_device *dev = obj->base.dev;
> +	void *addr;
>   	int ret;
>
>   	ret = i915_mutex_lock_interruptible(dev);
>   	if (ret)
>   		return ERR_PTR(ret);
>
> -	if (obj->dma_buf_vmapping) {
> -		obj->vmapping_count++;
> -		goto out_unlock;
> -	}
> -
> -	ret = i915_gem_object_get_pages(obj);
> -	if (ret)
> -		goto err;
> -
> -	i915_gem_object_pin_pages(obj);
> -
> -	ret = -ENOMEM;
> -
> -	obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0,
> -						dma_buf->size >> PAGE_SHIFT);
> -
> -	if (!obj->dma_buf_vmapping)
> -		goto err_unpin;
> -
> -	obj->vmapping_count = 1;
> -out_unlock:
> +	addr = i915_gem_object_pin_vmap(obj);
>   	mutex_unlock(&dev->struct_mutex);
> -	return obj->dma_buf_vmapping;
>
> -err_unpin:
> -	i915_gem_object_unpin_pages(obj);
> -err:
> -	mutex_unlock(&dev->struct_mutex);
> -	return ERR_PTR(ret);
> +	return addr;
>   }
>
>   static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
> @@ -151,12 +127,7 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
>   	struct drm_device *dev = obj->base.dev;
>
>   	mutex_lock(&dev->struct_mutex);
> -	if (--obj->vmapping_count == 0) {
> -		vunmap(obj->dma_buf_vmapping);
> -		obj->dma_buf_vmapping = NULL;
> -
> -		i915_gem_object_unpin_pages(obj);
> -	}
> +	i915_gem_object_unpin_vmap(obj);
>   	mutex_unlock(&dev->struct_mutex);
>   }
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 15e2d29..47f186e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2056,7 +2056,7 @@ static int init_phys_status_page(struct intel_engine_cs *ring)
>   void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>   {
>   	if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
> -		vunmap(ringbuf->virtual_start);
> +		i915_gem_object_unpin_vmap(ringbuf->obj);
>   	else
>   		iounmap(ringbuf->virtual_start);
>   	ringbuf->virtual_start = NULL;
> @@ -2080,10 +2080,10 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>   		if (ret)
>   			goto unpin;
>
> -		ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
> -						ringbuf->size >> PAGE_SHIFT);
> -		if (ringbuf->virtual_start == NULL) {
> -			ret = -ENOMEM;
> +		ringbuf->virtual_start = i915_gem_object_pin_vmap(obj);
> +		if (IS_ERR(ringbuf->virtual_start)) {
> +			ret = PTR_ERR(ringbuf->virtual_start);
> +			ringbuf->virtual_start = NULL;
>   			goto unpin;
>   		}
>   	} else {
>

Looks OK to me. Even the BUG_ON is not critical since it can't happen 
when using the API although I don't like to see them. Either way:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators
  2016-02-29 11:13 ` [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
@ 2016-02-29 16:16   ` Tvrtko Ursulin
  2016-02-29 16:55     ` [Intel-gfx] " Chris Wilson
  2016-02-29 17:00     ` Emil Velikov
  0 siblings, 2 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2016-02-29 16:16 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx; +Cc: dri-devel


On 29/02/16 11:13, Dave Gordon wrote:
> After the recent addition of drm_malloc_gfp(), it was noticed that
> some callers of these functions has swapped the parameters in the
> call - it's supposed to be 'number of members' and 'sizeof(element)',
> but a few callers had got the size first and the count second. This
> isn't otherwise detected because they're both type 'size_t', and
> the implementation at present just multiplies them anyway, so the
> result is still right. But some future implementation might treat
> them differently (e.g. allowing 0 elements but not zero size), so
> let's add some compile-time checks and complain if the second (size)
> parameter isn't a sizeof() expression, or at least a compile-time
> constant.
>
> This patch also fixes those callers where the order was wrong.
>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>Cc: dri-
> Cc: dri-devel@lists.freedesktop.org
> ---
>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  2 +-
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  8 ++++----
>   include/drm/drm_mem_util.h                   | 30 +++++++++++++++++++++++++---
>   3 files changed, 32 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 1aba01a..9ae4a71 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -340,7 +340,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
>   	 */
>   	bos = drm_malloc_ab(args->nr_bos, sizeof(*bos));
>   	relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
> -	stream = drm_malloc_ab(1, args->stream_size);
> +	stream = drm_malloc_ab(args->stream_size, sizeof(*stream));

I was surprised sizeof(void) == 1. On further research that seems to be 
an GCC extension.

I am not sure how active projects to compile the kernel with for example 
ICC are, just remember some talks about it in the past. Maybe it is even 
possible? In that case it would be better to just leave "1" there to not 
rely on GCC extensions.

And this use of array allocator in Etnaviv is strange since they are 
allocating an unstructured buffer, but whatever, it existing and 
external code. (I am not even sure would I bother touching it, since, is 
the logical view to allocate *one* _buffer_ of a specified size, or 
specified size of bytes make a buffer? :) )

>   	cmdbuf = etnaviv_gpu_cmdbuf_new(gpu, ALIGN(args->stream_size, 8) + 8,
>   					args->nr_bos);
>   	if (!bos || !relocs || !stream || !cmdbuf) {
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 18a5dcc..865876d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1687,8 +1687,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
>   	}
>
>   	/* Copy in the exec list from userland */
> -	exec_list = drm_malloc_ab(sizeof(*exec_list), args->buffer_count);
> -	exec2_list = drm_malloc_ab(sizeof(*exec2_list), args->buffer_count);
> +	exec_list = drm_malloc_ab(args->buffer_count, sizeof(*exec_list));
> +	exec2_list = drm_malloc_ab(args->buffer_count, sizeof(*exec2_list));
>   	if (exec_list == NULL || exec2_list == NULL) {
>   		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
>   			  args->buffer_count);
> @@ -1776,8 +1776,8 @@ static bool only_mappable_for_reloc(unsigned int flags)
>   		return -EINVAL;
>   	}
>
> -	exec2_list = drm_malloc_gfp(sizeof(*exec2_list),
> -				    args->buffer_count,
> +	exec2_list = drm_malloc_gfp(args->buffer_count,
> +				    sizeof(*exec2_list),
>   				    GFP_TEMPORARY);
>   	if (exec2_list == NULL) {
>   		DRM_DEBUG("Failed to allocate exec list for %d buffers\n",
> diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
> index 741ce75..886ff0a 100644
> --- a/include/drm/drm_mem_util.h
> +++ b/include/drm/drm_mem_util.h
> @@ -29,7 +29,7 @@
>
>   #include <linux/vmalloc.h>
>
> -static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
> +static __inline__ void *__drm_calloc_large(size_t nmemb, const size_t size)
>   {
>   	if (size != 0 && nmemb > SIZE_MAX / size)
>   		return NULL;
> @@ -41,8 +41,15 @@ static __inline__ void *drm_calloc_large(size_t nmemb, size_t size)
>   			 GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO, PAGE_KERNEL);
>   }
>
> +#define	drm_calloc_large(nmemb, size)					\
> +({									\
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
> +		"Non-constant 'size' - check argument ordering?");	\
> +	__drm_calloc_large(nmemb, size);				\
> +})
> +
>   /* Modeled after cairo's malloc_ab, it's like calloc but without the zeroing. */
> -static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
> +static __inline__ void *__drm_malloc_ab(size_t nmemb, const size_t size)
>   {
>   	if (size != 0 && nmemb > SIZE_MAX / size)
>   		return NULL;
> @@ -54,8 +61,18 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t size)
>   			 GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
>   }
>
> -static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
> +#define	drm_malloc_ab(nmemb, size)					\
> +({									\
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
> +		"Non-constant 'size' - check argument ordering?");	\
> +	__drm_malloc_ab(nmemb, size);					\
> +})
> +
> +static __inline__ void *__drm_malloc_gfp(size_t nmemb, const size_t size, gfp_t gfp)
>   {
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),
> +		"Non-constant 'size' - check argument ordering?");
> +
>   	if (size != 0 && nmemb > SIZE_MAX / size)
>   		return NULL;
>
> @@ -73,6 +90,13 @@ static __inline__ void *drm_malloc_gfp(size_t nmemb, size_t size, gfp_t gfp)
>   			 gfp | __GFP_HIGHMEM, PAGE_KERNEL);
>   }
>
> +#define	drm_malloc_gfp(nmemb, size, gfp)				\
> +({									\
> +	BUILD_BUG_ON_MSG(!__builtin_constant_p(size),			\
> +		"Non-constant 'size' - check argument ordering?");	\
> +	__drm_malloc_gfp(nmemb, size, gfp);				\
> +})
> +
>   static __inline void drm_free_large(void *ptr)
>   {
>   	kvfree(ptr);
>

i915 cleanups are good but I am unsure of whether it is good to add this 
constant constraints. All current code seems to use it like that, true, 
but I am not sure that it should be a requirement.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators
  2016-02-29 16:16   ` Tvrtko Ursulin
@ 2016-02-29 16:55     ` Chris Wilson
  2016-02-29 17:00     ` Emil Velikov
  1 sibling, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2016-02-29 16:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Dave Gordon, intel-gfx, dri-devel

On Mon, Feb 29, 2016 at 04:16:57PM +0000, Tvrtko Ursulin wrote:
> i915 cleanups are good but I am unsure of whether it is good to add
> this constant constraints. All current code seems to use it like
> that, true, but I am not sure that it should be a requirement.

The drm_mem_util allocators are written under that presumption in order
to allow gcc to do some constant-expression elimination - but obviously
that is not strictly required.

I like the assertions, they help describe the API and should allow us to
warn about potential bugs.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators
  2016-02-29 16:16   ` Tvrtko Ursulin
  2016-02-29 16:55     ` [Intel-gfx] " Chris Wilson
@ 2016-02-29 17:00     ` Emil Velikov
  1 sibling, 0 replies; 18+ messages in thread
From: Emil Velikov @ 2016-02-29 17:00 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx@lists.freedesktop.org, ML dri-devel

 On 29 February 2016 at 16:16, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
> On 29/02/16 11:13, Dave Gordon wrote:
>>
>> After the recent addition of drm_malloc_gfp(), it was noticed that
>> some callers of these functions has swapped the parameters in the
>> call - it's supposed to be 'number of members' and 'sizeof(element)',
>> but a few callers had got the size first and the count second. This
>> isn't otherwise detected because they're both type 'size_t', and
>> the implementation at present just multiplies them anyway, so the
>> result is still right. But some future implementation might treat
>> them differently (e.g. allowing 0 elements but not zero size), so
>> let's add some compile-time checks and complain if the second (size)
>> parameter isn't a sizeof() expression, or at least a compile-time
>> constant.
>>
>> This patch also fixes those callers where the order was wrong.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>Cc: dri-
>> Cc: dri-devel@lists.freedesktop.org
>> ---
>>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  2 +-
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  8 ++++----
>>   include/drm/drm_mem_util.h                   | 30
>> +++++++++++++++++++++++++---
>>   3 files changed, 32 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> index 1aba01a..9ae4a71 100644
>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> @@ -340,7 +340,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev,
>> void *data,
>>          */
>>         bos = drm_malloc_ab(args->nr_bos, sizeof(*bos));
>>         relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
>> -       stream = drm_malloc_ab(1, args->stream_size);
>> +       stream = drm_malloc_ab(args->stream_size, sizeof(*stream));
>
>
> I was surprised sizeof(void) == 1. On further research that seems to be an
> GCC extension.
>
Afaict pointer arithmetic (-Wpointer-arith) has/is been used in the
kernel extensively. In userspace we try to avoid it libdrm and mesa,
there might be a few cases where it's still around. No too sure about
libva{,-intel}.

> I am not sure how active projects to compile the kernel with for example ICC
> are, just remember some talks about it in the past. Maybe it is even
> possible? In that case it would be better to just leave "1" there to not
> rely on GCC extensions.
>
While I'm all for the idea, I doubt we'll get there any time soon. My
last suggestion (do not use zero sized arrays) went down in flames :-(

Regards,
Emil
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more)
  2016-02-29 12:36   ` Tvrtko Ursulin
@ 2016-02-29 20:42     ` Dave Gordon
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Gordon @ 2016-02-29 20:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

On 29/02/16 12:36, Tvrtko Ursulin wrote:
>
> On 29/02/16 11:13, Dave Gordon wrote:
>> This is essentially Chris Wilson's patch of a similar name, reworked on
>> top of Alex Dai's recent patch:
>>> drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
>> Chris' original commentary said:
>>
>>> We now have two implementations for vmapping a whole object, one for
>>> dma-buf and one for the ringbuffer. If we couple the vmapping into
>>> the obj->pages lifetime, then we can reuse an obj->vmapping for both
>>> and at the same time couple it into the shrinker.
>>>
>>> v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
>>> v3: Call unpin_vmap from the right dmabuf unmapper
>>
>> v4: reimplements the same functionality, but now as wrappers round the
>>      recently-introduced i915_gem_object_vmap_range() from Alex's patch
>>      mentioned above.
>>
>> v5: separated from two minor but unrelated changes [Tvrtko Ursulin];
>>      this is the third and most substantial portion.
>>
>>      Decided not to hold onto vmappings after the pin count goes to zero.
>>      This may reduce the benefit of Chris' scheme a bit, but does avoid
>>      any increased risk of exhausting kernel vmap space on 32-bit kernels
>>      [Tvrtko Ursulin]. Potentially, the vunmap() could be deferred until
>>      the put_pages() stage if a suitable notifier were written, but we're
>>      not doing that here. Nonetheless, the simplification of both dmabuf
>>      and ringbuffer code makes it worthwhile in its own right.
>>
>> With this change, we have only one vmap() in the whole driver :)
>
> The above line does not apply to this patch after the split any more. :)

Yes, actually that comment now applies to patch [3/7] which unified the 
various vmap callers into i915_gem_object_vmap_range() and then this 
patch wraps that and bundles it with get_pages() and pin_pages() ... 
there will still be one more respin, so I'll tweak the message a bit more :)

>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Alex Dai <yu.dai@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         | 22 +++++++++++++++-----
>>   drivers/gpu/drm/i915/i915_gem.c         | 36
>> ++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 37
>> ++++-----------------------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.c | 10 ++++-----
>>   4 files changed, 62 insertions(+), 43 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 918b0bb..7facdb5 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2172,10 +2172,7 @@ struct drm_i915_gem_object {
>>           struct scatterlist *sg;
>>           int last;
>>       } get_page;
>> -
>> -    /* prime dma-buf support */
>> -    void *dma_buf_vmapping;
>> -    int vmapping_count;
>> +    void *vmapping;
>>
>>       /** Breadcrumb of last rendering to the buffer.
>>        * There can only be one writer, but we allow for multiple readers.
>> @@ -2980,7 +2977,22 @@ static inline void
>> i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
>>   static inline void i915_gem_object_unpin_pages(struct
>> drm_i915_gem_object *obj)
>>   {
>>       BUG_ON(obj->pages_pin_count == 0);
>> -    obj->pages_pin_count--;
>> +    if (--obj->pages_pin_count == 0 && obj->vmapping) {
>> +        /*
>> +         * Releasing the vmapping here may yield less benefit than
>> +         * if we kept it until put_pages(), but on the other hand
>> +         * avoids issues of exhausting kernel vmappable address
>> +         * space on 32-bit kernels.
>> +         */
>> +        vunmap(obj->vmapping);
>> +        obj->vmapping = NULL;
>> +    }
>> +}
>> +
>> +void *__must_check i915_gem_object_pin_vmap(struct
>> drm_i915_gem_object *obj);
>> +static inline void i915_gem_object_unpin_vmap(struct
>> drm_i915_gem_object *obj)
>> +{
>> +    i915_gem_object_unpin_pages(obj);
>>   }
>>
>>   void *__must_check i915_gem_object_vmap_range(struct
>> drm_i915_gem_object *obj,
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index c126211..79d3d5b 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2235,6 +2235,8 @@ static void
>> i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>>       ops->put_pages(obj);
>>       obj->pages = NULL;
>>
>> +    BUG_ON(obj->vmapping);
>> +
>
> Do we know for certain kernel will crash and not just maybe leak a
> vmapping? I think not so would be better to just WARN.
>
> Maybe even WARN_ON and return -EBUSY higher up, as the function already
> does for obj->pages_pin_count?

This BUG_ON marks the point where Chris originally had the vunmap(), 
before we realised it would need a new notifier and decided to drop the 
mapping in i915_gem_object_unpin_pages() instead. It was really just 
there for me to check that you couldn't reach put_pages() without having 
gone through that function (e.g. playing with pages_pin_count).

Hmm ... actually there *is* a path where pages_pin_count gets set to 
zero other than by i915_gem_object_unpin_pages() :(

If you try to free a pinned object, you get a WARNING, but then 
i915_gem_free_object() will force pages_pin_count to 0 anyway, and then 
call i915_gem_object_put_pages(), which would definitely trigger the 
BUG_ON().

So I think I'll make it a WARN_ON() and then release the mapping anyway. 
Then if anyone wants to extend the vmap lifetime back to as Chris had 
originally it, it's just a matter of removing the WARNing here and the 
early-unmap code in i915_gem_object_unpin_pages().

.Dave.

>>       i915_gem_object_invalidate(obj);
>>
>>       return 0;
>> @@ -2452,6 +2454,40 @@ void *i915_gem_object_vmap_range(struct
>> drm_i915_gem_object *obj,
>>       return addr;
>>   }
>>
>> +/**
>> + * i915_gem_object_pin_vmap - pin a GEM object and map it into kernel
>> space
>> + * @obj: the GEM object to be mapped
>> + *
>> + * Combines the functions of get_pages(), pin_pages() and
>> vmap_range() on
>> + * the whole object.  The caller should release the mapping by calling
>> + * i915_gem_object_unpin_vmap() when it is no longer required.
>> + *
>> + * Returns the address at which the object has been mapped, or an
>> ERR_PTR
>> + * on failure.
>> + */
>> +void *i915_gem_object_pin_vmap(struct drm_i915_gem_object *obj)
>> +{
>> +    int ret;
>> +
>> +    ret = i915_gem_object_get_pages(obj);
>> +    if (ret)
>> +        return ERR_PTR(ret);
>> +
>> +    i915_gem_object_pin_pages(obj);
>> +
>> +    if (obj->vmapping == NULL) {
>> +        obj->vmapping = i915_gem_object_vmap_range(obj, 0,
>> +                    obj->base.size >> PAGE_SHIFT);
>> +
>> +        if (obj->vmapping == NULL) {
>> +            i915_gem_object_unpin_pages(obj);
>> +            return ERR_PTR(-ENOMEM);
>> +        }
>> +    }
>> +
>> +    return obj->vmapping;
>> +}
>> +
>>   void i915_vma_move_to_active(struct i915_vma *vma,
>>                    struct drm_i915_gem_request *req)
>>   {
>> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> index 68e21d1..adc7b5e 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> @@ -108,41 +108,17 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf
>> *dma_buf)
>>   {
>>       struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>>       struct drm_device *dev = obj->base.dev;
>> +    void *addr;
>>       int ret;
>>
>>       ret = i915_mutex_lock_interruptible(dev);
>>       if (ret)
>>           return ERR_PTR(ret);
>>
>> -    if (obj->dma_buf_vmapping) {
>> -        obj->vmapping_count++;
>> -        goto out_unlock;
>> -    }
>> -
>> -    ret = i915_gem_object_get_pages(obj);
>> -    if (ret)
>> -        goto err;
>> -
>> -    i915_gem_object_pin_pages(obj);
>> -
>> -    ret = -ENOMEM;
>> -
>> -    obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0,
>> -                        dma_buf->size >> PAGE_SHIFT);
>> -
>> -    if (!obj->dma_buf_vmapping)
>> -        goto err_unpin;
>> -
>> -    obj->vmapping_count = 1;
>> -out_unlock:
>> +    addr = i915_gem_object_pin_vmap(obj);
>>       mutex_unlock(&dev->struct_mutex);
>> -    return obj->dma_buf_vmapping;
>>
>> -err_unpin:
>> -    i915_gem_object_unpin_pages(obj);
>> -err:
>> -    mutex_unlock(&dev->struct_mutex);
>> -    return ERR_PTR(ret);
>> +    return addr;
>>   }
>>
>>   static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void
>> *vaddr)
>> @@ -151,12 +127,7 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf
>> *dma_buf, void *vaddr)
>>       struct drm_device *dev = obj->base.dev;
>>
>>       mutex_lock(&dev->struct_mutex);
>> -    if (--obj->vmapping_count == 0) {
>> -        vunmap(obj->dma_buf_vmapping);
>> -        obj->dma_buf_vmapping = NULL;
>> -
>> -        i915_gem_object_unpin_pages(obj);
>> -    }
>> +    i915_gem_object_unpin_vmap(obj);
>>       mutex_unlock(&dev->struct_mutex);
>>   }
>>
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 15e2d29..47f186e 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2056,7 +2056,7 @@ static int init_phys_status_page(struct
>> intel_engine_cs *ring)
>>   void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>>   {
>>       if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
>> -        vunmap(ringbuf->virtual_start);
>> +        i915_gem_object_unpin_vmap(ringbuf->obj);
>>       else
>>           iounmap(ringbuf->virtual_start);
>>       ringbuf->virtual_start = NULL;
>> @@ -2080,10 +2080,10 @@ int intel_pin_and_map_ringbuffer_obj(struct
>> drm_device *dev,
>>           if (ret)
>>               goto unpin;
>>
>> -        ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
>> -                        ringbuf->size >> PAGE_SHIFT);
>> -        if (ringbuf->virtual_start == NULL) {
>> -            ret = -ENOMEM;
>> +        ringbuf->virtual_start = i915_gem_object_pin_vmap(obj);
>> +        if (IS_ERR(ringbuf->virtual_start)) {
>> +            ret = PTR_ERR(ringbuf->virtual_start);
>> +            ringbuf->virtual_start = NULL;
>>               goto unpin;
>>           }
>>       } else {
>>
>
> Looks OK to me. Even the BUG_ON is not critical since it can't happen
> when using the API although I don't like to see them. Either way:
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
>
> Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-02-29 20:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-29 11:13 [PATCH v5 0/7] Reorganise calls to vmap() GEM objects Dave Gordon
2016-02-29 11:13 ` [PATCH v5 1/7] drm,i915: introduce drm_malloc_gfp() Dave Gordon
2016-02-29 11:13 ` [PATCH v5 2/7] drm: add parameter-order checking to drm memory allocators Dave Gordon
2016-02-29 16:16   ` Tvrtko Ursulin
2016-02-29 16:55     ` [Intel-gfx] " Chris Wilson
2016-02-29 17:00     ` Emil Velikov
2016-02-29 11:13 ` [PATCH v5 3/7] drm/i915: introduce and use i915_gem_object_vmap_range() Dave Gordon
2016-02-29 11:53   ` Tvrtko Ursulin
2016-02-29 11:13 ` [PATCH v5 4/7] drm/i915: optimise i915_gem_object_vmap_range() for small objects Dave Gordon
2016-02-29 12:01   ` Tvrtko Ursulin
2016-02-29 11:13 ` [PATCH v5 5/7] drm/i915: move locking in i915_gem_unmap_dma_buf() Dave Gordon
2016-02-29 12:07   ` Tvrtko Ursulin
2016-02-29 11:13 ` [PATCH v5 6/7] drm/i915: deduplicate intel_pin_and_map_ringbuffer_obj() error handling Dave Gordon
2016-02-29 12:10   ` Tvrtko Ursulin
2016-02-29 11:13 ` [PATCH v5 7/7] drm/i915: refactor duplicate object vmap functions (reworked some more) Dave Gordon
2016-02-29 12:36   ` Tvrtko Ursulin
2016-02-29 20:42     ` Dave Gordon
2016-02-29 11:28 ` ✗ Fi.CI.BAT: failure for Reorganise calls to vmap() GEM objects (rev3) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.