public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [RFC 0/3] Increase the utilization of Stolen area on VLV
@ 2014-03-05 11:35 sourab.gupta
  2014-03-05 11:35 ` [RFC 1/3] drm/i915: Added a new 'I915_CPU_MAP_NOT_NEEDED' flag to gem_create ioctl sourab.gupta
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: sourab.gupta @ 2014-03-05 11:35 UTC (permalink / raw)
  To: intel-gfx
  Cc: Cc: Daniel Vetter, Sourab Gupta, Cc: Jesse Barnes,
	Cc: Chris Wilson

From: Sourab Gupta <sourab.gupta@intel.com>

These patches are mainly for increasing the utilization
of stolen memory area on VLV. 
Although similar patches were submitted earlier, but this time the 
allocation from stolen area will be done based on user's input.
We have now exposed a new libdrm flag which user can use while creating
an object to convey driver that direct cpu access to buffer is 
not required. Based on this flag, driver can decide to allocate the 
backing storage for this object from stolen mem area.
Based on earlier comments, we have added the truncation logic for these
objects.

Sourab Gupta (3):
  drm/i915: Added a new 'I915_CPU_MAP_NOT_NEEDED' flag to gem_create
    ioctl.
  i915/drm: Increase the utilization of stolen memory on VLV
  drm/i915: Add the truncation logic for Stolen objects.

 drivers/gpu/drm/i915/i915_drv.h        |    8 ++
 drivers/gpu/drm/i915/i915_gem.c        |   34 ++++++++-
 drivers/gpu/drm/i915/i915_gem_stolen.c |  126 ++++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h            |    3 +-
 4 files changed, 166 insertions(+), 5 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC 1/3] drm/i915: Added a new 'I915_CPU_MAP_NOT_NEEDED' flag to gem_create ioctl.
  2014-03-05 11:35 [RFC 0/3] Increase the utilization of Stolen area on VLV sourab.gupta
@ 2014-03-05 11:35 ` sourab.gupta
  2014-03-05 11:35 ` [RFC 2/3] i915/drm: Increase the utilization of stolen memory on VLV sourab.gupta
  2014-03-05 11:35 ` [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects sourab.gupta
  2 siblings, 0 replies; 13+ messages in thread
From: sourab.gupta @ 2014-03-05 11:35 UTC (permalink / raw)
  To: intel-gfx
  Cc: Cc: Daniel Vetter, Sourab Gupta, Cc: Jesse Barnes, Akash Goel,
	Cc: Chris Wilson

From: Sourab Gupta <sourab.gupta@intel.com>

Adding the flag 'I915_CPU_MAP_NOT_NEEDED' to gem_create ioctl.
This is to indicate the driver that direct cpu access to this
buffer object is not needed and hence Driver can opt to use Stolen
area as a backing store for it.

Testcase: igt/gem_stolen_mem

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |    6 ++++++
 drivers/gpu/drm/i915/i915_gem.c |   12 +++++++++---
 include/uapi/drm/i915_drm.h     |    3 ++-
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 05cfcc1..8a066a7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1714,6 +1714,12 @@ struct drm_i915_gem_object {
 	unsigned int has_global_gtt_mapping:1;
 	unsigned int has_dma_mapping:1;
 
+	/*
+	 * Is the object not required to be accessed from CPU
+	 */
+	unsigned int cpu_map_not_needed:1;
+
+
 	struct sg_table *pages;
 	int pages_pin_count;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3618bb0..8421b80 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -224,7 +224,9 @@ static int
 i915_gem_create(struct drm_file *file,
 		struct drm_device *dev,
 		uint64_t size,
-		uint32_t *handle_p)
+		uint32_t *handle_p,
+		uint32_t flags)
+
 {
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -245,6 +247,10 @@ i915_gem_create(struct drm_file *file,
 	if (ret)
 		return ret;
 
+        if (flags & I915_CPU_MAP_NOT_NEEDED)
+		obj->cpu_map_not_needed = 1;
+
+
 	*handle_p = handle;
 	return 0;
 }
@@ -258,7 +264,7 @@ i915_gem_dumb_create(struct drm_file *file,
 	args->pitch = ALIGN(args->width * DIV_ROUND_UP(args->bpp, 8), 64);
 	args->size = args->pitch * args->height;
 	return i915_gem_create(file, dev,
-			       args->size, &args->handle);
+			       args->size, &args->handle, args->flags);
 }
 
 /**
@@ -271,7 +277,7 @@ i915_gem_create_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_gem_create *args = data;
 
 	return i915_gem_create(file, dev,
-			       args->size, &args->handle);
+			       args->size, &args->handle, args->flags);
 }
 
 static inline int
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 126bfaa..8625f54 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -431,7 +431,8 @@ struct drm_i915_gem_create {
 	 * Object handles are nonzero.
 	 */
 	__u32 handle;
-	__u32 pad;
+	__u32 flags;
+#define I915_CPU_MAP_NOT_NEEDED 0x1
 };
 
 struct drm_i915_gem_pread {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC 2/3] i915/drm: Increase the utilization of stolen memory on VLV
  2014-03-05 11:35 [RFC 0/3] Increase the utilization of Stolen area on VLV sourab.gupta
  2014-03-05 11:35 ` [RFC 1/3] drm/i915: Added a new 'I915_CPU_MAP_NOT_NEEDED' flag to gem_create ioctl sourab.gupta
@ 2014-03-05 11:35 ` sourab.gupta
  2014-03-05 11:35 ` [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects sourab.gupta
  2 siblings, 0 replies; 13+ messages in thread
From: sourab.gupta @ 2014-03-05 11:35 UTC (permalink / raw)
  To: intel-gfx
  Cc: Cc: Daniel Vetter, Sourab Gupta, Cc: Jesse Barnes, Akash Goel,
	Cc: Chris Wilson

From: Sourab Gupta <sourab.gupta@intel.com>

On VLV, 64MB of system memory was being reserved for stolen
area, but ~8MB of it was being utilized.
For the buffer objects which are not cpu mappable, we can allocate
the space from stolen memory, thus increasing the utilization of stolen
memory area.

Testcase: igt/gem_stolen_mem

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 +
 drivers/gpu/drm/i915/i915_gem.c        |   11 ++++
 drivers/gpu/drm/i915/i915_gem_stolen.c |   94 ++++++++++++++++++++++++++++++++
 3 files changed, 106 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8a066a7..b5f603f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2441,6 +2441,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 stolen_offset,
 					       u32 gtt_offset,
 					       u32 size);
+void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj);
 void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
 
 /* i915_gem_tiling.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8421b80..f57ca31 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3854,6 +3854,17 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		}
 	}
 
+	/* Try to allocate the physical space for the GEM object,
+	 * for which the CPU map is not required, from the stolen area.
+	 * But if there is no sufficient free space left in stolen
+	 * area, will fallback to shmem.
+	 */
+	if (obj->cpu_map_not_needed == 1) {
+		if (obj->pages == NULL) {
+			i915_gem_object_move_to_stolen(obj);
+		}
+	}
+
 	if (vma == NULL || !drm_mm_node_allocated(&vma->node)) {
 		vma = i915_gem_object_bind_to_vm(obj, vm, alignment, flags);
 		if (IS_ERR(vma))
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index d58b4e2..6758ba4 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -29,6 +29,7 @@
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
+#include <linux/shmem_fs.h>
 
 /*
  * The BIOS typically reserves some of the system's memory for the exclusive
@@ -447,6 +448,99 @@ err_out:
 }
 
 void
+i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj)
+{
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_mm_node *stolen;
+	u32 size = obj->base.size;
+	int ret = 0;
+
+	if (!IS_VALLEYVIEW(dev)) {
+		return;
+	}
+
+	if (obj->stolen) {
+		BUG_ON(obj->pages == NULL);
+		return;
+	}
+
+	if (!drm_mm_initialized(&dev_priv->mm.stolen))
+		return;
+
+	if (size == 0)
+		return;
+
+	/* Check if already shmem space has been allocated for the object
+	 * or not. We cannot rely upon on the value of 'pages' field for this.
+	 * As even though if the 'pages' field is NULL, it does not actually
+	 * indicate that the backing physical space (shmem) is currently not
+	 * reserved for the object, as the object may not get purged/truncated
+	 * on the calll to 'put_pages_gtt'.
+	 */
+	if (obj->base.filp) {
+		struct inode *inode = file_inode(obj->base.filp);
+		struct shmem_inode_info *info = SHMEM_I(inode);
+		if (!inode)
+			return;
+		spin_lock(&info->lock);
+		/* The alloced field stores how many data pages are
+		 * allocated to the file.
+		 */
+		ret = info->alloced;
+		spin_unlock(&info->lock);
+		if (ret > 0) {
+			DRM_DEBUG_DRIVER(
+				"Already shmem space alloced, %d pges\n", ret);
+			return;
+		}
+	}
+
+	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
+	if (!stolen)
+		return;
+
+	ret = drm_mm_insert_node(&dev_priv->mm.stolen, stolen, size,
+				 4096, DRM_MM_SEARCH_DEFAULT);
+	if (ret) {
+		kfree(stolen);
+		DRM_DEBUG_DRIVER("ran out of stolen space\n");
+		return;
+	}
+
+	/* Set up the object to use the stolen memory,
+	 * backing store no longer managed by shmem layer */
+	drm_gem_object_release(&(obj->base));
+	obj->base.filp = NULL;
+	obj->ops = &i915_gem_object_stolen_ops;
+
+	obj->pages = i915_pages_create_for_stolen(dev,
+						stolen->start, stolen->size);
+	if (obj->pages == NULL)
+		goto cleanup;
+
+	i915_gem_object_pin_pages(obj);
+	list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list);
+	obj->has_dma_mapping = true;
+	obj->stolen = stolen;
+
+	DRM_DEBUG_DRIVER("Obj moved to stolen, ptr = %p, size = %x\n",
+			 obj, size);
+
+	obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
+	obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
+
+	/* No zeroing-out of buffers allocated from stolen area */
+	return;
+
+cleanup:
+	drm_mm_remove_node(stolen);
+	kfree(stolen);
+	return;
+}
+
+
+void
 i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
 {
 	if (obj->stolen) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 11:35 [RFC 0/3] Increase the utilization of Stolen area on VLV sourab.gupta
  2014-03-05 11:35 ` [RFC 1/3] drm/i915: Added a new 'I915_CPU_MAP_NOT_NEEDED' flag to gem_create ioctl sourab.gupta
  2014-03-05 11:35 ` [RFC 2/3] i915/drm: Increase the utilization of stolen memory on VLV sourab.gupta
@ 2014-03-05 11:35 ` sourab.gupta
  2014-03-05 11:49   ` Chris Wilson
  2 siblings, 1 reply; 13+ messages in thread
From: sourab.gupta @ 2014-03-05 11:35 UTC (permalink / raw)
  To: intel-gfx
  Cc: Cc: Daniel Vetter, Sourab Gupta, Cc: Jesse Barnes, Akash Goel,
	Cc: Chris Wilson

From: Sourab Gupta <sourab.gupta@intel.com>

Adding the truncation logic for buffer objects with backing storage
from stolen memory. The objects will be truncated when user marks them
as purgeable and they are part of the inactive list.

Testcase: igt/gem_stolen_mem

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 +
 drivers/gpu/drm/i915/i915_gem.c        |   11 ++++++++++-
 drivers/gpu/drm/i915/i915_gem_stolen.c |   32 ++++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b5f603f..02142c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2443,6 +2443,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 size);
 void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj);
 void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
+void i915_gem_object_truncate_stolen(struct drm_i915_gem_object *obj);
 
 /* i915_gem_tiling.c */
 static inline bool i915_gem_object_needs_bit17_swizzle(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f57ca31..d106806 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2071,6 +2071,11 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 	obj->fenced_gpu_access = false;
 
 	obj->active = 0;
+
+	/* Truncate the stolen obj immediately if it is marked as purgeable */
+	if(obj->stolen && (i915_gem_object_is_purgeable(obj) ) )
+		i915_gem_object_truncate_stolen(obj);
+
 	drm_gem_object_unreference(&obj->base);
 
 	WARN_ON(i915_verify_lists(dev));
@@ -4069,6 +4074,10 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (i915_gem_object_is_purgeable(obj) && obj->pages == NULL)
 		i915_gem_object_truncate(obj);
 
+	/* if the stolen mem object is no longer active, discard its backing storage */
+	if (obj->stolen && i915_gem_object_is_purgeable(obj) && i915_gem_object_is_inactive(obj))
+		i915_gem_object_truncate_stolen(obj);
+
 	args->retained = obj->madv != __I915_MADV_PURGED;
 
 out:
@@ -4187,7 +4196,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
 	 * before progressing. */
-	if (obj->stolen)
+	if ((obj->stolen) && (obj->madv != __I915_MADV_PURGED))
 		i915_gem_object_unpin_pages(obj);
 
 	if (WARN_ON(obj->pages_pin_count))
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 6758ba4..1ecc447 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -471,6 +471,11 @@ i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj)
 	if (size == 0)
 		return;
 
+	if (obj->madv != I915_MADV_WILLNEED) {
+		DRM_ERROR("Attempting to allocate a purgeable object\n");
+		return;
+	}
+
 	/* Check if already shmem space has been allocated for the object
 	 * or not. We cannot rely upon on the value of 'pages' field for this.
 	 * As even though if the 'pages' field is NULL, it does not actually
@@ -549,3 +554,30 @@ i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
 		obj->stolen = NULL;
 	}
 }
+
+/*
+ * This function truncates stolen objects to reclaim their backing storage
+ * from stolen mem area. Currently objects are truncated when they are
+ * marked as purgeable and are in inactive list.
+ * Stolen objects are not considered for truncation by shrinker, as their
+ * physical pages are kept as pinned throughout their existance.
+ */
+void
+i915_gem_object_truncate_stolen(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma_temp,*next;
+
+	BUG_ON(!obj->stolen);
+
+	list_for_each_entry_safe(vma_temp, next, &obj->vma_list, vma_link)
+		WARN_ON(i915_vma_unbind(vma_temp));
+
+	i915_gem_object_unpin_pages(obj);
+	i915_gem_object_put_pages(obj);
+
+	/* Release the stolen space */
+	i915_gem_object_release_stolen(obj);
+
+	obj->madv = __I915_MADV_PURGED;
+
+}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 11:35 ` [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects sourab.gupta
@ 2014-03-05 11:49   ` Chris Wilson
  2014-03-05 12:26     ` Gupta, Sourab
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2014-03-05 11:49 UTC (permalink / raw)
  To: sourab.gupta
  Cc: Cc: Daniel Vetter, intel-gfx, Cc: Jesse Barnes, Akash Goel,
	Cc: Chris Wilson

On Wed, Mar 05, 2014 at 05:05:40PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Adding the truncation logic for buffer objects with backing storage
> from stolen memory. The objects will be truncated when user marks them
> as purgeable and they are part of the inactive list.
> 
> Testcase: igt/gem_stolen_mem

This code still has the same obvious bugs.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 11:49   ` Chris Wilson
@ 2014-03-05 12:26     ` Gupta, Sourab
  2014-03-05 12:45       ` Chris Wilson
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Sourab @ 2014-03-05 12:26 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Vetter, Daniel, intel-gfx@lists.freedesktop.org, Barnes, Jesse,
	Goel, Akash, Wilson, Chris

Hi Chris,

Can you please through some light on the bug here, so that we're able to handle it.
Is it about the handling required for the pread/pwrite ioctls, or is it about the format of mentioning the testcase?

Thanks,
Sourab

-----Original Message-----
From: Chris Wilson [mailto:chris@chris-wilson.co.uk] 
Sent: Wednesday, March 05, 2014 5:19 PM
To: Gupta, Sourab
Cc: intel-gfx@lists.freedesktop.org; Vetter, Daniel; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On Wed, Mar 05, 2014 at 05:05:40PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Adding the truncation logic for buffer objects with backing storage 
> from stolen memory. The objects will be truncated when user marks them 
> as purgeable and they are part of the inactive list.
> 
> Testcase: igt/gem_stolen_mem

This code still has the same obvious bugs.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 12:26     ` Gupta, Sourab
@ 2014-03-05 12:45       ` Chris Wilson
  2014-03-05 19:24         ` Gupta, Sourab
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2014-03-05 12:45 UTC (permalink / raw)
  To: Gupta, Sourab
  Cc: Vetter, Daniel, intel-gfx@lists.freedesktop.org, Barnes, Jesse,
	Goel, Akash, Wilson, Chris

On Wed, Mar 05, 2014 at 12:26:26PM +0000, Gupta, Sourab wrote:
> Hi Chris,
> 
> Can you please through some light on the bug here, so that we're able to handle it.
> Is it about the handling required for the pread/pwrite ioctls, or is it about the format of mentioning the testcase?

You truncate the stolen region whilst it is still in use and then hand
it out ot a new object. Rinse and repeat.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 12:45       ` Chris Wilson
@ 2014-03-05 19:24         ` Gupta, Sourab
  2014-03-05 19:47           ` Daniel Vetter
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Sourab @ 2014-03-05 19:24 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Vetter, Daniel, intel-gfx@lists.freedesktop.org, Barnes, Jesse,
	Goel, Akash, Wilson, Chris

Hi Chris,

We have assumed the following lifecycle of the (stolen)object, w.r.t the truncation usecase:

1) The user creates the (non-cpu mappable)object --> the gem object is created. Shmem filep is allocated. No backing storage present
2) GPU operations performed (after mmap_gtt ) --> object is moved to stolen memory and the backing pages are allocated from stolen mem area.
3) Object is marked as purgeable (by madvise_ioctl)-->  here two scenarios can occur:
	a) Object has not yet been processed by GPU. In this case it will be in the active list. So, madvise_ioctl will mark it as purgeable and return. 
	     Subsequently, when the retire handler runs, and object is being moved to inactive list, then, the purgeable object is truncated and marked as purged.
	b) Object is already processed by the GPU. In this case, it will be in the inactive list. So, madvise_ioctl will do the object truncation and mark it as purged.
4) If object is not marked as purgeable, the backing pages of the object will remain.
5) When object is freed, backing pages are released, if the object has not been purged already.

Is the object being in the inactive list, the necessary and sufficient condition of it not being in use?
We may be missing out on some condition regarding when the object will be in use.
Can you please point us to the flaws in above assumption regarding when we can truncate the object.

Thanks and Regards,
Sourab


-----Original Message-----
From: Chris Wilson [mailto:chris@chris-wilson.co.uk] 
Sent: Wednesday, March 05, 2014 6:15 PM
To: Gupta, Sourab
Cc: intel-gfx@lists.freedesktop.org; Vetter, Daniel; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On Wed, Mar 05, 2014 at 12:26:26PM +0000, Gupta, Sourab wrote:
> Hi Chris,
> 
> Can you please through some light on the bug here, so that we're able to handle it.
> Is it about the handling required for the pread/pwrite ioctls, or is it about the format of mentioning the testcase?

You truncate the stolen region whilst it is still in use and then hand it out ot a new object. Rinse and repeat.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 19:24         ` Gupta, Sourab
@ 2014-03-05 19:47           ` Daniel Vetter
  2014-03-06  9:39             ` Gupta, Sourab
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Vetter @ 2014-03-05 19:47 UTC (permalink / raw)
  To: Gupta, Sourab, Chris Wilson
  Cc: intel-gfx@lists.freedesktop.org, Barnes, Jesse, Goel, Akash,
	Wilson, Chris

On 05/03/2014 20:24, Gupta, Sourab wrote:
> We have assumed the following lifecycle of the (stolen)object, w.r.t the truncation usecase:
>
> 1) The user creates the (non-cpu mappable)object --> the gem object is created. Shmem filep is allocated. No backing storage present
> 2) GPU operations performed (after mmap_gtt ) --> object is moved to stolen memory and the backing pages are allocated from stolen mem area.
> 3) Object is marked as purgeable (by madvise_ioctl)-->  here two scenarios can occur:
> 	a) Object has not yet been processed by GPU. In this case it will be in the active list. So, madvise_ioctl will mark it as purgeable and return.
> 	     Subsequently, when the retire handler runs, and object is being moved to inactive list, then, the purgeable object is truncated and marked as purged.
> 	b) Object is already processed by the GPU. In this case, it will be in the inactive list. So, madvise_ioctl will do the object truncation and mark it as purged.
> 4) If object is not marked as purgeable, the backing pages of the object will remain.
> 5) When object is freed, backing pages are released, if the object has not been purged already.
>
> Is the object being in the inactive list, the necessary and sufficient condition of it not being in use?
> We may be missing out on some condition regarding when the object will be in use.
> Can you please point us to the flaws in above assumption regarding when we can truncate the object.
Immediately purging the backing storage is not the idea of the madvise 
ioctl. The idea is to purge it as late as possible, i.e. when we've 
tried to allocate more stolen space but failed. For comparison see how 
purgeable works for normal objects: Those only get purged in the 
shrinker, i.e. when the linux mm is short on memory.
-Daniel
Intel Semiconductor AG
Registered No. 020.30.913.786-7
Registered Office: World Trade Center, Leutschenbachstrasse 95, 8050 Zurich, Switzerland

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-05 19:47           ` Daniel Vetter
@ 2014-03-06  9:39             ` Gupta, Sourab
  2014-03-06  9:48               ` Daniel Vetter
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Sourab @ 2014-03-06  9:39 UTC (permalink / raw)
  To: Vetter, Daniel, Chris Wilson
  Cc: intel-gfx@lists.freedesktop.org, Barnes, Jesse, Goel, Akash,
	Wilson, Chris


Hi Daniel,

We had also considered the two points where the stolen mem objs can be truncated -
	a) at the point of a new bo allocation, if stolen mem space is exhausted. 
	b) at the point when the bo is marked as purgeable (madvise ioctl)

We decided upon the second case because:
1) For normal objects, there is a bo cacheing at libdrm level. So even after a bo is marked as purgeable, 
it can still get reused by libdrm (if not truncated already), thereby reducing the effort in new object creation. 
Currently for stolen objects we have not enabled this cacheing. So once a stolen bo is marked as purgeable by User, 
Driver can immediately reclaim the stolen space and there will be no obvious benefit in deferring the object truncation to 
a later stage (when there is a failure in allocation of a new bo). 
Actually we thought that since Stolen mem is a relatively much smaller space and has a limited usage,  
there may not be any additional benefit gained by enabling bo cacheing on libdrm side. 
Also the effort in getting the backing space from stolen mem shall be much less than from shmem. 

With the first approach, it would have required us to scan the entire list of bound & unbound objects to search for the 
purgeable stolen objects and purge them. With the 2nd approach, we could avoid this & purge the objects we already have
in our hand. 

2) Also, there is no coupling between the linux shrinker & the stolen memory. Shrinker invocation will depend upon the  
system memory usage. So, we cannot rely on shrinker to timely trigger the purging of objects from stolen mem.

We understand that we are diverging from the basic nature of madvise ioctl here (which is to only mark objs
as purgeable and so eligible for truncation if needed), but probably because of the above way of using the stolen memory, 
we thought madvise could also be one of the good checkpoint where we can truncate the stolen bos.

What are your views regarding above.
We will try to revisit our design if its digressing too much from the basic tenets regarding the madvise ioctl 

Thanks,
Sourab


-----Original Message-----
From: Vetter, Daniel 
Sent: Thursday, March 06, 2014 1:18 AM
To: Gupta, Sourab; Chris Wilson
Cc: intel-gfx@lists.freedesktop.org; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On 05/03/2014 20:24, Gupta, Sourab wrote:
> We have assumed the following lifecycle of the (stolen)object, w.r.t the truncation usecase:
>
> 1) The user creates the (non-cpu mappable)object --> the gem object is 
> created. Shmem filep is allocated. No backing storage present
> 2) GPU operations performed (after mmap_gtt ) --> object is moved to stolen memory and the backing pages are allocated from stolen mem area.
> 3) Object is marked as purgeable (by madvise_ioctl)-->  here two scenarios can occur:
> 	a) Object has not yet been processed by GPU. In this case it will be in the active list. So, madvise_ioctl will mark it as purgeable and return.
> 	     Subsequently, when the retire handler runs, and object is being moved to inactive list, then, the purgeable object is truncated and marked as purged.
> 	b) Object is already processed by the GPU. In this case, it will be in the inactive list. So, madvise_ioctl will do the object truncation and mark it as purged.
> 4) If object is not marked as purgeable, the backing pages of the object will remain.
> 5) When object is freed, backing pages are released, if the object has not been purged already.
>
> Is the object being in the inactive list, the necessary and sufficient condition of it not being in use?
> We may be missing out on some condition regarding when the object will be in use.
> Can you please point us to the flaws in above assumption regarding when we can truncate the object.
Immediately purging the backing storage is not the idea of the madvise ioctl. The idea is to purge it as late as possible, i.e. when we've tried to allocate more stolen space but failed. For comparison see how purgeable works for normal objects: Those only get purged in the shrinker, i.e. when the linux mm is short on memory.
-Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-06  9:39             ` Gupta, Sourab
@ 2014-03-06  9:48               ` Daniel Vetter
  2014-03-06 10:48                 ` Gupta, Sourab
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Vetter @ 2014-03-06  9:48 UTC (permalink / raw)
  To: Gupta, Sourab, Chris Wilson
  Cc: intel-gfx@lists.freedesktop.org, Barnes, Jesse, Goel, Akash,
	Wilson, Chris

On 06/03/2014 10:39, Gupta, Sourab wrote:
> Hi Daniel,
>
> We had also considered the two points where the stolen mem objs can be truncated -
> 	a) at the point of a new bo allocation, if stolen mem space is exhausted.
> 	b) at the point when the bo is marked as purgeable (madvise ioctl)
>
> We decided upon the second case because:
> 1) For normal objects, there is a bo cacheing at libdrm level. So even after a bo is marked as purgeable,
> it can still get reused by libdrm (if not truncated already), thereby reducing the effort in new object creation.
> Currently for stolen objects we have not enabled this cacheing. So once a stolen bo is marked as purgeable by User,
> Driver can immediately reclaim the stolen space and there will be no obvious benefit in deferring the object truncation to
> a later stage (when there is a failure in allocation of a new bo).
> Actually we thought that since Stolen mem is a relatively much smaller space and has a limited usage,
> there may not be any additional benefit gained by enabling bo cacheing on libdrm side.
> Also the effort in getting the backing space from stolen mem shall be much less than from shmem.
>
> With the first approach, it would have required us to scan the entire list of bound & unbound objects to search for the
> purgeable stolen objects and purge them. With the 2nd approach, we could avoid this & purge the objects we already have
> in our hand.
Well, approach b) breaks the libdrm cache, since libdrm sets the madvise 
flag to purgeable when insert an unused object into it's cache. So you 
really can't truncate the object right away, since that completely 
defeats the point of the cache.
-Daniel
Intel Semiconductor AG
Registered No. 020.30.913.786-7
Registered Office: World Trade Center, Leutschenbachstrasse 95, 8050 Zurich, Switzerland

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-06  9:48               ` Daniel Vetter
@ 2014-03-06 10:48                 ` Gupta, Sourab
  2014-03-06 11:17                   ` Daniel Vetter
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Sourab @ 2014-03-06 10:48 UTC (permalink / raw)
  To: Vetter, Daniel, Chris Wilson
  Cc: intel-gfx@lists.freedesktop.org, Barnes, Jesse, Goel, Akash,
	Wilson, Chris

Hi Daniel,

For stolen objects, we have not enabled the libdrm cacheing, so that when the user creates bo with this flag, the objects created are not cached.
Then the scenario of breaking the libdrm cacheing will not arise in this case.
This is ensured in the libdrm patches uploaded together with these patches. Can you please have a look at those patches.

Also, do we need to enable cacheing for stolen objects at libdrm ( therefore deferring the stolen object truncation to later) ?

Regards,
Sourab

-----Original Message-----
From: Vetter, Daniel 
Sent: Thursday, March 06, 2014 3:19 PM
To: Gupta, Sourab; Chris Wilson
Cc: intel-gfx@lists.freedesktop.org; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On 06/03/2014 10:39, Gupta, Sourab wrote:
> Hi Daniel,
>
> We had also considered the two points where the stolen mem objs can be truncated -
> 	a) at the point of a new bo allocation, if stolen mem space is exhausted.
> 	b) at the point when the bo is marked as purgeable (madvise ioctl)
>
> We decided upon the second case because:
> 1) For normal objects, there is a bo cacheing at libdrm level. So even 
> after a bo is marked as purgeable, it can still get reused by libdrm (if not truncated already), thereby reducing the effort in new object creation.
> Currently for stolen objects we have not enabled this cacheing. So 
> once a stolen bo is marked as purgeable by User, Driver can 
> immediately reclaim the stolen space and there will be no obvious benefit in deferring the object truncation to a later stage (when there is a failure in allocation of a new bo).
> Actually we thought that since Stolen mem is a relatively much smaller 
> space and has a limited usage, there may not be any additional benefit gained by enabling bo cacheing on libdrm side.
> Also the effort in getting the backing space from stolen mem shall be much less than from shmem.
>
> With the first approach, it would have required us to scan the entire 
> list of bound & unbound objects to search for the purgeable stolen 
> objects and purge them. With the 2nd approach, we could avoid this & purge the objects we already have in our hand.
Well, approach b) breaks the libdrm cache, since libdrm sets the madvise flag to purgeable when insert an unused object into it's cache. So you really can't truncate the object right away, since that completely defeats the point of the cache.
-Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.
  2014-03-06 10:48                 ` Gupta, Sourab
@ 2014-03-06 11:17                   ` Daniel Vetter
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2014-03-06 11:17 UTC (permalink / raw)
  To: Gupta, Sourab
  Cc: intel-gfx@lists.freedesktop.org, Goel, Akash, Barnes, Jesse,
	Wilson, Chris, Vetter, Daniel

On Thu, Mar 6, 2014 at 11:48 AM, Gupta, Sourab <sourab.gupta@intel.com> wrote:
> Also, do we need to enable cacheing for stolen objects at libdrm ( therefore deferring the stolen object truncation to later) ?

That was kinda the entire point of making stolen objects purgeable ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-03-06 11:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-05 11:35 [RFC 0/3] Increase the utilization of Stolen area on VLV sourab.gupta
2014-03-05 11:35 ` [RFC 1/3] drm/i915: Added a new 'I915_CPU_MAP_NOT_NEEDED' flag to gem_create ioctl sourab.gupta
2014-03-05 11:35 ` [RFC 2/3] i915/drm: Increase the utilization of stolen memory on VLV sourab.gupta
2014-03-05 11:35 ` [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects sourab.gupta
2014-03-05 11:49   ` Chris Wilson
2014-03-05 12:26     ` Gupta, Sourab
2014-03-05 12:45       ` Chris Wilson
2014-03-05 19:24         ` Gupta, Sourab
2014-03-05 19:47           ` Daniel Vetter
2014-03-06  9:39             ` Gupta, Sourab
2014-03-06  9:48               ` Daniel Vetter
2014-03-06 10:48                 ` Gupta, Sourab
2014-03-06 11:17                   ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox