From: Daniel Vetter <daniel@ffwll.ch>
To: akash.goel@intel.com
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 5/7] drm/i915/vlv: Increase the utilization of stolen memory on VLV.
Date: Thu, 9 Jan 2014 08:27:22 +0100 [thread overview]
Message-ID: <20140109072722.GY4770@phenom.ffwll.local> (raw)
In-Reply-To: <1389245462-11083-1-git-send-email-akash.goel@intel.com>
On Thu, Jan 09, 2014 at 11:01:02AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> On VLV, 64MB of system memory was being reserved for stolen
> area, but ~8MB of it was being utilized.
> Increased the utilization of Stolen area by allocating
> User created Frame buffers(only X Tiled) from it.
> User Frame buffers are suitable for allocation from stolen area,
> as its very unlikely that they are not accessed from CPU side.
> And its even more unlikely that the Tiled(X) buffers will be
> accessed directly from the CPU side. And any allocation
> from stolen area is not directly CPU accessible, but
> accessible only through the aperture space.
> With 1080p sized frame buffers (32 bpp), the allocation of 6-7
> frame buffers itself gives almost the full utilization of
> stolen area.
>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 1 +
> drivers/gpu/drm/i915/i915_gem.c | 21 ++++++++
> drivers/gpu/drm/i915/i915_gem_stolen.c | 93 ++++++++++++++++++++++++++++++++++
> 3 files changed, 115 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1bcc543..db29537 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2376,6 +2376,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> u32 stolen_offset,
> u32 gtt_offset,
> u32 size);
> +void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj);
> void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
>
> /* i915_gem_tiling.c */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 656406d..24f93ef 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3915,6 +3915,27 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> }
> }
>
> + /* Try to allocate the physical space for the GEM object,
> + * representing the User frame buffer, from the stolen area.
> + * But if there is no sufficient free space left in stolen
> + * area, will fallback to shmem.
> + */
> + if (obj->user_fb == 1) {
> + if (obj->pages == NULL) {
> + if (obj->tiling_mode == I915_TILING_X) {
> + /* Tiled(X) Scanout buffers are more suitable
> + * for allocation from stolen area, as its very
> + * unlikely that they will be accessed directly
> + * from the CPU side and any allocation from
> + * stolen area is not directly CPU accessible,
> + * but accessible only through the aperture
> + * space.
> + */
> + i915_gem_object_move_to_stolen(obj);
> + }
> + }
> + }
Neat hack ;-) But I think for upstream we need to address a few more
concerns. Random comments:
- The vlv patch subject is a bit misleading here since this is about
stolen memory usage in general across all platforms.
- object_pin is the wrong place to change the backing storage since
someone else could already have pinned the it (e.g. through dma-buf). We
need to do this earlier before calling down into ->get_pages.
- If we allow opportunistical placement of objects into stolen memory I
think we should also fully support purgeable objects so that we can
dynamically scale to workloads. Or at least to be able to kick out stuff
in case the kernel needs the contiguous memory for something more
important. So if we couldn't find a suitable stolen block we'd scan
through all objects with stolen memory backing and check whether any
purgeable objects could be kicked out.
- For your usecase, have you tried to simply reduce the stolen area as
much as possible? Our friendly competition on ARM SoCs seems to have
mostly moved away from gfx reserved chunks towards more flexible
approaches like CMA. Giving stolen back to the linux page allocator
should be possible, but I haven't really looked into that yet all that
much ...
- For upstream I think changing the personality of buffer objects behind
userspace's back is a bit too frisky wrt breaking something. I prefer if
userspace opts-in explicitly by passing a flag to the create ioctl
stating that stolen memory is allowed/preferred for this allocation.
Chris Wilson posted an rfc patch a while back to add a create2 ioctl
(which at the same time also allows us to set the caching, tiling and
any other object parameters).
- We've had some "fun" last time around we've tried to use stolen more
seriously with scanout buffers being strangely offset/corrupted.
Hopefully this is fixed by your sg->offset patch, but I can't find the
reports nor recall the details right now.
Cheers, Daniel
> +
> if (!i915_gem_obj_bound(obj, vm)) {
> ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> map_and_fenceable,
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 5cf97d6..29c22f9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -29,6 +29,7 @@
> #include <drm/drmP.h>
> #include <drm/i915_drm.h>
> #include "i915_drv.h"
> +#include <linux/shmem_fs.h>
>
> /*
> * The BIOS typically reserves some of the system's memory for the exclusive
> @@ -370,6 +371,98 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
> return NULL;
> }
>
> +void
> +i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj)
> +{
> + struct drm_device *dev = obj->base.dev;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_mm_node *stolen;
> + u32 size = obj->base.size;
> + int ret = 0;
> +
> + if (!IS_VALLEYVIEW(dev)) {
> + return;
> + }
> +
> + if (obj->stolen) {
> + BUG_ON(obj->pages == NULL);
> + return;
> + }
> +
> + if (!drm_mm_initialized(&dev_priv->mm.stolen))
> + return;
> +
> + if (size == 0)
> + return;
> +
> + /* Check if already shmem space has been allocated for the object
> + * or not. We cannot rely upon on the value of 'pages' field for this.
> + * As even though if the 'pages' field is NULL, it does not actually
> + * indicate that the backing physical space (shmem) is currently not
> + * reserved for the object, as the object may not get purged/truncated
> + * on the calll to 'put_pages_gtt'.
> + */
> + if (obj->base.filp) {
> + struct inode *inode = file_inode(obj->base.filp);
> + struct shmem_inode_info *info = SHMEM_I(inode);
> + if (!inode)
> + return;
> + spin_lock(&info->lock);
> + /* The alloced field stores how many data pages are
> + * allocated to the file.
> + */
> + ret = info->alloced;
> + spin_unlock(&info->lock);
> + if (ret > 0) {
> + DRM_DEBUG_DRIVER(
> + "Already shmem space alloced, %d pges\n", ret);
> + return;
> + }
> + }
> +
> + stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
> + if (!stolen)
> + return;
> +
> + ret = drm_mm_insert_node(&dev_priv->mm.stolen, stolen, size,
> + 4096, DRM_MM_SEARCH_DEFAULT);
> + if (ret) {
> + kfree(stolen);
> + DRM_DEBUG_DRIVER("ran out of stolen space\n");
> + return;
> + }
> +
> + /* Set up the object to use the stolen memory,
> + * backing store no longer managed by shmem layer */
> + drm_gem_object_release(&(obj->base));
> + obj->base.filp = NULL;
> + obj->ops = &i915_gem_object_stolen_ops;
> +
> + obj->pages = i915_pages_create_for_stolen(dev,
> + stolen->start, stolen->size);
> + if (obj->pages == NULL)
> + goto cleanup;
> +
> + i915_gem_object_pin_pages(obj);
> + list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list);
> + obj->has_dma_mapping = true;
> + obj->stolen = stolen;
> +
> + DRM_DEBUG_DRIVER("Obj moved to stolen, ptr = %p, size = %x\n",
> + obj, size);
> +
> + obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
> + obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
> +
> + /* No zeroing-out of buffers allocated from stolen area */
> + return;
> +
> +cleanup:
> + drm_mm_remove_node(stolen);
> + kfree(stolen);
> + return;
> +}
> +
> struct drm_i915_gem_object *
> i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> u32 stolen_offset,
> --
> 1.8.5.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
next prev parent reply other threads:[~2014-01-09 7:26 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-09 5:31 [PATCH 5/7] drm/i915/vlv: Increase the utilization of stolen memory on VLV akash.goel
2014-01-09 7:27 ` Daniel Vetter [this message]
2014-01-09 7:46 ` Daniel Vetter
[not found] ` <8BF5CF93467D8C498F250C96583BC09CC24C49@BGSMSX103.gar.corp.intel.com>
2014-01-09 8:53 ` Daniel Vetter
2014-01-10 8:40 ` Goel, Akash
2014-01-10 9:26 ` Daniel Vetter
2014-01-13 9:17 ` Goel, Akash
2014-01-09 9:48 ` Chris Wilson
2014-01-09 18:09 ` Goel, Akash
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140109072722.GY4770@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=akash.goel@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox