From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: ankitprasad.r.sharma@intel.com, intel-gfx@lists.freedesktop.org
Cc: akash.goel@intel.com, shashidhar.hiremath@intel.com
Subject: Re: [PATCH 1/2] drm/i915: Extend GET_APERTURE ioctl to report available map space
Date: Wed, 08 Jul 2015 14:13:43 +0100 [thread overview]
Message-ID: <559D2207.1060404@linux.intel.com> (raw)
In-Reply-To: <1436338285-23044-2-git-send-email-ankitprasad.r.sharma@intel.com>
Hi,
On 07/08/2015 07:51 AM, ankitprasad.r.sharma@intel.com wrote:
> From: Rodrigo Vivi <rodrigo.vivi at intel.com>
>
> When constructing a batchbuffer, it is sometimes crucial to know the
> largest hole into which we can fit a fenceable buffer (for example when
> handling very large objects on gen2 and gen3). This depends on the
> fragmentation of pinned buffers inside the aperture, a question only the
> kernel can easily answer.
>
> This patch extends the current DRM_I915_GEM_GET_APERTURE ioctl to
> include a couple of new fields in its reply to userspace - the total
> amount of space available in the mappable region of the aperture and
> also the single largest block available.
Since whatever this returns is a transient number is this really that
useful? There are no guarantees that by the time caller tries to act on
it it will still be valid.
> This is not quite what userspace wants to answer the question of whether
> this batch will fit as fences are also required to meet severe alignment
> constraints within the batch. For this purpose, a third conservative
> estimate of largest fence available is also provided. For when userspace
> needs more than one batch, we also provide the culmulative space
> available for fences such that it has some additional guidance to how
> much space it could allocate to fences. Conservatism still wins.
>
> The patch also adds a debugfs file for convenient testing and reporting.
>
> v2: The first object cannot end at offset 0, so we can use last==0 to
> detect the empty list.
>
> v3: Expand all values to 64bit, just in case.
> Report total mappable aperture size for userspace that cannot easily
> determine it by inspecting the PCI device.
>
> v4: (Rodrigo) Fixed rebase conflicts.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 27 +++++++++
> drivers/gpu/drm/i915/i915_gem.c | 116 ++++++++++++++++++++++++++++++++++--
> include/uapi/drm/i915_drm.h | 25 ++++++++
> 3 files changed, 164 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 31d8768..49ec438 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -512,6 +512,32 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
> return 0;
> }
>
> +static int i915_gem_aperture_info(struct seq_file *m, void *data)
> +{
> + struct drm_info_node *node = m->private;
> + struct drm_i915_gem_get_aperture arg;
> + int ret;
> +
> + ret = i915_gem_get_aperture_ioctl(node->minor->dev, &arg, NULL);
> + if (ret)
> + return ret;
> +
> + seq_printf(m, "Total size of the GTT: %llu bytes\n",
> + arg.aper_size);
> + seq_printf(m, "Available space in the GTT: %llu bytes\n",
> + arg.aper_available_size);
> + seq_printf(m, "Available space in the mappable aperture: %llu bytes\n",
> + arg.map_available_size);
> + seq_printf(m, "Single largest space in the mappable aperture: %llu bytes\n",
> + arg.map_largest_size);
> + seq_printf(m, "Available space for fences: %llu bytes\n",
> + arg.fence_available_size);
> + seq_printf(m, "Single largest fence available: %llu bytes\n",
> + arg.fence_largest_size);
> +
> + return 0;
> +}
> +
> static int i915_gem_gtt_info(struct seq_file *m, void *data)
> {
> struct drm_info_node *node = m->private;
> @@ -5030,6 +5056,7 @@ static int i915_debugfs_create(struct dentry *root,
> static const struct drm_info_list i915_debugfs_list[] = {
> {"i915_capabilities", i915_capabilities, 0},
> {"i915_gem_objects", i915_gem_object_info, 0},
> + {"i915_gem_aperture", i915_gem_aperture_info, 0},
> {"i915_gem_gtt", i915_gem_gtt_info, 0},
> {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST},
> {"i915_gem_active", i915_gem_object_list_info, 0, (void *) ACTIVE_LIST},
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a2a4a27..ccfc8d3 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -32,6 +32,7 @@
> #include "i915_vgpu.h"
> #include "i915_trace.h"
> #include "intel_drv.h"
> +#include <linux/list_sort.h>
> #include <linux/shmem_fs.h>
> #include <linux/slab.h>
> #include <linux/swap.h>
> @@ -143,6 +144,55 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> return 0;
> }
>
> +static inline bool
> +i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> +{
> + return i915_gem_obj_bound_any(obj) && !obj->active;
> +}
> +
> +static int obj_rank_by_ggtt(void *priv,
> + struct list_head *A,
> + struct list_head *B)
> +{
> + struct drm_i915_gem_object *a = list_entry(A,typeof(*a), obj_exec_link);
> + struct drm_i915_gem_object *b = list_entry(B,typeof(*b), obj_exec_link);
Nitpick - space after comma.
> +
> + return i915_gem_obj_ggtt_offset(a) - i915_gem_obj_ggtt_offset(b);
> +}
> +
> +static u32 __fence_size(struct drm_i915_private *dev_priv, u32 start, u32 end)
> +{
> + u32 size = end - start;
> + u32 fence_size;
> +
> + if (INTEL_INFO(dev_priv)->gen < 4) {
> + u32 fence_max;
> + u32 fence_next;
> +
> + if (IS_GEN3(dev_priv)) {
> + fence_max = I830_FENCE_MAX_SIZE_VAL << 20;
> + fence_next = 1024*1024;
> + } else {
> + fence_max = I830_FENCE_MAX_SIZE_VAL << 19;
> + fence_next = 512*1024;
> + }
> +
> + fence_max = min(fence_max, size);
> + fence_size = 0;
> + while (fence_next <= fence_max) {
> + u32 base = ALIGN(start, fence_next);
> + if (base + fence_next > end)
> + break;
> +
> + fence_size = fence_next;
> + fence_next <<= 1;
> + }
I don't know a lot about fence alignment and size restrictions but since
this loop is relatively complex could use a comment on what it is doing.
> + } else
> + fence_size = size;
I think coding style is for all if/else branches to have braces if one
has - feel free to correct me if I am wrong.
> +
> + return fence_size;
> +}
> +
> int
> i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file)
> @@ -150,17 +200,75 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_i915_gem_get_aperture *args = data;
> struct drm_i915_gem_object *obj;
> - size_t pinned;
> + struct list_head map_list;
> + const u32 map_limit = dev_priv->gtt.mappable_end;
mappable_end is u64 so gcc should complain here.
> + size_t pinned, map_space, map_largest, fence_space, fence_largest;
> + u32 last, size;
And since the ongoing push to standardize all GTT size variables to u64
I think you'l have to change the above as well.
> +
> + INIT_LIST_HEAD(&map_list);
>
> pinned = 0;
> + map_space = map_largest = 0;
> + fence_space = fence_largest = 0;
> +
> mutex_lock(&dev->struct_mutex);
> - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> - if (i915_gem_obj_is_pinned(obj))
> - pinned += i915_gem_obj_ggtt_size(obj);
Unfortunately you will have to rebase this again since someone managed
to change it in the meantime. Ooops it was me - sorry! :)
> + list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
The issue with this is it ignores rotated and partial VMA types so will
give wrong numbers. Rebasing should take care of it.
> + if (vma == NULL || !vma->pin_count)
> + continue;
> +
> + pinned += vma->node.size;
> +
> + if (vma->node.start < map_limit)
> + list_add(&obj->obj_exec_link, &map_list);
> + }
> +
> + last = 0;
> + list_sort(NULL, &map_list, obj_rank_by_ggtt);
I suppose the problems really start with the compare function here.
Since you store objects in the list and not VMAs, compare can't know
which are the correct VMAs to use.
Are there any free list_heads in the VMA you could use? vma->exec_list
looks like it could be used. (Please verify :)
That would make map_list store VMAs and compare function can be change
to use vma->node.start then (and work with normal/rotated/partial views).
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-07-08 13:14 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-08 6:51 [PATCH v3 0/2] Extending GET_APERTURE ioctl ankitprasad.r.sharma
2015-07-08 6:51 ` [PATCH 1/2] drm/i915: Extend GET_APERTURE ioctl to report available map space ankitprasad.r.sharma
2015-07-08 13:13 ` Tvrtko Ursulin [this message]
2015-07-08 13:28 ` Chris Wilson
2015-07-08 13:36 ` Tvrtko Ursulin
2015-07-08 13:53 ` Chris Wilson
2015-07-08 14:24 ` Tvrtko Ursulin
2015-07-08 14:32 ` Chris Wilson
2015-07-08 13:38 ` Chris Wilson
2015-07-08 13:29 ` Tvrtko Ursulin
2015-07-08 6:51 ` [PATCH 2/2] drm/i915: Extend GET_APERTURE ioctl to report size of the stolen region ankitprasad.r.sharma
2015-07-08 13:33 ` Tvrtko Ursulin
-- strict thread matches above, loose matches on Subject: below --
2015-07-01 9:25 [PATCH v2 0/2] Extending GET_APERTURE ioctl ankitprasad.r.sharma
2015-07-01 9:25 ` [PATCH 1/2] drm/i915: Extend GET_APERTURE ioctl to report available map space ankitprasad.r.sharma
2015-07-01 13:39 ` Daniel Vetter
2015-07-01 16:34 ` Ankitprasad Sharma
2015-05-13 12:07 [PATCH 0/2] Extending GET_APERTURE ioctl ankitprasad.r.sharma
2015-05-13 12:07 ` [PATCH 1/2] drm/i915: Extend GET_APERTURE ioctl to report available map space ankitprasad.r.sharma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=559D2207.1060404@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=akash.goel@intel.com \
--cc=ankitprasad.r.sharma@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=shashidhar.hiremath@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox