From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 07/22] drm/i915: Pad GTT views of exec objects up to user specified size
Date: Thu, 28 Jul 2016 11:55:21 +0200 [thread overview]
Message-ID: <20160728095521.GC6232@phenom.ffwll.local> (raw)
In-Reply-To: <1469618100-15298-8-git-send-email-chris@chris-wilson.co.uk>
On Wed, Jul 27, 2016 at 12:14:45PM +0100, Chris Wilson wrote:
> Our GPUs impose certain requirements upon buffers that depend upon how
> exactly they are used. Typically this is expressed as that they require
> a larger surface than would be naively computed by pitch * height.
> Normally such requirements are hidden away in the userspace driver, but
> when we accept pointers from strangers and later impose extra conditions
> on them, the original client allocator has no idea about the
> monstrosities in the GPU and we require the userspace driver to inform
> the kernel how many padding pages are required beyond the client
> allocation.
>
> v2: Long time, no see
> v3: Try an anonymous union for uapi struct compatibility
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
This is missing the testcase line. Also short link to the
libva/opencl/whatever patches would be good too.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_drv.h | 6 ++-
> drivers/gpu/drm/i915/i915_gem.c | 79 ++++++++++++++----------------
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++++-
> include/uapi/drm/i915_drm.h | 8 ++-
> 4 files changed, 62 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 74a31358fd87..1e1369319326 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3032,11 +3032,13 @@ void i915_gem_free_object(struct drm_gem_object *obj);
> int __must_check
> i915_gem_object_pin(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm,
> + u64 size,
> u64 alignment,
> u64 flags);
> int __must_check
> i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
> const struct i915_ggtt_view *view,
> + u64 size,
> u64 alignment,
> u64 flags);
>
> @@ -3313,8 +3315,8 @@ i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj,
> struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
> struct i915_ggtt *ggtt = &dev_priv->ggtt;
>
> - return i915_gem_object_pin(obj, &ggtt->base,
> - alignment, flags | PIN_GLOBAL);
> + return i915_gem_object_pin(obj, &ggtt->base, 0, alignment,
> + flags | PIN_GLOBAL);
> }
>
> void i915_gem_object_ggtt_unpin_view(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c4df44b47cea..2147225e7887 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1689,7 +1689,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> }
>
> /* Now pin it into the GTT if needed */
> - ret = i915_gem_object_ggtt_pin(obj, &view, 0, PIN_MAPPABLE);
> + ret = i915_gem_object_ggtt_pin(obj, &view, 0, 0, PIN_MAPPABLE);
> if (ret)
> goto unlock;
>
> @@ -2969,21 +2969,20 @@ static struct i915_vma *
> i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm,
> const struct i915_ggtt_view *ggtt_view,
> + u64 size,
> u64 alignment,
> u64 flags)
> {
> struct drm_device *dev = obj->base.dev;
> struct drm_i915_private *dev_priv = to_i915(dev);
> - struct i915_ggtt *ggtt = &dev_priv->ggtt;
> - u32 fence_alignment, unfenced_alignment;
> - u32 search_flag, alloc_flag;
> u64 start, end;
> - u64 size, fence_size;
> + u32 search_flag, alloc_flag;
> struct i915_vma *vma;
> int ret;
>
> if (i915_is_ggtt(vm)) {
> - u32 view_size;
> + u32 fence_size, fence_alignment, unfenced_alignment;
> + u64 view_size;
>
> if (WARN_ON(!ggtt_view))
> return ERR_PTR(-EINVAL);
> @@ -3001,48 +3000,39 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> view_size,
> obj->tiling_mode,
> false);
> - size = flags & PIN_MAPPABLE ? fence_size : view_size;
> + size = max(size, view_size);
> + if (flags & PIN_MAPPABLE)
> + size = max_t(u64, size, fence_size);
> +
> + if (alignment == 0)
> + alignment = flags & PIN_MAPPABLE ? fence_alignment :
> + unfenced_alignment;
> + if (flags & PIN_MAPPABLE && alignment & (fence_alignment - 1)) {
> + DRM_DEBUG("Invalid object (view type=%u) alignment requested %llx\n",
> + ggtt_view ? ggtt_view->type : 0,
> + (long long)alignment);
> + return ERR_PTR(-EINVAL);
> + }
> } else {
> - fence_size = i915_gem_get_gtt_size(dev,
> - obj->base.size,
> - obj->tiling_mode);
> - fence_alignment = i915_gem_get_gtt_alignment(dev,
> - obj->base.size,
> - obj->tiling_mode,
> - true);
> - unfenced_alignment =
> - i915_gem_get_gtt_alignment(dev,
> - obj->base.size,
> - obj->tiling_mode,
> - false);
> - size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
> + size = max_t(u64, size, obj->base.size);
> + alignment = 4096;
> }
>
> start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> end = vm->total;
> if (flags & PIN_MAPPABLE)
> - end = min_t(u64, end, ggtt->mappable_end);
> + end = min_t(u64, end, dev_priv->ggtt.mappable_end);
> if (flags & PIN_ZONE_4G)
> end = min_t(u64, end, (1ULL << 32) - PAGE_SIZE);
>
> - if (alignment == 0)
> - alignment = flags & PIN_MAPPABLE ? fence_alignment :
> - unfenced_alignment;
> - if (flags & PIN_MAPPABLE && alignment & (fence_alignment - 1)) {
> - DRM_DEBUG("Invalid object (view type=%u) alignment requested %llx\n",
> - ggtt_view ? ggtt_view->type : 0,
> - (long long)alignment);
> - return ERR_PTR(-EINVAL);
> - }
> -
> /* If binding the object/GGTT view requires more space than the entire
> * aperture has, reject it early before evicting everything in a vain
> * attempt to find space.
> */
> if (size > end) {
> - DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
> + DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: request=%llu [object=%zd] > %s aperture=%llu\n",
> ggtt_view ? ggtt_view->type : 0,
> - size,
> + size, obj->base.size,
> flags & PIN_MAPPABLE ? "mappable" : "total",
> end);
> return ERR_PTR(-E2BIG);
> @@ -3536,7 +3526,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> * (e.g. libkms for the bootup splash), we have to ensure that we
> * always use map_and_fenceable for all scanout buffers.
> */
> - ret = i915_gem_object_ggtt_pin(obj, view, alignment,
> + ret = i915_gem_object_ggtt_pin(obj, view, 0, alignment,
> view->type == I915_GGTT_VIEW_NORMAL ?
> PIN_MAPPABLE : 0);
> if (ret)
> @@ -3688,12 +3678,14 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> }
>
> static bool
> -i915_vma_misplaced(struct i915_vma *vma, u64 alignment, u64 flags)
> +i915_vma_misplaced(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
> {
> struct drm_i915_gem_object *obj = vma->obj;
>
> - if (alignment &&
> - vma->node.start & (alignment - 1))
> + if (vma->node.size < size)
> + return true;
> +
> + if (alignment && vma->node.start & (alignment - 1))
> return true;
>
> if (flags & PIN_MAPPABLE && !obj->map_and_fenceable)
> @@ -3737,6 +3729,7 @@ static int
> i915_gem_object_do_pin(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm,
> const struct i915_ggtt_view *ggtt_view,
> + u64 size,
> u64 alignment,
> u64 flags)
> {
> @@ -3764,7 +3757,7 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj,
> if (WARN_ON(vma->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> return -EBUSY;
>
> - if (i915_vma_misplaced(vma, alignment, flags)) {
> + if (i915_vma_misplaced(vma, size, alignment, flags)) {
> WARN(vma->pin_count,
> "bo is already pinned in %s with incorrect alignment:"
> " offset=%08x %08x, req.alignment=%llx, req.map_and_fenceable=%d,"
> @@ -3785,8 +3778,8 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj,
>
> bound = vma ? vma->bound : 0;
> if (vma == NULL || !drm_mm_node_allocated(&vma->node)) {
> - vma = i915_gem_object_bind_to_vm(obj, vm, ggtt_view, alignment,
> - flags);
> + vma = i915_gem_object_bind_to_vm(obj, vm, ggtt_view,
> + size, alignment, flags);
> if (IS_ERR(vma))
> return PTR_ERR(vma);
> } else {
> @@ -3808,17 +3801,19 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj,
> int
> i915_gem_object_pin(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm,
> + u64 size,
> u64 alignment,
> u64 flags)
> {
> return i915_gem_object_do_pin(obj, vm,
> i915_is_ggtt(vm) ? &i915_ggtt_view_normal : NULL,
> - alignment, flags);
> + size, alignment, flags);
> }
>
> int
> i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
> const struct i915_ggtt_view *view,
> + u64 size,
> u64 alignment,
> u64 flags)
> {
> @@ -3829,7 +3824,7 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
> BUG_ON(!view);
>
> return i915_gem_object_do_pin(obj, &ggtt->base, view,
> - alignment, flags | PIN_GLOBAL);
> + size, alignment, flags | PIN_GLOBAL);
> }
>
> void
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 63984c4d8e5a..f40fd7f9e5fa 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -682,10 +682,14 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> flags |= PIN_HIGH;
> }
>
> - ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
> + ret = i915_gem_object_pin(obj, vma->vm,
> + entry->pad_to_size,
> + entry->alignment,
> + flags);
> if ((ret == -ENOSPC || ret == -E2BIG) &&
> only_mappable_for_reloc(entry->flags))
> ret = i915_gem_object_pin(obj, vma->vm,
> + entry->pad_to_size,
> entry->alignment,
> flags & ~PIN_MAPPABLE);
> if (ret)
> @@ -748,6 +752,9 @@ eb_vma_misplaced(struct i915_vma *vma)
> vma->node.start & (entry->alignment - 1))
> return true;
>
> + if (vma->node.size < entry->pad_to_size)
> + return true;
> +
> if (entry->flags & EXEC_OBJECT_PINNED &&
> vma->node.start != entry->offset)
> return true;
> @@ -1091,6 +1098,13 @@ validate_exec_list(struct drm_device *dev,
> if (exec[i].alignment && !is_power_of_2(exec[i].alignment))
> return -EINVAL;
>
> + /* pad_to_size was once a reserved field, so sanitize it */
> + if (exec[i].flags & EXEC_OBJECT_PAD_TO_SIZE) {
> + if (offset_in_page(exec[i].pad_to_size))
> + return -EINVAL;
> + } else
> + exec[i].pad_to_size = 0;
> +
> /* First check for malicious input causing overflow in
> * the worst case where we need to allocate the entire
> * relocation tree as a single array.
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 33ce5ff9556a..0f292733cffc 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -727,11 +727,15 @@ struct drm_i915_gem_exec_object2 {
> #define EXEC_OBJECT_WRITE (1<<2)
> #define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
> #define EXEC_OBJECT_PINNED (1<<4)
> +#define EXEC_OBJECT_PAD_TO_SIZE (1<<5)
> /* All remaining bits are MBZ and RESERVED FOR FUTURE USE */
> -#define __EXEC_OBJECT_UNKNOWN_FLAGS (-(EXEC_OBJECT_PINNED<<1))
> +#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_PAD_TO_SIZE<<1)
> __u64 flags;
>
> - __u64 rsvd1;
> + union {
> + __u64 rsvd1;
> + __u64 pad_to_size;
> + };
> __u64 rsvd2;
> };
>
> --
> 2.8.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2016-07-28 9:55 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-27 11:14 Getting to RCU and exporting fences Chris Wilson
2016-07-27 11:14 ` [PATCH 01/22] drm/i915: Combine loops within i915_gem_evict_something Chris Wilson
2016-07-29 6:17 ` Joonas Lahtinen
2016-07-29 6:31 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 02/22] drm/i915: Remove surplus drm_device parameter to i915_gem_evict_something() Chris Wilson
2016-07-28 8:07 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 03/22] drm/i915: Double check the active status on the batch pool Chris Wilson
2016-07-28 8:14 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 04/22] drm/i915: Remove request retirement before each batch Chris Wilson
2016-07-28 8:32 ` Joonas Lahtinen
2016-07-28 9:32 ` Chris Wilson
2016-07-28 9:53 ` Joonas Lahtinen
2016-07-28 9:54 ` Daniel Vetter
2016-07-28 10:26 ` Chris Wilson
2016-07-28 11:52 ` Daniel Vetter
2016-07-28 12:24 ` Chris Wilson
2016-07-28 14:21 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 05/22] drm/i915: Remove i915_gem_execbuffer_retire_commands() Chris Wilson
2016-07-28 8:46 ` Joonas Lahtinen
2016-07-28 8:55 ` Chris Wilson
2016-07-28 9:54 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 06/22] drm/i915: Fix up vma alignment to be u64 Chris Wilson
2016-07-28 8:59 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 07/22] drm/i915: Pad GTT views of exec objects up to user specified size Chris Wilson
2016-07-28 9:55 ` Daniel Vetter [this message]
2016-07-28 10:33 ` Chris Wilson
2016-07-29 7:59 ` Joonas Lahtinen
2016-07-29 8:08 ` Chris Wilson
2016-07-29 8:55 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 08/22] drm/i915: Reduce WARN(i915_gem_valid_gtt_space) to a debug-only check Chris Wilson
2016-07-28 9:18 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 09/22] drm/i915: Split insertion/binding of an object into the VM Chris Wilson
2016-07-28 9:25 ` Joonas Lahtinen
2016-07-28 9:34 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 10/22] drm/i915: Record allocated vma size Chris Wilson
2016-07-29 6:53 ` Joonas Lahtinen
2016-07-29 7:18 ` Chris Wilson
2016-07-29 10:19 ` [PATCH] drm/i915: Convert 4096 alignment request to 0 for drm_mm allocations Chris Wilson
2016-07-29 10:28 ` Joonas Lahtinen
2016-07-29 10:38 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 11/22] drm/i915: Wrap vma->pin_count accessors with small inline helpers Chris Wilson
2016-07-29 6:59 ` Joonas Lahtinen
2016-07-29 7:23 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 12/22] drm/i915: Start passing around i915_vma from execbuffer Chris Wilson
2016-07-29 8:23 ` Joonas Lahtinen
2016-08-01 7:34 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 13/22] drm/i915: Combine all i915_vma bitfields into a single set of flags Chris Wilson
2016-07-29 7:30 ` Joonas Lahtinen
2016-07-29 7:44 ` Chris Wilson
2016-07-27 11:14 ` [PATCH 14/22] drm/i915: Make i915_vma_pin() small and inline Chris Wilson
2016-07-28 11:06 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 15/22] drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin() Chris Wilson
2016-07-28 10:38 ` Joonas Lahtinen
2016-07-28 11:36 ` Chris Wilson
2016-07-28 11:53 ` Joonas Lahtinen
2016-07-28 16:12 ` Chris Wilson
2016-07-29 9:10 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 16/22] drm/i915: Make fb_tracking.lock a spinlock Chris Wilson
2016-07-28 10:02 ` Daniel Vetter
2016-07-28 10:08 ` Daniel Vetter
2016-07-29 8:25 ` Chris Wilson
2016-07-28 10:19 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 17/22] drm/i915: Use atomics to manipulate obj->frontbuffer_bits Chris Wilson
2016-07-28 9:49 ` Joonas Lahtinen
2016-07-28 10:10 ` Chris Wilson
2016-07-28 10:51 ` Joonas Lahtinen
2016-07-28 10:05 ` Daniel Vetter
2016-07-27 11:14 ` [PATCH 18/22] drm/i915: Use dev_priv consistently through the intel_frontbuffer interface Chris Wilson
2016-07-28 9:36 ` Joonas Lahtinen
2016-07-28 10:06 ` Daniel Vetter
2016-07-27 11:14 ` [PATCH 19/22] drm/i915: Move obj->active:5 to obj->flags Chris Wilson
2016-07-29 7:40 ` Joonas Lahtinen
2016-07-29 8:04 ` Chris Wilson
2016-07-29 8:10 ` Chris Wilson
2016-07-29 9:34 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 20/22] drm/i915: Move i915_gem_object_wait_rendering() Chris Wilson
2016-07-28 9:37 ` Joonas Lahtinen
2016-07-27 11:14 ` [PATCH 21/22] drm/i915: Enable lockless lookup of request tracking via RCU Chris Wilson
2016-07-28 10:23 ` Daniel Vetter
2016-07-28 20:49 ` Chris Wilson
2016-07-29 8:41 ` Daniel Vetter
2016-07-29 8:49 ` Chris Wilson
2016-07-29 9:43 ` Chris Wilson
2016-07-29 9:45 ` Daniel Vetter
2016-07-27 11:15 ` [PATCH 22/22] drm/i915: Export our request as a dma-buf fence on the reservation object Chris Wilson
2016-07-28 10:32 ` Daniel Vetter
2016-07-28 10:40 ` Chris Wilson
2016-07-28 11:59 ` Daniel Vetter
2016-07-28 12:17 ` Chris Wilson
2016-07-28 12:28 ` Daniel Vetter
2016-07-28 12:45 ` Chris Wilson
2016-07-28 20:14 ` Daniel Vetter
2016-07-28 21:08 ` Chris Wilson
2016-07-27 11:23 ` ✗ Ro.CI.BAT: failure for series starting with [01/22] drm/i915: Combine loops within i915_gem_evict_something Patchwork
2016-07-29 10:20 ` ✗ Ro.CI.BAT: failure for series starting with [01/22] drm/i915: Combine loops within i915_gem_evict_something (rev2) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160728095521.GC6232@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox