From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: ankitprasad.r.sharma@intel.com, intel-gfx@lists.freedesktop.org
Cc: akash.goel@intel.com, shashidhar.hiremath@intel.com
Subject: Re: [PATCH 02/10] drm/i915: Use insert_page for pwrite_fast
Date: Tue, 22 Dec 2015 10:44:57 +0000 [thread overview]
Message-ID: <567929A9.6020707@linux.intel.com> (raw)
In-Reply-To: <1450765253-32104-3-git-send-email-ankitprasad.r.sharma@intel.com>
Hi,
On 22/12/15 06:20, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>
> In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> we try a nonblocking pin for the whole object (since that is fastest if
> reused), then failing that we try to grab one page in the mappable
> aperture. It also allows us to handle objects larger than the mappable
> aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> to a measely 8MiB or something like that).
>
> v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
>
> v3: Combined loops based on local patch by Chris (Chris)
>
> v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
>
> v5: Renamed wrapper function for drm_mm_insert_node_in_range (Chris)
>
> v5: Added wrapper for drm_mm_remove_node() (Chris)
>
> v6: Added get_pages call before pinning the pages (Tvrtko)
> Added remove_mappable_node() wrapper for drm_mm_remove_node() (Chris)
>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 94 +++++++++++++++++++++++++++++++----------
> 1 file changed, 72 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bf7f203..f11ec89 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -61,6 +61,23 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
> return obj->pin_display;
> }
>
> +static int
> +insert_mappable_node(struct drm_i915_private *i915,
> + struct drm_mm_node *node)
> +{
> + return drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> + PAGE_SIZE, 0, 0,
It does not look ideal that the function name is saying it will insert a
node and then it only inserts one page.
In that respect previous version looked better to me but since it is
static and single use - whatever - this has been dragging for too long.
> + 0, i915->gtt.mappable_end,
> + DRM_MM_SEARCH_DEFAULT,
> + DRM_MM_CREATE_DEFAULT);
> +}
> +
> +static void
> +remove_mappable_node(struct drm_mm_node *node)
> +{
> + drm_mm_remove_node(node);
> +}
> +
> /* some bookkeeping */
> static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv,
> size_t size)
> @@ -760,20 +777,34 @@ fast_user_write(struct io_mapping *mapping,
> * user into the GTT, uncached.
> */
> static int
> -i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> +i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
> struct drm_i915_gem_object *obj,
> struct drm_i915_gem_pwrite *args,
> struct drm_file *file)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - ssize_t remain;
> - loff_t offset, page_base;
> + struct drm_mm_node node;
> + uint64_t remain, offset;
> char __user *user_data;
> - int page_offset, page_length, ret;
> + int ret;
>
> ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK);
> - if (ret)
> - goto out;
> + if (ret) {
> + memset(&node, 0, sizeof(node));
> + ret = insert_mappable_node(i915, &node);
> + if (ret)
> + goto out;
> +
> + ret = i915_gem_object_get_pages(obj);
> + if (ret) {
> + remove_mappable_node(&node);
> + goto out;
> + }
> +
> + i915_gem_object_pin_pages(obj);
> + } else {
> + node.start = i915_gem_obj_ggtt_offset(obj);
> + node.allocated = false;
> + }
>
> ret = i915_gem_object_set_to_gtt_domain(obj, true);
> if (ret)
> @@ -783,31 +814,39 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> if (ret)
> goto out_unpin;
>
> - user_data = to_user_ptr(args->data_ptr);
> - remain = args->size;
> -
> - offset = i915_gem_obj_ggtt_offset(obj) + args->offset;
> -
> intel_fb_obj_invalidate(obj, ORIGIN_GTT);
> + obj->dirty = true;
>
> - while (remain > 0) {
> + user_data = to_user_ptr(args->data_ptr);
> + offset = args->offset;
> + remain = args->size;
> + while (remain) {
> /* Operation in this page
> *
> * page_base = page offset within aperture
> * page_offset = offset within page
> * page_length = bytes to copy for this page
> */
> - page_base = offset & PAGE_MASK;
> - page_offset = offset_in_page(offset);
> - page_length = remain;
> - if ((page_offset + remain) > PAGE_SIZE)
> - page_length = PAGE_SIZE - page_offset;
> -
> + u32 page_base = node.start;
Compiler does not complain about possible truncation here? I would
change it to a type of equivalent width just in case. Even before it was
loff_t.
> + unsigned page_offset = offset_in_page(offset);
> + unsigned page_length = PAGE_SIZE - page_offset;
> + page_length = remain < page_length ? remain : page_length;
> + if (node.allocated) {
> + wmb();
> + i915->gtt.base.insert_page(&i915->gtt.base,
> + i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
> + node.start,
> + I915_CACHE_NONE,
> + 0);
> + wmb();
> + } else {
> + page_base += offset & PAGE_MASK;
> + }
> /* If we get a fault while copying data, then (presumably) our
> * source page isn't available. Return the error and we'll
> * retry in the slow path.
> */
> - if (fast_user_write(dev_priv->gtt.mappable, page_base,
> + if (fast_user_write(i915->gtt.mappable, page_base,
> page_offset, user_data, page_length)) {
> ret = -EFAULT;
> goto out_flush;
> @@ -821,7 +860,18 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> out_flush:
> intel_fb_obj_flush(obj, false, ORIGIN_GTT);
> out_unpin:
> - i915_gem_object_ggtt_unpin(obj);
> + if (node.allocated) {
> + wmb();
> + i915->gtt.base.clear_range(&i915->gtt.base,
> + node.start, node.size,
> + true);
> + remove_mappable_node(&node);
> + i915_gem_object_unpin_pages(obj);
> + i915_gem_object_put_pages(obj);
> + }
> + else {
> + i915_gem_object_ggtt_unpin(obj);
> + }
> out:
> return ret;
> }
> @@ -1086,7 +1136,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> if (obj->tiling_mode == I915_TILING_NONE &&
> obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
> cpu_write_needs_clflush(obj)) {
> - ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file);
> + ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file);
> /* Note that the gtt paths might fail with non-page-backed user
> * pointers (e.g. gtt mappings when moving data between
> * textures). Fallback to the shmem path in that case. */
>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-12-22 10:44 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-22 6:20 [PATCH v13 0/10] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
2015-12-22 6:20 ` [PATCH 01/10] drm/i915: Allow use of get_dma_address for stolen " ankitprasad.r.sharma
2015-12-22 10:23 ` Tvrtko Ursulin
2015-12-22 10:39 ` Chris Wilson
2016-01-05 16:47 ` Daniel Vetter
2015-12-22 6:20 ` [PATCH 02/10] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
2015-12-22 6:55 ` kbuild test robot
2015-12-22 10:44 ` Tvrtko Ursulin [this message]
2015-12-22 11:15 ` Ankitprasad Sharma
2015-12-22 11:52 ` Chris Wilson
2015-12-22 12:03 ` Tvrtko Ursulin
2015-12-22 13:38 ` Chris Wilson
2015-12-22 6:20 ` [PATCH 03/10] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
2015-12-22 11:09 ` Tvrtko Ursulin
2015-12-22 6:20 ` [PATCH 04/10] drm/i915: Support for creating Stolen memory backed objects ankitprasad.r.sharma
2016-01-12 12:45 ` Chris Wilson
2015-12-22 6:20 ` [PATCH 05/10] drm/i915: Propagating correct error codes to the userspace ankitprasad.r.sharma
2015-12-22 11:20 ` Tvrtko Ursulin
2015-12-22 11:29 ` Ankitprasad Sharma
2015-12-22 12:02 ` Tvrtko Ursulin
2016-01-06 7:45 ` Daniel Vetter
2015-12-22 6:20 ` [PATCH 06/10] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
2015-12-22 11:22 ` Tvrtko Ursulin
2015-12-22 6:20 ` [PATCH 07/10] drm/i915: Support for pread/pwrite from/to non shmem backed objects ankitprasad.r.sharma
2015-12-22 11:58 ` Tvrtko Ursulin
2015-12-22 17:40 ` Chris Wilson
2016-01-11 14:21 ` Tvrtko Ursulin
2016-01-11 14:45 ` Chris Wilson
2016-01-11 15:11 ` Tvrtko Ursulin
2016-01-11 17:03 ` Chris Wilson
2016-01-11 17:15 ` Tvrtko Ursulin
2016-01-11 21:29 ` Chris Wilson
2016-01-12 7:50 ` Ankitprasad Sharma
2015-12-22 6:20 ` [PATCH 08/10] drm/i915: Migrate stolen objects before hibernation ankitprasad.r.sharma
2015-12-22 12:33 ` Tvrtko Ursulin
2015-12-22 17:02 ` Chris Wilson
2015-12-22 17:14 ` Tvrtko Ursulin
2016-01-06 7:48 ` Daniel Vetter
2015-12-22 17:23 ` Tvrtko Ursulin
2015-12-22 6:20 ` [PATCH 09/10] acpi: Export acpi_bus_type ankitprasad.r.sharma
2015-12-22 16:41 ` Tvrtko Ursulin
2016-01-06 7:51 ` Daniel Vetter
2015-12-22 6:20 ` [PATCH 10/10] drm/i915: Disable use of stolen area by User when Intel RST is present ankitprasad.r.sharma
2015-12-22 12:44 ` Tvrtko Ursulin
2015-12-22 13:14 ` Chris Wilson
2016-01-06 7:52 ` Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=567929A9.6020707@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=akash.goel@intel.com \
--cc=ankitprasad.r.sharma@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=shashidhar.hiremath@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.