From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
Akash Goel <akash.goel@intel.com>
Subject: Re: [PATCH] drm/i915: Support creation of unbound wc user mappings for objects
Date: Wed, 17 Dec 2014 12:46:44 +0000 [thread overview]
Message-ID: <54917B34.9030107@linux.intel.com> (raw)
In-Reply-To: <1414083347-31259-1-git-send-email-chris@chris-wilson.co.uk>
Hi,
On 10/23/2014 05:55 PM, Chris Wilson wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> This patch provides support to create write-combining virtual mappings of
> GEM object. It intends to provide the same funtionality of 'mmap_gtt'
> interface without the constraints and contention of a limited aperture
> space, but requires clients handles the linear to tile conversion on their
> own. This is for improving the CPU write operation performance, as with such
> mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
> to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
> flush after update from CPU side, when object is passed onto GPU. This
> type of mapping is specially useful in case of sub-region update,
> i.e. when only a portion of the object is to be updated. Using a CPU mmap
> in such cases would normally incur a clflush of the whole object, and
> using a GTT mmapping would likely require eviction of an active object or
> fence and thus stall. The write-combining CPU mmap avoids both.
>
> To ensure the cache coherency, before using this mapping, the GTT domain
> has been reused here. This provides the required cache flush if the object
> is in CPU domain or synchronization against the concurrent rendering.
> Although the access through an uncached mmap should automatically
> invalidate the cache lines, this may not be true for non-temporal write
> instructions and also not all pages of the object may be updated at any
> given point of time through this mapping. Having a call to get_pages in
> set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
> Broaden application of set-domain(GTT)', would guarantee the clflush and
> so there will be no cachelines holding the data for the object before it
> is accessed through this map.
>
> The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
> extended with a new flags field (defaulting to 0 for existent users). In
> order for userspace to detect the extended ioctl, a new parameter
> I915_PARAM_HAS_EXT_MMAP has been added for versioning the ioctl interface.
>
> v2: Fix error handling, invalid flag detection, renaming (ickle)
>
> The new mmapping is exercised by igt/gem_mmap_wc,
> igt/gem_concurrent_blit and igt/gem_gtt_speed.
>
> Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
> drivers/gpu/drm/i915/i915_dma.c | 3 +++
> drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++++++++++++
> include/uapi/drm/i915_drm.h | 9 +++++++++
> 3 files changed, 31 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1b398070b230..496fb72e25c8 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1027,6 +1027,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
> case I915_PARAM_CMD_PARSER_VERSION:
> value = i915_cmd_parser_get_version();
> break;
> + case I915_PARAM_MMAP_VERSION:
> + value = 1;
> + break;
> default:
> DRM_DEBUG("Unknown parameter %d\n", param->param);
> return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 895f9881f0aa..7d7265fcee95 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1485,6 +1485,12 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
> struct drm_gem_object *obj;
> unsigned long addr;
>
> + if (args->flags & ~(I915_MMAP_WC))
> + return -EINVAL;
> +
> + if (args->flags & I915_MMAP_WC && !cpu_has_pat)
> + return -ENODEV;
> +
> obj = drm_gem_object_lookup(dev, file, args->handle);
> if (obj == NULL)
> return -ENOENT;
> @@ -1500,6 +1506,19 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
> addr = vm_mmap(obj->filp, 0, args->size,
> PROT_READ | PROT_WRITE, MAP_SHARED,
> args->offset);
> + if (args->flags & I915_MMAP_WC) {
> + struct mm_struct *mm = current->mm;
> + struct vm_area_struct *vma;
> +
> + down_write(&mm->mmap_sem);
> + vma = find_vma(mm, addr);
> + if (vma)
> + vma->vm_page_prot =
> + pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> + else
> + addr = -ENOMEM;
> + up_write(&mm->mmap_sem);
> + }
> drm_gem_object_unreference_unlocked(obj);
> if (IS_ERR((void *)addr))
> return addr;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index ff57f07c3249..9b3762724a2b 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -340,6 +340,7 @@ typedef struct drm_i915_irq_wait {
> #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26
> #define I915_PARAM_HAS_WT 27
> #define I915_PARAM_CMD_PARSER_VERSION 28
> +#define I915_PARAM_MMAP_VERSION 29
>
> typedef struct drm_i915_getparam {
> int param;
> @@ -487,6 +488,14 @@ struct drm_i915_gem_mmap {
> * This is a fixed-size type for 32/64 compatibility.
> */
> __u64 addr_ptr;
> +
> + /**
> + * Flags for extended behaviour.
> + *
> + * Added in version 2.
> + */
> + __u64 flags;
> +#define I915_MMAP_WC 0x1
> };
>
> struct drm_i915_gem_mmap_gtt {
I had to spend some time reading about stuff but eventually this looks
good to me. So:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Just one thing to clarify - the same write combining speed up effect is
already possible using the mentioned non-temporal instructions, this
just makes it easier to use for a wider group of users?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2014-12-17 12:46 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-13 13:20 [PATCH] drm/i915: Broaden application of set-domain(GTT) Chris Wilson
2014-10-14 12:47 ` Chris Wilson
2014-10-23 10:33 ` [PATCH] drm/i915: Support to create uncached user mapping for a Gem object akash.goel
2014-10-23 10:37 ` [PATCH] intel: New libdrm interface to create uncached CPU mapping akash.goel
2014-10-23 10:54 ` [PATCH] drm/i915: Support to create uncached user mapping for a Gem object Chris Wilson
2014-10-23 11:03 ` Chris Wilson
2014-10-23 11:56 ` Chris Wilson
2014-10-23 13:23 ` Chris Wilson
2014-12-10 4:48 ` Chad Versace
2014-12-10 8:02 ` Chris Wilson
2014-10-23 16:55 ` [PATCH] drm/i915: Support creation of unbound wc user mappings for objects Chris Wilson
2014-10-23 19:05 ` [PATCH 1/3] igt/gem_evict_(alignment|everything): contend with GPU hangs Chris Wilson
2014-10-23 19:05 ` [PATCH 2/3] lib/core: Check for kernel error messages and WARN if any are found Chris Wilson
2014-10-23 19:05 ` [PATCH 3/3] reg-read-8 Chris Wilson
2014-10-23 19:12 ` [PATCH 1/3] igt/gem_evict_(alignment|everything): contend with GPU hangs Chris Wilson
2014-10-23 19:11 ` [PATCH 1/3] igt/gem_mmap_wc: Exercise mmap(wc) interface Chris Wilson
2014-10-23 19:11 ` [PATCH 2/3] igt/gem_gtt_speed: compare against WC mmaps Chris Wilson
2014-10-23 19:11 ` [PATCH 3/3] igt/gem_concurrent_blit: Exercise wc mappings Chris Wilson
2014-10-25 11:51 ` [PATCH] intel: New libdrm interface to create unbound wc user mappings for objects akash.goel
2014-10-25 12:45 ` Damien Lespiau
2014-10-26 8:36 ` Akash Goel
2014-10-26 8:41 ` Chris Wilson
2014-10-28 13:09 ` [PATCH v3] " akash.goel
2014-10-28 16:11 ` Damien Lespiau
2014-12-03 14:13 ` Damien Lespiau
2014-12-03 18:18 ` Chris Wilson
2014-12-09 10:54 ` Chris Wilson
2014-12-09 16:53 ` Damien Lespiau
2014-12-14 8:41 ` [PATCH v4] " akash.goel
2016-03-09 9:09 ` [PATCH v5] " akash.goel
2016-03-10 8:39 ` Martin Peres
2016-03-14 16:51 ` Martin Peres
2016-03-15 8:41 ` Daniel Vetter
2015-03-03 14:20 ` [PATCH v3] " Damien Lespiau
2015-03-03 17:05 ` Chris Wilson
2015-03-03 17:11 ` Damien Lespiau
2015-03-03 17:13 ` Chris Wilson
2015-03-03 17:16 ` Damien Lespiau
2014-11-05 12:48 ` [PATCH] drm/i915: Support creation of " Chris Wilson
2014-11-06 14:50 ` Daniel Vetter
2014-12-17 12:46 ` Tvrtko Ursulin [this message]
2014-12-17 12:52 ` Chris Wilson
2014-10-24 8:40 ` [PATCH] drm/i915: Support to create uncached user mapping for a Gem object Daniel Vetter
2014-10-24 9:23 ` Chris Wilson
2014-11-05 12:49 ` [PATCH] drm/i915: Broaden application of set-domain(GTT) Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54917B34.9030107@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=akash.goel@intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=daniel.vetter@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.