From: "Goel, Akash" <akash.goel@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>, Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: Only move to the CPU write domain if keeping the GTT pages
Date: Sun, 9 Aug 2015 16:23:01 +0530 [thread overview]
Message-ID: <55C7310D.2000703@intel.com> (raw)
In-Reply-To: <20150807080728.GJ17734@phenom.ffwll.local>
On 8/7/2015 1:37 PM, Daniel Vetter wrote:
> On Thu, Aug 06, 2015 at 05:43:39PM +0100, Chris Wilson wrote:
>> We have for a long time been ultra-paranoid about the situation whereby
>> we hand back pages to the system that have been written to by the GPU
>> and potentially simultaneously by the user through a CPU mmapping. We
>> can relax this restriction when we know that the cache domain tracking
>> is true and there can be no stale cacheline invalidatations. This is
>> true if the object has never been CPU mmaped as all internal accesses
>> (i.e. kmap/iomap) are carefully flushed. For a CPU mmaping, one would
>> expect that the invalid cache lines are resolved on PTE/TLB shootdown
>> during munmap(), so the only situation we need to be paranoid about is
>
> That seems a pretty strong assumption given that x86 cache are physically
> indexed - I'd never expect flushing to happen on munmap.
x86 L1 caches are most probably virtually indexed/physically tagged and
generally this is the prevalent & most optimal Cache configuration.
For the virtually indexed/physically tagged caches, the cache flush is
not really required either on Context switch or on page table update
(munmap, PTE/TLB shootdown).
So there could be few cache lines, holding the stale data (due to a
prior CPU mmapping), for the object being purged/truncated.
>
>> when such a CPU mmaping exists at the time of put_pages. Given that we
>> need to treat put_pages carefully as we may return live data to the
>> system that we want to use again in the future (i.e. I915_MADV_WILLNEED
>> pages) we can simply treat a live CPU mmaping as a special case of
>> WILLNEED (which it is!). Any I915_MADV_DONTNEED pages and their
>> mmapings are shotdown immediately following put_pages.
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: "Goel, Akash" <akash.goel@intel.com>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
>
> But it's still salvageable I think since we only care about coherency for
> the gpu (where data might be stuck in cpu caches). From the cpu's pov (and
> hence the entire system except the gpu) we should never see inconsistency
> really - as soon as the gpu does a write to a cacheline it'll win, and
> before that nothing in the system can assume anything about the contents
> of these pages.
>
> So imo a simple
>
> /* For purgeable objects we don't care about object contents. */
>
> would be enough.
>
> Well except that there's gl extensions which expose purgeable objects, so
> I guess we can't actually do this.
>
> I presume though you only want to avoid clflush when actually purging an
> object, so maybe we can keep this by purging the shmem backing node first
> and checking here for __I915_MADV_PURGED instead?
An object marked as MADV_DONT_NEED, implies that it will be
purged/truncated right away after the call to put_pages_gtt function.
So doing the other way round by purging first and then checking for
__I915_MADV_PURGED, might be equivalent.
Best regards
Akash
> Oh and some perf data would be good for this.
> -Daniel
>
>> ---
>> drivers/gpu/drm/i915/i915_gem.c | 49 ++++++++++++++++++++++++++++++-----------
>> 1 file changed, 36 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 2dfe707f11d3..24deace364a5 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2047,22 +2047,45 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
>>
>> BUG_ON(obj->madv == __I915_MADV_PURGED);
>>
>> - ret = i915_gem_object_set_to_cpu_domain(obj, true);
>> - if (ret) {
>> - /* In the event of a disaster, abandon all caches and
>> - * hope for the best.
>> - */
>> - WARN_ON(ret != -EIO);
>> - i915_gem_clflush_object(obj, true);
>> - obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
>> - }
>> -
>> i915_gem_gtt_finish_object(obj);
>>
>> - if (i915_gem_object_needs_bit17_swizzle(obj))
>> - i915_gem_object_save_bit_17_swizzle(obj);
>> + /* If we need to access the data in the future, we need to
>> + * be sure that the contents of the object is coherent with
>> + * the CPU prior to releasing the pages back to the system.
>> + * Once we unpin them, the mm is free to move them to different
>> + * zones or even swap them out to disk - all without our
>> + * intervention. (Though we could track such operations with our
>> + * own gemfs, if we ever write one.) As such if we want to keep
>> + * the data, set it to the CPU domain now just in case someone
>> + * else touches it.
>> + *
>> + * For a long time we have been paranoid about handing back
>> + * pages to the system with stale cacheline invalidation. For
>> + * all internal use (kmap/iomap), we know that the domain tracking is
>> + * accurate. However, the userspace API is lax and the user can CPU
>> + * mmap the object and invalidate cachelines without our accurate
>> + * tracking. We have been paranoid to be sure that we always flushed
>> + * the cachelines when we stopped using the pages. However, given
>> + * that the CPU PTE/TLB shootdown must have invalidated the cachelines
>> + * upon munmap(), we only need to be paranoid about a live CPU mmap
>> + * now. For this, we need only treat it as live data see
>> + * discard_backing_storage().
>> + */
>> + if (obj->madv == I915_MADV_WILLNEED) {
>> + ret = i915_gem_object_set_to_cpu_domain(obj, true);
>> + if (ret) {
>> + /* In the event of a disaster, abandon all caches and
>> + * hope for the best.
>> + */
>> + WARN_ON(ret != -EIO);
>> + i915_gem_clflush_object(obj, true);
>> + obj->base.read_domains = I915_GEM_DOMAIN_CPU;
>> + obj->base.write_domain = I915_GEM_DOMAIN_CPU;
>> + }
>>
>> - if (obj->madv == I915_MADV_DONTNEED)
>> + if (i915_gem_object_needs_bit17_swizzle(obj))
>> + i915_gem_object_save_bit_17_swizzle(obj);
>> + } else
>> obj->dirty = 0;
>>
>> st_for_each_page(&iter, obj->pages) {
>> --
>> 2.5.0
>>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-08-09 10:53 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-06 16:43 [PATCH] drm/i915: Only move to the CPU write domain if keeping the GTT pages Chris Wilson
2015-08-07 8:07 ` Daniel Vetter
2015-08-07 10:10 ` Chris Wilson
2015-08-07 11:55 ` Daniel Vetter
2015-08-07 13:07 ` Chris Wilson
2015-08-12 12:35 ` Daniel Vetter
2015-08-12 12:45 ` Chris Wilson
2015-08-09 10:53 ` Goel, Akash [this message]
2015-08-09 10:55 ` Chris Wilson
2015-08-09 11:41 ` Goel, Akash
2015-08-09 12:49 ` Chris Wilson
2015-08-09 13:32 ` Goel, Akash
2015-08-19 14:24 ` akash goel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55C7310D.2000703@intel.com \
--to=akash.goel@intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=daniel.vetter@ffwll.ch \
--cc=daniel@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox