From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Subject: [PATCH 2/4] drm/i915: Update rules for reading cache lines through the LLC
Date: Tue, 6 Aug 2013 13:17:03 +0100 [thread overview]
Message-ID: <1375791425-14978-2-git-send-email-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <1375791425-14978-1-git-send-email-chris@chris-wilson.co.uk>
The LLC is a fun device. The cache is a distinct functional block within
the SA that arbitrates access from both the CPU and GPU cores. As such
all writes to memory land first in the LLC before further action is
taken. For example, an uncached write from either the CPU or GPU will
then proceed to memory and evict the cacheline from the LLC. This means that
a read from the LLC always returns the correct information even if the PTE
bit in the GPU differs from the PAT bit in the CPU. For the older
snooping architecture on non-LLC, the fundamental principle still holds
except that some coordination is required between the CPU and GPU to
explicitly perform the snooping (which is handled by our request
tracking).
The upshot of this is that we know that we can issue a read from either
LLC devices or snoopable memory and trust the contents of the cache -
i.e. we can forgo a clflush before a read in these circumstances.
Writing to memory from the CPU is a little more tricky as we have to
consider that the scanout does not read from the CPU cache at all, but
from main memory. So we have to currently treat all requests to write to
uncached memory as having to be flushed to main memory for coherency
with all consumers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index eec6fcb..5671dab 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -60,6 +60,12 @@ static long i915_gem_purge(struct drm_i915_private *dev_priv, long target);
static void i915_gem_shrink_all(struct drm_i915_private *dev_priv);
static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
+static bool cpu_cache_is_coherent(struct drm_device *dev,
+ enum i915_cache_level level)
+{
+ return HAS_LLC(dev) || level != I915_CACHE_NONE;
+}
+
static inline void i915_gem_object_fence_lost(struct drm_i915_gem_object *obj)
{
if (obj->tiling_mode)
@@ -510,8 +516,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
* read domain and manually flush cachelines (if required). This
* optimizes for the case when the gpu will dirty the data
* anyway again before the next pread happens. */
- if (obj->cache_level == I915_CACHE_NONE)
- needs_clflush = 1;
+ needs_clflush = !cpu_cache_is_coherent(dev, obj->cache_level);
if (i915_gem_obj_ggtt_bound(obj)) {
ret = i915_gem_object_set_to_gtt_domain(obj, false);
if (ret)
@@ -835,11 +840,11 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
return ret;
}
}
- /* Same trick applies for invalidate partially written cachelines before
- * writing. */
- if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU)
- && obj->cache_level == I915_CACHE_NONE)
- needs_clflush_before = 1;
+ /* Same trick applies to invalidate partially written cachelines read
+ * before writing. */
+ if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0)
+ needs_clflush_before =
+ !cpu_cache_is_coherent(dev, obj->cache_level);
ret = i915_gem_object_get_pages(obj);
if (ret)
@@ -3650,7 +3655,8 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
/* Flush the CPU cache if it's still invalid. */
if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) {
- i915_gem_clflush_object(obj);
+ if (!cpu_cache_is_coherent(obj->base.dev, obj->cache_level))
+ i915_gem_clflush_object(obj);
obj->base.read_domains |= I915_GEM_DOMAIN_CPU;
}
--
1.8.4.rc1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2013-08-06 12:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-06 12:17 [PATCH 1/4] drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge Chris Wilson
2013-08-06 12:17 ` Chris Wilson [this message]
2013-08-06 13:31 ` [PATCH 2/4] drm/i915: Update rules for reading cache lines through the LLC Daniel Vetter
2013-08-06 16:20 ` Chris Wilson
2013-08-06 12:17 ` [PATCH 3/4] drm/i915: Update rules for writing through the LLC with the cpu Chris Wilson
2013-08-06 14:24 ` Ville Syrjälä
2013-08-06 16:16 ` Chris Wilson
2013-08-06 18:03 ` Ville Syrjälä
2013-08-06 18:45 ` Chris Wilson
2013-08-06 19:08 ` Ville Syrjälä
2013-08-06 19:26 ` Chris Wilson
2013-08-06 12:17 ` [PATCH 4/4] drm/i915: Only do a chipset flush after a clflush Chris Wilson
2013-08-06 14:25 ` [PATCH 1/4] drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge Ville Syrjälä
2013-08-06 14:36 ` Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1375791425-14978-2-git-send-email-chris@chris-wilson.co.uk \
--to=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).