All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Track dirtying of CPU cache for LLC
@ 2012-02-24 21:21 Chris Wilson
  2012-02-27 16:30 ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2012-02-24 21:21 UTC (permalink / raw)
  To: intel-gfx

Doing mixed rendering into the front/back scanout buffers lead to the
interesting rediscovery of clflushing when page-flipping. A painful
experience indeed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 drivers/gpu/drm/i915/i915_gem.c |    7 +++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4eee0bf..569475b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -865,6 +865,7 @@ struct drm_i915_gem_object {
 	unsigned int fenced_gpu_access:1;
 
 	unsigned int cache_level:2;
+	unsigned int cache_dirty:2;
 
 	unsigned int has_aliasing_ppgtt_mapping:1;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index eb63c69..8e547dd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2907,12 +2907,15 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj)
 	 * snooping behaviour occurs naturally as the result of our domain
 	 * tracking.
 	 */
-	if (obj->cache_level != I915_CACHE_NONE)
+	if (obj->cache_level != I915_CACHE_NONE) {
+		obj->cache_dirty = obj->cache_level;
 		return;
+	}
 
 	trace_i915_gem_object_clflush(obj);
 
 	drm_clflush_pages(obj->pages, obj->base.size / PAGE_SIZE);
+	obj->cache_dirty = I915_CACHE_NONE;
 }
 
 /** Flushes any GPU write domain for the object if it's dirty. */
@@ -3066,7 +3069,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 					       obj, cache_level);
 	}
 
-	if (cache_level == I915_CACHE_NONE) {
+	if (obj->cache_dirty) {
 		u32 old_read_domains, old_write_domain;
 
 		/* If we're coming from LLC cached, then we haven't
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Track dirtying of CPU cache for LLC
  2012-02-24 21:21 [PATCH] drm/i915: Track dirtying of CPU cache for LLC Chris Wilson
@ 2012-02-27 16:30 ` Daniel Vetter
  2012-02-27 17:10   ` Chris Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Vetter @ 2012-02-27 16:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Fri, Feb 24, 2012 at 09:21:51PM +0000, Chris Wilson wrote:
> Doing mixed rendering into the front/back scanout buffers lead to the
> interesting rediscovery of clflushing when page-flipping. A painful
> experience indeed.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Hm, I might be a bit dense here (again ...) but I don't follow what this
exactly fixes. Care to elaborate a bit?
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Track dirtying of CPU cache for LLC
  2012-02-27 16:30 ` Daniel Vetter
@ 2012-02-27 17:10   ` Chris Wilson
  2012-02-27 17:50     ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2012-02-27 17:10 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 27 Feb 2012 17:30:19 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Fri, Feb 24, 2012 at 09:21:51PM +0000, Chris Wilson wrote:
> > Doing mixed rendering into the front/back scanout buffers lead to the
> > interesting rediscovery of clflushing when page-flipping. A painful
> > experience indeed.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Hm, I might be a bit dense here (again ...) but I don't follow what this
> exactly fixes. Care to elaborate a bit?

When we are page-flipping, we take an active render buffer and flush it
to the display plane. This involves a migration into the uncached
domain and a clflush. If we allow the bo to transistion back to LLC
cached for fast rendering when it becomes the back-buffer again, we
incur another clflush back into the display place, even though we never
touch it with the CPU whilst it is in the CPU domain. Similarly, if we
are recycling bo used else for render buffers to be page-flipped.  The
patch avoids the defensive clflush by recording when we ignore transitions
in and out of the CPU cache domain due to cache coherency and replaying
those missed clflushes when the object is no longer cache coherent.

The effect of the extra clflushes is quite pronounced (>30% framerate
drop for glxgears) and really does interfere with experiments to manage
cache levels. It's not the only bottleneck, but it is the major one.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Track dirtying of CPU cache for LLC
  2012-02-27 17:10   ` Chris Wilson
@ 2012-02-27 17:50     ` Daniel Vetter
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Vetter @ 2012-02-27 17:50 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, Feb 27, 2012 at 05:10:24PM +0000, Chris Wilson wrote:
> On Mon, 27 Feb 2012 17:30:19 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Fri, Feb 24, 2012 at 09:21:51PM +0000, Chris Wilson wrote:
> > > Doing mixed rendering into the front/back scanout buffers lead to the
> > > interesting rediscovery of clflushing when page-flipping. A painful
> > > experience indeed.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > Hm, I might be a bit dense here (again ...) but I don't follow what this
> > exactly fixes. Care to elaborate a bit?
> 
> When we are page-flipping, we take an active render buffer and flush it
> to the display plane. This involves a migration into the uncached
> domain and a clflush. If we allow the bo to transistion back to LLC
> cached for fast rendering when it becomes the back-buffer again, we
> incur another clflush back into the display place, even though we never
> touch it with the CPU whilst it is in the CPU domain. Similarly, if we
> are recycling bo used else for render buffers to be page-flipped.  The
> patch avoids the defensive clflush by recording when we ignore transitions
> in and out of the CPU cache domain due to cache coherency and replaying
> those missed clflushes when the object is no longer cache coherent.
> 
> The effect of the extra clflushes is quite pronounced (>30% framerate
> drop for glxgears) and really does interfere with experiments to manage
> cache levels. It's not the only bottleneck, but it is the major one.

Ah, so you're playing around and have fun ;-) I think I'll wait with
picking this up until you come up with the real deal ...
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-02-27 17:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-24 21:21 [PATCH] drm/i915: Track dirtying of CPU cache for LLC Chris Wilson
2012-02-27 16:30 ` Daniel Vetter
2012-02-27 17:10   ` Chris Wilson
2012-02-27 17:50     ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.