* [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt
@ 2013-07-30 16:58 Chris Wilson
2013-07-30 16:58 ` [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Chris Wilson @ 2013-07-30 16:58 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
The PTE layouts are the same for both ppgtt and gtt, so we can simplify
the setup for ppgtt by copying the encoding function pointer from gtt.
This prevents bugs where we update one function pointer, but forget the
other.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1294cee..0522d00 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -298,13 +298,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
* now. */
first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
- if (IS_HASWELL(dev)) {
- ppgtt->base.pte_encode = hsw_pte_encode;
- } else if (IS_VALLEYVIEW(dev)) {
- ppgtt->base.pte_encode = byt_pte_encode;
- } else {
- ppgtt->base.pte_encode = gen6_pte_encode;
- }
+ ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
ppgtt->enable = gen6_ppgtt_enable;
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 16:58 [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Chris Wilson @ 2013-07-30 16:58 ` Chris Wilson 2013-07-30 17:19 ` Ville Syrjälä 2013-07-30 17:39 ` [PATCH 2/2] " Kenneth Graunke 2013-07-30 17:39 ` [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Kenneth Graunke 2013-07-30 18:04 ` [PATCH] " Chris Wilson 2 siblings, 2 replies; 19+ messages in thread From: Chris Wilson @ 2013-07-30 16:58 UTC (permalink / raw) To: intel-gfx; +Cc: Ben Widawsky Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC - so that we in theory get the best of both worlds, perfect display and fast access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 4 +++- drivers/gpu/drm/i915/i915_gem.c | 3 ++- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- include/uapi/drm/i915_drm.h | 1 + 5 files changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 8da0b3d..75989fc 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -976,6 +976,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_LLC: value = HAS_LLC(dev); break; + case I915_PARAM_HAS_WT: + value = HAS_WT(dev); + break; case I915_PARAM_HAS_ALIASING_PPGTT: value = dev_priv->mm.aliasing_ppgtt ? 1 : 0; break; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 34d2b9d..324ea14 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -452,6 +452,7 @@ enum i915_cache_level { I915_CACHE_NONE = 0, I915_CACHE_LLC, I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */ + I915_CACHE_WT, }; typedef uint32_t gen6_gtt_pte_t; @@ -1344,7 +1345,7 @@ struct drm_i915_gem_object { unsigned int pending_fenced_gpu_access:1; unsigned int fenced_gpu_access:1; - unsigned int cache_level:2; + unsigned int cache_level:3; unsigned int has_aliasing_ppgtt_mapping:1; unsigned int has_global_gtt_mapping:1; @@ -1547,6 +1548,7 @@ struct drm_i915_file_private { #define HAS_BLT(dev) (INTEL_INFO(dev)->has_blt_ring) #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring) #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) +#define HAS_WT(dev) (IS_HASWELL(dev) && ((struct drm_i915_private *)(dev))->ellc_size) #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 5) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 99362f7..cbea7f8 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3565,7 +3565,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * of uncaching, which would allow us to flush all the LLC-cached data * with that bit in the PTE to main memory with just one PIPE_CONTROL. */ - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); + ret = i915_gem_object_set_cache_level(obj, + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0522d00..072a348 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -54,6 +54,7 @@ (((bits) & 0x8) << (11 - 3))) #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) +#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, enum i915_cache_level level) @@ -116,8 +117,16 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, gen6_gtt_pte_t pte = GEN6_PTE_VALID; pte |= HSW_PTE_ADDR_ENCODE(addr); - if (level != I915_CACHE_NONE) + switch (level) { + case I915_CACHE_NONE: + break; + case I915_CACHE_WT: + pte |= HSW_WT_ELLC_LLC_AGE0; + break; + default: pte |= HSW_WB_ELLC_LLC_AGE0; + break; + } return pte; } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index e47cf00..e831292 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -338,6 +338,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_PINNED_BATCHES 24 #define I915_PARAM_HAS_EXEC_NO_RELOC 25 #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 +#define I915_PARAM_HAS_WT 27 typedef struct drm_i915_getparam { int param; -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 16:58 ` [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Chris Wilson @ 2013-07-30 17:19 ` Ville Syrjälä 2013-07-30 17:36 ` Chris Wilson 2013-07-30 17:45 ` [PATCH] " Chris Wilson 2013-07-30 17:39 ` [PATCH 2/2] " Kenneth Graunke 1 sibling, 2 replies; 19+ messages in thread From: Ville Syrjälä @ 2013-07-30 17:19 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx, Ben Widawsky On Tue, Jul 30, 2013 at 05:58:36PM +0100, Chris Wilson wrote: > Haswell GT3e has the unique feature of supporting Write-Through cacheing > of objects within the eLLC. The purpose of this is to enable the display > plane to remain coherent whilst objects lie resident in the eLLC - so > that we in theory get the best of both worlds, perfect display and fast > access. The description here talks about eLLC only, but you set the PTE for WT in LLC/eLLC both. > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_dma.c | 3 +++ > drivers/gpu/drm/i915/i915_drv.h | 4 +++- > drivers/gpu/drm/i915/i915_gem.c | 3 ++- > drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- > include/uapi/drm/i915_drm.h | 1 + > 5 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 8da0b3d..75989fc 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -976,6 +976,9 @@ static int i915_getparam(struct drm_device *dev, void *data, > case I915_PARAM_HAS_LLC: > value = HAS_LLC(dev); > break; > + case I915_PARAM_HAS_WT: > + value = HAS_WT(dev); > + break; > case I915_PARAM_HAS_ALIASING_PPGTT: > value = dev_priv->mm.aliasing_ppgtt ? 1 : 0; > break; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 34d2b9d..324ea14 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -452,6 +452,7 @@ enum i915_cache_level { > I915_CACHE_NONE = 0, > I915_CACHE_LLC, > I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */ > + I915_CACHE_WT, > }; > > typedef uint32_t gen6_gtt_pte_t; > @@ -1344,7 +1345,7 @@ struct drm_i915_gem_object { > unsigned int pending_fenced_gpu_access:1; > unsigned int fenced_gpu_access:1; > > - unsigned int cache_level:2; > + unsigned int cache_level:3; > > unsigned int has_aliasing_ppgtt_mapping:1; > unsigned int has_global_gtt_mapping:1; > @@ -1547,6 +1548,7 @@ struct drm_i915_file_private { > #define HAS_BLT(dev) (INTEL_INFO(dev)->has_blt_ring) > #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring) > #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) > +#define HAS_WT(dev) (IS_HASWELL(dev) && ((struct drm_i915_private *)(dev))->ellc_size) ->dev_private missing > #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) > > #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 5) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 99362f7..cbea7f8 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3565,7 +3565,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * of uncaching, which would allow us to flush all the LLC-cached data > * with that bit in the PTE to main memory with just one PIPE_CONTROL. > */ > - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > + ret = i915_gem_object_set_cache_level(obj, > + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); Don't we need to tweak the write domain like we do for UC to make sure already dirty lines get flushed from caches? > if (ret) > return ret; > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 0522d00..072a348 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -54,6 +54,7 @@ > (((bits) & 0x8) << (11 - 3))) > #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) > #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) > +#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) > > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > enum i915_cache_level level) > @@ -116,8 +117,16 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, > gen6_gtt_pte_t pte = GEN6_PTE_VALID; > pte |= HSW_PTE_ADDR_ENCODE(addr); > > - if (level != I915_CACHE_NONE) > + switch (level) { > + case I915_CACHE_NONE: > + break; > + case I915_CACHE_WT: > + pte |= HSW_WT_ELLC_LLC_AGE0; > + break; > + default: > pte |= HSW_WB_ELLC_LLC_AGE0; > + break; > + } > > return pte; > } > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index e47cf00..e831292 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -338,6 +338,7 @@ typedef struct drm_i915_irq_wait { > #define I915_PARAM_HAS_PINNED_BATCHES 24 > #define I915_PARAM_HAS_EXEC_NO_RELOC 25 > #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 > +#define I915_PARAM_HAS_WT 27 > > typedef struct drm_i915_getparam { > int param; > -- > 1.8.3.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 17:19 ` Ville Syrjälä @ 2013-07-30 17:36 ` Chris Wilson 2013-07-30 18:01 ` Ville Syrjälä 2013-07-30 17:45 ` [PATCH] " Chris Wilson 1 sibling, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-30 17:36 UTC (permalink / raw) To: Ville Syrjälä; +Cc: intel-gfx, Ben Widawsky On Tue, Jul 30, 2013 at 08:19:28PM +0300, Ville Syrjälä wrote: > On Tue, Jul 30, 2013 at 05:58:36PM +0100, Chris Wilson wrote: > > Haswell GT3e has the unique feature of supporting Write-Through cacheing > > of objects within the eLLC. The purpose of this is to enable the display > > plane to remain coherent whilst objects lie resident in the eLLC - so > > that we in theory get the best of both worlds, perfect display and fast > > access. > > The description here talks about eLLC only, but you set the PTE for > WT in LLC/eLLC both. s/eLLC/LLC/ For some reason, I keep telling myself that it is a magic property of the eLLC otherwise why wouldn't they do it for all LLC! > > - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > > + ret = i915_gem_object_set_cache_level(obj, > > + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); > > Don't we need to tweak the write domain like we do for UC to make sure > already dirty lines get flushed from caches? You need to explicitly do the flush, which gets ugly. I choose to ignore the problem as unlike the LLC -> UC transition, I haven't spotted any dirt. However, it doesn't look too bad if we replace it like so: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ac1b9cd..0e089e2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3362,27 +3362,32 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, i915_gem_obj_ggtt_set_color(obj, cache_level); } - if (cache_level == I915_CACHE_NONE) { - u32 old_read_domains, old_write_domain; - + if (cache_level == I915_CACHE_NONE || + cache_level == I915_CACHE_WT) { /* If we're coming from LLC cached, then we haven't * actually been tracking whether the data is in the * CPU cache or not, since we only allow one bit set * in obj->write_domain and have been skipping the clflushes. - * Just set it to the CPU cache for now. + * Do them now. */ - WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); + if (obj->base.read_domains & I915_GEM_DOMAIN_CPU) { + u32 old_read_domains, old_write_domain; - old_read_domains = obj->base.read_domains; - old_write_domain = obj->base.write_domain; + BUG_ON(obj->pages == NULL); - obj->base.read_domains = I915_GEM_DOMAIN_CPU; - obj->base.write_domain = I915_GEM_DOMAIN_CPU; + trace_i915_gem_object_clflush(obj); + drm_clflush_sg(obj->pages); - trace_i915_gem_object_change_domain(obj, - old_read_domains, - old_write_domain); + old_read_domains = obj->base.read_domains; + old_write_domain = obj->base.write_domain; + + obj->base.read_domains &= ~I915_GEM_DOMAIN_CPU; + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU; + + trace_i915_gem_object_change_domain(obj, + old_read_domains, + old_write_domain); + } } obj->cache_level = cache_level; -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 17:36 ` Chris Wilson @ 2013-07-30 18:01 ` Ville Syrjälä 2013-07-30 18:10 ` Chris Wilson 0 siblings, 1 reply; 19+ messages in thread From: Ville Syrjälä @ 2013-07-30 18:01 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Ben Widawsky On Tue, Jul 30, 2013 at 06:36:15PM +0100, Chris Wilson wrote: > On Tue, Jul 30, 2013 at 08:19:28PM +0300, Ville Syrjälä wrote: > > On Tue, Jul 30, 2013 at 05:58:36PM +0100, Chris Wilson wrote: > > > Haswell GT3e has the unique feature of supporting Write-Through cacheing > > > of objects within the eLLC. The purpose of this is to enable the display > > > plane to remain coherent whilst objects lie resident in the eLLC - so > > > that we in theory get the best of both worlds, perfect display and fast > > > access. > > > > The description here talks about eLLC only, but you set the PTE for > > WT in LLC/eLLC both. > > s/eLLC/LLC/ > > For some reason, I keep telling myself that it is a magic property of > the eLLC otherwise why wouldn't they do it for all LLC! > > > > - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > > > + ret = i915_gem_object_set_cache_level(obj, > > > + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); > > > > Don't we need to tweak the write domain like we do for UC to make sure > > already dirty lines get flushed from caches? > > You need to explicitly do the flush, which gets ugly. Ah right, because i915_gem_clflush_object() checks the cache_level. > I choose to ignore > the problem as unlike the LLC -> UC transition, I haven't spotted any > dirt. However, it doesn't look too bad if we replace it like so: Hmm, and what about CPU access in general? CPU doesn't know about the WT policy, so I'm assuming we'd need to flush after all CPU writes. > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ac1b9cd..0e089e2 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3362,27 +3362,32 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > i915_gem_obj_ggtt_set_color(obj, cache_level); > } > > - if (cache_level == I915_CACHE_NONE) { > - u32 old_read_domains, old_write_domain; > - > + if (cache_level == I915_CACHE_NONE || > + cache_level == I915_CACHE_WT) { > /* If we're coming from LLC cached, then we haven't > * actually been tracking whether the data is in the > * CPU cache or not, since we only allow one bit set > * in obj->write_domain and have been skipping the clflushes. > - * Just set it to the CPU cache for now. > + * Do them now. > */ > - WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); > - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); > + if (obj->base.read_domains & I915_GEM_DOMAIN_CPU) { > + u32 old_read_domains, old_write_domain; > > - old_read_domains = obj->base.read_domains; > - old_write_domain = obj->base.write_domain; > + BUG_ON(obj->pages == NULL); > > - obj->base.read_domains = I915_GEM_DOMAIN_CPU; > - obj->base.write_domain = I915_GEM_DOMAIN_CPU; > + trace_i915_gem_object_clflush(obj); > + drm_clflush_sg(obj->pages); > > - trace_i915_gem_object_change_domain(obj, > - old_read_domains, > - old_write_domain); > + old_read_domains = obj->base.read_domains; > + old_write_domain = obj->base.write_domain; > + > + obj->base.read_domains &= ~I915_GEM_DOMAIN_CPU; > + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU; > + > + trace_i915_gem_object_change_domain(obj, > + old_read_domains, > + old_write_domain); > + } > } > > obj->cache_level = cache_level; > > -- > Chris Wilson, Intel Open Source Technology Centre -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 18:01 ` Ville Syrjälä @ 2013-07-30 18:10 ` Chris Wilson 2013-07-30 18:29 ` Chris Wilson 0 siblings, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-30 18:10 UTC (permalink / raw) To: Ville Syrjälä; +Cc: intel-gfx, Ben Widawsky On Tue, Jul 30, 2013 at 09:01:22PM +0300, Ville Syrjälä wrote: > Hmm, and what about CPU access in general? CPU doesn't know about the > WT policy, so I'm assuming we'd need to flush after all CPU writes. Hmm, I had assumed the opposite. I've not yet seen signs of cache dirt. Certainly GTT access should fine, and CPU reads. It is just those CPU writes we need to worry about, and now I do indeed worry. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 18:10 ` Chris Wilson @ 2013-07-30 18:29 ` Chris Wilson 0 siblings, 0 replies; 19+ messages in thread From: Chris Wilson @ 2013-07-30 18:29 UTC (permalink / raw) To: Ville Syrjälä, intel-gfx, Ben Widawsky On Tue, Jul 30, 2013 at 07:10:03PM +0100, Chris Wilson wrote: > On Tue, Jul 30, 2013 at 09:01:22PM +0300, Ville Syrjälä wrote: > > Hmm, and what about CPU access in general? CPU doesn't know about the > > WT policy, so I'm assuming we'd need to flush after all CPU writes. > > Hmm, I had assumed the opposite. I've not yet seen signs of cache dirt. > Certainly GTT access should fine, and CPU reads. It is just those CPU > writes we need to worry about, and now I do indeed worry. Ok, didn't take too long to force a few updates through CPU mappings to a WT display, and yes the CPU writes are incoherent. So we need to be much more careful with the flushes here. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 17:19 ` Ville Syrjälä 2013-07-30 17:36 ` Chris Wilson @ 2013-07-30 17:45 ` Chris Wilson 2013-07-30 19:25 ` Chris Wilson 1 sibling, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-30 17:45 UTC (permalink / raw) To: intel-gfx Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC/LLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC/LLC - so that we, in theory, get the best of both worlds, perfect display and fast access. v2: Actually do the clflush on transition to WT, nagging by Ville. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 4 +++- drivers/gpu/drm/i915/i915_gem.c | 29 +++++++++++++++-------------- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- include/uapi/drm/i915_drm.h | 1 + 5 files changed, 32 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 8da0b3d..75989fc 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -976,6 +976,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_LLC: value = HAS_LLC(dev); break; + case I915_PARAM_HAS_WT: + value = HAS_WT(dev); + break; case I915_PARAM_HAS_ALIASING_PPGTT: value = dev_priv->mm.aliasing_ppgtt ? 1 : 0; break; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 34d2b9d..d27a82a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -452,6 +452,7 @@ enum i915_cache_level { I915_CACHE_NONE = 0, I915_CACHE_LLC, I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */ + I915_CACHE_WT, }; typedef uint32_t gen6_gtt_pte_t; @@ -1344,7 +1345,7 @@ struct drm_i915_gem_object { unsigned int pending_fenced_gpu_access:1; unsigned int fenced_gpu_access:1; - unsigned int cache_level:2; + unsigned int cache_level:3; unsigned int has_aliasing_ppgtt_mapping:1; unsigned int has_global_gtt_mapping:1; @@ -1547,6 +1548,7 @@ struct drm_i915_file_private { #define HAS_BLT(dev) (INTEL_INFO(dev)->has_blt_ring) #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring) #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) +#define HAS_WT(dev) (IS_HASWELL(dev) && ((struct drm_i915_private *)(dev)->dev_private)->ellc_size) #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 5) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 99362f7..6b33494 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3447,27 +3447,27 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, i915_gem_obj_ggtt_set_color(obj, cache_level); } - if (cache_level == I915_CACHE_NONE) { - u32 old_read_domains, old_write_domain; - + if (cache_level == I915_CACHE_NONE || + cache_level == I915_CACHE_WT) { /* If we're coming from LLC cached, then we haven't * actually been tracking whether the data is in the * CPU cache or not, since we only allow one bit set * in obj->write_domain and have been skipping the clflushes. - * Just set it to the CPU cache for now. + * Do them now. */ - WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); + if (obj->pages && !obj->stolen) { + u32 old_write_domain; - old_read_domains = obj->base.read_domains; - old_write_domain = obj->base.write_domain; + trace_i915_gem_object_clflush(obj); + drm_clflush_sg(obj->pages); - obj->base.read_domains = I915_GEM_DOMAIN_CPU; - obj->base.write_domain = I915_GEM_DOMAIN_CPU; + old_write_domain = obj->base.write_domain; + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU; - trace_i915_gem_object_change_domain(obj, - old_read_domains, - old_write_domain); + trace_i915_gem_object_change_domain(obj, + obj->base.read_domains, + old_write_domain); + } } obj->cache_level = cache_level; @@ -3565,7 +3565,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * of uncaching, which would allow us to flush all the LLC-cached data * with that bit in the PTE to main memory with just one PIPE_CONTROL. */ - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); + ret = i915_gem_object_set_cache_level(obj, + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0522d00..072a348 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -54,6 +54,7 @@ (((bits) & 0x8) << (11 - 3))) #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) +#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, enum i915_cache_level level) @@ -116,8 +117,16 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, gen6_gtt_pte_t pte = GEN6_PTE_VALID; pte |= HSW_PTE_ADDR_ENCODE(addr); - if (level != I915_CACHE_NONE) + switch (level) { + case I915_CACHE_NONE: + break; + case I915_CACHE_WT: + pte |= HSW_WT_ELLC_LLC_AGE0; + break; + default: pte |= HSW_WB_ELLC_LLC_AGE0; + break; + } return pte; } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index e47cf00..e831292 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -338,6 +338,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_PINNED_BATCHES 24 #define I915_PARAM_HAS_EXEC_NO_RELOC 25 #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 +#define I915_PARAM_HAS_WT 27 typedef struct drm_i915_getparam { int param; -- 1.8.3.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 17:45 ` [PATCH] " Chris Wilson @ 2013-07-30 19:25 ` Chris Wilson 2013-07-31 13:16 ` Ville Syrjälä 0 siblings, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-30 19:25 UTC (permalink / raw) To: intel-gfx Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC/LLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC/LLC - so that we, in theory, get the best of both worlds, perfect display and fast access. However, we still need to be careful as the CPU does not see the WT when accessing the cache. In particular, this means that we need to flush the cache lines after writing to an object through the CPU, and on transitioning from a cached state to WT. v2: Actually do the clflush on transition to WT, nagging by Ville. v3: Flush the CPU cache after writes into WT objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 4 +++- drivers/gpu/drm/i915/i915_gem.c | 38 ++++++++++++++++++++----------------- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- include/uapi/drm/i915_drm.h | 1 + 5 files changed, 38 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 8da0b3d..75989fc 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -976,6 +976,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_LLC: value = HAS_LLC(dev); break; + case I915_PARAM_HAS_WT: + value = HAS_WT(dev); + break; case I915_PARAM_HAS_ALIASING_PPGTT: value = dev_priv->mm.aliasing_ppgtt ? 1 : 0; break; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 34d2b9d..d27a82a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -452,6 +452,7 @@ enum i915_cache_level { I915_CACHE_NONE = 0, I915_CACHE_LLC, I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */ + I915_CACHE_WT, }; typedef uint32_t gen6_gtt_pte_t; @@ -1344,7 +1345,7 @@ struct drm_i915_gem_object { unsigned int pending_fenced_gpu_access:1; unsigned int fenced_gpu_access:1; - unsigned int cache_level:2; + unsigned int cache_level:3; unsigned int has_aliasing_ppgtt_mapping:1; unsigned int has_global_gtt_mapping:1; @@ -1547,6 +1548,7 @@ struct drm_i915_file_private { #define HAS_BLT(dev) (INTEL_INFO(dev)->has_blt_ring) #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring) #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) +#define HAS_WT(dev) (IS_HASWELL(dev) && ((struct drm_i915_private *)(dev)->dev_private)->ellc_size) #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 5) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 99362f7..1c5bcc7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3286,11 +3286,13 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj) * snooping behaviour occurs naturally as the result of our domain * tracking. */ - if (obj->cache_level != I915_CACHE_NONE) + switch (obj->cache_level) { + case I915_CACHE_LLC: + case I915_CACHE_LLC_MLC: return; + } trace_i915_gem_object_clflush(obj); - drm_clflush_sg(obj->pages); } @@ -3447,27 +3449,27 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, i915_gem_obj_ggtt_set_color(obj, cache_level); } - if (cache_level == I915_CACHE_NONE) { - u32 old_read_domains, old_write_domain; - + if (cache_level == I915_CACHE_NONE || + cache_level == I915_CACHE_WT) { /* If we're coming from LLC cached, then we haven't * actually been tracking whether the data is in the * CPU cache or not, since we only allow one bit set * in obj->write_domain and have been skipping the clflushes. - * Just set it to the CPU cache for now. + * Do them now. */ - WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); + if (obj->pages && !obj->stolen) { + u32 old_write_domain; - old_read_domains = obj->base.read_domains; - old_write_domain = obj->base.write_domain; + trace_i915_gem_object_clflush(obj); + drm_clflush_sg(obj->pages); - obj->base.read_domains = I915_GEM_DOMAIN_CPU; - obj->base.write_domain = I915_GEM_DOMAIN_CPU; + old_write_domain = obj->base.write_domain; + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU; - trace_i915_gem_object_change_domain(obj, - old_read_domains, - old_write_domain); + trace_i915_gem_object_change_domain(obj, + obj->base.read_domains, + old_write_domain); + } } obj->cache_level = cache_level; @@ -3565,7 +3567,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * of uncaching, which would allow us to flush all the LLC-cached data * with that bit in the PTE to main memory with just one PIPE_CONTROL. */ - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); + ret = i915_gem_object_set_cache_level(obj, + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); if (ret) return ret; @@ -3638,7 +3641,8 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) /* Flush the CPU cache if it's still invalid. */ if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) { - i915_gem_clflush_object(obj); + if (obj->cache_level == I915_CACHE_NONE) + i915_gem_clflush_object(obj); obj->base.read_domains |= I915_GEM_DOMAIN_CPU; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0522d00..072a348 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -54,6 +54,7 @@ (((bits) & 0x8) << (11 - 3))) #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) +#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, enum i915_cache_level level) @@ -116,8 +117,16 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, gen6_gtt_pte_t pte = GEN6_PTE_VALID; pte |= HSW_PTE_ADDR_ENCODE(addr); - if (level != I915_CACHE_NONE) + switch (level) { + case I915_CACHE_NONE: + break; + case I915_CACHE_WT: + pte |= HSW_WT_ELLC_LLC_AGE0; + break; + default: pte |= HSW_WB_ELLC_LLC_AGE0; + break; + } return pte; } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index e47cf00..e831292 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -338,6 +338,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_PINNED_BATCHES 24 #define I915_PARAM_HAS_EXEC_NO_RELOC 25 #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 +#define I915_PARAM_HAS_WT 27 typedef struct drm_i915_getparam { int param; -- 1.8.3.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 19:25 ` Chris Wilson @ 2013-07-31 13:16 ` Ville Syrjälä 2013-07-31 13:36 ` Chris Wilson 0 siblings, 1 reply; 19+ messages in thread From: Ville Syrjälä @ 2013-07-31 13:16 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Tue, Jul 30, 2013 at 08:25:56PM +0100, Chris Wilson wrote: > Haswell GT3e has the unique feature of supporting Write-Through cacheing > of objects within the eLLC/LLC. The purpose of this is to enable the display > plane to remain coherent whilst objects lie resident in the eLLC/LLC - so > that we, in theory, get the best of both worlds, perfect display and fast > access. > > However, we still need to be careful as the CPU does not see the WT when > accessing the cache. In particular, this means that we need to flush the > cache lines after writing to an object through the CPU, and on > transitioning from a cached state to WT. > > v2: Actually do the clflush on transition to WT, nagging by Ville. > v3: Flush the CPU cache after writes into WT objects. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Cc: Kenneth Graunke <kenneth@whitecape.org> > --- > drivers/gpu/drm/i915/i915_dma.c | 3 +++ > drivers/gpu/drm/i915/i915_drv.h | 4 +++- > drivers/gpu/drm/i915/i915_gem.c | 38 ++++++++++++++++++++----------------- > drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- > include/uapi/drm/i915_drm.h | 1 + > 5 files changed, 38 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 8da0b3d..75989fc 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -976,6 +976,9 @@ static int i915_getparam(struct drm_device *dev, void *data, > case I915_PARAM_HAS_LLC: > value = HAS_LLC(dev); > break; > + case I915_PARAM_HAS_WT: > + value = HAS_WT(dev); > + break; > case I915_PARAM_HAS_ALIASING_PPGTT: > value = dev_priv->mm.aliasing_ppgtt ? 1 : 0; > break; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 34d2b9d..d27a82a 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -452,6 +452,7 @@ enum i915_cache_level { > I915_CACHE_NONE = 0, > I915_CACHE_LLC, > I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */ > + I915_CACHE_WT, > }; > > typedef uint32_t gen6_gtt_pte_t; > @@ -1344,7 +1345,7 @@ struct drm_i915_gem_object { > unsigned int pending_fenced_gpu_access:1; > unsigned int fenced_gpu_access:1; > > - unsigned int cache_level:2; > + unsigned int cache_level:3; > > unsigned int has_aliasing_ppgtt_mapping:1; > unsigned int has_global_gtt_mapping:1; > @@ -1547,6 +1548,7 @@ struct drm_i915_file_private { > #define HAS_BLT(dev) (INTEL_INFO(dev)->has_blt_ring) > #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring) > #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) > +#define HAS_WT(dev) (IS_HASWELL(dev) && ((struct drm_i915_private *)(dev)->dev_private)->ellc_size) > #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) > > #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 5) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 99362f7..1c5bcc7 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3286,11 +3286,13 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj) > * snooping behaviour occurs naturally as the result of our domain > * tracking. > */ > - if (obj->cache_level != I915_CACHE_NONE) > + switch (obj->cache_level) { > + case I915_CACHE_LLC: > + case I915_CACHE_LLC_MLC: > return; > + } > > trace_i915_gem_object_clflush(obj); > - > drm_clflush_sg(obj->pages); > } > > @@ -3447,27 +3449,27 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > i915_gem_obj_ggtt_set_color(obj, cache_level); > } > > - if (cache_level == I915_CACHE_NONE) { > - u32 old_read_domains, old_write_domain; > - > + if (cache_level == I915_CACHE_NONE || > + cache_level == I915_CACHE_WT) { > /* If we're coming from LLC cached, then we haven't > * actually been tracking whether the data is in the > * CPU cache or not, since we only allow one bit set > * in obj->write_domain and have been skipping the clflushes. > - * Just set it to the CPU cache for now. > + * Do them now. > */ > - WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); > - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); > + if (obj->pages && !obj->stolen) { > + u32 old_write_domain; > > - old_read_domains = obj->base.read_domains; > - old_write_domain = obj->base.write_domain; > + trace_i915_gem_object_clflush(obj); > + drm_clflush_sg(obj->pages); > > - obj->base.read_domains = I915_GEM_DOMAIN_CPU; > - obj->base.write_domain = I915_GEM_DOMAIN_CPU; > + old_write_domain = obj->base.write_domain; > + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU; > > - trace_i915_gem_object_change_domain(obj, > - old_read_domains, > - old_write_domain); > + trace_i915_gem_object_change_domain(obj, > + obj->base.read_domains, > + old_write_domain); > + } With the change to i915_gem_clflush_object() this hunk could be dropped, no? > } > > obj->cache_level = cache_level; > @@ -3565,7 +3567,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * of uncaching, which would allow us to flush all the LLC-cached data > * with that bit in the PTE to main memory with just one PIPE_CONTROL. > */ > - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > + ret = i915_gem_object_set_cache_level(obj, > + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); > if (ret) > return ret; > > @@ -3638,7 +3641,8 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > > /* Flush the CPU cache if it's still invalid. */ > if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) { > - i915_gem_clflush_object(obj); > + if (obj->cache_level == I915_CACHE_NONE) > + i915_gem_clflush_object(obj); And what about pwrite? I must admit that I'm getting confused by our caches again. Based on what I've read I *think* UC GPU accesses will snoop/invalidate the LLC and IA caches, so this kind of flush shouldn't be necessary (on LLC platforms that is). Supposedly the same goes for the pre-copy flushes in pread/pwrite. And the post-copy flush in pwrite is needed due to the display engine only. Should we maybe add a special UC cache level for display to avoid pointless flushes on other UC buffers LLC platforms? That way userland could set most buffers to UC via the ioctl and use MOCS to override them to LLC on IVB. Also while looking through BSpec I noticed a slightly worrying note. Apparently, on HSW at least, L3/not-LLC cacheable surfaces can still evict dirty lines from L3 to LLC. The IVB flow diagrams leave me to think IVB could behave the same way, even though it's not really spelled out there. This would only be an issue when using MOCS since you can't configure such a caching mode through the PTEs alone. > > obj->base.read_domains |= I915_GEM_DOMAIN_CPU; > } > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 0522d00..072a348 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -54,6 +54,7 @@ > (((bits) & 0x8) << (11 - 3))) > #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) > #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) > +#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) > > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > enum i915_cache_level level) > @@ -116,8 +117,16 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, > gen6_gtt_pte_t pte = GEN6_PTE_VALID; > pte |= HSW_PTE_ADDR_ENCODE(addr); > > - if (level != I915_CACHE_NONE) > + switch (level) { > + case I915_CACHE_NONE: > + break; > + case I915_CACHE_WT: > + pte |= HSW_WT_ELLC_LLC_AGE0; > + break; > + default: > pte |= HSW_WB_ELLC_LLC_AGE0; > + break; > + } > > return pte; > } > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index e47cf00..e831292 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -338,6 +338,7 @@ typedef struct drm_i915_irq_wait { > #define I915_PARAM_HAS_PINNED_BATCHES 24 > #define I915_PARAM_HAS_EXEC_NO_RELOC 25 > #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 > +#define I915_PARAM_HAS_WT 27 > > typedef struct drm_i915_getparam { > int param; > -- > 1.8.3.2 -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-31 13:16 ` Ville Syrjälä @ 2013-07-31 13:36 ` Chris Wilson 2013-07-31 15:16 ` Ville Syrjälä 0 siblings, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-31 13:36 UTC (permalink / raw) To: Ville Syrjälä; +Cc: intel-gfx On Wed, Jul 31, 2013 at 04:16:40PM +0300, Ville Syrjälä wrote: > On Tue, Jul 30, 2013 at 08:25:56PM +0100, Chris Wilson wrote: > > + if (cache_level == I915_CACHE_NONE || > > + cache_level == I915_CACHE_WT) { > > /* If we're coming from LLC cached, then we haven't > > * actually been tracking whether the data is in the > > * CPU cache or not, since we only allow one bit set > > * in obj->write_domain and have been skipping the clflushes. > > - * Just set it to the CPU cache for now. > > + * Do them now. > > */ > > - WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); > > - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); > > + if (obj->pages && !obj->stolen) { > > + u32 old_write_domain; > > > > - old_read_domains = obj->base.read_domains; > > - old_write_domain = obj->base.write_domain; > > + trace_i915_gem_object_clflush(obj); > > + drm_clflush_sg(obj->pages); > > > > - obj->base.read_domains = I915_GEM_DOMAIN_CPU; > > - obj->base.write_domain = I915_GEM_DOMAIN_CPU; > > + old_write_domain = obj->base.write_domain; > > + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU; > > > > - trace_i915_gem_object_change_domain(obj, > > - old_read_domains, > > - old_write_domain); > > + trace_i915_gem_object_change_domain(obj, > > + obj->base.read_domains, > > + old_write_domain); > > + } > > With the change to i915_gem_clflush_object() this hunk could be dropped, > no? I liked the flush being explicit, so kept it in. It can be done either way. > And what about pwrite? If anybody used pwrite to a scanout buffer, they are going to be upset... Yes, it needs to clflush afterwards if WT. > I must admit that I'm getting confused by our caches again. > > Based on what I've read I *think* UC GPU accesses will snoop/invalidate > the LLC and IA caches, so this kind of flush shouldn't be necessary (on > LLC platforms that is). Nope, you have to explicitly enable the snooping in the GPU (that's the cacheable bit on non-LLC platforms), and then it only occurs upon invalidation/flush commands. > Supposedly the same goes for the pre-copy flushes > in pread/pwrite. And the post-copy flush in pwrite is needed due to the > display engine only. > > Should we maybe add a special UC cache level for display to avoid pointless > flushes on other UC buffers LLC platforms? That way userland could set most > buffers to UC via the ioctl and use MOCS to override them to LLC on IVB. No. We need the LLC bit set for coherent CPU mmaps which are now in vogue to avoid having to use the fences for detiling, and always have been preferred for linear buffers (i.e. asynchronous VBO maps and texture streams). > Also while looking through BSpec I noticed a slightly worrying note. > Apparently, on HSW at least, L3/not-LLC cacheable surfaces can > still evict dirty lines from L3 to LLC. The IVB flow diagrams leave me to > think IVB could behave the same way, even though it's not really spelled > out there. This would only be an issue when using MOCS since you can't > configure such a caching mode through the PTEs alone. Afaict, the render write flush is sufficient to write the dirty cache lines to LLC/UC memory, so from the kernel/CPU perspective it never has to worry about L3. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-31 13:36 ` Chris Wilson @ 2013-07-31 15:16 ` Ville Syrjälä 2013-07-31 15:26 ` Chris Wilson 0 siblings, 1 reply; 19+ messages in thread From: Ville Syrjälä @ 2013-07-31 15:16 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Kenneth Graunke On Wed, Jul 31, 2013 at 02:36:39PM +0100, Chris Wilson wrote: > On Wed, Jul 31, 2013 at 04:16:40PM +0300, Ville Syrjälä wrote: > > Also while looking through BSpec I noticed a slightly worrying note. > > Apparently, on HSW at least, L3/not-LLC cacheable surfaces can > > still evict dirty lines from L3 to LLC. The IVB flow diagrams leave me to > > think IVB could behave the same way, even though it's not really spelled > > out there. This would only be an issue when using MOCS since you can't > > configure such a caching mode through the PTEs alone. > > Afaict, the render write flush is sufficient to write the dirty cache > lines to LLC/UC memory, so from the kernel/CPU perspective it never has > to worry about L3. The problem would only occur when we have a an non-LLC cached scanout buffer which gets marked as L3 cacheable via MOCS. BSpec says that if stuff is evicted from L3 it may land in LLC regardless of the LLC cacheability bits. The data would then remain in LLC and would not get flushed to memory as that would require an explicit clflush. And in the end we'd scan out some stale garbage. That's for HSW at least. For IVB I'm not sure. It may be that L3 and LLC are supposed to be inclusive so if you have something in L3 it must also be present in LLC. I've seen some old training material about MLC that stated as much, but I don't know if that design was actually carried over to IVB. -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-31 15:16 ` Ville Syrjälä @ 2013-07-31 15:26 ` Chris Wilson 2013-07-31 15:54 ` Ville Syrjälä 0 siblings, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-31 15:26 UTC (permalink / raw) To: Ville Syrjälä; +Cc: intel-gfx On Wed, Jul 31, 2013 at 06:16:14PM +0300, Ville Syrjälä wrote: > On Wed, Jul 31, 2013 at 02:36:39PM +0100, Chris Wilson wrote: > > On Wed, Jul 31, 2013 at 04:16:40PM +0300, Ville Syrjälä wrote: > > > Also while looking through BSpec I noticed a slightly worrying note. > > > Apparently, on HSW at least, L3/not-LLC cacheable surfaces can > > > still evict dirty lines from L3 to LLC. The IVB flow diagrams leave me to > > > think IVB could behave the same way, even though it's not really spelled > > > out there. This would only be an issue when using MOCS since you can't > > > configure such a caching mode through the PTEs alone. > > > > Afaict, the render write flush is sufficient to write the dirty cache > > lines to LLC/UC memory, so from the kernel/CPU perspective it never has > > to worry about L3. > > The problem would only occur when we have a an non-LLC cached scanout buffer > which gets marked as L3 cacheable via MOCS. BSpec says that if stuff is evicted > from L3 it may land in LLC regardless of the LLC cacheability bits. The data > would then remain in LLC and would not get flushed to memory as that > would require an explicit clflush. And in the end we'd scan out some stale > garbage. That's true (for all LLC machines), but that wasn't the point I thought you were making. I thought we were arguing about CPU coherency, and whether the CPU would see the bits evicted from L3 into LLC irrespective of whether the PTE was marked as LLC or UC. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-31 15:26 ` Chris Wilson @ 2013-07-31 15:54 ` Ville Syrjälä 2013-07-31 16:14 ` Chris Wilson 0 siblings, 1 reply; 19+ messages in thread From: Ville Syrjälä @ 2013-07-31 15:54 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Kenneth Graunke On Wed, Jul 31, 2013 at 04:26:41PM +0100, Chris Wilson wrote: > On Wed, Jul 31, 2013 at 06:16:14PM +0300, Ville Syrjälä wrote: > > On Wed, Jul 31, 2013 at 02:36:39PM +0100, Chris Wilson wrote: > > > On Wed, Jul 31, 2013 at 04:16:40PM +0300, Ville Syrjälä wrote: > > > > Also while looking through BSpec I noticed a slightly worrying note. > > > > Apparently, on HSW at least, L3/not-LLC cacheable surfaces can > > > > still evict dirty lines from L3 to LLC. The IVB flow diagrams leave me to > > > > think IVB could behave the same way, even though it's not really spelled > > > > out there. This would only be an issue when using MOCS since you can't > > > > configure such a caching mode through the PTEs alone. > > > > > > Afaict, the render write flush is sufficient to write the dirty cache > > > lines to LLC/UC memory, so from the kernel/CPU perspective it never has > > > to worry about L3. > > > > The problem would only occur when we have a an non-LLC cached scanout buffer > > which gets marked as L3 cacheable via MOCS. BSpec says that if stuff is evicted > > from L3 it may land in LLC regardless of the LLC cacheability bits. The data > > would then remain in LLC and would not get flushed to memory as that > > would require an explicit clflush. And in the end we'd scan out some stale > > garbage. > > That's true (for all LLC machines), Well, I guess all LLC machines w/ L3, ie. IVB+. So basically it means that scanout surfaces should never be marked as either L3 or LLC cacheable via MOCS. I suppose GFDT would save us here, even in the L3 evicition case, except it's not guaranteed to work on HSW :( -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-31 15:54 ` Ville Syrjälä @ 2013-07-31 16:14 ` Chris Wilson 0 siblings, 0 replies; 19+ messages in thread From: Chris Wilson @ 2013-07-31 16:14 UTC (permalink / raw) To: Ville Syrjälä; +Cc: intel-gfx On Wed, Jul 31, 2013 at 06:54:07PM +0300, Ville Syrjälä wrote: > On Wed, Jul 31, 2013 at 04:26:41PM +0100, Chris Wilson wrote: > > On Wed, Jul 31, 2013 at 06:16:14PM +0300, Ville Syrjälä wrote: > > > On Wed, Jul 31, 2013 at 02:36:39PM +0100, Chris Wilson wrote: > > > > On Wed, Jul 31, 2013 at 04:16:40PM +0300, Ville Syrjälä wrote: > > > > > Also while looking through BSpec I noticed a slightly worrying note. > > > > > Apparently, on HSW at least, L3/not-LLC cacheable surfaces can > > > > > still evict dirty lines from L3 to LLC. The IVB flow diagrams leave me to > > > > > think IVB could behave the same way, even though it's not really spelled > > > > > out there. This would only be an issue when using MOCS since you can't > > > > > configure such a caching mode through the PTEs alone. > > > > > > > > Afaict, the render write flush is sufficient to write the dirty cache > > > > lines to LLC/UC memory, so from the kernel/CPU perspective it never has > > > > to worry about L3. > > > > > > The problem would only occur when we have a an non-LLC cached scanout buffer > > > which gets marked as L3 cacheable via MOCS. BSpec says that if stuff is evicted > > > from L3 it may land in LLC regardless of the LLC cacheability bits. The data > > > would then remain in LLC and would not get flushed to memory as that > > > would require an explicit clflush. And in the end we'd scan out some stale > > > garbage. > > > > That's true (for all LLC machines), > > Well, I guess all LLC machines w/ L3, ie. IVB+. > > So basically it means that scanout surfaces should never be marked as > either L3 or LLC cacheable via MOCS. I suppose GFDT would save us here, > even in the L3 evicition case, except it's not guaranteed to work on HSW :( Yes. GFDT worked quite well, WT appears much easier to use though. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris 2013-07-30 16:58 ` [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Chris Wilson 2013-07-30 17:19 ` Ville Syrjälä @ 2013-07-30 17:39 ` Kenneth Graunke 1 sibling, 0 replies; 19+ messages in thread From: Kenneth Graunke @ 2013-07-30 17:39 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx, Ben Widawsky On 07/30/2013 09:58 AM, Chris Wilson wrote: > Haswell GT3e has the unique feature of supporting Write-Through cacheing > of objects within the eLLC. The purpose of this is to enable the display > plane to remain coherent whilst objects lie resident in the eLLC - so > that we in theory get the best of both worlds, perfect display and fast > access. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ben Widawsky <ben@bwidawsk.net> Awesome. Thanks a ton for doing this, Chris! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt 2013-07-30 16:58 [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Chris Wilson 2013-07-30 16:58 ` [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Chris Wilson @ 2013-07-30 17:39 ` Kenneth Graunke 2013-07-30 18:04 ` [PATCH] " Chris Wilson 2 siblings, 0 replies; 19+ messages in thread From: Kenneth Graunke @ 2013-07-30 17:39 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx, Ben Widawsky On 07/30/2013 09:58 AM, Chris Wilson wrote: > The PTE layouts are the same for both ppgtt and gtt, so we can simplify > the setup for ppgtt by copying the encoding function pointer from gtt. > This prevents bugs where we update one function pointer, but forget the > other. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 1294cee..0522d00 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -298,13 +298,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) > * now. */ > first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt); > > - if (IS_HASWELL(dev)) { > - ppgtt->base.pte_encode = hsw_pte_encode; > - } else if (IS_VALLEYVIEW(dev)) { > - ppgtt->base.pte_encode = byt_pte_encode; > - } else { > - ppgtt->base.pte_encode = gen6_pte_encode; > - } > + ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode; > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; > ppgtt->enable = gen6_ppgtt_enable; > ppgtt->base.clear_range = gen6_ppgtt_clear_range; This is much nicer than my old code - thanks! It might be worth mentioning in the commit message that in particular, iris_pte_encode was forgotten here. Either way, Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] drm/i915: Use the same pte_encoding for ppgtt as for gtt 2013-07-30 16:58 [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Chris Wilson 2013-07-30 16:58 ` [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Chris Wilson 2013-07-30 17:39 ` [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Kenneth Graunke @ 2013-07-30 18:04 ` Chris Wilson 2013-07-31 7:42 ` Ben Widawsky 2 siblings, 1 reply; 19+ messages in thread From: Chris Wilson @ 2013-07-30 18:04 UTC (permalink / raw) To: intel-gfx; +Cc: Ben Widawsky The PTE layouts are the same for both ppgtt and gtt, so we can simplify the setup for ppgtt by copying the encoding function pointer from gtt. This prevents bugs where we update one function pointer, but forget the other. For instance, commit 4d15c145a6234d999c0452eec0d275c1fbf0688c Author: Ben Widawsky <ben@bwidawsk.net> Date: Thu Jul 4 11:02:06 2013 -0700 drm/i915: Use eLLC/LLC by default when available only extends the gtt to use eLLC/LLC cacheing and forgets to also update the ppgtt function pointer. v2: Actually mention the bug being fixed (Kenneth) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> --- drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1294cee..0522d00 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -298,13 +298,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) * now. */ first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt); - if (IS_HASWELL(dev)) { - ppgtt->base.pte_encode = hsw_pte_encode; - } else if (IS_VALLEYVIEW(dev)) { - ppgtt->base.pte_encode = byt_pte_encode; - } else { - ppgtt->base.pte_encode = gen6_pte_encode; - } + ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode; ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; ppgtt->enable = gen6_ppgtt_enable; ppgtt->base.clear_range = gen6_ppgtt_clear_range; -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Use the same pte_encoding for ppgtt as for gtt 2013-07-30 18:04 ` [PATCH] " Chris Wilson @ 2013-07-31 7:42 ` Ben Widawsky 0 siblings, 0 replies; 19+ messages in thread From: Ben Widawsky @ 2013-07-31 7:42 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Tue, Jul 30, 2013 at 07:04:37PM +0100, Chris Wilson wrote: > The PTE layouts are the same for both ppgtt and gtt, so we can simplify > the setup for ppgtt by copying the encoding function pointer from gtt. > This prevents bugs where we update one function pointer, but forget the > other. > > For instance, > > commit 4d15c145a6234d999c0452eec0d275c1fbf0688c > Author: Ben Widawsky <ben@bwidawsk.net> > Date: Thu Jul 4 11:02:06 2013 -0700 > > drm/i915: Use eLLC/LLC by default when available > > only extends the gtt to use eLLC/LLC cacheing and forgets to also update > the ppgtt function pointer. > > v2: Actually mention the bug being fixed (Kenneth) > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ben Widawsky <ben@bwidawsk.net> > Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> My bad. I've merged this to my temporary tree for when Daniel returns. -- Ben Widawsky, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-07-31 16:14 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-07-30 16:58 [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Chris Wilson 2013-07-30 16:58 ` [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Chris Wilson 2013-07-30 17:19 ` Ville Syrjälä 2013-07-30 17:36 ` Chris Wilson 2013-07-30 18:01 ` Ville Syrjälä 2013-07-30 18:10 ` Chris Wilson 2013-07-30 18:29 ` Chris Wilson 2013-07-30 17:45 ` [PATCH] " Chris Wilson 2013-07-30 19:25 ` Chris Wilson 2013-07-31 13:16 ` Ville Syrjälä 2013-07-31 13:36 ` Chris Wilson 2013-07-31 15:16 ` Ville Syrjälä 2013-07-31 15:26 ` Chris Wilson 2013-07-31 15:54 ` Ville Syrjälä 2013-07-31 16:14 ` Chris Wilson 2013-07-30 17:39 ` [PATCH 2/2] " Kenneth Graunke 2013-07-30 17:39 ` [PATCH 1/2] drm/i915: Use the same pte_encoding for ppgtt as for gtt Kenneth Graunke 2013-07-30 18:04 ` [PATCH] " Chris Wilson 2013-07-31 7:42 ` Ben Widawsky
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.