From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH 2/2] drm/i915: Use Write-Through cacheing for the display plane on Iris Date: Tue, 30 Jul 2013 20:19:28 +0300 Message-ID: <20130730171928.GV5004@intel.com> References: <1375203516-8023-1-git-send-email-chris@chris-wilson.co.uk> <1375203516-8023-2-git-send-email-chris@chris-wilson.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id 7C499E5F27 for ; Tue, 30 Jul 2013 10:19:31 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1375203516-8023-2-git-send-email-chris@chris-wilson.co.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Chris Wilson Cc: intel-gfx@lists.freedesktop.org, Ben Widawsky List-Id: intel-gfx@lists.freedesktop.org On Tue, Jul 30, 2013 at 05:58:36PM +0100, Chris Wilson wrote: > Haswell GT3e has the unique feature of supporting Write-Through cacheing > of objects within the eLLC. The purpose of this is to enable the display > plane to remain coherent whilst objects lie resident in the eLLC - so > that we in theory get the best of both worlds, perfect display and fast > access. The description here talks about eLLC only, but you set the PTE for WT in LLC/eLLC both. > Signed-off-by: Chris Wilson > Cc: Ben Widawsky > --- > drivers/gpu/drm/i915/i915_dma.c | 3 +++ > drivers/gpu/drm/i915/i915_drv.h | 4 +++- > drivers/gpu/drm/i915/i915_gem.c | 3 ++- > drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- > include/uapi/drm/i915_drm.h | 1 + > 5 files changed, 19 insertions(+), 3 deletions(-) > = > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_= dma.c > index 8da0b3d..75989fc 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -976,6 +976,9 @@ static int i915_getparam(struct drm_device *dev, void= *data, > case I915_PARAM_HAS_LLC: > value =3D HAS_LLC(dev); > break; > + case I915_PARAM_HAS_WT: > + value =3D HAS_WT(dev); > + break; > case I915_PARAM_HAS_ALIASING_PPGTT: > value =3D dev_priv->mm.aliasing_ppgtt ? 1 : 0; > break; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_= drv.h > index 34d2b9d..324ea14 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -452,6 +452,7 @@ enum i915_cache_level { > I915_CACHE_NONE =3D 0, > I915_CACHE_LLC, > I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */ > + I915_CACHE_WT, > }; > = > typedef uint32_t gen6_gtt_pte_t; > @@ -1344,7 +1345,7 @@ struct drm_i915_gem_object { > unsigned int pending_fenced_gpu_access:1; > unsigned int fenced_gpu_access:1; > = > - unsigned int cache_level:2; > + unsigned int cache_level:3; > = > unsigned int has_aliasing_ppgtt_mapping:1; > unsigned int has_global_gtt_mapping:1; > @@ -1547,6 +1548,7 @@ struct drm_i915_file_private { > #define HAS_BLT(dev) (INTEL_INFO(dev)->has_blt_ring) > #define HAS_VEBOX(dev) (INTEL_INFO(dev)->has_vebox_ring) > #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) > +#define HAS_WT(dev) (IS_HASWELL(dev) && ((struct drm_i915_pri= vate *)(dev))->ellc_size) ->dev_private missing > #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws) > = > #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >=3D 5) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_= gem.c > index 99362f7..cbea7f8 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3565,7 +3565,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i91= 5_gem_object *obj, > * of uncaching, which would allow us to flush all the LLC-cached data > * with that bit in the PTE to main memory with just one PIPE_CONTROL. > */ > - ret =3D i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > + ret =3D i915_gem_object_set_cache_level(obj, > + HAS_WT(obj->base.dev) ? I915_CACHE_WT : I915_CACHE_NONE); Don't we need to tweak the write domain like we do for UC to make sure already dirty lines get flushed from caches? > if (ret) > return ret; > = > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i= 915_gem_gtt.c > index 0522d00..072a348 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -54,6 +54,7 @@ > (((bits) & 0x8) << (11 - 3))) > #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) > #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) > +#define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) > = > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > enum i915_cache_level level) > @@ -116,8 +117,16 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t add= r, > gen6_gtt_pte_t pte =3D GEN6_PTE_VALID; > pte |=3D HSW_PTE_ADDR_ENCODE(addr); > = > - if (level !=3D I915_CACHE_NONE) > + switch (level) { > + case I915_CACHE_NONE: > + break; > + case I915_CACHE_WT: > + pte |=3D HSW_WT_ELLC_LLC_AGE0; > + break; > + default: > pte |=3D HSW_WB_ELLC_LLC_AGE0; > + break; > + } > = > return pte; > } > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index e47cf00..e831292 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -338,6 +338,7 @@ typedef struct drm_i915_irq_wait { > #define I915_PARAM_HAS_PINNED_BATCHES 24 > #define I915_PARAM_HAS_EXEC_NO_RELOC 25 > #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 > +#define I915_PARAM_HAS_WT 27 > = > typedef struct drm_i915_getparam { > int param; > -- = > 1.8.3.2 > = > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- = Ville Syrj=E4l=E4 Intel OTC