All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: Enable eLLC caching of display buffers for SKL+
Date: Tue, 16 Apr 2019 17:37:39 +0300	[thread overview]
Message-ID: <20190416143739.GI24299@intel.com> (raw)
In-Reply-To: <108f888a-23c5-4b13-b170-2c92f0bffc87@intel.com>

On Tue, Apr 16, 2019 at 05:28:57PM +0300, Eero Tamminen wrote:
> Hi,
> 
> Based on quick tests with the patch:
> 
> * Results in GfxBench and Unigine (Valley/Heaven) tests were within 
> daily variation on the tested SKL machines
> 
> * SKL GT4e (128MB eLLC) / Wayland / Weston:
>    +15-20% SynMark TexMem512 (512MB of textures)
>     +4-6% SynMark TerrainFly*, CSCloth, ShMapVsm
>    -5-10% SynMark TexMem128 (128MB of textures)

These seem mostly good. The 128MB case regression seems
understandable since we don't quite fit into the eLLC
anymore.

> 
> * SKL GT3e (64MB eLLC) / Xorg / Unity:
>    +4-8% GpuTest Triangle fullscreen (FullHD)
>   -5-10% GpuTest Triangle windowed (1/2 screen)

Not quite sure why the windowed case would suffer here :/

> 
> * SKL GT2 (no eLLC) / Xorg / Unity:
>    * Some of the higher FPS SynMark pixel and vertex shader tests
>      are few percent higher, more than daily variance
>    => Do you see any reason why this machine would be impacted
>       although it doesn't eLLC?

Can't think of a reason for that. All display buffers should still
be UC on such a machine.

> 
> (I built it against drm-tip and compared results against previous and 
> next day unpatched drm-tip results that I had otherwise.)
> 
> 
> 	- Eero
> 
> On 15.4.2019 17.16, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Since SKL the eLLC has been sitting on the far side of the system
> > agent, meaning the display engine can utilize it. Let's enable that.
> > 
> > I chose WB for the caching mode, because my numbers are indicating
> > that WT might actually be WB and WC might actually be UC. I'm not
> > 100% sure that is indeed the case but at least my simple rendercopy
> > based benchmark didn't see any difference in performance.
> > 
> > Also if I configure things to do LLCeLLC+WT I still get cache dirt
> > on my screen, suggesting that is in fact operating in WB mode
> > anyway. This is also the reason I had to fix the MOCS target cache
> > to really say PTE rather than LLC+eLLC.
> > 
> > Caveat: I've not benchmarked any real workloads. IIRC Eero did
> > benchmark an earlier version, but that didn't have the PTE vs.
> > LLC+eLLC MOCS fix so it wasn't actually doing the right thing
> > most likely.
> > 
> > Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.h     | 3 +--
> >   drivers/gpu/drm/i915/i915_gem_gtt.c | 7 +++++--
> >   drivers/gpu/drm/i915/i915_gem_gtt.h | 2 +-
> >   drivers/gpu/drm/i915/intel_mocs.c   | 2 +-
> >   4 files changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 35d0782c077e..2a4f33fa2bba 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2517,8 +2517,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
> >   #define HAS_LLC(dev_priv)	(INTEL_INFO(dev_priv)->has_llc)
> >   #define HAS_SNOOP(dev_priv)	(INTEL_INFO(dev_priv)->has_snoop)
> >   #define HAS_EDRAM(dev_priv)	((dev_priv)->edram_size_mb)
> > -#define HAS_WT(dev_priv)	((IS_HASWELL(dev_priv) || \
> > -				 IS_BROADWELL(dev_priv)) && HAS_EDRAM(dev_priv))
> > +#define HAS_WT(dev_priv)	HAS_EDRAM(dev_priv)
> >   
> >   #define HWS_NEEDS_PHYSICAL(dev_priv)	(INTEL_INFO(dev_priv)->hws_needs_physical)
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 8f460cc4cc1f..038fbf52a997 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -3071,7 +3071,7 @@ static void cnl_setup_private_ppat(struct intel_ppat *ppat)
> >   
> >   	__alloc_ppat_entry(ppat, 0, GEN8_PPAT_WB | GEN8_PPAT_LLC);
> >   	__alloc_ppat_entry(ppat, 1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC);
> > -	__alloc_ppat_entry(ppat, 2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC);
> > +	__alloc_ppat_entry(ppat, 2, GEN8_PPAT_WB | GEN8_PPAT_ELLC_OVERRIDE);
> >   	__alloc_ppat_entry(ppat, 3, GEN8_PPAT_UC);
> >   	__alloc_ppat_entry(ppat, 4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0));
> >   	__alloc_ppat_entry(ppat, 5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1));
> > @@ -3109,7 +3109,10 @@ static void bdw_setup_private_ppat(struct intel_ppat *ppat)
> >   
> >   	__alloc_ppat_entry(ppat, 0, GEN8_PPAT_WB | GEN8_PPAT_LLC);      /* for normal objects, no eLLC */
> >   	__alloc_ppat_entry(ppat, 1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC);  /* for something pointing to ptes? */
> > -	__alloc_ppat_entry(ppat, 2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC);  /* for scanout with eLLC */
> > +	if (INTEL_GEN(ppat->i915) >= 9)
> > +		__alloc_ppat_entry(ppat, 2, GEN8_PPAT_WB | GEN8_PPAT_ELLC_OVERRIDE); /* for scanout with eLLC */
> > +	else
> > +		__alloc_ppat_entry(ppat, 2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC); /* for scanout with eLLC */
> >   	__alloc_ppat_entry(ppat, 3, GEN8_PPAT_UC);                      /* Uncached objects, mostly for scanout */
> >   	__alloc_ppat_entry(ppat, 4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0));
> >   	__alloc_ppat_entry(ppat, 5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1));
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> > index f597f35b109b..47adc7268867 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> > @@ -139,7 +139,7 @@ typedef u64 gen8_ppgtt_pml4e_t;
> >   #define PPAT_UNCACHED			(_PAGE_PWT | _PAGE_PCD)
> >   #define PPAT_CACHED_PDE			0 /* WB LLC */
> >   #define PPAT_CACHED			_PAGE_PAT /* WB LLCeLLC */
> > -#define PPAT_DISPLAY_ELLC		_PAGE_PCD /* WT eLLC */
> > +#define PPAT_DISPLAY_ELLC		_PAGE_PCD /* WT LLCeLLC (HSW/BDW) or WB eLLC (SKL+) */
> >   
> >   #define CHV_PPAT_SNOOP			(1<<6)
> >   #define GEN8_PPAT_AGE(x)		((x)<<4)
> > diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c
> > index 274ba78500c0..d984ccff94ef 100644
> > --- a/drivers/gpu/drm/i915/intel_mocs.c
> > +++ b/drivers/gpu/drm/i915/intel_mocs.c
> > @@ -115,7 +115,7 @@ struct drm_i915_mocs_table {
> >   		   LE_1_UC | LE_TC_2_LLC_ELLC, \
> >   		   L3_1_UC), \
> >   	MOCS_ENTRY(I915_MOCS_PTE, \
> > -		   LE_0_PAGETABLE | LE_TC_2_LLC_ELLC | LE_LRUM(3), \
> > +		   LE_0_PAGETABLE | LE_TC_0_PAGETABLE | LE_LRUM(3), \
> >   		   L3_3_WB)
> >   
> >   static const struct drm_i915_mocs_entry skylake_mocs_table[] = {
> > 

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-04-16 14:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-15 14:16 [PATCH] drm/i915: Enable eLLC caching of display buffers for SKL+ Ville Syrjala
2019-04-15 15:02 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2019-04-15 15:03 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-04-15 15:21 ` ✓ Fi.CI.BAT: success " Patchwork
2019-04-15 16:03   ` Chris Wilson
2019-04-15 16:20     ` Ville Syrjälä
2019-04-15 17:38 ` ✗ Fi.CI.IGT: failure " Patchwork
2019-04-16 14:28 ` [PATCH] " Eero Tamminen
2019-04-16 14:37   ` Ville Syrjälä [this message]
2019-04-17  7:09 ` Chris Wilson
2019-04-17 17:15   ` Ville Syrjälä
2019-04-26 14:54     ` Ville Syrjälä
2019-04-26 15:01       ` Chris Wilson
2019-04-26 15:13         ` Ville Syrjälä
2019-05-24 20:19 ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190416143739.GI24299@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=eero.t.tamminen@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.