From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: Move hsw GT w/a to engine initialisation
Date: Fri, 3 Nov 2017 12:25:17 +0200 [thread overview]
Message-ID: <20171103102517.GU10981@intel.com> (raw)
In-Reply-To: <20171103025628.12733-1-chris@chris-wilson.co.uk>
On Fri, Nov 03, 2017 at 02:56:28AM +0000, Chris Wilson wrote:
> In commit b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to
> avoid it clobbering watermarks") init_clock_gating was called earlier in
> the module load sequence, moving it before we acquired the forcewake
> used to initialise the engines. This revealed that on Haswell, at least,
> some of those GT w/as had been moved into the power context, and so as
> we were now setting them outside of the power context, those settings
> were being lost.
Hmm. Writes shouldn't need forcewake as they go through the wake FIFO,
And the power context should have been set up by the BIOS. So I'm not
sure that explanation is entirely satisfactory, for masked registers
at least. For the ones that do RMW it could well be a problem.
Also there are some registers on the list that IIRC live in the
logical context, like GT_MODE/CACHE_MODE. I guess if the BIOS would
already enable rc6 those would be lost until we have a context set up.
This problem doesn't seem like it should be specific to HSW. So I wonder
if we should start by just reverting that offending patch and move just
the watermark thing out to some earlier position in the sequence.
> Now, strictly we want to set power context registers
> using LRI (that ensures there is a power context loaded!), we can
> restore the earlier behaviour by moving the GT register writes back to
> the same point in the module initialisation sequence.
>
> Reported-by: Mark Janes <mark.a.janes@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103549
> Fixes: b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to avoid it clobbering watermarks")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mark Janes <mark.a.janes@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_pm.c | 38 ------------------------------
> drivers/gpu/drm/i915/intel_ringbuffer.c | 41 +++++++++++++++++++++++++++++++++
> 2 files changed, 41 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 308439dd89d4..8a72526d491e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -8588,49 +8588,11 @@ static void hsw_init_clock_gating(struct drm_i915_private *dev_priv)
> {
> ilk_init_lp_watermarks(dev_priv);
>
> - /* L3 caching of data atomics doesn't work -- disable it. */
> - I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
> - I915_WRITE(HSW_ROW_CHICKEN3,
> - _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
> -
> /* This is required by WaCatErrorRejectionIssue:hsw */
> I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
> I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
> GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB);
>
> - /* WaVSRefCountFullforceMissDisable:hsw */
> - I915_WRITE(GEN7_FF_THREAD_MODE,
> - I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
> -
> - /* WaDisable_RenderCache_OperationalFlush:hsw */
> - I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> -
> - /* enable HiZ Raw Stall Optimization */
> - I915_WRITE(CACHE_MODE_0_GEN7,
> - _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> -
> - /* WaDisable4x2SubspanOptimization:hsw */
> - I915_WRITE(CACHE_MODE_1,
> - _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> -
> - /*
> - * BSpec recommends 8x4 when MSAA is used,
> - * however in practice 16x4 seems fastest.
> - *
> - * Note that PS/WM thread counts depend on the WIZ hashing
> - * disable bit, which we don't touch here, but it's good
> - * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
> - */
> - I915_WRITE(GEN7_GT_MODE,
> - _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
> -
> - /* WaSampleCChickenBitEnable:hsw */
> - I915_WRITE(HALF_SLICE_CHICKEN3,
> - _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
> -
> - /* WaSwitchSolVfFArbitrationPriority:hsw */
> - I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
> -
> /* WaRsPkgCStateDisplayPMReq:hsw */
> I915_WRITE(CHICKEN_PAR1_1,
> I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3321b801e77d..3a2287b0d9f4 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -707,6 +707,47 @@ static int init_render_ring(struct intel_engine_cs *engine)
> if (IS_GEN(dev_priv, 6, 7))
> I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
>
> +
> + if (IS_HASWELL(dev_priv)) {
> + /* L3 caching of data atomics doesn't work -- disable it. */
> + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
> + I915_WRITE(HSW_ROW_CHICKEN3,
> + _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
> +
> + /* WaVSRefCountFullforceMissDisable:hsw */
> + I915_WRITE(GEN7_FF_THREAD_MODE,
> + I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
> +
> + /* WaDisable_RenderCache_OperationalFlush:hsw */
> + I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
> + /* enable HiZ Raw Stall Optimization */
> + I915_WRITE(CACHE_MODE_0_GEN7,
> + _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> +
> + /* WaDisable4x2SubspanOptimization:hsw */
> + I915_WRITE(CACHE_MODE_1,
> + _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> +
> + /*
> + * BSpec recommends 8x4 when MSAA is used,
> + * however in practice 16x4 seems fastest.
> + *
> + * Note that PS/WM thread counts depend on the WIZ hashing
> + * disable bit, which we don't touch here, but it's good
> + * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
> + */
> + I915_WRITE(GEN7_GT_MODE,
> + _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
> +
> + /* WaSampleCChickenBitEnable:hsw */
> + I915_WRITE(HALF_SLICE_CHICKEN3,
> + _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
> +
> + /* WaSwitchSolVfFArbitrationPriority:hsw */
> + I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
> + }
> +
> if (INTEL_INFO(dev_priv)->gen >= 6)
> I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
>
> --
> 2.15.0
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2017-11-03 10:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-03 2:56 [PATCH] drm/i915: Move hsw GT w/a to engine initialisation Chris Wilson
2017-11-03 3:19 ` ✗ Fi.CI.BAT: failure for " Patchwork
2017-11-03 10:25 ` Ville Syrjälä [this message]
2017-11-03 10:38 ` [PATCH] " Chris Wilson
2017-11-06 10:45 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171103102517.GU10981@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=daniel.vetter@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.