From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org,
Simon Farnsworth <simon@farnz.org.uk>,
stable@vger.kernel.org
Subject: Re: [Intel-gfx] [PATCH 3/3] drm/i915: Disable PMSI sleep messages on all rings around context switches
Date: Mon, 15 Dec 2014 10:41:52 +0100 [thread overview]
Message-ID: <20141215094152.GL27182@phenom.ffwll.local> (raw)
In-Reply-To: <1418285821-12868-3-git-send-email-chris@chris-wilson.co.uk>
On Thu, Dec 11, 2014 at 08:17:01AM +0000, Chris Wilson wrote:
> There exists a current workaround to prevent a hang on context switch
> should the ring go to sleep in the middle of the restore,
> WaProgramMiArbOnOffAroundMiSetContext (applicable to all gen7+). In
> spite of disabling arbitration (which prevents the ring from powering
> down during the critical section) we were still hitting hangs that had
> the hallmarks of the known erratum. That is we are still seeing hangs
> "on the last instruction in the context restore". By comparing -nightly
> (broken) with requests (working), we were able to deduce that it was the
> semaphore LRI cross-talk that reproduced the original failure.
> Explicitly disabling PMSI sleep on the RCS ring was insufficient, all
> the rings had to be awake to prevent the hangs. Fortunately, we can
> reduce the wakelock to the MI_SET_CONTEXT operation itself, and so
> should be able to limit the extra power implications.
>
> Since the MI_ARB_ON_OFF workaround is listed for all gen7 and above
> products, we should apply this extra hammer for all of the same
> platforms despite so far that we have only been able to reproduce the
> hang on certain ivb and hsw models. The last question is whether we want
> to always use the extra hammer or only when we know semaphores are in
> operation. At the moment, we only use LRI on non-RCS rings for
> semaphores, but that may change in the future with the possibility of
> reintroducing this bug under subtle conditions.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80660
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83677
> Cc: Simon Farnsworth <simon@farnz.org.uk>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: stable@vger.kernel.org
Oh dear, but awesome piece of debugging!
First question: How does this work on vlv with the split forcewake domain?
Any bad side effects like longer ctx switch times?
> ---
> drivers/gpu/drm/i915/i915_gem_context.c | 39 ++++++++++++++++++++++++---------
> drivers/gpu/drm/i915/i915_reg.h | 2 ++
> 2 files changed, 31 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2acf5803cf32..724ccecea06a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -484,7 +484,9 @@ mi_set_context(struct intel_engine_cs *ring,
> u32 hw_flags)
> {
> u32 flags = hw_flags | MI_MM_SPACE_GTT;
> - int ret;
> + struct intel_engine_cs *engine;
> + const int num_rings = i915_semaphore_is_enabled(ring->dev) ? hweight32(INTEL_INFO(ring->dev)->ring_mask) : 0;
> + int len, i, ret;
>
> /* w/a: If Flush TLB Invalidation Mode is enabled, driver must do a TLB
> * invalidation prior to MI_SET_CONTEXT. On GEN6 we don't set the value
> @@ -501,15 +503,26 @@ mi_set_context(struct intel_engine_cs *ring,
> if (!IS_HASWELL(ring->dev) && INTEL_INFO(ring->dev)->gen < 8)
> flags |= (MI_SAVE_EXT_STATE_EN | MI_RESTORE_EXT_STATE_EN);
>
> - ret = intel_ring_begin(ring, 6);
> +
> + len = 4;
> + if (INTEL_INFO(ring->dev)->gen >= 7)
> + len += 4*num_rings + 2;
> +
> + ret = intel_ring_begin(ring, len);
> if (ret)
> return ret;
>
> /* WaProgramMiArbOnOffAroundMiSetContext:ivb,vlv,hsw,bdw,chv */
> - if (INTEL_INFO(ring->dev)->gen >= 7)
> - intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_DISABLE);
> - else
> - intel_ring_emit(ring, MI_NOOP);
> + if (INTEL_INFO(ring->dev)->gen >= 7) {
> + if (num_rings) {
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_rings));
> + for_each_ring(engine, to_i915(ring->dev), i) {
> + intel_ring_emit(ring, RING_PSMI_CTL(engine->mmio_base));
> + intel_ring_emit(ring, _MASKED_BIT_ENABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
> + }
> + } else
> + intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_DISABLE);
For added paranoia, shouldn't we keep this one here? Or is the only thing
this does frob the idle notifications like PSMI? I kinda prefer we just
keep it if there's no harmful effects ...
Otherwise lgtm.
-Daniel
> + }
>
> intel_ring_emit(ring, MI_NOOP);
> intel_ring_emit(ring, MI_SET_CONTEXT);
> @@ -521,10 +534,16 @@ mi_set_context(struct intel_engine_cs *ring,
> */
> intel_ring_emit(ring, MI_NOOP);
>
> - if (INTEL_INFO(ring->dev)->gen >= 7)
> - intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> - else
> - intel_ring_emit(ring, MI_NOOP);
> + if (INTEL_INFO(ring->dev)->gen >= 7) {
> + if (num_rings) {
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_rings));
> + for_each_ring(engine, to_i915(ring->dev), i) {
> + intel_ring_emit(ring, RING_PSMI_CTL(engine->mmio_base));
> + intel_ring_emit(ring, _MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
> + }
> + } else
> + intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> + }
>
> intel_ring_advance(ring);
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 0ddef7256d02..1ce303cb8852 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1134,6 +1134,7 @@ enum punit_power_well {
> #define GEN6_VERSYNC (RING_SYNC_1(VEBOX_RING_BASE))
> #define GEN6_VEVSYNC (RING_SYNC_2(VEBOX_RING_BASE))
> #define GEN6_NOSYNC 0
> +#define RING_PSMI_CTL(base) ((base)+0x50)
> #define RING_MAX_IDLE(base) ((base)+0x54)
> #define RING_HWS_PGA(base) ((base)+0x80)
> #define RING_HWS_PGA_GEN6(base) ((base)+0x2080)
> @@ -1464,6 +1465,7 @@ enum punit_power_well {
> #define GEN6_BLITTER_FBC_NOTIFY (1<<3)
>
> #define GEN6_RC_SLEEP_PSMI_CONTROL 0x2050
> +#define GEN6_PSMI_SLEEP_MSG_DISABLE (1 << 0)
> #define GEN8_RC_SEMA_IDLE_MSG_DISABLE (1 << 12)
> #define GEN8_FF_DOP_CLOCK_GATE_DISABLE (1<<10)
>
> --
> 2.1.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
next prev parent reply other threads:[~2014-12-15 9:41 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-11 8:16 [PATCH 1/3] drm/i915: Invalidate media caches on gen7 Chris Wilson
2014-12-11 8:17 ` [PATCH 2/3] drm/i915: Force the CS stall for invalidate flushes Chris Wilson
2014-12-12 9:09 ` [Intel-gfx] " Ville Syrjälä
2014-12-12 9:20 ` Chris Wilson
2014-12-12 11:29 ` Ville Syrjälä
2014-12-15 9:30 ` Daniel Vetter
2014-12-11 8:17 ` [PATCH 3/3] drm/i915: Disable PMSI sleep messages on all rings around context switches Chris Wilson
2014-12-11 12:35 ` shuang.he
2014-12-15 9:41 ` Daniel Vetter [this message]
2014-12-15 16:24 ` [Intel-gfx] " Chris Wilson
2014-12-15 17:03 ` Ville Syrjälä
2014-12-16 9:19 ` Daniel Vetter
2014-12-16 9:55 ` [Intel-gfx] " Chris Wilson
2014-12-16 10:04 ` Daniel Vetter
2014-12-11 12:24 ` [PATCH 1/3] drm/i915: Invalidate media caches on gen7 Ville Syrjälä
2014-12-12 9:23 ` [Intel-gfx] " Chris Wilson
2014-12-15 9:34 ` Daniel Vetter
2014-12-16 8:26 ` Chris Wilson
2014-12-16 9:22 ` Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141215094152.GL27182@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
--cc=simon@farnz.org.uk \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.