From: Imre Deak <imre.deak@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2 2/2] drm/i915: fix HW lockup due to missing RPS IRQ workaround on GEN6
Date: Fri, 19 Dec 2014 16:07:49 +0200 [thread overview]
Message-ID: <1418998069.28135.6.camel@intelbox> (raw)
In-Reply-To: <20141219134111.GM2711@phenom.ffwll.local>
On Fri, 2014-12-19 at 14:41 +0100, Daniel Vetter wrote:
> On Fri, Dec 19, 2014 at 02:51:57PM +0200, Imre Deak wrote:
> > In
> >
> > commit dbea3cea69508e9d548ed4a6be13de35492e5d15
> > Author: Imre Deak <imre.deak@intel.com>
> > Date: Mon Dec 15 18:59:28 2014 +0200
> >
> > drm/i915: sanitize RPS resetting during GPU reset
> >
> > we disable RPS interrupts during GPU resetting, but don't apply the
> > necessary GEN6 HW workaround. This leads to a HW lockup during a
> > subsequent "looping batchbuffer" workload. This is triggered by the
> > testcase that submits exactly this kind of workload after a simulated
> > GPU reset. I'm not sure how likely the bug would have triggered
> > otherwise, since we would have applied the workaround anyway shortly
> > after the GPU reset, when enabling GT powersaving from the deferred
> > work.
> >
> > This may also fix unrelated issues, since during driver loading /
> > suspending we also disable RPS interrupts and so we also had a short
> > window during the rest of the loading / resuming where a similar
> > workload could run without the workaround applied.
> >
> > v2:
> > - separate the fix to route RPS interrupts to the CPU on GEN9 too
> > to a separate patch (Daniel)
> >
> > Bisected-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
> > Testcase: igt/gem_reset_stats/ban-ctx-render
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87429
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_irq.c | 18 ++++++++++++++++--
> > drivers/gpu/drm/i915/intel_drv.h | 1 +
> > drivers/gpu/drm/i915/intel_pm.c | 11 +----------
> > 3 files changed, 18 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index aa3180c..f853f26 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -296,6 +296,21 @@ void gen6_enable_rps_interrupts(struct drm_device *dev)
> > spin_unlock_irq(&dev_priv->irq_lock);
> > }
> >
> > +u32 gen6_sanitize_rps_pm_mask(struct drm_i915_private *dev_priv, u32 mask)
> > +{
> > + /*
> > + * IVB and SNB hard hangs on looping batchbuffer
> > + * if GEN6_PM_UP_EI_EXPIRED is masked.
> > + */
> > + if (INTEL_INFO(dev_priv)->gen <= 7 && !IS_HASWELL(dev_priv))
>
> Hm, the comment says snb&ivb, but the code includes byt.
Yea, that code comment is out-of-date. You want me to send a v3 fixing
that?
> We also have an unprocted write to PMINTRMSK still in vlv_set_rps_idle.
> For consistency I think we should either switch to explicitly check for
> only snb/ivb here or adapt the mask in vlv_set_rps_idle too.
gen6_rps_pm_mask() is also used for VLV, so there is only the
unprotected write you mention. But that can be fixed up separately.
> I think the later is preferrable since that's the only vlv function
> that doesn't sanitize the rps interrupts with this now. We could later
> on blow through some cycles on vlv to figure out whether it's afflicted
> from this bug or not.
> -Daniel
>
> > + mask &= ~GEN6_PM_RP_UP_EI_EXPIRED;
> > +
> > + if (INTEL_INFO(dev_priv)->gen >= 8)
> > + mask &= ~GEN8_PMINTR_REDIRECT_TO_NON_DISP;
> > +
> > + return mask;
> > +}
> > +
> > void gen6_disable_rps_interrupts(struct drm_device *dev)
> > {
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > @@ -308,8 +323,7 @@ void gen6_disable_rps_interrupts(struct drm_device *dev)
> >
> > spin_lock_irq(&dev_priv->irq_lock);
> >
> > - I915_WRITE(GEN6_PMINTRMSK, INTEL_INFO(dev_priv)->gen >= 8 ?
> > - ~GEN8_PMINTR_REDIRECT_TO_NON_DISP : ~0);
> > + I915_WRITE(GEN6_PMINTRMSK, gen6_sanitize_rps_pm_mask(dev_priv, ~0));
> >
> > __gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
> > I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) &
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index 588b618..bb871f3 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -795,6 +795,7 @@ void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask);
> > void gen6_reset_rps_interrupts(struct drm_device *dev);
> > void gen6_enable_rps_interrupts(struct drm_device *dev);
> > void gen6_disable_rps_interrupts(struct drm_device *dev);
> > +u32 gen6_sanitize_rps_pm_mask(struct drm_i915_private *dev_priv, u32 mask);
> > void intel_runtime_pm_disable_interrupts(struct drm_i915_private *dev_priv);
> > void intel_runtime_pm_enable_interrupts(struct drm_i915_private *dev_priv);
> > static inline bool intel_irqs_enabled(struct drm_i915_private *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index f1f06d7..4bd1b8b 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3745,16 +3745,7 @@ static u32 gen6_rps_pm_mask(struct drm_i915_private *dev_priv, u8 val)
> > mask |= dev_priv->pm_rps_events & (GEN6_PM_RP_DOWN_EI_EXPIRED | GEN6_PM_RP_UP_EI_EXPIRED);
> > mask &= dev_priv->pm_rps_events;
> >
> > - /* IVB and SNB hard hangs on looping batchbuffer
> > - * if GEN6_PM_UP_EI_EXPIRED is masked.
> > - */
> > - if (INTEL_INFO(dev_priv->dev)->gen <= 7 && !IS_HASWELL(dev_priv->dev))
> > - mask |= GEN6_PM_RP_UP_EI_EXPIRED;
> > -
> > - if (INTEL_INFO(dev_priv)->gen >= 8)
> > - mask |= GEN8_PMINTR_REDIRECT_TO_NON_DISP;
> > -
> > - return ~mask;
> > + return gen6_sanitize_rps_pm_mask(dev_priv, ~mask);
> > }
> >
> > /* gen6_set_rps is called to update the frequency request, but should also be
> > --
> > 1.8.4
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2014-12-19 14:07 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-19 12:51 [PATCH v2 1/2] drm/i915: gen9: fix RPS interrupt routing to CPU vs. GT Imre Deak
2014-12-19 12:51 ` [PATCH v2 2/2] drm/i915: fix HW lockup due to missing RPS IRQ workaround on GEN6 Imre Deak
2014-12-19 13:41 ` Daniel Vetter
2014-12-19 14:07 ` Imre Deak [this message]
2014-12-19 13:31 ` [PATCH v2 1/2] drm/i915: gen9: fix RPS interrupt routing to CPU vs. GT Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1418998069.28135.6.camel@intelbox \
--to=imre.deak@intel.com \
--cc=daniel@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.