* [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained
@ 2014-03-07 20:09 Daniel Vetter
2014-03-07 21:35 ` Daniel Vetter
0 siblings, 1 reply; 5+ messages in thread
From: Daniel Vetter @ 2014-03-07 20:09 UTC (permalink / raw)
To: Intel Graphics Development; +Cc: Daniel Vetter, Mika Kuoppala
Since the gpu reset + full ppgtt merge we have a hard hang on snb when
running the gem_reset_stat tests. Recently Mika also some more strict
forcewake fifo warnigns for gen6/7 in
commit 20277c60ed08ab4f7237854cc6c2046649f9200f
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Wed Mar 5 18:08:19 2014 +0200
drm/i915: Always set fifo count to zero in gen6_reset
and they _do_ fire just right before the the final failing reset which
then results in the machine's ultimate demise.
So use this indicator to fail the gpu reset with an -EIO code,
preventing further command submission, further hangs and so the deadly
final gpu reset attempt. It seems to work and my snb survives now.
The gpu is still dead though unfortunately.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=74100
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
drivers/gpu/drm/i915/intel_uncore.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index c666af8232ef..9e22b11d0b0c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -989,9 +989,11 @@ static int gen6_do_reset(struct drm_device *dev)
if (fw_engine)
dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_engine);
- if (IS_GEN6(dev) || IS_GEN7(dev))
- WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
- GT_FIFO_FREE_ENTRIES_MASK) != 0);
+ if (IS_GEN6(dev) || IS_GEN7(dev)) {
+ if (WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
+ GT_FIFO_FREE_ENTRIES_MASK) != 0))
+ ret = -EIO;
+ }
dev_priv->uncore.fifo_count = 0;
--
1.8.1.4
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained 2014-03-07 20:09 [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained Daniel Vetter @ 2014-03-07 21:35 ` Daniel Vetter 2014-03-08 18:50 ` Ben Widawsky 0 siblings, 1 reply; 5+ messages in thread From: Daniel Vetter @ 2014-03-07 21:35 UTC (permalink / raw) To: Intel Graphics Development; +Cc: Daniel Vetter, Mika Kuoppala On Fri, Mar 07, 2014 at 09:09:03PM +0100, Daniel Vetter wrote: > Since the gpu reset + full ppgtt merge we have a hard hang on snb when > running the gem_reset_stat tests. Recently Mika also some more strict > forcewake fifo warnigns for gen6/7 in > > commit 20277c60ed08ab4f7237854cc6c2046649f9200f > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Date: Wed Mar 5 18:08:19 2014 +0200 > > drm/i915: Always set fifo count to zero in gen6_reset > > and they _do_ fire just right before the the final failing reset which > then results in the machine's ultimate demise. > > So use this indicator to fail the gpu reset with an -EIO code, > preventing further command submission, further hangs and so the deadly > final gpu reset attempt. It seems to work and my snb survives now. > > The gpu is still dead though unfortunately. > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > References: https://bugs.freedesktop.org/show_bug.cgi?id=74100 > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> > --- > drivers/gpu/drm/i915/intel_uncore.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index c666af8232ef..9e22b11d0b0c 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -989,9 +989,11 @@ static int gen6_do_reset(struct drm_device *dev) > if (fw_engine) > dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_engine); > > - if (IS_GEN6(dev) || IS_GEN7(dev)) > - WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) & > - GT_FIFO_FREE_ENTRIES_MASK) != 0); > + if (IS_GEN6(dev) || IS_GEN7(dev)) { > + if (WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) & > + GT_FIFO_FREE_ENTRIES_MASK) != 0)) > + ret = -EIO; Chris pointed out that this WARN doesn't make much sense, and testing confirmed that this completely breaks gpu reset on my machines here. I've backed out Mika's original patch, this seems to be the wrong path. -Daniel > + } > > dev_priv->uncore.fifo_count = 0; > > -- > 1.8.1.4 > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained 2014-03-07 21:35 ` Daniel Vetter @ 2014-03-08 18:50 ` Ben Widawsky 2014-03-08 19:58 ` Daniel Vetter 0 siblings, 1 reply; 5+ messages in thread From: Ben Widawsky @ 2014-03-08 18:50 UTC (permalink / raw) To: Daniel Vetter, Chris Wilson Cc: Daniel Vetter, Intel Graphics Development, Mika Kuoppala On Fri, Mar 07, 2014 at 10:35:56PM +0100, Daniel Vetter wrote: > On Fri, Mar 07, 2014 at 09:09:03PM +0100, Daniel Vetter wrote: > > Since the gpu reset + full ppgtt merge we have a hard hang on snb when > > running the gem_reset_stat tests. Recently Mika also some more strict > > forcewake fifo warnigns for gen6/7 in > > > > commit 20277c60ed08ab4f7237854cc6c2046649f9200f > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > Date: Wed Mar 5 18:08:19 2014 +0200 > > > > drm/i915: Always set fifo count to zero in gen6_reset > > > > and they _do_ fire just right before the the final failing reset which > > then results in the machine's ultimate demise. > > > > So use this indicator to fail the gpu reset with an -EIO code, > > preventing further command submission, further hangs and so the deadly > > final gpu reset attempt. It seems to work and my snb survives now. > > > > The gpu is still dead though unfortunately. > > > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=74100 > > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > --- > > drivers/gpu/drm/i915/intel_uncore.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > > index c666af8232ef..9e22b11d0b0c 100644 > > --- a/drivers/gpu/drm/i915/intel_uncore.c > > +++ b/drivers/gpu/drm/i915/intel_uncore.c > > @@ -989,9 +989,11 @@ static int gen6_do_reset(struct drm_device *dev) > > if (fw_engine) > > dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_engine); > > > > - if (IS_GEN6(dev) || IS_GEN7(dev)) > > - WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) & > > - GT_FIFO_FREE_ENTRIES_MASK) != 0); > > + if (IS_GEN6(dev) || IS_GEN7(dev)) { > > + if (WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) & > > + GT_FIFO_FREE_ENTRIES_MASK) != 0)) > > + ret = -EIO; > > Chris pointed out that this WARN doesn't make much sense, and testing > confirmed that this completely breaks gpu reset on my machines here. > > I've backed out Mika's original patch, this seems to be the wrong path. > -Daniel > > > + } > > > > dev_priv->uncore.fifo_count = 0; > > I've seen this too. Though I think the WARN does coincide with what the docs state - it doesn't seem to match reality. So I totally agree this is the right course. However, for my curiosity, Chris, can you elaborate on why you think it doesn't make sense? -- Ben Widawsky, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained 2014-03-08 18:50 ` Ben Widawsky @ 2014-03-08 19:58 ` Daniel Vetter 2014-03-08 20:02 ` Ben Widawsky 0 siblings, 1 reply; 5+ messages in thread From: Daniel Vetter @ 2014-03-08 19:58 UTC (permalink / raw) To: Ben Widawsky; +Cc: Intel Graphics Development, Mika Kuoppala On Sat, Mar 8, 2014 at 7:50 PM, Ben Widawsky <ben@bwidawsk.net> wrote: > I've seen this too. Though I think the WARN does coincide with what the > docs state - it doesn't seem to match reality. So I totally agree this > is the right course. > > However, for my curiosity, Chris, can you elaborate on why you think it > doesn't make sense? Our current fifo code would be broken - we stall for the fifo entries to refill if the value drops below NUM_FIFO_ENTRIES_RESERVED. Hence if the register value is zero right after reset, something is terribly broken. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained 2014-03-08 19:58 ` Daniel Vetter @ 2014-03-08 20:02 ` Ben Widawsky 0 siblings, 0 replies; 5+ messages in thread From: Ben Widawsky @ 2014-03-08 20:02 UTC (permalink / raw) To: Daniel Vetter; +Cc: Intel Graphics Development, Mika Kuoppala On Sat, Mar 08, 2014 at 08:58:24PM +0100, Daniel Vetter wrote: > On Sat, Mar 8, 2014 at 7:50 PM, Ben Widawsky <ben@bwidawsk.net> wrote: > > I've seen this too. Though I think the WARN does coincide with what the > > docs state - it doesn't seem to match reality. So I totally agree this > > is the right course. > > > > However, for my curiosity, Chris, can you elaborate on why you think it > > doesn't make sense? > > Our current fifo code would be broken - we stall for the fifo entries > to refill if the value drops below NUM_FIFO_ENTRIES_RESERVED. Hence if > the register value is zero right after reset, something is terribly > broken. > -Daniel Oh that's right. fifo_entries should be MAX, not 0. Wonder if that one would WARN. Anyway, I'm not actually sure if MAX is always known, so probably a stupid idea anyway. -- Ben Widawsky, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-03-08 20:02 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-07 20:09 [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained Daniel Vetter 2014-03-07 21:35 ` Daniel Vetter 2014-03-08 18:50 ` Ben Widawsky 2014-03-08 19:58 ` Daniel Vetter 2014-03-08 20:02 ` Ben Widawsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox