public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained
@ 2014-03-07 20:09 Daniel Vetter
  2014-03-07 21:35 ` Daniel Vetter
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Vetter @ 2014-03-07 20:09 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter, Mika Kuoppala

Since the gpu reset + full ppgtt merge we have a hard hang on snb when
running the gem_reset_stat tests. Recently Mika also some more strict
forcewake fifo warnigns for gen6/7 in

commit 20277c60ed08ab4f7237854cc6c2046649f9200f
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed Mar 5 18:08:19 2014 +0200

    drm/i915: Always set fifo count to zero in gen6_reset

and they _do_ fire just right before the the final failing reset which
then results in the machine's ultimate demise.

So use this indicator to fail the gpu reset with an -EIO code,
preventing further command submission, further hangs and so the deadly
final gpu reset attempt. It seems to work and my snb survives now.

The gpu is still dead though unfortunately.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=74100
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_uncore.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index c666af8232ef..9e22b11d0b0c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -989,9 +989,11 @@ static int gen6_do_reset(struct drm_device *dev)
 	if (fw_engine)
 		dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_engine);
 
-	if (IS_GEN6(dev) || IS_GEN7(dev))
-		WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
-			 GT_FIFO_FREE_ENTRIES_MASK) != 0);
+	if (IS_GEN6(dev) || IS_GEN7(dev)) {
+		if (WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
+			     GT_FIFO_FREE_ENTRIES_MASK) != 0))
+		    ret = -EIO;
+	}
 
 	dev_priv->uncore.fifo_count = 0;
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained
  2014-03-07 20:09 [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained Daniel Vetter
@ 2014-03-07 21:35 ` Daniel Vetter
  2014-03-08 18:50   ` Ben Widawsky
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Vetter @ 2014-03-07 21:35 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter, Mika Kuoppala

On Fri, Mar 07, 2014 at 09:09:03PM +0100, Daniel Vetter wrote:
> Since the gpu reset + full ppgtt merge we have a hard hang on snb when
> running the gem_reset_stat tests. Recently Mika also some more strict
> forcewake fifo warnigns for gen6/7 in
> 
> commit 20277c60ed08ab4f7237854cc6c2046649f9200f
> Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Date:   Wed Mar 5 18:08:19 2014 +0200
> 
>     drm/i915: Always set fifo count to zero in gen6_reset
> 
> and they _do_ fire just right before the the final failing reset which
> then results in the machine's ultimate demise.
> 
> So use this indicator to fail the gpu reset with an -EIO code,
> preventing further command submission, further hangs and so the deadly
> final gpu reset attempt. It seems to work and my snb survives now.
> 
> The gpu is still dead though unfortunately.
> 
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=74100
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index c666af8232ef..9e22b11d0b0c 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -989,9 +989,11 @@ static int gen6_do_reset(struct drm_device *dev)
>  	if (fw_engine)
>  		dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_engine);
>  
> -	if (IS_GEN6(dev) || IS_GEN7(dev))
> -		WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
> -			 GT_FIFO_FREE_ENTRIES_MASK) != 0);
> +	if (IS_GEN6(dev) || IS_GEN7(dev)) {
> +		if (WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
> +			     GT_FIFO_FREE_ENTRIES_MASK) != 0))
> +		    ret = -EIO;

Chris pointed out that this WARN doesn't make much sense, and testing
confirmed that this completely breaks gpu reset on my machines here.

I've backed out Mika's original patch, this seems to be the wrong path.
-Daniel

> +	}
>  
>  	dev_priv->uncore.fifo_count = 0;
>  
> -- 
> 1.8.1.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained
  2014-03-07 21:35 ` Daniel Vetter
@ 2014-03-08 18:50   ` Ben Widawsky
  2014-03-08 19:58     ` Daniel Vetter
  0 siblings, 1 reply; 5+ messages in thread
From: Ben Widawsky @ 2014-03-08 18:50 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson
  Cc: Daniel Vetter, Intel Graphics Development, Mika Kuoppala

On Fri, Mar 07, 2014 at 10:35:56PM +0100, Daniel Vetter wrote:
> On Fri, Mar 07, 2014 at 09:09:03PM +0100, Daniel Vetter wrote:
> > Since the gpu reset + full ppgtt merge we have a hard hang on snb when
> > running the gem_reset_stat tests. Recently Mika also some more strict
> > forcewake fifo warnigns for gen6/7 in
> > 
> > commit 20277c60ed08ab4f7237854cc6c2046649f9200f
> > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Date:   Wed Mar 5 18:08:19 2014 +0200
> > 
> >     drm/i915: Always set fifo count to zero in gen6_reset
> > 
> > and they _do_ fire just right before the the final failing reset which
> > then results in the machine's ultimate demise.
> > 
> > So use this indicator to fail the gpu reset with an -EIO code,
> > preventing further command submission, further hangs and so the deadly
> > final gpu reset attempt. It seems to work and my snb survives now.
> > 
> > The gpu is still dead though unfortunately.
> > 
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=74100
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/intel_uncore.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > index c666af8232ef..9e22b11d0b0c 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -989,9 +989,11 @@ static int gen6_do_reset(struct drm_device *dev)
> >  	if (fw_engine)
> >  		dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_engine);
> >  
> > -	if (IS_GEN6(dev) || IS_GEN7(dev))
> > -		WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
> > -			 GT_FIFO_FREE_ENTRIES_MASK) != 0);
> > +	if (IS_GEN6(dev) || IS_GEN7(dev)) {
> > +		if (WARN_ON((__raw_i915_read32(dev_priv, GTFIFOCTL) &
> > +			     GT_FIFO_FREE_ENTRIES_MASK) != 0))
> > +		    ret = -EIO;
> 
> Chris pointed out that this WARN doesn't make much sense, and testing
> confirmed that this completely breaks gpu reset on my machines here.
> 
> I've backed out Mika's original patch, this seems to be the wrong path.
> -Daniel
> 
> > +	}
> >  
> >  	dev_priv->uncore.fifo_count = 0;
> >  

I've seen this too. Though I think the WARN does coincide with what the
docs state - it doesn't seem to match reality. So I totally agree this
is the right course.

However, for my curiosity, Chris, can you elaborate on why you think it
doesn't make sense?


-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained
  2014-03-08 18:50   ` Ben Widawsky
@ 2014-03-08 19:58     ` Daniel Vetter
  2014-03-08 20:02       ` Ben Widawsky
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Vetter @ 2014-03-08 19:58 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel Graphics Development, Mika Kuoppala

On Sat, Mar 8, 2014 at 7:50 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> I've seen this too. Though I think the WARN does coincide with what the
> docs state - it doesn't seem to match reality. So I totally agree this
> is the right course.
>
> However, for my curiosity, Chris, can you elaborate on why you think it
> doesn't make sense?

Our current fifo code would be broken - we stall for the fifo entries
to refill if the value drops below NUM_FIFO_ENTRIES_RESERVED. Hence if
the register value is zero right after reset, something is terribly
broken.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained
  2014-03-08 19:58     ` Daniel Vetter
@ 2014-03-08 20:02       ` Ben Widawsky
  0 siblings, 0 replies; 5+ messages in thread
From: Ben Widawsky @ 2014-03-08 20:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel Graphics Development, Mika Kuoppala

On Sat, Mar 08, 2014 at 08:58:24PM +0100, Daniel Vetter wrote:
> On Sat, Mar 8, 2014 at 7:50 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > I've seen this too. Though I think the WARN does coincide with what the
> > docs state - it doesn't seem to match reality. So I totally agree this
> > is the right course.
> >
> > However, for my curiosity, Chris, can you elaborate on why you think it
> > doesn't make sense?
> 
> Our current fifo code would be broken - we stall for the fifo entries
> to refill if the value drops below NUM_FIFO_ENTRIES_RESERVED. Hence if
> the register value is zero right after reset, something is terribly
> broken.
> -Daniel

Oh that's right. fifo_entries should be MAX, not 0. Wonder if that one
would WARN. Anyway, I'm not actually sure if MAX is always known, so
probably a stupid idea anyway.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-03-08 20:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-07 20:09 [PATCH] drm/i915: Fail gpu reset if the forcewake fifo hasn't drained Daniel Vetter
2014-03-07 21:35 ` Daniel Vetter
2014-03-08 18:50   ` Ben Widawsky
2014-03-08 19:58     ` Daniel Vetter
2014-03-08 20:02       ` Ben Widawsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox