public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Don't continually defer the hangcheck
@ 2014-11-07  9:47 Chris Wilson
  2014-11-07 14:28 ` Mika Kuoppala
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2014-11-07  9:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

With multiple rings, we may continue to render on the blitter whilst
executing an infinite shader on the render ring. As we currently, rearm
the timer with each execbuf, in this scenario the hangcheck will never
fire and we will never detect the lockup on the render ring. Instead,
only arm the timer once per hangcheck, so that hangcheck runs more
frequently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 318a6a0724d0..82b4d742aba5 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
 void i915_queue_hangcheck(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
+
 	if (!i915.enable_hangcheck)
 		return;
 
-	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
-		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
+	if (timer_pending(timer))
+		return;
+
+	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES);
+	add_timer(timer);
 }
 
 static void ibx_irq_reset(struct drm_device *dev)
-- 
2.1.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-07  9:47 [PATCH] drm/i915: Don't continually defer the hangcheck Chris Wilson
@ 2014-11-07 14:28 ` Mika Kuoppala
  2014-11-07 14:35   ` Chris Wilson
  0 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2014-11-07 14:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> With multiple rings, we may continue to render on the blitter whilst
> executing an infinite shader on the render ring. As we currently, rearm
> the timer with each execbuf, in this scenario the hangcheck will never
> fire and we will never detect the lockup on the render ring. Instead,
> only arm the timer once per hangcheck, so that hangcheck runs more
> frequently.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 318a6a0724d0..82b4d742aba5 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
>  void i915_queue_hangcheck(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
> +
>  	if (!i915.enable_hangcheck)
>  		return;
>  
> -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
> -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> +	if (timer_pending(timer))
> +		return;
> +

As this is called from both process and interrupt context, what
keeps us safe from not messing up the timer bookkeepping? The lock in timer code?

I am thinking that the other thread will hit the BUG_ON in add_timer().

-Mika

> +	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES);
> +	add_timer(timer);
>  }
>  
>  static void ibx_irq_reset(struct drm_device *dev)
> -- 
> 2.1.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-07 14:28 ` Mika Kuoppala
@ 2014-11-07 14:35   ` Chris Wilson
  2014-11-07 15:14     ` Mika Kuoppala
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2014-11-07 14:35 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > With multiple rings, we may continue to render on the blitter whilst
> > executing an infinite shader on the render ring. As we currently, rearm
> > the timer with each execbuf, in this scenario the hangcheck will never
> > fire and we will never detect the lockup on the render ring. Instead,
> > only arm the timer once per hangcheck, so that hangcheck runs more
> > frequently.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 318a6a0724d0..82b4d742aba5 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
> >  void i915_queue_hangcheck(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
> > +
> >  	if (!i915.enable_hangcheck)
> >  		return;
> >  
> > -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
> > -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> > +	if (timer_pending(timer))
> > +		return;
> > +
> 
> As this is called from both process and interrupt context, what
> keeps us safe from not messing up the timer bookkeepping? The lock in timer code?
> 
> I am thinking that the other thread will hit the BUG_ON in add_timer().

if (!timer_pending(timer))
	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
mod_timer(timer, timer->expires);
?

Or we just switch to using delayed_work() with a slightly less risky
interface.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-07 14:35   ` Chris Wilson
@ 2014-11-07 15:14     ` Mika Kuoppala
  2014-11-19 10:00       ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2014-11-07 15:14 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote:
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > With multiple rings, we may continue to render on the blitter whilst
>> > executing an infinite shader on the render ring. As we currently, rearm
>> > the timer with each execbuf, in this scenario the hangcheck will never
>> > fire and we will never detect the lockup on the render ring. Instead,
>> > only arm the timer once per hangcheck, so that hangcheck runs more
>> > frequently.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
>> >  1 file changed, 7 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> > index 318a6a0724d0..82b4d742aba5 100644
>> > --- a/drivers/gpu/drm/i915/i915_irq.c
>> > +++ b/drivers/gpu/drm/i915/i915_irq.c
>> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
>> >  void i915_queue_hangcheck(struct drm_device *dev)
>> >  {
>> >  	struct drm_i915_private *dev_priv = dev->dev_private;
>> > +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
>> > +
>> >  	if (!i915.enable_hangcheck)
>> >  		return;
>> >  
>> > -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
>> > -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
>> > +	if (timer_pending(timer))
>> > +		return;
>> > +
>> 
>> As this is called from both process and interrupt context, what
>> keeps us safe from not messing up the timer bookkeepping? The lock in timer code?
>> 
>> I am thinking that the other thread will hit the BUG_ON in add_timer().
>
> if (!timer_pending(timer))
> 	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> mod_timer(timer, timer->expires);
> ?

With this changed:

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> Or we just switch to using delayed_work() with a slightly less risky
> interface.

We should, after the dust settles.

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] drm/i915: Don't continually defer the hangcheck
@ 2014-11-19  9:47 Chris Wilson
  2014-11-19 10:43 ` Mika Kuoppala
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Chris Wilson @ 2014-11-19  9:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

With multiple rings, we may continue to render on the blitter whilst
executing an infinite shader on the render ring. As we currently, rearm
the timer with each execbuf, in this scenario the hangcheck will never
fire and we will never detect the lockup on the render ring. Instead,
only arm the timer once per hangcheck, so that hangcheck runs more
frequently.

v2: Rearrange code to avoid triggering a BUG_ON in add_timer from
softirq context.

Testcase: igt/gem_reset_stats/defer-hangcheck*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 729e9a329f76..c88c005433d6 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -988,7 +988,6 @@ static void notify_ring(struct drm_device *dev,
 	trace_i915_gem_request_complete(ring);
 
 	wake_up_all(&ring->irq_queue);
-	i915_queue_hangcheck(dev);
 }
 
 static u32 vlv_c0_residency(struct drm_i915_private *dev_priv,
@@ -3035,11 +3034,15 @@ static void i915_hangcheck_elapsed(unsigned long data)
 void i915_queue_hangcheck(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
+
 	if (!i915.enable_hangcheck)
 		return;
 
-	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
-		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
+	/* Don't continually defer the hangcheck, but make sure it is active */
+	if (!timer_pending(timer))
+		timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES);
+	mod_timer(timer, timer->expires);
 }
 
 static void ibx_irq_reset(struct drm_device *dev)
-- 
2.1.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-07 15:14     ` Mika Kuoppala
@ 2014-11-19 10:00       ` Daniel Vetter
  2014-11-19 10:13         ` Chris Wilson
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2014-11-19 10:00 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Fri, Nov 07, 2014 at 05:14:36PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote:
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > With multiple rings, we may continue to render on the blitter whilst
> >> > executing an infinite shader on the render ring. As we currently, rearm
> >> > the timer with each execbuf, in this scenario the hangcheck will never
> >> > fire and we will never detect the lockup on the render ring. Instead,
> >> > only arm the timer once per hangcheck, so that hangcheck runs more
> >> > frequently.
> >> >
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
> >> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >> > index 318a6a0724d0..82b4d742aba5 100644
> >> > --- a/drivers/gpu/drm/i915/i915_irq.c
> >> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> >> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
> >> >  void i915_queue_hangcheck(struct drm_device *dev)
> >> >  {
> >> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >> > +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
> >> > +
> >> >  	if (!i915.enable_hangcheck)
> >> >  		return;
> >> >  
> >> > -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
> >> > -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> >> > +	if (timer_pending(timer))
> >> > +		return;
> >> > +
> >> 
> >> As this is called from both process and interrupt context, what
> >> keeps us safe from not messing up the timer bookkeepping? The lock in timer code?
> >> 
> >> I am thinking that the other thread will hit the BUG_ON in add_timer().
> >
> > if (!timer_pending(timer))
> > 	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> > mod_timer(timer, timer->expires);
> > ?
> 
> With this changed:
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225

Can you please respin this with mod_timer so that I can slurp it in?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-19 10:00       ` Daniel Vetter
@ 2014-11-19 10:13         ` Chris Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2014-11-19 10:13 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Nov 19, 2014 at 11:00:08AM +0100, Daniel Vetter wrote:
> On Fri, Nov 07, 2014 at 05:14:36PM +0200, Mika Kuoppala wrote:
> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> > 
> > > On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote:
> > >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> > >> 
> > >> > With multiple rings, we may continue to render on the blitter whilst
> > >> > executing an infinite shader on the render ring. As we currently, rearm
> > >> > the timer with each execbuf, in this scenario the hangcheck will never
> > >> > fire and we will never detect the lockup on the render ring. Instead,
> > >> > only arm the timer once per hangcheck, so that hangcheck runs more
> > >> > frequently.
> > >> >
> > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > >> > ---
> > >> >  drivers/gpu/drm/i915/i915_irq.c | 9 +++++++--
> > >> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > >> >
> > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > >> > index 318a6a0724d0..82b4d742aba5 100644
> > >> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > >> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > >> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data)
> > >> >  void i915_queue_hangcheck(struct drm_device *dev)
> > >> >  {
> > >> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >> > +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
> > >> > +
> > >> >  	if (!i915.enable_hangcheck)
> > >> >  		return;
> > >> >  
> > >> > -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
> > >> > -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> > >> > +	if (timer_pending(timer))
> > >> > +		return;
> > >> > +
> > >> 
> > >> As this is called from both process and interrupt context, what
> > >> keeps us safe from not messing up the timer bookkeepping? The lock in timer code?
> > >> 
> > >> I am thinking that the other thread will hit the BUG_ON in add_timer().
> > >
> > > if (!timer_pending(timer))
> > > 	timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> > > mod_timer(timer, timer->expires);
> > > ?
> > 
> > With this changed:
> > 
> > Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
> 
> Can you please respin this with mod_timer so that I can slurp it in?

I was lazy and didn't look for this msg to reply to:

http://patchwork.freedesktop.org/patch/37111/
1416390439-5724-1-git-send-email-chris@chris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-19  9:47 Chris Wilson
@ 2014-11-19 10:43 ` Mika Kuoppala
  2014-11-19 10:45 ` Daniel Vetter
  2014-11-19 17:05 ` shuang.he
  2 siblings, 0 replies; 10+ messages in thread
From: Mika Kuoppala @ 2014-11-19 10:43 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> With multiple rings, we may continue to render on the blitter whilst
> executing an infinite shader on the render ring. As we currently, rearm
> the timer with each execbuf, in this scenario the hangcheck will never
> fire and we will never detect the lockup on the render ring. Instead,
> only arm the timer once per hangcheck, so that hangcheck runs more
> frequently.
>
> v2: Rearrange code to avoid triggering a BUG_ON in add_timer from
> softirq context.
>
> Testcase: igt/gem_reset_stats/defer-hangcheck*
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

>  drivers/gpu/drm/i915/i915_irq.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 729e9a329f76..c88c005433d6 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -988,7 +988,6 @@ static void notify_ring(struct drm_device *dev,
>  	trace_i915_gem_request_complete(ring);
>  
>  	wake_up_all(&ring->irq_queue);
> -	i915_queue_hangcheck(dev);
>  }
>  
>  static u32 vlv_c0_residency(struct drm_i915_private *dev_priv,
> @@ -3035,11 +3034,15 @@ static void i915_hangcheck_elapsed(unsigned long data)
>  void i915_queue_hangcheck(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
> +
>  	if (!i915.enable_hangcheck)
>  		return;
>  
> -	mod_timer(&dev_priv->gpu_error.hangcheck_timer,
> -		  round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
> +	/* Don't continually defer the hangcheck, but make sure it is active */
> +	if (!timer_pending(timer))
> +		timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES);
> +	mod_timer(timer, timer->expires);
>  }
>  
>  static void ibx_irq_reset(struct drm_device *dev)
> -- 
> 2.1.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-19  9:47 Chris Wilson
  2014-11-19 10:43 ` Mika Kuoppala
@ 2014-11-19 10:45 ` Daniel Vetter
  2014-11-19 17:05 ` shuang.he
  2 siblings, 0 replies; 10+ messages in thread
From: Daniel Vetter @ 2014-11-19 10:45 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Mika Kuoppala

On Wed, Nov 19, 2014 at 09:47:19AM +0000, Chris Wilson wrote:
> With multiple rings, we may continue to render on the blitter whilst
> executing an infinite shader on the render ring. As we currently, rearm
> the timer with each execbuf, in this scenario the hangcheck will never
> fire and we will never detect the lockup on the render ring. Instead,
> only arm the timer once per hangcheck, so that hangcheck runs more
> frequently.
> 
> v2: Rearrange code to avoid triggering a BUG_ON in add_timer from
> softirq context.
> 
> Testcase: igt/gem_reset_stats/defer-hangcheck*
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>

I went right ahead and upgraded this to r-b: mika.

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Don't continually defer the hangcheck
  2014-11-19  9:47 Chris Wilson
  2014-11-19 10:43 ` Mika Kuoppala
  2014-11-19 10:45 ` Daniel Vetter
@ 2014-11-19 17:05 ` shuang.he
  2 siblings, 0 replies; 10+ messages in thread
From: shuang.he @ 2014-11-19 17:05 UTC (permalink / raw)
  To: shuang.he, intel-gfx, chris

Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -2              369/369              367/369
ILK                 -3              403/403              400/403
SNB                 -1              459/459              458/459
IVB                 -3              535/545              532/545
BYT                 -1              290/290              289/290
HSW                 -4              610/610              606/610
BDW                 -2              451/451              449/451
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
PNV  igt_drv_hangman_error-state-capture-render      PASS(4, M25M7)      TIMEOUT(1, M7)PASS(3, M7)
PNV  igt_drv_missed_irq_hang      TIMEOUT(3, M7)PASS(1, M25)      TIMEOUT(1, M7)PASS(3, M7)
ILK  igt_drv_hangman_error-state-basic      PASS(4, M37)      TIMEOUT(1, M37)PASS(3, M37)
ILK  igt_gem_reset_stats_ban-render      PASS(4, M37)      TIMEOUT(1, M37)PASS(3, M37)
ILK  igt_gem_workarounds_reset      PASS(4, M6M37)      TIMEOUT(1, M37)PASS(3, M37)
SNB  igt_kms_pipe_crc_basic_hang-read-crc-pipe-A      PASS(4, M35)      TIMEOUT(1, M35)PASS(3, M35)
IVB  igt_drv_hangman_error-state-basic      PASS(4, M21M4)      TIMEOUT(1, M4)PASS(3, M4)
IVB  igt_gem_reset_stats_ban-ctx-render      PASS(4, M21M4)      TIMEOUT(1, M4)PASS(3, M4)
IVB  igt_gem_workarounds_reset      PASS(4, M45M4)      TIMEOUT(1, M4)PASS(3, M4)
BYT  igt_drv_missed_irq_hang      TIMEOUT(3, M36)PASS(1, M36)      TIMEOUT(1, M36)PASS(3, M36)
HSW  igt_drv_hangman_error-state-basic      PASS(4, M19M40)      TIMEOUT(1, M40)PASS(3, M40)
HSW  igt_gem_reset_stats_ban-bsd      PASS(4, M19M40)      TIMEOUT(1, M40)PASS(3, M40)
HSW  igt_gem_workarounds_reset      PASS(4, M39M40)      TIMEOUT(1, M40)PASS(3, M40)
HSW  igt_pm_rps_min-max-config-idle      PASS(4, M19M40)      FAIL(1, M40)PASS(3, M40)
BDW  igt_drv_hangman_error-state-basic      PASS(4, M28M30)      TIMEOUT(1, M30)PASS(3, M30)
BDW  igt_gem_reset_stats_ban-blt      PASS(4, M28M30)      TIMEOUT(1, M30)PASS(3, M30)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-11-19 17:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-07  9:47 [PATCH] drm/i915: Don't continually defer the hangcheck Chris Wilson
2014-11-07 14:28 ` Mika Kuoppala
2014-11-07 14:35   ` Chris Wilson
2014-11-07 15:14     ` Mika Kuoppala
2014-11-19 10:00       ` Daniel Vetter
2014-11-19 10:13         ` Chris Wilson
  -- strict thread matches above, loose matches on Subject: below --
2014-11-19  9:47 Chris Wilson
2014-11-19 10:43 ` Mika Kuoppala
2014-11-19 10:45 ` Daniel Vetter
2014-11-19 17:05 ` shuang.he

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox