* [PATCH] drm/i915: Don't continually defer the hangcheck
@ 2014-11-19 9:47 Chris Wilson
2014-11-19 10:43 ` Mika Kuoppala
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Chris Wilson @ 2014-11-19 9:47 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala
With multiple rings, we may continue to render on the blitter whilst
executing an infinite shader on the render ring. As we currently, rearm
the timer with each execbuf, in this scenario the hangcheck will never
fire and we will never detect the lockup on the render ring. Instead,
only arm the timer once per hangcheck, so that hangcheck runs more
frequently.
v2: Rearrange code to avoid triggering a BUG_ON in add_timer from
softirq context.
Testcase: igt/gem_reset_stats/defer-hangcheck*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
drivers/gpu/drm/i915/i915_irq.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 729e9a329f76..c88c005433d6 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -988,7 +988,6 @@ static void notify_ring(struct drm_device *dev,
trace_i915_gem_request_complete(ring);
wake_up_all(&ring->irq_queue);
- i915_queue_hangcheck(dev);
}
static u32 vlv_c0_residency(struct drm_i915_private *dev_priv,
@@ -3035,11 +3034,15 @@ static void i915_hangcheck_elapsed(unsigned long data)
void i915_queue_hangcheck(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer;
+
if (!i915.enable_hangcheck)
return;
- mod_timer(&dev_priv->gpu_error.hangcheck_timer,
- round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES));
+ /* Don't continually defer the hangcheck, but make sure it is active */
+ if (!timer_pending(timer))
+ timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES);
+ mod_timer(timer, timer->expires);
}
static void ibx_irq_reset(struct drm_device *dev)
--
2.1.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-19 9:47 [PATCH] drm/i915: Don't continually defer the hangcheck Chris Wilson @ 2014-11-19 10:43 ` Mika Kuoppala 2014-11-19 10:45 ` Daniel Vetter 2014-11-19 17:05 ` shuang.he 2 siblings, 0 replies; 10+ messages in thread From: Mika Kuoppala @ 2014-11-19 10:43 UTC (permalink / raw) To: Chris Wilson, intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > With multiple rings, we may continue to render on the blitter whilst > executing an infinite shader on the render ring. As we currently, rearm > the timer with each execbuf, in this scenario the hangcheck will never > fire and we will never detect the lockup on the render ring. Instead, > only arm the timer once per hangcheck, so that hangcheck runs more > frequently. > > v2: Rearrange code to avoid triggering a BUG_ON in add_timer from > softirq context. > > Testcase: igt/gem_reset_stats/defer-hangcheck* > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > --- Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> > drivers/gpu/drm/i915/i915_irq.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 729e9a329f76..c88c005433d6 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -988,7 +988,6 @@ static void notify_ring(struct drm_device *dev, > trace_i915_gem_request_complete(ring); > > wake_up_all(&ring->irq_queue); > - i915_queue_hangcheck(dev); > } > > static u32 vlv_c0_residency(struct drm_i915_private *dev_priv, > @@ -3035,11 +3034,15 @@ static void i915_hangcheck_elapsed(unsigned long data) > void i915_queue_hangcheck(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; > + > if (!i915.enable_hangcheck) > return; > > - mod_timer(&dev_priv->gpu_error.hangcheck_timer, > - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > + /* Don't continually defer the hangcheck, but make sure it is active */ > + if (!timer_pending(timer)) > + timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES); > + mod_timer(timer, timer->expires); > } > > static void ibx_irq_reset(struct drm_device *dev) > -- > 2.1.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-19 9:47 [PATCH] drm/i915: Don't continually defer the hangcheck Chris Wilson 2014-11-19 10:43 ` Mika Kuoppala @ 2014-11-19 10:45 ` Daniel Vetter 2014-11-19 17:05 ` shuang.he 2 siblings, 0 replies; 10+ messages in thread From: Daniel Vetter @ 2014-11-19 10:45 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx, Mika Kuoppala On Wed, Nov 19, 2014 at 09:47:19AM +0000, Chris Wilson wrote: > With multiple rings, we may continue to render on the blitter whilst > executing an infinite shader on the render ring. As we currently, rearm > the timer with each execbuf, in this scenario the hangcheck will never > fire and we will never detect the lockup on the render ring. Instead, > only arm the timer once per hangcheck, so that hangcheck runs more > frequently. > > v2: Rearrange code to avoid triggering a BUG_ON in add_timer from > softirq context. > > Testcase: igt/gem_reset_stats/defer-hangcheck* > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> I went right ahead and upgraded this to r-b: mika. Queued for -next, thanks for the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-19 9:47 [PATCH] drm/i915: Don't continually defer the hangcheck Chris Wilson 2014-11-19 10:43 ` Mika Kuoppala 2014-11-19 10:45 ` Daniel Vetter @ 2014-11-19 17:05 ` shuang.he 2 siblings, 0 replies; 10+ messages in thread From: shuang.he @ 2014-11-19 17:05 UTC (permalink / raw) To: shuang.he, intel-gfx, chris Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) -------------------------------------Summary------------------------------------- Platform Delta drm-intel-nightly Series Applied PNV -2 369/369 367/369 ILK -3 403/403 400/403 SNB -1 459/459 458/459 IVB -3 535/545 532/545 BYT -1 290/290 289/290 HSW -4 610/610 606/610 BDW -2 451/451 449/451 -------------------------------------Detailed------------------------------------- Platform Test drm-intel-nightly Series Applied PNV igt_drv_hangman_error-state-capture-render PASS(4, M25M7) TIMEOUT(1, M7)PASS(3, M7) PNV igt_drv_missed_irq_hang TIMEOUT(3, M7)PASS(1, M25) TIMEOUT(1, M7)PASS(3, M7) ILK igt_drv_hangman_error-state-basic PASS(4, M37) TIMEOUT(1, M37)PASS(3, M37) ILK igt_gem_reset_stats_ban-render PASS(4, M37) TIMEOUT(1, M37)PASS(3, M37) ILK igt_gem_workarounds_reset PASS(4, M6M37) TIMEOUT(1, M37)PASS(3, M37) SNB igt_kms_pipe_crc_basic_hang-read-crc-pipe-A PASS(4, M35) TIMEOUT(1, M35)PASS(3, M35) IVB igt_drv_hangman_error-state-basic PASS(4, M21M4) TIMEOUT(1, M4)PASS(3, M4) IVB igt_gem_reset_stats_ban-ctx-render PASS(4, M21M4) TIMEOUT(1, M4)PASS(3, M4) IVB igt_gem_workarounds_reset PASS(4, M45M4) TIMEOUT(1, M4)PASS(3, M4) BYT igt_drv_missed_irq_hang TIMEOUT(3, M36)PASS(1, M36) TIMEOUT(1, M36)PASS(3, M36) HSW igt_drv_hangman_error-state-basic PASS(4, M19M40) TIMEOUT(1, M40)PASS(3, M40) HSW igt_gem_reset_stats_ban-bsd PASS(4, M19M40) TIMEOUT(1, M40)PASS(3, M40) HSW igt_gem_workarounds_reset PASS(4, M39M40) TIMEOUT(1, M40)PASS(3, M40) HSW igt_pm_rps_min-max-config-idle PASS(4, M19M40) FAIL(1, M40)PASS(3, M40) BDW igt_drv_hangman_error-state-basic PASS(4, M28M30) TIMEOUT(1, M30)PASS(3, M30) BDW igt_gem_reset_stats_ban-blt PASS(4, M28M30) TIMEOUT(1, M30)PASS(3, M30) _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] drm/i915: Don't continually defer the hangcheck @ 2014-11-07 9:47 Chris Wilson 2014-11-07 14:28 ` Mika Kuoppala 0 siblings, 1 reply; 10+ messages in thread From: Chris Wilson @ 2014-11-07 9:47 UTC (permalink / raw) To: intel-gfx; +Cc: Mika Kuoppala With multiple rings, we may continue to render on the blitter whilst executing an infinite shader on the render ring. As we currently, rearm the timer with each execbuf, in this scenario the hangcheck will never fire and we will never detect the lockup on the render ring. Instead, only arm the timer once per hangcheck, so that hangcheck runs more frequently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> --- drivers/gpu/drm/i915/i915_irq.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 318a6a0724d0..82b4d742aba5 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data) void i915_queue_hangcheck(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; + if (!i915.enable_hangcheck) return; - mod_timer(&dev_priv->gpu_error.hangcheck_timer, - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); + if (timer_pending(timer)) + return; + + timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES); + add_timer(timer); } static void ibx_irq_reset(struct drm_device *dev) -- 2.1.3 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-07 9:47 Chris Wilson @ 2014-11-07 14:28 ` Mika Kuoppala 2014-11-07 14:35 ` Chris Wilson 0 siblings, 1 reply; 10+ messages in thread From: Mika Kuoppala @ 2014-11-07 14:28 UTC (permalink / raw) To: Chris Wilson, intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > With multiple rings, we may continue to render on the blitter whilst > executing an infinite shader on the render ring. As we currently, rearm > the timer with each execbuf, in this scenario the hangcheck will never > fire and we will never detect the lockup on the render ring. Instead, > only arm the timer once per hangcheck, so that hangcheck runs more > frequently. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > --- > drivers/gpu/drm/i915/i915_irq.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 318a6a0724d0..82b4d742aba5 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data) > void i915_queue_hangcheck(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; > + > if (!i915.enable_hangcheck) > return; > > - mod_timer(&dev_priv->gpu_error.hangcheck_timer, > - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > + if (timer_pending(timer)) > + return; > + As this is called from both process and interrupt context, what keeps us safe from not messing up the timer bookkeepping? The lock in timer code? I am thinking that the other thread will hit the BUG_ON in add_timer(). -Mika > + timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES); > + add_timer(timer); > } > > static void ibx_irq_reset(struct drm_device *dev) > -- > 2.1.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-07 14:28 ` Mika Kuoppala @ 2014-11-07 14:35 ` Chris Wilson 2014-11-07 15:14 ` Mika Kuoppala 0 siblings, 1 reply; 10+ messages in thread From: Chris Wilson @ 2014-11-07 14:35 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > With multiple rings, we may continue to render on the blitter whilst > > executing an infinite shader on the render ring. As we currently, rearm > > the timer with each execbuf, in this scenario the hangcheck will never > > fire and we will never detect the lockup on the render ring. Instead, > > only arm the timer once per hangcheck, so that hangcheck runs more > > frequently. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > --- > > drivers/gpu/drm/i915/i915_irq.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > index 318a6a0724d0..82b4d742aba5 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data) > > void i915_queue_hangcheck(struct drm_device *dev) > > { > > struct drm_i915_private *dev_priv = dev->dev_private; > > + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; > > + > > if (!i915.enable_hangcheck) > > return; > > > > - mod_timer(&dev_priv->gpu_error.hangcheck_timer, > > - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > > + if (timer_pending(timer)) > > + return; > > + > > As this is called from both process and interrupt context, what > keeps us safe from not messing up the timer bookkeepping? The lock in timer code? > > I am thinking that the other thread will hit the BUG_ON in add_timer(). if (!timer_pending(timer)) timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); mod_timer(timer, timer->expires); ? Or we just switch to using delayed_work() with a slightly less risky interface. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-07 14:35 ` Chris Wilson @ 2014-11-07 15:14 ` Mika Kuoppala 2014-11-19 10:00 ` Daniel Vetter 0 siblings, 1 reply; 10+ messages in thread From: Mika Kuoppala @ 2014-11-07 15:14 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx Chris Wilson <chris@chris-wilson.co.uk> writes: > On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote: >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >> > With multiple rings, we may continue to render on the blitter whilst >> > executing an infinite shader on the render ring. As we currently, rearm >> > the timer with each execbuf, in this scenario the hangcheck will never >> > fire and we will never detect the lockup on the render ring. Instead, >> > only arm the timer once per hangcheck, so that hangcheck runs more >> > frequently. >> > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> >> > --- >> > drivers/gpu/drm/i915/i915_irq.c | 9 +++++++-- >> > 1 file changed, 7 insertions(+), 2 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c >> > index 318a6a0724d0..82b4d742aba5 100644 >> > --- a/drivers/gpu/drm/i915/i915_irq.c >> > +++ b/drivers/gpu/drm/i915/i915_irq.c >> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data) >> > void i915_queue_hangcheck(struct drm_device *dev) >> > { >> > struct drm_i915_private *dev_priv = dev->dev_private; >> > + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; >> > + >> > if (!i915.enable_hangcheck) >> > return; >> > >> > - mod_timer(&dev_priv->gpu_error.hangcheck_timer, >> > - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); >> > + if (timer_pending(timer)) >> > + return; >> > + >> >> As this is called from both process and interrupt context, what >> keeps us safe from not messing up the timer bookkeepping? The lock in timer code? >> >> I am thinking that the other thread will hit the BUG_ON in add_timer(). > > if (!timer_pending(timer)) > timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > mod_timer(timer, timer->expires); > ? With this changed: Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> > Or we just switch to using delayed_work() with a slightly less risky > interface. We should, after the dust settles. > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-07 15:14 ` Mika Kuoppala @ 2014-11-19 10:00 ` Daniel Vetter 2014-11-19 10:13 ` Chris Wilson 0 siblings, 1 reply; 10+ messages in thread From: Daniel Vetter @ 2014-11-19 10:00 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx On Fri, Nov 07, 2014 at 05:14:36PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote: > >> Chris Wilson <chris@chris-wilson.co.uk> writes: > >> > >> > With multiple rings, we may continue to render on the blitter whilst > >> > executing an infinite shader on the render ring. As we currently, rearm > >> > the timer with each execbuf, in this scenario the hangcheck will never > >> > fire and we will never detect the lockup on the render ring. Instead, > >> > only arm the timer once per hangcheck, so that hangcheck runs more > >> > frequently. > >> > > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > >> > --- > >> > drivers/gpu/drm/i915/i915_irq.c | 9 +++++++-- > >> > 1 file changed, 7 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > >> > index 318a6a0724d0..82b4d742aba5 100644 > >> > --- a/drivers/gpu/drm/i915/i915_irq.c > >> > +++ b/drivers/gpu/drm/i915/i915_irq.c > >> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data) > >> > void i915_queue_hangcheck(struct drm_device *dev) > >> > { > >> > struct drm_i915_private *dev_priv = dev->dev_private; > >> > + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; > >> > + > >> > if (!i915.enable_hangcheck) > >> > return; > >> > > >> > - mod_timer(&dev_priv->gpu_error.hangcheck_timer, > >> > - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > >> > + if (timer_pending(timer)) > >> > + return; > >> > + > >> > >> As this is called from both process and interrupt context, what > >> keeps us safe from not messing up the timer bookkeepping? The lock in timer code? > >> > >> I am thinking that the other thread will hit the BUG_ON in add_timer(). > > > > if (!timer_pending(timer)) > > timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > > mod_timer(timer, timer->expires); > > ? > > With this changed: > > Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225 Can you please respin this with mod_timer so that I can slurp it in? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915: Don't continually defer the hangcheck 2014-11-19 10:00 ` Daniel Vetter @ 2014-11-19 10:13 ` Chris Wilson 0 siblings, 0 replies; 10+ messages in thread From: Chris Wilson @ 2014-11-19 10:13 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Wed, Nov 19, 2014 at 11:00:08AM +0100, Daniel Vetter wrote: > On Fri, Nov 07, 2014 at 05:14:36PM +0200, Mika Kuoppala wrote: > > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > > > On Fri, Nov 07, 2014 at 04:28:33PM +0200, Mika Kuoppala wrote: > > >> Chris Wilson <chris@chris-wilson.co.uk> writes: > > >> > > >> > With multiple rings, we may continue to render on the blitter whilst > > >> > executing an infinite shader on the render ring. As we currently, rearm > > >> > the timer with each execbuf, in this scenario the hangcheck will never > > >> > fire and we will never detect the lockup on the render ring. Instead, > > >> > only arm the timer once per hangcheck, so that hangcheck runs more > > >> > frequently. > > >> > > > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > >> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > >> > --- > > >> > drivers/gpu/drm/i915/i915_irq.c | 9 +++++++-- > > >> > 1 file changed, 7 insertions(+), 2 deletions(-) > > >> > > > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > >> > index 318a6a0724d0..82b4d742aba5 100644 > > >> > --- a/drivers/gpu/drm/i915/i915_irq.c > > >> > +++ b/drivers/gpu/drm/i915/i915_irq.c > > >> > @@ -3039,11 +3039,16 @@ static void i915_hangcheck_elapsed(unsigned long data) > > >> > void i915_queue_hangcheck(struct drm_device *dev) > > >> > { > > >> > struct drm_i915_private *dev_priv = dev->dev_private; > > >> > + struct timer_list *timer = &dev_priv->gpu_error.hangcheck_timer; > > >> > + > > >> > if (!i915.enable_hangcheck) > > >> > return; > > >> > > > >> > - mod_timer(&dev_priv->gpu_error.hangcheck_timer, > > >> > - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > > >> > + if (timer_pending(timer)) > > >> > + return; > > >> > + > > >> > > >> As this is called from both process and interrupt context, what > > >> keeps us safe from not messing up the timer bookkeepping? The lock in timer code? > > >> > > >> I am thinking that the other thread will hit the BUG_ON in add_timer(). > > > > > > if (!timer_pending(timer)) > > > timer->expires = round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); > > > mod_timer(timer, timer->expires); > > > ? > > > > With this changed: > > > > Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225 > > Can you please respin this with mod_timer so that I can slurp it in? I was lazy and didn't look for this msg to reply to: http://patchwork.freedesktop.org/patch/37111/ 1416390439-5724-1-git-send-email-chris@chris-wilson.co.uk -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-11-19 17:07 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-19 9:47 [PATCH] drm/i915: Don't continually defer the hangcheck Chris Wilson 2014-11-19 10:43 ` Mika Kuoppala 2014-11-19 10:45 ` Daniel Vetter 2014-11-19 17:05 ` shuang.he -- strict thread matches above, loose matches on Subject: below -- 2014-11-07 9:47 Chris Wilson 2014-11-07 14:28 ` Mika Kuoppala 2014-11-07 14:35 ` Chris Wilson 2014-11-07 15:14 ` Mika Kuoppala 2014-11-19 10:00 ` Daniel Vetter 2014-11-19 10:13 ` Chris Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox