public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 5/8] drm/i915: Double check hangcheck.seqno	after reset
Date: Mon, 03 Oct 2016 16:14:39 +0300	[thread overview]
Message-ID: <87fuod4ls0.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20161003125246.17686-5-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Check that there was not a late recovery between us declaring the GPU
> hung and processing the reset. If the GPU did recover by itself, let the
> request remain on the active list and see if it hangs again!
>

Did you see this in action? Makes sense to recheck
after reset. I don't remember how TDR will deal with multiple
reset on the same engine but we should start tracking the seqno
that cause it and make sure we don't get stuck by replaying the same.

Do we check the banning on resubmission and/or do we trust that
the breadcrumb update always succeedes?

I envision that if we get multiple resets on same seqno, we
just write the breadcrumbs through cpu and move on. But let's
hope we don't need to and the gpu breadcrumps are always enough.

Regardless, it's improvement and should weed out false positives
on some hangs.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
-Mika

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0cae8acdf906..a89a88922448 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2589,6 +2589,9 @@ static void i915_gem_reset_engine(struct intel_engine_cs *engine)
>  		return;
>  
>  	ring_hung = engine->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG;
> +	if (engine->hangcheck.seqno != intel_engine_get_seqno(engine))
> +		ring_hung = false;
> +
>  	i915_set_reset_status(request->ctx, ring_hung);
>  	if (!ring_hung)
>  		return;
> -- 
> 2.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-10-03 13:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-03 12:52 [PATCH 1/8] drm/i915: Share the computation of ring size for RING_CTL register Chris Wilson
2016-10-03 12:52 ` [PATCH 2/8] drm/i915/execlists: Reinitialise context image after GPU hang Chris Wilson
2016-10-03 12:52 ` [PATCH 3/8] drm/i915/execlists: Move clearing submission count from reset to init Chris Wilson
2016-10-03 13:07   ` Mika Kuoppala
2016-10-03 12:52 ` [PATCH 4/8] drm/i915: Disable irqs across GPU reset Chris Wilson
2016-10-03 13:09   ` Mika Kuoppala
2016-10-03 12:52 ` [PATCH 5/8] drm/i915: Double check hangcheck.seqno after reset Chris Wilson
2016-10-03 13:14   ` Mika Kuoppala [this message]
2016-10-03 14:01     ` Chris Wilson
2016-10-03 12:52 ` [PATCH 6/8] drm/i915: Show bounds of active request in the ring on GPU hang Chris Wilson
2016-10-04 11:56   ` Mika Kuoppala
2016-10-03 12:52 ` [PATCH 7/8] drm/i915: Show RING registers through debugfs Chris Wilson
2016-10-04 12:35   ` Mika Kuoppala
2016-10-04 13:11     ` Chris Wilson
2016-10-03 12:52 ` [PATCH 8/8] drm/i915: Show waiters in i915_hangcheck_info Chris Wilson
2016-10-04 12:41   ` Mika Kuoppala
2016-10-04 13:07     ` Chris Wilson
2016-10-04 13:22   ` Mika Kuoppala
2016-10-03 13:00 ` [PATCH 1/8] drm/i915: Share the computation of ring size for RING_CTL register Chris Wilson
2016-10-04 12:51   ` Mika Kuoppala
2016-10-03 13:01 ` Mika Kuoppala
2016-10-03 14:49 ` ✗ Fi.CI.BAT: warning for series starting with [1/8] " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fuod4ls0.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox