public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: Ignore stuck requests when considering hangs
Date: Mon, 22 Aug 2016 14:39:30 +0300	[thread overview]
Message-ID: <87shtxm3rh.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20160820145408.32180-1-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If the engine isn't being retired (worker starvation?) then it is
> possible for us to repeatedly observe that between consecutive
> hangchecks the seqno on the ring to be the same and there remain
> unretired requests. Ignore these completely and only regard the engine
> as busy for the purpose of hang detection (not stall detection) if there
> are outstanding breadcrumbs.
>
> In recent history we have looked at using both the request and seqno as
> indication of activity on the engine, but that was reduced to just
> inspecting seqno in commit cffa781e5907 ("drm/i915: Simplify check for
> idleness in hangcheck"). However, in commit dcff85c8443e ("drm/i915:
> Enable i915_gem_wait_for_idle() without holding struct_mutex"), I made
> the decision to use the new common lockless function, under the
> assumption that request retirement was more frequent than hangcheck and
> so we would not have a stuck busy check. The flaw there was in
> forgetting that we accumulate the hang score, and so successive checks
> seeing a stuck request, albeit with the GPU advancing elsewhere and so
> not necessary the same stuck request, would eventually trigger the hang.
>
> Fixes: dcff85c8443e ("drm/i915: Enable i915_gem_wait_for_idle()...")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index ebb83d5a448b..7610eca4f687 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -3079,6 +3079,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
>  		bool busy = intel_engine_has_waiter(engine);
>  		u64 acthd;
>  		u32 seqno;
> +		u32 submit;
>  
>  		semaphore_clear_deadlocks(dev_priv);
>  
> @@ -3094,9 +3095,10 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
>  
>  		acthd = intel_engine_get_active_head(engine);
>  		seqno = intel_engine_get_seqno(engine);
> +		submit = READ_ONCE(engine->last_submitted_seqno);
>  
>  		if (engine->hangcheck.seqno == seqno) {
> -			if (!intel_engine_is_active(engine)) {
> +			if (i915_seqno_passed(seqno, submit)) {

Setting of busy could be moved in the in scope.

Also the check could be seqno == submit and warning if we see
seqno on engine past submit.

But the patch fixes what it says it does,

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

>  				engine->hangcheck.action = HANGCHECK_IDLE;
>  				if (busy) {
>  					/* Safeguard against driver failure */
> -- 
> 2.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2016-08-22 11:40 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-20 14:54 [PATCH] drm/i915: Ignore stuck requests when considering hangs Chris Wilson
2016-08-20 15:19 ` ✗ Ro.CI.BAT: failure for " Patchwork
2016-08-22 11:39 ` Mika Kuoppala [this message]
2016-08-22 12:07   ` [PATCH] " Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87shtxm3rh.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox