public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Michel Thierry <michel.thierry@intel.com>
To: intel-gfx@lists.freedesktop.org
Subject: [PATCH v9 RFC 08/21] drm/i915: Carry on with reset even if hw engine is not ready
Date: Thu, 15 Jun 2017 13:18:15 -0700	[thread overview]
Message-ID: <20170615201828.23144-9-michel.thierry@intel.com> (raw)
In-Reply-To: <20170615201828.23144-1-michel.thierry@intel.com>

We try to get the engines ready/idle before triggering the reset, but it
has been seen that sometimes the hw never acknowledges this.

If we miss the acknowledgment, carry on with the reset instead of
leaving the GPU in a wedged state.

The frequency of missed acknowledgment from hw is low, but it has been
seen at least once in CI.

References: https://intel-gfx-ci.01.org/CI/Trybot_831/
Reported-by: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 1ed3dd8df850..b99b7c69a525 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1630,8 +1630,12 @@ static int gen8_reset_engine_start(struct intel_engine_cs *engine)
 					 RESET_CTL_READY_TO_RESET,
 					 RESET_CTL_READY_TO_RESET,
 					 700);
-	if (ret)
-		DRM_ERROR("%s: reset request timeout\n", engine->name);
+	if (GEM_WARN_ON(ret)) {
+		/* hw did not ack ready-to-reset, reset anyway */
+		DRM_DEBUG_DRIVER("%s: reset request timeout, continue\n",
+				 engine->name);
+		ret = 0;
+	}
 
 	return ret;
 }
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-06-15 20:18 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-15 20:18 [PATCH v9 00/21] Gen8+ engine-reset Michel Thierry
2017-06-15 20:18 ` [PATCH v9 01/21] drm/i915: Look for active requests earlier in the reset path Michel Thierry
2017-06-15 20:18 ` [PATCH v9 02/21] drm/i915: Update i915.reset to handle engine resets Michel Thierry
2017-06-15 20:18 ` [PATCH v9 03/21] drm/i915: Modify error handler for per engine hang recovery Michel Thierry
2017-06-15 20:18 ` [PATCH v9 04/21] drm/i915: Include reset engine information in has_gpu_reset getparam Michel Thierry
2017-06-15 20:18 ` [PATCH v9 05/21] drm/i915: Add support for per engine reset recovery Michel Thierry
2017-06-19 12:31   ` Chris Wilson
2017-06-19 12:46   ` Chris Wilson
2017-06-19 18:42     ` Michel Thierry
2017-06-15 20:18 ` [PATCH v9 06/21] drm/i915: Add engine reset count to error state Michel Thierry
2017-06-15 20:18 ` [PATCH v9 07/21] drm/i915: Export per-engine reset count info to debugfs Michel Thierry
2017-06-15 20:18 ` Michel Thierry [this message]
2017-06-15 20:18 ` [PATCH v9 09/21] drm/i915: Enable Engine reset and recovery support Michel Thierry
2017-06-15 20:18 ` [PATCH v9 10/21] drm/i915: Add engine reset count in get-reset-stats ioctl Michel Thierry
2017-06-15 21:14   ` Chris Wilson
2017-06-15 21:23     ` Michel Thierry
2017-06-15 20:18 ` [PATCH v9 11/21] drm/i915/selftests: reset engine self tests Michel Thierry
2017-06-15 20:18 ` [PATCH v9 12/21] drm/i915/guc: fix mmio whitelist mmio_start offset and add reminder Michel Thierry
2017-06-15 20:18 ` [PATCH v9 13/21] drm/i915/guc: Provide register list to be saved/restored during engine reset Michel Thierry
2017-06-15 20:18 ` [PATCH v9 14/21] drm/i915/guc: Rename the function that resets the GuC Michel Thierry
2017-06-15 20:18 ` [PATCH v9 15/21] drm/i915/guc: Add support for reset engine using GuC commands Michel Thierry
2017-06-15 20:18 ` [PATCH v9 16/21] drm/i915: Watchdog timeout: Pass GuC shared data structure during param load Michel Thierry
2017-06-15 20:18 ` [PATCH v9 17/21] drm/i915: Watchdog timeout: IRQ handler for gen8+ Michel Thierry
2017-06-15 20:18 ` [PATCH v9 18/21] drm/i915: Watchdog timeout: Ringbuffer command emission " Michel Thierry
2017-06-15 20:18 ` [PATCH v9 19/21] drm/i915: Watchdog timeout: DRM kernel interface to set the timeout Michel Thierry
2017-06-15 20:18 ` [PATCH v9 20/21] drm/i915: Watchdog timeout: Include threshold value in error state Michel Thierry
2017-06-15 20:18 ` [PATCH v9 21/21] drm/i915: Watchdog timeout: Export media reset count from GuC to debugfs Michel Thierry
2017-06-15 20:42 ` ✓ Fi.CI.BAT: success for Gen8+ engine-reset (rev13) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170615201828.23144-9-michel.thierry@intel.com \
    --to=michel.thierry@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox