public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: Abandon the reset if we fail to stop the engines
Date: Thu, 26 Oct 2017 15:59:05 +0300	[thread overview]
Message-ID: <20171026125905.GP10981@intel.com> (raw)
In-Reply-To: <20171026121212.22900-1-chris@chris-wilson.co.uk>

On Thu, Oct 26, 2017 at 01:12:12PM +0100, Chris Wilson wrote:
> Some machines, *cough* snb *cough*, fail catastrophically if asked to
> reset the GPU under certain conditions.

Did we try skipping the gen6_rps_disable() already?

> The initial guess is that this
> is when the rings are still busy at the time of the reset request
> (because that's a pattern we've seen elsewhere, hence why we do try
> gen3_stop_engines() before reset) so abandon the reset and leave the
> device wedged, if gen3_stop_engines() fails.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103240
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> Whee! Let's see how much breaks!
> -Chris
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 33 ++++++++++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 20e3c65c0999..c9a254b6125f 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1372,20 +1372,23 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>  	return ret;
>  }
>  
> -static void gen3_stop_engine(struct intel_engine_cs *engine)
> +static bool gen3_stop_engine(struct intel_engine_cs *engine)
>  {
>  	struct drm_i915_private *dev_priv = engine->i915;
>  	const u32 base = engine->mmio_base;
>  	const i915_reg_t mode = RING_MI_MODE(base);
>  
> +
>  	I915_WRITE_FW(mode, _MASKED_BIT_ENABLE(STOP_RING));
>  	if (intel_wait_for_register_fw(dev_priv,
>  				       mode,
>  				       MODE_IDLE,
>  				       MODE_IDLE,
> -				       500))
> +				       500)) {
>  		DRM_DEBUG_DRIVER("%s: timed out on STOP_RING\n",
>  				 engine->name);
> +		return false;
> +	}
>  
>  	I915_WRITE_FW(RING_CTL(base), 0);
>  	I915_WRITE_FW(RING_HEAD(base), 0);
> @@ -1395,19 +1398,32 @@ static void gen3_stop_engine(struct intel_engine_cs *engine)
>  	if (I915_READ_FW(RING_HEAD(base)) != 0)
>  		DRM_DEBUG_DRIVER("%s: ring head not parked\n",
>  				 engine->name);
> +
> +	return true;
>  }
>  
> -static void i915_stop_engines(struct drm_i915_private *dev_priv,
> -			      unsigned engine_mask)
> +static int i915_stop_engines(struct drm_i915_private *dev_priv,
> +			     unsigned engine_mask)
>  {
>  	struct intel_engine_cs *engine;
>  	enum intel_engine_id id;
> +	bool idle;
>  
>  	if (INTEL_GEN(dev_priv) < 3)
> -		return;
> +		return true;
>  
> +	idle = true;
>  	for_each_engine_masked(engine, dev_priv, engine_mask, id)
> -		gen3_stop_engine(engine);
> +		idle &= gen3_stop_engine(engine);
> +	if (idle)
> +		return idle;
> +
> +	dev_err(dev_priv->drm.dev, "Failed to stop all engines\n");
> +	for_each_engine_masked(engine, dev_priv, engine_mask, id) {
> +		struct drm_printer p = drm_debug_printer(__func__);
> +		intel_engine_dump(engine, &p);
> +	}
> +	return false;
>  }
>  
>  static bool i915_reset_complete(struct pci_dev *pdev)
> @@ -1768,7 +1784,10 @@ int intel_gpu_reset(struct drm_i915_private *dev_priv, unsigned engine_mask)
>  		 *
>  		 * FIXME: Wa for more modern gens needs to be validated
>  		 */
> -		i915_stop_engines(dev_priv, engine_mask);
> +		if (!i915_stop_engines(dev_priv, engine_mask)) {
> +			ret = -EIO;
> +			break;
> +		}
>  
>  		ret = -ENODEV;
>  		if (reset)
> -- 
> 2.15.0.rc2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-10-26 12:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-26 12:12 [PATCH] drm/i915: Abandon the reset if we fail to stop the engines Chris Wilson
2017-10-26 12:33 ` ✗ Fi.CI.BAT: failure for " Patchwork
2017-10-26 12:46   ` Chris Wilson
2017-10-26 12:59 ` Ville Syrjälä [this message]
2017-10-26 13:11   ` [PATCH] " Chris Wilson
2017-10-26 13:30     ` Ville Syrjälä

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171026125905.GP10981@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox