All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff McGee <jeff.mcgee@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v6] drm/i915/guc: Add support for reset engine using GuC commands
Date: Wed, 1 Nov 2017 13:41:13 -0700	[thread overview]
Message-ID: <20171101204113.GA18571@jeffdesk> (raw)
In-Reply-To: <150954468470.23379.11108118716807118148@mail.alporthouse.com>

On Wed, Nov 01, 2017 at 01:58:04PM +0000, Chris Wilson wrote:
> Quoting Michel Thierry (2017-10-31 22:53:09)
> > This patch adds per engine reset and recovery (TDR) support when GuC is
> > used to submit workloads to GPU.
> > 
> > In the case of i915 directly submission to ELSP, driver manages hang
> > detection, recovery and resubmission. With GuC submission these tasks
> > are shared between driver and GuC. i915 is still responsible for detecting
> > a hang, and when it does it only requests GuC to reset that Engine. GuC
> > internally manages acquiring forcewake and idling the engine before
> > resetting it.
> > 
> > Once the reset is successful, i915 takes over again and handles the
> > resubmission. The scheduler in i915 knows which requests are pending so
> > after resetting a engine, pending workloads/requests are resubmitted
> > again.
> > 
> > v2: s/i915_guc_request_engine_reset/i915_guc_reset_engine/ to match the
> > non-guc function names.
> > 
> > v3: Removed debug message about engine restarting from which request,
> > since the new baseline do it regardless of submission mode. (Chris)
> > 
> > v4: Rebase.
> > 
> > v5: Do not pass unnecessary reporting flags to the fw (Jeff);
> > tasklet_schedule(&execlists->irq_tasklet) handles the resubmit; rebase.
> > 
> > v6: Rename the existing reset engine function and share a similar
> > interface between guc and non-guc paths (Chris).
> > 
> > Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c       | 15 +++++++++++++--
> >  drivers/gpu/drm/i915/i915_drv.h       |  2 ++
> >  drivers/gpu/drm/i915/intel_guc.c      | 24 ++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_guc_fwif.h |  1 +
> >  drivers/gpu/drm/i915/intel_uncore.c   |  5 -----
> >  5 files changed, 40 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index af745749509c..359333a423cf 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1950,6 +1950,12 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> >         goto finish;
> >  }
> >  
> > +static inline int intel_gt_reset_engine(struct drm_i915_private *dev_priv,
> > +                                       struct intel_engine_cs *engine)
> > +{
> > +       return intel_gpu_reset(dev_priv, intel_engine_flag(engine));
> > +}
> > +
> >  /**
> >   * i915_reset_engine - reset GPU engine to recover from a hang
> >   * @engine: engine to reset
> > @@ -1984,10 +1990,15 @@ int i915_reset_engine(struct intel_engine_cs *engine, unsigned int flags)
> >                 goto out;
> >         }
> >  
> > -       ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine));
> > +       if (!engine->i915->guc.execbuf_client)
> > +               ret = intel_gt_reset_engine(engine->i915, engine);
> > +       else
> > +               ret = intel_guc_reset_engine(&engine->i915->guc, engine);
> > +
> >         if (ret) {
> >                 /* If we fail here, we expect to fallback to a global reset */
> > -               DRM_DEBUG_DRIVER("Failed to reset %s, ret=%d\n",
> > +               DRM_DEBUG_DRIVER("%sFailed to reset %s, ret=%d\n",
> > +                                (engine->i915->guc.execbuf_client ? "GUC ":""),
> 
> A bit overkill on the parentheses there ;)
> 
> Lgtm, can you please ping, say, Jeff or Daniele for an r-b on the guc
> interaction?
> -Chris

There is one small change needed in the GuC preemption protocol to make it
compatible with GuC engine reset. I will send that shortly.

There are also a couple of corner case bugs with engine reset in our current
firmware versions. We are planning a firmware update to address those. But
the host-side code here is fine. So...

Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-11-01 20:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-30 18:56 [PATCH 0/3] GuC based reset engine Michel Thierry
2017-10-30 18:56 ` [PATCH 1/3] drm/i915/guc: Rename the function that resets the GuC Michel Thierry
2017-10-30 21:02   ` Chris Wilson
2017-10-30 18:56 ` [PATCH 2/3] drm/i915/guc: Add support for reset engine using GuC commands Michel Thierry
2017-10-30 20:58   ` Chris Wilson
2017-10-30 21:08     ` Michel Thierry
2017-10-30 21:09   ` Chris Wilson
2017-10-31  4:38     ` Michel Thierry
2017-10-31 10:17       ` Chris Wilson
2017-10-31 22:53   ` [PATCH v6] " Michel Thierry
2017-11-01 13:58     ` Chris Wilson
2017-11-01 20:41       ` Jeff McGee [this message]
2017-11-02  8:43         ` Chris Wilson
2017-10-30 18:56 ` [PATCH 3/3] HAX enable GuC submission for CI Michel Thierry
2017-10-30 20:05 ` ✓ Fi.CI.BAT: success for GuC based reset engine Patchwork
2017-10-30 21:14   ` Chris Wilson
2017-10-30 23:20 ` ✗ Fi.CI.IGT: warning " Patchwork
2017-10-31 10:20   ` Chris Wilson
2017-10-31 20:56     ` Michel Thierry
2017-10-31 21:31       ` Chris Wilson
2017-10-31 23:50 ` ✓ Fi.CI.BAT: success for GuC based reset engine (rev2) Patchwork
2017-11-01  0:59 ` ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171101204113.GA18571@jeffdesk \
    --to=jeff.mcgee@intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.