From: Imre Deak <imre.deak@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, Mika Kuoppala <mika.kuoppala@intel.com>
Subject: Re: [PATCH 1/2] drm/i915: Make sure engines are idle during GPU idling in LR mode
Date: Thu, 03 Nov 2016 22:57:23 +0200 [thread overview]
Message-ID: <1478206643.30482.28.camel@intel.com> (raw)
In-Reply-To: <20161103185927.GC5674@nuc-i3427.alporthouse.com>
On Thu, 2016-11-03 at 18:59 +0000, Chris Wilson wrote:
> On Thu, Nov 03, 2016 at 06:19:37PM +0200, Imre Deak wrote:
> > We assume that the GPU is idle once receiving the seqno via the last
> > request's user interrupt. In execlist mode the corresponding context
> > completed interrupt can be delayed though and until this latter
> > interrupt arrives we consider the request to be pending on the ELSP
> > submit port. This can cause a problem during system suspend where this
> > last request will be seen by the resume code as still pending. Such
> > pending requests are normally replayed after a GPU reset, but during
> > resume we reset both SW and HW tracking of the ring head/tail pointers,
> > so replaying the pending request with its stale tale pointer will leave
> > the ring in an inconsistent state. A subsequent request submission can
> > lead then to the GPU executing from uninitialized area in the ring
> > behind the above stale tail pointer.
> >
> > Fix this by making sure any pending request on the ELSP port is
> > completed before suspending. I used a polling wait since the completion
> > time I measured was <1ms and since normally we only need to wait during
> > system suspend. GPU idling during runtime suspend is scheduled with a
> > delay (currently 50-100ms) after the retirement of the last request at
> > which point the context completed interrupt must have arrived already.
> >
> > The chance of this bug was increased by
> >
> > commit 1c777c5d1dcdf8fa0223fcff35fb387b5bb9517a
> > Author: Imre Deak <imre.deak@intel.com>
> > Date: Wed Oct 12 17:46:37 2016 +0300
> >
> > drm/i915/hsw: Fix GPU hang during resume from S3-devices state
> >
> > but it could happen even without the explicit GPU reset, since we
> > disable interrupts afterwards during the suspend sequence.
> >
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98470
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 3 +++
> > drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++++++
> > drivers/gpu/drm/i915/intel_lrc.h | 1 +
> > 3 files changed, 16 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 1f995ce..5ff02b5 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2766,6 +2766,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > if (dev_priv->gt.active_requests)
> > goto out_unlock;
> >
> > + if (i915.enable_execlists)
> > + intel_lr_wait_engines_idle(dev_priv);
>
> Idle work handler... So runtime suspend.
> Anyway this is not an ideal place for a stall under struct_mutex (even if
> 16x10us, it's the principle!).
During runtime suspend this won't add any overhead since the context
done interrupt happened already (unless there is a bug somewhere else).
> Move this to before the first READ_ONCE(dev_priv->gt.active_requests);
> so we stall before taking the lock, and skip if any new requests arrive
> whilst waiting.
>
> (Also i915.enable_execlists is forbidden. But meh)
>
> static struct drm_i915_gem_request *
> execlists_active_port(struct intel_engine_cs *engine)
> {
> struct drm_i915_gem_request *request;
>
> request = READ_ONCE(engine->execlist_port[1]);
> if (request)
> return request;
>
> return READ_ONCE(engine->execlist_port[0]);
> }
>
> /* Wait for execlists to settle, but bail if any new requests come in */
> for_each_engine(engine, dev_priv, id) {
> struct drm_i915_gem_request *request;
>
> request = execlists_active_port(engine);
> if (!request)
> continue;
>
> if (wait_for(execlists_active_port(engine) != request, 10))
> DRM_ERROR("Timeout waiting for %s to idle\n", engine->name);
> }
Hm, but we still need to re-check and bail out if not idle with
struct_mutex held, since gt.active_requests could go 0->1->0 before
taking struct_mutex? I can rewrite things with that check added, using
the above.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2016-11-03 20:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-03 16:19 [PATCH 1/2] drm/i915: Make sure engines are idle during GPU idling in LR mode Imre Deak
2016-11-03 16:19 ` [PATCH 2/2] drm/i915: Add assert for no pending GPU requests during resume " Imre Deak
2016-11-03 19:08 ` Chris Wilson
2016-11-03 16:45 ` ✗ Fi.CI.BAT: warning for series starting with [1/2] drm/i915: Make sure engines are idle during GPU idling " Patchwork
2016-11-03 16:49 ` [PATCH 1/2] " Tvrtko Ursulin
2016-11-03 18:59 ` Chris Wilson
2016-11-03 20:57 ` Imre Deak [this message]
2016-11-03 21:14 ` Chris Wilson
2016-11-04 20:33 ` Imre Deak
2016-11-04 21:01 ` Chris Wilson
2016-11-04 22:32 ` Imre Deak
2016-11-05 0:11 ` Imre Deak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1478206643.30482.28.camel@intel.com \
--to=imre.deak@intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
--cc=mika.kuoppala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.