From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
intel-gfx@lists.freedesktop.org, stable@vger.kernel.org,
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 5/5] drm/i915: Solve the GPU reset vs. modeset deadlocks with an rw_semaphore
Date: Fri, 30 Jun 2017 18:44:32 +0300 [thread overview]
Message-ID: <20170630154432.GZ12629@intel.com> (raw)
In-Reply-To: <20170630133033.27yqunbfoawuz2wl@phenom.ffwll.local>
On Fri, Jun 30, 2017 at 03:30:33PM +0200, Daniel Vetter wrote:
> On Fri, Jun 30, 2017 at 03:25:58PM +0200, Daniel Vetter wrote:
> > On Thu, Jun 29, 2017 at 04:49:48PM +0300, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > > Introduce an rw_semaphore to protect the display commits. All normal
> > > commits use down_read() and hence can proceed in parallel, but GPU reset
> > > will use down_write() making sure no other commits are in progress when
> > > we have to pull the plug on the display engine on pre-g4x platforms.
> > > There are no modeset/gem locks taken inside __intel_atomic_commit_tail()
> > > itself, and we wait for all dependencies before the down_read(), and
> > > thus there is no chance of deadlocks with this scheme.
> >
> > How does this solve the deadlock? Afaiui the deadlock is that the gpu
> > reset stopped unconditionally completing all requests before it did
> > anything else, including anything with the hw or taking modeset locks.
> >
> > This ensured that any outstanding flips (we only had pageflips, no atomic
> > ones back then) could complete (although maybe displaying garbage). The
> > only thing we had to do was grab the locks (to avoid concurrent modesets)
> > and then we could happily nuke the hw (since the flips where lost in the
> > CS anyway), and restore it afterwards.
> >
> > Then the TDR rewrite came around and broke that, which now makes atomic
> > time out waiting for the gpu to complete (because the gpu reset waits for
> > the modeset to complete first). Which means if you want to fix this
> > without breaking TDR, then you need to cancel the pending atomic commits.
> > That seems somewhat what you're attempting here with trying to figure out
> > what the latest hw-committed step is (but that function gets it wrong),
> > and lots more locking and stuff on top.
> >
> > Why exactly can't we just go back to the old world of force-completing
> > dead requests on gen4 and earlier? That would be tons simpler imo instead
> > of throwing a pile of hacks (which really needs a complete rewrite of the
> > atomic code in i915) in as a work-around. We didn't have TDR on gen4 and
> > earlier for years, going back to that isn't going to hurt anyone.
> >
> > Making working gen4 gpu reset contigent on cancellable atomic commits and
> > the full commit queue is imo nuts.
>
> And if the GEM folks insist the old behavior can't be restored, then we
> just need a tailor-made get-out-of-jail card for gen4 gpu reset somewhere
> in i915_sw_fence. Force-completing all render requests atomic updates
> depend upon is imo the simplest solution to this, and we've had a driver
> that worked like that for years.
And it used to break all the time. I think we've had to fix it at least
three times by now. So I tend to think it's better to fix it in a way
that won't break so easily.
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2017-06-30 15:44 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-29 13:49 [PATCH 0/5] drm/i915: Fix pre-g4x GPU reset, again ville.syrjala
2017-06-29 13:49 ` [PATCH 1/5] drm/atomic: Refactor drm_atomic_state_realloc_connectors() ville.syrjala
2017-06-30 13:15 ` Daniel Vetter
2017-06-29 13:49 ` [PATCH 2/5] drm/atomic: Introduce drm_atomic_helper_duplicate_committed_state() ville.syrjala
2017-06-29 13:49 ` [PATCH 3/5] drm/i915% Store vma gtt offset in plane state ville.syrjala
2017-06-29 13:49 ` [PATCH 4/5] drm/i915: Refactor __intel_atomic_commit_tail() ville.syrjala
2017-06-29 13:49 ` [PATCH 5/5] drm/i915: Solve the GPU reset vs. modeset deadlocks with an rw_semaphore ville.syrjala
2017-06-29 13:56 ` Chris Wilson
2017-06-29 14:36 ` [PATCH v4 " ville.syrjala
2017-06-29 17:57 ` Chris Wilson
2017-06-29 19:26 ` Ville Syrjälä
2017-06-30 13:35 ` Daniel Vetter
2017-06-30 13:53 ` Ville Syrjälä
2017-06-30 15:30 ` Daniel Vetter
2017-06-30 15:39 ` Chris Wilson
2017-07-03 8:03 ` Daniel Vetter
2017-07-03 8:16 ` Chris Wilson
2017-07-03 16:42 ` Daniel Vetter
2017-06-30 13:25 ` [PATCH " Daniel Vetter
2017-06-30 13:30 ` Daniel Vetter
2017-06-30 15:44 ` Ville Syrjälä [this message]
2017-06-30 18:23 ` Daniel Vetter
2017-06-30 18:46 ` Ville Syrjälä
2017-07-03 7:55 ` Daniel Vetter
2017-07-03 9:30 ` Ville Syrjälä
2017-07-03 16:48 ` Daniel Vetter
2017-06-29 14:24 ` ✓ Fi.CI.BAT: success for drm/i915: Fix pre-g4x GPU reset, again Patchwork
2017-06-29 15:07 ` Ville Syrjälä
2017-06-29 14:55 ` ✓ Fi.CI.BAT: success for drm/i915: Fix pre-g4x GPU reset, again (rev2) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170630154432.GZ12629@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=daniel.vetter@ffwll.ch \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).