public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Daniel Vetter <daniel@ffwll.ch>, Tomas Elf <tomas.elf@intel.com>,
	Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	intel-gfx@lists.freedesktop.org,
	Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: Re: [PATCH] drm/i915: Reset request handling for gen8+
Date: Mon, 22 Jun 2015 14:50:05 +0200	[thread overview]
Message-ID: <20150622125005.GV25769@phenom.ffwll.local> (raw)
In-Reply-To: <20150619163045.GA29508@nuc-i3427.alporthouse.com>

On Fri, Jun 19, 2015 at 05:30:45PM +0100, Chris Wilson wrote:
> On Thu, Jun 18, 2015 at 04:58:06PM +0200, Daniel Vetter wrote:
> > On Thu, Jun 18, 2015 at 12:42:55PM +0100, Chris Wilson wrote:
> > > I understand the merit in trying the reset a few times before giving up,
> > > it would just need a bit of restructuring to try the reset before
> > > clearing gem state (trivial) and requeueing the hangcheck. I am just
> > > wary of feature creep before we get stuck into TDR, which promises to
> > > change how we think about resets entirely.
> > 
> > My maintainer concern here is always that we should err on the side of not
> > killing the machine. If the reset failed, or if the gpu reinit failed then
> > marking the gpu as wedged has historically been the safe option. The
> > system will still run, display mostly works and there's a reasonable
> > chance you can gather debug data.
> 
> One thing to bear in mind here is that it with this particular don't
> reset if not ready logic, repeating the attempt at reset after another
> hangcheck is equivalent to just using a slower hangcheck. (more or less,
> a couple of writes to one register difference) So it is no more likely
> to hang the machine than the original GPU hang.
> 
> We can differentiate the cases here, between say EBUSY, ENODEV, and EIO,
> from the actual the reset request to determine which we want to retry
> (i.e. EBUSY).

Tbh I don't want to make the reset code to clever with multiple fallback
paths - it's a really tricky code and as-is already suffers from imo
insufficient test coverage and too many bugs. Once we decided that the gpu
is dead and return -EIO this should be a terminal state. Developers can
always manually unwedge through debugfs, but for users it's imo paramount
that we don't automatically run some little-tested path and take down
their box in the process.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

      reply	other threads:[~2015-06-22 12:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-16 13:39 [PATCH 1/1] drm/i915: Reset request handling for gen9+ Mika Kuoppala
2015-06-16 14:09 ` Chris Wilson
2015-06-16 17:10 ` Chris Wilson
2015-06-16 20:15   ` Tomas Elf
2015-06-17  6:33     ` Mika Kuoppala
2015-06-16 19:57 ` Tomas Elf
2015-06-17 12:35 ` [PATCH] drm/i915: Reset request handling for gen8+ Mika Kuoppala
2015-06-18  8:36   ` Mika Kuoppala
2015-06-18  8:50     ` Chris Wilson
2015-06-18  9:51       ` Mika Kuoppala
2015-06-18 10:03         ` Chris Wilson
2015-06-18 10:22           ` Mika Kuoppala
2015-06-18 15:00             ` Daniel Vetter
2015-06-18 10:11         ` Tomas Elf
2015-06-18 10:31           ` Mika Kuoppala
2015-06-18 10:36           ` Chris Wilson
2015-06-18 11:18             ` Tomas Elf
2015-06-18 11:42               ` Chris Wilson
2015-06-18 14:58                 ` Daniel Vetter
2015-06-19 16:30                   ` Chris Wilson
2015-06-22 12:50                     ` Daniel Vetter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150622125005.GV25769@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@linux.intel.com \
    --cc=tomas.elf@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox