public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Daniel Vetter <daniel@ffwll.ch>,
	Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	intel-gfx@lists.freedesktop.org, miku@iki.fi
Subject: Re: [PATCH 2/2] drm/i915: Protect engine request list with spinlock
Date: Tue, 24 Feb 2015 13:57:56 +0100	[thread overview]
Message-ID: <20150224125756.GY24485@phenom.ffwll.local> (raw)
In-Reply-To: <20150224105249.GD12726@nuc-i3427.alporthouse.com>

On Tue, Feb 24, 2015 at 10:52:49AM +0000, Chris Wilson wrote:
> On Tue, Feb 24, 2015 at 11:39:08AM +0100, Daniel Vetter wrote:
> > On Tue, Feb 24, 2015 at 08:31:18AM +0000, Chris Wilson wrote:
> > > On Tue, Feb 24, 2015 at 12:58:19AM +0100, Daniel Vetter wrote:
> > > > On Thu, Feb 19, 2015 at 04:41:12PM +0000, Chris Wilson wrote:
> > > > > On Thu, Feb 19, 2015 at 06:18:55PM +0200, Mika Kuoppala wrote:
> > > > > > There are multiple players interested in the ring->request_list
> > > > > > state. Request submission can happen in kernel or user context,
> > > > > > idle worker is going through request list to free items. And then there
> > > > > > is hangcheck worker which tries to figure out if particular ring is
> > > > > > healthy by peeking at the request list among other things. And if
> > > > > > judged stuck by hangcheck, error state is colleted. Which in turns
> > > > > > needs access to ring->request_list.
> > > > > 
> > > > > We have discussed this before. Hangcheck does not need the lock so long
> > > > > as it is serialised with deletion. List processing with hangcheck during
> > > > > concurrent addition is safe.
> > > > > 
> > > > > For example, I expect the request locking to look like
> > > > > 
> > > > > http://cgit.freedesktop.org/~ickle/linux-2.6/tree/drivers/gpu/drm/i915/i915_gem_request.c#n691
> > > > 
> > > > I think longer-term with per-engine reset and fun stuff like that we
> > > > probably want the spinlock, just to avoid too many headaches with locking
> > > > auditing. For the execbuf fastpath it should just be one more spinlock per
> > > > ioctl, so hopefully bearable.
> > > 
> > > But it is not even the locking bug that breaks capture, so what's the
> > > point?
> > 
> > Oh I've read the patch as general prep work for more finegrained reset
> > support not as a fix for the referenced bug. I guess the bug is just the
> > usual incoherent seqno/irq thing that's been plagueing us ever since gen6?
> 
> I presumed Mika wants to fix that hangcheck and capture may explode as
> requests are completed concurrently. The bug that I expect will remain
> is that we peek at the bo without locks during capture.

Well my idea was that if we prevent request retiring with some minimal
spinlock then that should be enough to prevent objects from getting
retired. Which I hoped should be all we need to prevent everything else
from going poof too.

But thinking a bit more about this we need some additional checks too in
the retire code: If it grabs the request spinlock then it also needs to
check for in-progress reset. And if that's signalled it may not retire any
objects.

I think for hangcheck itself the spinlock alone should give sufficient
protection, as long as we store any state needed by hangcheck somewhere in
the request struct (or stuff hanging off it like contexts). But there's
indeed more trouble in the error capture code on top of that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-02-24 12:56 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-19 16:18 [PATCH 1/2] drm/i915: Split adding request to smaller functions Mika Kuoppala
2015-02-19 16:18 ` [PATCH 2/2] drm/i915: Protect engine request list with spinlock Mika Kuoppala
2015-02-19 16:41   ` Chris Wilson
2015-02-23 23:58     ` Daniel Vetter
2015-02-24  8:31       ` Chris Wilson
2015-02-24 10:39         ` Daniel Vetter
2015-02-24 10:52           ` Chris Wilson
2015-02-24 11:23             ` Mika Kuoppala
2015-02-24 11:40               ` Chris Wilson
2015-02-24 12:57             ` Daniel Vetter [this message]
2015-02-19 16:54 ` [PATCH 1/2] drm/i915: Split adding request to smaller functions John Harrison
2015-02-20  9:16   ` Mika Kuoppala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150224125756.GY24485@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@linux.intel.com \
    --cc=miku@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox