public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To: Daniel Vetter <daniel@ffwll.ch>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org,
	Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: Re: [PATCH 01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance
Date: Wed, 10 Aug 2016 14:00:51 +0300	[thread overview]
Message-ID: <1470826851.4143.71.camel@linux.intel.com> (raw)
In-Reply-To: <20160810101359.GZ6232@phenom.ffwll.local>

On ke, 2016-08-10 at 12:13 +0200, Daniel Vetter wrote:
> On Wed, Aug 10, 2016 at 12:12:37PM +0200, Daniel Vetter wrote:
> > 
> > On Tue, Aug 09, 2016 at 10:05:30AM +0100, Chris Wilson wrote:
> > > 
> > > On Tue, Aug 09, 2016 at 11:48:56AM +0300, Joonas Lahtinen wrote:
> > > > 
> > > > On ti, 2016-08-09 at 08:14 +0100, Chris Wilson wrote:
> > > > > 
> > > > > On Tue, Aug 09, 2016 at 09:36:48AM +0300, Joonas Lahtinen wrote:
> > > > > > 
> > > > > > 
> > > > > > On ma, 2016-08-08 at 10:45 +0100, Chris Wilson wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On Mon, Aug 08, 2016 at 10:30:25AM +0100, Chris Wilson wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Mon, Aug 08, 2016 at 11:12:59AM +0200, Daniel Vetter wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Sun, Aug 07, 2016 at 03:45:09PM +0100, Chris Wilson wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > In the debate as to whether the second read of active->request is
> > > > > > > > > > ordered after the dependent reads of the first read of active->request,
> > > > > > > > > > just give in and throw a smp_rmb() in there so that ordering of loads is
> > > > > > > > > > assured.
> > > > > > > > > > 
> > > > > > > > > > v2: Explain the manual smp_rmb()
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Chris Wilson 
> > > > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > > r-b confirmed.
> > > > > > > > It's still fishy that we are implying an SMP effect where we need to
> > > > > > > > mandate the local processor order (that being the order evaluation of
> > > > > > > > request = *active; engine = *request; *active). The two *active are
> > > > > > > > already ordered across SMP, so we are only concered about this cpu. :|
> > > > > > > More second thoughts. rcu_assign_pointer(NULL) is not visible to
> > > > > > > rcu_access_pointer on another CPU without the smp_rmb. 
> > > > > > Should not a RCU read side lock be involved?
> > > > > Yes, we use rcu read lock here. The question here is about visibility of
> > > > > the other processor writes vs the local processor order. Before the
> > > > > other processor can overwrite the request during reallocation, it will
> > > > > have updated the active->request and gone through a wmb. During busy
> > > > > ioctl's read of the request, we want to make sure that the values we
> > > > > read (request->engine, request->seqno) have not been overwritten as we
> > > > > do so - and we do that by serialising the second pointer check with the
> > > > > other cpus.
> > > > As discussed in IRC, some other mechanism than an improvised spinning
> > > > loop + some SMP barriers thrown around would be much preferred.
> > > > 
> > > > You suggested a seqlock, and it would likely be ok.
> > > I was comparing the read latching as they are identical. Using a
> > > read/write seqlock around the request modification does not prevent all
> > > dangers such as using kzalloc() and introduces a second sequence counter
> > > to the one we already have. And for good reason seqlock says to use RCU
> > > here. Which puts us in a bit of a catch-22 and having to guard against
> > > SLAB_DESTROY_BY_RCU.
> > Since I was part of that irc party too: I think this is perfectly fine
> > as-is. Essentially what we do here is plain rcu+kref_get_unless_zero to
> > protect against zombies. This is exactly the design fence_get_rcu is meant
> > to be used in. But for this fastpath we can avoid even the zombie
> > protection if we're careful enough. Trying to make this less scary by
> > using a seqlock instead of plain rcu would run counter to the overall
> > fence_get_rcu design. And since the goal is to share fences widely and
> > far, we don't want to build up slightly different locking scheme here
> > where fence pointers are not really protected by rcu.
> > 
> > Given all that I think the open-coded scary dance (with lots of comments)
> > is acceptable, also since this clearly is a fastpath that userspace loves
> > to beat on.
> Still would like to see Joonas' r-b on the first few patches ofc.

With the extra comments;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

I still think it's fragile, though. But lets see once the dust settles
if we can make improvements.

Regards, Joonas

> -Daniel
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-08-10 11:01 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-07 14:45 First class VMA, take 2 Chris Wilson
2016-08-07 14:45 ` [PATCH 01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance Chris Wilson
2016-08-08  9:12   ` Daniel Vetter
2016-08-08  9:30     ` Chris Wilson
2016-08-08  9:45       ` Chris Wilson
2016-08-09  6:36         ` Joonas Lahtinen
2016-08-09  7:14           ` Chris Wilson
2016-08-09  8:48             ` Joonas Lahtinen
2016-08-09  9:05               ` Chris Wilson
2016-08-10 10:12                 ` Daniel Vetter
2016-08-10 10:13                   ` Daniel Vetter
2016-08-10 11:00                     ` Joonas Lahtinen [this message]
2016-08-12  9:50                       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 02/33] drm/i915: Do not overwrite the request with zero on reallocation Chris Wilson
2016-08-08  9:25   ` Daniel Vetter
2016-08-08  9:56     ` Chris Wilson
2016-08-09  6:32       ` Daniel Vetter
2016-08-07 14:45 ` [PATCH 03/33] drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs Chris Wilson
2016-08-09 14:08   ` [PATCH v2] " Chris Wilson
2016-08-09 14:10   ` [PATCH v3] " Chris Wilson
2016-08-09 15:24     ` Mika Kuoppala
2016-08-07 14:45 ` [PATCH 04/33] drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh Chris Wilson
2016-08-08  9:33   ` Daniel Vetter
2016-08-12  9:56   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 05/33] drm/i915: Reduce amount of duplicate buffer information captured on error Chris Wilson
2016-08-10  7:04   ` Joonas Lahtinen
2016-08-10  7:15     ` Chris Wilson
2016-08-10  8:07       ` Joonas Lahtinen
2016-08-10  8:36         ` Chris Wilson
2016-08-10 10:51           ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 06/33] drm/i915: Stop the machine whilst capturing the GPU crash dump Chris Wilson
2016-08-07 14:45 ` [PATCH 07/33] drm/i915: Store the active context object on all engines upon error Chris Wilson
2016-08-09  9:02   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 08/33] drm/i915: Move setting of request->batch into its single callsite Chris Wilson
2016-08-09 15:53   ` Mika Kuoppala
2016-08-09 16:04     ` Chris Wilson
2016-08-10  7:19   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 09/33] drm/i915: Mark unmappable GGTT entries as PIN_HIGH Chris Wilson
2016-08-08  9:09   ` Joonas Lahtinen
2016-08-09 11:05   ` Tvrtko Ursulin
2016-08-09 11:13     ` Chris Wilson
2016-08-09 11:20       ` Chris Wilson
2016-08-07 14:45 ` [PATCH 10/33] drm/i915: Remove inactive/active list from debugfs Chris Wilson
2016-08-09 10:29   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 11/33] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins Chris Wilson
2016-08-09 10:39   ` Joonas Lahtinen
2016-08-09 10:46     ` Chris Wilson
2016-08-09 11:32       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 12/33] drm/i915: Reduce i915_gem_objects to only show object information Chris Wilson
2016-08-10  7:29   ` Joonas Lahtinen
2016-08-10  7:38     ` Chris Wilson
2016-08-10  8:10       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 13/33] drm/i915: Remove redundant WARN_ON from __i915_add_request() Chris Wilson
2016-08-08  9:03   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 14/33] drm/i915: Create a VMA for an object Chris Wilson
2016-08-08  9:01   ` Joonas Lahtinen
2016-08-08  9:09     ` Chris Wilson
2016-08-10 10:58       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 15/33] drm/i915: Track pinned vma inside guc Chris Wilson
2016-08-11 16:19   ` Dave Gordon
2016-08-11 16:41     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 16/33] drm/i915: Convert fence computations to use vma directly Chris Wilson
2016-08-09 10:27   ` Joonas Lahtinen
2016-08-09 10:33     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 17/33] drm/i915: Use VMA directly for checking tiling parameters Chris Wilson
2016-08-09  6:18   ` Joonas Lahtinen
2016-08-09  8:03     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 18/33] drm/i915: Use VMA as the primary object for context state Chris Wilson
2016-08-10  8:03   ` Joonas Lahtinen
2016-08-10  8:25     ` Chris Wilson
2016-08-10 10:54       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 19/33] drm/i915: Only clflush the context object when binding Chris Wilson
2016-08-10  8:41   ` Joonas Lahtinen
2016-08-10  9:02     ` Chris Wilson
2016-08-10 10:50       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 20/33] drm/i915: Use VMA for ringbuffer tracking Chris Wilson
2016-08-11  9:32   ` Joonas Lahtinen
2016-08-11  9:58     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 21/33] drm/i915: Use VMA for scratch page tracking Chris Wilson
2016-08-08  8:00   ` [PATCH 1/3] " Chris Wilson
2016-08-08  8:00     ` [PATCH 2/3] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c Chris Wilson
2016-08-08  9:24       ` Matthew Auld
2016-08-08  8:00     ` [PATCH 3/3] drm/i915: Move common seqno reset " Chris Wilson
2016-08-08  9:40       ` Matthew Auld
2016-08-08 10:15         ` Chris Wilson
2016-08-08 15:34           ` Matthew Auld
2016-08-11 10:06   ` [PATCH 21/33] drm/i915: Use VMA for scratch page tracking Joonas Lahtinen
2016-08-11 10:22     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 22/33] drm/i915/overlay: Use VMA as the primary tracker for images Chris Wilson
2016-08-11 10:17   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 23/33] drm/i915: Use VMA as the primary tracker for semaphore page Chris Wilson
2016-08-11 10:42   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 24/33] drm/i915: Use VMA for render state page tracking Chris Wilson
2016-08-11 10:46   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 25/33] drm/i915: Use VMA for wa_ctx tracking Chris Wilson
2016-08-11 10:53   ` Joonas Lahtinen
2016-08-11 11:02     ` Chris Wilson
2016-08-11 12:41       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 26/33] drm/i915: Track pinned VMA Chris Wilson
2016-08-11 12:18   ` Joonas Lahtinen
2016-08-11 12:37     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 27/33] drm/i915: Print the batchbuffer offset next to BBADDR in error state Chris Wilson
2016-08-11 12:24   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 28/33] drm/i915: Move per-request pid from request to ctx Chris Wilson
2016-08-11 12:32   ` Joonas Lahtinen
2016-08-11 12:41     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 29/33] drm/i915: Only record active and pending requests upon a GPU hang Chris Wilson
2016-08-11 12:36   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 30/33] drm/i915: Record the RING_MODE register for post-mortem debugging Chris Wilson
2016-08-08 11:35   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 31/33] drm/i915: Always use the GTT for error capture Chris Wilson
2016-08-07 14:45 ` [PATCH 32/33] drm/i915: Consolidate error object printing Chris Wilson
2016-08-09 11:44   ` Joonas Lahtinen
2016-08-09 11:53     ` Chris Wilson
2016-08-10 10:55       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 33/33] drm/i915: Compress GPU objects in error state Chris Wilson
2016-08-10 10:32   ` Joonas Lahtinen
2016-08-10 10:52     ` Chris Wilson
2016-08-10 11:26       ` Joonas Lahtinen
2016-08-07 15:16 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance Patchwork
2016-08-08  9:46 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev4) Patchwork
2016-08-08 10:34 ` ✗ Fi.CI.BAT: " Patchwork
2016-08-09 14:10 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev5) Patchwork
2016-08-09 14:20 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev6) Patchwork
2016-08-10  6:43 ` Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1470826851.4143.71.camel@linux.intel.com \
    --to=joonas.lahtinen@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox