From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Daniel Vetter <daniel@ffwll.ch>,
dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
intel-gfx@lists.freedesktop.org, linux-media@vger.kernel.org
Subject: Re: [Linaro-mm-sig] [PATCH 10/11] dma-buf: Use seqlock to close RCU race in test_signaled_single
Date: Sun, 25 Sep 2016 22:43:08 +0200 [thread overview]
Message-ID: <20160925204308.GP20761@phenom.ffwll.local> (raw)
In-Reply-To: <20160923140232.GD28107@nuc-i3427.alporthouse.com>
On Fri, Sep 23, 2016 at 03:02:32PM +0100, Chris Wilson wrote:
> On Fri, Sep 23, 2016 at 03:49:26PM +0200, Daniel Vetter wrote:
> > On Mon, Aug 29, 2016 at 08:08:33AM +0100, Chris Wilson wrote:
> > > With the seqlock now extended to cover the lookup of the fence and its
> > > testing, we can perform that testing solely under the seqlock guard and
> > > avoid the effective locking and serialisation of acquiring a reference to
> > > the request. As the fence is RCU protected we know it cannot disappear
> > > as we test it, the same guarantee that made it safe to acquire the
> > > reference previously. The seqlock tests whether the fence was replaced
> > > as we are testing it telling us whether or not we can trust the result
> > > (if not, we just repeat the test until stable).
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > Cc: linux-media@vger.kernel.org
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: linaro-mm-sig@lists.linaro.org
> >
> > Not entirely sure this is safe for non-i915 drivers. We might now call
> > ->signalled on a zombie fence (i.e. refcount == 0, but not yet kfreed).
> > i915 can do that, but other drivers might go boom.
>
> All fences must be under RCU guard, or is that the sticking point? Given
> that, the problem is fence reallocation within the RCU grace period. If
> we can mandate that fence reallocation must be safe for concurrent
> fence->ops->*(), we can use this technique to avoid the serialisation
> barrier otherwise required. In the simple stress test, the difference is
> an order of magnitude, and test_signaled_rcu is often on a path where
> every memory barrier quickly adds up (at least for us).
>
> So is it just that you worry that others using SLAB_DESTROY_BY_RCU won't
> ensure their fence is safe during the reallocation?
Before your patch the rcu-protected part was just the
kref_get_unless_zero. This was done before calling down into and
fenc->ops->* functions. Which means the code of these functions was
guaranteed to run on a real fence object, and not a zombie fence in the
process of getting destructed.
Afaiui with your patch we might call into fence->ops->* on these zombie
fences. That would be a behaviour change with rather big implications
(since we'd need to audit all existing implementations, and also make sure
all future ones will be ok too). Or I missed something again.
I think we could still to this trick, at least partially, by making sure
we only inspect generic fence state and never call into fence->ops before
we're guaranteed to have a real fence.
But atm (at least per Christian König) a fence won't eventually get
signalled without calling into ->ops functions, so there's a catch 22.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Daniel Vetter <daniel@ffwll.ch>,
dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
intel-gfx@lists.freedesktop.org, linux-media@vger.kernel.org
Subject: Re: [Linaro-mm-sig] [PATCH 10/11] dma-buf: Use seqlock to close RCU race in test_signaled_single
Date: Sun, 25 Sep 2016 22:43:08 +0200 [thread overview]
Message-ID: <20160925204308.GP20761@phenom.ffwll.local> (raw)
In-Reply-To: <20160923140232.GD28107@nuc-i3427.alporthouse.com>
On Fri, Sep 23, 2016 at 03:02:32PM +0100, Chris Wilson wrote:
> On Fri, Sep 23, 2016 at 03:49:26PM +0200, Daniel Vetter wrote:
> > On Mon, Aug 29, 2016 at 08:08:33AM +0100, Chris Wilson wrote:
> > > With the seqlock now extended to cover the lookup of the fence and its
> > > testing, we can perform that testing solely under the seqlock guard and
> > > avoid the effective locking and serialisation of acquiring a reference to
> > > the request. As the fence is RCU protected we know it cannot disappear
> > > as we test it, the same guarantee that made it safe to acquire the
> > > reference previously. The seqlock tests whether the fence was replaced
> > > as we are testing it telling us whether or not we can trust the result
> > > (if not, we just repeat the test until stable).
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > Cc: linux-media@vger.kernel.org
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: linaro-mm-sig@lists.linaro.org
> >
> > Not entirely sure this is safe for non-i915 drivers. We might now call
> > ->signalled on a zombie fence (i.e. refcount == 0, but not yet kfreed).
> > i915 can do that, but other drivers might go boom.
>
> All fences must be under RCU guard, or is that the sticking point? Given
> that, the problem is fence reallocation within the RCU grace period. If
> we can mandate that fence reallocation must be safe for concurrent
> fence->ops->*(), we can use this technique to avoid the serialisation
> barrier otherwise required. In the simple stress test, the difference is
> an order of magnitude, and test_signaled_rcu is often on a path where
> every memory barrier quickly adds up (at least for us).
>
> So is it just that you worry that others using SLAB_DESTROY_BY_RCU won't
> ensure their fence is safe during the reallocation?
Before your patch the rcu-protected part was just the
kref_get_unless_zero. This was done before calling down into and
fenc->ops->* functions. Which means the code of these functions was
guaranteed to run on a real fence object, and not a zombie fence in the
process of getting destructed.
Afaiui with your patch we might call into fence->ops->* on these zombie
fences. That would be a behaviour change with rather big implications
(since we'd need to audit all existing implementations, and also make sure
all future ones will be ok too). Or I missed something again.
I think we could still to this trick, at least partially, by making sure
we only inspect generic fence state and never call into fence->ops before
we're guaranteed to have a real fence.
But atm (at least per Christian König) a fence won't eventually get
signalled without calling into ->ops functions, so there's a catch 22.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2016-09-25 20:43 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-29 7:08 [PATCH 01/11] drm/amdgpu: Remove call to reservation_object_test_signaled_rcu before wait Chris Wilson
2016-08-29 7:08 ` [PATCH 02/11] drm/etnaviv: Remove manual " Chris Wilson
2016-09-23 12:55 ` Daniel Vetter
2016-10-05 16:15 ` Sumit Semwal
2016-10-10 13:17 ` Lucas Stach
2016-08-29 7:08 ` [PATCH 03/11] drm/msm: Remove " Chris Wilson
2016-09-23 12:55 ` Daniel Vetter
2016-09-23 13:07 ` [Intel-gfx] " Rob Clark
2016-08-29 7:08 ` [PATCH 04/11] drm/nouveau: " Chris Wilson
2016-09-23 12:55 ` Daniel Vetter
2016-10-05 16:05 ` Sumit Semwal
2016-08-29 7:08 ` [PATCH 05/11] drm/vmwgfx: " Chris Wilson
2016-09-23 12:56 ` Daniel Vetter
2016-10-05 16:11 ` [Intel-gfx] " Sumit Semwal
2016-10-05 17:03 ` Sinclair Yeh
2016-08-29 7:08 ` [PATCH 06/11] dma-buf: Introduce fence_get_rcu_safe() Chris Wilson
2016-08-29 7:08 ` Chris Wilson
2016-09-23 12:59 ` Daniel Vetter
2016-09-23 12:59 ` Daniel Vetter
2016-09-23 13:34 ` Markus Heiser
2016-09-23 13:34 ` Markus Heiser
2016-08-29 7:08 ` [PATCH 07/11] dma-buf: Restart reservation_object_get_fences_rcu() after writes Chris Wilson
2016-08-29 7:08 ` Chris Wilson
2016-09-23 13:03 ` Daniel Vetter
2016-09-23 13:03 ` Daniel Vetter
2016-08-29 7:08 ` [PATCH 08/11] dma-buf: Restart reservation_object_wait_timeout_rcu() " Chris Wilson
2016-08-29 7:08 ` Chris Wilson
2016-09-23 13:18 ` Daniel Vetter
2016-08-29 7:08 ` [PATCH 09/11] dma-buf: Restart reservation_object_test_signaled_rcu() " Chris Wilson
2016-08-29 7:08 ` Chris Wilson
2016-09-23 13:43 ` Daniel Vetter
2016-09-23 13:43 ` Daniel Vetter
2016-08-29 7:08 ` [PATCH 10/11] dma-buf: Use seqlock to close RCU race in test_signaled_single Chris Wilson
2016-08-29 7:08 ` Chris Wilson
2016-09-23 13:49 ` [Linaro-mm-sig] " Daniel Vetter
2016-09-23 13:49 ` Daniel Vetter
2016-09-23 14:02 ` Chris Wilson
2016-09-25 20:43 ` Daniel Vetter [this message]
2016-09-25 20:43 ` Daniel Vetter
2016-08-29 7:08 ` [PATCH 11/11] dma-buf: Do a fast lockless check for poll with timeout=0 Chris Wilson
2016-08-29 7:08 ` Chris Wilson
2016-08-29 18:16 ` [PATCH] dma-buf/sync-file: Avoid enable fence signaling if poll(.timeout=0) Chris Wilson
2016-08-29 18:16 ` Chris Wilson
2016-08-29 18:26 ` Gustavo Padovan
2016-09-13 14:46 ` Sumit Semwal
2016-09-13 14:46 ` Sumit Semwal
2016-09-15 0:00 ` Rafael Antognolli
2016-09-21 7:26 ` Gustavo Padovan
2016-09-21 11:08 ` Chris Wilson
2016-09-23 13:50 ` [Intel-gfx] [PATCH 11/11] dma-buf: Do a fast lockless check for poll with timeout=0 Daniel Vetter
2016-09-23 13:50 ` Daniel Vetter
2016-09-23 14:15 ` Chris Wilson
2016-09-23 15:06 ` Chris Wilson
2016-09-23 15:06 ` [Intel-gfx] " Chris Wilson
2016-09-23 15:20 ` Chris Wilson
2016-09-23 17:59 ` Christian König
2016-09-23 17:59 ` Christian König
2016-09-25 20:44 ` Daniel Vetter
2016-09-25 20:44 ` [Intel-gfx] " Daniel Vetter
2016-08-29 7:50 ` ✗ Fi.CI.BAT: warning for series starting with [01/11] drm/amdgpu: Remove call to reservation_object_test_signaled_rcu before wait Patchwork
2016-08-29 8:20 ` [PATCH 01/11] " Christian König
2016-09-23 12:54 ` Daniel Vetter
2016-10-05 16:03 ` Sumit Semwal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160925204308.GP20761@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=chris@chris-wilson.co.uk \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-media@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.