All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Danilo Krummrich <dakr@kernel.org>
Cc: phasta@kernel.org, Sumit Semwal <sumit.semwal@linaro.org>,
	Boris Brezillon <boris.brezillon@collabora.com>,
	Alice Ryhl <aliceryhl@google.com>,
	Daniel Almeida <dwlsalmeida@gmail.com>,
	Gary Guo <gary@garyguo.net>,
	Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
Date: Tue, 9 Jun 2026 10:17:06 +0200	[thread overview]
Message-ID: <db43fb09-5aaa-417a-962b-dfb4bfd548ed@amd.com> (raw)
In-Reply-To: <DJ3WX3B6MYBN.2L8NIJ1ELIYJ8@kernel.org>

On 6/8/26 21:25, Danilo Krummrich wrote:
> On Mon Jun 8, 2026 at 8:47 PM CEST, Christian König wrote:
>> On 6/8/26 20:39, Danilo Krummrich wrote:
>>> On Mon Jun 8, 2026 at 8:32 PM CEST, Christian König wrote:
>>>> On 6/8/26 19:59, Danilo Krummrich wrote:
>>>>> On Mon Jun 8, 2026 at 7:34 PM CEST, Christian König wrote:
>>>>>> That's why we need the RCU grace period to make sure that nobody is
>>>>>> referencing the driver stuff any more.
>>>>>
>>>>> Right, and that's what Philipp tries to address, the requirement to wait for an
>>>>> RCU grace period is perfectly fine if it is only about freeing memory, but it
>>>>> can become painful if the fence private data contains data also needs to be
>>>>> destructed in some way.
>>>>
>>>> Yeah that makes sense.
>>>>
>>>>> IOW, if a driver signals a fence, it is lifecycle-wise reasonable to destruct
>>>>> the private data that is no longer needed (remaining users only deal with struct
>>>>> dma_fence) and having to wait for a full grace period adds sublety and
>>>>> complication that can be avoided with the proposed approach.
>>>>
>>>> Yeah, I've run into that when I tried to make the amdgpu fences independent as well.
>>>>> That said, I'd like to ask the opposite question: What are the concerns with the
>>>>> proposed approach over (pure) RCU?
>>>>
>>>> Well a) locking inversions and b) performance.
>>>>
>>>> For example the reason why we have the dma_fence_is_signaled() and
>>>> dma_fence_is_signaled_locked() variants is because there is a measurable
>>>> difference in some specific use cases for not grabbing the locks.
>>>
>>> I checked for this as well, but couldn't find a case where
>>> dma_fence_is_signaled() is used in a way where it would be performance critical
>>> to avoid the lock in any way.
>>>
>>> Note that the lock is only bypassed when the fence is signaled already (this
>>> would be preserved) and if signaled() returns false, i.e. dma_fence_signal()
>>> will take the lock anyways.
>>>
>>>> I personally find those micro-optimizations rather questionable, but the
>>>> community agreement is that we should have them.
>>>
>>> I agree, it is rather questionable. So, I wouldn't make this the deciding factor
>>> unless someone can present a valid case where it actually matters.
>>>
>>>> So my take would rather be that the dma_fence_is_signaled_locked() variant
>>>> goes away and we consistently call the ops pointers without holding the
>>>> dma_fence lock and the driver implementations can then optionally take it if
>>>> necessary.
>>>
>>> How did you get to this conclusion considering that you run into what I
>>> mentioned above as well and the fact that we seem to agree that the performance
>>> concern is rather questionable?
>>
>> Quite simple, it's the cleaner approach.
> 
> I would maybe agree iff the RCU read side critical section wouldn't be needed
> and we wouldn't need to deal with the consequences of having to defer
> everything.
> 
> And so far it seems to me that there isn't really any other reason that the
> performance concern we both don't buy into.
> 
>> Calling callbacks with locks held is rather questionable even putting the
>> performance issue aside.
> 
> In general, I don't think that more flexibility for drivers is automatically
> always superior.
> 
> Also, before we keep calling it a performance issue, I'd really love to know
> where dma_fence_is_signaled() is called in a case where it returns false and the
> spinlock causes such an overhead that it actually matters.
> 
> (As mentioned above, none of the cases where it returns true would change.)
> 
>> In detail calling the callbacks without holding locks allows all
>> implementations who need it to explicitly take locks in the order they want.
> 
> I don't think this is true in this case.
> 
>   1) The existence of dma_fence_is_signaled_locked() already mandates that all
>      such callbacks must work properly if called with the fence lock held.

For the deadline callback it is mandatory to *not* hold the lock.

The signaled callback can be called with and without holding the lock.

The enable_signaled callback is always called while holding the lock.

> 
>   2) The RCU read side critical section already mandates that driver must not
>      sleep within the callback.
> 
>> If you call it with the lock held you enforce the fence lock the be the
>> outermost lock.
> 
> That's practically already the case, due to dma_fence_is_signaled_locked().

Yeah, but we can fix this and the enable signaling callback.

But the other way around isn't easily possible as far as I can see.

Regards,
Christian.



  reply	other threads:[~2026-06-09  8:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08 14:24 [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking Philipp Stanner
2026-06-08 14:43 ` sashiko-bot
2026-06-08 15:01 ` Boris Brezillon
2026-06-08 15:17   ` Philipp Stanner
2026-06-08 15:23     ` Danilo Krummrich
2026-06-08 15:30       ` Boris Brezillon
2026-06-08 15:30       ` Philipp Stanner
2026-06-08 16:16         ` Boris Brezillon
2026-06-09  8:02           ` Christian König
2026-06-09  8:54             ` Philipp Stanner
2026-06-09  8:43           ` Philipp Stanner
2026-06-09  8:47             ` Christian König
2026-06-09  9:00               ` Philipp Stanner
2026-06-08 15:07 ` Tvrtko Ursulin
2026-06-08 15:15   ` Philipp Stanner
2026-06-08 15:35 ` Christian König
2026-06-08 15:41   ` Philipp Stanner
2026-06-08 17:34     ` Christian König
2026-06-08 17:59       ` Danilo Krummrich
2026-06-08 18:32         ` Christian König
2026-06-08 18:39           ` Danilo Krummrich
2026-06-08 18:47             ` Christian König
2026-06-08 19:25               ` Danilo Krummrich
2026-06-09  8:17                 ` Christian König [this message]
2026-06-09  5:52               ` Philipp Stanner
2026-06-09 10:26                 ` Christian König
2026-06-09 10:42                   ` Philipp Stanner
2026-06-09 10:53                     ` Christian König
2026-06-09 11:39                       ` Philipp Stanner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db43fb09-5aaa-417a-962b-dfb4bfd548ed@amd.com \
    --to=christian.koenig@amd.com \
    --cc=aliceryhl@google.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dwlsalmeida@gmail.com \
    --cc=gary@garyguo.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=phasta@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.