public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
	dri-devel@lists.freedesktop.org
Cc: kernel-dev@igalia.com, Philipp Stanner <phasta@kernel.org>,
	Boris Brezillon <boris.brezillon@collabora.com>,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org
Subject: Re: [PATCH 2/2] dma-fence: Fix potential tracepoint null pointer dereferences
Date: Wed, 15 Apr 2026 10:13:28 +0200	[thread overview]
Message-ID: <b3b75077-878b-4d4e-b18e-4976765c63db@amd.com> (raw)
In-Reply-To: <5fea79b5-ab5a-4a98-95c8-6452b20e83c4@igalia.com>

On 4/15/26 09:58, Tvrtko Ursulin wrote:
> 
> On 14/04/2026 19:30, Christian König wrote:
>> On 4/14/26 17:49, Tvrtko Ursulin wrote:
>>> Trace_dma_fence_signaled, trace_dma_fence_wait_end and
>>> trace_dma_fence_destroy can all currently dereference a null fence->ops
>>> pointer after it has been reset on fence signalling.
>>>
>>> Lets use the safe string getters for most tracepoints to avoid this class
>>> of a problem, while for the signal tracepoint we move it to before ops are
>>> cleared to avoid losing the driver and timeline name information. Apart
>>> from moving it we also need to add a new tracepoint class to bypass the
>>> safe name getters since the signaled bit is already set.
>>>
>>> For dma_fence_init we also need to use the new tracepoint class since the
>>> rcu read lock is not held there, and we can do the same for the enable
>>> signaling since there we are certain the fence cannot be signaled while
>>> we are holding the lock and have even validated the fence->ops.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> Fixes: 541c8f2468b9 ("dma-buf: detach fence ops on signal v3")
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Philipp Stanner <phasta@kernel.org>
>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>> Cc: linux-media@vger.kernel.org
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> ---
>>>   drivers/dma-buf/dma-fence.c      |  3 ++-
>>>   include/trace/events/dma_fence.h | 33 ++++++++++++++++++++++++++++----
>>>   2 files changed, 31 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>>> index a2aa82f4eedd..b3bfa6943a8e 100644
>>> --- a/drivers/dma-buf/dma-fence.c
>>> +++ b/drivers/dma-buf/dma-fence.c
>>> @@ -363,6 +363,8 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>>>                         &fence->flags)))
>>>           return;
>>>   +    trace_dma_fence_signaled(fence);
>>> +
>>>       /*
>>>        * When neither a release nor a wait operation is specified set the ops
>>>        * pointer to NULL to allow the fence structure to become independent
>>> @@ -377,7 +379,6 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>>>         fence->timestamp = timestamp;
>>>       set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
>>> -    trace_dma_fence_signaled(fence);
>>
>> I think this part here should be a separate patch.
> 
> I had that in https://lore.kernel.org/dri-devel/20260330133623.17704-1-tvrtko.ursulin@igalia.com/ but the discussion fizzled out before an rb.
> 
>>
>>>         list_for_each_entry_safe(cur, tmp, &cb_list, node) {
>>>           INIT_LIST_HEAD(&cur->node);
>>> diff --git a/include/trace/events/dma_fence.h b/include/trace/events/dma_fence.h
>>> index 3abba45c0601..9e0cb9ce2388 100644
>>> --- a/include/trace/events/dma_fence.h
>>> +++ b/include/trace/events/dma_fence.h
>>> @@ -9,12 +9,37 @@
>>>     struct dma_fence;
>>>   +DECLARE_EVENT_CLASS(dma_fence,
>>> +
>>> +    TP_PROTO(struct dma_fence *fence),
>>> +
>>> +    TP_ARGS(fence),
>>> +
>>> +    TP_STRUCT__entry(
>>> +        __string(driver, dma_fence_driver_name(fence))
>>> +        __string(timeline, dma_fence_timeline_name(fence))
>>> +        __field(unsigned int, context)
>>> +        __field(unsigned int, seqno)
>>> +    ),
>>> +
>>> +    TP_fast_assign(
>>> +        __assign_str(driver);
>>> +        __assign_str(timeline);
>>> +        __entry->context = fence->context;
>>> +        __entry->seqno = fence->seqno;
>>> +    ),
>>> +
>>> +    TP_printk("driver=%s timeline=%s context=%u seqno=%u",
>>> +          __get_str(driver), __get_str(timeline), __entry->context,
>>> +          __entry->seqno)
>>> +);
>>> +
>>
>> Mhm, I'm strongly in favor to just use this approach for all trace points.
>>
>> The minimal extra overhead shouldn't really matter at all.
> 
> Yeah, I am a bit on the fence. It would required a bit of an ugly rcu_read_lock around trace_dma_fence_signal_init

I think as long as we only grab the RCU read side lock when the tracepoint is actually enabled then that shouldn't matter.

I do remember patches flying by which optimized this use case for the whole trace subsystem but didn't took a closer look how to do that now.

> and trace_dma_fence_signaled would lose the driver/timeline info _unless_ name helpers would also be changed to look at fence->ops instead of "is signaled". Those have no memory barriers so not sure I want to think about racyness and how to solve it.

Mhm, that is a bit more problematic.

ops is only set to NULL when neither free nor wait is specified, so checking is signaled is still the right thing to do for drivers which uses those callbacks but still want to have the RCU protection of the returned strings.

Ok, feel free to go ahead with this approach for now but please add a /* TODO: clean that up when most drivers switched to independent fences */.

Thanks,
Christian.

> 
> Regards,
> 
> Tvrtko
> 
>>
>> Regards,
>> Christian.
>>
>>>   /*
>>>    * Safe only for call sites which are guaranteed to not race with fence
>>>    * signaling,holding the fence->lock and having checked for not signaled, or the
>>>    * signaling path itself.
>>>    */
>>> -DECLARE_EVENT_CLASS(dma_fence,
>>> +DECLARE_EVENT_CLASS(dma_fence_ops,
>>>         TP_PROTO(struct dma_fence *fence),
>>>   @@ -46,7 +71,7 @@ DEFINE_EVENT(dma_fence, dma_fence_emit,
>>>       TP_ARGS(fence)
>>>   );
>>>   -DEFINE_EVENT(dma_fence, dma_fence_init,
>>> +DEFINE_EVENT(dma_fence_ops, dma_fence_init,
>>>         TP_PROTO(struct dma_fence *fence),
>>>   @@ -60,14 +85,14 @@ DEFINE_EVENT(dma_fence, dma_fence_destroy,
>>>       TP_ARGS(fence)
>>>   );
>>>   -DEFINE_EVENT(dma_fence, dma_fence_enable_signal,
>>> +DEFINE_EVENT(dma_fence_ops, dma_fence_enable_signal,
>>>         TP_PROTO(struct dma_fence *fence),
>>>         TP_ARGS(fence)
>>>   );
>>>   -DEFINE_EVENT(dma_fence, dma_fence_signaled,
>>> +DEFINE_EVENT(dma_fence_ops, dma_fence_signaled,
>>>         TP_PROTO(struct dma_fence *fence),
>>>   
>>
> 


  reply	other threads:[~2026-04-15  8:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-14 15:49 [PATCH 1/2] dma-fence: Silence sparse warning in dma_fence_describe Tvrtko Ursulin
2026-04-14 15:49 ` [PATCH 2/2] dma-fence: Fix potential tracepoint null pointer dereferences Tvrtko Ursulin
2026-04-14 18:30   ` Christian König
2026-04-15  7:58     ` Tvrtko Ursulin
2026-04-15  8:13       ` Christian König [this message]
2026-04-15  8:33         ` Tvrtko Ursulin
2026-04-15  9:04           ` Christian König
2026-04-14 18:28 ` [PATCH 1/2] dma-fence: Silence sparse warning in dma_fence_describe Christian König
  -- strict thread matches above, loose matches on Subject: below --
2026-04-15  8:32 Tvrtko Ursulin
2026-04-15  8:32 ` [PATCH 2/2] dma-fence: Fix potential tracepoint null pointer dereferences Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3b75077-878b-4d4e-b18e-4976765c63db@amd.com \
    --to=christian.koenig@amd.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-media@vger.kernel.org \
    --cc=phasta@kernel.org \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox