From: Petr Pavlu <petr.pavlu@suse.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: mhiramat@kernel.org, mathieu.desnoyers@efficios.com,
zhengyejian1@huawei.com, linux-trace-kernel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] tracing: Simplify and fix "buffered event" synchronization
Date: Fri, 1 Dec 2023 15:17:35 +0100 [thread overview]
Message-ID: <d404113e-4ffe-4e9c-ab45-1b076c1f498c@suse.com> (raw)
In-Reply-To: <20231129095826.1aec6381@gandalf.local.home>
On 11/29/23 15:58, Steven Rostedt wrote:
> On Wed, 29 Nov 2023 10:22:23 +0100
> Petr Pavlu <petr.pavlu@suse.com> wrote:
>
>> Yes, I believe this should address this potential race condition.
>>
>> An alternative would be instead to update
>> trace_event_buffer_lock_reserve() as follows:
>>
>> if (this_cpu_inc_return(trace_buffered_event_cnt) == 1) {
>> smp_rmb();
>
> This is the problem I have with your approach. That smp_rmb() is in the
> highly critical path. On some architectures, this has a significant impact
> on the overhead of this code. This path is called during code execution and
> increases the overhead of the tracing infrastructure.
>
> If I'm given two solutions where one adds a smp_rmb() to the critical path
> and the other just slows down the non-critical path more, I'll take the
> slow down of non-critical path every time.
>
>> if ((entry = __this_cpu_read(trace_buffered_event))) {
>> [...]
>>
>> That saves the synchronize_rcu() call but additionally modifies
>> trace_buffered_event_cnt even if trace_buffered_event is currently NULL.
>>
>> Another alternative is the approach taken by my patch which avoids more
>> RCU work and unnecessary memory modifications.
>>
>> I'd be interested if you could have a look again at what I'm proposing
>> in my patch. I think it simplifies the code while addressing these
>> problems as well. However, if you have reservations about that approach
>> then it is ok, I can fix the found problems individually as discussed.
>
> Fix this without adding any memory barriers to the critical path, then I'll
> take another look.
>
> FYI, this code was designed in the first place to avoid adding memory
> barriers in the critical path.
Thank you for the explanation. I ran the tests you mentioned in the
description of commit 0fc1b09ff1ff ("tracing: Use temp buffer when
filtering events") to understand this aspect a bit more. I confirm that
my proposed patch makes the tracing slower, usually by single digit
percentages. I understand this is not welcome. I also realize that the
slowdown might be even worse in different situations and on other
architectures that I checked (arm64, x86_64).
Ok, keeping the current approach, my plan for v2 is to prepare the
following patches:
* Fix for the missing increment+decrement of trace_buffered_event_cnt
on the current CPU in trace_buffered_event_disable().
Replace smp_call_function_many() with on_each_cpu_mask() in
trace_buffered_event_disable(). The on_each_cpu_mask() function has
also an advantage that it itself disables preemption so doing that can
be then removed from trace_buffered_event_disable().
* Fix the potential race between trace_buffered_event_enable() and
trace_event_buffer_lock_reserve() where the latter might already see
a valid trace_buffered_event pointer but not all initialization yet.
I think this might be actually best to address by using the same
maintenance exclusion as is implemented in
trace_buffered_event_disable(). It would make both maintenance
operations consistent but for the cost of making the enable operation
somewhat slower.
* Fix the WARN_ON_ONCE(!trace_buffered_event_ref) issued in
trace_buffered_event_disable() when trace_buffered_event_enable()
previously fails.
Add a variable/flag tracking whether trace_buffered_event is currently
allocated and use that for driving if a new allocation needs to be
done when trace_buffered_event_enable() is called, or the buffers
should be really freed when trace_buffered_event_disable() is invoked.
Not sure if the mentioned alternative of leaving trace_buffered_event
partially initialized on failure is preferred instead.
* Fix the potential race between trace_buffered_event_disable() and
trace_event_buffer_lock_reserve() where the latter might still grab
a pointer from trace_buffered_event that is being freed.
Replace smp_wmb() with synchronize_rcu() in
trace_buffered_event_disable().
Thanks,
Petr
next prev parent reply other threads:[~2023-12-01 14:17 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-27 15:12 [PATCH 0/2] tracing: Simplify and fix "buffered event" synchronization Petr Pavlu
2023-11-27 15:12 ` [PATCH 1/2] " Petr Pavlu
2023-11-27 17:41 ` Steven Rostedt
2023-11-28 15:05 ` Petr Pavlu
2023-11-28 15:27 ` Steven Rostedt
2023-11-29 9:22 ` Petr Pavlu
2023-11-29 14:58 ` Steven Rostedt
2023-12-01 14:17 ` Petr Pavlu [this message]
2023-12-01 14:46 ` Steven Rostedt
2023-12-05 16:16 ` Petr Pavlu
2023-11-27 15:12 ` [PATCH 2/2] tracing: Disable events in reverse order of their enable operation Petr Pavlu
2023-11-27 17:58 ` Steven Rostedt
2023-11-28 15:06 ` Petr Pavlu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d404113e-4ffe-4e9c-ab45-1b076c1f498c@suse.com \
--to=petr.pavlu@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=rostedt@goodmis.org \
--cc=zhengyejian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox