From: Steven Rostedt <rostedt@kernel.org>
To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Namhyung Kim <namhyung@kernel.org>,
Takaya Saeki <takayas@google.com>,
Tom Zanussi <zanussi@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ian Rogers <irogers@google.com>,
Douglas Raillard <douglas.raillard@arm.com>,
"Paul E. McKenney" <paulmck@kernel.org>
Subject: [PATCH v2 1/8] tracing: Replace syscall RCU pointer assignment with READ/WRITE_ONCE()
Date: Tue, 23 Sep 2025 09:04:58 -0400 [thread overview]
Message-ID: <20250923130713.594320290@kernel.org> (raw)
In-Reply-To: 20250923130457.901085554@kernel.org
From: Steven Rostedt <rostedt@goodmis.org>
The syscall events are pseudo events that hook to the raw syscalls. The
ftrace_syscall_enter/exit() callback is called by the raw_syscall
enter/exit tracepoints respectively whenever any of the syscall events are
enabled.
The trace_array has an array of syscall "files" that correspond to the
system calls based on their __NR_SYSCALL number. The array is read and if
there's a pointer to a trace_event_file then it is considered enabled and
if it is NULL that syscall event is considered disabled.
Currently it uses an rcu_dereference_sched() to get this pointer and a
rcu_assign_ptr() or RCU_INIT_POINTER() to write to it. This is unnecessary
as the file pointer will not go away outside the synchronization of the
tracepoint logic itself. And this code adds no extra RCU synchronization
that uses this.
Replace these functions with a simple READ_ONCE() and WRITE_ONCE() which
is all they need. This will also allow this code to not depend on
preemption being disabled as system call tracepoints are now allowed to
fault.
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/20250805193234.745705874@kernel.org
- Removed __rcu annotation to the fields that do not need RCU to protect
them.
kernel/trace/trace.h | 4 ++--
kernel/trace/trace_syscalls.c | 14 ++++++--------
2 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 5f4bed5842f9..85eabb454bee 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -380,8 +380,8 @@ struct trace_array {
#ifdef CONFIG_FTRACE_SYSCALLS
int sys_refcount_enter;
int sys_refcount_exit;
- struct trace_event_file __rcu *enter_syscall_files[NR_syscalls];
- struct trace_event_file __rcu *exit_syscall_files[NR_syscalls];
+ struct trace_event_file *enter_syscall_files[NR_syscalls];
+ struct trace_event_file *exit_syscall_files[NR_syscalls];
#endif
int stop_count;
int clock_id;
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 46aab0ab9350..3a0b65f89130 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -310,8 +310,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
return;
- /* Here we're inside tp handler's rcu_read_lock_sched (__DO_TRACE) */
- trace_file = rcu_dereference_sched(tr->enter_syscall_files[syscall_nr]);
+ trace_file = READ_ONCE(tr->enter_syscall_files[syscall_nr]);
if (!trace_file)
return;
@@ -356,8 +355,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
return;
- /* Here we're inside tp handler's rcu_read_lock_sched (__DO_TRACE()) */
- trace_file = rcu_dereference_sched(tr->exit_syscall_files[syscall_nr]);
+ trace_file = READ_ONCE(tr->exit_syscall_files[syscall_nr]);
if (!trace_file)
return;
@@ -393,7 +391,7 @@ static int reg_event_syscall_enter(struct trace_event_file *file,
if (!tr->sys_refcount_enter)
ret = register_trace_sys_enter(ftrace_syscall_enter, tr);
if (!ret) {
- rcu_assign_pointer(tr->enter_syscall_files[num], file);
+ WRITE_ONCE(tr->enter_syscall_files[num], file);
tr->sys_refcount_enter++;
}
mutex_unlock(&syscall_trace_lock);
@@ -411,7 +409,7 @@ static void unreg_event_syscall_enter(struct trace_event_file *file,
return;
mutex_lock(&syscall_trace_lock);
tr->sys_refcount_enter--;
- RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL);
+ WRITE_ONCE(tr->enter_syscall_files[num], NULL);
if (!tr->sys_refcount_enter)
unregister_trace_sys_enter(ftrace_syscall_enter, tr);
mutex_unlock(&syscall_trace_lock);
@@ -431,7 +429,7 @@ static int reg_event_syscall_exit(struct trace_event_file *file,
if (!tr->sys_refcount_exit)
ret = register_trace_sys_exit(ftrace_syscall_exit, tr);
if (!ret) {
- rcu_assign_pointer(tr->exit_syscall_files[num], file);
+ WRITE_ONCE(tr->exit_syscall_files[num], file);
tr->sys_refcount_exit++;
}
mutex_unlock(&syscall_trace_lock);
@@ -449,7 +447,7 @@ static void unreg_event_syscall_exit(struct trace_event_file *file,
return;
mutex_lock(&syscall_trace_lock);
tr->sys_refcount_exit--;
- RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL);
+ WRITE_ONCE(tr->exit_syscall_files[num], NULL);
if (!tr->sys_refcount_exit)
unregister_trace_sys_exit(ftrace_syscall_exit, tr);
mutex_unlock(&syscall_trace_lock);
--
2.50.1
next prev parent reply other threads:[~2025-09-23 13:05 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 13:04 [PATCH v2 0/8] tracing: Show contents of syscall trace event user space fields Steven Rostedt
2025-09-23 13:04 ` Steven Rostedt [this message]
2025-09-23 13:04 ` [PATCH v2 2/8] tracing: Have syscall trace events show "0x" for values greater than 10 Steven Rostedt
2025-09-23 13:05 ` [PATCH v2 3/8] tracing: Have syscall trace events read user space string Steven Rostedt
2025-09-25 7:26 ` Peter Zijlstra
2025-09-25 11:15 ` Steven Rostedt
2025-09-27 14:43 ` Steven Rostedt
2025-09-23 13:05 ` [PATCH v2 4/8] tracing: Have system call events record user array data Steven Rostedt
2025-09-23 13:05 ` [PATCH v2 5/8] tracing: Display some syscall arrays as strings Steven Rostedt
2025-09-23 13:05 ` [PATCH v2 6/8] tracing: Allow syscall trace events to read more than one user parameter Steven Rostedt
2025-09-23 13:05 ` [PATCH v2 7/8] tracing: Add syscall_user_buf_size to limit amount written Steven Rostedt
2025-09-24 9:49 ` kernel test robot
2025-09-23 13:05 ` [PATCH v2 8/8] tracing: Show printable characters in syscall arrays Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250923130713.594320290@kernel.org \
--to=rostedt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=douglas.raillard@arm.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=namhyung@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=takayas@google.com \
--cc=tglx@linutronix.de \
--cc=zanussi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.