* [for-linus][PATCH 1/4] tracing: samples: Initialize trace_array_printk() with the correct function
[not found] <20250514133831.110736154@goodmis.org>
@ 2025-05-14 13:38 ` Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 2/4] ftrace: Fix preemption acounting for stacktrace trigger command Steven Rostedt
` (2 subsequent siblings)
3 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2025-05-14 13:38 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
stable, Divya Indi
From: Steven Rostedt <rostedt@goodmis.org>
When using trace_array_printk() on a created instance, the correct
function to use to initialize it is:
trace_array_init_printk()
Not
trace_printk_init_buffer()
The former is a proper function to use, the latter is for initializing
trace_printk() and causes the NOTICE banner to be displayed.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Divya Indi <divya.indi@oracle.com>
Link: https://lore.kernel.org/20250509152657.0f6744d9@gandalf.local.home
Fixes: 89ed42495ef4a ("tracing: Sample module to demonstrate kernel access to Ftrace instances.")
Fixes: 38ce2a9e33db6 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
samples/ftrace/sample-trace-array.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/ftrace/sample-trace-array.c b/samples/ftrace/sample-trace-array.c
index dac67c367457..4147616102f9 100644
--- a/samples/ftrace/sample-trace-array.c
+++ b/samples/ftrace/sample-trace-array.c
@@ -112,7 +112,7 @@ static int __init sample_trace_array_init(void)
/*
* If context specific per-cpu buffers havent already been allocated.
*/
- trace_printk_init_buffers();
+ trace_array_init_printk(tr);
simple_tsk = kthread_run(simple_thread, NULL, "sample-instance");
if (IS_ERR(simple_tsk)) {
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [for-linus][PATCH 2/4] ftrace: Fix preemption acounting for stacktrace trigger command
[not found] <20250514133831.110736154@goodmis.org>
2025-05-14 13:38 ` [for-linus][PATCH 1/4] tracing: samples: Initialize trace_array_printk() with the correct function Steven Rostedt
@ 2025-05-14 13:38 ` Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 3/4] ftrace: Fix preemption accounting for stacktrace filter command Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 4/4] ring-buffer: Fix persistent buffer when commit page is the reader page Steven Rostedt
3 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2025-05-14 13:38 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
stable, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
When using the stacktrace trigger command to trace syscalls, the
preemption count was consistently reported as 1 when the system call
event itself had 0 (".").
For example:
root@ubuntu22-vm:/sys/kernel/tracing/events/syscalls/sys_enter_read
$ echo stacktrace > trigger
$ echo 1 > enable
sshd-416 [002] ..... 232.864910: sys_read(fd: a, buf: 556b1f3221d0, count: 8000)
sshd-416 [002] ...1. 232.864913: <stack trace>
=> ftrace_syscall_enter
=> syscall_trace_enter
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
The root cause is that the trace framework disables preemption in __DO_TRACE before
invoking the trigger callback.
Use the tracing_gen_ctx_dec() that will accommodate for the increase of
the preemption count in __DO_TRACE when calling the callback. The result
is the accurate reporting of:
sshd-410 [004] ..... 210.117660: sys_read(fd: 4, buf: 559b725ba130, count: 40000)
sshd-410 [004] ..... 210.117662: <stack trace>
=> ftrace_syscall_enter
=> syscall_trace_enter
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
Cc: stable@vger.kernel.org
Fixes: ce33c845b030c ("tracing: Dump stacktrace trigger to the corresponding instance")
Link: https://lore.kernel.org/20250512094246.1167956-1-dolinux.peng@gmail.com
Signed-off-by: pengdonglin <dolinux.peng@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_trigger.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index b66b6d235d91..6e87ae2a1a66 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -1560,7 +1560,7 @@ stacktrace_trigger(struct event_trigger_data *data,
struct trace_event_file *file = data->private_data;
if (file)
- __trace_stack(file->tr, tracing_gen_ctx(), STACK_SKIP);
+ __trace_stack(file->tr, tracing_gen_ctx_dec(), STACK_SKIP);
else
trace_dump_stack(STACK_SKIP);
}
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [for-linus][PATCH 3/4] ftrace: Fix preemption accounting for stacktrace filter command
[not found] <20250514133831.110736154@goodmis.org>
2025-05-14 13:38 ` [for-linus][PATCH 1/4] tracing: samples: Initialize trace_array_printk() with the correct function Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 2/4] ftrace: Fix preemption acounting for stacktrace trigger command Steven Rostedt
@ 2025-05-14 13:38 ` Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 4/4] ring-buffer: Fix persistent buffer when commit page is the reader page Steven Rostedt
3 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2025-05-14 13:38 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
stable, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
The preemption count of the stacktrace filter command to trace ksys_read
is consistently incorrect:
$ echo ksys_read:stacktrace > set_ftrace_filter
<...>-453 [004] ...1. 38.308956: <stack trace>
=> ksys_read
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
The root cause is that the trace framework disables preemption when
invoking the filter command callback in function_trace_probe_call:
preempt_disable_notrace();
probe_ops->func(ip, parent_ip, probe_opsbe->tr, probe_ops, probe->data);
preempt_enable_notrace();
Use tracing_gen_ctx_dec() to account for the preempt_disable_notrace(),
which will output the correct preemption count:
$ echo ksys_read:stacktrace > set_ftrace_filter
<...>-410 [006] ..... 31.420396: <stack trace>
=> ksys_read
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
Cc: stable@vger.kernel.org
Fixes: 36590c50b2d07 ("tracing: Merge irqflags + preempt counter.")
Link: https://lore.kernel.org/20250512094246.1167956-2-dolinux.peng@gmail.com
Signed-off-by: pengdonglin <dolinux.peng@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_functions.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index 98ccf3f00c51..4e37a0f6aaa3 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -633,11 +633,7 @@ ftrace_traceoff(unsigned long ip, unsigned long parent_ip,
static __always_inline void trace_stack(struct trace_array *tr)
{
- unsigned int trace_ctx;
-
- trace_ctx = tracing_gen_ctx();
-
- __trace_stack(tr, trace_ctx, FTRACE_STACK_SKIP);
+ __trace_stack(tr, tracing_gen_ctx_dec(), FTRACE_STACK_SKIP);
}
static void
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [for-linus][PATCH 4/4] ring-buffer: Fix persistent buffer when commit page is the reader page
[not found] <20250514133831.110736154@goodmis.org>
` (2 preceding siblings ...)
2025-05-14 13:38 ` [for-linus][PATCH 3/4] ftrace: Fix preemption accounting for stacktrace filter command Steven Rostedt
@ 2025-05-14 13:38 ` Steven Rostedt
3 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2025-05-14 13:38 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
stable, Tasos Sahanidis
From: Steven Rostedt <rostedt@goodmis.org>
The ring buffer is made up of sub buffers (sometimes called pages as they
are by default PAGE_SIZE). It has the following "pages":
"tail page" - this is the page that the next write will write to
"head page" - this is the page that the reader will swap the reader page with.
"reader page" - This belongs to the reader, where it will swap the head
page from the ring buffer so that the reader does not
race with the writer.
The writer may end up on the "reader page" if the ring buffer hasn't
written more than one page, where the "tail page" and the "head page" are
the same.
The persistent ring buffer has meta data that points to where these pages
exist so on reboot it can re-create the pointers to the cpu_buffer
descriptor. But when the commit page is on the reader page, the logic is
incorrect.
The check to see if the commit page is on the reader page checked if the
head page was the reader page, which would never happen, as the head page
is always in the ring buffer. The correct check would be to test if the
commit page is on the reader page. If that's the case, then it can exit
out early as the commit page is only on the reader page when there's only
one page of data in the buffer. There's no reason to iterate the ring
buffer pages to find the "commit page" as it is already found.
To trigger this bug:
# echo 1 > /sys/kernel/tracing/instances/boot_mapped/events/syscalls/sys_enter_fchownat/enable
# touch /tmp/x
# chown sshd /tmp/x
# reboot
On boot up, the dmesg will have:
Ring buffer meta [0] is from previous boot!
Ring buffer meta [1] is from previous boot!
Ring buffer meta [2] is from previous boot!
Ring buffer meta [3] is from previous boot!
Ring buffer meta [4] commit page not found
Ring buffer meta [5] is from previous boot!
Ring buffer meta [6] is from previous boot!
Ring buffer meta [7] is from previous boot!
Where the buffer on CPU 4 had a "commit page not found" error and that
buffer is cleared and reset causing the output to be empty and the data lost.
When it works correctly, it has:
# cat /sys/kernel/tracing/instances/boot_mapped/trace_pipe
<...>-1137 [004] ..... 998.205323: sys_enter_fchownat: __syscall_nr=0x104 (260) dfd=0xffffff9c (4294967196) filename=(0xffffc90000a0002c) user=0x3e8 (1000) group=0xffffffff (4294967295) flag=0x0 (0
Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250513115032.3e0b97f7@gandalf.local.home
Fixes: 5f3b6e839f3ce ("ring-buffer: Validate boot range memory events")
Reported-by: Tasos Sahanidis <tasos@tasossah.com>
Tested-by: Tasos Sahanidis <tasos@tasossah.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/ring_buffer.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index c0f877d39a24..3f9bf562beea 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1887,10 +1887,12 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
head_page = cpu_buffer->head_page;
- /* If both the head and commit are on the reader_page then we are done. */
- if (head_page == cpu_buffer->reader_page &&
- head_page == cpu_buffer->commit_page)
+ /* If the commit_buffer is the reader page, update the commit page */
+ if (meta->commit_buffer == (unsigned long)cpu_buffer->reader_page->page) {
+ cpu_buffer->commit_page = cpu_buffer->reader_page;
+ /* Nothing more to do, the only page is the reader page */
goto done;
+ }
/* Iterate until finding the commit page */
for (i = 0; i < meta->nr_subbufs + 1; i++, rb_inc_page(&head_page)) {
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-05-14 13:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250514133831.110736154@goodmis.org>
2025-05-14 13:38 ` [for-linus][PATCH 1/4] tracing: samples: Initialize trace_array_printk() with the correct function Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 2/4] ftrace: Fix preemption acounting for stacktrace trigger command Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 3/4] ftrace: Fix preemption accounting for stacktrace filter command Steven Rostedt
2025-05-14 13:38 ` [for-linus][PATCH 4/4] ring-buffer: Fix persistent buffer when commit page is the reader page Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox