* [PATCH v2 00/11] Add spi-hid transport driver
From: Jingyuan Liang @ 2026-03-24 6:39 UTC (permalink / raw)
To: Jiri Kosina, Benjamin Tissoires, Jonathan Corbet, Mark Brown,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Dmitry Torokhov, Rob Herring, Krzysztof Kozlowski, Conor Dooley
Cc: linux-input, linux-doc, linux-kernel, linux-spi,
linux-trace-kernel, devicetree, hbarnor, tfiga, Jingyuan Liang,
Jarrett Schultz, Dmitry Antipov, Angela Czubak
This series picks up the spi-hid driver work originally started by
Microsoft. The patch breakdown has been modified and the implementation
has been refactored to address upstream feedback and testing issues. We
are submitting this as a new series while keeping the original sign-off
chain to reflect the history.
Same as the original series, there is a change to HID documentation, some
HID core changes to support a SPI device, the SPI HID transport driver,
and HID over SPI Device Tree binding. We have added the HID over SPI ACPI
support, power management, panel follower, and quirks for Ilitek touch
controllers.
Original authors: Jarrett Schultz <jaschultz@microsoft.com>,
Dmitry Antipov <dmanti@microsoft.com>
Link: https://lore.kernel.org/r/86b63b7b-afda-d7f4-7bfa-175085d5a8ef@gmail.com
Signed-off-by: Jingyuan Liang <jingyliang@chromium.org>
---
Changes in v2:
- Fix style problems and remove unnecessary fields from the DT binding file
- Drop patch 12 as it is vendor specific
- Add a lock to fix input/output concurrency race
- Link to v1: https://lore.kernel.org/r/20260303-send-upstream-v1-0-1515ba218f3d@chromium.org
---
Angela Czubak (2):
HID: spi-hid: add transport driver skeleton for HID over SPI bus
HID: spi_hid: add ACPI support for SPI over HID
Jarrett Schultz (3):
Documentation: Correction in HID output_report callback description.
HID: Add BUS_SPI support and define HID_SPI_DEVICE macro
HID: spi_hid: add device tree support for SPI over HID
Jingyuan Liang (6):
HID: spi-hid: add spi-hid driver HID layer
HID: spi-hid: add HID SPI protocol implementation
HID: spi_hid: add spi_hid traces
dt-bindings: input: Document hid-over-spi DT schema
HID: spi-hid: add power management implementation
HID: spi-hid: add panel follower support
.../devicetree/bindings/input/hid-over-spi.yaml | 126 ++
Documentation/hid/hid-transport.rst | 4 +-
drivers/hid/Kconfig | 2 +
drivers/hid/Makefile | 2 +
drivers/hid/hid-core.c | 3 +
drivers/hid/spi-hid/Kconfig | 45 +
drivers/hid/spi-hid/Makefile | 11 +
drivers/hid/spi-hid/spi-hid-acpi.c | 254 ++++
drivers/hid/spi-hid/spi-hid-core.c | 1417 ++++++++++++++++++++
drivers/hid/spi-hid/spi-hid-core.h | 93 ++
drivers/hid/spi-hid/spi-hid-of.c | 244 ++++
drivers/hid/spi-hid/spi-hid.h | 46 +
include/linux/hid.h | 2 +
include/trace/events/spi_hid.h | 156 +++
14 files changed, 2403 insertions(+), 2 deletions(-)
---
base-commit: 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b
change-id: 20260212-send-upstream-75f6fd9ed92e
Best regards,
--
Jingyuan Liang <jingyliang@chromium.org>
^ permalink raw reply
* Re: [PATCH v6 4/4] selftests/ftrace: Add accept cases for fprobe list syntax
From: Masami Hiramatsu @ 2026-03-24 4:12 UTC (permalink / raw)
To: Seokwoo Chung (Ryan)
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <20260205135842.20517-5-seokwoo.chung130@gmail.com>
On Thu, 5 Feb 2026 08:58:42 -0500
"Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
> Add fprobe_list.tc to test the comma-separated symbol list syntax
> with :entry/:exit suffixes. Three scenarios are covered:
>
> 1. List with default (entry) behavior and ! exclusion
> 2. List with explicit :entry suffix
> 3. List with :exit suffix for return probes
Could you also add wildcard pattern test?
>
> Each test verifies that the correct functions appear in
> enabled_functions and that excluded (!) symbols are absent.
>
> Note: The existing tests add_remove_fprobe.tc, fprobe_syntax_errors.tc,
> and add_remove_fprobe_repeat.tc check their "requires" line against the
> tracefs README for the old "%return" syntax pattern. Since the README
> now documents ":entry|:exit" instead, these tests report UNSUPPORTED.
> Their "requires" lines need updating in a follow-up patch.
This means you'll break the selftest. please fix those test first.
(This fix must be done before "tracing/fprobe: Support comma-separated
symbols and :entry/:exit" so that we can safely bisect it.)
Thank you,
>
> Signed-off-by: Seokwoo Chung (Ryan) <seokwoo.chung130@gmail.com>
> ---
> .../ftrace/test.d/dynevent/fprobe_list.tc | 92 +++++++++++++++++++
> 1 file changed, 92 insertions(+)
> create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
>
> diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> new file mode 100644
> index 000000000000..45e57c6f487d
> --- /dev/null
> +++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> @@ -0,0 +1,92 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +# description: Fprobe event list syntax and :entry/:exit suffixes
> +# requires: dynamic_events "f[:[<group>/][<event>]] <func-name>[:entry|:exit] [<args>]":README
> +
> +# Setup symbols to test. These are common kernel functions.
> +PLACE=vfs_read
> +PLACE2=vfs_write
> +PLACE3=vfs_open
> +
> +echo 0 > events/enable
> +echo > dynamic_events
> +
> +# Get baseline count of enabled functions (should be 0 if clean, but be safe)
> +if [ -f enabled_functions ]; then
> + ocnt=`cat enabled_functions | wc -l`
> +else
> + ocnt=0
> +fi
> +
> +# Test 1: List default (entry) with exclusion
> +# Target: Trace vfs_read and vfs_open, but EXCLUDE vfs_write
> +echo "f:test/list_entry $PLACE,!$PLACE2,$PLACE3" >> dynamic_events
> +grep -q "test/list_entry" dynamic_events
> +test -d events/test/list_entry
> +
> +echo 1 > events/test/list_entry/enable
> +
> +grep -q "$PLACE" enabled_functions
> +grep -q "$PLACE3" enabled_functions
> +! grep -q "$PLACE2" enabled_functions
> +
> +# Check count (Baseline + 2 new functions)
> +cnt=`cat enabled_functions | wc -l`
> +if [ $cnt -ne $((ocnt + 2)) ]; then
> + exit_fail
> +fi
> +
> +# Cleanup Test 1
> +echo 0 > events/test/list_entry/enable
> +echo "-:test/list_entry" >> dynamic_events
> +! grep -q "test/list_entry" dynamic_events
> +
> +# Count should return to baseline
> +cnt=`cat enabled_functions | wc -l`
> +if [ $cnt -ne $ocnt ]; then
> + exit_fail
> +fi
> +
> +# Test 2: List with explicit :entry suffix
> +# (Should behave exactly like Test 1)
> +echo "f:test/list_entry_exp $PLACE,!$PLACE2,$PLACE3:entry" >> dynamic_events
> +grep -q "test/list_entry_exp" dynamic_events
> +test -d events/test/list_entry_exp
> +
> +echo 1 > events/test/list_entry_exp/enable
> +
> +grep -q "$PLACE" enabled_functions
> +grep -q "$PLACE3" enabled_functions
> +! grep -q "$PLACE2" enabled_functions
> +
> +cnt=`cat enabled_functions | wc -l`
> +if [ $cnt -ne $((ocnt + 2)) ]; then
> + exit_fail
> +fi
> +
> +# Cleanup Test 2
> +echo 0 > events/test/list_entry_exp/enable
> +echo "-:test/list_entry_exp" >> dynamic_events
> +
> +# Test 3: List with :exit suffix
> +echo "f:test/list_exit $PLACE,!$PLACE2,$PLACE3:exit" >> dynamic_events
> +grep -q "test/list_exit" dynamic_events
> +test -d events/test/list_exit
> +
> +echo 1 > events/test/list_exit/enable
> +
> +# Even for return probes, enabled_functions lists the attached symbols
> +grep -q "$PLACE" enabled_functions
> +grep -q "$PLACE3" enabled_functions
> +! grep -q "$PLACE2" enabled_functions
> +
> +cnt=`cat enabled_functions | wc -l`
> +if [ $cnt -ne $((ocnt + 2)) ]; then
> + exit_fail
> +fi
> +
> +# Cleanup Test 3
> +echo 0 > events/test/list_exit/enable
> +echo "-:test/list_exit" >> dynamic_events
> +
> +clear_trace
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v6 3/4] docs: tracing/fprobe: Document list filters and :entry/:exit
From: Masami Hiramatsu @ 2026-03-24 4:07 UTC (permalink / raw)
To: Seokwoo Chung (Ryan)
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <20260205135842.20517-4-seokwoo.chung130@gmail.com>
On Thu, 5 Feb 2026 08:58:41 -0500
"Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
> Update fprobe event documentation to describe comma-separated symbol lists,
> exclusions, and explicit suffixes.
>
> Signed-off-by: Seokwoo Chung (Ryan) <seokwoo.chung130@gmail.com>
> ---
> Documentation/trace/fprobetrace.rst | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst
> index b4c2ca3d02c1..bbcfd57f0005 100644
> --- a/Documentation/trace/fprobetrace.rst
> +++ b/Documentation/trace/fprobetrace.rst
> @@ -25,14 +25,18 @@ Synopsis of fprobe-events
> -------------------------
> ::
>
> - f[:[GRP1/][EVENT1]] SYM [FETCHARGS] : Probe on function entry
> - f[MAXACTIVE][:[GRP1/][EVENT1]] SYM%return [FETCHARGS] : Probe on function exit
> + f[:[GRP1/][EVENT1]] SYM[%return] [FETCHARGS] : Single function
We also accept wildcard pattern instead of SYM.
f[:[GRP1/][EVENT1]] (SYM|PATTERN)[%return] [FETCHARGS] : Single target
> + f[:[GRP1/][EVENT1]] SYM[,[!]SYM[,...]][:entry|:exit] [FETCHARGS] :Multiple
Hmm if this is for multiple function, we can not omit event name, so
f:[GRP1/]EVNET1 SYM[,[!]SYM[,...]][:entry|:exit] [FETCHARGS] : Multiple function
> + function
Also, if you put this in the next line, please indent it.
> t[:[GRP2/][EVENT2]] TRACEPOINT [FETCHARGS] : Probe on tracepoint
>
> GRP1 : Group name for fprobe. If omitted, use "fprobes" for it.
> GRP2 : Group name for tprobe. If omitted, use "tracepoints" for it.
> EVENT1 : Event name for fprobe. If omitted, the event name is
> - "SYM__entry" or "SYM__exit".
> + - For a single literal symbol, the event name is
> + "SYM__entry" or "SYM__exit".
> + - For a *list or any wildcard*, an explicit [GRP1/][EVENT1] is
an explicit EVENT1 is required. (group name can be omitted)
> + required; otherwise the parser rejects it.
> EVENT2 : Event name for tprobe. If omitted, the event name is
> the same as "TRACEPOINT", but if the "TRACEPOINT" starts
> with a digit character, "_TRACEPOINT" is used.
> @@ -40,6 +44,13 @@ Synopsis of fprobe-events
> can be probed simultaneously, or 0 for the default value
> as defined in Documentation/trace/fprobe.rst
>
> + SYM : Function name or comma-separated list of symbols.
Doesn't SYM still be a function name? In above synopsis, SYM is an entry of
comma-separated list.
> + - SYM prefixed with "!" are exclusions.
> + - ":entry" suffix means it probes entry of given symbols
> + (default)
> + - ":exit" suffix means it probes exit of given symbols.
> + - "%return" suffix means it probes exit of SYM (single
> + symbol).
And we need PATTERN here.
PATTERN : Function name pattern with wildcards (You can use "*" or "?").
Thank you,
> FETCHARGS : Arguments. Each probe can have up to 128 args.
> ARG : Fetch "ARG" function argument using BTF (only for function
> entry or tracepoint.) (\*1)
> --
> 2.43.0
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v6 2/4] fprobe: Support comma-separated filters in register_fprobe()
From: Masami Hiramatsu @ 2026-03-24 1:59 UTC (permalink / raw)
To: Seokwoo Chung (Ryan)
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <20260205135842.20517-3-seokwoo.chung130@gmail.com>
On Thu, 5 Feb 2026 08:58:40 -0500
"Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
> register_fprobe() passes its filter and notfilter strings directly to
> glob_match(), which only understands shell-style globs (*, ?, [...]).
> Comma-separated symbol lists such as "vfs_read,vfs_open" never match
> any symbol because no kernel symbol contains a comma.
>
> Add glob_match_comma_list() that splits the filter on commas and
> checks each entry individually with glob_match(). The existing
> single-pattern fast path is preserved (no commas means the loop
> executes exactly once).
>
> This is required by the comma-separated fprobe list syntax introduced
> in the preceding patch; without it, enabling a list-mode fprobe event
> fails with "Could not enable event".
OK, in this case, you should reorder patch this as the first one.
Please make this [1/4] and remove this requirement explanation paragraph.
The patch itself looks good to me.
Thank you,
>
> Signed-off-by: Seokwoo Chung (Ryan) <seokwoo.chung130@gmail.com>
> ---
> kernel/trace/fprobe.c | 30 ++++++++++++++++++++++++++++--
> 1 file changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 1188eefef07c..2acd24b80d04 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -672,12 +672,38 @@ struct filter_match_data {
> struct module **mods;
> };
>
> +/*
> + * Check if @name matches any comma-separated glob pattern in @list.
> + * If @list contains no commas, this is equivalent to glob_match().
> + */
> +static bool glob_match_comma_list(const char *list, const char *name)
> +{
> + const char *cur = list;
> +
> + while (*cur) {
> + const char *sep = strchr(cur, ',');
> + int len = sep ? sep - cur : strlen(cur);
> + char pat[KSYM_NAME_LEN];
> +
> + if (len > 0 && len < KSYM_NAME_LEN) {
> + memcpy(pat, cur, len);
> + pat[len] = '\0';
> + if (glob_match(pat, name))
> + return true;
> + }
> + if (!sep)
> + break;
> + cur = sep + 1;
> + }
> + return false;
> +}
> +
> static int filter_match_callback(void *data, const char *name, unsigned long addr)
> {
> struct filter_match_data *match = data;
>
> - if (!glob_match(match->filter, name) ||
> - (match->notfilter && glob_match(match->notfilter, name)))
> + if (!glob_match_comma_list(match->filter, name) ||
> + (match->notfilter && glob_match_comma_list(match->notfilter, name)))
> return 0;
>
> if (!ftrace_location(addr))
> --
> 2.43.0
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v6 0/4] tracing/fprobe: Support comma-separated symbol lists and :entry/:exit suffixes
From: Masami Hiramatsu @ 2026-03-24 1:51 UTC (permalink / raw)
To: Seokwoo Chung (Ryan)
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <20260205135842.20517-1-seokwoo.chung130@gmail.com>
Hi,
Sorry, I completely missed this series. Let me review it.
Thank you,
On Thu, 5 Feb 2026 08:58:38 -0500
"Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
> Extend the fprobe event interface to accept comma-separated symbol lists
> with ! exclusion prefix, and :entry/:exit suffixes as an alternative to
> %return. Single-symbol probes retain full backward compatibility with
> %return.
>
> Example usage:
> f:mygroup/myevent vfs_read,!vfs_write,vfs_open:entry
> f:mygroup/myexit vfs_read,vfs_open:exit
>
> Changes since v5:
> - Fix missing closing brace in the empty-token check that caused a
> build error.
> - Remove redundant strchr/strstr checks for tracepoint validation;
> the character validation loop already rejects ',', ':', and '%'.
> - Add trace_probe_log_err() to the tracepoint character validation
> loop so users see what went wrong in tracefs/error_log (reviewer
> feedback from Masami Hiramatsu).
> - Remove unnecessary braces around single-statement if per kernel
> coding style (reviewer feedback).
> - Extract list parsing into parse_fprobe_list() to keep
> parse_fprobe_spec() focused (reviewer feedback).
> - New patch 2/4: add glob_match_comma_list() in kernel/trace/fprobe.c
> so register_fprobe() correctly handles comma-separated filter
> strings. Without this, enabling a list-mode fprobe event failed
> with "Could not enable event" because glob_match() does not
> understand commas.
> - Reorder: documentation patch now comes after all code changes.
> - Updated selftest commit message to note that existing tests
> (add_remove_fprobe.tc, fprobe_syntax_errors.tc,
> add_remove_fprobe_repeat.tc) report UNSUPPORTED because their
> "requires" lines still check for the old %return syntax in README.
> Their requires lines need updating in a follow-up patch.
>
> Tested in QEMU/KVM but I am not too confident if I configured correctly and
> would like to ask for further testing.
>
> Seokwoo Chung (Ryan) (4):
> tracing/fprobe: Support comma-separated symbols and :entry/:exit
> fprobe: Support comma-separated filters in register_fprobe()
> docs: tracing/fprobe: Document list filters and :entry/:exit
> selftests/ftrace: Add accept cases for fprobe list syntax
>
> Documentation/trace/fprobetrace.rst | 17 +-
> kernel/trace/fprobe.c | 30 ++-
> kernel/trace/trace.c | 3 +-
> kernel/trace/trace_fprobe.c | 219 ++++++++++++++----
> .../ftrace/test.d/dynevent/fprobe_list.tc | 92 ++++++++
> 5 files changed, 308 insertions(+), 53 deletions(-)
> create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
>
> --
> 2.43.0
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v6 1/4] tracing/fprobe: Support comma-separated symbols and :entry/:exit
From: Masami Hiramatsu @ 2026-03-24 1:50 UTC (permalink / raw)
To: Seokwoo Chung (Ryan)
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <20260205135842.20517-2-seokwoo.chung130@gmail.com>
On Thu, 5 Feb 2026 08:58:39 -0500
"Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
> Extend the fprobe event interface to support:
> - Comma-separated symbol lists: "func1,func2,func3"
> - Exclusion prefix: "func1,!func2,func3"
> - Explicit :entry and :exit suffixes (replacing %return for lists)
>
> Single-symbol probes retain backward compatibility with %return.
>
> The list parsing is factored into a dedicated parse_fprobe_list()
> helper that splits comma-separated input into filter (included) and
> nofilter (excluded) strings. Tracepoint validation now reports the
> error position via trace_probe_log_err() so users can see what went
> wrong in tracefs/error_log.
>
> Changes since v5:
> - Fix missing closing brace in the empty-token check that caused a
> build error.
> - Remove redundant strchr/strstr checks for tracepoint validation
> (the character validation loop already rejects ',', ':', and '%').
> - Add trace_probe_log_err() to the tracepoint character validation
> loop per reviewer feedback.
> - Remove unnecessary braces around single-statement if per kernel
> coding style.
> - Extract list parsing into parse_fprobe_list() per reviewer feedback
> to keep parse_fprobe_spec() focused.
Thanks for updating! I have some comments below.
>
> Update tracefs/README to reflect the new syntax.
>
> Signed-off-by: Seokwoo Chung (Ryan) <seokwoo.chung130@gmail.com>
> ---
> kernel/trace/trace.c | 3 +-
> kernel/trace/trace_fprobe.c | 219 ++++++++++++++++++++++++++++--------
> 2 files changed, 174 insertions(+), 48 deletions(-)
>
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 8bd4ec08fb36..649a6e6021b4 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -5578,7 +5578,8 @@ static const char readme_msg[] =
> "\t r[maxactive][:[<group>/][<event>]] <place> [<args>]\n"
> #endif
> #ifdef CONFIG_FPROBE_EVENTS
> - "\t f[:[<group>/][<event>]] <func-name>[%return] [<args>]\n"
> + "\t f[:[<group>/][<event>]] <func-name>[:entry|:exit] [<args>]\n"
> + "\t (single symbols still accept %return)\n"
> "\t t[:[<group>/][<event>]] <tracepoint> [<args>]\n"
> #endif
> #ifdef CONFIG_HIST_TRIGGERS
> diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> index 262c0556e4af..f8846cd1d020 100644
> --- a/kernel/trace/trace_fprobe.c
> +++ b/kernel/trace/trace_fprobe.c
> @@ -187,11 +187,14 @@ DEFINE_FREE(tuser_put, struct tracepoint_user *,
> */
> struct trace_fprobe {
> struct dyn_event devent;
> + char *filter;
Could you move this filter to the previous line of nofilter?
> struct fprobe fp;
> + bool list_mode;
This list_mode seems an alias of (filter || nofilter). In this case,
we don't need trace_fprobe::list_mode. Please remove this field.
> + char *nofilter;
> const char *symbol;
> + struct trace_probe tp;
tp must be the last field. Please do not move.
> bool tprobe;
> struct tracepoint_user *tuser;
> - struct trace_probe tp;
> };
>
> static bool is_trace_fprobe(struct dyn_event *ev)
> @@ -559,6 +562,8 @@ static void free_trace_fprobe(struct trace_fprobe *tf)
> trace_probe_cleanup(&tf->tp);
> if (tf->tuser)
> tracepoint_user_put(tf->tuser);
> + kfree(tf->filter);
> + kfree(tf->nofilter);
> kfree(tf->symbol);
> kfree(tf);
> }
> @@ -838,7 +843,12 @@ static int __register_trace_fprobe(struct trace_fprobe *tf)
> if (trace_fprobe_is_tracepoint(tf))
> return __regsiter_tracepoint_fprobe(tf);
>
> - /* TODO: handle filter, nofilter or symbol list */
> + /* Registration path:
> + * - list_mode: pass filter/nofilter
> + * - single: pass symbol only (legacy)
> + */
> + if (tf->list_mode)
> + return register_fprobe(&tf->fp, tf->filter, tf->nofilter);
> return register_fprobe(&tf->fp, tf->symbol, NULL);
> }
>
> @@ -1154,60 +1164,131 @@ static struct notifier_block tprobe_event_module_nb = {
> };
> #endif /* CONFIG_MODULES */
>
> -static int parse_symbol_and_return(int argc, const char *argv[],
> - char **symbol, bool *is_return,
> - bool is_tracepoint)
> +static bool has_wildcard(const char *s)
> {
> - char *tmp = strchr(argv[1], '%');
> - int i;
> + return s && (strchr(s, '*') || strchr(s, '?'));
> +}
>
> - if (tmp) {
> - int len = tmp - argv[1];
> +static int parse_fprobe_list(char *b, char **filter, char **nofilter)
> +{
> + char *f __free(kfree) = NULL;
> + char *nf __free(kfree) = NULL;
> + char *tmp = b, *tok;
> + size_t sz;
>
> - if (!is_tracepoint && !strcmp(tmp, "%return")) {
> - *is_return = true;
> - } else {
> - trace_probe_log_err(len, BAD_ADDR_SUFFIX);
> - return -EINVAL;
> - }
> - *symbol = kmemdup_nul(argv[1], len, GFP_KERNEL);
> - } else
> - *symbol = kstrdup(argv[1], GFP_KERNEL);
> - if (!*symbol)
> + sz = strlen(b) + 1;
> +
> + f = kzalloc(sz, GFP_KERNEL);
> + nf = kzalloc(sz, GFP_KERNEL);
> + if (!f || !nf)
> return -ENOMEM;
>
> - if (*is_return)
> - return 0;
> + while ((tok = strsep(&tmp, ",")) != NULL) {
> + char *dst;
> + bool neg = (*tok == '!');
>
> - if (is_tracepoint) {
> - tmp = *symbol;
> - while (*tmp && (isalnum(*tmp) || *tmp == '_'))
> - tmp++;
> - if (*tmp) {
> - /* find a wrong character. */
> - trace_probe_log_err(tmp - *symbol, BAD_TP_NAME);
> - kfree(*symbol);
> - *symbol = NULL;
> + if (*tok == '\0') {
> + trace_probe_log_err(tmp - b - 1, BAD_TP_NAME);
> return -EINVAL;
> }
> +
> + if (neg)
> + tok++;
> + dst = neg ? nf : f;
> + if (dst[0] != '\0')
> + strcat(dst, ",");
> + strcat(dst, tok);
> }
>
> - /* If there is $retval, this should be a return fprobe. */
> - for (i = 2; i < argc; i++) {
> - tmp = strstr(argv[i], "$retval");
> - if (tmp && !isalnum(tmp[7]) && tmp[7] != '_') {
> - if (is_tracepoint) {
> - trace_probe_log_set_index(i);
> - trace_probe_log_err(tmp - argv[i], RETVAL_ON_PROBE);
> - kfree(*symbol);
> - *symbol = NULL;
> + *filter = no_free_ptr(f);
> + *nofilter = no_free_ptr(nf);
> +
> + return 0;
> +}
> +
> +static int parse_fprobe_spec(const char *in, bool is_tracepoint,
> + char **base, bool *is_return, bool *list_mode,
> + char **filter, char **nofilter)
> +{
> + char *work __free(kfree) = NULL;
> + char *b __free(kfree) = NULL;
> + char *f __free(kfree) = NULL;
> + char *nf __free(kfree) = NULL;
> + bool legacy_ret = false;
> + bool list = false;
> + const char *p;
> + int ret = 0;
> +
> + if (!in || !base || !is_return || !list_mode || !filter || !nofilter)
> + return -EINVAL;
> +
> + *base = NULL; *filter = NULL; *nofilter = NULL;
> + *is_return = false; *list_mode = false;
> +
> + if (is_tracepoint) {
> + for (p = in; *p; p++)
> + if (!isalnum(*p) && *p != '_') {
> + trace_probe_log_err(p - in, BAD_TP_NAME);
> + return -EINVAL;
> + }
> + b = kstrdup(in, GFP_KERNEL);
> + if (!b)
> + return -ENOMEM;
> + *base = no_free_ptr(b);
> + return 0;
> + }
> +
> + work = kstrdup(in, GFP_KERNEL);
> + if (!work)
> + return -ENOMEM;
> +
> + p = strstr(work, "%return");
> + if (p && p[7] == '\0') {
> + *is_return = true;
> + legacy_ret = true;
> + *(char *)p = '\0';
> + } else {
> + /*
> + * If "symbol:entry" or "symbol:exit" is given, it is new
> + * style probe.
> + */
> + p = strrchr(work, ':');
> + if (p) {
> + if (!strcmp(p, ":exit")) {
> + *is_return = true;
> + *(char *)p = '\0';
> + } else if (!strcmp(p, ":entry")) {
> + *(char *)p = '\0';
> + } else {
> return -EINVAL;
> }
> - *is_return = true;
> - break;
> }
> }
> - return 0;
> +
> + list = !!strchr(work, ',');
> +
Here is a needless whitespace (tab) above line.
> + if (list && legacy_ret)
> + return -EINVAL;
> +
> + if (legacy_ret)
> + *is_return = true;
> +
> + b = kstrdup(work, GFP_KERNEL);
> + if (!b)
> + return -ENOMEM;
> +
> + if (list) {
> + ret = parse_fprobe_list(b, &f, &nf);
> + if (ret)
> + return ret;
> + *list_mode = true;
> + }
> +
> + *base = no_free_ptr(b);
> + *filter = no_free_ptr(f);
> + *nofilter = no_free_ptr(nf);
> +
> + return ret;
> }
>
> static int trace_fprobe_create_internal(int argc, const char *argv[],
> @@ -1241,6 +1322,8 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
> const char *event = NULL, *group = FPROBE_EVENT_SYSTEM;
> struct module *mod __free(module_put) = NULL;
> const char **new_argv __free(kfree) = NULL;
> + char *parsed_nofilter __free(kfree) = NULL;
> + char *parsed_filter __free(kfree) = NULL;
nit: we should make those as filter/nofilter simply.
> char *symbol __free(kfree) = NULL;
> char *ebuf __free(kfree) = NULL;
> char *gbuf __free(kfree) = NULL;
> @@ -1249,6 +1332,7 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
> char *dbuf __free(kfree) = NULL;
> int i, new_argc = 0, ret = 0;
> bool is_tracepoint = false;
> + bool list_mode = false;
Then, we don't need this list_mode. we can replace it with "(filter || nofilter)".
> bool is_return = false;
>
> if ((argv[0][0] != 'f' && argv[0][0] != 't') || argc < 2)
> @@ -1270,11 +1354,26 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
>
> trace_probe_log_set_index(1);
>
> - /* a symbol(or tracepoint) must be specified */
> - ret = parse_symbol_and_return(argc, argv, &symbol, &is_return, is_tracepoint);
> + /* Parse spec early (single vs list, suffix, base symbol) */
> + ret = parse_fprobe_spec(argv[1], is_tracepoint, &symbol, &is_return,
> + &list_mode, &parsed_filter, &parsed_nofilter);
> if (ret < 0)
> return -EINVAL;
>
> + for (i = 2; i < argc; i++) {
> + char *tmp = strstr(argv[i], "$retval");
> +
> + if (tmp && !isalnum(tmp[7]) && tmp[7] != '_') {
> + if (is_tracepoint) {
> + trace_probe_log_set_index(i);
> + trace_probe_log_err(tmp - argv[i], RETVAL_ON_PROBE);
> + return -EINVAL;
> + }
> + is_return = true;
> + break;
> + }
> + }
Please do this $retval check in parse_fprobe_spec() as parse_symbol_and_return() did.
> +
> trace_probe_log_set_index(0);
> if (event) {
> gbuf = kmalloc(MAX_EVENT_NAME_LEN, GFP_KERNEL);
> @@ -1287,6 +1386,15 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
> }
>
> if (!event) {
> + /*
> + * Event name rules:
> + * - For list/wildcard: require explicit [GROUP/]EVENT
> + * - For single literal: autogenerate symbol__entry/symbol__exit
> + */
> + if (list_mode || has_wildcard(symbol)) {
> + trace_probe_log_err(0, NO_GROUP_NAME);
NO_EVENT_NAME error?
> + return -EINVAL;
> + }
> ebuf = kmalloc(MAX_EVENT_NAME_LEN, GFP_KERNEL);
> if (!ebuf)
> return -ENOMEM;
> @@ -1322,7 +1430,8 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
> NULL, NULL, NULL, sbuf);
> }
> }
> - if (!ctx->funcname)
> +
> + if (!list_mode && !has_wildcard(symbol) && !is_tracepoint)
> ctx->funcname = symbol;
>
> abuf = kmalloc(MAX_BTF_ARGS_LEN, GFP_KERNEL);
> @@ -1356,6 +1465,21 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
> return ret;
> }
>
> + /* carry list parsing result into tf */
> + if (!is_tracepoint) {
> + tf->list_mode = list_mode;
> + if (parsed_filter) {
> + tf->filter = kstrdup(parsed_filter, GFP_KERNEL);
> + if (!tf->filter)
> + return -ENOMEM;
> + }
> + if (parsed_nofilter) {
> + tf->nofilter = kstrdup(parsed_nofilter, GFP_KERNEL);
> + if (!tf->nofilter)
> + return -ENOMEM;
> + }
> + }
Also, it is natural to do this inside alloc_trace_fprobe(). Can you pass
filter and nofilter to alloc_trace_fprobe()?
(and just make those NULL instead of using kstrdup()?)
Thank you,
> +
> /* parse arguments */
> for (i = 0; i < argc; i++) {
> trace_probe_log_set_index(i + 2);
> @@ -1442,8 +1566,9 @@ static int trace_fprobe_show(struct seq_file *m, struct dyn_event *ev)
> seq_printf(m, ":%s/%s", trace_probe_group_name(&tf->tp),
> trace_probe_name(&tf->tp));
>
> - seq_printf(m, " %s%s", trace_fprobe_symbol(tf),
> - trace_fprobe_is_return(tf) ? "%return" : "");
> + seq_printf(m, " %s", trace_fprobe_symbol(tf));
> + if (!trace_fprobe_is_tracepoint(tf) && trace_fprobe_is_return(tf))
> + seq_puts(m, ":exit");
>
> for (i = 0; i < tf->tp.nr_args; i++)
> seq_printf(m, " %s=%s", tf->tp.args[i].name, tf->tp.args[i].comm);
> --
> 2.43.0
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH] tracing: fprobe: fix the length of unused fgraph_data
From: Masami Hiramatsu @ 2026-03-24 0:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: Martin Kaiser, Masami Hiramatsu, Mathieu Desnoyers,
linux-trace-kernel, linux-kernel, stable
In-Reply-To: <20260323104818.0ad25dd5@gandalf.local.home>
On Mon, 23 Mar 2026 10:48:18 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 23 Mar 2026 11:19:36 +0100
> Martin Kaiser <martin@kaiser.cx> wrote:
>
> > If fprobe_entry does not fill the allocated fgraph_data completely, the
> > unused part is zeroed with memset.
> >
> > Fix the length for this memset call. Both reserved_words and used are in
> > units of return stack words, but memset needs the number of bytes.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Signed-off-by: Martin Kaiser <martin@kaiser.cx>
> > ---
> > kernel/trace/fprobe.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> > index dcadf1d23b8a..6a1192515afd 100644
> > --- a/kernel/trace/fprobe.c
> > +++ b/kernel/trace/fprobe.c
> > @@ -451,7 +451,7 @@ static int fprobe_fgraph_entry(struct ftrace_graph_ent *trace, struct fgraph_ops
> > }
> > }
> > if (used < reserved_words)
> > - memset(fgraph_data + used, 0, reserved_words - used);
> > + memset(fgraph_data + used, 0, (reserved_words - used) * sizeof(long));
>
> So fgraph_data is only used internally between the fprobe_fgraph_entry()
> and fprobe_return() as it only exists on the fgraph shadow stack. I'm not
> even sure if the unused portion needs to be zeroed out.
>
> Thus, this may be correct, but it doesn't look like a true bug that needs a
> stable tag.
Hmm, indeed. Maybe we'd better just remove this memset from for-next.
Thanks,
>
> -- Steve
>
>
> >
> > /* If any exit_handler is set, data must be used. */
> > return used != 0;
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v2 18/19] mm: damon: Use trace_call__##name() at guarded tracepoint call sites
From: SeongJae Park @ 2026-03-24 0:25 UTC (permalink / raw)
To: Vineeth Pillai (Google)
Cc: SeongJae Park, Steven Rostedt, Peter Zijlstra, Andrew Morton,
damon, linux-mm, linux-kernel, linux-trace-kernel
In-Reply-To: <20260323160052.17528-19-vineeth@bitbyteword.org>
On Mon, 23 Mar 2026 12:00:37 -0400 "Vineeth Pillai (Google)" <vineeth@bitbyteword.org> wrote:
> Replace trace_damos_stat_after_apply_interval() with
> trace_call__damos_stat_after_apply_interval() at a site already guarded
> by an early return when !trace_damos_stat_after_apply_interval_enabled(),
> avoiding a redundant static_branch_unlikely() re-evaluation inside the
> tracepoint.
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Thanks,
SJ
[...]
^ permalink raw reply
* [PATCH] ring-buffer: Show what clock function is used on timestamp errors
From: Steven Rostedt @ 2026-03-24 0:22 UTC (permalink / raw)
To: LKML, Linux Trace Kernel; +Cc: Masami Hiramatsu, Mathieu Desnoyers
From: Steven Rostedt <rostedt@goodmis.org>
The testing for tracing was triggering a timestamp count issue that was
always off by one. This has been happening for some time but has never
been reported by anyone else. It was finally discovered to be an issue
with the "uptime" (jiffies) clock that happened to be traced and the
internal recursion caused the discrepancy. This would have been much
easier to solve if the clock function being used was displayed when the
error was detected.
Add the clock function to the error output.
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/ring_buffer.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 170170bd83bd..a99d1c7d180b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4435,18 +4435,20 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer,
ret = rb_read_data_buffer(bpage, tail, cpu_buffer->cpu, &ts, &delta);
if (ret < 0) {
if (delta < ts) {
- buffer_warn_return("[CPU: %d]ABSOLUTE TIME WENT BACKWARDS: last ts: %lld absolute ts: %lld\n",
- cpu_buffer->cpu, ts, delta);
+ buffer_warn_return("[CPU: %d]ABSOLUTE TIME WENT BACKWARDS: last ts: %lld absolute ts: %lld clock:%pS\n",
+ cpu_buffer->cpu, ts, delta,
+ cpu_buffer->buffer->clock);
goto out;
}
}
if ((full && ts > info->ts) ||
(!full && ts + info->delta != info->ts)) {
- buffer_warn_return("[CPU: %d]TIME DOES NOT MATCH expected:%lld actual:%lld delta:%lld before:%lld after:%lld%s context:%s\n",
+ buffer_warn_return("[CPU: %d]TIME DOES NOT MATCH expected:%lld actual:%lld delta:%lld before:%lld after:%lld%s context:%s\ntrace clock:%pS",
cpu_buffer->cpu,
ts + info->delta, info->ts, info->delta,
info->before, info->after,
- full ? " (full)" : "", show_interrupt_level());
+ full ? " (full)" : "", show_interrupt_level(),
+ cpu_buffer->buffer->clock);
}
out:
atomic_dec(this_cpu_ptr(&checking));
--
2.51.0
^ permalink raw reply related
* [PATCH v12 4/4] ring-buffer: Add persistent ring buffer selftest
From: Masami Hiramatsu (Google) @ 2026-03-24 0:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers
In-Reply-To: <177431040276.3708637.10277850418341653677.stgit@mhiramat.tok.corp.google.com>
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Add a self-destractive test for the persistent ring buffer. This
will invalidate some sub-buffer pages in the persistent ring buffer
when kernel gets panic, and check whether the number of detected
invalid pages and the total entry_bytes are the same as record
after reboot.
This can ensure the kernel correctly recover partially corrupted
persistent ring buffer when boot.
The test only runs on the persistent ring buffer whose name is
"ptracingtest". And user has to fill it up with events before
kernel panics.
To run the test, enable CONFIG_RING_BUFFER_PERSISTENT_SELFTEST
and you have to setup the kernel cmdline;
reserve_mem=20M:2M:trace trace_instance=ptracingtest^traceoff@trace
panic=1
And run following commands after the 1st boot;
cd /sys/kernel/tracing/instances/ptracingtest
echo 1 > tracing_on
echo 1 > events/enable
sleep 3
echo c > /proc/sysrq-trigger
After panic message, the kernel will reboot and run the verification
on the persistent ring buffer, e.g.
Ring buffer meta [2] invalid buffer page detected
Ring buffer meta [2] is from previous boot! (318 pages discarded)
Ring buffer testing [2] invalid pages: PASSED (318/318)
Ring buffer testing [2] entry_bytes: PASSED (1300476/1300476)
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
Changes in v10:
- Add entry_bytes test.
- Do not compile test code if CONFIG_RING_BUFFER_PERSISTENT_SELFTEST=n.
Changes in v9:
- Test also reader pages.
---
include/linux/ring_buffer.h | 1 +
kernel/trace/Kconfig | 15 +++++++++
kernel/trace/ring_buffer.c | 69 +++++++++++++++++++++++++++++++++++++++++++
kernel/trace/trace.c | 4 ++
4 files changed, 89 insertions(+)
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index d862fa610270..a92a22c44387 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -238,6 +238,7 @@ int ring_buffer_subbuf_size_get(struct trace_buffer *buffer);
enum ring_buffer_flags {
RB_FL_OVERWRITE = 1 << 0,
+ RB_FL_TESTING = 1 << 1,
};
#ifdef CONFIG_RING_BUFFER
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 49de13cae428..2e6f3b7c6a31 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -1202,6 +1202,21 @@ config RING_BUFFER_VALIDATE_TIME_DELTAS
Only say Y if you understand what this does, and you
still want it enabled. Otherwise say N
+config RING_BUFFER_PERSISTENT_SELFTEST
+ bool "Enable persistent ring buffer selftest"
+ depends on RING_BUFFER
+ help
+ Run a selftest on the persistent ring buffer which names
+ "ptracingtest" (and its backup) when panic_on_reboot by
+ invalidating ring buffer pages.
+ Note that user has to enable events on the persistent ring
+ buffer manually to fill up ring buffers before rebooting.
+ Since this invalidates the data on test target ring buffer,
+ "ptracingtest" persistent ring buffer must not be used for
+ actual tracing, but only for testing.
+
+ If unsure, say N
+
config MMIOTRACE_TEST
tristate "Test module for mmiotrace"
depends on MMIOTRACE && m
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 99f59f1c63e0..31e479143ca0 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -63,6 +63,10 @@ struct ring_buffer_cpu_meta {
unsigned long commit_buffer;
__u32 subbuf_size;
__u32 nr_subbufs;
+#ifdef CONFIG_RING_BUFFER_PERSISTENT_SELFTEST
+ __u32 nr_invalid;
+ __u32 entry_bytes;
+#endif
int buffers[];
};
@@ -2106,6 +2110,19 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
cpu_buffer->cpu, discarded);
+
+#ifdef CONFIG_RING_BUFFER_PERSISTENT_SELFTEST
+ if (meta->nr_invalid)
+ pr_info("Ring buffer testing [%d] invalid pages: %s (%d/%d)\n",
+ cpu_buffer->cpu,
+ (discarded == meta->nr_invalid) ? "PASSED" : "FAILED",
+ discarded, meta->nr_invalid);
+ if (meta->entry_bytes)
+ pr_info("Ring buffer testing [%d] entry_bytes: %s (%ld/%ld)\n",
+ cpu_buffer->cpu,
+ (entry_bytes == meta->entry_bytes) ? "PASSED" : "FAILED",
+ (long)entry_bytes, (long)meta->entry_bytes);
+#endif
return;
invalid:
@@ -2508,12 +2525,64 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer)
kfree(cpu_buffer);
}
+#ifdef CONFIG_RING_BUFFER_PERSISTENT_SELFTEST
+static void rb_test_inject_invalid_pages(struct trace_buffer *buffer)
+{
+ struct ring_buffer_per_cpu *cpu_buffer;
+ struct ring_buffer_cpu_meta *meta;
+ struct buffer_data_page *dpage;
+ u32 entry_bytes = 0;
+ unsigned long ptr;
+ int subbuf_size;
+ int invalid = 0;
+ int cpu;
+ int i;
+
+ if (!(buffer->flags & RB_FL_TESTING))
+ return;
+
+ guard(preempt)();
+ cpu = smp_processor_id();
+
+ cpu_buffer = buffer->buffers[cpu];
+ meta = cpu_buffer->ring_meta;
+ ptr = (unsigned long)rb_subbufs_from_meta(meta);
+ subbuf_size = meta->subbuf_size;
+
+ for (i = 0; i < meta->nr_subbufs; i++) {
+ int idx = meta->buffers[i];
+
+ dpage = (void *)(ptr + idx * subbuf_size);
+ /* Skip unused pages */
+ if (!local_read(&dpage->commit))
+ continue;
+
+ /* Invalidate even pages. */
+ if (!(i & 0x1)) {
+ local_add(subbuf_size + 1, &dpage->commit);
+ invalid++;
+ } else {
+ /* Count total commit bytes. */
+ entry_bytes += local_read(&dpage->commit);
+ }
+ }
+
+ pr_info("Inject invalidated %d pages on CPU%d, total size: %ld\n",
+ invalid, cpu, (long)entry_bytes);
+ meta->nr_invalid = invalid;
+ meta->entry_bytes = entry_bytes;
+}
+#else /* !CONFIG_RING_BUFFER_PERSISTENT_SELFTEST */
+#define rb_test_inject_invalid_pages(buffer) do { } while (0)
+#endif
+
/* Stop recording on a persistent buffer and flush cache if needed. */
static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
{
struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
ring_buffer_record_off(buffer);
+ rb_test_inject_invalid_pages(buffer);
arch_ring_buffer_flush_range(buffer->range_addr_start, buffer->range_addr_end);
return NOTIFY_DONE;
}
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a626211ceb9a..561bcc1c9125 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -9366,6 +9366,8 @@ static void setup_trace_scratch(struct trace_array *tr,
memset(tscratch, 0, size);
}
+#define TRACE_TEST_PTRACING_NAME "ptracingtest"
+
static int
allocate_trace_buffer(struct trace_array *tr, struct array_buffer *buf, unsigned long size)
{
@@ -9378,6 +9380,8 @@ allocate_trace_buffer(struct trace_array *tr, struct array_buffer *buf, unsigned
buf->tr = tr;
if (tr->range_addr_start && tr->range_addr_size) {
+ if (!strcmp(tr->name, TRACE_TEST_PTRACING_NAME))
+ rb_flags |= RB_FL_TESTING;
/* Add scratch buffer to handle 128 modules */
buf->buffer = ring_buffer_alloc_range(size, rb_flags, 0,
tr->range_addr_start,
^ permalink raw reply related
* [PATCH v12 3/4] ring-buffer: Skip invalid sub-buffers when rewinding persistent ring buffer
From: Masami Hiramatsu (Google) @ 2026-03-24 0:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers
In-Reply-To: <177431040276.3708637.10277850418341653677.stgit@mhiramat.tok.corp.google.com>
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Skip invalid sub-buffers when rewinding the persistent ring buffer
instead of stopping the rewinding the ring buffer. The skipped
buffers are cleared.
To ensure the rewinding stops at the unused page, this also clears
buffer_data_page::time_stamp when tracing resets the buffer. This
allows us to identify unused pages and empty pages.
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
Changes in v12:
- Fix build error.
Changes in v11:
- Reset timestamp when the buffer is invalid.
- When rewinding, skip subbuf page if timestamp is wrong and
check timestamp after validating buffer data page.
Changes in v10:
- Newly added.
---
kernel/trace/ring_buffer.c | 76 +++++++++++++++++++++++++-------------------
1 file changed, 43 insertions(+), 33 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index dd4c8cf350df..99f59f1c63e0 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -389,6 +389,7 @@ struct buffer_page {
static void rb_init_page(struct buffer_data_page *bpage)
{
local_set(&bpage->commit, 0);
+ bpage->time_stamp = 0;
}
static __always_inline unsigned int rb_page_commit(struct buffer_page *bpage)
@@ -1907,12 +1908,14 @@ static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu
return events;
}
-static int rb_validate_buffer(struct buffer_data_page *dpage, int cpu,
+static int rb_validate_buffer(struct buffer_page *bpage, int cpu,
struct ring_buffer_cpu_meta *meta)
{
+ struct buffer_data_page *dpage = bpage->page;
unsigned long long ts;
unsigned long tail;
u64 delta;
+ int ret = -1;
/*
* When a sub-buffer is recovered from a read, the commit value may
@@ -1921,9 +1924,17 @@ static int rb_validate_buffer(struct buffer_data_page *dpage, int cpu,
* subbuf_size is considered invalid.
*/
tail = local_read(&dpage->commit) & ~RB_MISSED_MASK;
- if (tail > meta->subbuf_size)
- return -1;
- return rb_read_data_buffer(dpage, tail, cpu, &ts, &delta);
+ if (tail <= meta->subbuf_size)
+ ret = rb_read_data_buffer(dpage, tail, cpu, &ts, &delta);
+
+ if (ret < 0) {
+ local_set(&bpage->entries, 0);
+ local_set(&bpage->page->commit, 0);
+ } else {
+ local_set(&bpage->entries, ret);
+ }
+
+ return ret;
}
/* If the meta data has been validated, now validate the events */
@@ -1944,18 +1955,14 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
orig_head = head_page = cpu_buffer->head_page;
/* Do the reader page first */
- ret = rb_validate_buffer(cpu_buffer->reader_page->page, cpu_buffer->cpu, meta);
+ ret = rb_validate_buffer(cpu_buffer->reader_page, cpu_buffer->cpu, meta);
if (ret < 0) {
pr_info("Ring buffer meta [%d] invalid reader page detected\n",
cpu_buffer->cpu);
discarded++;
- /* Instead of discard whole ring buffer, discard only this sub-buffer. */
- local_set(&cpu_buffer->reader_page->entries, 0);
- local_set(&cpu_buffer->reader_page->page->commit, 0);
} else {
entries += ret;
entry_bytes += rb_page_size(cpu_buffer->reader_page);
- local_set(&cpu_buffer->reader_page->entries, ret);
}
ts = head_page->page->time_stamp;
@@ -1974,26 +1981,33 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
if (head_page == cpu_buffer->tail_page)
break;
- /* Ensure the page has older data than head. */
- if (ts < head_page->page->time_stamp)
- break;
-
- ts = head_page->page->time_stamp;
- /* Ensure the page has correct timestamp and some data. */
- if (!ts || rb_page_commit(head_page) == 0)
- break;
-
- /* Stop rewind if the page is invalid. */
- ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu, meta);
- if (ret < 0)
+ /* Rewind until unused page (no timestamp, no commit). */
+ if (!head_page->page->time_stamp && rb_page_commit(head_page) == 0)
break;
- /* Recover the number of entries and update stats. */
- local_set(&head_page->entries, ret);
- if (ret)
- local_inc(&cpu_buffer->pages_touched);
- entries += ret;
- entry_bytes += rb_page_commit(head_page);
+ /*
+ * Skip if the page is invalid, or its timestamp is newer than the
+ * previous valid page.
+ */
+ ret = rb_validate_buffer(head_page, cpu_buffer->cpu, meta);
+ if (ret >= 0 && ts < head_page->page->time_stamp) {
+ local_set(&head_page->entries, 0);
+ local_set(&head_page->page->commit, 0);
+ head_page->page->time_stamp = ts;
+ ret = -1;
+ }
+ if (ret < 0) {
+ if (!discarded)
+ pr_info("Ring buffer meta [%d] invalid buffer page detected\n",
+ cpu_buffer->cpu);
+ discarded++;
+ } else {
+ entries += ret;
+ entry_bytes += rb_page_size(head_page);
+ if (ret > 0)
+ local_inc(&cpu_buffer->pages_touched);
+ ts = head_page->page->time_stamp;
+ }
}
if (i)
pr_info("Ring buffer [%d] rewound %d pages\n", cpu_buffer->cpu, i);
@@ -2063,15 +2077,12 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
if (head_page == cpu_buffer->reader_page)
continue;
- ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu, meta);
+ ret = rb_validate_buffer(head_page, cpu_buffer->cpu, meta);
if (ret < 0) {
if (!discarded)
pr_info("Ring buffer meta [%d] invalid buffer page detected\n",
cpu_buffer->cpu);
discarded++;
- /* Instead of discard whole ring buffer, discard only this sub-buffer. */
- local_set(&head_page->entries, 0);
- local_set(&head_page->page->commit, 0);
} else {
/* If the buffer has content, update pages_touched */
if (ret)
@@ -2079,7 +2090,6 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
entries += ret;
entry_bytes += rb_page_size(head_page);
- local_set(&head_page->entries, ret);
}
if (head_page == cpu_buffer->commit_page)
break;
@@ -2110,7 +2120,7 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
/* Reset all the subbuffers */
for (i = 0; i < meta->nr_subbufs - 1; i++, rb_inc_page(&head_page)) {
local_set(&head_page->entries, 0);
- local_set(&head_page->page->commit, 0);
+ rb_init_page(head_page->page);
}
}
^ permalink raw reply related
* [PATCH v12 2/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
From: Masami Hiramatsu (Google) @ 2026-03-24 0:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers
In-Reply-To: <177431040276.3708637.10277850418341653677.stgit@mhiramat.tok.corp.google.com>
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Skip invalid sub-buffers when validating the persistent ring buffer
instead of discarding the entire ring buffer. Only skipped buffers
are invalidated (cleared).
If the cache data in memory fails to be synchronized during a reboot,
the persistent ring buffer may become partially corrupted, but other
sub-buffers may still contain readable event data. Only discard the
subbuffers that are found to be corrupted.
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
Changes in v11:
- Fix a typo.
Changes in v9:
- Add meta->subbuf_size check.
- Fix a typo.
- Handle invalid reader_page case.
Changes in v8:
- Add comment in rb_valudate_buffer()
- Clear the RB_MISSED_* flags in rb_valudate_buffer() instead of
skipping subbuf.
- Remove unused subbuf local variable from rb_cpu_meta_valid().
Changes in v7:
- Combined with Handling RB_MISSED_* flags patch, focus on validation at boot.
- Remove checking subbuffer data when validating metadata, because it should be done
later.
- Do not mark the discarded sub buffer page but just reset it.
Changes in v6:
- Show invalid page detection message once per CPU.
Changes in v5:
- Instead of showing errors for each page, just show the number
of discarded pages at last.
Changes in v3:
- Record missed data event on commit.
---
kernel/trace/ring_buffer.c | 98 ++++++++++++++++++++++++++------------------
1 file changed, 58 insertions(+), 40 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 0131ce4e018c..dd4c8cf350df 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -396,6 +396,12 @@ static __always_inline unsigned int rb_page_commit(struct buffer_page *bpage)
return local_read(&bpage->page->commit);
}
+/* Size is determined by what has been committed */
+static __always_inline unsigned int rb_page_size(struct buffer_page *bpage)
+{
+ return rb_page_commit(bpage) & ~RB_MISSED_MASK;
+}
+
static void free_buffer_page(struct buffer_page *bpage)
{
/* Range pages are not to be freed */
@@ -1791,7 +1797,6 @@ static bool rb_cpu_meta_valid(struct ring_buffer_cpu_meta *meta, int cpu,
unsigned long *subbuf_mask)
{
int subbuf_size = PAGE_SIZE;
- struct buffer_data_page *subbuf;
unsigned long buffers_start;
unsigned long buffers_end;
int i;
@@ -1799,6 +1804,11 @@ static bool rb_cpu_meta_valid(struct ring_buffer_cpu_meta *meta, int cpu,
if (!subbuf_mask)
return false;
+ if (meta->subbuf_size != PAGE_SIZE) {
+ pr_info("Ring buffer boot meta [%d] invalid subbuf_size\n", cpu);
+ return false;
+ }
+
buffers_start = meta->first_buffer;
buffers_end = meta->first_buffer + (subbuf_size * meta->nr_subbufs);
@@ -1815,11 +1825,12 @@ static bool rb_cpu_meta_valid(struct ring_buffer_cpu_meta *meta, int cpu,
return false;
}
- subbuf = rb_subbufs_from_meta(meta);
-
bitmap_clear(subbuf_mask, 0, meta->nr_subbufs);
- /* Is the meta buffers and the subbufs themselves have correct data? */
+ /*
+ * Ensure the meta::buffers array has correct data. The data in each subbufs
+ * are checked later in rb_meta_validate_events().
+ */
for (i = 0; i < meta->nr_subbufs; i++) {
if (meta->buffers[i] < 0 ||
meta->buffers[i] >= meta->nr_subbufs) {
@@ -1827,18 +1838,12 @@ static bool rb_cpu_meta_valid(struct ring_buffer_cpu_meta *meta, int cpu,
return false;
}
- if ((unsigned)local_read(&subbuf->commit) > subbuf_size) {
- pr_info("Ring buffer boot meta [%d] buffer invalid commit\n", cpu);
- return false;
- }
-
if (test_bit(meta->buffers[i], subbuf_mask)) {
pr_info("Ring buffer boot meta [%d] array has duplicates\n", cpu);
return false;
}
set_bit(meta->buffers[i], subbuf_mask);
- subbuf = (void *)subbuf + subbuf_size;
}
return true;
@@ -1902,13 +1907,22 @@ static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu
return events;
}
-static int rb_validate_buffer(struct buffer_data_page *dpage, int cpu)
+static int rb_validate_buffer(struct buffer_data_page *dpage, int cpu,
+ struct ring_buffer_cpu_meta *meta)
{
unsigned long long ts;
+ unsigned long tail;
u64 delta;
- int tail;
- tail = local_read(&dpage->commit);
+ /*
+ * When a sub-buffer is recovered from a read, the commit value may
+ * have RB_MISSED_* bits set, as these bits are reset on reuse.
+ * Even after clearing these bits, a commit value greater than the
+ * subbuf_size is considered invalid.
+ */
+ tail = local_read(&dpage->commit) & ~RB_MISSED_MASK;
+ if (tail > meta->subbuf_size)
+ return -1;
return rb_read_data_buffer(dpage, tail, cpu, &ts, &delta);
}
@@ -1919,6 +1933,7 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
struct buffer_page *head_page, *orig_head;
unsigned long entry_bytes = 0;
unsigned long entries = 0;
+ int discarded = 0;
int ret;
u64 ts;
int i;
@@ -1929,14 +1944,19 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
orig_head = head_page = cpu_buffer->head_page;
/* Do the reader page first */
- ret = rb_validate_buffer(cpu_buffer->reader_page->page, cpu_buffer->cpu);
+ ret = rb_validate_buffer(cpu_buffer->reader_page->page, cpu_buffer->cpu, meta);
if (ret < 0) {
- pr_info("Ring buffer reader page is invalid\n");
- goto invalid;
+ pr_info("Ring buffer meta [%d] invalid reader page detected\n",
+ cpu_buffer->cpu);
+ discarded++;
+ /* Instead of discard whole ring buffer, discard only this sub-buffer. */
+ local_set(&cpu_buffer->reader_page->entries, 0);
+ local_set(&cpu_buffer->reader_page->page->commit, 0);
+ } else {
+ entries += ret;
+ entry_bytes += rb_page_size(cpu_buffer->reader_page);
+ local_set(&cpu_buffer->reader_page->entries, ret);
}
- entries += ret;
- entry_bytes += local_read(&cpu_buffer->reader_page->page->commit);
- local_set(&cpu_buffer->reader_page->entries, ret);
ts = head_page->page->time_stamp;
@@ -1964,7 +1984,7 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
break;
/* Stop rewind if the page is invalid. */
- ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu);
+ ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu, meta);
if (ret < 0)
break;
@@ -2043,21 +2063,24 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
if (head_page == cpu_buffer->reader_page)
continue;
- ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu);
+ ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu, meta);
if (ret < 0) {
- pr_info("Ring buffer meta [%d] invalid buffer page\n",
- cpu_buffer->cpu);
- goto invalid;
- }
-
- /* If the buffer has content, update pages_touched */
- if (ret)
- local_inc(&cpu_buffer->pages_touched);
-
- entries += ret;
- entry_bytes += local_read(&head_page->page->commit);
- local_set(&head_page->entries, ret);
+ if (!discarded)
+ pr_info("Ring buffer meta [%d] invalid buffer page detected\n",
+ cpu_buffer->cpu);
+ discarded++;
+ /* Instead of discard whole ring buffer, discard only this sub-buffer. */
+ local_set(&head_page->entries, 0);
+ local_set(&head_page->page->commit, 0);
+ } else {
+ /* If the buffer has content, update pages_touched */
+ if (ret)
+ local_inc(&cpu_buffer->pages_touched);
+ entries += ret;
+ entry_bytes += rb_page_size(head_page);
+ local_set(&head_page->entries, ret);
+ }
if (head_page == cpu_buffer->commit_page)
break;
}
@@ -2071,7 +2094,8 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
local_set(&cpu_buffer->entries, entries);
local_set(&cpu_buffer->entries_bytes, entry_bytes);
- pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
+ pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
+ cpu_buffer->cpu, discarded);
return;
invalid:
@@ -3258,12 +3282,6 @@ rb_iter_head_event(struct ring_buffer_iter *iter)
return NULL;
}
-/* Size is determined by what has been committed */
-static __always_inline unsigned rb_page_size(struct buffer_page *bpage)
-{
- return rb_page_commit(bpage) & ~RB_MISSED_MASK;
-}
-
static __always_inline unsigned
rb_commit_index(struct ring_buffer_per_cpu *cpu_buffer)
{
^ permalink raw reply related
* Re: [PATCH v11 0/4] ring-buffer: Making persistent ring buffers robust
From: Masami Hiramatsu @ 2026-03-24 0:00 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Steven Rostedt, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers
In-Reply-To: <177426904455.3236905.2079719303397330696.stgit@mhiramat.tok.corp.google.com>
Oops. This is v12. Ignore this.
On Mon, 23 Mar 2026 21:30:45 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> Hi,
>
> Here is the 12th version of improvement patches for making persistent
> ring buffers robust to failures.
> The previous version is here:
>
> https://lore.kernel.org/all/177391152793.193994.8986943289250629418.stgit@mhiramat.tok.corp.google.com/
>
>
> In this version, I rebased it on tracing/fixes branch (thus
> the first fix patch has been removed) and fixed
> a build error which came from a copy-paste mistake.
>
> Thank you,
>
> ---
>
> Masami Hiramatsu (Google) (4):
> ring-buffer: Flush and stop persistent ring buffer on panic
> ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
> ring-buffer: Skip invalid sub-buffers when rewinding persistent ring buffer
> ring-buffer: Add persistent ring buffer selftest
>
>
> arch/alpha/include/asm/Kbuild | 1
> arch/arc/include/asm/Kbuild | 1
> arch/arm/include/asm/Kbuild | 1
> arch/arm64/include/asm/ring_buffer.h | 10 +
> arch/csky/include/asm/Kbuild | 1
> arch/hexagon/include/asm/Kbuild | 1
> arch/loongarch/include/asm/Kbuild | 1
> arch/m68k/include/asm/Kbuild | 1
> arch/microblaze/include/asm/Kbuild | 1
> arch/mips/include/asm/Kbuild | 1
> arch/nios2/include/asm/Kbuild | 1
> arch/openrisc/include/asm/Kbuild | 1
> arch/parisc/include/asm/Kbuild | 1
> arch/powerpc/include/asm/Kbuild | 1
> arch/riscv/include/asm/Kbuild | 1
> arch/s390/include/asm/Kbuild | 1
> arch/sh/include/asm/Kbuild | 1
> arch/sparc/include/asm/Kbuild | 1
> arch/um/include/asm/Kbuild | 1
> arch/x86/include/asm/Kbuild | 1
> arch/xtensa/include/asm/Kbuild | 1
> include/asm-generic/ring_buffer.h | 13 ++
> include/linux/ring_buffer.h | 1
> kernel/trace/Kconfig | 15 ++
> kernel/trace/ring_buffer.c | 237 ++++++++++++++++++++++++++--------
> kernel/trace/trace.c | 4 +
> 26 files changed, 241 insertions(+), 59 deletions(-)
> create mode 100644 arch/arm64/include/asm/ring_buffer.h
> create mode 100644 include/asm-generic/ring_buffer.h
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* [PATCH v12 1/4] ring-buffer: Flush and stop persistent ring buffer on panic
From: Masami Hiramatsu (Google) @ 2026-03-24 0:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers
In-Reply-To: <177431040276.3708637.10277850418341653677.stgit@mhiramat.tok.corp.google.com>
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
On real hardware, panic and machine reboot may not flush hardware cache
to memory. This means the persistent ring buffer, which relies on a
coherent state of memory, may not have its events written to the buffer
and they may be lost. Moreover, there may be inconsistency with the
counters which are used for validation of the integrity of the
persistent ring buffer which may cause all data to be discarded.
To avoid this issue, stop recording of the ring buffer on panic and
flush the cache of the ring buffer's memory.
Fixes: e645535a954a ("tracing: Add option to use memmapped memory for trace boot instance")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
Changes in v11:
- Do nothing by default since flush_cache_vmap() does nothing on x86
but it can cause deadlock on some architectures via on_each_cpu()
because other CPUs will be stoppped when panic notifier is called.
Changes in v9:
- Fix typo of & to &&.
- Fix typo of "Generic"
Changes in v6:
- Introduce asm/ring_buffer.h for arch_ring_buffer_flush_range().
- Use flush_cache_vmap() instead of flush_cache_all().
Changes in v5:
- Use ring_buffer_record_off() instead of ring_buffer_record_disable().
- Use flush_cache_all() to ensure flush all cache.
Changes in v3:
- update patch description.
---
arch/alpha/include/asm/Kbuild | 1 +
arch/arc/include/asm/Kbuild | 1 +
arch/arm/include/asm/Kbuild | 1 +
arch/arm64/include/asm/ring_buffer.h | 10 ++++++++++
arch/csky/include/asm/Kbuild | 1 +
arch/hexagon/include/asm/Kbuild | 1 +
arch/loongarch/include/asm/Kbuild | 1 +
arch/m68k/include/asm/Kbuild | 1 +
arch/microblaze/include/asm/Kbuild | 1 +
arch/mips/include/asm/Kbuild | 1 +
arch/nios2/include/asm/Kbuild | 1 +
arch/openrisc/include/asm/Kbuild | 1 +
arch/parisc/include/asm/Kbuild | 1 +
arch/powerpc/include/asm/Kbuild | 1 +
arch/riscv/include/asm/Kbuild | 1 +
arch/s390/include/asm/Kbuild | 1 +
arch/sh/include/asm/Kbuild | 1 +
arch/sparc/include/asm/Kbuild | 1 +
arch/um/include/asm/Kbuild | 1 +
arch/x86/include/asm/Kbuild | 1 +
arch/xtensa/include/asm/Kbuild | 1 +
include/asm-generic/ring_buffer.h | 13 +++++++++++++
kernel/trace/ring_buffer.c | 22 ++++++++++++++++++++++
23 files changed, 65 insertions(+)
create mode 100644 arch/arm64/include/asm/ring_buffer.h
create mode 100644 include/asm-generic/ring_buffer.h
diff --git a/arch/alpha/include/asm/Kbuild b/arch/alpha/include/asm/Kbuild
index 483965c5a4de..b154b4e3dfa8 100644
--- a/arch/alpha/include/asm/Kbuild
+++ b/arch/alpha/include/asm/Kbuild
@@ -5,4 +5,5 @@ generic-y += agp.h
generic-y += asm-offsets.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
+generic-y += ring_buffer.h
generic-y += text-patching.h
diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild
index 4c69522e0328..483caacc6988 100644
--- a/arch/arc/include/asm/Kbuild
+++ b/arch/arc/include/asm/Kbuild
@@ -5,5 +5,6 @@ generic-y += extable.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
generic-y += parport.h
+generic-y += ring_buffer.h
generic-y += user.h
generic-y += text-patching.h
diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
index 03657ff8fbe3..decad5f2c826 100644
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@ -3,6 +3,7 @@ generic-y += early_ioremap.h
generic-y += extable.h
generic-y += flat.h
generic-y += parport.h
+generic-y += ring_buffer.h
generated-y += mach-types.h
generated-y += unistd-nr.h
diff --git a/arch/arm64/include/asm/ring_buffer.h b/arch/arm64/include/asm/ring_buffer.h
new file mode 100644
index 000000000000..62316c406888
--- /dev/null
+++ b/arch/arm64/include/asm/ring_buffer.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_ARM64_RING_BUFFER_H
+#define _ASM_ARM64_RING_BUFFER_H
+
+#include <asm/cacheflush.h>
+
+/* Flush D-cache on persistent ring buffer */
+#define arch_ring_buffer_flush_range(start, end) dcache_clean_pop(start, end)
+
+#endif /* _ASM_ARM64_RING_BUFFER_H */
diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild
index 3a5c7f6e5aac..7dca0c6cdc84 100644
--- a/arch/csky/include/asm/Kbuild
+++ b/arch/csky/include/asm/Kbuild
@@ -9,6 +9,7 @@ generic-y += qrwlock.h
generic-y += qrwlock_types.h
generic-y += qspinlock.h
generic-y += parport.h
+generic-y += ring_buffer.h
generic-y += user.h
generic-y += vmlinux.lds.h
generic-y += text-patching.h
diff --git a/arch/hexagon/include/asm/Kbuild b/arch/hexagon/include/asm/Kbuild
index 1efa1e993d4b..0f887d4238ed 100644
--- a/arch/hexagon/include/asm/Kbuild
+++ b/arch/hexagon/include/asm/Kbuild
@@ -5,4 +5,5 @@ generic-y += extable.h
generic-y += iomap.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
+generic-y += ring_buffer.h
generic-y += text-patching.h
diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
index 9034b583a88a..7e92957baf6a 100644
--- a/arch/loongarch/include/asm/Kbuild
+++ b/arch/loongarch/include/asm/Kbuild
@@ -10,5 +10,6 @@ generic-y += qrwlock.h
generic-y += user.h
generic-y += ioctl.h
generic-y += mmzone.h
+generic-y += ring_buffer.h
generic-y += statfs.h
generic-y += text-patching.h
diff --git a/arch/m68k/include/asm/Kbuild b/arch/m68k/include/asm/Kbuild
index b282e0dd8dc1..62543bf305ff 100644
--- a/arch/m68k/include/asm/Kbuild
+++ b/arch/m68k/include/asm/Kbuild
@@ -3,5 +3,6 @@ generated-y += syscall_table.h
generic-y += extable.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
+generic-y += ring_buffer.h
generic-y += spinlock.h
generic-y += text-patching.h
diff --git a/arch/microblaze/include/asm/Kbuild b/arch/microblaze/include/asm/Kbuild
index 7178f990e8b3..0030309b47ad 100644
--- a/arch/microblaze/include/asm/Kbuild
+++ b/arch/microblaze/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += extable.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
generic-y += parport.h
+generic-y += ring_buffer.h
generic-y += syscalls.h
generic-y += tlb.h
generic-y += user.h
diff --git a/arch/mips/include/asm/Kbuild b/arch/mips/include/asm/Kbuild
index 684569b2ecd6..9771c3d85074 100644
--- a/arch/mips/include/asm/Kbuild
+++ b/arch/mips/include/asm/Kbuild
@@ -12,5 +12,6 @@ generic-y += mcs_spinlock.h
generic-y += parport.h
generic-y += qrwlock.h
generic-y += qspinlock.h
+generic-y += ring_buffer.h
generic-y += user.h
generic-y += text-patching.h
diff --git a/arch/nios2/include/asm/Kbuild b/arch/nios2/include/asm/Kbuild
index 28004301c236..0a2530964413 100644
--- a/arch/nios2/include/asm/Kbuild
+++ b/arch/nios2/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += cmpxchg.h
generic-y += extable.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
+generic-y += ring_buffer.h
generic-y += spinlock.h
generic-y += user.h
generic-y += text-patching.h
diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild
index cef49d60d74c..8aa34621702d 100644
--- a/arch/openrisc/include/asm/Kbuild
+++ b/arch/openrisc/include/asm/Kbuild
@@ -8,4 +8,5 @@ generic-y += spinlock_types.h
generic-y += spinlock.h
generic-y += qrwlock_types.h
generic-y += qrwlock.h
+generic-y += ring_buffer.h
generic-y += user.h
diff --git a/arch/parisc/include/asm/Kbuild b/arch/parisc/include/asm/Kbuild
index 4fb596d94c89..d48d158f7241 100644
--- a/arch/parisc/include/asm/Kbuild
+++ b/arch/parisc/include/asm/Kbuild
@@ -4,4 +4,5 @@ generated-y += syscall_table_64.h
generic-y += agp.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
+generic-y += ring_buffer.h
generic-y += user.h
diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
index 2e23533b67e3..805b5aeebb6f 100644
--- a/arch/powerpc/include/asm/Kbuild
+++ b/arch/powerpc/include/asm/Kbuild
@@ -5,4 +5,5 @@ generated-y += syscall_table_spu.h
generic-y += agp.h
generic-y += mcs_spinlock.h
generic-y += qrwlock.h
+generic-y += ring_buffer.h
generic-y += early_ioremap.h
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index bd5fc9403295..7721b63642f4 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -14,5 +14,6 @@ generic-y += ticket_spinlock.h
generic-y += qrwlock.h
generic-y += qrwlock_types.h
generic-y += qspinlock.h
+generic-y += ring_buffer.h
generic-y += user.h
generic-y += vmlinux.lds.h
diff --git a/arch/s390/include/asm/Kbuild b/arch/s390/include/asm/Kbuild
index 80bad7de7a04..0c1fc47c3ba0 100644
--- a/arch/s390/include/asm/Kbuild
+++ b/arch/s390/include/asm/Kbuild
@@ -7,3 +7,4 @@ generated-y += unistd_nr.h
generic-y += asm-offsets.h
generic-y += mcs_spinlock.h
generic-y += mmzone.h
+generic-y += ring_buffer.h
diff --git a/arch/sh/include/asm/Kbuild b/arch/sh/include/asm/Kbuild
index 4d3f10ed8275..f0403d3ee8ab 100644
--- a/arch/sh/include/asm/Kbuild
+++ b/arch/sh/include/asm/Kbuild
@@ -3,4 +3,5 @@ generated-y += syscall_table.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
generic-y += parport.h
+generic-y += ring_buffer.h
generic-y += text-patching.h
diff --git a/arch/sparc/include/asm/Kbuild b/arch/sparc/include/asm/Kbuild
index 17ee8a273aa6..49c6bb326b75 100644
--- a/arch/sparc/include/asm/Kbuild
+++ b/arch/sparc/include/asm/Kbuild
@@ -4,4 +4,5 @@ generated-y += syscall_table_64.h
generic-y += agp.h
generic-y += kvm_para.h
generic-y += mcs_spinlock.h
+generic-y += ring_buffer.h
generic-y += text-patching.h
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 1b9b82bbe322..2a1629ba8140 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -17,6 +17,7 @@ generic-y += module.lds.h
generic-y += parport.h
generic-y += percpu.h
generic-y += preempt.h
+generic-y += ring_buffer.h
generic-y += runtime-const.h
generic-y += softirq_stack.h
generic-y += switch_to.h
diff --git a/arch/x86/include/asm/Kbuild b/arch/x86/include/asm/Kbuild
index 4566000e15c4..078fd2c0d69d 100644
--- a/arch/x86/include/asm/Kbuild
+++ b/arch/x86/include/asm/Kbuild
@@ -14,3 +14,4 @@ generic-y += early_ioremap.h
generic-y += fprobe.h
generic-y += mcs_spinlock.h
generic-y += mmzone.h
+generic-y += ring_buffer.h
diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild
index 13fe45dea296..e57af619263a 100644
--- a/arch/xtensa/include/asm/Kbuild
+++ b/arch/xtensa/include/asm/Kbuild
@@ -6,5 +6,6 @@ generic-y += mcs_spinlock.h
generic-y += parport.h
generic-y += qrwlock.h
generic-y += qspinlock.h
+generic-y += ring_buffer.h
generic-y += user.h
generic-y += text-patching.h
diff --git a/include/asm-generic/ring_buffer.h b/include/asm-generic/ring_buffer.h
new file mode 100644
index 000000000000..201d2aee1005
--- /dev/null
+++ b/include/asm-generic/ring_buffer.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Generic arch dependent ring_buffer macros.
+ */
+#ifndef __ASM_GENERIC_RING_BUFFER_H__
+#define __ASM_GENERIC_RING_BUFFER_H__
+
+#include <linux/cacheflush.h>
+
+/* Flush cache on ring buffer range if needed. Do nothing by default. */
+#define arch_ring_buffer_flush_range(start, end) do { } while (0)
+
+#endif /* __ASM_GENERIC_RING_BUFFER_H__ */
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 170170bd83bd..0131ce4e018c 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -6,6 +6,7 @@
*/
#include <linux/sched/isolation.h>
#include <linux/trace_recursion.h>
+#include <linux/panic_notifier.h>
#include <linux/trace_events.h>
#include <linux/ring_buffer.h>
#include <linux/trace_clock.h>
@@ -30,6 +31,7 @@
#include <linux/oom.h>
#include <linux/mm.h>
+#include <asm/ring_buffer.h>
#include <asm/local64.h>
#include <asm/local.h>
#include <asm/setup.h>
@@ -589,6 +591,7 @@ struct trace_buffer {
unsigned long range_addr_start;
unsigned long range_addr_end;
+ struct notifier_block flush_nb;
struct ring_buffer_meta *meta;
@@ -2471,6 +2474,16 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer)
kfree(cpu_buffer);
}
+/* Stop recording on a persistent buffer and flush cache if needed. */
+static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
+{
+ struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
+
+ ring_buffer_record_off(buffer);
+ arch_ring_buffer_flush_range(buffer->range_addr_start, buffer->range_addr_end);
+ return NOTIFY_DONE;
+}
+
static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags,
int order, unsigned long start,
unsigned long end,
@@ -2590,6 +2603,12 @@ static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags,
mutex_init(&buffer->mutex);
+ /* Persistent ring buffer needs to flush cache before reboot. */
+ if (start && end) {
+ buffer->flush_nb.notifier_call = rb_flush_buffer_cb;
+ atomic_notifier_chain_register(&panic_notifier_list, &buffer->flush_nb);
+ }
+
return_ptr(buffer);
fail_free_buffers:
@@ -2677,6 +2696,9 @@ ring_buffer_free(struct trace_buffer *buffer)
{
int cpu;
+ if (buffer->range_addr_start && buffer->range_addr_end)
+ atomic_notifier_chain_unregister(&panic_notifier_list, &buffer->flush_nb);
+
cpuhp_state_remove_instance(CPUHP_TRACE_RB_PREPARE, &buffer->node);
irq_work_sync(&buffer->irq_work.work);
^ permalink raw reply related
* [PATCH v12 0/4] ring-buffer: Making persistent ring buffers robust
From: Masami Hiramatsu (Google) @ 2026-03-24 0:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers
Hi,
Here is the 12th version of improvement patches for making persistent
ring buffers robust to failures.
The previous version is here:
https://lore.kernel.org/all/177391152793.193994.8986943289250629418.stgit@mhiramat.tok.corp.google.com/
In this version, I rebased it on tracing/fixes branch (thus
the first fix patch has been removed) and fixed
a build error which came from a copy-paste mistake.
Thank you,
---
Masami Hiramatsu (Google) (4):
ring-buffer: Flush and stop persistent ring buffer on panic
ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
ring-buffer: Skip invalid sub-buffers when rewinding persistent ring buffer
ring-buffer: Add persistent ring buffer selftest
arch/alpha/include/asm/Kbuild | 1
arch/arc/include/asm/Kbuild | 1
arch/arm/include/asm/Kbuild | 1
arch/arm64/include/asm/ring_buffer.h | 10 +
arch/csky/include/asm/Kbuild | 1
arch/hexagon/include/asm/Kbuild | 1
arch/loongarch/include/asm/Kbuild | 1
arch/m68k/include/asm/Kbuild | 1
arch/microblaze/include/asm/Kbuild | 1
arch/mips/include/asm/Kbuild | 1
arch/nios2/include/asm/Kbuild | 1
arch/openrisc/include/asm/Kbuild | 1
arch/parisc/include/asm/Kbuild | 1
arch/powerpc/include/asm/Kbuild | 1
arch/riscv/include/asm/Kbuild | 1
arch/s390/include/asm/Kbuild | 1
arch/sh/include/asm/Kbuild | 1
arch/sparc/include/asm/Kbuild | 1
arch/um/include/asm/Kbuild | 1
arch/x86/include/asm/Kbuild | 1
arch/xtensa/include/asm/Kbuild | 1
include/asm-generic/ring_buffer.h | 13 ++
include/linux/ring_buffer.h | 1
kernel/trace/Kconfig | 15 ++
kernel/trace/ring_buffer.c | 237 ++++++++++++++++++++++++++--------
kernel/trace/trace.c | 4 +
26 files changed, 241 insertions(+), 59 deletions(-)
create mode 100644 arch/arm64/include/asm/ring_buffer.h
create mode 100644 include/asm-generic/ring_buffer.h
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v2 15/19] btrfs: Use trace_call__##name() at guarded tracepoint call sites
From: David Sterba @ 2026-03-23 17:53 UTC (permalink / raw)
To: Vineeth Pillai (Google)
Cc: Steven Rostedt, Peter Zijlstra, Chris Mason, David Sterba,
linux-btrfs, linux-kernel, linux-trace-kernel
In-Reply-To: <20260323160052.17528-16-vineeth@bitbyteword.org>
On Mon, Mar 23, 2026 at 12:00:34PM -0400, Vineeth Pillai (Google) wrote:
> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
> Assisted-by: Claude:claude-sonnet-4-6
Acked-by: David Sterba <dsterba@suse.com>
^ permalink raw reply
* Re: [PATCH v2 02/19] kernel: Use trace_call__##name() at guarded tracepoint call sites
From: Tejun Heo @ 2026-03-23 17:42 UTC (permalink / raw)
To: Vineeth Pillai (Google)
Cc: Steven Rostedt, Peter Zijlstra, David Vernet, Andrea Righi,
Changwoo Min, Ingo Molnar, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Ben Segall, Mel Gorman, Valentin Schneider,
Thomas Gleixner, Yury Norov [NVIDIA], Paul E. McKenney,
Rik van Riel, Roman Kisel, Joel Fernandes, Rafael J. Wysocki,
Ulf Hansson, linux-kernel, sched-ext, linux-trace-kernel
In-Reply-To: <20260323160052.17528-3-vineeth@bitbyteword.org>
On Mon, Mar 23, 2026 at 12:00:21PM -0400, Vineeth Pillai (Google) wrote:
> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
> Assisted-by: Claude:claude-sonnet-4-6
Acked-by: Tejun Heo <tj@kernel.org>
Thanks.
--
tejun
^ permalink raw reply
* Re: [PATCH v2 12/19] i2c: Use trace_call__##name() at guarded tracepoint call sites
From: Wolfram Sang @ 2026-03-23 16:43 UTC (permalink / raw)
To: Vineeth Pillai (Google)
Cc: Steven Rostedt, Peter Zijlstra, linux-i2c, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260323160052.17528-13-vineeth@bitbyteword.org>
On Mon, Mar 23, 2026 at 12:00:31PM -0400, Vineeth Pillai (Google) wrote:
> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
> Assisted-by: Claude:claude-sonnet-4-6
If the core of this series is accepted:
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
"To:"-Field of the message is empty!
^ permalink raw reply
* Re: [PATCH v2 13/19] spi: Use trace_call__##name() at guarded tracepoint call sites
From: Mark Brown @ 2026-03-23 16:03 UTC (permalink / raw)
To: Vineeth Pillai (Google)
Cc: Steven Rostedt, Peter Zijlstra, Michael Hennerich, Nuno Sá,
David Lechner, linux-spi, linux-kernel, linux-trace-kernel
In-Reply-To: <20260323160052.17528-14-vineeth@bitbyteword.org>
[-- Attachment #1: Type: text/plain, Size: 410 bytes --]
On Mon, Mar 23, 2026 at 12:00:32PM -0400, Vineeth Pillai (Google) wrote:
> Replace trace_foo() with the new trace_call__foo() at sites already
> guarded by trace_foo_enabled(), avoiding a redundant
> static_branch_unlikely() re-evaluation inside the tracepoint.
> trace_call__foo() calls the tracepoint callbacks directly without
> utilizing the static branch again.
Acked-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* [PATCH v2 18/19] mm: damon: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-03-23 16:00 UTC (permalink / raw)
Cc: Vineeth Pillai (Google), Steven Rostedt, Peter Zijlstra,
SeongJae Park, Andrew Morton, damon, linux-mm, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260323160052.17528-1-vineeth@bitbyteword.org>
Replace trace_damos_stat_after_apply_interval() with
trace_call__damos_stat_after_apply_interval() at a site already guarded
by an early return when !trace_damos_stat_after_apply_interval_enabled(),
avoiding a redundant static_branch_unlikely() re-evaluation inside the
tracepoint.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
---
mm/damon/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index adfc52fee9dc2..b1cc4f44f90a2 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -2342,7 +2342,7 @@ static void damos_trace_stat(struct damon_ctx *c, struct damos *s)
break;
sidx++;
}
- trace_damos_stat_after_apply_interval(cidx, sidx, &s->stat);
+ trace_call__damos_stat_after_apply_interval(cidx, sidx, &s->stat);
}
static void kdamond_apply_schemes(struct damon_ctx *c)
--
2.53.0
^ permalink raw reply related
* [PATCH v2 19/19] x86: msr: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-03-23 16:00 UTC (permalink / raw)
Cc: Vineeth Pillai (Google), Steven Rostedt, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Sean Christopherson, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260323160052.17528-1-vineeth@bitbyteword.org>
The do_trace_write_msr(), do_trace_read_msr(), and do_trace_rdpmc()
wrappers exist to break the header dependency from asm/msr.h which
cannot include trace headers directly. Their callers in asm/msr.h
already guard calls with tracepoint_enabled() checks, so the trace_foo()
calls inside these wrappers perform a redundant static_branch_unlikely()
re-evaluation.
Replace trace_write_msr(), trace_read_msr(), and trace_rdpmc() with
their trace_call__##name() variants to call the tracepoint callbacks
directly without the redundant static branch check.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
---
arch/x86/lib/msr.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index dfdd1da89f366..14785fe5e07b5 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -125,21 +125,21 @@ EXPORT_SYMBOL_FOR_KVM(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS
void do_trace_write_msr(u32 msr, u64 val, int failed)
{
- trace_write_msr(msr, val, failed);
+ trace_call__write_msr(msr, val, failed);
}
EXPORT_SYMBOL(do_trace_write_msr);
EXPORT_TRACEPOINT_SYMBOL(write_msr);
void do_trace_read_msr(u32 msr, u64 val, int failed)
{
- trace_read_msr(msr, val, failed);
+ trace_call__read_msr(msr, val, failed);
}
EXPORT_SYMBOL(do_trace_read_msr);
EXPORT_TRACEPOINT_SYMBOL(read_msr);
void do_trace_rdpmc(u32 msr, u64 val, int failed)
{
- trace_rdpmc(msr, val, failed);
+ trace_call__rdpmc(msr, val, failed);
}
EXPORT_SYMBOL(do_trace_rdpmc);
EXPORT_TRACEPOINT_SYMBOL(rdpmc);
--
2.53.0
^ permalink raw reply related
* [PATCH v2 17/19] kernel: time, trace: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-03-23 16:00 UTC (permalink / raw)
Cc: Vineeth Pillai (Google), Steven Rostedt, Peter Zijlstra,
Anna-Maria Behnsen, Frederic Weisbecker, Ingo Molnar,
Thomas Gleixner, Masami Hiramatsu, Mathieu Desnoyers,
linux-kernel, linux-trace-kernel
In-Reply-To: <20260323160052.17528-1-vineeth@bitbyteword.org>
Replace trace_foo() with trace_call__foo() at sites already guarded by
tracepoint_enabled() or trace_foo_enabled() checks, avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
- tick-sched.c: Multiple trace_tick_stop() calls are guarded by an early
return when tracepoint_enabled(tick_stop) is false.
- trace_benchmark.c: trace_benchmark_event() is guarded by an early return
when !trace_benchmark_event_enabled().
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
---
kernel/time/tick-sched.c | 12 ++++++------
kernel/trace/trace_benchmark.c | 2 +-
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f7907fadd63f2..f8ab472e30858 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -348,32 +348,32 @@ static bool check_tick_dependency(atomic_t *dep)
return !val;
if (val & TICK_DEP_MASK_POSIX_TIMER) {
- trace_tick_stop(0, TICK_DEP_MASK_POSIX_TIMER);
+ trace_call__tick_stop(0, TICK_DEP_MASK_POSIX_TIMER);
return true;
}
if (val & TICK_DEP_MASK_PERF_EVENTS) {
- trace_tick_stop(0, TICK_DEP_MASK_PERF_EVENTS);
+ trace_call__tick_stop(0, TICK_DEP_MASK_PERF_EVENTS);
return true;
}
if (val & TICK_DEP_MASK_SCHED) {
- trace_tick_stop(0, TICK_DEP_MASK_SCHED);
+ trace_call__tick_stop(0, TICK_DEP_MASK_SCHED);
return true;
}
if (val & TICK_DEP_MASK_CLOCK_UNSTABLE) {
- trace_tick_stop(0, TICK_DEP_MASK_CLOCK_UNSTABLE);
+ trace_call__tick_stop(0, TICK_DEP_MASK_CLOCK_UNSTABLE);
return true;
}
if (val & TICK_DEP_MASK_RCU) {
- trace_tick_stop(0, TICK_DEP_MASK_RCU);
+ trace_call__tick_stop(0, TICK_DEP_MASK_RCU);
return true;
}
if (val & TICK_DEP_MASK_RCU_EXP) {
- trace_tick_stop(0, TICK_DEP_MASK_RCU_EXP);
+ trace_call__tick_stop(0, TICK_DEP_MASK_RCU_EXP);
return true;
}
diff --git a/kernel/trace/trace_benchmark.c b/kernel/trace/trace_benchmark.c
index e19c32f2a9381..189d383934fd3 100644
--- a/kernel/trace/trace_benchmark.c
+++ b/kernel/trace/trace_benchmark.c
@@ -51,7 +51,7 @@ static void trace_do_benchmark(void)
local_irq_disable();
start = trace_clock_local();
- trace_benchmark_event(bm_str, bm_last);
+ trace_call__benchmark_event(bm_str, bm_last);
stop = trace_clock_local();
local_irq_enable();
--
2.53.0
^ permalink raw reply related
* [PATCH v2 10/19] drm: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-03-23 16:00 UTC (permalink / raw)
Cc: Vineeth Pillai (Google), Steven Rostedt, Peter Zijlstra,
Alex Deucher, Christian König, David Airlie, Simona Vetter,
Harry Wentland, Leo Li, Rodrigo Siqueira, Matthew Brost,
Danilo Krummrich, Philipp Stanner, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, Sunil Khatri, Liu01 Tong,
Tvrtko Ursulin, Mario Limonciello, Kees Cook, Prike Liang,
Felix Kuehling, André Almeida, Jesse.Zhang, Philip Yang,
Alex Hung, Aurabindo Pillai, Ray Wu, Wayne Lin,
Mario Limonciello (AMD), Timur Kristóf, Ivan Lipski,
Dominik Kaszewski, amd-gfx, dri-devel, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260323160052.17528-1-vineeth@bitbyteword.org>
Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
drivers/gpu/drm/scheduler/sched_entity.c | 4 ++--
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 24e4b4fc91564..99f0e3c9bddcc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1012,7 +1012,7 @@ static void trace_amdgpu_cs_ibs(struct amdgpu_cs_parser *p)
struct amdgpu_job *job = p->jobs[i];
for (j = 0; j < job->num_ibs; ++j)
- trace_amdgpu_cs(p, job, &job->ibs[j]);
+ trace_call__amdgpu_cs(p, job, &job->ibs[j]);
}
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index f2beb980e3c3a..69d5723b98580 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1394,7 +1394,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
if (trace_amdgpu_vm_bo_mapping_enabled()) {
list_for_each_entry(mapping, &bo_va->valids, list)
- trace_amdgpu_vm_bo_mapping(mapping);
+ trace_call__amdgpu_vm_bo_mapping(mapping);
}
error_free:
@@ -2167,7 +2167,7 @@ void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx *ticket)
continue;
}
- trace_amdgpu_vm_bo_cs(mapping);
+ trace_call__amdgpu_vm_bo_cs(mapping);
}
}
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b3d6f2cd8ab6f..2c6c8050af269 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5190,7 +5190,7 @@ static void amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm,
}
if (trace_amdgpu_dm_brightness_enabled()) {
- trace_amdgpu_dm_brightness(__builtin_return_address(0),
+ trace_call__amdgpu_dm_brightness(__builtin_return_address(0),
user_brightness,
brightness,
caps->aux_support,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index fe174a4857be7..93d764d229ca9 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -429,7 +429,7 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity,
if (trace_drm_sched_job_unschedulable_enabled() &&
!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &entity->dependency->flags))
- trace_drm_sched_job_unschedulable(sched_job, entity->dependency);
+ trace_call__drm_sched_job_unschedulable(sched_job, entity->dependency);
if (!dma_fence_add_callback(entity->dependency, &entity->cb,
drm_sched_entity_wakeup))
@@ -586,7 +586,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
unsigned long index;
xa_for_each(&sched_job->dependencies, index, entry)
- trace_drm_sched_job_add_dep(sched_job, entry);
+ trace_call__drm_sched_job_add_dep(sched_job, entry);
}
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
--
2.53.0
^ permalink raw reply related
* [PATCH v2 15/19] btrfs: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-03-23 16:00 UTC (permalink / raw)
Cc: Vineeth Pillai (Google), Steven Rostedt, Peter Zijlstra,
Chris Mason, David Sterba, linux-btrfs, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260323160052.17528-1-vineeth@bitbyteword.org>
Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
---
fs/btrfs/extent_map.c | 4 ++--
fs/btrfs/raid56.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 095a561d733f0..9284c0a81befb 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -1318,7 +1318,7 @@ static void btrfs_extent_map_shrinker_worker(struct work_struct *work)
if (trace_btrfs_extent_map_shrinker_scan_enter_enabled()) {
s64 nr = percpu_counter_sum_positive(&fs_info->evictable_extent_maps);
- trace_btrfs_extent_map_shrinker_scan_enter(fs_info, nr);
+ trace_call__btrfs_extent_map_shrinker_scan_enter(fs_info, nr);
}
while (ctx.scanned < ctx.nr_to_scan && !btrfs_fs_closing(fs_info)) {
@@ -1358,7 +1358,7 @@ static void btrfs_extent_map_shrinker_worker(struct work_struct *work)
if (trace_btrfs_extent_map_shrinker_scan_exit_enabled()) {
s64 nr = percpu_counter_sum_positive(&fs_info->evictable_extent_maps);
- trace_btrfs_extent_map_shrinker_scan_exit(fs_info, nr_dropped, nr);
+ trace_call__btrfs_extent_map_shrinker_scan_exit(fs_info, nr_dropped, nr);
}
atomic64_set(&fs_info->em_shrinker_nr_to_scan, 0);
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index b4511f560e929..0b5b2432aedf6 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1743,7 +1743,7 @@ static void submit_read_wait_bio_list(struct btrfs_raid_bio *rbio,
struct raid56_bio_trace_info trace_info = { 0 };
bio_get_trace_info(rbio, bio, &trace_info);
- trace_raid56_read(rbio, bio, &trace_info);
+ trace_call__raid56_read(rbio, bio, &trace_info);
}
submit_bio(bio);
}
@@ -2420,7 +2420,7 @@ static void submit_write_bios(struct btrfs_raid_bio *rbio,
struct raid56_bio_trace_info trace_info = { 0 };
bio_get_trace_info(rbio, bio, &trace_info);
- trace_raid56_write(rbio, bio, &trace_info);
+ trace_call__raid56_write(rbio, bio, &trace_info);
}
submit_bio(bio);
}
--
2.53.0
^ permalink raw reply related
* [PATCH v2 16/19] net: devlink: Use trace_call__##name() at guarded tracepoint call sites
From: Vineeth Pillai (Google) @ 2026-03-23 16:00 UTC (permalink / raw)
Cc: Vineeth Pillai (Google), Steven Rostedt, Peter Zijlstra,
Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, netdev, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260323160052.17528-1-vineeth@bitbyteword.org>
Replace trace_devlink_trap_report() with trace_call__devlink_trap_report()
at a site already guarded by tracepoint_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__devlink_trap_report() calls the tracepoint callbacks directly
without utilizing the static branch again.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
---
net/devlink/trap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/devlink/trap.c b/net/devlink/trap.c
index 8edb31654a68a..d54276dcd62fb 100644
--- a/net/devlink/trap.c
+++ b/net/devlink/trap.c
@@ -1497,7 +1497,7 @@ void devlink_trap_report(struct devlink *devlink, struct sk_buff *skb,
devlink_trap_report_metadata_set(&metadata, trap_item,
in_devlink_port, fa_cookie);
- trace_devlink_trap_report(devlink, skb, &metadata);
+ trace_call__devlink_trap_report(devlink, skb, &metadata);
}
}
EXPORT_SYMBOL_GPL(devlink_trap_report);
--
2.53.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox