* [RFC PATCH v2 1/4] rtla/osnoise: Add IPI tracking cmdline option
2026-06-17 13:17 [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Valentin Schneider
@ 2026-06-17 13:17 ` Valentin Schneider
2026-06-29 10:51 ` Tomas Glozar
2026-06-17 13:17 ` [RFC PATCH v2 2/4] rtla/osnoise: Record IPI count in osnoise top Valentin Schneider
` (3 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2026-06-17 13:17 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Tomas Glozar, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Costa Shulyupin, Crystal Wood, John Kacur, Ivan Pravdin,
Jonathan Corbet
Later commits will add IPI tracking to osnoise top. To avoid breaking
existing scripts, this new feature will be gated behind a new -i option.
Suggested-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
Documentation/tools/rtla/rtla-osnoise-top.rst | 4 ++++
tools/tracing/rtla/src/cli.c | 1 +
tools/tracing/rtla/src/cli_p.h | 3 +++
tools/tracing/rtla/src/common.h | 1 +
4 files changed, 9 insertions(+)
diff --git a/Documentation/tools/rtla/rtla-osnoise-top.rst b/Documentation/tools/rtla/rtla-osnoise-top.rst
index b91c02ac2bbe1..98f77f8971a69 100644
--- a/Documentation/tools/rtla/rtla-osnoise-top.rst
+++ b/Documentation/tools/rtla/rtla-osnoise-top.rst
@@ -28,6 +28,10 @@ OPTIONS
=======
.. include:: common_osnoise_options.txt
+**-i**, **--ipi**
+
+ Track sources of IPIs.
+
.. include:: common_top_options.txt
.. include:: common_options.txt
diff --git a/tools/tracing/rtla/src/cli.c b/tools/tracing/rtla/src/cli.c
index c5279c9875310..eb1e76a6b0dea 100644
--- a/tools/tracing/rtla/src/cli.c
+++ b/tools/tracing/rtla/src/cli.c
@@ -78,6 +78,7 @@ struct common_params *osnoise_top_parse_args(int argc, char **argv)
RTLA_OPT_STOP_TOTAL('S', "stop-total", "total sample"),
OSNOISE_OPT_THRESHOLD,
RTLA_OPT_TRACE_OUTPUT("osnoise", opt_osnoise_trace_output_cb),
+ OSNOISE_OPT_IPI,
OPT_GROUP("Event Configuration:"),
RTLA_OPT_EVENT,
diff --git a/tools/tracing/rtla/src/cli_p.h b/tools/tracing/rtla/src/cli_p.h
index 3c939de9abf02..7d3f982cfabdb 100644
--- a/tools/tracing/rtla/src/cli_p.h
+++ b/tools/tracing/rtla/src/cli_p.h
@@ -305,6 +305,9 @@ static int opt_filter_cb(const struct option *opt, const char *arg, int unset)
"the minimum delta to be considered a noise", \
opt_llong_callback)
+#define OSNOISE_OPT_IPI OPT_BOOLEAN('i', "ipi", ¶ms->common.ipi, \
+ "track sources of IPIs")
+
/*
* Callback functions for command line options for osnoise tools
*/
diff --git a/tools/tracing/rtla/src/common.h b/tools/tracing/rtla/src/common.h
index 04b287a03f6d4..045253230fcf2 100644
--- a/tools/tracing/rtla/src/common.h
+++ b/tools/tracing/rtla/src/common.h
@@ -108,6 +108,7 @@ struct common_params {
bool kernel_workload;
bool user_data;
bool aa_only;
+ bool ipi;
struct actions threshold_actions;
struct actions end_actions;
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 1/4] rtla/osnoise: Add IPI tracking cmdline option
2026-06-17 13:17 ` [RFC PATCH v2 1/4] rtla/osnoise: Add IPI tracking cmdline option Valentin Schneider
@ 2026-06-29 10:51 ` Tomas Glozar
2026-06-30 13:59 ` Valentin Schneider
0 siblings, 1 reply; 17+ messages in thread
From: Tomas Glozar @ 2026-06-29 10:51 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
<vschneid@redhat.com> napsal:
>
> Later commits will add IPI tracking to osnoise top. To avoid breaking
> existing scripts, this new feature will be gated behind a new -i option.
>
> Suggested-by: Tomas Glozar <tglozar@redhat.com>
Thanks. Implementing this as a separate option also means we don't
have to worry about the performance impact in the general use case, as
the feature is not enabled by default.
If we decide to enable IPI tracking by default in the future, we can
just change the option to "--no-ipi" without breaking anything, as
libsubcmd generates all options in a pair by default (i.e. it
automatically recognizes --no-ipi when you define --ipi and vice
versa, unless explicitly disabled).
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
> Documentation/tools/rtla/rtla-osnoise-top.rst | 4 ++++
> tools/tracing/rtla/src/cli.c | 1 +
> tools/tracing/rtla/src/cli_p.h | 3 +++
> tools/tracing/rtla/src/common.h | 1 +
> 4 files changed, 9 insertions(+)
>
> [truncated]
>
> --- a/tools/tracing/rtla/src/cli_p.h
> +++ b/tools/tracing/rtla/src/cli_p.h
> @@ -305,6 +305,9 @@ static int opt_filter_cb(const struct option *opt, const char *arg, int unset)
> "the minimum delta to be considered a noise", \
> opt_llong_callback)
>
> +#define OSNOISE_OPT_IPI OPT_BOOLEAN('i', "ipi", ¶ms->common.ipi, \
> + "track sources of IPIs")
> +
As IPI tracking is not a commonly used functionality, unlike e.g.
"-p/--period", and -i is already a different option for timerlat tools
(-i-/--irq), I'd suggest keeping just the long option, --ipi, like I
did for --on-threshold/--on-end (on Arnaldo's suggestion based on his
experience from perf [1]). This will make it clear to user the option
means "IPI detection" and not something else beginning with the letter
"i". We can always add a short option later if its use becomes common.
[1] https://lore.kernel.org/linux-trace-kernel/aEmWyPqQw2Ly7Jlu@x1/
> [truncated]
Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 1/4] rtla/osnoise: Add IPI tracking cmdline option
2026-06-29 10:51 ` Tomas Glozar
@ 2026-06-30 13:59 ` Valentin Schneider
0 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2026-06-30 13:59 UTC (permalink / raw)
To: Tomas Glozar
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
On 29/06/26 12:51, Tomas Glozar wrote:
> st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
> <vschneid@redhat.com> napsal:
>> @@ -305,6 +305,9 @@ static int opt_filter_cb(const struct option *opt, const char *arg, int unset)
>> "the minimum delta to be considered a noise", \
>> opt_llong_callback)
>>
>> +#define OSNOISE_OPT_IPI OPT_BOOLEAN('i', "ipi", ¶ms->common.ipi, \
>> + "track sources of IPIs")
>> +
>
> As IPI tracking is not a commonly used functionality, unlike e.g.
> "-p/--period", and -i is already a different option for timerlat tools
> (-i-/--irq), I'd suggest keeping just the long option, --ipi, like I
> did for --on-threshold/--on-end (on Arnaldo's suggestion based on his
> experience from perf [1]). This will make it clear to user the option
> means "IPI detection" and not something else beginning with the letter
> "i". We can always add a short option later if its use becomes common.
>
> [1] https://lore.kernel.org/linux-trace-kernel/aEmWyPqQw2Ly7Jlu@x1/
>
Makes sense to me!
>> [truncated]
>
> Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH v2 2/4] rtla/osnoise: Record IPI count in osnoise top
2026-06-17 13:17 [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Valentin Schneider
2026-06-17 13:17 ` [RFC PATCH v2 1/4] rtla/osnoise: Add IPI tracking cmdline option Valentin Schneider
@ 2026-06-17 13:17 ` Valentin Schneider
2026-06-29 12:56 ` Tomas Glozar
2026-06-17 13:17 ` [RFC PATCH v2 3/4] rtla/osnoise: Trace IPI events when recording a trace file Valentin Schneider
` (2 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2026-06-17 13:17 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Tomas Glozar,
Costa Shulyupin, Crystal Wood, John Kacur, Ivan Pravdin,
Jonathan Corbet
Leverage the ipi_send_cpu and ipi_send_cpumask trace events to record the
count of IPIs sent to monitored CPUs. These interferences are already
accounted by the IRQ count, but this split gives a better overall picture.
This uses the newly added -i cmdline option.
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
tools/tracing/rtla/src/osnoise_top.c | 124 ++++++++++++++++++++++++++-
1 file changed, 123 insertions(+), 1 deletion(-)
diff --git a/tools/tracing/rtla/src/osnoise_top.c b/tools/tracing/rtla/src/osnoise_top.c
index 512a6299cb018..5b462a3543b97 100644
--- a/tools/tracing/rtla/src/osnoise_top.c
+++ b/tools/tracing/rtla/src/osnoise_top.c
@@ -8,6 +8,7 @@
#include <string.h>
#include <signal.h>
#include <unistd.h>
+#include <errno.h>
#include <stdio.h>
#include <time.h>
@@ -25,6 +26,7 @@ struct osnoise_top_cpu {
unsigned long long irq_count;
unsigned long long softirq_count;
unsigned long long thread_count;
+ unsigned long long ipi_count;
int sum_cycles;
};
@@ -70,6 +72,91 @@ static struct osnoise_top_data *osnoise_alloc_top(void)
return NULL;
}
+static void account_ipi(struct osnoise_tool *tool,
+ unsigned long long src_cpu, unsigned long long dst_cpu)
+{
+ struct osnoise_top_cpu *cpu_data;
+ struct osnoise_top_data *data;
+ unsigned long long inc = 1;
+
+ data = tool->data;
+ cpu_data = &data->cpu_data[dst_cpu];
+
+ update_sum(&cpu_data->ipi_count, &inc);
+}
+
+/*
+ * osnoise_ipi_cpu_handler - this is the handler for single CPU IPI events.
+ */
+static int
+osnoise_ipi_cpu_handler(struct trace_seq *s, struct tep_record *record,
+ struct tep_event *event, void *context)
+{
+ struct osnoise_tool *tool;
+ struct osnoise_params *params;
+ unsigned long long src_cpu, dst_cpu;
+ struct trace_instance *trace = context;
+
+ tool = container_of(trace, struct osnoise_tool, trace);
+ params = to_osnoise_params(tool->params);
+
+ src_cpu = record->cpu;
+ tep_get_field_val(s, event, "cpu", record, &dst_cpu, 1);
+
+ if (CPU_ISSET(dst_cpu, ¶ms->common.monitored_cpus))
+ account_ipi(tool, src_cpu, dst_cpu);
+
+ return 0;
+}
+
+static cpu_set_t cpumask_tmp_cpus;
+
+/*
+ * osnoise_ipi_cpumask_handler - this is the handler for broadcasted IPI events.
+ */
+static int
+osnoise_ipi_cpumask_handler(struct trace_seq *s, struct tep_record *record,
+ struct tep_event *event, void *context)
+{
+ struct trace_instance *trace = context;
+ struct osnoise_tool *tool;
+ struct osnoise_params *params;
+ struct tep_format_field *field;
+ unsigned long long src_cpu;
+ cpu_set_t *event_cpus;
+ int len;
+
+ tool = container_of(trace, struct osnoise_tool, trace);
+ params = to_osnoise_params(tool->params);
+
+ src_cpu = record->cpu;
+
+ field = tep_find_field(event, "cpumask");
+ if (!field)
+ return 0;
+
+ event_cpus = tep_get_field_raw(s, event, "cpumask", record, &len, 1);
+ if (!event_cpus) {
+ err_msg("Failed to get cpumask field\n");
+ return 0;
+ }
+
+ CPU_AND(&cpumask_tmp_cpus, event_cpus, ¶ms->common.monitored_cpus);
+
+ /*
+ * Computing the mask weight is overkill but there is no leaner option
+ * provided by glibc, e.g cpumask_first() or somesuch.
+ */
+ if (CPU_COUNT(&cpumask_tmp_cpus)) {
+ for (int cpu = 0; cpu < nr_cpus; cpu++) {
+ if (CPU_ISSET(cpu, &cpumask_tmp_cpus))
+ account_ipi(tool, src_cpu, cpu);
+ }
+ }
+
+ return 0;
+}
+
/*
* osnoise_top_handler - this is the handler for osnoise tracer events
*/
@@ -164,6 +251,8 @@ static void osnoise_top_header(struct osnoise_tool *top)
goto eol;
trace_seq_printf(s, " IRQ Softirq Thread");
+ if (params->common.ipi)
+ trace_seq_printf(s, " IPI");
eol:
if (pretty)
@@ -218,7 +307,13 @@ static void osnoise_top_print(struct osnoise_tool *tool, int cpu)
trace_seq_printf(s, "%12llu ", cpu_data->irq_count);
trace_seq_printf(s, "%12llu ", cpu_data->softirq_count);
- trace_seq_printf(s, "%12llu\n", cpu_data->thread_count);
+ trace_seq_printf(s, "%12llu", cpu_data->thread_count);
+ if (!params->common.ipi) {
+ trace_seq_printf(s, "\n");
+ return;
+ }
+
+ trace_seq_printf(s, " %12llu\n", cpu_data->ipi_count);
}
/*
@@ -281,6 +376,7 @@ osnoise_top_apply_config(struct osnoise_tool *tool)
struct osnoise_tool *osnoise_init_top(struct common_params *params)
{
struct osnoise_tool *tool;
+ int retval;
tool = osnoise_init_tool("osnoise_top");
if (!tool)
@@ -295,7 +391,33 @@ struct osnoise_tool *osnoise_init_top(struct common_params *params)
tep_register_event_handler(tool->trace.tep, -1, "ftrace", "osnoise",
osnoise_top_handler, NULL);
+ if (!params->ipi)
+ goto out;
+
+ retval = tracefs_event_enable(tool->trace.inst, "ipi", "ipi_send_cpu");
+ if (retval < 0 && !errno) {
+ err_msg("Could not find ipi_send_cpu event\n");
+ goto out_err;
+ }
+
+ retval = tracefs_event_enable(tool->trace.inst, "ipi", "ipi_send_cpumask");
+ if (retval < 0 && !errno) {
+ err_msg("Could not find ipi_send_cpumask event\n");
+ goto out_err;
+ }
+
+ tep_register_event_handler(tool->trace.tep, -1, "ipi", "ipi_send_cpu",
+ osnoise_ipi_cpu_handler, NULL);
+
+ tep_register_event_handler(tool->trace.tep, -1, "ipi", "ipi_send_cpumask",
+ osnoise_ipi_cpumask_handler, NULL);
+
+out:
return tool;
+out_err:
+ osnoise_free_top_tool(tool);
+ osnoise_destroy_tool(tool);
+ return NULL;
}
struct tool_ops osnoise_top_ops = {
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 2/4] rtla/osnoise: Record IPI count in osnoise top
2026-06-17 13:17 ` [RFC PATCH v2 2/4] rtla/osnoise: Record IPI count in osnoise top Valentin Schneider
@ 2026-06-29 12:56 ` Tomas Glozar
2026-06-30 13:59 ` Valentin Schneider
0 siblings, 1 reply; 17+ messages in thread
From: Tomas Glozar @ 2026-06-29 12:56 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
<vschneid@redhat.com> napsal:
>
> Leverage the ipi_send_cpu and ipi_send_cpumask trace events to record the
> count of IPIs sent to monitored CPUs. These interferences are already
> accounted by the IRQ count, but this split gives a better overall picture.
>
> This uses the newly added -i cmdline option.
>
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
> tools/tracing/rtla/src/osnoise_top.c | 124 ++++++++++++++++++++++++++-
> 1 file changed, 123 insertions(+), 1 deletion(-)
>
Overall, looks good to me (see small comments below). The reported
numbers make sense:
[tglozar@cs9 rtla]$ sudo ./rtla osnoise top -q -c 0,1 -d 5s --ipi
Operating System Noise
duration: 0 00:00:05 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max
Single HW NMI IRQ Softirq Thread
IPI
0 #4 4000000 28481 99.28797 8977
248 6756 0 4002 18 1
42
1 #5 5000000 38025 99.23950 8120
185 8403 0 5260 153 141
49
(It looks good in the terminal, I'm sure Gmail will garble it...)
I'll compare with trace output on the next patch.
> diff --git a/tools/tracing/rtla/src/osnoise_top.c b/tools/tracing/rtla/src/osnoise_top.c
> index 512a6299cb018..5b462a3543b97 100644
> --- a/tools/tracing/rtla/src/osnoise_top.c
> +++ b/tools/tracing/rtla/src/osnoise_top.c
>
> [truncated]
>
> @@ -70,6 +72,91 @@ static struct osnoise_top_data *osnoise_alloc_top(void)
> return NULL;
> }
> +static void account_ipi(struct osnoise_tool *tool,
> + unsigned long long src_cpu, unsigned long long dst_cpu)
> +{
> + struct osnoise_top_cpu *cpu_data;
> + struct osnoise_top_data *data;
> + unsigned long long inc = 1;
> +
> + data = tool->data;
> + cpu_data = &data->cpu_data[dst_cpu];
> +
> + update_sum(&cpu_data->ipi_count, &inc);
> +}
> +
> +/*
> + * osnoise_ipi_cpu_handler - this is the handler for single CPU IPI events.
> + */
> +static int
> +osnoise_ipi_cpu_handler(struct trace_seq *s, struct tep_record *record,
> + struct tep_event *event, void *context)
> +{
> + struct osnoise_tool *tool;
> + struct osnoise_params *params;
> + unsigned long long src_cpu, dst_cpu;
> + struct trace_instance *trace = context;
> +
> + tool = container_of(trace, struct osnoise_tool, trace);
> + params = to_osnoise_params(tool->params);
> +
> + src_cpu = record->cpu;
> + tep_get_field_val(s, event, "cpu", record, &dst_cpu, 1);
> +
> + if (CPU_ISSET(dst_cpu, ¶ms->common.monitored_cpus))
> + account_ipi(tool, src_cpu, dst_cpu);
Do we need to retrieve and pass the src_cpu here? I get it if you plan
on using it in the future, but as far as I understand, you are
specifically tracking the destination CPU, not the source CPU. Same
note applies to osnoise_ipi_cpumask_handler() below.
> +
> + return 0;
> +}
> +
> +static cpu_set_t cpumask_tmp_cpus;
> +
> +/*
> + * osnoise_ipi_cpumask_handler - this is the handler for broadcasted IPI events.
> + */
> +static int
> +osnoise_ipi_cpumask_handler(struct trace_seq *s, struct tep_record *record,
> + struct tep_event *event, void *context)
> +{
> + struct trace_instance *trace = context;
> + struct osnoise_tool *tool;
> + struct osnoise_params *params;
> + struct tep_format_field *field;
> + unsigned long long src_cpu;
> + cpu_set_t *event_cpus;
> + int len;
> +
> + tool = container_of(trace, struct osnoise_tool, trace);
> + params = to_osnoise_params(tool->params);
> +
> + src_cpu = record->cpu;
> +
> + field = tep_find_field(event, "cpumask");
> + if (!field)
> + return 0;
> +
> + event_cpus = tep_get_field_raw(s, event, "cpumask", record, &len, 1);
> + if (!event_cpus) {
> + err_msg("Failed to get cpumask field\n");
> + return 0;
> + }
> +
> + CPU_AND(&cpumask_tmp_cpus, event_cpus, ¶ms->common.monitored_cpus);
> +
> + /*
> + * Computing the mask weight is overkill but there is no leaner option
> + * provided by glibc, e.g cpumask_first() or somesuch.
> + */
> + if (CPU_COUNT(&cpumask_tmp_cpus)) {
> + for (int cpu = 0; cpu < nr_cpus; cpu++) {
> + if (CPU_ISSET(cpu, &cpumask_tmp_cpus))
> + account_ipi(tool, src_cpu, cpu);
> + }
> + }
Technically, the existing code already relies on the glibc cpumask
implementation (cpu_set_t) matching the kernel "cpumask_t" type, as
the "cpumask" field is the latter (per
/sys/kernel/tracing/events/ipi/ipi_send_cpumask/format), not the
former. So I wouldn't worry about the opaqueness of cpu_set_t much.
Not sure how this is handled in other tracing tools that need to use
cpumask, I'd have to look around a bit. It might even make sense to
have a "tools" version of the cpumask functions like cpumask_first(),
I guess, like we already do for e.g. lists and container_of.
> +
> + return 0;
> +}
> +
> /*
> * osnoise_top_handler - this is the handler for osnoise tracer events
> */
Nit: As this is extra functionality, it'd be more readable to have the
IPI handling after the main top handler, so that someone not familiar
with the source code will see the core logic first. That would also
match IPI being displayed to the right of the other numbers in the top
output.
> @@ -164,6 +251,8 @@ static void osnoise_top_header(struct osnoise_tool *top)
> goto eol;
>
> trace_seq_printf(s, " IRQ Softirq Thread");
> + if (params->common.ipi)
> + trace_seq_printf(s, " IPI");
>
> eol:
> if (pretty)
> @@ -218,7 +307,13 @@ static void osnoise_top_print(struct osnoise_tool *tool, int cpu)
>
> trace_seq_printf(s, "%12llu ", cpu_data->irq_count);
> trace_seq_printf(s, "%12llu ", cpu_data->softirq_count);
> - trace_seq_printf(s, "%12llu\n", cpu_data->thread_count);
> + trace_seq_printf(s, "%12llu", cpu_data->thread_count);
> + if (!params->common.ipi) {
> + trace_seq_printf(s, "\n");
> + return;
> + }
> +
> + trace_seq_printf(s, " %12llu\n", cpu_data->ipi_count);
Maybe at this point it is worth it to print the "\n" in a separate
statement, readability-wise:
trace_seq_printf(s, "%12llu ", cpu_data->irq_count);
trace_seq_printf(s, "%12llu ", cpu_data->softirq_count);
trace_seq_printf(s, "%12llu", cpu_data->thread_count);
if (params->common.ipi)
trace_seq_printf(s, " %12llu", cpu_data->ipi_count);
trace_seq_printf(s, "\n");
It would also make diffs nicer when adding new options.
> [truncated]
Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 2/4] rtla/osnoise: Record IPI count in osnoise top
2026-06-29 12:56 ` Tomas Glozar
@ 2026-06-30 13:59 ` Valentin Schneider
0 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2026-06-30 13:59 UTC (permalink / raw)
To: Tomas Glozar
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
On 29/06/26 14:56, Tomas Glozar wrote:
> st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
> <vschneid@redhat.com> napsal:
>> +/*
>> + * osnoise_ipi_cpu_handler - this is the handler for single CPU IPI events.
>> + */
>> +static int
>> +osnoise_ipi_cpu_handler(struct trace_seq *s, struct tep_record *record,
>> + struct tep_event *event, void *context)
>> +{
>> + struct osnoise_tool *tool;
>> + struct osnoise_params *params;
>> + unsigned long long src_cpu, dst_cpu;
>> + struct trace_instance *trace = context;
>> +
>> + tool = container_of(trace, struct osnoise_tool, trace);
>> + params = to_osnoise_params(tool->params);
>> +
>> + src_cpu = record->cpu;
>> + tep_get_field_val(s, event, "cpu", record, &dst_cpu, 1);
>> +
>> + if (CPU_ISSET(dst_cpu, ¶ms->common.monitored_cpus))
>> + account_ipi(tool, src_cpu, dst_cpu);
>
> Do we need to retrieve and pass the src_cpu here? I get it if you plan
> on using it in the future, but as far as I understand, you are
> specifically tracking the destination CPU, not the source CPU. Same
> note applies to osnoise_ipi_cpumask_handler() below.
>
You're right, I fished out the src_cpu to have it available but it's not
being used ATM.
>> +
>> + return 0;
>> +}
>> +
>> +static cpu_set_t cpumask_tmp_cpus;
>> +
>> +/*
>> + * osnoise_ipi_cpumask_handler - this is the handler for broadcasted IPI events.
>> + */
>> +static int
>> +osnoise_ipi_cpumask_handler(struct trace_seq *s, struct tep_record *record,
>> + struct tep_event *event, void *context)
>> +{
>> + struct trace_instance *trace = context;
>> + struct osnoise_tool *tool;
>> + struct osnoise_params *params;
>> + struct tep_format_field *field;
>> + unsigned long long src_cpu;
>> + cpu_set_t *event_cpus;
>> + int len;
>> +
>> + tool = container_of(trace, struct osnoise_tool, trace);
>> + params = to_osnoise_params(tool->params);
>> +
>> + src_cpu = record->cpu;
>> +
>> + field = tep_find_field(event, "cpumask");
>> + if (!field)
>> + return 0;
>> +
>> + event_cpus = tep_get_field_raw(s, event, "cpumask", record, &len, 1);
>> + if (!event_cpus) {
>> + err_msg("Failed to get cpumask field\n");
>> + return 0;
>> + }
>> +
>> + CPU_AND(&cpumask_tmp_cpus, event_cpus, ¶ms->common.monitored_cpus);
>> +
>> + /*
>> + * Computing the mask weight is overkill but there is no leaner option
>> + * provided by glibc, e.g cpumask_first() or somesuch.
>> + */
>> + if (CPU_COUNT(&cpumask_tmp_cpus)) {
>> + for (int cpu = 0; cpu < nr_cpus; cpu++) {
>> + if (CPU_ISSET(cpu, &cpumask_tmp_cpus))
>> + account_ipi(tool, src_cpu, cpu);
>> + }
>> + }
>
> Technically, the existing code already relies on the glibc cpumask
> implementation (cpu_set_t) matching the kernel "cpumask_t" type, as
> the "cpumask" field is the latter (per
> /sys/kernel/tracing/events/ipi/ipi_send_cpumask/format), not the
> former. So I wouldn't worry about the opaqueness of cpu_set_t much.
>
Right, AFAICT that's the "canonical" type for passing cpumasks around
between userspace and kernelspace. e.g. for sched_getaffinity():
manpage:
int sched_getaffinity(pid_t pid, size_t cpusetsize,
cpu_set_t *mask);
kernelside:
SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
unsigned long __user *, user_mask_ptr)
{
cpumask_var_t mask;
sched_getaffinity(pid, mask);
copy_to_user(user_mask_ptr, cpumask_bits(mask), ...)
}
> Not sure how this is handled in other tracing tools that need to use
> cpumask, I'd have to look around a bit. It might even make sense to
> have a "tools" version of the cpumask functions like cpumask_first(),
> I guess, like we already do for e.g. lists and container_of.
>
I couldn't find anything in tools/testing/* other than the CPU_*() helpers.
>> +
>> + return 0;
>> +}
>> +
>> /*
>> * osnoise_top_handler - this is the handler for osnoise tracer events
>> */
>
> Nit: As this is extra functionality, it'd be more readable to have the
> IPI handling after the main top handler, so that someone not familiar
> with the source code will see the core logic first. That would also
> match IPI being displayed to the right of the other numbers in the top
> output.
>
Ack.
>> @@ -164,6 +251,8 @@ static void osnoise_top_header(struct osnoise_tool *top)
>> goto eol;
>>
>> trace_seq_printf(s, " IRQ Softirq Thread");
>> + if (params->common.ipi)
>> + trace_seq_printf(s, " IPI");
>>
>> eol:
>> if (pretty)
>> @@ -218,7 +307,13 @@ static void osnoise_top_print(struct osnoise_tool *tool, int cpu)
>>
>> trace_seq_printf(s, "%12llu ", cpu_data->irq_count);
>> trace_seq_printf(s, "%12llu ", cpu_data->softirq_count);
>> - trace_seq_printf(s, "%12llu\n", cpu_data->thread_count);
>> + trace_seq_printf(s, "%12llu", cpu_data->thread_count);
>> + if (!params->common.ipi) {
>> + trace_seq_printf(s, "\n");
>> + return;
>> + }
>> +
>> + trace_seq_printf(s, " %12llu\n", cpu_data->ipi_count);
>
> Maybe at this point it is worth it to print the "\n" in a separate
> statement, readability-wise:
>
> trace_seq_printf(s, "%12llu ", cpu_data->irq_count);
> trace_seq_printf(s, "%12llu ", cpu_data->softirq_count);
> trace_seq_printf(s, "%12llu", cpu_data->thread_count);
> if (params->common.ipi)
> trace_seq_printf(s, " %12llu", cpu_data->ipi_count);
> trace_seq_printf(s, "\n");
>
> It would also make diffs nicer when adding new options.
>
Indeed, will do.
>> [truncated]
>
>
> Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH v2 3/4] rtla/osnoise: Trace IPI events when recording a trace file
2026-06-17 13:17 [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Valentin Schneider
2026-06-17 13:17 ` [RFC PATCH v2 1/4] rtla/osnoise: Add IPI tracking cmdline option Valentin Schneider
2026-06-17 13:17 ` [RFC PATCH v2 2/4] rtla/osnoise: Record IPI count in osnoise top Valentin Schneider
@ 2026-06-17 13:17 ` Valentin Schneider
2026-06-30 11:32 ` Tomas Glozar
2026-06-17 13:17 ` [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs Valentin Schneider
2026-06-26 10:26 ` [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Steven Rostedt
4 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2026-06-17 13:17 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Tomas Glozar,
Costa Shulyupin, Crystal Wood, John Kacur, Ivan Pravdin,
Jonathan Corbet
IPIs can now be monitored and accounted by osnoise top. When that is
the case, also record them when saving a trace file.
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
tools/tracing/rtla/src/common.c | 2 +-
tools/tracing/rtla/src/common.h | 2 +-
tools/tracing/rtla/src/osnoise.c | 17 ++++++++++++++++-
3 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/tools/tracing/rtla/src/common.c b/tools/tracing/rtla/src/common.c
index d0a8a6edbf0cb..dd302427557ca 100644
--- a/tools/tracing/rtla/src/common.c
+++ b/tools/tracing/rtla/src/common.c
@@ -204,7 +204,7 @@ int run_tool(struct tool_ops *ops, int argc, char *argv[])
if (params->threshold_actions.present[ACTION_TRACE_OUTPUT] ||
params->end_actions.present[ACTION_TRACE_OUTPUT]) {
- tool->record = osnoise_init_trace_tool(ops->tracer);
+ tool->record = osnoise_init_trace_tool(params, ops->tracer);
if (!tool->record) {
err_msg("Failed to enable the trace instance\n");
goto out_free;
diff --git a/tools/tracing/rtla/src/common.h b/tools/tracing/rtla/src/common.h
index 045253230fcf2..421e06e10f3f1 100644
--- a/tools/tracing/rtla/src/common.h
+++ b/tools/tracing/rtla/src/common.h
@@ -178,7 +178,7 @@ int osnoise_set_workload(struct osnoise_context *context, bool onoff);
void osnoise_destroy_tool(struct osnoise_tool *top);
struct osnoise_tool *osnoise_init_tool(char *tool_name);
-struct osnoise_tool *osnoise_init_trace_tool(const char *tracer);
+struct osnoise_tool *osnoise_init_trace_tool(struct common_params *params, const char *tracer);
bool osnoise_trace_is_off(struct osnoise_tool *tool, struct osnoise_tool *record);
int osnoise_set_stop_us(struct osnoise_context *context, long long stop_us);
int osnoise_set_stop_total_us(struct osnoise_context *context,
diff --git a/tools/tracing/rtla/src/osnoise.c b/tools/tracing/rtla/src/osnoise.c
index 4ff5dad013b10..281f6f57d15af 100644
--- a/tools/tracing/rtla/src/osnoise.c
+++ b/tools/tracing/rtla/src/osnoise.c
@@ -1181,7 +1181,8 @@ struct osnoise_tool *osnoise_init_tool(char *tool_name)
/*
* osnoise_init_trace_tool - init a tracer instance to trace osnoise events
*/
-struct osnoise_tool *osnoise_init_trace_tool(const char *tracer)
+struct osnoise_tool *osnoise_init_trace_tool(struct common_params *params,
+ const char *tracer)
{
struct osnoise_tool *trace;
int retval;
@@ -1196,6 +1197,20 @@ struct osnoise_tool *osnoise_init_trace_tool(const char *tracer)
goto out_err;
}
+ if (params->ipi) {
+ retval = tracefs_event_enable(trace->trace.inst, "ipi", "ipi_send_cpu");
+ if (retval < 0 && !errno) {
+ err_msg("Could not find ipi_send_cpu event\n");
+ goto out_err;
+ }
+
+ retval = tracefs_event_enable(trace->trace.inst, "ipi", "ipi_send_cpumask");
+ if (retval < 0 && !errno) {
+ err_msg("Could not find ipi_send_cpumask event\n");
+ goto out_err;
+ }
+ }
+
retval = enable_tracer_by_name(trace->trace.inst, tracer);
if (retval) {
err_msg("Could not enable %s tracer for tracing\n", tracer);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 3/4] rtla/osnoise: Trace IPI events when recording a trace file
2026-06-17 13:17 ` [RFC PATCH v2 3/4] rtla/osnoise: Trace IPI events when recording a trace file Valentin Schneider
@ 2026-06-30 11:32 ` Tomas Glozar
0 siblings, 0 replies; 17+ messages in thread
From: Tomas Glozar @ 2026-06-30 11:32 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
<vschneid@redhat.com> napsal:
>
> IPIs can now be monitored and accounted by osnoise top. When that is
> the case, also record them when saving a trace file.
>
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
> tools/tracing/rtla/src/common.c | 2 +-
> tools/tracing/rtla/src/common.h | 2 +-
> tools/tracing/rtla/src/osnoise.c | 17 ++++++++++++++++-
> 3 files changed, 18 insertions(+), 3 deletions(-)
>
Looks good, the numbers match between what is seen in (non-truncated)
trace and what is detected in RTLA:
$ rtla osnoise top -q -d 5s --ipi --on-end
trace,file=/tmp/ipi_trace.txt | awk '/^ *[0-9]/{ print "CPU: " $1 ",
IPI count: " $13 }'
CPU: 0, IPI count: 20
CPU: 1, IPI count: 2
CPU: 2, IPI count: 1
CPU: 3, IPI count: 3
CPU: 4, IPI count: 0
CPU: 5, IPI count: 2
CPU: 6, IPI count: 1
CPU: 7, IPI count: 1
CPU: 8, IPI count: 0
CPU: 9, IPI count: 0
CPU: 10, IPI count: 2
CPU: 11, IPI count: 3
CPU: 12, IPI count: 0
CPU: 13, IPI count: 20
$ grep ipi_send_cpumask /tmp/ipi_trace.txt | wc -l
0
$ for cpu in {0..13}; do n=$(grep -F "ipi_send_cpu: cpu=$cpu "
/tmp/ipi_trace.txt | wc -l); echo "CPU: $cpu, IPI count: $n"; done
CPU: 0, IPI count: 20
CPU: 1, IPI count: 2
CPU: 2, IPI count: 1
CPU: 3, IPI count: 3
CPU: 4, IPI count: 0
CPU: 5, IPI count: 2
CPU: 6, IPI count: 1
CPU: 7, IPI count: 1
CPU: 8, IPI count: 0
CPU: 9, IPI count: 0
CPU: 10, IPI count: 2
CPU: 11, IPI count: 3
CPU: 12, IPI count: 0
CPU: 13, IPI count: 20
(This is in a VM, with apparently no ipi_send_cpumask events, so I
didn't test that.)
Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs
2026-06-17 13:17 [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Valentin Schneider
` (2 preceding siblings ...)
2026-06-17 13:17 ` [RFC PATCH v2 3/4] rtla/osnoise: Trace IPI events when recording a trace file Valentin Schneider
@ 2026-06-17 13:17 ` Valentin Schneider
2026-06-30 10:14 ` Tomas Glozar
2026-06-26 10:26 ` [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Steven Rostedt
4 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2026-06-17 13:17 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Tomas Glozar,
Costa Shulyupin, Crystal Wood, John Kacur, Ivan Pravdin,
Jonathan Corbet
Instead of post-processing the events in the tracefs_iterate_raw_events()
callbacks, leverage the kernel event filtering infrastructure to only emit
IPI events if they target CPUs that are being traced, as specified by the
-c cmdline option.
Note that some post-processing is still required for the ipi_send_cpumask
event, as the event being emitted means *some* CPUs targeted by that event
are monitored, but not all of them - userspace has to recompute that
intersection.
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
tools/tracing/rtla/src/osnoise_top.c | 37 +++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 4 deletions(-)
diff --git a/tools/tracing/rtla/src/osnoise_top.c b/tools/tracing/rtla/src/osnoise_top.c
index 5b462a3543b97..8040521710884 100644
--- a/tools/tracing/rtla/src/osnoise_top.c
+++ b/tools/tracing/rtla/src/osnoise_top.c
@@ -93,18 +93,15 @@ osnoise_ipi_cpu_handler(struct trace_seq *s, struct tep_record *record,
struct tep_event *event, void *context)
{
struct osnoise_tool *tool;
- struct osnoise_params *params;
unsigned long long src_cpu, dst_cpu;
struct trace_instance *trace = context;
tool = container_of(trace, struct osnoise_tool, trace);
- params = to_osnoise_params(tool->params);
src_cpu = record->cpu;
tep_get_field_val(s, event, "cpu", record, &dst_cpu, 1);
- if (CPU_ISSET(dst_cpu, ¶ms->common.monitored_cpus))
- account_ipi(tool, src_cpu, dst_cpu);
+ account_ipi(tool, src_cpu, dst_cpu);
return 0;
}
@@ -141,6 +138,11 @@ osnoise_ipi_cpumask_handler(struct trace_seq *s, struct tep_record *record,
return 0;
}
+ /*
+ * Despite already filtering for such an intersection, we need to compute
+ * the intersection here as the @cpumask field may contain non-monitered
+ * CPUs.
+ */
CPU_AND(&cpumask_tmp_cpus, event_cpus, ¶ms->common.monitored_cpus);
/*
@@ -406,6 +408,33 @@ struct osnoise_tool *osnoise_init_top(struct common_params *params)
goto out_err;
}
+ /*
+ * If tracing on a subset of possible CPUs, leverage the kernel filtering
+ * infrastructure to only generate events on traced CPUs.
+ */
+ if (params->cpus) {
+ char filter[MAX_PATH];
+
+ snprintf(filter, ARRAY_SIZE(filter), "cpu & CPUS{%s}\n", params->cpus);
+ retval = tracefs_event_file_write(tool->trace.inst,
+ "ipi", "ipi_send_cpu", "filter",
+ filter);
+ if (retval) {
+ err_msg("Could not set ipi_send_cpu CPU filter\n");
+ goto out_err;
+ }
+
+
+ snprintf(filter, ARRAY_SIZE(filter), "cpumask & CPUS{%s}\n", params->cpus);
+ retval = tracefs_event_file_write(tool->trace.inst,
+ "ipi", "ipi_send_cpumask", "filter",
+ filter);
+ if (retval) {
+ err_msg("Could not set ipi_send_cpumask CPU filter\n");
+ goto out_err;
+ }
+ }
+
tep_register_event_handler(tool->trace.tep, -1, "ipi", "ipi_send_cpu",
osnoise_ipi_cpu_handler, NULL);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs
2026-06-17 13:17 ` [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs Valentin Schneider
@ 2026-06-30 10:14 ` Tomas Glozar
2026-06-30 13:59 ` Valentin Schneider
0 siblings, 1 reply; 17+ messages in thread
From: Tomas Glozar @ 2026-06-30 10:14 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
<vschneid@redhat.com> napsal:
>
> Instead of post-processing the events in the tracefs_iterate_raw_events()
> callbacks, leverage the kernel event filtering infrastructure to only emit
> IPI events if they target CPUs that are being traced, as specified by the
> -c cmdline option.
>
> Note that some post-processing is still required for the ipi_send_cpumask
> event, as the event being emitted means *some* CPUs targeted by that event
> are monitored, but not all of them - userspace has to recompute that
> intersection.
>
Nit: I'd drop the "Instead of post-processing the events in the
tracefs_iterate_raw_events() callbacks" sentence. I find it a bit
confusing, as "instead of" is quite a strong wording implying
post-processing is removed (at least to my perception), but in the
next paragraph, you contradict it by saying that some post-processing
is still done. Also the commit message is perfectly understandable
without it.
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
> tools/tracing/rtla/src/osnoise_top.c | 37 +++++++++++++++++++++++++---
> 1 file changed, 33 insertions(+), 4 deletions(-)
>
> diff --git a/tools/tracing/rtla/src/osnoise_top.c b/tools/tracing/rtla/src/osnoise_top.c
> index 5b462a3543b97..8040521710884 100644
> --- a/tools/tracing/rtla/src/osnoise_top.c
> +++ b/tools/tracing/rtla/src/osnoise_top.c
> @@ -93,18 +93,15 @@ osnoise_ipi_cpu_handler(struct trace_seq *s, struct tep_record *record,
> struct tep_event *event, void *context)
> {
> struct osnoise_tool *tool;
> - struct osnoise_params *params;
> unsigned long long src_cpu, dst_cpu;
> struct trace_instance *trace = context;
>
> tool = container_of(trace, struct osnoise_tool, trace);
> - params = to_osnoise_params(tool->params);
>
> src_cpu = record->cpu;
> tep_get_field_val(s, event, "cpu", record, &dst_cpu, 1);
>
> - if (CPU_ISSET(dst_cpu, ¶ms->common.monitored_cpus))
> - account_ipi(tool, src_cpu, dst_cpu);
> + account_ipi(tool, src_cpu, dst_cpu);
>
> return 0;
> }
> @@ -141,6 +138,11 @@ osnoise_ipi_cpumask_handler(struct trace_seq *s, struct tep_record *record,
> return 0;
> }
>
> + /*
> + * Despite already filtering for such an intersection, we need to compute
> + * the intersection here as the @cpumask field may contain non-monitered
Typo: non-monitered -> non-monitored
> + * CPUs.
> + */
> CPU_AND(&cpumask_tmp_cpus, event_cpus, ¶ms->common.monitored_cpus);
>
> /*
> @@ -406,6 +408,33 @@ struct osnoise_tool *osnoise_init_top(struct common_params *params)
> goto out_err;
> }
>
> + /*
> + * If tracing on a subset of possible CPUs, leverage the kernel filtering
> + * infrastructure to only generate events on traced CPUs.
> + */
> + if (params->cpus) {
> + char filter[MAX_PATH];
> +
> + snprintf(filter, ARRAY_SIZE(filter), "cpu & CPUS{%s}\n", params->cpus);
> + retval = tracefs_event_file_write(tool->trace.inst,
> + "ipi", "ipi_send_cpu", "filter",
> + filter);
> + if (retval) {
retval is the number of bytes written here, so this should be "retval
< 0" like in trace_event_enable_filter() in trace.c. Same below.
> + err_msg("Could not set ipi_send_cpu CPU filter\n");
> + goto out_err;
It would be useful to have --ipi work even on older kernels that don't
yet have your cpumask trace event filter patchset [1], for example, by
printing a debug message that filtering is disabled and setting a flag
instead of erroring out here. Then the code in
osnoise_ipi_cpu_handler() can preserve the CPU_ISSET check if the flag
is set.
As --ipi is optional, we can choose to only support it on newer
kernels, but it would be nice to have it working without the filter,
too.
[1] https://lore.kernel.org/linux-trace-kernel/20230707172155.70873-1-vschneid@redhat.com/T/#u
> + }
> +
> +
> + snprintf(filter, ARRAY_SIZE(filter), "cpumask & CPUS{%s}\n", params->cpus);
> + retval = tracefs_event_file_write(tool->trace.inst,
> + "ipi", "ipi_send_cpumask", "filter",
> + filter);
> + if (retval) {
> + err_msg("Could not set ipi_send_cpumask CPU filter\n");
> + goto out_err;
> + }
Same two comments above apply here.
> + }
> +
> tep_register_event_handler(tool->trace.tep, -1, "ipi", "ipi_send_cpu",
> osnoise_ipi_cpu_handler, NULL);
>
> --
> 2.54.0
>
I was thinking that it might make sense to enable the filters also for
the trace output instance. On the other hand, it would make it
difficult to enable the event without the filter then, as specifying
"-e ipi" or similar only re-enables the event but does not remove the
filter. Maybe the better idea is to implement an option to filter any
event enabled through -e/--event only to the measurement CPU, as a
separate feature.
Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs
2026-06-30 10:14 ` Tomas Glozar
@ 2026-06-30 13:59 ` Valentin Schneider
2026-07-01 6:45 ` Tomas Glozar
0 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2026-06-30 13:59 UTC (permalink / raw)
To: Tomas Glozar
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
On 30/06/26 12:14, Tomas Glozar wrote:
> st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
> <vschneid@redhat.com> napsal:
>> @@ -406,6 +408,33 @@ struct osnoise_tool *osnoise_init_top(struct common_params *params)
>> goto out_err;
>> }
>>
>> + /*
>> + * If tracing on a subset of possible CPUs, leverage the kernel filtering
>> + * infrastructure to only generate events on traced CPUs.
>> + */
>> + if (params->cpus) {
>> + char filter[MAX_PATH];
>> +
>> + snprintf(filter, ARRAY_SIZE(filter), "cpu & CPUS{%s}\n", params->cpus);
>> + retval = tracefs_event_file_write(tool->trace.inst,
>> + "ipi", "ipi_send_cpu", "filter",
>> + filter);
>> + if (retval) {
>
> retval is the number of bytes written here, so this should be "retval
> < 0" like in trace_event_enable_filter() in trace.c. Same below.
>
According to the docstring:
* Return 0 on success, and -1 on error.
but regardless yes that should be a '< 0' check to match existing code.
>> + err_msg("Could not set ipi_send_cpu CPU filter\n");
>> + goto out_err;
>
> It would be useful to have --ipi work even on older kernels that don't
> yet have your cpumask trace event filter patchset [1], for example, by
> printing a debug message that filtering is disabled and setting a flag
> instead of erroring out here. Then the code in
> osnoise_ipi_cpu_handler() can preserve the CPU_ISSET check if the flag
> is set.
>
> As --ipi is optional, we can choose to only support it on newer
> kernels, but it would be nice to have it working without the filter,
> too.
>
> [1] https://lore.kernel.org/linux-trace-kernel/20230707172155.70873-1-vschneid@redhat.com/T/#u
>
Makes sense, will do.
>> + }
>> +
>> +
>> + snprintf(filter, ARRAY_SIZE(filter), "cpumask & CPUS{%s}\n", params->cpus);
>> + retval = tracefs_event_file_write(tool->trace.inst,
>> + "ipi", "ipi_send_cpumask", "filter",
>> + filter);
>> + if (retval) {
>> + err_msg("Could not set ipi_send_cpumask CPU filter\n");
>> + goto out_err;
>> + }
>
> Same two comments above apply here.
>
>> + }
>> +
>> tep_register_event_handler(tool->trace.tep, -1, "ipi", "ipi_send_cpu",
>> osnoise_ipi_cpu_handler, NULL);
>>
>> --
>> 2.54.0
>>
>
> I was thinking that it might make sense to enable the filters also for
> the trace output instance. On the other hand, it would make it
> difficult to enable the event without the filter then, as specifying
> "-e ipi" or similar only re-enables the event but does not remove the
> filter. Maybe the better idea is to implement an option to filter any
> event enabled through -e/--event only to the measurement CPU, as a
> separate feature.
>
I had actually forgotten about applying the filters for the output
instance... I'll look into it.
> Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs
2026-06-30 13:59 ` Valentin Schneider
@ 2026-07-01 6:45 ` Tomas Glozar
2026-07-01 7:25 ` Valentin Schneider
2026-07-01 13:11 ` Steven Rostedt
0 siblings, 2 replies; 17+ messages in thread
From: Tomas Glozar @ 2026-07-01 6:45 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
út 30. 6. 2026 v 16:00 odesílatel Valentin Schneider
<vschneid@redhat.com> napsal:
>
> On 30/06/26 12:14, Tomas Glozar wrote:
> > st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
> > <vschneid@redhat.com> napsal:
> >> @@ -406,6 +408,33 @@ struct osnoise_tool *osnoise_init_top(struct common_params *params)
> >> goto out_err;
> >> }
> >>
> >> + /*
> >> + * If tracing on a subset of possible CPUs, leverage the kernel filtering
> >> + * infrastructure to only generate events on traced CPUs.
> >> + */
> >> + if (params->cpus) {
> >> + char filter[MAX_PATH];
> >> +
> >> + snprintf(filter, ARRAY_SIZE(filter), "cpu & CPUS{%s}\n", params->cpus);
> >> + retval = tracefs_event_file_write(tool->trace.inst,
> >> + "ipi", "ipi_send_cpu", "filter",
> >> + filter);
> >> + if (retval) {
> >
> > retval is the number of bytes written here, so this should be "retval
> > < 0" like in trace_event_enable_filter() in trace.c. Same below.
> >
>
> According to the docstring:
>
> * Return 0 on success, and -1 on error.
>
> but regardless yes that should be a '< 0' check to match existing code.
>
I double-checked that and you are correct that the docstring says so,
but it's an error in the docstring. According to the manpage, it
returns the number of bytes written (i.e. positive on success, not
zero) [1]:
"RETURN VALUE
...
tracefs_event_file_write() and tracefs_event_file_append() returns
*the number of bytes written to the system/event file* or negative on
error."
The code agrees as well: in tracefs_event_file_write() there's the
wrong docstring (likely copied from another function) [2]:
/*
* tracefs_event_file_write - write to an event file
* ...
* Return 0 on success, and -1 on error.
*/
int tracefs_event_file_write(struct tracefs_instance *instance,
const char *system, const char *event,
const char *file, const char *str)
{
....
ret = tracefs_instance_file_write(instance, path, str);
free(path);
return ret;
}
but the source of the return value is tracefs_instance_file_write(),
where the docstring is correct [3]:
/**
* tracefs_instance_file_write - Write in trace file of specific instance.
* ...
* Returns the number of written bytes, or -1 in case of an error
*/
int tracefs_instance_file_write(struct tracefs_instance *instance,
const char *file, const char *str)
{
return instance_file_write(instance, file, str, O_WRONLY | O_TRUNC);
}
instance_file_write() gets the return value from write_file() [4]
which returns the return value of write() [5].
[1] https://man7.org/linux/man-pages/man3/tracefs_event_get_file.3.html
[2] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-events.c#n686
[3] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-instance.c#n532
[4] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-instance.c#n514
[5] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-instance.c#n496
So the "< 0" is required, the CPU filter doesn't work without it:
[tglozar@fedora rtla]$ uname -a
Linux fedora 7.0.12-101.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 11
01:32:26 UTC 2026 x86_64 GNU/Linux
[tglozar@fedora rtla]$ sudo ./rtla osnoise top -q -c 0 --ipi
Could not set ipi_send_cpu CPU filter, return value: 14
Could not init osnoise tool
[tglozar@fedora rtla]$ sudo ./rtla osnoise top -q -d 5s --ipi
Operating System Noise
...
[tglozar@fedora rtla]$
(output with additional debug print)
> >> + err_msg("Could not set ipi_send_cpu CPU filter\n");
> >> + goto out_err;
> >
> > It would be useful to have --ipi work even on older kernels that don't
> > yet have your cpumask trace event filter patchset [1], for example, by
> > printing a debug message that filtering is disabled and setting a flag
> > instead of erroring out here. Then the code in
> > osnoise_ipi_cpu_handler() can preserve the CPU_ISSET check if the flag
> > is set.
> >
> > As --ipi is optional, we can choose to only support it on newer
> > kernels, but it would be nice to have it working without the filter,
> > too.
> >
> > [1] https://lore.kernel.org/linux-trace-kernel/20230707172155.70873-1-vschneid@redhat.com/T/#u
> >
>
> Makes sense, will do.
>
Thanks!
> [truncated]
> >
> > I was thinking that it might make sense to enable the filters also for
> > the trace output instance. On the other hand, it would make it
> > difficult to enable the event without the filter then, as specifying
> > "-e ipi" or similar only re-enables the event but does not remove the
> > filter. Maybe the better idea is to implement an option to filter any
> > event enabled through -e/--event only to the measurement CPU, as a
> > separate feature.
> >
>
> I had actually forgotten about applying the filters for the output
> instance... I'll look into it.
>
Thanks. I gave it some more thought and realized enabling the event
without the filter should not be complicated at all. We can just
remove existing filters in trace_events_enable(), as
trace_events_enable() is called after osnoise_init_trace_tool() in
run_tool(). So that will make an explicit "-e ipi" drop the filter
from "--ipi" on the trace instance and show all IPI events. So you can
disregard my note about filtering -e options, it's not relevant here.
Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs
2026-07-01 6:45 ` Tomas Glozar
@ 2026-07-01 7:25 ` Valentin Schneider
2026-07-01 13:11 ` Steven Rostedt
1 sibling, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2026-07-01 7:25 UTC (permalink / raw)
To: Tomas Glozar
Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
On 01/07/26 08:45, Tomas Glozar wrote:
> út 30. 6. 2026 v 16:00 odesílatel Valentin Schneider
> <vschneid@redhat.com> napsal:
>>
>> On 30/06/26 12:14, Tomas Glozar wrote:
>> > st 17. 6. 2026 v 15:18 odesílatel Valentin Schneider
>> > <vschneid@redhat.com> napsal:
>> >> @@ -406,6 +408,33 @@ struct osnoise_tool *osnoise_init_top(struct common_params *params)
>> >> goto out_err;
>> >> }
>> >>
>> >> + /*
>> >> + * If tracing on a subset of possible CPUs, leverage the kernel filtering
>> >> + * infrastructure to only generate events on traced CPUs.
>> >> + */
>> >> + if (params->cpus) {
>> >> + char filter[MAX_PATH];
>> >> +
>> >> + snprintf(filter, ARRAY_SIZE(filter), "cpu & CPUS{%s}\n", params->cpus);
>> >> + retval = tracefs_event_file_write(tool->trace.inst,
>> >> + "ipi", "ipi_send_cpu", "filter",
>> >> + filter);
>> >> + if (retval) {
>> >
>> > retval is the number of bytes written here, so this should be "retval
>> > < 0" like in trace_event_enable_filter() in trace.c. Same below.
>> >
>>
>> According to the docstring:
>>
>> * Return 0 on success, and -1 on error.
>>
>> but regardless yes that should be a '< 0' check to match existing code.
>>
>
> I double-checked that and you are correct that the docstring says so,
> but it's an error in the docstring. According to the manpage, it
> returns the number of bytes written (i.e. positive on success, not
> zero) [1]:
>
> "RETURN VALUE
> ...
> tracefs_event_file_write() and tracefs_event_file_append() returns
> *the number of bytes written to the system/event file* or negative on
> error."
>
> The code agrees as well: in tracefs_event_file_write() there's the
> wrong docstring (likely copied from another function) [2]:
>
> /*
> * tracefs_event_file_write - write to an event file
> * ...
> * Return 0 on success, and -1 on error.
> */
> int tracefs_event_file_write(struct tracefs_instance *instance,
> const char *system, const char *event,
> const char *file, const char *str)
> {
> ....
> ret = tracefs_instance_file_write(instance, path, str);
> free(path);
> return ret;
> }
>
> but the source of the return value is tracefs_instance_file_write(),
> where the docstring is correct [3]:
>
> /**
> * tracefs_instance_file_write - Write in trace file of specific instance.
> * ...
> * Returns the number of written bytes, or -1 in case of an error
> */
> int tracefs_instance_file_write(struct tracefs_instance *instance,
> const char *file, const char *str)
> {
> return instance_file_write(instance, file, str, O_WRONLY | O_TRUNC);
> }
>
> instance_file_write() gets the return value from write_file() [4]
> which returns the return value of write() [5].
>
> [1] https://man7.org/linux/man-pages/man3/tracefs_event_get_file.3.html
> [2] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-events.c#n686
> [3] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-instance.c#n532
> [4] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-instance.c#n514
> [5] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-instance.c#n496
>
> So the "< 0" is required, the CPU filter doesn't work without it:
>
> [tglozar@fedora rtla]$ uname -a
> Linux fedora 7.0.12-101.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 11
> 01:32:26 UTC 2026 x86_64 GNU/Linux
> [tglozar@fedora rtla]$ sudo ./rtla osnoise top -q -c 0 --ipi
> Could not set ipi_send_cpu CPU filter, return value: 14
> Could not init osnoise tool
> [tglozar@fedora rtla]$ sudo ./rtla osnoise top -q -d 5s --ipi
> Operating System Noise
> ...
> [tglozar@fedora rtla]$
>
> (output with additional debug print)
>
Darnit, you're right! I'm surprised I didn't catch this while
testing. Thanks!
>> >> + err_msg("Could not set ipi_send_cpu CPU filter\n");
>> >> + goto out_err;
>> >
>> > It would be useful to have --ipi work even on older kernels that don't
>> > yet have your cpumask trace event filter patchset [1], for example, by
>> > printing a debug message that filtering is disabled and setting a flag
>> > instead of erroring out here. Then the code in
>> > osnoise_ipi_cpu_handler() can preserve the CPU_ISSET check if the flag
>> > is set.
>> >
>> > As --ipi is optional, we can choose to only support it on newer
>> > kernels, but it would be nice to have it working without the filter,
>> > too.
>> >
>> > [1] https://lore.kernel.org/linux-trace-kernel/20230707172155.70873-1-vschneid@redhat.com/T/#u
>> >
>>
>> Makes sense, will do.
>>
>
> Thanks!
>
>> [truncated]
>> >
>> > I was thinking that it might make sense to enable the filters also for
>> > the trace output instance. On the other hand, it would make it
>> > difficult to enable the event without the filter then, as specifying
>> > "-e ipi" or similar only re-enables the event but does not remove the
>> > filter. Maybe the better idea is to implement an option to filter any
>> > event enabled through -e/--event only to the measurement CPU, as a
>> > separate feature.
>> >
>>
>> I had actually forgotten about applying the filters for the output
>> instance... I'll look into it.
>>
>
> Thanks. I gave it some more thought and realized enabling the event
> without the filter should not be complicated at all. We can just
> remove existing filters in trace_events_enable(), as
> trace_events_enable() is called after osnoise_init_trace_tool() in
> run_tool(). So that will make an explicit "-e ipi" drop the filter
> from "--ipi" on the trace instance and show all IPI events. So you can
> disregard my note about filtering -e options, it's not relevant here.
>
That makes sense, that's the most intuitive option.
> Tomas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs
2026-07-01 6:45 ` Tomas Glozar
2026-07-01 7:25 ` Valentin Schneider
@ 2026-07-01 13:11 ` Steven Rostedt
1 sibling, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2026-07-01 13:11 UTC (permalink / raw)
To: Tomas Glozar
Cc: Valentin Schneider, linux-kernel, linux-trace-kernel,
Masami Hiramatsu, Mathieu Desnoyers, Costa Shulyupin,
Crystal Wood, John Kacur, Ivan Pravdin, Jonathan Corbet
On Wed, 1 Jul 2026 08:45:40 +0200
Tomas Glozar <tglozar@redhat.com> wrote:
> I double-checked that and you are correct that the docstring says so,
> but it's an error in the docstring. According to the manpage, it
> returns the number of bytes written (i.e. positive on success, not
> zero) [1]:
Note, the man page is considered the source of "truth".
>
> "RETURN VALUE
> ...
> tracefs_event_file_write() and tracefs_event_file_append() returns
> *the number of bytes written to the system/event file* or negative on
> error."
>
> The code agrees as well: in tracefs_event_file_write() there's the
> wrong docstring (likely copied from another function) [2]:
>
> /*
> * tracefs_event_file_write - write to an event file
> * ...
> * Return 0 on success, and -1 on error.
Ug, that's a bug and needs to be fixed.
Thanks for catching this. I need to spend some time to catch up on the user
side of tracing. There's a few new bugzillas and patches I need to apply.
-- Steve
> */
> int tracefs_event_file_write(struct tracefs_instance *instance,
> const char *system, const char *event,
> const char *file, const char *str)
> {
> ....
> ret = tracefs_instance_file_write(instance, path, str);
> free(path);
> return ret;
> }
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs
2026-06-17 13:17 [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Valentin Schneider
` (3 preceding siblings ...)
2026-06-17 13:17 ` [RFC PATCH v2 4/4] rtla/osnoise: Leverage IPI event filters when tracing a subset of CPUs Valentin Schneider
@ 2026-06-26 10:26 ` Steven Rostedt
2026-06-26 12:25 ` Valentin Schneider
4 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2026-06-26 10:26 UTC (permalink / raw)
To: Valentin Schneider
Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu,
Mathieu Desnoyers, Tomas Glozar, Costa Shulyupin, Crystal Wood,
John Kacur, Ivan Pravdin, Jonathan Corbet
On Wed, 17 Jun 2026 15:17:55 +0200
Valentin Schneider <vschneid@redhat.com> wrote:
> Hi folks,
>
> So I've seen a few times now reports of latency spikes caused by IPIs, usually
> because of isolation misconfiguration, but only detected at the tail of end
> e.g. a 24h timerlat run.
>
> It's not because those IPIs are rare, but rather that they don't by themselves
> cause a monitered CPU to reach the latency threshold, it's usually a combined
> interference that gets us there.
>
> I'd like to make it easier to detect such misconfigurations and thus IPIs
> hitting supposedly-isolated CPUs. I initially kludged a timerlat option to stop
> tracing as soon as an IPI was sent to a monitored CPU, regardless of the latency
> threshold. It sort of did the trick, but Tomáš convinced me timerlat wasn't
> really the place for that.
>
> So here's IPI tracking added to osnoise. This time around fully in userspace, as
> Tomáš pointed out to me that this will make it a lot easier to deploy to older
> kernels.
>
> Based on top of linux/next at 'next-20260616' to have the latest libsubcmd
> changes.
>
Hi Valentin,
My new job actually makes me very interested in IPI interference, and
this patch set looks *very* interesting. I'm currently finishing up my
orientation and hopefully next week I can start catching up on all my
email.
I'll try to take a deeper look at this in the coming weeks.
-- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs
2026-06-26 10:26 ` [RFC PATCH v2 0/4] tracing/osnoise: Track IPIs Steven Rostedt
@ 2026-06-26 12:25 ` Valentin Schneider
0 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2026-06-26 12:25 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu,
Mathieu Desnoyers, Tomas Glozar, Costa Shulyupin, Crystal Wood,
John Kacur, Ivan Pravdin, Jonathan Corbet
On 26/06/26 06:26, Steven Rostedt wrote:
> On Wed, 17 Jun 2026 15:17:55 +0200
> Valentin Schneider <vschneid@redhat.com> wrote:
>
>> Hi folks,
>>
>> So I've seen a few times now reports of latency spikes caused by IPIs, usually
>> because of isolation misconfiguration, but only detected at the tail of end
>> e.g. a 24h timerlat run.
>>
>> It's not because those IPIs are rare, but rather that they don't by themselves
>> cause a monitered CPU to reach the latency threshold, it's usually a combined
>> interference that gets us there.
>>
>> I'd like to make it easier to detect such misconfigurations and thus IPIs
>> hitting supposedly-isolated CPUs. I initially kludged a timerlat option to stop
>> tracing as soon as an IPI was sent to a monitored CPU, regardless of the latency
>> threshold. It sort of did the trick, but Tomáš convinced me timerlat wasn't
>> really the place for that.
>>
>> So here's IPI tracking added to osnoise. This time around fully in userspace, as
>> Tomáš pointed out to me that this will make it a lot easier to deploy to older
>> kernels.
>>
>> Based on top of linux/next at 'next-20260616' to have the latest libsubcmd
>> changes.
>>
>
> Hi Valentin,
>
> My new job actually makes me very interested in IPI interference, and
> this patch set looks *very* interesting. I'm currently finishing up my
> orientation and hopefully next week I can start catching up on all my
> email.
>
Welcome back :-) If IPIs are your thing, you may also have a look at
[1]. I'm working on a v10 following some (surprisingly) useful feedback
from Sashiko.
[1]: https://lore.kernel.org/lkml/20260505082355.1982003-1-vschneid@redhat.com/
> I'll try to take a deeper look at this in the coming weeks.
>
Thanks!
> -- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread