public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf trace: Introduce --show-cpu option to display cpu id
@ 2026-04-21 20:39 Aaron Tomlin
  2026-04-21 21:02 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Aaron Tomlin @ 2026-04-21 20:39 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung
  Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
	james.clark, howardchu95, atomlin, neelx, linux-perf-users,
	linux-kernel

When tracing system-wide workloads or specific events, it is highly
valuable to know exactly which CPU executed a specific event. Currently,
perf trace output defaults to omitting CPU information.

Introduce a new "--show-cpu" command-line option. When provided, this
flag extracts the CPU from the perf sample and prints it in a "[000]"
format immediately following the timestamp. This mirrors the behaviour of
other tracing tools like ftrace and perf script. For example:

  # perf trace -e sched:sched_switch --max-events 5 --show-cpu
       0.000 [002] :0/0 sched:sched_switch(prev_comm: "swapper/2", prev_prio: 120, next_comm: "rcu_preempt", next_pid: 16 (rcu_preempt), next_prio: 120)
       0.009 [002] rcu_preempt/16 sched:sched_switch(prev_comm: "rcu_preempt", prev_pid: 16 (rcu_preempt), prev_prio: 120, prev_state: 128, next_comm: "swapper/2", next_prio: 120)
       0.033 [002] :0/0 sched:sched_switch(prev_comm: "swapper/2", prev_prio: 120, next_comm: "kworker/u32:48", next_pid: 35840 (kworker/u32:48-), next_prio: 120)
       0.041 [002] kworker/u32:48/35840 sched:sched_switch(prev_comm: "kworker/u32:48", prev_pid: 35840 (kworker/u32:48-), prev_prio: 120, prev_state: 128, next_comm: "swapper/2", next_prio: 120)
       0.045 [002] :0/0 sched:sched_switch(prev_comm: "swapper/2", prev_prio: 120, next_comm: "kworker/u32:48", next_pid: 35840 (kworker/u32:48-), next_prio: 120)

The feature is implemented strictly as an opt-in toggle to prevent
cluttering the standard output and to preserve backwards compatibility
for scripts parsing the default output format.

Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
 tools/perf/Documentation/perf-trace.txt | 3 +++
 tools/perf/builtin-trace.c              | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index 892c82a9bf40..d0b6c771a1b9 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -199,6 +199,9 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
 --show-on-off-events::
 	Show the --switch-on/off events too.
 
+--show-cpu::
+	Show cpu id.
+
 --max-stack::
         Set the stack depth limit when parsing the callchain, anything
         beyond the specified depth will be ignored. Note that at this point
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e58c49d047a2..be4104e88285 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -217,6 +217,7 @@ struct trace {
 	bool			kernel_syscallchains;
 	s16			args_alignment;
 	bool			show_tstamp;
+	bool			show_cpu;
 	bool			show_duration;
 	bool			show_zeros;
 	bool			show_arg_names;
@@ -3280,6 +3281,9 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
 	trace__printf_interrupted_entry(trace);
 	trace__fprintf_tstamp(trace, sample->time, trace->output);
 
+	if (trace->show_cpu)
+		fprintf(trace->output, "[%03d] ", sample->cpu);
+
 	if (trace->trace_syscalls && trace->show_duration)
 		fprintf(trace->output, "(         ): ");
 
@@ -5432,6 +5436,7 @@ int cmd_trace(int argc, const char **argv)
 	OPT_CALLBACK('m', "mmap-pages", &trace.opts.mmap_pages, "pages",
 		     "number of mmap data pages", evlist__parse_mmap_pages),
 	OPT_STRING('u', "uid", &trace.uid_str, "user", "user to profile"),
+	OPT_BOOLEAN(0, "show-cpu", &trace.show_cpu, "show cpu id"),
 	OPT_CALLBACK(0, "duration", &trace, "float",
 		     "show only events with duration > N.M ms",
 		     trace__set_duration),
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] perf trace: Introduce --show-cpu option to display cpu id
  2026-04-21 20:39 [PATCH] perf trace: Introduce --show-cpu option to display cpu id Aaron Tomlin
@ 2026-04-21 21:02 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-04-21 21:02 UTC (permalink / raw)
  To: Aaron Tomlin; +Cc: linux-perf-users

Sashiko AI review found 2 potential issue(s):
- [Medium] The `--show-cpu` option fails to request the PERF_SAMPLE_CPU attribute and incorrectly uses `%03d` to print a u32, resulting in `[-01]` being printed.
- [Medium] The `--show-cpu` feature has no effect on the core events of `perf trace` (system calls and page faults).
--

commit 85a27484af518bd604f94598d0e741862be09625
Author: Aaron Tomlin <atomlin@atomlin.com>

perf trace: Introduce --show-cpu option to display cpu id

This commit introduces a new "--show-cpu" command-line option to perf trace.
When provided, it extracts the CPU from the perf sample and prints it
alongside the timestamp.

> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c

[ ... ]

> @@ -3277,6 +3278,9 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
>  	trace__printf_interrupted_entry(trace);
>  	trace__fprintf_tstamp(trace, sample->time, trace->output);
>  
> +	if (trace->show_cpu)
> +		fprintf(trace->output, "[%03d] ", sample->cpu);
> +

When tracing specific processes using non-tracepoint events,
target__has_cpu() is false and CPU samples might not be recorded. If the CPU
data is missing, evsel__parse_sample() assigns a default value of -1
(0xFFFFFFFF) to the u32 sample->cpu field.

Since %03d is a signed format specifier, will this cause [-01] to be printed
instead of a valid CPU id?

Furthermore, this modifies trace__event_handler(), which processes secondary
events. System calls and page faults, which make up the primary workload of
perf trace, are processed by dedicated handlers like trace__sys_enter(),
trace__sys_exit(), and trace__pgfault().

Because those core handlers format their output prefixes using
trace__fprintf_entry_head(), will --show-cpu silently fail to display CPU
information for the default trace output?

[ ... ]

> @@ -5422,6 +5426,7 @@ int cmd_trace(int argc, const char **argv)
>  	OPT_CALLBACK('m', "mmap-pages", &trace.opts.mmap_pages, "pages",
>  		     "number of mmap data pages", evlist__parse_mmap_pages),
>  	OPT_STRING('u', "uid", &trace.uid_str, "user", "user to profile"),
> +	OPT_BOOLEAN(0, "show-cpu", &trace.show_cpu, "show cpu id"),

Does trace.opts.sample_cpu need to be set to true when --show-cpu is
provided, to ensure the kernel is instructed to actually record the CPU
samples?

>  	OPT_CALLBACK(0, "duration", &trace, "float",
>  		     "show only events with duration > N.M ms",
>  		     trace__set_duration),

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260421203934.64032-1-atomlin@atomlin.com?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-21 21:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 20:39 [PATCH] perf trace: Introduce --show-cpu option to display cpu id Aaron Tomlin
2026-04-21 21:02 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox