* [PATCH v2] perf trace: Introduce --show-cpu option to display cpu id
@ 2026-04-23 19:24 Aaron Tomlin
2026-04-23 22:42 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Aaron Tomlin @ 2026-04-23 19:24 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
james.clark, howardchu95, atomlin, neelx, sean, linux-perf-users,
linux-kernel
When tracing system-wide workloads or specific events, it is highly
valuable to know exactly which CPU executed a specific event. Currently,
perf trace output defaults to omitting CPU information.
Introduce a new "--show-cpu" command-line option. When provided, this
flag extracts the CPU from the perf sample and prints it in a "[000]"
format immediately following the timestamp. This mirrors the behaviour of
other tracing tools like ftrace and perf script. For example:
# perf trace -e sched:sched_switch --max-events 5 --show-cpu
0.000 [002] :0/0 sched:sched_switch(prev_comm: "swapper/2", prev_prio: 120, next_comm: "rcu_preempt", next_pid: 16 (rcu_preempt), next_prio: 120)
0.009 [002] rcu_preempt/16 sched:sched_switch(prev_comm: "rcu_preempt", prev_pid: 16 (rcu_preempt), prev_prio: 120, prev_state: 128, next_comm: "swapper/2", next_prio: 120)
0.033 [002] :0/0 sched:sched_switch(prev_comm: "swapper/2", prev_prio: 120, next_comm: "kworker/u32:48", next_pid: 35840 (kworker/u32:48-), next_prio: 120)
0.041 [002] kworker/u32:48/35840 sched:sched_switch(prev_comm: "kworker/u32:48", prev_pid: 35840 (kworker/u32:48-), prev_prio: 120, prev_state: 128, next_comm: "swapper/2", next_prio: 120)
0.045 [002] :0/0 sched:sched_switch(prev_comm: "swapper/2", prev_prio: 120, next_comm: "kworker/u32:48", next_pid: 35840 (kworker/u32:48-), next_prio: 120)
The feature is implemented strictly as an opt-in toggle to prevent
cluttering the standard output and to preserve backwards compatibility
for scripts parsing the default output format.
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
Changes since v1 [1]:
- Fixed a silent failure where core trace events (e.g., system calls and
page faults) ignored the --show-cpu flag. All primary workload events
correctly display the CPU ID when required
- Updated all core event handlers (i.e., trace__sys_enter,
trace__sys_exit, trace__pgfault and trace__printf_interrupted_entry)
to extract sample->cpu and pass it down into the entry head
formatter
- Abstracted the CPU formatting logic into a dedicated, documented helper
function trace__fprintf_cpu()
- Added a safety guard to verify the CPU data is actually present
(cpu != (u32)-1) before attempting to print it, preventing dummy
values from polluting the output when sample data is missing
[1]: https://lore.kernel.org/linux-perf-users/20260421203934.64032-1-atomlin@atomlin.com/
---
tools/perf/Documentation/perf-trace.txt | 3 ++
tools/perf/builtin-trace.c | 49 ++++++++++++++++++++++---
2 files changed, 47 insertions(+), 5 deletions(-)
diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index 892c82a9bf40..d0b6c771a1b9 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -199,6 +199,9 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
--show-on-off-events::
Show the --switch-on/off events too.
+--show-cpu::
+ Show cpu id.
+
--max-stack::
Set the stack depth limit when parsing the callchain, anything
beyond the specified depth will be ignored. Note that at this point
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e58c49d047a2..6314332ad711 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -217,6 +217,7 @@ struct trace {
bool kernel_syscallchains;
s16 args_alignment;
bool show_tstamp;
+ bool show_cpu;
bool show_duration;
bool show_zeros;
bool show_arg_names;
@@ -1893,6 +1894,27 @@ static size_t trace__fprintf_tstamp(struct trace *trace, u64 tstamp, FILE *fp)
return fprintf(fp, " ? ");
}
+/**
+ * trace__fprintf_cpu - Print the CPU ID to a given file stream
+ * @cpu: The CPU ID to print
+ * @fp: The file stream to write to
+ *
+ * Formats and prints the specified CPU ID enclosed in brackets
+ * (e.g., "[003] ") to the provided file pointer. It is used to
+ * align and display the CPU ID consistently within the trace output.
+ *
+ * Return: The number of characters printed.
+ */
+static size_t trace__fprintf_cpu(u32 cpu, FILE *fp)
+{
+ size_t printed = 0;
+
+ if (cpu >= 0)
+ printed += fprintf(fp, "[%03d] ", cpu);
+
+ return printed;
+}
+
static pid_t workload_pid = -1;
static volatile sig_atomic_t done = false;
static volatile sig_atomic_t interrupted = false;
@@ -1923,12 +1945,15 @@ static size_t trace__fprintf_comm_tid(struct trace *trace, struct thread *thread
}
static size_t trace__fprintf_entry_head(struct trace *trace, struct thread *thread,
- u64 duration, bool duration_calculated, u64 tstamp, FILE *fp)
+ u64 duration, bool duration_calculated,
+ u64 tstamp, u32 cpu, FILE *fp)
{
size_t printed = 0;
if (trace->show_tstamp)
printed = trace__fprintf_tstamp(trace, tstamp, fp);
+ if (trace->show_cpu && cpu != (u32)-1)
+ printed += trace__fprintf_cpu(cpu, fp);
if (trace->show_duration)
printed += fprintf_duration(duration, duration_calculated, fp);
return printed + trace__fprintf_comm_tid(trace, thread, fp);
@@ -2707,7 +2732,9 @@ static int trace__printf_interrupted_entry(struct trace *trace)
if (!ttrace->entry_pending)
return 0;
- printed = trace__fprintf_entry_head(trace, trace->current, 0, false, ttrace->entry_time, trace->output);
+ printed = trace__fprintf_entry_head(trace, trace->current, 0, false,
+ ttrace->entry_time, 0,
+ trace->output);
printed += len = fprintf(trace->output, "%s)", ttrace->entry_str);
if (len < trace->args_alignment - 4)
@@ -2835,7 +2862,9 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
if (!(trace->duration_filter || trace->summary_only || trace->failure_only || trace->min_stack)) {
int alignment = 0;
- trace__fprintf_entry_head(trace, thread, 0, false, ttrace->entry_time, trace->output);
+ trace__fprintf_entry_head(trace, thread, 0, false,
+ ttrace->entry_time,
+ sample->cpu, trace->output);
printed = fprintf(trace->output, "%s)", ttrace->entry_str);
if (trace->args_alignment > printed)
alignment = trace->args_alignment - printed;
@@ -2980,7 +3009,9 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
if (trace->summary_only || (ret >= 0 && trace->failure_only))
goto out;
- trace__fprintf_entry_head(trace, thread, duration, duration_calculated, ttrace->entry_time, trace->output);
+ trace__fprintf_entry_head(trace, thread, duration,
+ duration_calculated, ttrace->entry_time,
+ sample->cpu, trace->output);
if (ttrace->entry_pending) {
printed = fprintf(trace->output, "%s", ttrace->entry_str);
@@ -3280,6 +3311,9 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
trace__printf_interrupted_entry(trace);
trace__fprintf_tstamp(trace, sample->time, trace->output);
+ if (trace->show_cpu && sample->cpu != (u32)-1)
+ fprintf(trace->output, "[%03d] ", sample->cpu);
+
if (trace->trace_syscalls && trace->show_duration)
fprintf(trace->output, "( ): ");
@@ -3405,7 +3439,8 @@ static int trace__pgfault(struct trace *trace,
thread__find_symbol(thread, sample->cpumode, sample->ip, &al);
- trace__fprintf_entry_head(trace, thread, 0, true, sample->time, trace->output);
+ trace__fprintf_entry_head(trace, thread, 0, true, sample->time,
+ sample->cpu, trace->output);
fprintf(trace->output, "%sfault [",
evsel->core.attr.config == PERF_COUNT_SW_PAGE_FAULTS_MAJ ?
@@ -5432,6 +5467,7 @@ int cmd_trace(int argc, const char **argv)
OPT_CALLBACK('m', "mmap-pages", &trace.opts.mmap_pages, "pages",
"number of mmap data pages", evlist__parse_mmap_pages),
OPT_STRING('u', "uid", &trace.uid_str, "user", "user to profile"),
+ OPT_BOOLEAN(0, "show-cpu", &trace.show_cpu, "show cpu id"),
OPT_CALLBACK(0, "duration", &trace, "float",
"show only events with duration > N.M ms",
trace__set_duration),
@@ -5566,6 +5602,9 @@ int cmd_trace(int argc, const char **argv)
goto out;
}
+ if (trace.show_cpu)
+ trace.opts.sample_cpu = true;
+
if ((nr_cgroups || trace.cgroup) && !trace.opts.target.system_wide) {
usage_with_options_msg(trace_usage, trace_options,
"cgroup monitoring only available in system-wide mode");
--
2.51.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] perf trace: Introduce --show-cpu option to display cpu id
2026-04-23 19:24 [PATCH v2] perf trace: Introduce --show-cpu option to display cpu id Aaron Tomlin
@ 2026-04-23 22:42 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-04-23 22:42 UTC (permalink / raw)
To: Aaron Tomlin; +Cc: linux-perf-users
Sashiko AI review found 2 potential issue(s):
- [High] Hardcoded CPU 0 misattributes interrupted syscalls.
- [Medium] Tautological check on unsigned integer and format string mismatch.
--
commit 1c192dba8ca61a6516bab340989a37ca856bfece
Author: Aaron Tomlin <atomlin@atomlin.com>
perf trace: Introduce --show-cpu option to display cpu id
This commit adds a new --show-cpu command-line option to perf trace.
When enabled, it extracts the CPU ID from the perf sample and includes it
in the trace output immediately following the timestamp.
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 311d9da9896a41..264e4762ea28d2 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
[ ... ]
> +static size_t trace__fprintf_cpu(u32 cpu, FILE *fp)
> +{
> + size_t printed = 0;
> +
> + if (cpu >= 0)
Since cpu is an unsigned 32-bit integer (u32), is this comparison always
true? This might trigger a compiler warning for tautological comparisons.
> + printed += fprintf(fp, "[%03d] ", cpu);
Given that cpu is unsigned, should this use the %03u format specifier
instead of %03d to avoid a format signedness mismatch?
> +
> + return printed;
> +}
[ ... ]
> @@ -2704,7 +2729,9 @@ static int trace__printf_interrupted_entry(struct trace *trace)
> if (!ttrace->entry_pending)
> return 0;
>
> - printed = trace__fprintf_entry_head(trace, trace->current, 0, false, ttrace->entry_time, trace->output);
> + printed = trace__fprintf_entry_head(trace, trace->current, 0, false,
> + ttrace->entry_time, 0,
> + trace->output);
Does passing a hardcoded 0 for the cpu argument cause all interrupted events
to be incorrectly attributed to CPU 0 when the --show-cpu option is enabled?
If the actual entry CPU isn't saved in struct thread_trace, would it be
better to pass (u32)-1 here so the CPU printout is properly omitted,
matching the check inside trace__fprintf_entry_head()?
[ ... ]
> @@ -3277,6 +3308,9 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
> trace__printf_interrupted_entry(trace);
> trace__fprintf_tstamp(trace, sample->time, trace->output);
>
> + if (trace->show_cpu && sample->cpu != (u32)-1)
> + fprintf(trace->output, "[%03d] ", sample->cpu);
Similar to the function above, should this use %03u since sample->cpu is
an unsigned integer?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260423192445.131351-1-atomlin@atomlin.com?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-23 22:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 19:24 [PATCH v2] perf trace: Introduce --show-cpu option to display cpu id Aaron Tomlin
2026-04-23 22:42 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox