From: Namhyung Kim <namhyung@kernel.org>
To: Steven Rostedt <rostedt@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
bpf@vger.kernel.org, x86@kernel.org,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrii Nakryiko <andrii@kernel.org>,
Indu Bhagat <indu.bhagat@oracle.com>,
"Jose E. Marchesi" <jemarch@gnu.org>,
Beau Belgrave <beaub@linux.microsoft.com>,
Jens Remus <jremus@linux.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Jens Axboe <axboe@kernel.dk>, Florian Weimer <fweimer@redhat.com>,
Sam James <sam@gentoo.org>
Subject: Re: [PATCH v15 8/8] perf tools: Merge deferred user callchains
Date: Tue, 2 Sep 2025 00:07:32 -0700 [thread overview]
Message-ID: <aLaXtBFCkzbKFr0B@z2> (raw)
In-Reply-To: <20250825180802.725570056@kernel.org>
On Mon, Aug 25, 2025 at 02:06:46PM -0400, Steven Rostedt wrote:
> From: Namhyung Kim <namhyung@kernel.org>
>
> Save samples with deferred callchains in a separate list and deliver
> them after merging the user callchains. If users don't want to merge
> they can set tool->merge_deferred_callchains to false to prevent the
> behavior.
>
> With previous result, now perf script will show the merged callchains.
>
> $ perf script
> perf 801 [000] 18.031793: 1 cycles:P:
> ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> ...
>
> The old output can be get using --no-merge-callchain option.
> Also perf report can get the user callchain entry at the end.
>
> $ perf report --no-children --percent-limit=0 --stdio -q -S __intel_pmu_enable_all.isra.0
> # symbol: __intel_pmu_enable_all.isra.0
> 0.00% perf [kernel.kallsyms]
> |
> ---__intel_pmu_enable_all.isra.0
> perf_ctx_enable
> event_function
> remote_function
> generic_exec_single
> smp_call_function_single
> event_function_call
> perf_event_for_each_child
> _perf_ioctl
> perf_ioctl
> __x64_sys_ioctl
> do_syscall_64
> entry_SYSCALL_64
> __GI___ioctl
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changes since v14: https://lore.kernel.org/20250718164324.925232448@kernel.org
>
> - Use both TID and cookie to match the request event to the deferred
> unwind event.
>
> tools/perf/Documentation/perf-script.txt | 5 ++
> tools/perf/builtin-script.c | 5 +-
> tools/perf/util/callchain.c | 24 +++++++++
> tools/perf/util/callchain.h | 3 ++
> tools/perf/util/evlist.c | 1 +
> tools/perf/util/evlist.h | 1 +
> tools/perf/util/session.c | 64 +++++++++++++++++++++++-
> tools/perf/util/tool.c | 1 +
> tools/perf/util/tool.h | 1 +
> 9 files changed, 103 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
> index 28bec7e78bc8..03d112960632 100644
> --- a/tools/perf/Documentation/perf-script.txt
> +++ b/tools/perf/Documentation/perf-script.txt
> @@ -527,6 +527,11 @@ include::itrace.txt[]
> The known limitations include exception handing such as
> setjmp/longjmp will have calls/returns not match.
>
> +--merge-callchains::
> + Enable merging deferred user callchains if available. This is the
> + default behavior. If you want to see separate CALLCHAIN_DEFERRED
> + records for some reason, use --no-merge-callchains explicitly.
> +
> :GMEXAMPLECMD: script
> :GMEXAMPLESUBCMD:
> include::guest-files.txt[]
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index d17e0a3d8567..70e7658a61fb 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -3785,6 +3785,7 @@ int cmd_script(int argc, const char **argv)
> bool header_only = false;
> bool script_started = false;
> bool unsorted_dump = false;
> + bool merge_deferred_callchains = true;
> char *rec_script_path = NULL;
> char *rep_script_path = NULL;
> struct perf_session *session;
> @@ -3938,6 +3939,8 @@ int cmd_script(int argc, const char **argv)
> "Guest code can be found in hypervisor process"),
> OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr,
> "Enable LBR callgraph stitching approach"),
> + OPT_BOOLEAN('\0', "merge-callchains", &merge_deferred_callchains,
> + "Enable merge deferred user callchains"),
> OPTS_EVSWITCH(&script.evswitch),
> OPT_END()
> };
> @@ -4194,7 +4197,7 @@ int cmd_script(int argc, const char **argv)
> script.tool.throttle = process_throttle_event;
> script.tool.unthrottle = process_throttle_event;
> script.tool.ordering_requires_timestamps = true;
> - script.tool.merge_deferred_callchains = false;
> + script.tool.merge_deferred_callchains = merge_deferred_callchains;
> session = perf_session__new(&data, &script.tool);
> if (IS_ERR(session))
> return PTR_ERR(session);
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index d7b7eef740b9..d2d672f1d6ba 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -1828,3 +1828,27 @@ int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
> }
> return 0;
> }
> +
> +int sample__merge_deferred_callchain(struct perf_sample *sample_orig,
> + struct perf_sample *sample_callchain)
> +{
> + u64 nr_orig = sample_orig->callchain->nr - PERF_DEFERRED_ITEMS;
> + u64 nr_deferred = sample_callchain->callchain->nr;
> + struct ip_callchain *callchain;
> +
> + callchain = calloc(1 + nr_orig + nr_deferred, sizeof(u64));
> + if (callchain == NULL) {
> + sample_orig->deferred_callchain = false;
> + return -ENOMEM;
> + }
> +
> + callchain->nr = nr_orig + nr_deferred;
> + /* copy except for the last PERF_CONTEXT_USER_DEFERRED */
> + memcpy(callchain->ips, sample_orig->callchain->ips, nr_orig * sizeof(u64));
> + /* copy deferred use callchains */
> + memcpy(&callchain->ips[nr_orig], sample_callchain->callchain->ips,
> + nr_deferred * sizeof(u64));
> +
> + sample_orig->callchain = callchain;
> + return 0;
> +}
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index 86ed9e4d04f9..89785125ed25 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -317,4 +317,7 @@ int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
> struct perf_sample *sample, int max_stack,
> bool symbols, callchain_iter_fn cb, void *data);
>
> +int sample__merge_deferred_callchain(struct perf_sample *sample_orig,
> + struct perf_sample *sample_callchain);
> +
> #endif /* __PERF_CALLCHAIN_H */
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 80d8387e6b97..9518b45af393 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -85,6 +85,7 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
> evlist->ctl_fd.pos = -1;
> evlist->nr_br_cntr = -1;
> metricgroup__rblist_init(&evlist->metric_events);
> + INIT_LIST_HEAD(&evlist->deferred_samples);
> }
>
> struct evlist *evlist__new(void)
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 5e71e3dc6042..309ef8d78495 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -92,6 +92,7 @@ struct evlist {
> * of struct metric_expr.
> */
> struct rblist metric_events;
> + struct list_head deferred_samples;
> };
>
> struct evsel_str_handler {
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index a071006350f5..ef1902309395 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1283,6 +1283,57 @@ static int evlist__deliver_sample(struct evlist *evlist, const struct perf_tool
> per_thread);
> }
>
> +struct deferred_event {
> + struct list_head list;
> + union perf_event *event;
> +};
> +
> +static int evlist__deliver_deferred_samples(struct evlist *evlist,
> + const struct perf_tool *tool,
> + union perf_event *event,
> + struct perf_sample *sample,
> + struct machine *machine)
> +{
> + struct deferred_event *de, *tmp;
> + struct evsel *evsel;
> + int ret = 0;
> +
> + if (!tool->merge_deferred_callchains) {
> + evsel = evlist__id2evsel(evlist, sample->id);
> + return tool->callchain_deferred(tool, event, sample,
> + evsel, machine);
> + }
> +
> + list_for_each_entry_safe(de, tmp, &evlist->deferred_samples, list) {
> + struct perf_sample orig_sample;
> +
> + ret = evlist__parse_sample(evlist, de->event, &orig_sample);
> + if (ret < 0) {
> + pr_err("failed to parse original sample\n");
> + break;
> + }
> +
> + if (sample->tid != orig_sample.tid ||
> + event->callchain_deferred.cookie != orig_sample.deferred_cookie)
> + continue;
I think we should handle original samples with different cookies too.
They are before LOST records so we don't merge the callchain though.
> +
> + evsel = evlist__id2evsel(evlist, orig_sample.id);
> + sample__merge_deferred_callchain(&orig_sample, sample);
> + ret = evlist__deliver_sample(evlist, tool, de->event,
> + &orig_sample, evsel, machine);
Something like this.
@@ -1313,12 +1313,13 @@ static int evlist__deliver_deferred_samples(struct evlist *evlist,
break;
}
- if (sample->tid != orig_sample.tid ||
- event->callchain_deferred.cookie != orig_sample.deferred_cookie)
+ if (sample->tid != orig_sample.tid)
continue;
+ if (event->callchain_deferred.cookie == orig_sample.deferred_cookie)
+ sample__merge_deferred_callchain(&orig_sample, sample);
+
evsel = evlist__id2evsel(evlist, orig_sample.id);
- sample__merge_deferred_callchain(&orig_sample, sample);
ret = evlist__deliver_sample(evlist, tool, de->event,
&orig_sample, evsel, machine);
Thanks,
Namhyung
> +
> + if (orig_sample.deferred_callchain)
> + free(orig_sample.callchain);
> +
> + list_del(&de->list);
> + free(de);
> +
> + if (ret)
> + break;
> + }
> + return ret;
> +}
> +
> static int machines__deliver_event(struct machines *machines,
> struct evlist *evlist,
> union perf_event *event,
> @@ -1311,6 +1362,16 @@ static int machines__deliver_event(struct machines *machines,
> return 0;
> }
> dump_sample(evsel, event, sample, perf_env__arch(machine->env));
> + if (sample->deferred_callchain && tool->merge_deferred_callchains) {
> + struct deferred_event *de = malloc(sizeof(*de));
> +
> + if (de == NULL)
> + return -ENOMEM;
> +
> + de->event = event;
> + list_add_tail(&de->list, &evlist->deferred_samples);
> + return 0;
> + }
> return evlist__deliver_sample(evlist, tool, event, sample, evsel, machine);
> case PERF_RECORD_MMAP:
> return tool->mmap(tool, event, sample, machine);
> @@ -1370,7 +1431,8 @@ static int machines__deliver_event(struct machines *machines,
> return tool->aux_output_hw_id(tool, event, sample, machine);
> case PERF_RECORD_CALLCHAIN_DEFERRED:
> dump_deferred_callchain(evsel, event, sample);
> - return tool->callchain_deferred(tool, event, sample, evsel, machine);
> + return evlist__deliver_deferred_samples(evlist, tool, event,
> + sample, machine);
> default:
> ++evlist->stats.nr_unknown_events;
> return -1;
> diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
> index 8bf86af1ca90..9ab9e231b5d5 100644
> --- a/tools/perf/util/tool.c
> +++ b/tools/perf/util/tool.c
> @@ -258,6 +258,7 @@ void perf_tool__init(struct perf_tool *tool, bool ordered_events)
> tool->cgroup_events = false;
> tool->no_warn = false;
> tool->show_feat_hdr = SHOW_FEAT_NO_HEADER;
> + tool->merge_deferred_callchains = true;
>
> tool->sample = process_event_sample_stub;
> tool->mmap = process_event_stub;
> diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
> index 2676d84da80c..7165a487a485 100644
> --- a/tools/perf/util/tool.h
> +++ b/tools/perf/util/tool.h
> @@ -88,6 +88,7 @@ struct perf_tool {
> bool cgroup_events;
> bool no_warn;
> bool dont_split_sample_group;
> + bool merge_deferred_callchains;
> enum show_feature_header show_feat_hdr;
> };
>
> --
> 2.50.1
>
>
prev parent reply other threads:[~2025-09-02 7:07 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-25 18:06 [PATCH v15 0/8] perf: Support the deferred unwinding infrastructure Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 1/8] unwind deferred: Add unwind_user_get_cookie() API Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 2/8] perf: Support deferred user callchains Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 3/8] perf: Have the deferred request record the user context cookie Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 4/8] perf: Support deferred user callchains for per CPU events Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 5/8] perf tools: Minimal CALLCHAIN_DEFERRED support Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 6/8] perf record: Enable defer_callchain for user callchains Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 7/8] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Steven Rostedt
2025-09-02 6:59 ` Namhyung Kim
2025-08-25 18:06 ` [PATCH v15 8/8] perf tools: Merge deferred user callchains Steven Rostedt
2025-09-02 7:07 ` Namhyung Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLaXtBFCkzbKFr0B@z2 \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=axboe@kernel.dk \
--cc=beaub@linux.microsoft.com \
--cc=bpf@vger.kernel.org \
--cc=fweimer@redhat.com \
--cc=indu.bhagat@oracle.com \
--cc=jemarch@gnu.org \
--cc=jolsa@kernel.org \
--cc=jpoimboe@kernel.org \
--cc=jremus@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@kernel.org \
--cc=sam@gentoo.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).