* [PATCH 1/4] perf record: Split --data-mmap option
@ 2025-12-10 2:33 Namhyung Kim
2025-12-10 2:33 ` [PATCH 2/4] perf report: Enable data-type profiling with -F option too Namhyung Kim
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Namhyung Kim @ 2025-12-10 2:33 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, James Clark
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users
Currently -d/--data option controls both PERF_SAMPLE_ADDR bit and
perf_event_attr.mmap_data flag. Separate them using new --data-mmap
option to support recording only one of them.
For data-type profiling, data MMAP is unnecessary but it wastes a lot
of space in the ring buffer and data file.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/Documentation/perf-record.txt | 8 +++++++-
tools/perf/builtin-record.c | 19 +++++++++++++------
tools/perf/util/evsel.c | 5 +++--
tools/perf/util/record.h | 2 ++
4 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e8b9aadbbfa50574..c402e74172f6a22d 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -344,7 +344,8 @@ OPTIONS
-d::
--data::
- Record the sample virtual addresses. Implies --sample-mem-info.
+ Record the sample virtual addresses. Implies --sample-mem-info and
+ --data-mmap.
--phys-data::
Record the sample physical addresses.
@@ -861,6 +862,11 @@ filtered through the mask provided by -C option.
Prepare BPF filter to be used by regular users. The action should be
either "pin" or "unpin". The filter can be used after it's pinned.
+--data-mmap::
+ Enable recording MMAP events for non-executable mappings. Basically
+ perf only records executable mappings but data mmaping can be useful
+ when you analyze data access with sample addresses. So using -d option
+ would enable this unless you specify --no-data-mmap manually.
include::intel-hybrid.txt[]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2584d0d8bc820676..cbfbd9bb10634093 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1881,7 +1881,7 @@ static int record__synthesize_workload(struct record *rec, bool tail)
process_synthesized_event,
&rec->session->machines.host,
needs_mmap,
- rec->opts.sample_address);
+ rec->opts.record_data_mmap);
perf_thread_map__put(thread_map);
return err;
}
@@ -2191,7 +2191,7 @@ static int record__synthesize(struct record *rec, bool tail)
err = __machine__synthesize_threads(machine, tool, &opts->target,
rec->evlist->core.threads,
- f, needs_mmap, opts->sample_address,
+ f, needs_mmap, opts->record_data_mmap,
rec->opts.nr_threads_synthesize);
}
@@ -3006,8 +3006,9 @@ int record_opts__parse_callchain(struct record_opts *record,
ret = parse_callchain_record_opt(arg, callchain);
if (!ret) {
/* Enable data address sampling for DWARF unwind. */
- if (callchain->record_mode == CALLCHAIN_DWARF)
- record->sample_address = true;
+ if (callchain->record_mode == CALLCHAIN_DWARF &&
+ !record->record_data_mmap_set)
+ record->record_data_mmap = true;
callchain_debug(callchain);
}
@@ -3686,6 +3687,9 @@ static struct option __record_options[] = {
OPT_CALLBACK(0, "off-cpu-thresh", &record.opts, "ms",
"Dump off-cpu samples if off-cpu time exceeds this threshold (in milliseconds). (Default: 500ms)",
record__parse_off_cpu_thresh),
+ OPT_BOOLEAN_SET(0, "data-mmap", &record.opts.record_data_mmap,
+ &record.opts.record_data_mmap_set,
+ "Record mmap events for non-executable mappings"),
OPT_END()
};
@@ -4249,9 +4253,12 @@ int cmd_record(int argc, const char **argv)
goto out_opts;
}
- /* For backward compatibility, -d implies --mem-info */
- if (rec->opts.sample_address)
+ /* For backward compatibility, -d implies --mem-info and --data-mmap */
+ if (rec->opts.sample_address) {
rec->opts.sample_data_src = true;
+ if (!rec->opts.record_data_mmap_set)
+ rec->opts.record_data_mmap = true;
+ }
/*
* Allow aliases to facilitate the lookup of symbols for address
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 9cd706f6279313c2..ec6552a6f667fec6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1445,10 +1445,11 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
attr->inherit_stat = 1;
}
- if (opts->sample_address) {
+ if (opts->sample_address)
evsel__set_sample_bit(evsel, ADDR);
+
+ if (opts->record_data_mmap)
attr->mmap_data = track;
- }
/*
* We don't allow user space callchains for function trace
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index ea3a6c4657eefb74..93627c9a73387ddd 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -40,6 +40,8 @@ struct record_opts {
bool record_cgroup;
bool record_switch_events;
bool record_switch_events_set;
+ bool record_data_mmap;
+ bool record_data_mmap_set;
bool all_kernel;
bool all_user;
bool kernel_callchains;
--
2.52.0.223.gf5cc29aaa4-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/4] perf report: Enable data-type profiling with -F option too
2025-12-10 2:33 [PATCH 1/4] perf record: Split --data-mmap option Namhyung Kim
@ 2025-12-10 2:33 ` Namhyung Kim
2025-12-10 22:05 ` Ian Rogers
2025-12-10 2:33 ` [PATCH 3/4] perf report: Fix histogram entry collapsing for -F option Namhyung Kim
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2025-12-10 2:33 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, James Clark
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users
It checked -s/--sort options only. As the sort keys can be setup using
the -F/--fields option as well, it should enable data-type profiling
with it too.
The following two commands should have the same output.
$ perf report -s type
$ perf report -F overhead,type
But there's another problem on this. I'll handle it in the next commit.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-report.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index add6b1c2aaf04270..6c2b4f93ec78e579 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1727,7 +1727,8 @@ int cmd_report(int argc, const char **argv)
sort_order = NULL;
}
- if (sort_order && strstr(sort_order, "type")) {
+ if ((sort_order && strstr(sort_order, "type")) ||
+ (field_order && strstr(field_order, "type"))) {
report.data_type = true;
annotate_opts.annotate_src = false;
--
2.52.0.223.gf5cc29aaa4-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/4] perf report: Fix histogram entry collapsing for -F option
2025-12-10 2:33 [PATCH 1/4] perf record: Split --data-mmap option Namhyung Kim
2025-12-10 2:33 ` [PATCH 2/4] perf report: Enable data-type profiling with -F option too Namhyung Kim
@ 2025-12-10 2:33 ` Namhyung Kim
2025-12-10 22:06 ` Ian Rogers
2025-12-10 2:33 ` [PATCH 4/4] perf report: Update sort key state from " Namhyung Kim
2025-12-10 22:05 ` [PATCH 1/4] perf record: Split --data-mmap option Ian Rogers
3 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2025-12-10 2:33 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, James Clark
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users
Users can use -F/--fields option to set output fields and sort keys
together. But it missed to set perf_hpp_list->need_collapse for sort
entries that have se_collapse callbacks. So it ends up with having
duplicated entries separately.
For example, let's run this command first.
$ perf mem record -t load -U -- perf test -w datasym
This will record samples for memory access (load) to struct 'buf' and a
loop condition ('sig_atomic_t') types. So the following two commands
should have identical output.
$ perf report -s type --stdio --percent-limit=1 -q
87.80% perf buf
12.17% perf sig_atomic_t
But using -F option didn't collapse the entries based on types so the
result looked like below:
$ perf report -F overhead,type --stdio --percent-limit=1 -q
23.31% perf buf
22.84% perf buf
21.26% perf buf
20.39% perf buf
12.17% perf sig_atomic_t
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/sort.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a0a5c1c40af0c318..c51604eaae0a4b90 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -3578,6 +3578,9 @@ static int __sort_dimension__add_output(struct perf_hpp_list *list,
if (__sort_dimension__add_hpp_output(sd, list, level) < 0)
return -1;
+ if (sd->entry->se_collapse)
+ list->need_collapse = 1;
+
sd->taken = 1;
return 0;
}
--
2.52.0.223.gf5cc29aaa4-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/4] perf report: Update sort key state from -F option
2025-12-10 2:33 [PATCH 1/4] perf record: Split --data-mmap option Namhyung Kim
2025-12-10 2:33 ` [PATCH 2/4] perf report: Enable data-type profiling with -F option too Namhyung Kim
2025-12-10 2:33 ` [PATCH 3/4] perf report: Fix histogram entry collapsing for -F option Namhyung Kim
@ 2025-12-10 2:33 ` Namhyung Kim
2025-12-10 22:06 ` Ian Rogers
2025-12-10 22:05 ` [PATCH 1/4] perf record: Split --data-mmap option Ian Rogers
3 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2025-12-10 2:33 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, James Clark
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users
Factor out __sort_dimension__update() so that it can be called from -s
and -F option parsing logics. Otherwise the following command cannot go
into the annotation mode.
$ perf report -F overhead,type,sym
Warning: Annotation is only available for symbolic views, include "sym*" in --sort to use it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/sort.c | 100 ++++++++++++++++++++++-------------------
1 file changed, 54 insertions(+), 46 deletions(-)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index c51604eaae0a4b90..5e27a66e3ccb0bbe 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -3531,6 +3531,56 @@ static int add_dynamic_entry(struct evlist *evlist, const char *tok,
return ret;
}
+static int __sort_dimension__update(struct sort_dimension *sd,
+ struct perf_hpp_list *list)
+{
+ if (sd->entry == &sort_parent && parent_pattern) {
+ int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
+ if (ret) {
+ char err[BUFSIZ];
+
+ regerror(ret, &parent_regex, err, sizeof(err));
+ pr_err("Invalid regex: %s\n%s", parent_pattern, err);
+ return -EINVAL;
+ }
+ list->parent = 1;
+ } else if (sd->entry == &sort_sym) {
+ list->sym = 1;
+ /*
+ * perf diff displays the performance difference amongst
+ * two or more perf.data files. Those files could come
+ * from different binaries. So we should not compare
+ * their ips, but the name of symbol.
+ */
+ if (sort__mode == SORT_MODE__DIFF)
+ sd->entry->se_collapse = sort__sym_sort;
+
+ } else if (sd->entry == &sort_sym_offset) {
+ list->sym = 1;
+ } else if (sd->entry == &sort_dso) {
+ list->dso = 1;
+ } else if (sd->entry == &sort_socket) {
+ list->socket = 1;
+ } else if (sd->entry == &sort_thread) {
+ list->thread = 1;
+ } else if (sd->entry == &sort_comm) {
+ list->comm = 1;
+ } else if (sd->entry == &sort_type_offset) {
+ symbol_conf.annotate_data_member = true;
+ } else if (sd->entry == &sort_sym_from || sd->entry == &sort_sym_to) {
+ list->sym = 1;
+ } else if (sd->entry == &sort_mem_dcacheline && cacheline_size() == 0) {
+ return -EINVAL;
+ } else if (sd->entry == &sort_mem_daddr_sym) {
+ list->sym = 1;
+ }
+
+ if (sd->entry->se_collapse)
+ list->need_collapse = 1;
+
+ return 0;
+}
+
static int __sort_dimension__add(struct sort_dimension *sd,
struct perf_hpp_list *list,
int level)
@@ -3541,8 +3591,8 @@ static int __sort_dimension__add(struct sort_dimension *sd,
if (__sort_dimension__add_hpp_sort(sd, list, level) < 0)
return -1;
- if (sd->entry->se_collapse)
- list->need_collapse = 1;
+ if (__sort_dimension__update(sd, list) < 0)
+ return -1;
sd->taken = 1;
@@ -3578,8 +3628,8 @@ static int __sort_dimension__add_output(struct perf_hpp_list *list,
if (__sort_dimension__add_hpp_output(sd, list, level) < 0)
return -1;
- if (sd->entry->se_collapse)
- list->need_collapse = 1;
+ if (__sort_dimension__update(sd, list) < 0)
+ return -1;
sd->taken = 1;
return 0;
@@ -3644,39 +3694,6 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
sort_dimension_add_dynamic_header(sd, env);
}
- if (sd->entry == &sort_parent && parent_pattern) {
- int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
- if (ret) {
- char err[BUFSIZ];
-
- regerror(ret, &parent_regex, err, sizeof(err));
- pr_err("Invalid regex: %s\n%s", parent_pattern, err);
- return -EINVAL;
- }
- list->parent = 1;
- } else if (sd->entry == &sort_sym) {
- list->sym = 1;
- /*
- * perf diff displays the performance difference amongst
- * two or more perf.data files. Those files could come
- * from different binaries. So we should not compare
- * their ips, but the name of symbol.
- */
- if (sort__mode == SORT_MODE__DIFF)
- sd->entry->se_collapse = sort__sym_sort;
-
- } else if (sd->entry == &sort_dso) {
- list->dso = 1;
- } else if (sd->entry == &sort_socket) {
- list->socket = 1;
- } else if (sd->entry == &sort_thread) {
- list->thread = 1;
- } else if (sd->entry == &sort_comm) {
- list->comm = 1;
- } else if (sd->entry == &sort_type_offset) {
- symbol_conf.annotate_data_member = true;
- }
-
return __sort_dimension__add(sd, list, level);
}
@@ -3695,9 +3712,6 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
strlen(tok)))
return -EINVAL;
- if (sd->entry == &sort_sym_from || sd->entry == &sort_sym_to)
- list->sym = 1;
-
__sort_dimension__add(sd, list, level);
return 0;
}
@@ -3711,12 +3725,6 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
if (sort__mode != SORT_MODE__MEMORY)
return -EINVAL;
- if (sd->entry == &sort_mem_dcacheline && cacheline_size() == 0)
- return -EINVAL;
-
- if (sd->entry == &sort_mem_daddr_sym)
- list->sym = 1;
-
__sort_dimension__add(sd, list, level);
return 0;
}
--
2.52.0.223.gf5cc29aaa4-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] perf record: Split --data-mmap option
2025-12-10 2:33 [PATCH 1/4] perf record: Split --data-mmap option Namhyung Kim
` (2 preceding siblings ...)
2025-12-10 2:33 ` [PATCH 4/4] perf report: Update sort key state from " Namhyung Kim
@ 2025-12-10 22:05 ` Ian Rogers
2025-12-17 12:37 ` Arnaldo Carvalho de Melo
3 siblings, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2025-12-10 22:05 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, James Clark, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users
On Tue, Dec 9, 2025 at 6:33 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Currently -d/--data option controls both PERF_SAMPLE_ADDR bit and
> perf_event_attr.mmap_data flag. Separate them using new --data-mmap
> option to support recording only one of them.
>
> For data-type profiling, data MMAP is unnecessary but it wastes a lot
> of space in the ring buffer and data file.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/Documentation/perf-record.txt | 8 +++++++-
> tools/perf/builtin-record.c | 19 +++++++++++++------
> tools/perf/util/evsel.c | 5 +++--
> tools/perf/util/record.h | 2 ++
> 4 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index e8b9aadbbfa50574..c402e74172f6a22d 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -344,7 +344,8 @@ OPTIONS
>
> -d::
> --data::
> - Record the sample virtual addresses. Implies --sample-mem-info.
> + Record the sample virtual addresses. Implies --sample-mem-info and
> + --data-mmap.
>
> --phys-data::
> Record the sample physical addresses.
> @@ -861,6 +862,11 @@ filtered through the mask provided by -C option.
> Prepare BPF filter to be used by regular users. The action should be
> either "pin" or "unpin". The filter can be used after it's pinned.
>
> +--data-mmap::
> + Enable recording MMAP events for non-executable mappings. Basically
> + perf only records executable mappings but data mmaping can be useful
> + when you analyze data access with sample addresses. So using -d option
> + would enable this unless you specify --no-data-mmap manually.
>
> include::intel-hybrid.txt[]
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 2584d0d8bc820676..cbfbd9bb10634093 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1881,7 +1881,7 @@ static int record__synthesize_workload(struct record *rec, bool tail)
> process_synthesized_event,
> &rec->session->machines.host,
> needs_mmap,
> - rec->opts.sample_address);
> + rec->opts.record_data_mmap);
> perf_thread_map__put(thread_map);
> return err;
> }
> @@ -2191,7 +2191,7 @@ static int record__synthesize(struct record *rec, bool tail)
>
> err = __machine__synthesize_threads(machine, tool, &opts->target,
> rec->evlist->core.threads,
> - f, needs_mmap, opts->sample_address,
> + f, needs_mmap, opts->record_data_mmap,
> rec->opts.nr_threads_synthesize);
> }
>
> @@ -3006,8 +3006,9 @@ int record_opts__parse_callchain(struct record_opts *record,
> ret = parse_callchain_record_opt(arg, callchain);
> if (!ret) {
> /* Enable data address sampling for DWARF unwind. */
> - if (callchain->record_mode == CALLCHAIN_DWARF)
> - record->sample_address = true;
> + if (callchain->record_mode == CALLCHAIN_DWARF &&
> + !record->record_data_mmap_set)
> + record->record_data_mmap = true;
> callchain_debug(callchain);
> }
>
> @@ -3686,6 +3687,9 @@ static struct option __record_options[] = {
> OPT_CALLBACK(0, "off-cpu-thresh", &record.opts, "ms",
> "Dump off-cpu samples if off-cpu time exceeds this threshold (in milliseconds). (Default: 500ms)",
> record__parse_off_cpu_thresh),
> + OPT_BOOLEAN_SET(0, "data-mmap", &record.opts.record_data_mmap,
> + &record.opts.record_data_mmap_set,
> + "Record mmap events for non-executable mappings"),
> OPT_END()
> };
>
> @@ -4249,9 +4253,12 @@ int cmd_record(int argc, const char **argv)
> goto out_opts;
> }
>
> - /* For backward compatibility, -d implies --mem-info */
> - if (rec->opts.sample_address)
> + /* For backward compatibility, -d implies --mem-info and --data-mmap */
> + if (rec->opts.sample_address) {
> rec->opts.sample_data_src = true;
> + if (!rec->opts.record_data_mmap_set)
> + rec->opts.record_data_mmap = true;
> + }
>
> /*
> * Allow aliases to facilitate the lookup of symbols for address
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 9cd706f6279313c2..ec6552a6f667fec6 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1445,10 +1445,11 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
> attr->inherit_stat = 1;
> }
>
> - if (opts->sample_address) {
> + if (opts->sample_address)
> evsel__set_sample_bit(evsel, ADDR);
> +
> + if (opts->record_data_mmap)
> attr->mmap_data = track;
> - }
>
> /*
> * We don't allow user space callchains for function trace
> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
> index ea3a6c4657eefb74..93627c9a73387ddd 100644
> --- a/tools/perf/util/record.h
> +++ b/tools/perf/util/record.h
> @@ -40,6 +40,8 @@ struct record_opts {
> bool record_cgroup;
> bool record_switch_events;
> bool record_switch_events_set;
> + bool record_data_mmap;
> + bool record_data_mmap_set;
> bool all_kernel;
> bool all_user;
> bool kernel_callchains;
> --
> 2.52.0.223.gf5cc29aaa4-goog
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/4] perf report: Enable data-type profiling with -F option too
2025-12-10 2:33 ` [PATCH 2/4] perf report: Enable data-type profiling with -F option too Namhyung Kim
@ 2025-12-10 22:05 ` Ian Rogers
0 siblings, 0 replies; 9+ messages in thread
From: Ian Rogers @ 2025-12-10 22:05 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, James Clark, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users
On Tue, Dec 9, 2025 at 6:33 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> It checked -s/--sort options only. As the sort keys can be setup using
> the -F/--fields option as well, it should enable data-type profiling
> with it too.
>
> The following two commands should have the same output.
>
> $ perf report -s type
>
> $ perf report -F overhead,type
>
> But there's another problem on this. I'll handle it in the next commit.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/builtin-report.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index add6b1c2aaf04270..6c2b4f93ec78e579 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -1727,7 +1727,8 @@ int cmd_report(int argc, const char **argv)
> sort_order = NULL;
> }
>
> - if (sort_order && strstr(sort_order, "type")) {
> + if ((sort_order && strstr(sort_order, "type")) ||
> + (field_order && strstr(field_order, "type"))) {
> report.data_type = true;
> annotate_opts.annotate_src = false;
>
> --
> 2.52.0.223.gf5cc29aaa4-goog
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/4] perf report: Fix histogram entry collapsing for -F option
2025-12-10 2:33 ` [PATCH 3/4] perf report: Fix histogram entry collapsing for -F option Namhyung Kim
@ 2025-12-10 22:06 ` Ian Rogers
0 siblings, 0 replies; 9+ messages in thread
From: Ian Rogers @ 2025-12-10 22:06 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, James Clark, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users
On Tue, Dec 9, 2025 at 6:33 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Users can use -F/--fields option to set output fields and sort keys
> together. But it missed to set perf_hpp_list->need_collapse for sort
> entries that have se_collapse callbacks. So it ends up with having
> duplicated entries separately.
>
> For example, let's run this command first.
>
> $ perf mem record -t load -U -- perf test -w datasym
>
> This will record samples for memory access (load) to struct 'buf' and a
> loop condition ('sig_atomic_t') types. So the following two commands
> should have identical output.
>
> $ perf report -s type --stdio --percent-limit=1 -q
> 87.80% perf buf
> 12.17% perf sig_atomic_t
>
> But using -F option didn't collapse the entries based on types so the
> result looked like below:
>
> $ perf report -F overhead,type --stdio --percent-limit=1 -q
> 23.31% perf buf
> 22.84% perf buf
> 21.26% perf buf
> 20.39% perf buf
> 12.17% perf sig_atomic_t
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/util/sort.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index a0a5c1c40af0c318..c51604eaae0a4b90 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -3578,6 +3578,9 @@ static int __sort_dimension__add_output(struct perf_hpp_list *list,
> if (__sort_dimension__add_hpp_output(sd, list, level) < 0)
> return -1;
>
> + if (sd->entry->se_collapse)
> + list->need_collapse = 1;
> +
> sd->taken = 1;
> return 0;
> }
> --
> 2.52.0.223.gf5cc29aaa4-goog
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/4] perf report: Update sort key state from -F option
2025-12-10 2:33 ` [PATCH 4/4] perf report: Update sort key state from " Namhyung Kim
@ 2025-12-10 22:06 ` Ian Rogers
0 siblings, 0 replies; 9+ messages in thread
From: Ian Rogers @ 2025-12-10 22:06 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, James Clark, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users
On Tue, Dec 9, 2025 at 6:33 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Factor out __sort_dimension__update() so that it can be called from -s
> and -F option parsing logics. Otherwise the following command cannot go
> into the annotation mode.
>
> $ perf report -F overhead,type,sym
>
> Warning: Annotation is only available for symbolic views, include "sym*" in --sort to use it.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/util/sort.c | 100 ++++++++++++++++++++++-------------------
> 1 file changed, 54 insertions(+), 46 deletions(-)
>
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index c51604eaae0a4b90..5e27a66e3ccb0bbe 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -3531,6 +3531,56 @@ static int add_dynamic_entry(struct evlist *evlist, const char *tok,
> return ret;
> }
>
> +static int __sort_dimension__update(struct sort_dimension *sd,
> + struct perf_hpp_list *list)
> +{
> + if (sd->entry == &sort_parent && parent_pattern) {
> + int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
> + if (ret) {
> + char err[BUFSIZ];
> +
> + regerror(ret, &parent_regex, err, sizeof(err));
> + pr_err("Invalid regex: %s\n%s", parent_pattern, err);
> + return -EINVAL;
> + }
> + list->parent = 1;
> + } else if (sd->entry == &sort_sym) {
> + list->sym = 1;
> + /*
> + * perf diff displays the performance difference amongst
> + * two or more perf.data files. Those files could come
> + * from different binaries. So we should not compare
> + * their ips, but the name of symbol.
> + */
> + if (sort__mode == SORT_MODE__DIFF)
> + sd->entry->se_collapse = sort__sym_sort;
> +
> + } else if (sd->entry == &sort_sym_offset) {
> + list->sym = 1;
> + } else if (sd->entry == &sort_dso) {
> + list->dso = 1;
> + } else if (sd->entry == &sort_socket) {
> + list->socket = 1;
> + } else if (sd->entry == &sort_thread) {
> + list->thread = 1;
> + } else if (sd->entry == &sort_comm) {
> + list->comm = 1;
> + } else if (sd->entry == &sort_type_offset) {
> + symbol_conf.annotate_data_member = true;
> + } else if (sd->entry == &sort_sym_from || sd->entry == &sort_sym_to) {
> + list->sym = 1;
> + } else if (sd->entry == &sort_mem_dcacheline && cacheline_size() == 0) {
> + return -EINVAL;
> + } else if (sd->entry == &sort_mem_daddr_sym) {
> + list->sym = 1;
> + }
> +
> + if (sd->entry->se_collapse)
> + list->need_collapse = 1;
> +
> + return 0;
> +}
> +
> static int __sort_dimension__add(struct sort_dimension *sd,
> struct perf_hpp_list *list,
> int level)
> @@ -3541,8 +3591,8 @@ static int __sort_dimension__add(struct sort_dimension *sd,
> if (__sort_dimension__add_hpp_sort(sd, list, level) < 0)
> return -1;
>
> - if (sd->entry->se_collapse)
> - list->need_collapse = 1;
> + if (__sort_dimension__update(sd, list) < 0)
> + return -1;
>
> sd->taken = 1;
>
> @@ -3578,8 +3628,8 @@ static int __sort_dimension__add_output(struct perf_hpp_list *list,
> if (__sort_dimension__add_hpp_output(sd, list, level) < 0)
> return -1;
>
> - if (sd->entry->se_collapse)
> - list->need_collapse = 1;
> + if (__sort_dimension__update(sd, list) < 0)
> + return -1;
>
> sd->taken = 1;
> return 0;
> @@ -3644,39 +3694,6 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
> sort_dimension_add_dynamic_header(sd, env);
> }
>
> - if (sd->entry == &sort_parent && parent_pattern) {
> - int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
> - if (ret) {
> - char err[BUFSIZ];
> -
> - regerror(ret, &parent_regex, err, sizeof(err));
> - pr_err("Invalid regex: %s\n%s", parent_pattern, err);
> - return -EINVAL;
> - }
> - list->parent = 1;
> - } else if (sd->entry == &sort_sym) {
> - list->sym = 1;
> - /*
> - * perf diff displays the performance difference amongst
> - * two or more perf.data files. Those files could come
> - * from different binaries. So we should not compare
> - * their ips, but the name of symbol.
> - */
> - if (sort__mode == SORT_MODE__DIFF)
> - sd->entry->se_collapse = sort__sym_sort;
> -
> - } else if (sd->entry == &sort_dso) {
> - list->dso = 1;
> - } else if (sd->entry == &sort_socket) {
> - list->socket = 1;
> - } else if (sd->entry == &sort_thread) {
> - list->thread = 1;
> - } else if (sd->entry == &sort_comm) {
> - list->comm = 1;
> - } else if (sd->entry == &sort_type_offset) {
> - symbol_conf.annotate_data_member = true;
> - }
> -
> return __sort_dimension__add(sd, list, level);
> }
>
> @@ -3695,9 +3712,6 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
> strlen(tok)))
> return -EINVAL;
>
> - if (sd->entry == &sort_sym_from || sd->entry == &sort_sym_to)
> - list->sym = 1;
> -
> __sort_dimension__add(sd, list, level);
> return 0;
> }
> @@ -3711,12 +3725,6 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
> if (sort__mode != SORT_MODE__MEMORY)
> return -EINVAL;
>
> - if (sd->entry == &sort_mem_dcacheline && cacheline_size() == 0)
> - return -EINVAL;
> -
> - if (sd->entry == &sort_mem_daddr_sym)
> - list->sym = 1;
> -
> __sort_dimension__add(sd, list, level);
> return 0;
> }
> --
> 2.52.0.223.gf5cc29aaa4-goog
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] perf record: Split --data-mmap option
2025-12-10 22:05 ` [PATCH 1/4] perf record: Split --data-mmap option Ian Rogers
@ 2025-12-17 12:37 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-12-17 12:37 UTC (permalink / raw)
To: Ian Rogers
Cc: Namhyung Kim, James Clark, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users
On Wed, Dec 10, 2025 at 02:05:23PM -0800, Ian Rogers wrote:
> On Tue, Dec 9, 2025 at 6:33 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Currently -d/--data option controls both PERF_SAMPLE_ADDR bit and
> > perf_event_attr.mmap_data flag. Separate them using new --data-mmap
> > option to support recording only one of them.
> >
> > For data-type profiling, data MMAP is unnecessary but it wastes a lot
> > of space in the ring buffer and data file.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>
> Reviewed-by: Ian Rogers <irogers@google.com>
Tested the series and applied to perf-tools-next (6.20),
Thanks,
- Arnaldo
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-12-17 12:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-10 2:33 [PATCH 1/4] perf record: Split --data-mmap option Namhyung Kim
2025-12-10 2:33 ` [PATCH 2/4] perf report: Enable data-type profiling with -F option too Namhyung Kim
2025-12-10 22:05 ` Ian Rogers
2025-12-10 2:33 ` [PATCH 3/4] perf report: Fix histogram entry collapsing for -F option Namhyung Kim
2025-12-10 22:06 ` Ian Rogers
2025-12-10 2:33 ` [PATCH 4/4] perf report: Update sort key state from " Namhyung Kim
2025-12-10 22:06 ` Ian Rogers
2025-12-10 22:05 ` [PATCH 1/4] perf record: Split --data-mmap option Ian Rogers
2025-12-17 12:37 ` Arnaldo Carvalho de Melo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.