* [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1)
@ 2024-07-31 23:54 Namhyung Kim
2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:54 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev
Hello,
As we added data type profiling, 'perf mem report' should support that
as well. This patches just added a couple of convenient options.
$ perf mem report -T -s mem
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 131 of event 'cpu/mem-loads,ldlat=30/P'
# Total weight : 18561
# Sort order : mem,type
#
# Overhead Samples Memory access Data Type
# ........ ............ ....................................... .........
#
14.84% 22 L1 hit (unknown)
12.02% 8 RAM hit (unknown)
7.68% 8 LFB/MAB hit (unknown)
6.29% 12 L1 hit (stack operation)
3.97% 5 LFB/MAB hit struct psi_group_cpu
3.69% 3 L1 hit struct
3.18% 3 LFB/MAB hit (stack operation)
2.89% 5 L3 hit (unknown)
2.58% 3 L1 hit unsigned int
2.31% 2 L1 hit struct psi_group_cpu
2.21% 2 LFB/MAB hit struct cfs_rq
2.19% 2 RAM hit struct sched_entity
2.16% 1 L1 hit struct task_struct
1.85% 3 L1 hit struct pcpu_hot
1.78% 1 RAM hit struct tss_struct
1.72% 1 LFB/MAB hit struct mm_struct
1.62% 2 L1 hit struct psi_group
...
The code is also available at 'perf/mem-type-v1' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (6):
perf hist: Correct hist_entry->mem_info refcounts
perf mem: Free the allocated sort string
perf mem: Rework command option handling
perf tools: Add mode argument to sort_help()
perf mem: Add -s/--sort option
perf mem: Add -T/--data-type option to report subcommand
tools/perf/builtin-mem.c | 100 ++++++++++++++++++++++-------------
tools/perf/builtin-report.c | 4 +-
tools/perf/util/hist.c | 14 ++---
tools/perf/util/map_symbol.c | 18 +++++++
tools/perf/util/map_symbol.h | 3 ++
tools/perf/util/mem-info.c | 13 +++++
tools/perf/util/mem-info.h | 1 +
tools/perf/util/sort.c | 12 +++--
tools/perf/util/sort.h | 2 +-
9 files changed, 116 insertions(+), 51 deletions(-)
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev
The mem_info is created by iter_prepare_mem_entry() at the beginning and
destroyed by iter_finish_mem_entry() at the end. So if it's used in a
new hist_entry, it should be cloned. Simplify (hopefully) the logic
by adding some helper functions and not holding the refcount in the
temporary entry.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/hist.c | 14 +++++++-------
tools/perf/util/map_symbol.c | 18 ++++++++++++++++++
tools/perf/util/map_symbol.h | 3 +++
tools/perf/util/mem-info.c | 13 +++++++++++++
tools/perf/util/mem-info.h | 1 +
5 files changed, 42 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index f028f113c4fd..f8ee1cd6929d 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -476,6 +476,12 @@ static int hist_entry__init(struct hist_entry *he,
he->branch_info->to.ms.map = map__get(he->branch_info->to.ms.map);
}
+ if (he->mem_info) {
+ he->mem_info = mem_info__clone(template->mem_info);
+ if (he->mem_info == NULL)
+ goto err_infos;
+ }
+
if (hist_entry__has_callchains(he) && symbol_conf.use_callchain)
callchain_init(he->callchain);
@@ -620,12 +626,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (symbol_conf.cumulate_callchain)
he_stat__add_period(he->stat_acc, period);
- /*
- * This mem info was allocated from sample__resolve_mem
- * and will not be used anymore.
- */
- mem_info__zput(entry->mem_info);
-
block_info__delete(entry->block_info);
kvm_info__zput(entry->kvm_info);
@@ -739,7 +739,7 @@ __hists__add_entry(struct hists *hists,
.filtered = symbol__parent_filter(sym_parent) | al->filtered,
.hists = hists,
.branch_info = bi,
- .mem_info = mem_info__get(mi),
+ .mem_info = mi,
.kvm_info = ki,
.block_info = block_info,
.transaction = sample->transaction,
diff --git a/tools/perf/util/map_symbol.c b/tools/perf/util/map_symbol.c
index bef5079f2403..6ad2960bc289 100644
--- a/tools/perf/util/map_symbol.c
+++ b/tools/perf/util/map_symbol.c
@@ -13,3 +13,21 @@ void addr_map_symbol__exit(struct addr_map_symbol *ams)
{
map_symbol__exit(&ams->ms);
}
+
+void map_symbol__copy(struct map_symbol *dst, struct map_symbol *src)
+{
+ dst->maps = maps__get(src->maps);
+ dst->map = map__get(src->map);
+ dst->sym = src->sym;
+}
+
+void addr_map_symbol__copy(struct addr_map_symbol *dst, struct addr_map_symbol *src)
+{
+ map_symbol__copy(&dst->ms, &src->ms);
+
+ dst->addr = src->addr;
+ dst->al_addr = src->al_addr;
+ dst->al_level = src->al_level;
+ dst->phys_addr = src->phys_addr;
+ dst->data_page_size = src->data_page_size;
+}
diff --git a/tools/perf/util/map_symbol.h b/tools/perf/util/map_symbol.h
index 72d5ed938ed6..e370bb32ed47 100644
--- a/tools/perf/util/map_symbol.h
+++ b/tools/perf/util/map_symbol.h
@@ -26,4 +26,7 @@ struct addr_map_symbol {
void map_symbol__exit(struct map_symbol *ms);
void addr_map_symbol__exit(struct addr_map_symbol *ams);
+void map_symbol__copy(struct map_symbol *dst, struct map_symbol *src);
+void addr_map_symbol__copy(struct addr_map_symbol *dst, struct addr_map_symbol *src);
+
#endif // __PERF_MAP_SYMBOL
diff --git a/tools/perf/util/mem-info.c b/tools/perf/util/mem-info.c
index 27d67721a695..d3efa9c139f2 100644
--- a/tools/perf/util/mem-info.c
+++ b/tools/perf/util/mem-info.c
@@ -33,3 +33,16 @@ struct mem_info *mem_info__new(void)
return result;
}
+
+struct mem_info *mem_info__clone(struct mem_info *mi)
+{
+ struct mem_info *result = mem_info__new();
+
+ if (result) {
+ addr_map_symbol__copy(mem_info__iaddr(result), mem_info__iaddr(mi));
+ addr_map_symbol__copy(mem_info__daddr(result), mem_info__daddr(mi));
+ mem_info__data_src(result)->val = mem_info__data_src(mi)->val;
+ }
+
+ return result;
+}
diff --git a/tools/perf/util/mem-info.h b/tools/perf/util/mem-info.h
index 0f68e29f311b..df75e94ed3d0 100644
--- a/tools/perf/util/mem-info.h
+++ b/tools/perf/util/mem-info.h
@@ -15,6 +15,7 @@ DECLARE_RC_STRUCT(mem_info) {
};
struct mem_info *mem_info__new(void);
+struct mem_info *mem_info__clone(struct mem_info *mi);
struct mem_info *mem_info__get(struct mem_info *mi);
void mem_info__put(struct mem_info *mi);
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/6] perf mem: Free the allocated sort string
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
2024-08-01 15:20 ` Arnaldo Carvalho de Melo
2024-07-31 23:55 ` [PATCH 3/6] perf mem: Rework command option handling Namhyung Kim
` (4 subsequent siblings)
6 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev
The get_sort_order() returns either a new string (from strdup) or NULL
but it never gets freed.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-mem.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 863fcd735dae..93413cfcd585 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -372,6 +372,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
rep_argv[i] = argv[j];
ret = cmd_report(i, rep_argv);
+ free(new_sort_order);
free(rep_argv);
return ret;
}
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/6] perf mem: Rework command option handling
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
2024-07-31 23:55 ` [PATCH 4/6] perf tools: Add mode argument to sort_help() Namhyung Kim
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev, Aditya Gupta
Split the common option and ones for record or report. Otherwise -U in
the record option cannot be used because it clashes with in the common
(or report) option. Also rename report_events() to __cmd_report() to
follow the convention and to be sync with the record part.
Also set the flag PARSE_OPT_STOP_AT_NON_OPTION for the common option so
that it can show the help message in the subcommand like below:
$ perf mem record -h
Usage: perf mem record [<options>] [<command>]
or: perf mem record [<options>] -- <command> [<options>]
-C, --cpu <cpu> list of cpus to profile
-e, --event <event> event selector. use 'perf mem record -e list' to list available events
-f, --force don't complain, do it
-K, --all-kernel collect only kernel level data
-p, --phys-data Record/Report sample physical addresses
-t, --type <type> memory operations(load,store) Default load,store
-U, --all-user collect only user level data
-v, --verbose be more verbose (show counter open errors, etc)
--data-page-size Record/Report sample data address page size
--ldlat <n> mem-loads latency
Cc: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-mem.c | 79 +++++++++++++++++++++++-----------------
1 file changed, 45 insertions(+), 34 deletions(-)
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 93413cfcd585..819edaf6b1df 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -34,6 +34,8 @@ struct perf_mem {
bool force;
bool phys_addr;
bool data_page_size;
+ bool all_kernel;
+ bool all_user;
int operation;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -62,33 +64,19 @@ static int parse_record_events(const struct option *opt,
return 0;
}
-static const char * const __usage[] = {
- "perf mem record [<options>] [<command>]",
- "perf mem record [<options>] -- <command> [<options>]",
- NULL
-};
-
-static const char * const *record_mem_usage = __usage;
-
-static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
+static int __cmd_record(int argc, const char **argv, struct perf_mem *mem,
+ const struct option *options)
{
int rec_argc, i = 0, j;
int start, end;
const char **rec_argv;
int ret;
- bool all_user = false, all_kernel = false;
struct perf_mem_event *e;
struct perf_pmu *pmu;
- struct option options[] = {
- OPT_CALLBACK('e', "event", &mem, "event",
- "event selector. use 'perf mem record -e list' to list available events",
- parse_record_events),
- OPT_UINTEGER(0, "ldlat", &perf_mem_events__loads_ldlat, "mem-loads latency"),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show counter open errors, etc)"),
- OPT_BOOLEAN('U', "all-user", &all_user, "collect only user level data"),
- OPT_BOOLEAN('K', "all-kernel", &all_kernel, "collect only kernel level data"),
- OPT_END()
+ const char * const record_usage[] = {
+ "perf mem record [<options>] [<command>]",
+ "perf mem record [<options>] -- <command> [<options>]",
+ NULL
};
pmu = perf_mem_events_find_pmu();
@@ -102,7 +90,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
return -1;
}
- argc = parse_options(argc, argv, options, record_mem_usage,
+ argc = parse_options(argc, argv, options, record_usage,
PARSE_OPT_KEEP_UNKNOWN);
/* Max number of arguments multiplied by number of PMUs that can support them. */
@@ -158,10 +146,10 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
goto out;
end = i;
- if (all_user)
+ if (mem->all_user)
rec_argv[i++] = "--all-user";
- if (all_kernel)
+ if (mem->all_kernel)
rec_argv[i++] = "--all-kernel";
if (mem->cpu_list) {
@@ -319,6 +307,7 @@ static int report_raw_events(struct perf_mem *mem)
perf_session__delete(session);
return ret;
}
+
static char *get_sort_order(struct perf_mem *mem)
{
bool has_extra_options = (mem->phys_addr | mem->data_page_size) ? true : false;
@@ -346,11 +335,19 @@ static char *get_sort_order(struct perf_mem *mem)
return strdup(sort);
}
-static int report_events(int argc, const char **argv, struct perf_mem *mem)
+static int __cmd_report(int argc, const char **argv, struct perf_mem *mem,
+ const struct option *options)
{
const char **rep_argv;
int ret, i = 0, j, rep_argc;
char *new_sort_order;
+ const char * const report_usage[] = {
+ "perf mem report [<options>]",
+ NULL
+ };
+
+ argc = parse_options(argc, argv, options, report_usage,
+ PARSE_OPT_KEEP_UNKNOWN);
if (mem->dump_raw)
return report_raw_events(mem);
@@ -368,7 +365,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
if (new_sort_order)
rep_argv[i++] = new_sort_order;
- for (j = 1; j < argc; j++, i++)
+ for (j = 0; j < argc; j++, i++)
rep_argv[i] = argv[j];
ret = cmd_report(i, rep_argv);
@@ -475,22 +472,36 @@ int cmd_mem(int argc, const char **argv)
OPT_CALLBACK('t', "type", &mem.operation,
"type", "memory operations(load,store) Default load,store",
parse_mem_ops),
+ OPT_STRING('C', "cpu", &mem.cpu_list, "cpu",
+ "list of cpus to profile"),
+ OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"),
+ OPT_INCR('v', "verbose", &verbose,
+ "be more verbose (show counter open errors, etc)"),
+ OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"),
+ OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"),
+ OPT_END()
+ };
+ const struct option record_options[] = {
+ OPT_CALLBACK('e', "event", &mem, "event",
+ "event selector. use 'perf mem record -e list' to list available events",
+ parse_record_events),
+ OPT_UINTEGER(0, "ldlat", &perf_mem_events__loads_ldlat, "mem-loads latency"),
+ OPT_BOOLEAN('U', "all-user", &mem.all_user, "collect only user level data"),
+ OPT_BOOLEAN('K', "all-kernel", &mem.all_kernel, "collect only kernel level data"),
+ OPT_PARENT(mem_options)
+ };
+ const struct option report_options[] = {
OPT_BOOLEAN('D', "dump-raw-samples", &mem.dump_raw,
"dump raw samples in ASCII"),
OPT_BOOLEAN('U', "hide-unresolved", &mem.hide_unresolved,
"Only display entries resolved to a symbol"),
OPT_STRING('i', "input", &input_name, "file",
"input file name"),
- OPT_STRING('C', "cpu", &mem.cpu_list, "cpu",
- "list of cpus to profile"),
OPT_STRING_NOEMPTY('x', "field-separator", &symbol_conf.field_sep,
"separator",
"separator for columns, no spaces will be added"
" between columns '.' is reserved."),
- OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"),
- OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"),
- OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"),
- OPT_END()
+ OPT_PARENT(mem_options)
};
const char *const mem_subcommands[] = { "record", "report", NULL };
const char *mem_usage[] = {
@@ -499,7 +510,7 @@ int cmd_mem(int argc, const char **argv)
};
argc = parse_options_subcommand(argc, argv, mem_options, mem_subcommands,
- mem_usage, PARSE_OPT_KEEP_UNKNOWN);
+ mem_usage, PARSE_OPT_STOP_AT_NON_OPTION);
if (!argc || !(strncmp(argv[0], "rec", 3) || mem.operation))
usage_with_options(mem_usage, mem_options);
@@ -512,9 +523,9 @@ int cmd_mem(int argc, const char **argv)
}
if (strlen(argv[0]) > 2 && strstarts("record", argv[0]))
- return __cmd_record(argc, argv, &mem);
+ return __cmd_record(argc, argv, &mem, record_options);
else if (strlen(argv[0]) > 2 && strstarts("report", argv[0]))
- return report_events(argc, argv, &mem);
+ return __cmd_report(argc, argv, &mem, report_options);
else
usage_with_options(mem_usage, mem_options);
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/6] perf tools: Add mode argument to sort_help()
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
` (2 preceding siblings ...)
2024-07-31 23:55 ` [PATCH 3/6] perf mem: Rework command option handling Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
2024-07-31 23:55 ` [PATCH 5/6] perf mem: Add -s/--sort option Namhyung Kim
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev
Some sort keys are meaningful only in a specific mode - like branch
stack and memory (data-src). Add the mode to skip unnecessary ones.
This will be used for 'perf mem report' later.
While at it, change the prefix for the -F/--fields option to remove
the duplicate part.
Before:
$ perf report -F
Error: switch `F' requires a value
Usage: perf report [<options>]
-F, --fields <key[,keys...]>
output field(s): overhead period sample overhead overhead_sys
overhead_us overhead_guest_sys overhead_guest_us overhead_children
sample period weight1 weight2 weight3 ins_lat retire_lat
...
After:
$ perf report -F
Error: switch `F' requires a value
Usage: perf report [<options>]
-F, --fields <key[,keys...]>
output field(s): overhead overhead_sys overhead_us
overhead_guest_sys overhead_guest_us overhead_children
sample period weight1 weight2 weight3 ins_lat retire_lat
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-report.c | 4 ++--
tools/perf/util/sort.c | 12 +++++++-----
tools/perf/util/sort.h | 2 +-
3 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6edc0d4ce6fb..930052961c1a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1301,8 +1301,8 @@ int cmd_report(int argc, const char **argv)
.socket_filter = -1,
.skip_empty = true,
};
- char *sort_order_help = sort_help("sort by key(s):");
- char *field_order_help = sort_help("output field(s): overhead period sample ");
+ char *sort_order_help = sort_help("sort by key(s):", SORT_MODE__NORMAL);
+ char *field_order_help = sort_help("output field(s):", SORT_MODE__NORMAL);
const char *disassembler_style = NULL, *objdump_path = NULL, *addr2line_path = NULL;
const struct option options[] = {
OPT_STRING('i', "input", &input_name, "file",
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index ab7c7ff35f9b..c4046d5d1749 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -3960,7 +3960,7 @@ static void add_hpp_sort_string(struct strbuf *sb, struct hpp_dimension *s, int
add_key(sb, s[i].name, llen);
}
-char *sort_help(const char *prefix)
+char *sort_help(const char *prefix, enum sort_mode mode)
{
struct strbuf sb;
char *s;
@@ -3972,10 +3972,12 @@ char *sort_help(const char *prefix)
ARRAY_SIZE(hpp_sort_dimensions), &len);
add_sort_string(&sb, common_sort_dimensions,
ARRAY_SIZE(common_sort_dimensions), &len);
- add_sort_string(&sb, bstack_sort_dimensions,
- ARRAY_SIZE(bstack_sort_dimensions), &len);
- add_sort_string(&sb, memory_sort_dimensions,
- ARRAY_SIZE(memory_sort_dimensions), &len);
+ if (mode == SORT_MODE__NORMAL || mode == SORT_MODE__BRANCH)
+ add_sort_string(&sb, bstack_sort_dimensions,
+ ARRAY_SIZE(bstack_sort_dimensions), &len);
+ if (mode == SORT_MODE__NORMAL || mode == SORT_MODE__MEMORY)
+ add_sort_string(&sb, memory_sort_dimensions,
+ ARRAY_SIZE(memory_sort_dimensions), &len);
s = strbuf_detach(&sb, NULL);
strbuf_release(&sb);
return s;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 0bd0ee3ae76b..6357bc32c5ca 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -130,7 +130,7 @@ void reset_output_field(void);
void sort__setup_elide(FILE *fp);
void perf_hpp__set_elide(int idx, bool elide);
-char *sort_help(const char *prefix);
+char *sort_help(const char *prefix, enum sort_mode mode);
int report_parse_ignore_callees_opt(const struct option *opt, const char *arg, int unset);
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 5/6] perf mem: Add -s/--sort option
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
` (3 preceding siblings ...)
2024-07-31 23:55 ` [PATCH 4/6] perf tools: Add mode argument to sort_help() Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
2024-07-31 23:55 ` [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand Namhyung Kim
2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev
So that users can set the sort key manually as they want.
$ perf mem report -s
Error: switch `s' requires a value
Usage: perf mem report [<options>]
-s, --sort <key[,key2...]>
sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys
overhead_guest_us overhead_children sample period
weight1 weight2 weight3 ins_lat retire_lat p_stage_cyc
pid comm dso symbol parent cpu socket srcline srcfile
local_weight weight transaction trace symbol_size
dso_size cgroup cgroup_id ipc_null time code_page_size
local_ins_lat ins_lat local_p_stage_cyc p_stage_cyc
addr local_retire_lat retire_lat simd type typeoff
symoff symbol_daddr dso_daddr locked tlb mem snoop
dcacheline symbol_iaddr phys_daddr data_page_size
blocked
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-mem.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 819edaf6b1df..24a4f0084f49 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -19,6 +19,7 @@
#include "util/symbol.h"
#include "util/pmus.h"
#include "util/sample.h"
+#include "util/sort.h"
#include "util/string2.h"
#include "util/util.h"
#include <linux/err.h>
@@ -28,7 +29,8 @@
struct perf_mem {
struct perf_tool tool;
- char const *input_name;
+ const char *input_name;
+ const char *sort_key;
bool hide_unresolved;
bool dump_raw;
bool force;
@@ -313,11 +315,13 @@ static char *get_sort_order(struct perf_mem *mem)
bool has_extra_options = (mem->phys_addr | mem->data_page_size) ? true : false;
char sort[128];
+ if (mem->sort_key)
+ scnprintf(sort, sizeof(sort), "--sort=%s", mem->sort_key);
/*
* there is no weight (cost) associated with stores, so don't print
* the column
*/
- if (!(mem->operation & MEM_OPERATION_LOAD)) {
+ else if (!(mem->operation & MEM_OPERATION_LOAD)) {
strcpy(sort, "--sort=mem,sym,dso,symbol_daddr,"
"dso_daddr,tlb,locked");
} else if (has_extra_options) {
@@ -468,6 +472,7 @@ int cmd_mem(int argc, const char **argv)
*/
.operation = MEM_OPERATION_LOAD | MEM_OPERATION_STORE,
};
+ char *sort_order_help = sort_help("sort by key(s):", SORT_MODE__MEMORY);
const struct option mem_options[] = {
OPT_CALLBACK('t', "type", &mem.operation,
"type", "memory operations(load,store) Default load,store",
@@ -501,6 +506,8 @@ int cmd_mem(int argc, const char **argv)
"separator",
"separator for columns, no spaces will be added"
" between columns '.' is reserved."),
+ OPT_STRING('s', "sort", &mem.sort_key, "key[,key2...]",
+ sort_order_help),
OPT_PARENT(mem_options)
};
const char *const mem_subcommands[] = { "record", "report", NULL };
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
` (4 preceding siblings ...)
2024-07-31 23:55 ` [PATCH 5/6] perf mem: Add -s/--sort option Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Stephane Eranian, Athira Rajeev
This is just a shortcut to have 'type' in the sort key and use more
compact output format like below.
$ perf mem report -T
...
#
# Overhead Samples Memory access Snoop TLB access Data Type
# ........ ............ ....................................... ............ ...................... .........
#
14.84% 22 L1 hit None L1 or L2 hit (unknown)
7.68% 8 LFB/MAB hit None L1 or L2 hit (unknown)
7.17% 3 RAM hit Hit L2 miss (unknown)
6.29% 12 L1 hit None L1 or L2 hit (stack operation)
4.85% 5 RAM hit Hit L1 or L2 hit (unknown)
3.97% 5 LFB/MAB hit None L1 or L2 hit struct psi_group_cpu
3.18% 3 LFB/MAB hit None L1 or L2 hit (stack operation)
2.58% 3 L1 hit None L1 or L2 hit unsigned int
2.36% 2 L1 hit None L1 or L2 hit struct
2.31% 2 L1 hit None L1 or L2 hit struct psi_group_cpu
...
Users also can use their own sort keys and -T option makes sure it has
the 'type' sort key at the end.
$ perf mem report -T -s mem
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-mem.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 24a4f0084f49..efa700d14c98 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -38,6 +38,7 @@ struct perf_mem {
bool data_page_size;
bool all_kernel;
bool all_user;
+ bool data_type;
int operation;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -317,6 +318,8 @@ static char *get_sort_order(struct perf_mem *mem)
if (mem->sort_key)
scnprintf(sort, sizeof(sort), "--sort=%s", mem->sort_key);
+ else if (mem->data_type)
+ strcpy(sort, "--sort=mem,snoop,tlb,type");
/*
* there is no weight (cost) associated with stores, so don't print
* the column
@@ -336,6 +339,10 @@ static char *get_sort_order(struct perf_mem *mem)
if (mem->data_page_size)
strcat(sort, ",data_page_size");
+ /* make sure it has 'type' sort key even -s option is used */
+ if (mem->data_type && !strstr(sort, "type"))
+ strcat(sort, ",type");
+
return strdup(sort);
}
@@ -508,6 +515,8 @@ int cmd_mem(int argc, const char **argv)
" between columns '.' is reserved."),
OPT_STRING('s', "sort", &mem.sort_key, "key[,key2...]",
sort_order_help),
+ OPT_BOOLEAN('T', "type-profile", &mem.data_type,
+ "Show data-type profile result"),
OPT_PARENT(mem_options)
};
const char *const mem_subcommands[] = { "record", "report", NULL };
--
2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/6] perf mem: Free the allocated sort string
2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
@ 2024-08-01 15:20 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-01 15:20 UTC (permalink / raw)
To: Namhyung Kim, Kan Liang
Cc: Ian Rogers, Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar,
LKML, linux-perf-users, Stephane Eranian, Athira Rajeev
On Wed, Jul 31, 2024 at 04:55:01PM -0700, Namhyung Kim wrote:
> The get_sort_order() returns either a new string (from strdup) or NULL
> but it never gets freed.
Applied and added:
Fixes: 2e7f545096f954a9 ("perf mem: Factor out a function to generate sort order")
- Arnaldo
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/builtin-mem.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
> index 863fcd735dae..93413cfcd585 100644
> --- a/tools/perf/builtin-mem.c
> +++ b/tools/perf/builtin-mem.c
> @@ -372,6 +372,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
> rep_argv[i] = argv[j];
>
> ret = cmd_report(i, rep_argv);
> + free(new_sort_order);
> free(rep_argv);
> return ret;
> }
> --
> 2.46.0.rc1.232.g9752f9e123-goog
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1)
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
` (5 preceding siblings ...)
2024-07-31 23:55 ` [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand Namhyung Kim
@ 2024-08-01 15:23 ` Ian Rogers
2024-08-02 0:24 ` Namhyung Kim
6 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2024-08-01 15:23 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
Stephane Eranian, Athira Rajeev
On Wed, Jul 31, 2024 at 4:55 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> As we added data type profiling, 'perf mem report' should support that
> as well. This patches just added a couple of convenient options.
>
> $ perf mem report -T -s mem
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 131 of event 'cpu/mem-loads,ldlat=30/P'
> # Total weight : 18561
> # Sort order : mem,type
> #
> # Overhead Samples Memory access Data Type
> # ........ ............ ....................................... .........
> #
> 14.84% 22 L1 hit (unknown)
> 12.02% 8 RAM hit (unknown)
> 7.68% 8 LFB/MAB hit (unknown)
> 6.29% 12 L1 hit (stack operation)
> 3.97% 5 LFB/MAB hit struct psi_group_cpu
> 3.69% 3 L1 hit struct
> 3.18% 3 LFB/MAB hit (stack operation)
> 2.89% 5 L3 hit (unknown)
> 2.58% 3 L1 hit unsigned int
> 2.31% 2 L1 hit struct psi_group_cpu
> 2.21% 2 LFB/MAB hit struct cfs_rq
> 2.19% 2 RAM hit struct sched_entity
> 2.16% 1 L1 hit struct task_struct
> 1.85% 3 L1 hit struct pcpu_hot
> 1.78% 1 RAM hit struct tss_struct
> 1.72% 1 LFB/MAB hit struct mm_struct
> 1.62% 2 L1 hit struct psi_group
> ...
>
>
> The code is also available at 'perf/mem-type-v1' branch in
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Namhyung Kim (6):
> perf hist: Correct hist_entry->mem_info refcounts
> perf mem: Free the allocated sort string
> perf mem: Rework command option handling
> perf tools: Add mode argument to sort_help()
> perf mem: Add -s/--sort option
> perf mem: Add -T/--data-type option to report subcommand
>
> tools/perf/builtin-mem.c | 100 ++++++++++++++++++++++-------------
> tools/perf/builtin-report.c | 4 +-
> tools/perf/util/hist.c | 14 ++---
> tools/perf/util/map_symbol.c | 18 +++++++
> tools/perf/util/map_symbol.h | 3 ++
> tools/perf/util/mem-info.c | 13 +++++
> tools/perf/util/mem-info.h | 1 +
> tools/perf/util/sort.c | 12 +++--
> tools/perf/util/sort.h | 2 +-
> 9 files changed, 116 insertions(+), 51 deletions(-)
Lgtm, man page updates?
Thanks,
Ian
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1)
2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
@ 2024-08-02 0:24 ` Namhyung Kim
0 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-08-02 0:24 UTC (permalink / raw)
To: Ian Rogers
Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
Stephane Eranian, Athira Rajeev
On Thu, Aug 1, 2024 at 8:23 AM Ian Rogers <irogers@google.com> wrote:
>
> On Wed, Jul 31, 2024 at 4:55 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hello,
> >
> > As we added data type profiling, 'perf mem report' should support that
> > as well. This patches just added a couple of convenient options.
> >
> > $ perf mem report -T -s mem
> > # To display the perf.data header info, please use --header/--header-only options.
> > #
> > #
> > # Total Lost Samples: 0
> > #
> > # Samples: 131 of event 'cpu/mem-loads,ldlat=30/P'
> > # Total weight : 18561
> > # Sort order : mem,type
> > #
> > # Overhead Samples Memory access Data Type
> > # ........ ............ ....................................... .........
> > #
> > 14.84% 22 L1 hit (unknown)
> > 12.02% 8 RAM hit (unknown)
> > 7.68% 8 LFB/MAB hit (unknown)
> > 6.29% 12 L1 hit (stack operation)
> > 3.97% 5 LFB/MAB hit struct psi_group_cpu
> > 3.69% 3 L1 hit struct
> > 3.18% 3 LFB/MAB hit (stack operation)
> > 2.89% 5 L3 hit (unknown)
> > 2.58% 3 L1 hit unsigned int
> > 2.31% 2 L1 hit struct psi_group_cpu
> > 2.21% 2 LFB/MAB hit struct cfs_rq
> > 2.19% 2 RAM hit struct sched_entity
> > 2.16% 1 L1 hit struct task_struct
> > 1.85% 3 L1 hit struct pcpu_hot
> > 1.78% 1 RAM hit struct tss_struct
> > 1.72% 1 LFB/MAB hit struct mm_struct
> > 1.62% 2 L1 hit struct psi_group
> > ...
> >
> >
> > The code is also available at 'perf/mem-type-v1' branch in
> > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Thanks,
> > Namhyung
> >
> >
> > Namhyung Kim (6):
> > perf hist: Correct hist_entry->mem_info refcounts
> > perf mem: Free the allocated sort string
> > perf mem: Rework command option handling
> > perf tools: Add mode argument to sort_help()
> > perf mem: Add -s/--sort option
> > perf mem: Add -T/--data-type option to report subcommand
> >
> > tools/perf/builtin-mem.c | 100 ++++++++++++++++++++++-------------
> > tools/perf/builtin-report.c | 4 +-
> > tools/perf/util/hist.c | 14 ++---
> > tools/perf/util/map_symbol.c | 18 +++++++
> > tools/perf/util/map_symbol.h | 3 ++
> > tools/perf/util/mem-info.c | 13 +++++
> > tools/perf/util/mem-info.h | 1 +
> > tools/perf/util/sort.c | 12 +++--
> > tools/perf/util/sort.h | 2 +-
> > 9 files changed, 116 insertions(+), 51 deletions(-)
>
> Lgtm, man page updates?
Thanks for your review, I'll update the documentation.
Namhyung
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-08-02 0:25 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
2024-08-01 15:20 ` Arnaldo Carvalho de Melo
2024-07-31 23:55 ` [PATCH 3/6] perf mem: Rework command option handling Namhyung Kim
2024-07-31 23:55 ` [PATCH 4/6] perf tools: Add mode argument to sort_help() Namhyung Kim
2024-07-31 23:55 ` [PATCH 5/6] perf mem: Add -s/--sort option Namhyung Kim
2024-07-31 23:55 ` [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand Namhyung Kim
2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
2024-08-02 0:24 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).