linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1)
@ 2024-07-31 23:54 Namhyung Kim
  2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev

Hello,

As we added data type profiling, 'perf mem report' should support that
as well.  This patches just added a couple of convenient options.

  $ perf mem report -T -s mem
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 131  of event 'cpu/mem-loads,ldlat=30/P'
  # Total weight : 18561
  # Sort order   : mem,type
  #
  # Overhead       Samples  Memory access                            Data Type
  # ........  ............  .......................................  .........
  #
      14.84%            22  L1 hit                                   (unknown)
      12.02%             8  RAM hit                                  (unknown)
       7.68%             8  LFB/MAB hit                              (unknown)
       6.29%            12  L1 hit                                   (stack operation)
       3.97%             5  LFB/MAB hit                              struct psi_group_cpu
       3.69%             3  L1 hit                                   struct   
       3.18%             3  LFB/MAB hit                              (stack operation)
       2.89%             5  L3 hit                                   (unknown)
       2.58%             3  L1 hit                                   unsigned int
       2.31%             2  L1 hit                                   struct psi_group_cpu
       2.21%             2  LFB/MAB hit                              struct cfs_rq
       2.19%             2  RAM hit                                  struct sched_entity
       2.16%             1  L1 hit                                   struct task_struct
       1.85%             3  L1 hit                                   struct pcpu_hot
       1.78%             1  RAM hit                                  struct tss_struct
       1.72%             1  LFB/MAB hit                              struct mm_struct
       1.62%             2  L1 hit                                   struct psi_group
       ...


The code is also available at 'perf/mem-type-v1' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (6):
  perf hist: Correct hist_entry->mem_info refcounts
  perf mem: Free the allocated sort string
  perf mem: Rework command option handling
  perf tools: Add mode argument to sort_help()
  perf mem: Add -s/--sort option
  perf mem: Add -T/--data-type option to report subcommand

 tools/perf/builtin-mem.c     | 100 ++++++++++++++++++++++-------------
 tools/perf/builtin-report.c  |   4 +-
 tools/perf/util/hist.c       |  14 ++---
 tools/perf/util/map_symbol.c |  18 +++++++
 tools/perf/util/map_symbol.h |   3 ++
 tools/perf/util/mem-info.c   |  13 +++++
 tools/perf/util/mem-info.h   |   1 +
 tools/perf/util/sort.c       |  12 +++--
 tools/perf/util/sort.h       |   2 +-
 9 files changed, 116 insertions(+), 51 deletions(-)

-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
  2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev

The mem_info is created by iter_prepare_mem_entry() at the beginning and
destroyed by iter_finish_mem_entry() at the end.  So if it's used in a
new hist_entry, it should be cloned.  Simplify (hopefully) the logic
by adding some helper functions and not holding the refcount in the
temporary entry.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/hist.c       | 14 +++++++-------
 tools/perf/util/map_symbol.c | 18 ++++++++++++++++++
 tools/perf/util/map_symbol.h |  3 +++
 tools/perf/util/mem-info.c   | 13 +++++++++++++
 tools/perf/util/mem-info.h   |  1 +
 5 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index f028f113c4fd..f8ee1cd6929d 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -476,6 +476,12 @@ static int hist_entry__init(struct hist_entry *he,
 		he->branch_info->to.ms.map = map__get(he->branch_info->to.ms.map);
 	}
 
+	if (he->mem_info) {
+		he->mem_info = mem_info__clone(template->mem_info);
+		if (he->mem_info == NULL)
+			goto err_infos;
+	}
+
 	if (hist_entry__has_callchains(he) && symbol_conf.use_callchain)
 		callchain_init(he->callchain);
 
@@ -620,12 +626,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 			if (symbol_conf.cumulate_callchain)
 				he_stat__add_period(he->stat_acc, period);
 
-			/*
-			 * This mem info was allocated from sample__resolve_mem
-			 * and will not be used anymore.
-			 */
-			mem_info__zput(entry->mem_info);
-
 			block_info__delete(entry->block_info);
 
 			kvm_info__zput(entry->kvm_info);
@@ -739,7 +739,7 @@ __hists__add_entry(struct hists *hists,
 		.filtered = symbol__parent_filter(sym_parent) | al->filtered,
 		.hists	= hists,
 		.branch_info = bi,
-		.mem_info = mem_info__get(mi),
+		.mem_info = mi,
 		.kvm_info = ki,
 		.block_info = block_info,
 		.transaction = sample->transaction,
diff --git a/tools/perf/util/map_symbol.c b/tools/perf/util/map_symbol.c
index bef5079f2403..6ad2960bc289 100644
--- a/tools/perf/util/map_symbol.c
+++ b/tools/perf/util/map_symbol.c
@@ -13,3 +13,21 @@ void addr_map_symbol__exit(struct addr_map_symbol *ams)
 {
 	map_symbol__exit(&ams->ms);
 }
+
+void map_symbol__copy(struct map_symbol *dst, struct map_symbol *src)
+{
+	dst->maps = maps__get(src->maps);
+	dst->map = map__get(src->map);
+	dst->sym = src->sym;
+}
+
+void addr_map_symbol__copy(struct addr_map_symbol *dst, struct addr_map_symbol *src)
+{
+	map_symbol__copy(&dst->ms, &src->ms);
+
+	dst->addr = src->addr;
+	dst->al_addr = src->al_addr;
+	dst->al_level = src->al_level;
+	dst->phys_addr = src->phys_addr;
+	dst->data_page_size = src->data_page_size;
+}
diff --git a/tools/perf/util/map_symbol.h b/tools/perf/util/map_symbol.h
index 72d5ed938ed6..e370bb32ed47 100644
--- a/tools/perf/util/map_symbol.h
+++ b/tools/perf/util/map_symbol.h
@@ -26,4 +26,7 @@ struct addr_map_symbol {
 void map_symbol__exit(struct map_symbol *ms);
 void addr_map_symbol__exit(struct addr_map_symbol *ams);
 
+void map_symbol__copy(struct map_symbol *dst, struct map_symbol *src);
+void addr_map_symbol__copy(struct addr_map_symbol *dst, struct addr_map_symbol *src);
+
 #endif // __PERF_MAP_SYMBOL
diff --git a/tools/perf/util/mem-info.c b/tools/perf/util/mem-info.c
index 27d67721a695..d3efa9c139f2 100644
--- a/tools/perf/util/mem-info.c
+++ b/tools/perf/util/mem-info.c
@@ -33,3 +33,16 @@ struct mem_info *mem_info__new(void)
 
 	return result;
 }
+
+struct mem_info *mem_info__clone(struct mem_info *mi)
+{
+	struct mem_info *result = mem_info__new();
+
+	if (result) {
+		addr_map_symbol__copy(mem_info__iaddr(result), mem_info__iaddr(mi));
+		addr_map_symbol__copy(mem_info__daddr(result), mem_info__daddr(mi));
+		mem_info__data_src(result)->val = mem_info__data_src(mi)->val;
+	}
+
+	return result;
+}
diff --git a/tools/perf/util/mem-info.h b/tools/perf/util/mem-info.h
index 0f68e29f311b..df75e94ed3d0 100644
--- a/tools/perf/util/mem-info.h
+++ b/tools/perf/util/mem-info.h
@@ -15,6 +15,7 @@ DECLARE_RC_STRUCT(mem_info) {
 };
 
 struct mem_info *mem_info__new(void);
+struct mem_info *mem_info__clone(struct mem_info *mi);
 struct mem_info *mem_info__get(struct mem_info *mi);
 void   mem_info__put(struct mem_info *mi);
 
-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] perf mem: Free the allocated sort string
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
  2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
  2024-08-01 15:20   ` Arnaldo Carvalho de Melo
  2024-07-31 23:55 ` [PATCH 3/6] perf mem: Rework command option handling Namhyung Kim
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev

The get_sort_order() returns either a new string (from strdup) or NULL
but it never gets freed.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-mem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 863fcd735dae..93413cfcd585 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -372,6 +372,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
 		rep_argv[i] = argv[j];
 
 	ret = cmd_report(i, rep_argv);
+	free(new_sort_order);
 	free(rep_argv);
 	return ret;
 }
-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] perf mem: Rework command option handling
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
  2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
  2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
  2024-07-31 23:55 ` [PATCH 4/6] perf tools: Add mode argument to sort_help() Namhyung Kim
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev, Aditya Gupta

Split the common option and ones for record or report.  Otherwise -U in
the record option cannot be used because it clashes with in the common
(or report) option.  Also rename report_events() to __cmd_report() to
follow the convention and to be sync with the record part.

Also set the flag PARSE_OPT_STOP_AT_NON_OPTION for the common option so
that it can show the help message in the subcommand like below:

  $ perf mem record -h

   Usage: perf mem record [<options>] [<command>]
      or: perf mem record [<options>] -- <command> [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -e, --event <event>   event selector. use 'perf mem record -e list' to list available events
      -f, --force           don't complain, do it
      -K, --all-kernel      collect only kernel level data
      -p, --phys-data       Record/Report sample physical addresses
      -t, --type <type>     memory operations(load,store) Default load,store
      -U, --all-user        collect only user level data
      -v, --verbose         be more verbose (show counter open errors, etc)
          --data-page-size  Record/Report sample data address page size
          --ldlat <n>       mem-loads latency

Cc: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-mem.c | 79 +++++++++++++++++++++++-----------------
 1 file changed, 45 insertions(+), 34 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 93413cfcd585..819edaf6b1df 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -34,6 +34,8 @@ struct perf_mem {
 	bool			force;
 	bool			phys_addr;
 	bool			data_page_size;
+	bool			all_kernel;
+	bool			all_user;
 	int			operation;
 	const char		*cpu_list;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -62,33 +64,19 @@ static int parse_record_events(const struct option *opt,
 	return 0;
 }
 
-static const char * const __usage[] = {
-	"perf mem record [<options>] [<command>]",
-	"perf mem record [<options>] -- <command> [<options>]",
-	NULL
-};
-
-static const char * const *record_mem_usage = __usage;
-
-static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
+static int __cmd_record(int argc, const char **argv, struct perf_mem *mem,
+			const struct option *options)
 {
 	int rec_argc, i = 0, j;
 	int start, end;
 	const char **rec_argv;
 	int ret;
-	bool all_user = false, all_kernel = false;
 	struct perf_mem_event *e;
 	struct perf_pmu *pmu;
-	struct option options[] = {
-	OPT_CALLBACK('e', "event", &mem, "event",
-		     "event selector. use 'perf mem record -e list' to list available events",
-		     parse_record_events),
-	OPT_UINTEGER(0, "ldlat", &perf_mem_events__loads_ldlat, "mem-loads latency"),
-	OPT_INCR('v', "verbose", &verbose,
-		 "be more verbose (show counter open errors, etc)"),
-	OPT_BOOLEAN('U', "all-user", &all_user, "collect only user level data"),
-	OPT_BOOLEAN('K', "all-kernel", &all_kernel, "collect only kernel level data"),
-	OPT_END()
+	const char * const record_usage[] = {
+		"perf mem record [<options>] [<command>]",
+		"perf mem record [<options>] -- <command> [<options>]",
+		NULL
 	};
 
 	pmu = perf_mem_events_find_pmu();
@@ -102,7 +90,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 		return -1;
 	}
 
-	argc = parse_options(argc, argv, options, record_mem_usage,
+	argc = parse_options(argc, argv, options, record_usage,
 			     PARSE_OPT_KEEP_UNKNOWN);
 
 	/* Max number of arguments multiplied by number of PMUs that can support them. */
@@ -158,10 +146,10 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 		goto out;
 	end = i;
 
-	if (all_user)
+	if (mem->all_user)
 		rec_argv[i++] = "--all-user";
 
-	if (all_kernel)
+	if (mem->all_kernel)
 		rec_argv[i++] = "--all-kernel";
 
 	if (mem->cpu_list) {
@@ -319,6 +307,7 @@ static int report_raw_events(struct perf_mem *mem)
 	perf_session__delete(session);
 	return ret;
 }
+
 static char *get_sort_order(struct perf_mem *mem)
 {
 	bool has_extra_options = (mem->phys_addr | mem->data_page_size) ? true : false;
@@ -346,11 +335,19 @@ static char *get_sort_order(struct perf_mem *mem)
 	return strdup(sort);
 }
 
-static int report_events(int argc, const char **argv, struct perf_mem *mem)
+static int __cmd_report(int argc, const char **argv, struct perf_mem *mem,
+			const struct option *options)
 {
 	const char **rep_argv;
 	int ret, i = 0, j, rep_argc;
 	char *new_sort_order;
+	const char * const report_usage[] = {
+		"perf mem report [<options>]",
+		NULL
+	};
+
+	argc = parse_options(argc, argv, options, report_usage,
+			     PARSE_OPT_KEEP_UNKNOWN);
 
 	if (mem->dump_raw)
 		return report_raw_events(mem);
@@ -368,7 +365,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
 	if (new_sort_order)
 		rep_argv[i++] = new_sort_order;
 
-	for (j = 1; j < argc; j++, i++)
+	for (j = 0; j < argc; j++, i++)
 		rep_argv[i] = argv[j];
 
 	ret = cmd_report(i, rep_argv);
@@ -475,22 +472,36 @@ int cmd_mem(int argc, const char **argv)
 	OPT_CALLBACK('t', "type", &mem.operation,
 		   "type", "memory operations(load,store) Default load,store",
 		    parse_mem_ops),
+	OPT_STRING('C', "cpu", &mem.cpu_list, "cpu",
+		   "list of cpus to profile"),
+	OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"),
+	OPT_INCR('v', "verbose", &verbose,
+		 "be more verbose (show counter open errors, etc)"),
+	OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"),
+	OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"),
+	OPT_END()
+	};
+	const struct option record_options[] = {
+	OPT_CALLBACK('e', "event", &mem, "event",
+		     "event selector. use 'perf mem record -e list' to list available events",
+		     parse_record_events),
+	OPT_UINTEGER(0, "ldlat", &perf_mem_events__loads_ldlat, "mem-loads latency"),
+	OPT_BOOLEAN('U', "all-user", &mem.all_user, "collect only user level data"),
+	OPT_BOOLEAN('K', "all-kernel", &mem.all_kernel, "collect only kernel level data"),
+	OPT_PARENT(mem_options)
+	};
+	const struct option report_options[] = {
 	OPT_BOOLEAN('D', "dump-raw-samples", &mem.dump_raw,
 		    "dump raw samples in ASCII"),
 	OPT_BOOLEAN('U', "hide-unresolved", &mem.hide_unresolved,
 		    "Only display entries resolved to a symbol"),
 	OPT_STRING('i', "input", &input_name, "file",
 		   "input file name"),
-	OPT_STRING('C', "cpu", &mem.cpu_list, "cpu",
-		   "list of cpus to profile"),
 	OPT_STRING_NOEMPTY('x', "field-separator", &symbol_conf.field_sep,
 		   "separator",
 		   "separator for columns, no spaces will be added"
 		   " between columns '.' is reserved."),
-	OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"),
-	OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"),
-	OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"),
-	OPT_END()
+	OPT_PARENT(mem_options)
 	};
 	const char *const mem_subcommands[] = { "record", "report", NULL };
 	const char *mem_usage[] = {
@@ -499,7 +510,7 @@ int cmd_mem(int argc, const char **argv)
 	};
 
 	argc = parse_options_subcommand(argc, argv, mem_options, mem_subcommands,
-					mem_usage, PARSE_OPT_KEEP_UNKNOWN);
+					mem_usage, PARSE_OPT_STOP_AT_NON_OPTION);
 
 	if (!argc || !(strncmp(argv[0], "rec", 3) || mem.operation))
 		usage_with_options(mem_usage, mem_options);
@@ -512,9 +523,9 @@ int cmd_mem(int argc, const char **argv)
 	}
 
 	if (strlen(argv[0]) > 2 && strstarts("record", argv[0]))
-		return __cmd_record(argc, argv, &mem);
+		return __cmd_record(argc, argv, &mem, record_options);
 	else if (strlen(argv[0]) > 2 && strstarts("report", argv[0]))
-		return report_events(argc, argv, &mem);
+		return __cmd_report(argc, argv, &mem, report_options);
 	else
 		usage_with_options(mem_usage, mem_options);
 
-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] perf tools: Add mode argument to sort_help()
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
                   ` (2 preceding siblings ...)
  2024-07-31 23:55 ` [PATCH 3/6] perf mem: Rework command option handling Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
  2024-07-31 23:55 ` [PATCH 5/6] perf mem: Add -s/--sort option Namhyung Kim
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev

Some sort keys are meaningful only in a specific mode - like branch
stack and memory (data-src).  Add the mode to skip unnecessary ones.
This will be used for 'perf mem report' later.

While at it, change the prefix for the -F/--fields option to remove
the duplicate part.

Before:
  $ perf report -F
   Error: switch `F' requires a value
   Usage: perf report [<options>]

      -F, --fields <key[,keys...]>
  			  output field(s): overhead period sample  overhead overhead_sys
  			  overhead_us overhead_guest_sys overhead_guest_us overhead_children
  			  sample period weight1 weight2 weight3 ins_lat retire_lat
  			  ...
After:
  $ perf report -F
   Error: switch `F' requires a value
   Usage: perf report [<options>]

      -F, --fields <key[,keys...]>
  			  output field(s): overhead overhead_sys overhead_us
  			  overhead_guest_sys overhead_guest_us overhead_children
  			  sample period weight1 weight2 weight3 ins_lat retire_lat
  			  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c |  4 ++--
 tools/perf/util/sort.c      | 12 +++++++-----
 tools/perf/util/sort.h      |  2 +-
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6edc0d4ce6fb..930052961c1a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1301,8 +1301,8 @@ int cmd_report(int argc, const char **argv)
 		.socket_filter		 = -1,
 		.skip_empty		 = true,
 	};
-	char *sort_order_help = sort_help("sort by key(s):");
-	char *field_order_help = sort_help("output field(s): overhead period sample ");
+	char *sort_order_help = sort_help("sort by key(s):", SORT_MODE__NORMAL);
+	char *field_order_help = sort_help("output field(s):", SORT_MODE__NORMAL);
 	const char *disassembler_style = NULL, *objdump_path = NULL, *addr2line_path = NULL;
 	const struct option options[] = {
 	OPT_STRING('i', "input", &input_name, "file",
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index ab7c7ff35f9b..c4046d5d1749 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -3960,7 +3960,7 @@ static void add_hpp_sort_string(struct strbuf *sb, struct hpp_dimension *s, int
 		add_key(sb, s[i].name, llen);
 }
 
-char *sort_help(const char *prefix)
+char *sort_help(const char *prefix, enum sort_mode mode)
 {
 	struct strbuf sb;
 	char *s;
@@ -3972,10 +3972,12 @@ char *sort_help(const char *prefix)
 			    ARRAY_SIZE(hpp_sort_dimensions), &len);
 	add_sort_string(&sb, common_sort_dimensions,
 			    ARRAY_SIZE(common_sort_dimensions), &len);
-	add_sort_string(&sb, bstack_sort_dimensions,
-			    ARRAY_SIZE(bstack_sort_dimensions), &len);
-	add_sort_string(&sb, memory_sort_dimensions,
-			    ARRAY_SIZE(memory_sort_dimensions), &len);
+	if (mode == SORT_MODE__NORMAL || mode == SORT_MODE__BRANCH)
+		add_sort_string(&sb, bstack_sort_dimensions,
+				ARRAY_SIZE(bstack_sort_dimensions), &len);
+	if (mode == SORT_MODE__NORMAL || mode == SORT_MODE__MEMORY)
+		add_sort_string(&sb, memory_sort_dimensions,
+				ARRAY_SIZE(memory_sort_dimensions), &len);
 	s = strbuf_detach(&sb, NULL);
 	strbuf_release(&sb);
 	return s;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 0bd0ee3ae76b..6357bc32c5ca 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -130,7 +130,7 @@ void reset_output_field(void);
 void sort__setup_elide(FILE *fp);
 void perf_hpp__set_elide(int idx, bool elide);
 
-char *sort_help(const char *prefix);
+char *sort_help(const char *prefix, enum sort_mode mode);
 
 int report_parse_ignore_callees_opt(const struct option *opt, const char *arg, int unset);
 
-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] perf mem: Add -s/--sort option
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
                   ` (3 preceding siblings ...)
  2024-07-31 23:55 ` [PATCH 4/6] perf tools: Add mode argument to sort_help() Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
  2024-07-31 23:55 ` [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand Namhyung Kim
  2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
  6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev

So that users can set the sort key manually as they want.

  $ perf mem report -s
   Error: switch `s' requires a value
   Usage: perf mem report [<options>]

      -s, --sort <key[,key2...]>
                          sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys
  			  overhead_guest_us overhead_children sample period
  			  weight1 weight2 weight3 ins_lat retire_lat p_stage_cyc
  			  pid comm dso symbol parent cpu socket srcline srcfile
  			  local_weight weight transaction trace symbol_size
  			  dso_size cgroup cgroup_id ipc_null time code_page_size
  			  local_ins_lat ins_lat local_p_stage_cyc p_stage_cyc
  			  addr local_retire_lat retire_lat simd type typeoff
  			  symoff symbol_daddr dso_daddr locked tlb mem snoop
  			  dcacheline symbol_iaddr phys_daddr data_page_size
  			  blocked

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-mem.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 819edaf6b1df..24a4f0084f49 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -19,6 +19,7 @@
 #include "util/symbol.h"
 #include "util/pmus.h"
 #include "util/sample.h"
+#include "util/sort.h"
 #include "util/string2.h"
 #include "util/util.h"
 #include <linux/err.h>
@@ -28,7 +29,8 @@
 
 struct perf_mem {
 	struct perf_tool	tool;
-	char const		*input_name;
+	const char		*input_name;
+	const char		*sort_key;
 	bool			hide_unresolved;
 	bool			dump_raw;
 	bool			force;
@@ -313,11 +315,13 @@ static char *get_sort_order(struct perf_mem *mem)
 	bool has_extra_options = (mem->phys_addr | mem->data_page_size) ? true : false;
 	char sort[128];
 
+	if (mem->sort_key)
+		scnprintf(sort, sizeof(sort), "--sort=%s", mem->sort_key);
 	/*
 	 * there is no weight (cost) associated with stores, so don't print
 	 * the column
 	 */
-	if (!(mem->operation & MEM_OPERATION_LOAD)) {
+	else if (!(mem->operation & MEM_OPERATION_LOAD)) {
 		strcpy(sort, "--sort=mem,sym,dso,symbol_daddr,"
 			     "dso_daddr,tlb,locked");
 	} else if (has_extra_options) {
@@ -468,6 +472,7 @@ int cmd_mem(int argc, const char **argv)
 		 */
 		.operation		 = MEM_OPERATION_LOAD | MEM_OPERATION_STORE,
 	};
+	char *sort_order_help = sort_help("sort by key(s):", SORT_MODE__MEMORY);
 	const struct option mem_options[] = {
 	OPT_CALLBACK('t', "type", &mem.operation,
 		   "type", "memory operations(load,store) Default load,store",
@@ -501,6 +506,8 @@ int cmd_mem(int argc, const char **argv)
 		   "separator",
 		   "separator for columns, no spaces will be added"
 		   " between columns '.' is reserved."),
+	OPT_STRING('s', "sort", &mem.sort_key, "key[,key2...]",
+		   sort_order_help),
 	OPT_PARENT(mem_options)
 	};
 	const char *const mem_subcommands[] = { "record", "report", NULL };
-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
                   ` (4 preceding siblings ...)
  2024-07-31 23:55 ` [PATCH 5/6] perf mem: Add -s/--sort option Namhyung Kim
@ 2024-07-31 23:55 ` Namhyung Kim
  2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
  6 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-07-31 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Stephane Eranian, Athira Rajeev

This is just a shortcut to have 'type' in the sort key and use more
compact output format like below.

  $ perf mem report -T
  ...
  #
  # Overhead       Samples  Memory access                            Snoop         TLB access              Data Type
  # ........  ............  .......................................  ............  ......................  .........
  #
      14.84%            22  L1 hit                                   None          L1 or L2 hit            (unknown)
       7.68%             8  LFB/MAB hit                              None          L1 or L2 hit            (unknown)
       7.17%             3  RAM hit                                  Hit           L2 miss                 (unknown)
       6.29%            12  L1 hit                                   None          L1 or L2 hit            (stack operation)
       4.85%             5  RAM hit                                  Hit           L1 or L2 hit            (unknown)
       3.97%             5  LFB/MAB hit                              None          L1 or L2 hit            struct psi_group_cpu
       3.18%             3  LFB/MAB hit                              None          L1 or L2 hit            (stack operation)
       2.58%             3  L1 hit                                   None          L1 or L2 hit            unsigned int
       2.36%             2  L1 hit                                   None          L1 or L2 hit            struct
       2.31%             2  L1 hit                                   None          L1 or L2 hit            struct psi_group_cpu
  ...

Users also can use their own sort keys and -T option makes sure it has
the 'type' sort key at the end.

  $ perf mem report -T -s mem

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-mem.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 24a4f0084f49..efa700d14c98 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -38,6 +38,7 @@ struct perf_mem {
 	bool			data_page_size;
 	bool			all_kernel;
 	bool			all_user;
+	bool			data_type;
 	int			operation;
 	const char		*cpu_list;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -317,6 +318,8 @@ static char *get_sort_order(struct perf_mem *mem)
 
 	if (mem->sort_key)
 		scnprintf(sort, sizeof(sort), "--sort=%s", mem->sort_key);
+	else if (mem->data_type)
+		strcpy(sort, "--sort=mem,snoop,tlb,type");
 	/*
 	 * there is no weight (cost) associated with stores, so don't print
 	 * the column
@@ -336,6 +339,10 @@ static char *get_sort_order(struct perf_mem *mem)
 	if (mem->data_page_size)
 		strcat(sort, ",data_page_size");
 
+	/* make sure it has 'type' sort key even -s option is used */
+	if (mem->data_type && !strstr(sort, "type"))
+		strcat(sort, ",type");
+
 	return strdup(sort);
 }
 
@@ -508,6 +515,8 @@ int cmd_mem(int argc, const char **argv)
 		   " between columns '.' is reserved."),
 	OPT_STRING('s', "sort", &mem.sort_key, "key[,key2...]",
 		   sort_order_help),
+	OPT_BOOLEAN('T', "type-profile", &mem.data_type,
+		    "Show data-type profile result"),
 	OPT_PARENT(mem_options)
 	};
 	const char *const mem_subcommands[] = { "record", "report", NULL };
-- 
2.46.0.rc1.232.g9752f9e123-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/6] perf mem: Free the allocated sort string
  2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
@ 2024-08-01 15:20   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-01 15:20 UTC (permalink / raw)
  To: Namhyung Kim, Kan Liang
  Cc: Ian Rogers, Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar,
	LKML, linux-perf-users, Stephane Eranian, Athira Rajeev

On Wed, Jul 31, 2024 at 04:55:01PM -0700, Namhyung Kim wrote:
> The get_sort_order() returns either a new string (from strdup) or NULL
> but it never gets freed.

Applied and added:

Fixes: 2e7f545096f954a9 ("perf mem: Factor out a function to generate sort order")

- Arnaldo
 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-mem.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
> index 863fcd735dae..93413cfcd585 100644
> --- a/tools/perf/builtin-mem.c
> +++ b/tools/perf/builtin-mem.c
> @@ -372,6 +372,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
>  		rep_argv[i] = argv[j];
>  
>  	ret = cmd_report(i, rep_argv);
> +	free(new_sort_order);
>  	free(rep_argv);
>  	return ret;
>  }
> -- 
> 2.46.0.rc1.232.g9752f9e123-goog

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1)
  2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
                   ` (5 preceding siblings ...)
  2024-07-31 23:55 ` [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand Namhyung Kim
@ 2024-08-01 15:23 ` Ian Rogers
  2024-08-02  0:24   ` Namhyung Kim
  6 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2024-08-01 15:23 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Stephane Eranian, Athira Rajeev

On Wed, Jul 31, 2024 at 4:55 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> As we added data type profiling, 'perf mem report' should support that
> as well.  This patches just added a couple of convenient options.
>
>   $ perf mem report -T -s mem
>   # To display the perf.data header info, please use --header/--header-only options.
>   #
>   #
>   # Total Lost Samples: 0
>   #
>   # Samples: 131  of event 'cpu/mem-loads,ldlat=30/P'
>   # Total weight : 18561
>   # Sort order   : mem,type
>   #
>   # Overhead       Samples  Memory access                            Data Type
>   # ........  ............  .......................................  .........
>   #
>       14.84%            22  L1 hit                                   (unknown)
>       12.02%             8  RAM hit                                  (unknown)
>        7.68%             8  LFB/MAB hit                              (unknown)
>        6.29%            12  L1 hit                                   (stack operation)
>        3.97%             5  LFB/MAB hit                              struct psi_group_cpu
>        3.69%             3  L1 hit                                   struct
>        3.18%             3  LFB/MAB hit                              (stack operation)
>        2.89%             5  L3 hit                                   (unknown)
>        2.58%             3  L1 hit                                   unsigned int
>        2.31%             2  L1 hit                                   struct psi_group_cpu
>        2.21%             2  LFB/MAB hit                              struct cfs_rq
>        2.19%             2  RAM hit                                  struct sched_entity
>        2.16%             1  L1 hit                                   struct task_struct
>        1.85%             3  L1 hit                                   struct pcpu_hot
>        1.78%             1  RAM hit                                  struct tss_struct
>        1.72%             1  LFB/MAB hit                              struct mm_struct
>        1.62%             2  L1 hit                                   struct psi_group
>        ...
>
>
> The code is also available at 'perf/mem-type-v1' branch in
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Namhyung Kim (6):
>   perf hist: Correct hist_entry->mem_info refcounts
>   perf mem: Free the allocated sort string
>   perf mem: Rework command option handling
>   perf tools: Add mode argument to sort_help()
>   perf mem: Add -s/--sort option
>   perf mem: Add -T/--data-type option to report subcommand
>
>  tools/perf/builtin-mem.c     | 100 ++++++++++++++++++++++-------------
>  tools/perf/builtin-report.c  |   4 +-
>  tools/perf/util/hist.c       |  14 ++---
>  tools/perf/util/map_symbol.c |  18 +++++++
>  tools/perf/util/map_symbol.h |   3 ++
>  tools/perf/util/mem-info.c   |  13 +++++
>  tools/perf/util/mem-info.h   |   1 +
>  tools/perf/util/sort.c       |  12 +++--
>  tools/perf/util/sort.h       |   2 +-
>  9 files changed, 116 insertions(+), 51 deletions(-)

Lgtm, man page updates?

Thanks,
Ian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1)
  2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
@ 2024-08-02  0:24   ` Namhyung Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2024-08-02  0:24 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Kan Liang, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Stephane Eranian, Athira Rajeev

On Thu, Aug 1, 2024 at 8:23 AM Ian Rogers <irogers@google.com> wrote:
>
> On Wed, Jul 31, 2024 at 4:55 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hello,
> >
> > As we added data type profiling, 'perf mem report' should support that
> > as well.  This patches just added a couple of convenient options.
> >
> >   $ perf mem report -T -s mem
> >   # To display the perf.data header info, please use --header/--header-only options.
> >   #
> >   #
> >   # Total Lost Samples: 0
> >   #
> >   # Samples: 131  of event 'cpu/mem-loads,ldlat=30/P'
> >   # Total weight : 18561
> >   # Sort order   : mem,type
> >   #
> >   # Overhead       Samples  Memory access                            Data Type
> >   # ........  ............  .......................................  .........
> >   #
> >       14.84%            22  L1 hit                                   (unknown)
> >       12.02%             8  RAM hit                                  (unknown)
> >        7.68%             8  LFB/MAB hit                              (unknown)
> >        6.29%            12  L1 hit                                   (stack operation)
> >        3.97%             5  LFB/MAB hit                              struct psi_group_cpu
> >        3.69%             3  L1 hit                                   struct
> >        3.18%             3  LFB/MAB hit                              (stack operation)
> >        2.89%             5  L3 hit                                   (unknown)
> >        2.58%             3  L1 hit                                   unsigned int
> >        2.31%             2  L1 hit                                   struct psi_group_cpu
> >        2.21%             2  LFB/MAB hit                              struct cfs_rq
> >        2.19%             2  RAM hit                                  struct sched_entity
> >        2.16%             1  L1 hit                                   struct task_struct
> >        1.85%             3  L1 hit                                   struct pcpu_hot
> >        1.78%             1  RAM hit                                  struct tss_struct
> >        1.72%             1  LFB/MAB hit                              struct mm_struct
> >        1.62%             2  L1 hit                                   struct psi_group
> >        ...
> >
> >
> > The code is also available at 'perf/mem-type-v1' branch in
> > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Thanks,
> > Namhyung
> >
> >
> > Namhyung Kim (6):
> >   perf hist: Correct hist_entry->mem_info refcounts
> >   perf mem: Free the allocated sort string
> >   perf mem: Rework command option handling
> >   perf tools: Add mode argument to sort_help()
> >   perf mem: Add -s/--sort option
> >   perf mem: Add -T/--data-type option to report subcommand
> >
> >  tools/perf/builtin-mem.c     | 100 ++++++++++++++++++++++-------------
> >  tools/perf/builtin-report.c  |   4 +-
> >  tools/perf/util/hist.c       |  14 ++---
> >  tools/perf/util/map_symbol.c |  18 +++++++
> >  tools/perf/util/map_symbol.h |   3 ++
> >  tools/perf/util/mem-info.c   |  13 +++++
> >  tools/perf/util/mem-info.h   |   1 +
> >  tools/perf/util/sort.c       |  12 +++--
> >  tools/perf/util/sort.h       |   2 +-
> >  9 files changed, 116 insertions(+), 51 deletions(-)
>
> Lgtm, man page updates?

Thanks for your review, I'll update the documentation.
Namhyung

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-08-02  0:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-31 23:54 [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Namhyung Kim
2024-07-31 23:55 ` [PATCH 1/6] perf hist: Correct hist_entry->mem_info refcounts Namhyung Kim
2024-07-31 23:55 ` [PATCH 2/6] perf mem: Free the allocated sort string Namhyung Kim
2024-08-01 15:20   ` Arnaldo Carvalho de Melo
2024-07-31 23:55 ` [PATCH 3/6] perf mem: Rework command option handling Namhyung Kim
2024-07-31 23:55 ` [PATCH 4/6] perf tools: Add mode argument to sort_help() Namhyung Kim
2024-07-31 23:55 ` [PATCH 5/6] perf mem: Add -s/--sort option Namhyung Kim
2024-07-31 23:55 ` [PATCH 6/6] perf mem: Add -T/--data-type option to report subcommand Namhyung Kim
2024-08-01 15:23 ` [PATCHSET 0/6] perf mem: Basic support for data type profiling (v1) Ian Rogers
2024-08-02  0:24   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).