* [PATCH v4 00/48] Refactor perf cpumap
@ 2022-01-05  6:13 Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 01/48] libperf: Add comments to perf_cpu_map Ian Rogers
                   ` (48 more replies)
  0 siblings, 49 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Perf cpu map has various functions where a cpumap and index are passed
in order to load the cpu. A problem with this is that the wrong index
may be passed for the cpumap, causing problems like aggregation on the
wrong CPU:
https://lore.kernel.org/lkml/20211204023409.969668-1-irogers@google.com/
This patch set refactors the cpu map API, reducing it and explicitly
passing the cpu (rather than the pair) to functions that need
it. Comments are added at the same time. Changes modify the same
file/function more than once as refactoring and fixes are broken apart
for the sake of bisection.
v2. Incorproates fixes suggested Jiri Olsa, rewrites the evlist CPU
    iterator in part in a way suggested by Riccardo Mancini. The new
    fixes start at patch 23. The final change was suggested by John
    Garry to make the CPUs have their own struct wrapper.
v3. Incorporates fixes suggested by Namhyung Kim.
v4. Rebases and incorporates fixes suggested by Jiri Olsa.
Ian Rogers (48):
  libperf: Add comments to perf_cpu_map.
  perf stat: Add aggr creators that are passed a cpu.
  perf stat: Correct aggregation CPU map
  perf stat: Switch aggregation to use for_each loop
  perf stat: Switch to cpu version of cpu_map__get
  perf cpumap: Switch cpu_map__build_map to cpu function
  perf cpumap: Remove map+index get_socket
  perf cpumap: Remove map+index get_die
  perf cpumap: Remove map+index get_core
  perf cpumap: Remove map+index get_node
  perf cpumap: Add comments to aggr_cpu_id
  perf cpumap: Remove unused cpu_map__socket
  perf cpumap: Simplify equal function name.
  perf cpumap: Rename empty functions.
  perf cpumap: Document cpu__get_node and remove redundant function
  perf cpumap: Remove map from function names that don't use a map.
  perf cpumap: Remove cpu_map__cpu, use libperf function.
  perf cpumap: Refactor cpu_map__build_map
  perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
  perf cpumap: Move 'has' function to libperf
  perf cpumap: Add some comments to cpu_aggr_map
  perf cpumap: Trim the cpu_aggr_map
  perf stat: Fix memory leak in check_per_pkg
  perf cpumap: Add CPU to aggr_cpu_id
  perf stat-display: Avoid use of core for CPU.
  perf evsel: Derive CPUs and threads in alloc_counts
  libperf: Switch cpu to more accurate cpu_map_idx
  libperf: Use cpu not index for evsel mmap
  perf counts: Switch name cpu to cpu_map_idx
  perf stat: Rename aggr_data cpu to imply it's an index
  perf stat: Use perf_cpu_map__for_each_cpu
  perf script: Use for each cpu to aid readability
  libperf: Allow NULL in perf_cpu_map__idx
  perf evlist: Refactor evlist__for_each_cpu.
  perf evsel: Pass cpu not cpu map index to synthesize
  perf stat: Correct variable name for read counter
  perf evsel: Rename CPU around get_group_fd
  perf evsel: Reduce scope of evsel__ignore_missing_thread
  perf evsel: Rename variable cpu to index
  perf test: Use perf_cpu_map__for_each_cpu
  perf stat: Correct check_per_pkg cpu
  perf stat: Swap variable name cpu to index
  libperf: Sync evsel documentation
  perf bpf: Rename cpu to cpu_map_idx
  perf c2c: Use more intention revealing iterator
  perf script: Fix flipped index and cpu
  perf stat: Correct first_shadow_cpu to return index
  perf cpumap: Give CPUs their own type.
 tools/lib/perf/Documentation/libperf.txt      |  11 +-
 tools/lib/perf/cpumap.c                       | 131 +++--
 tools/lib/perf/evlist.c                       |   4 +-
 tools/lib/perf/evsel.c                        |  92 ++--
 tools/lib/perf/include/internal/cpumap.h      |  18 +-
 tools/lib/perf/include/internal/evlist.h      |   3 +-
 tools/lib/perf/include/internal/evsel.h       |   4 +-
 tools/lib/perf/include/internal/mmap.h        |   5 +-
 tools/lib/perf/include/perf/cpumap.h          |   8 +-
 tools/lib/perf/include/perf/evsel.h           |  10 +-
 tools/lib/perf/libperf.map                    |   1 +
 tools/lib/perf/mmap.c                         |   2 +-
 tools/perf/arch/arm/util/cs-etm.c             |  16 +-
 tools/perf/bench/epoll-ctl.c                  |   2 +-
 tools/perf/bench/epoll-wait.c                 |   2 +-
 tools/perf/bench/futex-hash.c                 |   2 +-
 tools/perf/bench/futex-lock-pi.c              |   2 +-
 tools/perf/bench/futex-requeue.c              |   2 +-
 tools/perf/bench/futex-wake-parallel.c        |   2 +-
 tools/perf/bench/futex-wake.c                 |   2 +-
 tools/perf/builtin-c2c.c                      |  15 +-
 tools/perf/builtin-ftrace.c                   |   2 +-
 tools/perf/builtin-kmem.c                     |   2 +-
 tools/perf/builtin-record.c                   |   2 +-
 tools/perf/builtin-sched.c                    |  71 +--
 tools/perf/builtin-script.c                   |  10 +-
 tools/perf/builtin-stat.c                     | 516 +++++++++---------
 tools/perf/tests/attr.c                       |   6 +-
 tools/perf/tests/bitmap.c                     |   2 +-
 tools/perf/tests/cpumap.c                     |   6 +-
 tools/perf/tests/event_update.c               |   6 +-
 tools/perf/tests/mem2node.c                   |   2 +-
 tools/perf/tests/mmap-basic.c                 |   4 +-
 tools/perf/tests/openat-syscall-all-cpus.c    |  39 +-
 tools/perf/tests/stat.c                       |   3 +-
 tools/perf/tests/topology.c                   |  43 +-
 tools/perf/util/affinity.c                    |   2 +-
 tools/perf/util/auxtrace.c                    |  12 +-
 tools/perf/util/auxtrace.h                    |   5 +-
 tools/perf/util/bpf_counter.c                 |  16 +-
 tools/perf/util/bpf_counter.h                 |   4 +-
 tools/perf/util/counts.c                      |   8 +-
 tools/perf/util/counts.h                      |  14 +-
 tools/perf/util/cpumap.c                      | 253 ++++-----
 tools/perf/util/cpumap.h                      | 116 ++--
 tools/perf/util/cputopo.c                     |   6 +-
 tools/perf/util/env.c                         |  29 +-
 tools/perf/util/env.h                         |   3 +-
 tools/perf/util/evlist.c                      | 148 ++---
 tools/perf/util/evlist.h                      |  50 +-
 tools/perf/util/evsel.c                       | 143 ++---
 tools/perf/util/evsel.h                       |  27 +-
 tools/perf/util/expr.c                        |   2 +-
 tools/perf/util/header.c                      |   6 +-
 tools/perf/util/mmap.c                        |  19 +-
 tools/perf/util/mmap.h                        |   3 +-
 tools/perf/util/perf_api_probe.c              |  15 +-
 tools/perf/util/python.c                      |   4 +-
 tools/perf/util/record.c                      |  11 +-
 .../scripting-engines/trace-event-python.c    |   6 +-
 tools/perf/util/session.c                     |  10 +-
 tools/perf/util/stat-display.c                | 138 ++---
 tools/perf/util/stat-shadow.c                 | 308 +++++------
 tools/perf/util/stat.c                        |  47 +-
 tools/perf/util/stat.h                        |   9 +-
 tools/perf/util/svghelper.c                   |   6 +-
 tools/perf/util/synthetic-events.c            |  12 +-
 tools/perf/util/synthetic-events.h            |   3 +-
 tools/perf/util/util.h                        |   5 +-
 69 files changed, 1333 insertions(+), 1155 deletions(-)
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply	[flat|nested] 84+ messages in thread
* [PATCH v4 01/48] libperf: Add comments to perf_cpu_map.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 17:00   ` John Garry
  2022-01-12  5:35   ` kajoljain
  2022-01-05  6:13 ` [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu Ian Rogers
                   ` (47 subsequent siblings)
  48 siblings, 2 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
A particular observed problem is confusing the index with the CPU value,
documentation should hopefully reduce this type of problem.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/include/internal/cpumap.h | 9 +++++++++
 1 file changed, 9 insertions(+)
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 840d4032587b..4054169c12c5 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -4,9 +4,18 @@
 
 #include <linux/refcount.h>
 
+/**
+ * A sized, reference counted, sorted array of integers representing CPU
+ * numbers. This is commonly used to capture which CPUs a PMU is associated
+ * with. The indices into the cpumap are frequently used as they avoid having
+ * gaps if CPU numbers were used. For events associated with a pid, rather than
+ * a CPU, a single dummy map with an entry of -1 is used.
+ */
 struct perf_cpu_map {
 	refcount_t	refcnt;
+	/** Length of the map array. */
 	int		nr;
+	/** The CPU values. */
 	int		map[];
 };
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 01/48] libperf: Add comments to perf_cpu_map Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 17:10   ` John Garry
                     ` (2 more replies)
  2022-01-05  6:13 ` [PATCH v4 03/48] perf stat: Correct aggregation CPU map Ian Rogers
                   ` (46 subsequent siblings)
  48 siblings, 3 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
The cpu_map and index can get confused. Add variants of the cpu_map__get
routines that are passed a cpu. Make the existing cpu_map__get routines
use the new functions with a view to remove them when no longer used.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
 tools/perf/util/cpumap.h |  6 ++-
 2 files changed, 51 insertions(+), 34 deletions(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 87d3eca9b872..49fba2c53822 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
 	return ret ?: value;
 }
 
-struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
-					void *data __maybe_unused)
+struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
 {
-	int cpu;
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
 
-	if (idx > map->nr)
-		return id;
-
-	cpu = map->map[idx];
-
 	id.socket = cpu_map__get_socket_id(cpu);
 	return id;
 }
 
+struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
+				       void *data)
+{
+	if (idx < 0 || idx > map->nr)
+		return cpu_map__empty_aggr_cpu_id();
+
+	return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
+}
+
 static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
 {
 	struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
@@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
 	return ret ?: value;
 }
 
-struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
+struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
 {
-	int cpu, die;
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
-
-	if (idx > map->nr)
-		return id;
-
-	cpu = map->map[idx];
+	struct aggr_cpu_id id;
+	int die;
 
 	die = cpu_map__get_die_id(cpu);
 	/* There is no die_id on legacy system. */
@@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
 	 * with the socket ID and then add die to
 	 * make a unique ID.
 	 */
-	id = cpu_map__get_socket(map, idx, data);
+	id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
 	if (cpu_map__aggr_cpu_id_is_empty(id))
 		return id;
 
@@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
 	return id;
 }
 
+struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
+				    void *data)
+{
+	if (idx < 0 || idx > map->nr)
+		return cpu_map__empty_aggr_cpu_id();
+
+	return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
+}
+
 int cpu_map__get_core_id(int cpu)
 {
 	int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
@@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
 	return cpu__get_node(cpu);
 }
 
-struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
+struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 {
-	int cpu;
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
-
-	if (idx > map->nr)
-		return id;
-
-	cpu = map->map[idx];
-
-	cpu = cpu_map__get_core_id(cpu);
+	struct aggr_cpu_id id;
+	int core = cpu_map__get_core_id(cpu);
 
 	/* cpu_map__get_die returns a struct with socket and die set*/
-	id = cpu_map__get_die(map, idx, data);
+	id = cpu_map__get_die_aggr_by_cpu(cpu, data);
 	if (cpu_map__aggr_cpu_id_is_empty(id))
 		return id;
 
@@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
 	 * core_id is relative to socket and die, we need a global id.
 	 * So we combine the result from cpu_map__get_die with the core id
 	 */
-	id.core = cpu;
+	id.core = core;
 	return id;
+
 }
 
-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
+struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
+{
+	if (idx < 0 || idx > map->nr)
+		return cpu_map__empty_aggr_cpu_id();
+
+	return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
+}
+
+struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
 
+	id.node = cpu_map__get_node_id(cpu);
+	return id;
+}
+
+struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
+{
 	if (idx < 0 || idx >= map->nr)
-		return id;
+		return cpu_map__empty_aggr_cpu_id();
 
-	id.node = cpu_map__get_node_id(map->map[idx]);
-	return id;
+	return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
 }
 
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index a27eeaf086e8..c62d67704425 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
 int cpu_map__get_socket_id(int cpu);
+struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__get_die_id(int cpu);
+struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__get_core_id(int cpu);
+struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__get_node_id(int cpu);
-struct aggr_cpu_id  cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
+struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
+struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
 int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
 int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 03/48] perf stat: Correct aggregation CPU map
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 01/48] libperf: Add comments to perf_cpu_map Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 18:43   ` John Garry
  2022-01-05  6:13 ` [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop Ian Rogers
                   ` (45 subsequent siblings)
  48 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Switch the perf_cpu_map in aggr_update_shadow from
the evlist to the counter's cpu map, so the index is appropriate. This
addresses a problem where uncore counts, with a cpumap like:
$ cat /sys/devices/uncore_imc_0/cpumask
0,18
Don't aggregate counts in CPUs based on the index of those values in the
cpumap (0 and 1) but on the actual CPU (0 and 18). Thereby correcting
metric calculations in per-socket mode for counters without a full
cpumask.
On a SkylakeX with a tweaked DRAM_BW_Use metric, to remove unnecessary
scaling, this gives:
Before:
$ /perf stat --per-socket -M DRAM_BW_Use -I 1000
     1.001102293 S0        1              27.01 MiB  uncore_imc/cas_count_write/ #   103.00 DRAM_BW_Use
     1.001102293 S0        1              30.22 MiB  uncore_imc/cas_count_read/
     1.001102293 S0        1      1,001,102,293 ns   duration_time
     1.001102293 S1        1              20.10 MiB  uncore_imc/cas_count_write/ #     0.00 DRAM_BW_Use
     1.001102293 S1        1              32.74 MiB  uncore_imc/cas_count_read/
     1.001102293 S1        0      <not counted> ns   duration_time
     2.003517973 S0        1              83.04 MiB  uncore_imc/cas_count_write/ #   920.00 DRAM_BW_Use
     2.003517973 S0        1             145.95 MiB  uncore_imc/cas_count_read/
     2.003517973 S0        1      1,002,415,680 ns   duration_time
     2.003517973 S1        1             302.45 MiB  uncore_imc/cas_count_write/ #     0.00 DRAM_BW_Use
     2.003517973 S1        1             290.99 MiB  uncore_imc/cas_count_read/
     2.003517973 S1        0      <not counted> ns   duration_time
After:
$ perf stat --per-socket -M DRAM_BW_Use -I 1000
     1.001080840 S0        1              24.96 MiB  uncore_imc/cas_count_write/ #    54.00 DRAM_BW_Use
     1.001080840 S0        1              33.64 MiB  uncore_imc/cas_count_read/
     1.001080840 S0        1      1,001,080,840 ns   duration_time
     1.001080840 S1        1              42.43 MiB  uncore_imc/cas_count_write/ #    84.00 DRAM_BW_Use
     1.001080840 S1        1              47.05 MiB  uncore_imc/cas_count_read/
     1.001080840 S1        0      <not counted> ns   duration_time
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 588601000f3f..b0fa81ffce61 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -526,7 +526,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
 		evlist__for_each_entry(evlist, counter) {
 			val = 0;
 			for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
-				s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
+				s2 = config->aggr_get_id(config, evsel__cpus(counter), cpu);
 				if (!cpu_map__compare_aggr_cpu_id(s2, id))
 					continue;
 				val += perf_counts(counter->counts, cpu, 0)->val;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (2 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 03/48] perf stat: Correct aggregation CPU map Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 18:50   ` John Garry
  2022-01-05  6:13 ` [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get Ian Rogers
                   ` (44 subsequent siblings)
  48 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Tidy up the use of cpu and index to hopefully make the code less error
prone. Avoid unused warnings with (void) which will be removed in a
later patch.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 48 +++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 21 deletions(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index b0fa81ffce61..efab39a759ff 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -330,8 +330,8 @@ static void print_metric_header(struct perf_stat_config *config,
 static int first_shadow_cpu(struct perf_stat_config *config,
 			    struct evsel *evsel, struct aggr_cpu_id id)
 {
-	struct evlist *evlist = evsel->evlist;
-	int i;
+	struct perf_cpu_map *cpus;
+	int cpu, idx;
 
 	if (config->aggr_mode == AGGR_NONE)
 		return id.core;
@@ -339,14 +339,11 @@ static int first_shadow_cpu(struct perf_stat_config *config,
 	if (!config->aggr_get_id)
 		return 0;
 
-	for (i = 0; i < evsel__nr_cpus(evsel); i++) {
-		int cpu2 = evsel__cpus(evsel)->map[i];
-
-		if (cpu_map__compare_aggr_cpu_id(
-					config->aggr_get_id(config, evlist->core.cpus, cpu2),
-					id)) {
-			return cpu2;
-		}
+	cpus = evsel__cpus(evsel);
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+		if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
+						 id))
+			return cpu;
 	}
 	return 0;
 }
@@ -516,20 +513,23 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
 static void aggr_update_shadow(struct perf_stat_config *config,
 			       struct evlist *evlist)
 {
-	int cpu, s;
+	int cpu, idx, s;
 	struct aggr_cpu_id s2, id;
 	u64 val;
 	struct evsel *counter;
+	struct perf_cpu_map *cpus;
 
 	for (s = 0; s < config->aggr_map->nr; s++) {
 		id = config->aggr_map->map[s];
 		evlist__for_each_entry(evlist, counter) {
+			cpus = evsel__cpus(counter);
 			val = 0;
-			for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
-				s2 = config->aggr_get_id(config, evsel__cpus(counter), cpu);
+			perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+				(void)cpu;
+				s2 = config->aggr_get_id(config, cpus, idx);
 				if (!cpu_map__compare_aggr_cpu_id(s2, id))
 					continue;
-				val += perf_counts(counter->counts, cpu, 0)->val;
+				val += perf_counts(counter->counts, idx, 0)->val;
 			}
 			perf_stat__update_shadow_stats(counter, val,
 					first_shadow_cpu(config, counter, id),
@@ -634,18 +634,21 @@ static void aggr_cb(struct perf_stat_config *config,
 		    struct evsel *counter, void *data, bool first)
 {
 	struct aggr_data *ad = data;
-	int cpu;
+	int idx, cpu;
+	struct perf_cpu_map *cpus;
 	struct aggr_cpu_id s2;
 
-	for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
+	cpus = evsel__cpus(counter);
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 		struct perf_counts_values *counts;
 
-		s2 = config->aggr_get_id(config, evsel__cpus(counter), cpu);
+		(void)cpu;
+		s2 = config->aggr_get_id(config, cpus, idx);
 		if (!cpu_map__compare_aggr_cpu_id(s2, ad->id))
 			continue;
 		if (first)
 			ad->nr++;
-		counts = perf_counts(counter->counts, cpu, 0);
+		counts = perf_counts(counter->counts, idx, 0);
 		/*
 		 * When any result is bad, make them all to give
 		 * consistent output in interval mode.
@@ -1208,10 +1211,13 @@ static void print_percore_thread(struct perf_stat_config *config,
 {
 	int s;
 	struct aggr_cpu_id s2, id;
+	struct perf_cpu_map *cpus;
 	bool first = true;
+	int idx, cpu;
 
-	for (int i = 0; i < evsel__nr_cpus(counter); i++) {
-		s2 = config->aggr_get_id(config, evsel__cpus(counter), i);
+	cpus = evsel__cpus(counter);
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+		s2 = config->aggr_get_id(config, cpus, idx);
 		for (s = 0; s < config->aggr_map->nr; s++) {
 			id = config->aggr_map->map[s];
 			if (cpu_map__compare_aggr_cpu_id(s2, id))
@@ -1220,7 +1226,7 @@ static void print_percore_thread(struct perf_stat_config *config,
 
 		print_counter_aggrdata(config, counter, s,
 				       prefix, false,
-				       &first, i);
+				       &first, cpu);
 	}
 }
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (3 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 19:04   ` John Garry
  2022-01-05  6:13 ` [PATCH v4 06/48] perf cpumap: Switch cpu_map__build_map to cpu function Ian Rogers
                   ` (43 subsequent siblings)
  48 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Avoid bugs where the wrong index is passed with the cpu_map.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c      | 93 +++++++++++++++++++---------------
 tools/perf/util/stat-display.c | 11 ++--
 tools/perf/util/stat.h         |  3 +-
 3 files changed, 57 insertions(+), 50 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f6ca2b054c5b..9791ae9b1a53 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1323,69 +1323,63 @@ static struct option stat_options[] = {
 };
 
 static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
-				 struct perf_cpu_map *map, int cpu)
+						int cpu)
 {
-	return cpu_map__get_socket(map, cpu, NULL);
+	return cpu_map__get_socket_aggr_by_cpu(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_die(struct perf_stat_config *config __maybe_unused,
-			      struct perf_cpu_map *map, int cpu)
+					     int cpu)
 {
-	return cpu_map__get_die(map, cpu, NULL);
+	return cpu_map__get_die_aggr_by_cpu(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_core(struct perf_stat_config *config __maybe_unused,
-			       struct perf_cpu_map *map, int cpu)
+					      int cpu)
 {
-	return cpu_map__get_core(map, cpu, NULL);
+	return cpu_map__get_core_aggr_by_cpu(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __maybe_unused,
-			       struct perf_cpu_map *map, int cpu)
+					      int cpu)
 {
-	return cpu_map__get_node(map, cpu, NULL);
+	return cpu_map__get_node_aggr_by_cpu(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
-			       aggr_get_id_t get_id, struct perf_cpu_map *map, int idx)
+					      aggr_get_id_t get_id, int cpu)
 {
-	int cpu;
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
 
-	if (idx >= map->nr)
-		return id;
-
-	cpu = map->map[idx];
-
 	if (cpu_map__aggr_cpu_id_is_empty(config->cpus_aggr_map->map[cpu]))
-		config->cpus_aggr_map->map[cpu] = get_id(config, map, idx);
+		config->cpus_aggr_map->map[cpu] = get_id(config, cpu);
 
 	id = config->cpus_aggr_map->map[cpu];
 	return id;
 }
 
 static struct aggr_cpu_id perf_stat__get_socket_cached(struct perf_stat_config *config,
-					struct perf_cpu_map *map, int idx)
+						       int cpu)
 {
-	return perf_stat__get_aggr(config, perf_stat__get_socket, map, idx);
+	return perf_stat__get_aggr(config, perf_stat__get_socket, cpu);
 }
 
 static struct aggr_cpu_id perf_stat__get_die_cached(struct perf_stat_config *config,
-					struct perf_cpu_map *map, int idx)
+						    int cpu)
 {
-	return perf_stat__get_aggr(config, perf_stat__get_die, map, idx);
+	return perf_stat__get_aggr(config, perf_stat__get_die, cpu);
 }
 
 static struct aggr_cpu_id perf_stat__get_core_cached(struct perf_stat_config *config,
-				      struct perf_cpu_map *map, int idx)
+						     int cpu)
 {
-	return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
+	return perf_stat__get_aggr(config, perf_stat__get_core, cpu);
 }
 
 static struct aggr_cpu_id perf_stat__get_node_cached(struct perf_stat_config *config,
-				      struct perf_cpu_map *map, int idx)
+						     int cpu)
 {
-	return perf_stat__get_aggr(config, perf_stat__get_node, map, idx);
+	return perf_stat__get_aggr(config, perf_stat__get_node, cpu);
 }
 
 static bool term_percore_set(void)
@@ -1483,8 +1477,9 @@ static void perf_stat__exit_aggr_mode(void)
 	stat_config.cpus_aggr_map = NULL;
 }
 
-static inline int perf_env__get_cpu(struct perf_env *env, struct perf_cpu_map *map, int idx)
+static inline int perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
 {
+	struct perf_env *env = data;
 	int cpu;
 
 	if (idx > map->nr)
@@ -1498,10 +1493,9 @@ static inline int perf_env__get_cpu(struct perf_env *env, struct perf_cpu_map *m
 	return cpu;
 }
 
-static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
-	int cpu = perf_env__get_cpu(env, map, idx);
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
 
 	if (cpu != -1)
@@ -1510,11 +1504,15 @@ static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx, void *data)
+{
+	return perf_env__get_socket_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
+static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
-	int cpu = perf_env__get_cpu(env, map, idx);
 
 	if (cpu != -1) {
 		/*
@@ -1529,11 +1527,15 @@ static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, v
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, void *data)
+{
+	return perf_env__get_die_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
+static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
-	int cpu = perf_env__get_cpu(env, map, idx);
 
 	if (cpu != -1) {
 		/*
@@ -1549,15 +1551,24 @@ static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx,
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_node(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx, void *data)
+{
+	return perf_env__get_core_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
+static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
 {
-	int cpu = perf_env__get_cpu(data, map, idx);
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
 
 	id.node = perf_env__numa_node(data, cpu);
 	return id;
 }
 
+static struct aggr_cpu_id perf_env__get_node(struct perf_cpu_map *map, int idx, void *data)
+{
+	return perf_env__get_node_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
 static int perf_env__build_socket_map(struct perf_env *env, struct perf_cpu_map *cpus,
 				      struct cpu_aggr_map **sockp)
 {
@@ -1583,26 +1594,26 @@ static int perf_env__build_node_map(struct perf_env *env, struct perf_cpu_map *c
 }
 
 static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
-				      struct perf_cpu_map *map, int idx)
+						     int cpu)
 {
-	return perf_env__get_socket(map, idx, &perf_stat.session->header.env);
+	return perf_env__get_socket_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 static struct aggr_cpu_id perf_stat__get_die_file(struct perf_stat_config *config __maybe_unused,
-				   struct perf_cpu_map *map, int idx)
+						  int cpu)
 {
-	return perf_env__get_die(map, idx, &perf_stat.session->header.env);
+	return perf_env__get_die_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 
 static struct aggr_cpu_id perf_stat__get_core_file(struct perf_stat_config *config __maybe_unused,
-				    struct perf_cpu_map *map, int idx)
+						   int cpu)
 {
-	return perf_env__get_core(map, idx, &perf_stat.session->header.env);
+	return perf_env__get_core_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 
 static struct aggr_cpu_id perf_stat__get_node_file(struct perf_stat_config *config __maybe_unused,
-				    struct perf_cpu_map *map, int idx)
+						   int cpu)
 {
-	return perf_env__get_node(map, idx, &perf_stat.session->header.env);
+	return perf_env__get_node_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 
 static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index efab39a759ff..6c40b91d5e32 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -341,8 +341,7 @@ static int first_shadow_cpu(struct perf_stat_config *config,
 
 	cpus = evsel__cpus(evsel);
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
-		if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
-						 id))
+		if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpu), id))
 			return cpu;
 	}
 	return 0;
@@ -525,8 +524,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
 			cpus = evsel__cpus(counter);
 			val = 0;
 			perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
-				(void)cpu;
-				s2 = config->aggr_get_id(config, cpus, idx);
+				s2 = config->aggr_get_id(config, cpu);
 				if (!cpu_map__compare_aggr_cpu_id(s2, id))
 					continue;
 				val += perf_counts(counter->counts, idx, 0)->val;
@@ -642,8 +640,7 @@ static void aggr_cb(struct perf_stat_config *config,
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 		struct perf_counts_values *counts;
 
-		(void)cpu;
-		s2 = config->aggr_get_id(config, cpus, idx);
+		s2 = config->aggr_get_id(config, cpu);
 		if (!cpu_map__compare_aggr_cpu_id(s2, ad->id))
 			continue;
 		if (first)
@@ -1217,7 +1214,7 @@ static void print_percore_thread(struct perf_stat_config *config,
 
 	cpus = evsel__cpus(counter);
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
-		s2 = config->aggr_get_id(config, cpus, idx);
+		s2 = config->aggr_get_id(config, cpu);
 		for (s = 0; s < config->aggr_map->nr; s++) {
 			id = config->aggr_map->map[s];
 			if (cpu_map__compare_aggr_cpu_id(s2, id))
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 32c8527de347..32cf24186229 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -108,8 +108,7 @@ struct runtime_stat {
 	struct rblist value_list;
 };
 
-typedef struct aggr_cpu_id (*aggr_get_id_t)(struct perf_stat_config *config,
-			     struct perf_cpu_map *m, int cpu);
+typedef struct aggr_cpu_id (*aggr_get_id_t)(struct perf_stat_config *config, int cpu);
 
 struct perf_stat_config {
 	enum aggr_mode		 aggr_mode;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 06/48] perf cpumap: Switch cpu_map__build_map to cpu function
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (4 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 07/48] perf cpumap: Remove map+index get_socket Ian Rogers
                   ` (42 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Avoid error prone cpu_map + idx variant. Remove now unused functions.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c | 28 ++++------------------------
 tools/perf/util/cpumap.c  | 12 ++++++------
 tools/perf/util/cpumap.h  |  2 +-
 3 files changed, 11 insertions(+), 31 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9791ae9b1a53..6b985abaa2d2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1504,11 +1504,6 @@ static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx, void *data)
-{
-	return perf_env__get_socket_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
 static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
@@ -1527,11 +1522,6 @@ static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, void *data)
-{
-	return perf_env__get_die_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
 static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
@@ -1551,11 +1541,6 @@ static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx, void *data)
-{
-	return perf_env__get_core_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
 static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
 {
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
@@ -1564,33 +1549,28 @@ static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_node(struct perf_cpu_map *map, int idx, void *data)
-{
-	return perf_env__get_node_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
 static int perf_env__build_socket_map(struct perf_env *env, struct perf_cpu_map *cpus,
 				      struct cpu_aggr_map **sockp)
 {
-	return cpu_map__build_map(cpus, sockp, perf_env__get_socket, env);
+	return cpu_map__build_map(cpus, sockp, perf_env__get_socket_aggr_by_cpu, env);
 }
 
 static int perf_env__build_die_map(struct perf_env *env, struct perf_cpu_map *cpus,
 				   struct cpu_aggr_map **diep)
 {
-	return cpu_map__build_map(cpus, diep, perf_env__get_die, env);
+	return cpu_map__build_map(cpus, diep, perf_env__get_die_aggr_by_cpu, env);
 }
 
 static int perf_env__build_core_map(struct perf_env *env, struct perf_cpu_map *cpus,
 				    struct cpu_aggr_map **corep)
 {
-	return cpu_map__build_map(cpus, corep, perf_env__get_core, env);
+	return cpu_map__build_map(cpus, corep, perf_env__get_core_aggr_by_cpu, env);
 }
 
 static int perf_env__build_node_map(struct perf_env *env, struct perf_cpu_map *cpus,
 				    struct cpu_aggr_map **nodep)
 {
-	return cpu_map__build_map(cpus, nodep, perf_env__get_node, env);
+	return cpu_map__build_map(cpus, nodep, perf_env__get_node_aggr_by_cpu, env);
 }
 
 static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 49fba2c53822..feaf34b25efc 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -163,7 +163,7 @@ static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
 }
 
 int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
-		       struct aggr_cpu_id (*f)(struct perf_cpu_map *map, int cpu, void *data),
+		       struct aggr_cpu_id (*f)(int cpu, void *data),
 		       void *data)
 {
 	int nr = cpus->nr;
@@ -178,7 +178,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
 	c->nr = 0;
 
 	for (cpu = 0; cpu < nr; cpu++) {
-		s1 = f(cpus, cpu, data);
+		s1 = f(cpu, data);
 		for (s2 = 0; s2 < c->nr; s2++) {
 			if (cpu_map__compare_aggr_cpu_id(s1, c->map[s2]))
 				break;
@@ -290,22 +290,22 @@ struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *da
 
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
 {
-	return cpu_map__build_map(cpus, sockp, cpu_map__get_socket, NULL);
+	return cpu_map__build_map(cpus, sockp, cpu_map__get_socket_aggr_by_cpu, NULL);
 }
 
 int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep)
 {
-	return cpu_map__build_map(cpus, diep, cpu_map__get_die, NULL);
+	return cpu_map__build_map(cpus, diep, cpu_map__get_die_aggr_by_cpu, NULL);
 }
 
 int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep)
 {
-	return cpu_map__build_map(cpus, corep, cpu_map__get_core, NULL);
+	return cpu_map__build_map(cpus, corep, cpu_map__get_core_aggr_by_cpu, NULL);
 }
 
 int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **numap)
 {
-	return cpu_map__build_map(cpus, numap, cpu_map__get_node, NULL);
+	return cpu_map__build_map(cpus, numap, cpu_map__get_node_aggr_by_cpu, NULL);
 }
 
 /* setup simple routines to easily access node numbers given a cpu number */
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index c62d67704425..9648816c4255 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -63,7 +63,7 @@ int cpu__max_present_cpu(void);
 int cpu__get_node(int cpu);
 
 int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
-		       struct aggr_cpu_id (*f)(struct perf_cpu_map *map, int cpu, void *data),
+		       struct aggr_cpu_id (*f)(int cpu, void *data),
 		       void *data);
 
 int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 07/48] perf cpumap: Remove map+index get_socket
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (5 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 06/48] perf cpumap: Switch cpu_map__build_map to cpu function Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 08/48] perf cpumap: Remove map+index get_die Ian Rogers
                   ` (41 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Migrate final users to appropriate cpu variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 2 +-
 tools/perf/util/cpumap.c    | 9 ---------
 tools/perf/util/cpumap.h    | 1 -
 tools/perf/util/stat.c      | 2 +-
 4 files changed, 2 insertions(+), 12 deletions(-)
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 869986139146..69a64074b897 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -150,7 +150,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that socket ID contains only socket
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_socket(map, i, NULL);
+		id = cpu_map__get_socket_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
 			session->header.env.cpu[map->map[i]].socket_id == id.socket);
 
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index feaf34b25efc..342a5eaee9d3 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -136,15 +136,6 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
 	return id;
 }
 
-struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
-				       void *data)
-{
-	if (idx < 0 || idx > map->nr)
-		return cpu_map__empty_aggr_cpu_id();
-
-	return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
-}
-
 static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
 {
 	struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 9648816c4255..a53af24301d2 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -32,7 +32,6 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
 int cpu_map__get_socket_id(int cpu);
 struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__get_die_id(int cpu);
 struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 09ea334586f2..9eca1111fa52 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -328,7 +328,7 @@ static int check_per_pkg(struct evsel *counter,
 	if (!(vals->run && vals->ena))
 		return 0;
 
-	s = cpu_map__get_socket(cpus, cpu, NULL).socket;
+	s = cpu_map__get_socket_id(cpu);
 	if (s < 0)
 		return -1;
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 08/48] perf cpumap: Remove map+index get_die
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (6 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 07/48] perf cpumap: Remove map+index get_socket Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 09/48] perf cpumap: Remove map+index get_core Ian Rogers
                   ` (40 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Migrate final users to appropriate cpu variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 2 +-
 tools/perf/util/cpumap.c    | 9 ---------
 tools/perf/util/cpumap.h    | 1 -
 tools/perf/util/stat.c      | 2 +-
 4 files changed, 2 insertions(+), 12 deletions(-)
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 69a64074b897..ce085b6f379b 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -136,7 +136,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that die ID contains socket and die
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_die(map, i, NULL);
+		id = cpu_map__get_die_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Die map - Socket ID doesn't match",
 			session->header.env.cpu[map->map[i]].socket_id == id.socket);
 
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 342a5eaee9d3..ff91c32da688 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -216,15 +216,6 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
-				    void *data)
-{
-	if (idx < 0 || idx > map->nr)
-		return cpu_map__empty_aggr_cpu_id();
-
-	return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
-}
-
 int cpu_map__get_core_id(int cpu)
 {
 	int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index a53af24301d2..365ed69699e1 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -34,7 +34,6 @@ int cpu_map__get_socket_id(int cpu);
 struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
 int cpu_map__get_die_id(int cpu);
 struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__get_core_id(int cpu);
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 9eca1111fa52..5ed99bcfe91e 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -336,7 +336,7 @@ static int check_per_pkg(struct evsel *counter,
 	 * On multi-die system, die_id > 0. On no-die system, die_id = 0.
 	 * We use hashmap(socket, die) to check the used socket+die pair.
 	 */
-	d = cpu_map__get_die(cpus, cpu, NULL).die;
+	d = cpu_map__get_die_id(cpu);
 	if (d < 0)
 		return -1;
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 09/48] perf cpumap: Remove map+index get_core
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (7 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 08/48] perf cpumap: Remove map+index get_die Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 10/48] perf cpumap: Remove map+index get_node Ian Rogers
                   ` (39 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Migrate final users to appropriate cpu variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 2 +-
 tools/perf/util/cpumap.c    | 8 --------
 tools/perf/util/cpumap.h    | 1 -
 3 files changed, 1 insertion(+), 10 deletions(-)
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index ce085b6f379b..9a671670415a 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -121,7 +121,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that core ID contains socket, die and core
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_core(map, i, NULL);
+		id = cpu_map__get_core_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Core map - Core ID doesn't match",
 			session->header.env.cpu[map->map[i]].core_id == id.core);
 
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index ff91c32da688..e8149bcf8bfa 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -246,14 +246,6 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 
 }
 
-struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
-{
-	if (idx < 0 || idx > map->nr)
-		return cpu_map__empty_aggr_cpu_id();
-
-	return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
-}
-
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 365ed69699e1..7e1829468bd6 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -36,7 +36,6 @@ int cpu_map__get_die_id(int cpu);
 struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
 int cpu_map__get_core_id(int cpu);
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__get_node_id(int cpu);
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 10/48] perf cpumap: Remove map+index get_node
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (8 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 09/48] perf cpumap: Remove map+index get_core Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 11/48] perf cpumap: Add comments to aggr_cpu_id Ian Rogers
                   ` (38 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Migrate final users to appropriate cpu variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 2 +-
 tools/perf/util/cpumap.c    | 8 --------
 tools/perf/util/cpumap.h    | 1 -
 3 files changed, 1 insertion(+), 10 deletions(-)
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 9a671670415a..5992b323c4f5 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -162,7 +162,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that node ID contains only node
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_node(map, i, NULL);
+		id = cpu_map__get_node_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Node map - Node ID doesn't match",
 			cpu__get_node(map->map[i]) == id.node);
 		TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index e8149bcf8bfa..f67b2e7aac13 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -254,14 +254,6 @@ struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unu
 	return id;
 }
 
-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
-{
-	if (idx < 0 || idx >= map->nr)
-		return cpu_map__empty_aggr_cpu_id();
-
-	return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
-}
-
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
 {
 	return cpu_map__build_map(cpus, sockp, cpu_map__get_socket_aggr_by_cpu, NULL);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 7e1829468bd6..f0121dd4fdcb 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -38,7 +38,6 @@ int cpu_map__get_core_id(int cpu);
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
 int cpu_map__get_node_id(int cpu);
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
 int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
 int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 11/48] perf cpumap: Add comments to aggr_cpu_id
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (9 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 10/48] perf cpumap: Remove map+index get_node Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 12/48] perf cpumap: Remove unused cpu_map__socket Ian Rogers
                   ` (37 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
This code is already tested in topology.c.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.h | 9 +++++++++
 1 file changed, 9 insertions(+)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index f0121dd4fdcb..edd93e1db36a 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -7,11 +7,20 @@
 #include <internal/cpumap.h>
 #include <perf/cpumap.h>
 
+/** Identify where counts are aggregated, -1 implies not to aggregate. */
 struct aggr_cpu_id {
+	/** A value in the range 0 to number of threads. */
 	int thread;
+	/** The numa node X as read from /sys/devices/system/node/nodeX. */
 	int node;
+	/**
+	 * The socket number as read from
+	 * /sys/devices/system/cpu/cpuX/topology/physical_package_id.
+	 */
 	int socket;
+	/** The die id as read from /sys/devices/system/cpu/cpuX/topology/die_id. */
 	int die;
+	/** The core id as read from /sys/devices/system/cpu/cpuX/topology/core_id. */
 	int core;
 };
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 12/48] perf cpumap: Remove unused cpu_map__socket
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (10 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 11/48] perf cpumap: Add comments to aggr_cpu_id Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 13/48] perf cpumap: Simplify equal function name Ian Rogers
                   ` (36 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Unused function so remove.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.h | 7 -------
 1 file changed, 7 deletions(-)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index edd93e1db36a..22e53fd54657 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -53,13 +53,6 @@ int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **cor
 int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **nodep);
 const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */
 
-static inline int cpu_map__socket(struct perf_cpu_map *sock, int s)
-{
-	if (!sock || s > sock->nr || s < 0)
-		return 0;
-	return sock->map[s];
-}
-
 int cpu__setup_cpunode_map(void);
 
 int cpu__max_node(void);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 13/48] perf cpumap: Simplify equal function name.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (11 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 12/48] perf cpumap: Remove unused cpu_map__socket Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 14/48] perf cpumap: Rename empty functions Ian Rogers
                   ` (35 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Rename cpu_map__compare_aggr_cpu_id to aggr_cpu_id__equal, the cpu_map
part of the name is misleading. Equal better describes the function than
compare.
Switch to const pointer rather than value as struct given the number of
variables in aggr_cpu_id.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.c       | 14 +++++++-------
 tools/perf/util/cpumap.h       |  2 +-
 tools/perf/util/stat-display.c | 18 ++++++++++--------
 3 files changed, 18 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index f67b2e7aac13..8fa00a6221c8 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -171,7 +171,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
 	for (cpu = 0; cpu < nr; cpu++) {
 		s1 = f(cpu, data);
 		for (s2 = 0; s2 < c->nr; s2++) {
-			if (cpu_map__compare_aggr_cpu_id(s1, c->map[s2]))
+			if (aggr_cpu_id__equal(&s1, &c->map[s2]))
 				break;
 		}
 		if (s2 == c->nr) {
@@ -593,13 +593,13 @@ const struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */
 	return online;
 }
 
-bool cpu_map__compare_aggr_cpu_id(struct aggr_cpu_id a, struct aggr_cpu_id b)
+bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b)
 {
-	return a.thread == b.thread &&
-		a.node == b.node &&
-		a.socket == b.socket &&
-		a.die == b.die &&
-		a.core == b.core;
+	return a->thread == b->thread &&
+		a->node == b->node &&
+		a->socket == b->socket &&
+		a->die == b->die &&
+		a->core == b->core;
 }
 
 bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 22e53fd54657..652b76c69376 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -67,7 +67,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
 int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
 bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
 
-bool cpu_map__compare_aggr_cpu_id(struct aggr_cpu_id a, struct aggr_cpu_id b);
+bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
 bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a);
 struct aggr_cpu_id cpu_map__empty_aggr_cpu_id(void);
 
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6c40b91d5e32..0241436bb1fb 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -328,20 +328,22 @@ static void print_metric_header(struct perf_stat_config *config,
 }
 
 static int first_shadow_cpu(struct perf_stat_config *config,
-			    struct evsel *evsel, struct aggr_cpu_id id)
+			    struct evsel *evsel, const struct aggr_cpu_id *id)
 {
 	struct perf_cpu_map *cpus;
 	int cpu, idx;
 
 	if (config->aggr_mode == AGGR_NONE)
-		return id.core;
+		return id->core;
 
 	if (!config->aggr_get_id)
 		return 0;
 
 	cpus = evsel__cpus(evsel);
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
-		if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpu), id))
+		struct aggr_cpu_id cpu_id = config->aggr_get_id(config, cpu);
+
+		if (aggr_cpu_id__equal(&cpu_id, id))
 			return cpu;
 	}
 	return 0;
@@ -501,7 +503,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
 	}
 
 	perf_stat__print_shadow_stats(config, counter, uval,
-				first_shadow_cpu(config, counter, id),
+				first_shadow_cpu(config, counter, &id),
 				&out, &config->metric_events, st);
 	if (!config->csv_output && !config->metric_only) {
 		print_noise(config, counter, noise);
@@ -525,12 +527,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
 			val = 0;
 			perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 				s2 = config->aggr_get_id(config, cpu);
-				if (!cpu_map__compare_aggr_cpu_id(s2, id))
+				if (!aggr_cpu_id__equal(&s2, &id))
 					continue;
 				val += perf_counts(counter->counts, idx, 0)->val;
 			}
 			perf_stat__update_shadow_stats(counter, val,
-					first_shadow_cpu(config, counter, id),
+					first_shadow_cpu(config, counter, &id),
 					&rt_stat);
 		}
 	}
@@ -641,7 +643,7 @@ static void aggr_cb(struct perf_stat_config *config,
 		struct perf_counts_values *counts;
 
 		s2 = config->aggr_get_id(config, cpu);
-		if (!cpu_map__compare_aggr_cpu_id(s2, ad->id))
+		if (!aggr_cpu_id__equal(&s2, &ad->id))
 			continue;
 		if (first)
 			ad->nr++;
@@ -1217,7 +1219,7 @@ static void print_percore_thread(struct perf_stat_config *config,
 		s2 = config->aggr_get_id(config, cpu);
 		for (s = 0; s < config->aggr_map->nr; s++) {
 			id = config->aggr_map->map[s];
-			if (cpu_map__compare_aggr_cpu_id(s2, id))
+			if (aggr_cpu_id__equal(&s2, &id))
 				break;
 		}
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 14/48] perf cpumap: Rename empty functions.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (12 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 13/48] perf cpumap: Simplify equal function name Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 15/48] perf cpumap: Document cpu__get_node and remove redundant function Ian Rogers
                   ` (34 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Remove cpu_map from name as a cpu_map isn't used. Pass a const pointer
rather than by value to avoid unnecessary copying.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c      | 12 ++++++------
 tools/perf/util/cpumap.c       | 24 ++++++++++++------------
 tools/perf/util/cpumap.h       |  4 ++--
 tools/perf/util/stat-display.c | 10 +++++-----
 4 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6b985abaa2d2..d229fb71b577 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1349,9 +1349,9 @@ static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __
 static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
 					      aggr_get_id_t get_id, int cpu)
 {
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu_map__aggr_cpu_id_is_empty(config->cpus_aggr_map->map[cpu]))
+	if (aggr_cpu_id__is_empty(&config->cpus_aggr_map->map[cpu]))
 		config->cpus_aggr_map->map[cpu] = get_id(config, cpu);
 
 	id = config->cpus_aggr_map->map[cpu];
@@ -1496,7 +1496,7 @@ static inline int perf_env__get_cpu(void *data, struct perf_cpu_map *map, int id
 static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
 	if (cpu != -1)
 		id.socket = env->cpu[cpu].socket_id;
@@ -1507,7 +1507,7 @@ static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
 static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
 	if (cpu != -1) {
 		/*
@@ -1525,7 +1525,7 @@ static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
 static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
 {
 	struct perf_env *env = data;
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
 	if (cpu != -1) {
 		/*
@@ -1543,7 +1543,7 @@ static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
 
 static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
 {
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
 	id.node = perf_env__numa_node(data, cpu);
 	return id;
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 8fa00a6221c8..b3e1304aca0c 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -104,7 +104,7 @@ struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr)
 
 		cpus->nr = nr;
 		for (i = 0; i < nr; i++)
-			cpus->map[i] = cpu_map__empty_aggr_cpu_id();
+			cpus->map[i] = aggr_cpu_id__empty();
 
 		refcount_set(&cpus->refcnt, 1);
 	}
@@ -130,7 +130,7 @@ int cpu_map__get_socket_id(int cpu)
 
 struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
 {
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
 	id.socket = cpu_map__get_socket_id(cpu);
 	return id;
@@ -209,7 +209,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
 	 * make a unique ID.
 	 */
 	id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
-	if (cpu_map__aggr_cpu_id_is_empty(id))
+	if (aggr_cpu_id__is_empty(&id))
 		return id;
 
 	id.die = die;
@@ -234,7 +234,7 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 
 	/* cpu_map__get_die returns a struct with socket and die set*/
 	id = cpu_map__get_die_aggr_by_cpu(cpu, data);
-	if (cpu_map__aggr_cpu_id_is_empty(id))
+	if (aggr_cpu_id__is_empty(&id))
 		return id;
 
 	/*
@@ -248,7 +248,7 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
 {
-	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
 	id.node = cpu_map__get_node_id(cpu);
 	return id;
@@ -602,16 +602,16 @@ bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b
 		a->core == b->core;
 }
 
-bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a)
+bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a)
 {
-	return a.thread == -1 &&
-		a.node == -1 &&
-		a.socket == -1 &&
-		a.die == -1 &&
-		a.core == -1;
+	return a->thread == -1 &&
+		a->node == -1 &&
+		a->socket == -1 &&
+		a->die == -1 &&
+		a->core == -1;
 }
 
-struct aggr_cpu_id cpu_map__empty_aggr_cpu_id(void)
+struct aggr_cpu_id aggr_cpu_id__empty(void)
 {
 	struct aggr_cpu_id ret = {
 		.thread = -1,
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 652b76c69376..9589b0001a28 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -68,7 +68,7 @@ int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
 bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
 
 bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
-bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a);
-struct aggr_cpu_id cpu_map__empty_aggr_cpu_id(void);
+bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
+struct aggr_cpu_id aggr_cpu_id__empty(void);
 
 #endif /* __PERF_CPUMAP_H */
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 0241436bb1fb..870b1db71fbc 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -698,7 +698,7 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
 
 	uval = val * counter->scale;
 	if (cpu != -1) {
-		id = cpu_map__empty_aggr_cpu_id();
+		id = aggr_cpu_id__empty();
 		id.core = cpu;
 	}
 	printout(config, id, nr, counter, uval,
@@ -780,7 +780,7 @@ static struct perf_aggr_thread_value *sort_aggr_thread(
 			continue;
 
 		buf[i].counter = counter;
-		buf[i].id = cpu_map__empty_aggr_cpu_id();
+		buf[i].id = aggr_cpu_id__empty();
 		buf[i].id.thread = thread;
 		buf[i].uval = uval;
 		buf[i].val = val;
@@ -868,7 +868,7 @@ static void print_counter_aggr(struct perf_stat_config *config,
 		fprintf(output, "%s", prefix);
 
 	uval = cd.avg * counter->scale;
-	printout(config, cpu_map__empty_aggr_cpu_id(), 0, counter, uval, prefix, cd.avg_running,
+	printout(config, aggr_cpu_id__empty(), 0, counter, uval, prefix, cd.avg_running,
 		 cd.avg_enabled, cd.avg, &rt_stat);
 	if (!metric_only)
 		fprintf(output, "\n");
@@ -911,7 +911,7 @@ static void print_counter(struct perf_stat_config *config,
 			fprintf(output, "%s", prefix);
 
 		uval = val * counter->scale;
-		id = cpu_map__empty_aggr_cpu_id();
+		id = aggr_cpu_id__empty();
 		id.core = cpu;
 		printout(config, id, 0, counter, uval, prefix,
 			 run, ena, 1.0, &rt_stat);
@@ -938,7 +938,7 @@ static void print_no_aggr_metric(struct perf_stat_config *config,
 		if (prefix)
 			fputs(prefix, config->output);
 		evlist__for_each_entry(evlist, counter) {
-			id = cpu_map__empty_aggr_cpu_id();
+			id = aggr_cpu_id__empty();
 			id.core = cpu;
 			if (first) {
 				aggr_printout(config, counter, id, 0);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 15/48] perf cpumap: Document cpu__get_node and remove redundant function
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (13 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 14/48] perf cpumap: Rename empty functions Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 16/48] perf cpumap: Remove map from function names that don't use a map Ian Rogers
                   ` (33 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
cpu_map__get_node_id isn't used externally and merely delegates to
cpu__get_node.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.c | 11 +++++------
 tools/perf/util/cpumap.h |  5 ++++-
 2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index b3e1304aca0c..1626b0991408 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -16,6 +16,10 @@
 static int max_cpu_num;
 static int max_present_cpu_num;
 static int max_node_num;
+/**
+ * The numa node X as read from /sys/devices/system/node/nodeX indexed by the
+ * CPU number.
+ */
 static int *cpunode_map;
 
 static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
@@ -222,11 +226,6 @@ int cpu_map__get_core_id(int cpu)
 	return ret ?: value;
 }
 
-int cpu_map__get_node_id(int cpu)
-{
-	return cpu__get_node(cpu);
-}
-
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 {
 	struct aggr_cpu_id id;
@@ -250,7 +249,7 @@ struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unu
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	id.node = cpu_map__get_node_id(cpu);
+	id.node = cpu__get_node(cpu);
 	return id;
 }
 
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 9589b0001a28..f849f01c5860 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -45,7 +45,6 @@ int cpu_map__get_die_id(int cpu);
 struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
 int cpu_map__get_core_id(int cpu);
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
-int cpu_map__get_node_id(int cpu);
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
 int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
@@ -58,6 +57,10 @@ int cpu__setup_cpunode_map(void);
 int cpu__max_node(void);
 int cpu__max_cpu(void);
 int cpu__max_present_cpu(void);
+/**
+ * cpu__get_node - Returns the numa node X as read from
+ * /sys/devices/system/node/nodeX for the given CPU.
+ */
 int cpu__get_node(int cpu);
 
 int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 16/48] perf cpumap: Remove map from function names that don't use a map.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (14 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 15/48] perf cpumap: Document cpu__get_node and remove redundant function Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function Ian Rogers
                   ` (32 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Move to the cpu name and document for consistency.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.c | 12 ++++++------
 tools/perf/util/cpumap.h | 19 ++++++++++++++++---
 tools/perf/util/env.c    |  6 +++---
 tools/perf/util/stat.c   |  4 ++--
 4 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 1626b0991408..e0d7f1da5858 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -126,7 +126,7 @@ static int cpu__get_topology_int(int cpu, const char *name, int *value)
 	return sysfs__read_int(path, value);
 }
 
-int cpu_map__get_socket_id(int cpu)
+int cpu__get_socket_id(int cpu)
 {
 	int value, ret = cpu__get_topology_int(cpu, "physical_package_id", &value);
 	return ret ?: value;
@@ -136,7 +136,7 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	id.socket = cpu_map__get_socket_id(cpu);
+	id.socket = cpu__get_socket_id(cpu);
 	return id;
 }
 
@@ -190,7 +190,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
 	return 0;
 }
 
-int cpu_map__get_die_id(int cpu)
+int cpu__get_die_id(int cpu)
 {
 	int value, ret = cpu__get_topology_int(cpu, "die_id", &value);
 
@@ -202,7 +202,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
 	struct aggr_cpu_id id;
 	int die;
 
-	die = cpu_map__get_die_id(cpu);
+	die = cpu__get_die_id(cpu);
 	/* There is no die_id on legacy system. */
 	if (die == -1)
 		die = 0;
@@ -220,7 +220,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-int cpu_map__get_core_id(int cpu)
+int cpu__get_core_id(int cpu)
 {
 	int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
 	return ret ?: value;
@@ -229,7 +229,7 @@ int cpu_map__get_core_id(int cpu)
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 {
 	struct aggr_cpu_id id;
-	int core = cpu_map__get_core_id(cpu);
+	int core = cpu__get_core_id(cpu);
 
 	/* cpu_map__get_die returns a struct with socket and die set*/
 	id = cpu_map__get_die_aggr_by_cpu(cpu, data);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index f849f01c5860..a053bf31a3f0 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -39,11 +39,8 @@ struct perf_cpu_map *cpu_map__new_data(struct perf_record_cpu_map_data *data);
 size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
-int cpu_map__get_socket_id(int cpu);
 struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
-int cpu_map__get_die_id(int cpu);
 struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
-int cpu_map__get_core_id(int cpu);
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
 int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
@@ -62,6 +59,22 @@ int cpu__max_present_cpu(void);
  * /sys/devices/system/node/nodeX for the given CPU.
  */
 int cpu__get_node(int cpu);
+/**
+ * cpu__get_socket_id - Returns the socket number as read from
+ * /sys/devices/system/cpu/cpuX/topology/physical_package_id for the given CPU.
+ */
+int cpu__get_socket_id(int cpu);
+/**
+ * cpu__get_die_id - Returns the die id as read from
+ * /sys/devices/system/cpu/cpuX/topology/die_id for the given CPU.
+ */
+int cpu__get_die_id(int cpu);
+/**
+ * cpu__get_core_id - Returns the core id as read from
+ * /sys/devices/system/cpu/cpuX/topology/core_id for the given CPU.
+ */
+int cpu__get_core_id(int cpu);
+
 
 int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
 		       struct aggr_cpu_id (*f)(int cpu, void *data),
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index b9904896eb97..fd12c0dcaefb 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -302,9 +302,9 @@ int perf_env__read_cpu_topology_map(struct perf_env *env)
 		return -ENOMEM;
 
 	for (cpu = 0; cpu < nr_cpus; ++cpu) {
-		env->cpu[cpu].core_id	= cpu_map__get_core_id(cpu);
-		env->cpu[cpu].socket_id	= cpu_map__get_socket_id(cpu);
-		env->cpu[cpu].die_id	= cpu_map__get_die_id(cpu);
+		env->cpu[cpu].core_id	= cpu__get_core_id(cpu);
+		env->cpu[cpu].socket_id	= cpu__get_socket_id(cpu);
+		env->cpu[cpu].die_id	= cpu__get_die_id(cpu);
 	}
 
 	env->nr_cpus_avail = nr_cpus;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 5ed99bcfe91e..5c24aca0968c 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -328,7 +328,7 @@ static int check_per_pkg(struct evsel *counter,
 	if (!(vals->run && vals->ena))
 		return 0;
 
-	s = cpu_map__get_socket_id(cpu);
+	s = cpu__get_socket_id(cpu);
 	if (s < 0)
 		return -1;
 
@@ -336,7 +336,7 @@ static int check_per_pkg(struct evsel *counter,
 	 * On multi-die system, die_id > 0. On no-die system, die_id = 0.
 	 * We use hashmap(socket, die) to check the used socket+die pair.
 	 */
-	d = cpu_map__get_die_id(cpu);
+	d = cpu__get_die_id(cpu);
 	if (d < 0)
 		return -1;
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (15 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 16/48] perf cpumap: Remove map from function names that don't use a map Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 19:02   ` Arnaldo Carvalho de Melo
  2022-01-05  6:13 ` [PATCH v4 18/48] perf cpumap: Refactor cpu_map__build_map Ian Rogers
                   ` (31 subsequent siblings)
  48 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Switch the remaining few users of cpu_map__cpu to perf_cpu_map__cpu and
remove the function.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-ftrace.c | 2 +-
 tools/perf/util/cpumap.c    | 9 ++-------
 tools/perf/util/cpumap.h    | 1 -
 3 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 2b54e2ddc80a..f16c39a37a52 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -281,7 +281,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
 	int ret;
 	int last_cpu;
 
-	last_cpu = cpu_map__cpu(cpumap, cpumap->nr - 1);
+	last_cpu = perf_cpu_map__cpu(cpumap, cpumap->nr - 1);
 	mask_size = last_cpu / 4 + 2; /* one more byte for EOS */
 	mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */
 
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index e0d7f1da5858..32f9fc2dd389 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -485,11 +485,6 @@ bool cpu_map__has(struct perf_cpu_map *cpus, int cpu)
 	return perf_cpu_map__idx(cpus, cpu) != -1;
 }
 
-int cpu_map__cpu(struct perf_cpu_map *cpus, int idx)
-{
-	return cpus->map[idx];
-}
-
 size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
 {
 	int i, cpu, start = -1;
@@ -547,7 +542,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 	int i, cpu;
 	char *ptr = buf;
 	unsigned char *bitmap;
-	int last_cpu = cpu_map__cpu(map, map->nr - 1);
+	int last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
 
 	if (buf == NULL)
 		return 0;
@@ -559,7 +554,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 	}
 
 	for (i = 0; i < map->nr; i++) {
-		cpu = cpu_map__cpu(map, i);
+		cpu = perf_cpu_map__cpu(map, i);
 		bitmap[cpu / 8] |= 1 << (cpu % 8);
 	}
 
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index a053bf31a3f0..87545bcd461d 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -80,7 +80,6 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
 		       struct aggr_cpu_id (*f)(int cpu, void *data),
 		       void *data);
 
-int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
 bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
 
 bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 18/48] perf cpumap: Refactor cpu_map__build_map
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (16 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 19/48] perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions Ian Rogers
                   ` (30 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Turn it into a cpu_aggr_map__new. Pass helper functions. Refactor
builtin-stat calls to manually pass function pointers. Try to reduce
some copy-paste code.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c | 187 ++++++++++++++++++++------------------
 tools/perf/util/cpumap.c  |  59 +++++-------
 tools/perf/util/cpumap.h  |  16 ++--
 3 files changed, 130 insertions(+), 132 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d229fb71b577..2b04a948cf37 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1322,6 +1322,17 @@ static struct option stat_options[] = {
 	OPT_END()
 };
 
+static const char *const aggr_mode__string[] = {
+	[AGGR_CORE] = "core",
+	[AGGR_DIE] = "die",
+	[AGGR_GLOBAL] = "global",
+	[AGGR_NODE] = "node",
+	[AGGR_NONE] = "none",
+	[AGGR_SOCKET] = "socket",
+	[AGGR_THREAD] = "thread",
+	[AGGR_UNSET] = "unset",
+};
+
 static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
 						int cpu)
 {
@@ -1394,54 +1405,67 @@ static bool term_percore_set(void)
 	return false;
 }
 
-static int perf_stat_init_aggr_mode(void)
+static aggr_cpu_id_get_t aggr_mode__get_aggr(enum aggr_mode aggr_mode)
 {
-	int nr;
+	switch (aggr_mode) {
+	case AGGR_SOCKET:
+		return cpu_map__get_socket_aggr_by_cpu;
+	case AGGR_DIE:
+		return cpu_map__get_die_aggr_by_cpu;
+	case AGGR_CORE:
+		return cpu_map__get_core_aggr_by_cpu;
+	case AGGR_NODE:
+		return cpu_map__get_node_aggr_by_cpu;
+	case AGGR_NONE:
+		if (term_percore_set())
+			return cpu_map__get_core_aggr_by_cpu;
+
+		return NULL;
+	case AGGR_GLOBAL:
+	case AGGR_THREAD:
+	case AGGR_UNSET:
+	default:
+		return NULL;
+	}
+}
 
-	switch (stat_config.aggr_mode) {
+static aggr_get_id_t aggr_mode__get_id(enum aggr_mode aggr_mode)
+{
+	switch (aggr_mode) {
 	case AGGR_SOCKET:
-		if (cpu_map__build_socket_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build socket map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_socket_cached;
-		break;
+		return perf_stat__get_socket_cached;
 	case AGGR_DIE:
-		if (cpu_map__build_die_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build die map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_die_cached;
-		break;
+		return perf_stat__get_die_cached;
 	case AGGR_CORE:
-		if (cpu_map__build_core_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build core map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_core_cached;
-		break;
+		return perf_stat__get_core_cached;
 	case AGGR_NODE:
-		if (cpu_map__build_node_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build core map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_node_cached;
-		break;
+		return perf_stat__get_node_cached;
 	case AGGR_NONE:
 		if (term_percore_set()) {
-			if (cpu_map__build_core_map(evsel_list->core.cpus,
-						    &stat_config.aggr_map)) {
-				perror("cannot build core map");
-				return -1;
-			}
-			stat_config.aggr_get_id = perf_stat__get_core_cached;
+			return perf_stat__get_core_cached;
 		}
-		break;
+		return NULL;
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
 	case AGGR_UNSET:
 	default:
-		break;
+		return NULL;
+	}
+}
+
+static int perf_stat_init_aggr_mode(void)
+{
+	int nr;
+	aggr_cpu_id_get_t get_id = aggr_mode__get_aggr(stat_config.aggr_mode);
+
+	if (get_id) {
+		stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus,
+							 get_id, /*data=*/NULL);
+		if (!stat_config.aggr_map) {
+			pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
+			return -1;
+		}
+		stat_config.aggr_get_id = aggr_mode__get_id(stat_config.aggr_mode);
 	}
 
 	/*
@@ -1549,30 +1573,6 @@ static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
 	return id;
 }
 
-static int perf_env__build_socket_map(struct perf_env *env, struct perf_cpu_map *cpus,
-				      struct cpu_aggr_map **sockp)
-{
-	return cpu_map__build_map(cpus, sockp, perf_env__get_socket_aggr_by_cpu, env);
-}
-
-static int perf_env__build_die_map(struct perf_env *env, struct perf_cpu_map *cpus,
-				   struct cpu_aggr_map **diep)
-{
-	return cpu_map__build_map(cpus, diep, perf_env__get_die_aggr_by_cpu, env);
-}
-
-static int perf_env__build_core_map(struct perf_env *env, struct perf_cpu_map *cpus,
-				    struct cpu_aggr_map **corep)
-{
-	return cpu_map__build_map(cpus, corep, perf_env__get_core_aggr_by_cpu, env);
-}
-
-static int perf_env__build_node_map(struct perf_env *env, struct perf_cpu_map *cpus,
-				    struct cpu_aggr_map **nodep)
-{
-	return cpu_map__build_map(cpus, nodep, perf_env__get_node_aggr_by_cpu, env);
-}
-
 static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
 						     int cpu)
 {
@@ -1596,47 +1596,60 @@ static struct aggr_cpu_id perf_stat__get_node_file(struct perf_stat_config *conf
 	return perf_env__get_node_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 
-static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
+static aggr_cpu_id_get_t aggr_mode__get_aggr_file(enum aggr_mode aggr_mode)
 {
-	struct perf_env *env = &st->session->header.env;
+	switch (aggr_mode) {
+	case AGGR_SOCKET:
+		return perf_env__get_socket_aggr_by_cpu;
+	case AGGR_DIE:
+		return perf_env__get_die_aggr_by_cpu;
+	case AGGR_CORE:
+		return perf_env__get_core_aggr_by_cpu;
+	case AGGR_NODE:
+		return perf_env__get_node_aggr_by_cpu;
+	case AGGR_NONE:
+	case AGGR_GLOBAL:
+	case AGGR_THREAD:
+	case AGGR_UNSET:
+	default:
+		return NULL;
+	}
+}
 
-	switch (stat_config.aggr_mode) {
+static aggr_get_id_t aggr_mode__get_id_file(enum aggr_mode aggr_mode)
+{
+	switch (aggr_mode) {
 	case AGGR_SOCKET:
-		if (perf_env__build_socket_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build socket map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_socket_file;
-		break;
+		return perf_stat__get_socket_file;
 	case AGGR_DIE:
-		if (perf_env__build_die_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build die map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_die_file;
-		break;
+		return perf_stat__get_die_file;
 	case AGGR_CORE:
-		if (perf_env__build_core_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build core map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_core_file;
-		break;
+		return perf_stat__get_core_file;
 	case AGGR_NODE:
-		if (perf_env__build_node_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
-			perror("cannot build core map");
-			return -1;
-		}
-		stat_config.aggr_get_id = perf_stat__get_node_file;
-		break;
+		return perf_stat__get_node_file;
 	case AGGR_NONE:
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
 	case AGGR_UNSET:
 	default:
-		break;
+		return NULL;
 	}
+}
+
+static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
+{
+	struct perf_env *env = &st->session->header.env;
+	aggr_cpu_id_get_t get_id = aggr_mode__get_aggr_file(stat_config.aggr_mode);
 
+	if (!get_id)
+		return 0;
+
+	stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus, get_id, env);
+	if (!stat_config.aggr_map) {
+		pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
+		return -1;
+	}
+	stat_config.aggr_get_id = aggr_mode__get_id_file(stat_config.aggr_mode);
 	return 0;
 }
 
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 32f9fc2dd389..c8f9b3f15759 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -140,7 +140,7 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
 	return id;
 }
 
-static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
+static int aggr_cpu_id__cmp(const void *a_pointer, const void *b_pointer)
 {
 	struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
 	struct aggr_cpu_id *b = (struct aggr_cpu_id *)b_pointer;
@@ -157,37 +157,40 @@ static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
 		return a->thread - b->thread;
 }
 
-int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
-		       struct aggr_cpu_id (*f)(int cpu, void *data),
-		       void *data)
+struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
+				       aggr_cpu_id_get_t get_id,
+				       void *data)
 {
-	int nr = cpus->nr;
-	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(nr);
-	int cpu, s2;
-	struct aggr_cpu_id s1;
+	int cpu, idx;
+	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
 
 	if (!c)
-		return -1;
+		return NULL;
 
 	/* Reset size as it may only be partially filled */
 	c->nr = 0;
 
-	for (cpu = 0; cpu < nr; cpu++) {
-		s1 = f(cpu, data);
-		for (s2 = 0; s2 < c->nr; s2++) {
-			if (aggr_cpu_id__equal(&s1, &c->map[s2]))
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+		bool duplicate = false;
+		struct aggr_cpu_id cpu_id = get_id(cpu, data);
+
+		for (int j = 0; j < c->nr; j++) {
+			if (aggr_cpu_id__equal(&cpu_id, &c->map[j])) {
+				duplicate = true;
 				break;
+			}
 		}
-		if (s2 == c->nr) {
-			c->map[c->nr] = s1;
+		if (!duplicate) {
+			c->map[c->nr] = cpu_id;
 			c->nr++;
 		}
 	}
+
 	/* ensure we process id in increasing order */
-	qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), cmp_aggr_cpu_id);
+	qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);
+
+	return c;
 
-	*res = c;
-	return 0;
 }
 
 int cpu__get_die_id(int cpu)
@@ -253,26 +256,6 @@ struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unu
 	return id;
 }
 
-int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
-{
-	return cpu_map__build_map(cpus, sockp, cpu_map__get_socket_aggr_by_cpu, NULL);
-}
-
-int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep)
-{
-	return cpu_map__build_map(cpus, diep, cpu_map__get_die_aggr_by_cpu, NULL);
-}
-
-int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep)
-{
-	return cpu_map__build_map(cpus, corep, cpu_map__get_core_aggr_by_cpu, NULL);
-}
-
-int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **numap)
-{
-	return cpu_map__build_map(cpus, numap, cpu_map__get_node_aggr_by_cpu, NULL);
-}
-
 /* setup simple routines to easily access node numbers given a cpu number */
 static int get_max_num(char *path, int *max)
 {
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 87545bcd461d..611048e2a592 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -43,10 +43,6 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
 struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
-int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
-int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
-int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
-int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **nodep);
 const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */
 
 int cpu__setup_cpunode_map(void);
@@ -75,10 +71,16 @@ int cpu__get_die_id(int cpu);
  */
 int cpu__get_core_id(int cpu);
 
+typedef struct aggr_cpu_id (*aggr_cpu_id_get_t)(int cpu, void *data);
 
-int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
-		       struct aggr_cpu_id (*f)(int cpu, void *data),
-		       void *data);
+/**
+ * cpu_aggr_map__new - Create a cpu_aggr_map with an aggr_cpu_id for each cpu in
+ * cpus. The aggr_cpu_id is created with 'get_id' that may have a data value
+ * passed to it. The cpu_aggr_map is sorted with duplicate values removed.
+ */
+struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
+				       aggr_cpu_id_get_t get_id,
+				       void *data);
 
 bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 19/48] perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (17 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 18/48] perf cpumap: Refactor cpu_map__build_map Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 20/48] perf cpumap: Move 'has' function to libperf Ian Rogers
                   ` (29 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
The functions don't use a cpu_map so reduce them to being like
constructors of aggr_cpu_id.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c   | 18 +++++++++---------
 tools/perf/tests/topology.c |  8 ++++----
 tools/perf/util/cpumap.c    | 14 +++++++-------
 tools/perf/util/cpumap.h    | 29 +++++++++++++++++++++++++----
 4 files changed, 45 insertions(+), 24 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2b04a948cf37..98f80c217279 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1336,25 +1336,25 @@ static const char *const aggr_mode__string[] = {
 static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
 						int cpu)
 {
-	return cpu_map__get_socket_aggr_by_cpu(cpu, /*data=*/NULL);
+	return aggr_cpu_id__socket(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_die(struct perf_stat_config *config __maybe_unused,
 					     int cpu)
 {
-	return cpu_map__get_die_aggr_by_cpu(cpu, /*data=*/NULL);
+	return aggr_cpu_id__die(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_core(struct perf_stat_config *config __maybe_unused,
 					      int cpu)
 {
-	return cpu_map__get_core_aggr_by_cpu(cpu, /*data=*/NULL);
+	return aggr_cpu_id__core(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __maybe_unused,
 					      int cpu)
 {
-	return cpu_map__get_node_aggr_by_cpu(cpu, /*data=*/NULL);
+	return aggr_cpu_id__node(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
@@ -1409,16 +1409,16 @@ static aggr_cpu_id_get_t aggr_mode__get_aggr(enum aggr_mode aggr_mode)
 {
 	switch (aggr_mode) {
 	case AGGR_SOCKET:
-		return cpu_map__get_socket_aggr_by_cpu;
+		return aggr_cpu_id__socket;
 	case AGGR_DIE:
-		return cpu_map__get_die_aggr_by_cpu;
+		return aggr_cpu_id__die;
 	case AGGR_CORE:
-		return cpu_map__get_core_aggr_by_cpu;
+		return aggr_cpu_id__core;
 	case AGGR_NODE:
-		return cpu_map__get_node_aggr_by_cpu;
+		return aggr_cpu_id__node;
 	case AGGR_NONE:
 		if (term_percore_set())
-			return cpu_map__get_core_aggr_by_cpu;
+			return aggr_cpu_id__core;
 
 		return NULL;
 	case AGGR_GLOBAL:
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 5992b323c4f5..0cb7b015b4b9 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -121,7 +121,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that core ID contains socket, die and core
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_core_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+		id = aggr_cpu_id__core(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Core map - Core ID doesn't match",
 			session->header.env.cpu[map->map[i]].core_id == id.core);
 
@@ -136,7 +136,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that die ID contains socket and die
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_die_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+		id = aggr_cpu_id__die(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Die map - Socket ID doesn't match",
 			session->header.env.cpu[map->map[i]].socket_id == id.socket);
 
@@ -150,7 +150,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that socket ID contains only socket
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_socket_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+		id = aggr_cpu_id__socket(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
 			session->header.env.cpu[map->map[i]].socket_id == id.socket);
 
@@ -162,7 +162,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 	// Test that node ID contains only node
 	for (i = 0; i < map->nr; i++) {
-		id = cpu_map__get_node_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+		id = aggr_cpu_id__node(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Node map - Node ID doesn't match",
 			cpu__get_node(map->map[i]) == id.node);
 		TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index c8f9b3f15759..19e502cc65e7 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -132,7 +132,7 @@ int cpu__get_socket_id(int cpu)
 	return ret ?: value;
 }
 
-struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
+struct aggr_cpu_id aggr_cpu_id__socket(int cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
@@ -200,7 +200,7 @@ int cpu__get_die_id(int cpu)
 	return ret ?: value;
 }
 
-struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data)
 {
 	struct aggr_cpu_id id;
 	int die;
@@ -215,7 +215,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
 	 * with the socket ID and then add die to
 	 * make a unique ID.
 	 */
-	id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
+	id = aggr_cpu_id__socket(cpu, data);
 	if (aggr_cpu_id__is_empty(&id))
 		return id;
 
@@ -229,13 +229,13 @@ int cpu__get_core_id(int cpu)
 	return ret ?: value;
 }
 
-struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data)
 {
 	struct aggr_cpu_id id;
 	int core = cpu__get_core_id(cpu);
 
-	/* cpu_map__get_die returns a struct with socket and die set*/
-	id = cpu_map__get_die_aggr_by_cpu(cpu, data);
+	/* aggr_cpu_id__die returns a struct with socket and die set*/
+	id = aggr_cpu_id__die(cpu, data);
 	if (aggr_cpu_id__is_empty(&id))
 		return id;
 
@@ -248,7 +248,7 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
 
 }
 
-struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
+struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 611048e2a592..ecd658293a2d 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -39,10 +39,6 @@ struct perf_cpu_map *cpu_map__new_data(struct perf_record_cpu_map_data *data);
 size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
 size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
-struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
 const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */
 
 int cpu__setup_cpunode_map(void);
@@ -88,4 +84,29 @@ bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b
 bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
 struct aggr_cpu_id aggr_cpu_id__empty(void);
 
+
+/**
+ * aggr_cpu_id__socket - Create an aggr_cpu_id with the socket populated with
+ * the socket for cpu. The function signature is compatible with
+ * aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__socket(int cpu, void *data);
+/**
+ * aggr_cpu_id__die - Create an aggr_cpu_id with the die and socket populated
+ * with the die and socket for cpu. The function signature is compatible with
+ * aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data);
+/**
+ * aggr_cpu_id__core - Create an aggr_cpu_id with the core, die and socket
+ * populated with the core, die and socket for cpu. The function signature is
+ * compatible with aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data);
+/**
+ * aggr_cpu_id__node - Create an aggr_cpu_id with the numa node populated for
+ * cpu. The function signature is compatible with aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data);
+
 #endif /* __PERF_CPUMAP_H */
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 20/48] perf cpumap: Move 'has' function to libperf
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (18 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 19/48] perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 21/48] perf cpumap: Add some comments to cpu_aggr_map Ian Rogers
                   ` (28 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Make the cpu map argument const for consistency with the rest of the
API. Modify cpu_map__idx accordingly.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/Documentation/libperf.txt |  1 +
 tools/lib/perf/cpumap.c                  |  7 ++++++-
 tools/lib/perf/include/internal/cpumap.h |  2 +-
 tools/lib/perf/include/perf/cpumap.h     |  1 +
 tools/lib/perf/libperf.map               |  1 +
 tools/perf/arch/arm/util/cs-etm.c        | 16 ++++++++--------
 tools/perf/builtin-sched.c               |  6 +++---
 tools/perf/tests/topology.c              |  2 +-
 tools/perf/util/cpumap.c                 |  5 -----
 tools/perf/util/cpumap.h                 |  2 --
 tools/perf/util/cputopo.c                |  2 +-
 11 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt
index 63ae5e0195ce..faef9ba3a540 100644
--- a/tools/lib/perf/Documentation/libperf.txt
+++ b/tools/lib/perf/Documentation/libperf.txt
@@ -48,6 +48,7 @@ SYNOPSIS
   int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
   bool perf_cpu_map__empty(const struct perf_cpu_map *map);
   int perf_cpu_map__max(struct perf_cpu_map *map);
+  bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
 
   #define perf_cpu_map__for_each_cpu(cpu, idx, cpus)
 --
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index adaad3dddf6e..3c36a06771af 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -268,7 +268,7 @@ bool perf_cpu_map__empty(const struct perf_cpu_map *map)
 	return map ? map->map[0] == -1 : true;
 }
 
-int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu)
+int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
 {
 	int low = 0, high = cpus->nr;
 
@@ -288,6 +288,11 @@ int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu)
 	return -1;
 }
 
+bool perf_cpu_map__has(const struct perf_cpu_map *cpus, int cpu)
+{
+	return perf_cpu_map__idx(cpus, cpu) != -1;
+}
+
 int perf_cpu_map__max(struct perf_cpu_map *map)
 {
 	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 4054169c12c5..71a31ed738c9 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -23,6 +23,6 @@ struct perf_cpu_map {
 #define MAX_NR_CPUS	2048
 #endif
 
-int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu);
+int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu);
 
 #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
index 7c27766ea0bf..3f1c0afa3ccd 100644
--- a/tools/lib/perf/include/perf/cpumap.h
+++ b/tools/lib/perf/include/perf/cpumap.h
@@ -20,6 +20,7 @@ LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
 LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
 LIBPERF_API bool perf_cpu_map__empty(const struct perf_cpu_map *map);
 LIBPERF_API int perf_cpu_map__max(struct perf_cpu_map *map);
+LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
 
 #define perf_cpu_map__for_each_cpu(cpu, idx, cpus)		\
 	for ((idx) = 0, (cpu) = perf_cpu_map__cpu(cpus, idx);	\
diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
index 5979bf92d98f..93696affda2e 100644
--- a/tools/lib/perf/libperf.map
+++ b/tools/lib/perf/libperf.map
@@ -10,6 +10,7 @@ LIBPERF_0.0.1 {
 		perf_cpu_map__cpu;
 		perf_cpu_map__empty;
 		perf_cpu_map__max;
+		perf_cpu_map__has;
 		perf_thread_map__new_dummy;
 		perf_thread_map__set_pid;
 		perf_thread_map__comm;
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 8a3d54a86c9c..129c0272d65b 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -204,8 +204,8 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
 
 	/* Set option of each CPU we have */
 	for (i = 0; i < cpu__max_cpu(); i++) {
-		if (!cpu_map__has(event_cpus, i) ||
-		    !cpu_map__has(online_cpus, i))
+		if (!perf_cpu_map__has(event_cpus, i) ||
+		    !perf_cpu_map__has(online_cpus, i))
 			continue;
 
 		if (option & BIT(ETM_OPT_CTXTID)) {
@@ -523,8 +523,8 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
 	/* cpu map is not empty, we have specific CPUs to work with */
 	if (!perf_cpu_map__empty(event_cpus)) {
 		for (i = 0; i < cpu__max_cpu(); i++) {
-			if (!cpu_map__has(event_cpus, i) ||
-			    !cpu_map__has(online_cpus, i))
+			if (!perf_cpu_map__has(event_cpus, i) ||
+			    !perf_cpu_map__has(online_cpus, i))
 				continue;
 
 			if (cs_etm_is_ete(itr, i))
@@ -537,7 +537,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
 	} else {
 		/* get configuration for all CPUs in the system */
 		for (i = 0; i < cpu__max_cpu(); i++) {
-			if (!cpu_map__has(online_cpus, i))
+			if (!perf_cpu_map__has(online_cpus, i))
 				continue;
 
 			if (cs_etm_is_ete(itr, i))
@@ -722,8 +722,8 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
 	} else {
 		/* Make sure all specified CPUs are online */
 		for (i = 0; i < perf_cpu_map__nr(event_cpus); i++) {
-			if (cpu_map__has(event_cpus, i) &&
-			    !cpu_map__has(online_cpus, i))
+			if (perf_cpu_map__has(event_cpus, i) &&
+			    !perf_cpu_map__has(online_cpus, i))
 				return -EINVAL;
 		}
 
@@ -744,7 +744,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
 	offset = CS_ETM_SNAPSHOT + 1;
 
 	for (i = 0; i < cpu__max_cpu() && offset < priv_size; i++)
-		if (cpu_map__has(cpu_map, i))
+		if (perf_cpu_map__has(cpu_map, i))
 			cs_etm_get_metadata(i, &offset, itr, info);
 
 	perf_cpu_map__put(online_cpus);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 4527f632ebe4..9da1da4749c9 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1617,10 +1617,10 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
 		if (curr_thread && thread__has_color(curr_thread))
 			pid_color = COLOR_PIDS;
 
-		if (sched->map.cpus && !cpu_map__has(sched->map.cpus, cpu))
+		if (sched->map.cpus && !perf_cpu_map__has(sched->map.cpus, cpu))
 			continue;
 
-		if (sched->map.color_cpus && cpu_map__has(sched->map.color_cpus, cpu))
+		if (sched->map.color_cpus && perf_cpu_map__has(sched->map.color_cpus, cpu))
 			cpu_color = COLOR_CPUS;
 
 		if (cpu != this_cpu)
@@ -1639,7 +1639,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
 			color_fprintf(stdout, color, "   ");
 	}
 
-	if (sched->map.cpus && !cpu_map__has(sched->map.cpus, this_cpu))
+	if (sched->map.cpus && !perf_cpu_map__has(sched->map.cpus, this_cpu))
 		goto out;
 
 	timestamp__scnprintf_usec(timestamp, stimestamp, sizeof(stimestamp));
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 0cb7b015b4b9..cb29ea7ec409 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -112,7 +112,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	TEST_ASSERT_VAL("Session header CPU map not set", session->header.env.cpu);
 
 	for (i = 0; i < session->header.env.nr_cpus_avail; i++) {
-		if (!cpu_map__has(map, i))
+		if (!perf_cpu_map__has(map, i))
 			continue;
 		pr_debug("CPU %d, core %d, socket %d\n", i,
 			 session->header.env.cpu[i].core_id,
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 19e502cc65e7..f1d76a8e92e8 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -463,11 +463,6 @@ int cpu__setup_cpunode_map(void)
 	return 0;
 }
 
-bool cpu_map__has(struct perf_cpu_map *cpus, int cpu)
-{
-	return perf_cpu_map__idx(cpus, cpu) != -1;
-}
-
 size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
 {
 	int i, cpu, start = -1;
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index ecd658293a2d..32b8b5178f01 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -78,8 +78,6 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 				       aggr_cpu_id_get_t get_id,
 				       void *data);
 
-bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
-
 bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
 bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
 struct aggr_cpu_id aggr_cpu_id__empty(void);
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index 51b429c86f98..8affb37d90e7 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -218,7 +218,7 @@ struct cpu_topology *cpu_topology__new(void)
 	tp->core_cpus_list = addr;
 
 	for (i = 0; i < nr; i++) {
-		if (!cpu_map__has(map, i))
+		if (!perf_cpu_map__has(map, i))
 			continue;
 
 		ret = build_cpu_topology(tp, i);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 21/48] perf cpumap: Add some comments to cpu_aggr_map
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (19 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 20/48] perf cpumap: Move 'has' function to libperf Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 22/48] perf cpumap: Trim the cpu_aggr_map Ian Rogers
                   ` (27 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Move cpu_aggr_map__empty_new to be with other cpu_aggr_map function.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 32b8b5178f01..25a08d640d81 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -24,16 +24,18 @@ struct aggr_cpu_id {
 	int core;
 };
 
+/** A collection of aggr_cpu_id values, the "built" version is sorted and uniqued. */
 struct cpu_aggr_map {
 	refcount_t refcnt;
+	/** Number of valid entries. */
 	int nr;
+	/** The entries. */
 	struct aggr_cpu_id map[];
 };
 
 struct perf_record_cpu_map_data;
 
 struct perf_cpu_map *perf_cpu_map__empty_new(int nr);
-struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr);
 
 struct perf_cpu_map *cpu_map__new_data(struct perf_record_cpu_map_data *data);
 size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
@@ -67,6 +69,12 @@ int cpu__get_die_id(int cpu);
  */
 int cpu__get_core_id(int cpu);
 
+/**
+ * cpu_aggr_map__empty_new - Create a cpu_aggr_map of size nr with every entry
+ * being empty.
+ */
+struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr);
+
 typedef struct aggr_cpu_id (*aggr_cpu_id_get_t)(int cpu, void *data);
 
 /**
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 22/48] perf cpumap: Trim the cpu_aggr_map
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (20 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 21/48] perf cpumap: Add some comments to cpu_aggr_map Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 23/48] perf stat: Fix memory leak in check_per_pkg Ian Rogers
                   ` (26 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
cpu_aggr_map__new removes duplicates, when this happens shrink the
array.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cpumap.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index f1d76a8e92e8..2779474f39db 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -185,7 +185,15 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 			c->nr++;
 		}
 	}
-
+	/* Trim. */
+	if (c->nr != cpus->nr) {
+		struct cpu_aggr_map *trimmed_c =
+			realloc(c,
+				sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
+
+		if (trimmed_c)
+			c = trimmed_c;
+	}
 	/* ensure we process id in increasing order */
 	qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 23/48] perf stat: Fix memory leak in check_per_pkg
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (21 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 22/48] perf cpumap: Trim the cpu_aggr_map Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 24/48] perf cpumap: Add CPU to aggr_cpu_id Ian Rogers
                   ` (25 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
If the key is already present then free the key used for lookup.
Found with:
$ perf stat -M IO_Read_BW /bin/true
==1749112==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 32 byte(s) in 4 object(s) allocated from:
    #0 0x7f6f6fa7d7cf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x55acecd9d7a6 in check_per_pkg util/stat.c:343
    #2 0x55acecd9d9c5 in process_counter_values util/stat.c:365
    #3 0x55acecd9e0ab in process_counter_maps util/stat.c:421
    #4 0x55acecd9e292 in perf_stat_process_counter util/stat.c:443
    #5 0x55aceca8553e in read_counters ./tools/perf/builtin-stat.c:470
    #6 0x55aceca88fe3 in __run_perf_stat ./tools/perf/builtin-stat.c:1023
    #7 0x55aceca89146 in run_perf_stat ./tools/perf/builtin-stat.c:1048
    #8 0x55aceca90858 in cmd_stat ./tools/perf/builtin-stat.c:2555
    #9 0x55acecc05fa5 in run_builtin ./tools/perf/perf.c:313
    #10 0x55acecc064fe in handle_internal_command ./tools/perf/perf.c:365
    #11 0x55acecc068bb in run_argv ./tools/perf/perf.c:409
    #12 0x55acecc070aa in main ./tools/perf/perf.c:539
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 5c24aca0968c..c69b221f5e3e 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -345,9 +345,10 @@ static int check_per_pkg(struct evsel *counter,
 		return -ENOMEM;
 
 	*key = (uint64_t)d << 32 | s;
-	if (hashmap__find(mask, (void *)key, NULL))
+	if (hashmap__find(mask, (void *)key, NULL)) {
 		*skip = true;
-	else
+		free(key);
+	} else
 		ret = hashmap__add(mask, (void *)key, (void *)1);
 
 	return ret;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 24/48] perf cpumap: Add CPU to aggr_cpu_id
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (22 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 23/48] perf stat: Fix memory leak in check_per_pkg Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 25/48] perf stat-display: Avoid use of core for CPU Ian Rogers
                   ` (24 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
With no aggregration, such as 'perf stat -A', the aggr_cpu_id lacks a
way to describe per CPU aggregation and the core is set to the CPU in
places like print_counter_aggrdata in stat-display.c. Setting the core
to the CPU is undesirable as the CPU will exceed valid core values and
lead to confusion. Add a CPU variable to address this.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/topology.c | 19 +++++++++++++++++++
 tools/perf/util/cpumap.c    | 25 +++++++++++++++++++++----
 tools/perf/util/cpumap.h    |  8 ++++++++
 3 files changed, 48 insertions(+), 4 deletions(-)
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index cb29ea7ec409..33e4cb81265c 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -119,6 +119,22 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 			 session->header.env.cpu[i].socket_id);
 	}
 
+	// Test that CPU ID contains socket, die, core and CPU
+	for (i = 0; i < map->nr; i++) {
+		id = aggr_cpu_id__cpu(perf_cpu_map__cpu(map, i), NULL);
+		TEST_ASSERT_VAL("Cpu map - CPU ID doesn't match", map->map[i] == id.cpu);
+
+		TEST_ASSERT_VAL("Cpu map - Core ID doesn't match",
+			session->header.env.cpu[map->map[i]].core_id == id.core);
+		TEST_ASSERT_VAL("Cpu map - Socket ID doesn't match",
+			session->header.env.cpu[map->map[i]].socket_id == id.socket);
+
+		TEST_ASSERT_VAL("Cpu map - Die ID doesn't match",
+			session->header.env.cpu[map->map[i]].die_id == id.die);
+		TEST_ASSERT_VAL("Cpu map - Node ID is set", id.node == -1);
+		TEST_ASSERT_VAL("Cpu map - Thread is set", id.thread == -1);
+	}
+
 	// Test that core ID contains socket, die and core
 	for (i = 0; i < map->nr; i++) {
 		id = aggr_cpu_id__core(perf_cpu_map__cpu(map, i), NULL);
@@ -145,6 +161,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 
 		TEST_ASSERT_VAL("Die map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Die map - Core is set", id.core == -1);
+		TEST_ASSERT_VAL("Die map - CPU is set", id.cpu == -1);
 		TEST_ASSERT_VAL("Die map - Thread is set", id.thread == -1);
 	}
 
@@ -157,6 +174,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 		TEST_ASSERT_VAL("Socket map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Socket map - Die ID is set", id.die == -1);
 		TEST_ASSERT_VAL("Socket map - Core is set", id.core == -1);
+		TEST_ASSERT_VAL("Socket map - CPU is set", id.cpu == -1);
 		TEST_ASSERT_VAL("Socket map - Thread is set", id.thread == -1);
 	}
 
@@ -168,6 +186,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 		TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
 		TEST_ASSERT_VAL("Node map - Die ID is set", id.die == -1);
 		TEST_ASSERT_VAL("Node map - Core is set", id.core == -1);
+		TEST_ASSERT_VAL("Node map - CPU is set", id.cpu == -1);
 		TEST_ASSERT_VAL("Node map - Thread is set", id.thread == -1);
 	}
 	perf_session__delete(session);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 2779474f39db..48ce583af0ec 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -242,7 +242,7 @@ struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data)
 	struct aggr_cpu_id id;
 	int core = cpu__get_core_id(cpu);
 
-	/* aggr_cpu_id__die returns a struct with socket and die set*/
+	/* aggr_cpu_id__die returns a struct with socket and die set. */
 	id = aggr_cpu_id__die(cpu, data);
 	if (aggr_cpu_id__is_empty(&id))
 		return id;
@@ -256,6 +256,20 @@ struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data)
 
 }
 
+struct aggr_cpu_id aggr_cpu_id__cpu(int cpu, void *data)
+{
+	struct aggr_cpu_id id;
+
+	/* aggr_cpu_id__core returns a struct with socket, die and core set. */
+	id = aggr_cpu_id__core(cpu, data);
+	if (aggr_cpu_id__is_empty(&id))
+		return id;
+
+	id.cpu = cpu;
+	return id;
+
+}
+
 struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
@@ -579,7 +593,8 @@ bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b
 		a->node == b->node &&
 		a->socket == b->socket &&
 		a->die == b->die &&
-		a->core == b->core;
+		a->core == b->core &&
+		a->cpu == b->cpu;
 }
 
 bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a)
@@ -588,7 +603,8 @@ bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a)
 		a->node == -1 &&
 		a->socket == -1 &&
 		a->die == -1 &&
-		a->core == -1;
+		a->core == -1 &&
+		a->cpu == -1;
 }
 
 struct aggr_cpu_id aggr_cpu_id__empty(void)
@@ -598,7 +614,8 @@ struct aggr_cpu_id aggr_cpu_id__empty(void)
 		.node = -1,
 		.socket = -1,
 		.die = -1,
-		.core = -1
+		.core = -1,
+		.cpu = -1
 	};
 	return ret;
 }
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 25a08d640d81..b98cd1739677 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -22,6 +22,8 @@ struct aggr_cpu_id {
 	int die;
 	/** The core id as read from /sys/devices/system/cpu/cpuX/topology/core_id. */
 	int core;
+	/** CPU aggregation, note there is one CPU for each SMT thread. */
+	int cpu;
 };
 
 /** A collection of aggr_cpu_id values, the "built" version is sorted and uniqued. */
@@ -109,6 +111,12 @@ struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data);
  * compatible with aggr_cpu_id_get_t.
  */
 struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data);
+/**
+ * aggr_cpu_id__core - Create an aggr_cpu_id with the cpu, core, die and socket
+ * populated with the cpu, core, die and socket for cpu. The function signature
+ * is compatible with aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__cpu(int cpu, void *data);
 /**
  * aggr_cpu_id__node - Create an aggr_cpu_id with the numa node populated for
  * cpu. The function signature is compatible with aggr_cpu_id_get_t.
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 25/48] perf stat-display: Avoid use of core for CPU.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (23 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 24/48] perf cpumap: Add CPU to aggr_cpu_id Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts Ian Rogers
                   ` (23 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Correct use of cpumap index in print_no_aggr_metric.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 45 +++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 23 deletions(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 870b1db71fbc..f48d1678861c 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -120,11 +120,10 @@ static void aggr_printout(struct perf_stat_config *config,
 				id.die,
 				config->csv_output ? 0 : -3,
 				id.core, config->csv_sep);
-		} else if (id.core > -1) {
+		} else if (id.cpu > -1) {
 			fprintf(config->output, "CPU%*d%s",
 				config->csv_output ? 0 : -7,
-				evsel__cpus(evsel)->map[id.core],
-				config->csv_sep);
+				id.cpu, config->csv_sep);
 		}
 		break;
 	case AGGR_THREAD:
@@ -334,7 +333,7 @@ static int first_shadow_cpu(struct perf_stat_config *config,
 	int cpu, idx;
 
 	if (config->aggr_mode == AGGR_NONE)
-		return id->core;
+		return id->cpu;
 
 	if (!config->aggr_get_id)
 		return 0;
@@ -697,10 +696,9 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
 		fprintf(output, "%s", prefix);
 
 	uval = val * counter->scale;
-	if (cpu != -1) {
-		id = aggr_cpu_id__empty();
-		id.core = cpu;
-	}
+	if (cpu != -1)
+		id = aggr_cpu_id__cpu(cpu, /*data=*/NULL);
+
 	printout(config, id, nr, counter, uval,
 		 prefix, run, ena, 1.0, &rt_stat);
 	if (!metric_only)
@@ -911,8 +909,7 @@ static void print_counter(struct perf_stat_config *config,
 			fprintf(output, "%s", prefix);
 
 		uval = val * counter->scale;
-		id = aggr_cpu_id__empty();
-		id.core = cpu;
+		id = aggr_cpu_id__cpu(cpu, /*data=*/NULL);
 		printout(config, id, 0, counter, uval, prefix,
 			 run, ena, 1.0, &rt_stat);
 
@@ -924,29 +921,31 @@ static void print_no_aggr_metric(struct perf_stat_config *config,
 				 struct evlist *evlist,
 				 char *prefix)
 {
-	int cpu;
-	int nrcpus = 0;
-	struct evsel *counter;
-	u64 ena, run, val;
-	double uval;
-	struct aggr_cpu_id id;
+	int all_idx, cpu;
 
-	nrcpus = evlist->core.cpus->nr;
-	for (cpu = 0; cpu < nrcpus; cpu++) {
+	perf_cpu_map__for_each_cpu(cpu, all_idx, evlist->core.cpus) {
+		struct evsel *counter;
 		bool first = true;
 
 		if (prefix)
 			fputs(prefix, config->output);
 		evlist__for_each_entry(evlist, counter) {
-			id = aggr_cpu_id__empty();
-			id.core = cpu;
+			u64 ena, run, val;
+			double uval;
+			struct aggr_cpu_id id;
+			int counter_idx = perf_cpu_map__idx(evsel__cpus(counter), cpu);
+
+			if (counter_idx < 0)
+				continue;
+
+			id = aggr_cpu_id__cpu(cpu, /*data=*/NULL);
 			if (first) {
 				aggr_printout(config, counter, id, 0);
 				first = false;
 			}
-			val = perf_counts(counter->counts, cpu, 0)->val;
-			ena = perf_counts(counter->counts, cpu, 0)->ena;
-			run = perf_counts(counter->counts, cpu, 0)->run;
+			val = perf_counts(counter->counts, counter_idx, 0)->val;
+			ena = perf_counts(counter->counts, counter_idx, 0)->ena;
+			run = perf_counts(counter->counts, counter_idx, 0)->run;
 
 			uval = val * counter->scale;
 			printout(config, id, 0, counter, uval, prefix,
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (24 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 25/48] perf stat-display: Avoid use of core for CPU Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-09 18:30   ` Jiri Olsa
  2022-01-05  6:13 ` [PATCH v4 27/48] libperf: Switch cpu to more accurate cpu_map_idx Ian Rogers
                   ` (22 subsequent siblings)
  48 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Passing the number of CPUs and threads allows for an evsel's counts to
be mismatched to its cpu map. To avoid this always derive the counts
size from the cpu map. Change openat-syscall-all-cpus to set the cpus
to allow for this to work.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/openat-syscall-all-cpus.c | 10 +---------
 tools/perf/util/counts.c                   |  8 ++++++--
 tools/perf/util/counts.h                   |  2 +-
 tools/perf/util/evsel.c                    |  2 +-
 tools/perf/util/stat.c                     | 13 ++++++-------
 5 files changed, 15 insertions(+), 20 deletions(-)
diff --git a/tools/perf/tests/openat-syscall-all-cpus.c b/tools/perf/tests/openat-syscall-all-cpus.c
index cd3dd463783f..544db0839b3b 100644
--- a/tools/perf/tests/openat-syscall-all-cpus.c
+++ b/tools/perf/tests/openat-syscall-all-cpus.c
@@ -85,15 +85,7 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 		CPU_CLR(cpus->map[cpu], &cpu_set);
 	}
 
-	/*
-	 * Here we need to explicitly preallocate the counts, as if
-	 * we use the auto allocation it will allocate just for 1 cpu,
-	 * as we start by cpu 0.
-	 */
-	if (evsel__alloc_counts(evsel, cpus->nr, 1) < 0) {
-		pr_debug("evsel__alloc_counts(ncpus=%d)\n", cpus->nr);
-		goto out_close_fd;
-	}
+	evsel->core.cpus = perf_cpu_map__get(cpus);
 
 	err = 0;
 
diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
index 582f3aeaf5e4..2b81707b9dba 100644
--- a/tools/perf/util/counts.c
+++ b/tools/perf/util/counts.c
@@ -4,6 +4,7 @@
 #include <string.h>
 #include "evsel.h"
 #include "counts.h"
+#include <perf/threadmap.h>
 #include <linux/zalloc.h>
 
 struct perf_counts *perf_counts__new(int ncpus, int nthreads)
@@ -55,9 +56,12 @@ void evsel__reset_counts(struct evsel *evsel)
 	perf_counts__reset(evsel->counts);
 }
 
-int evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads)
+int evsel__alloc_counts(struct evsel *evsel)
 {
-	evsel->counts = perf_counts__new(ncpus, nthreads);
+	struct perf_cpu_map *cpus = evsel__cpus(evsel);
+	int nthreads = perf_thread_map__nr(evsel->core.threads);
+
+	evsel->counts = perf_counts__new(cpus ? cpus->nr : 1, nthreads);
 	return evsel->counts != NULL ? 0 : -ENOMEM;
 }
 
diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
index 7ff36bf6d644..3e275e9c60d1 100644
--- a/tools/perf/util/counts.h
+++ b/tools/perf/util/counts.h
@@ -40,7 +40,7 @@ void perf_counts__delete(struct perf_counts *counts);
 void perf_counts__reset(struct perf_counts *counts);
 
 void evsel__reset_counts(struct evsel *evsel);
-int evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
+int evsel__alloc_counts(struct evsel *evsel);
 void evsel__free_counts(struct evsel *evsel);
 
 #endif /* __PERF_COUNTS_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 656c30b988ce..6c9af21776e6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1578,7 +1578,7 @@ int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale)
 	if (FD(evsel, cpu, thread) < 0)
 		return -EINVAL;
 
-	if (evsel->counts == NULL && evsel__alloc_counts(evsel, cpu + 1, thread + 1) < 0)
+	if (evsel->counts == NULL && evsel__alloc_counts(evsel) < 0)
 		return -ENOMEM;
 
 	if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) <= 0)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index c69b221f5e3e..995cb5003133 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -152,11 +152,13 @@ static void evsel__free_stat_priv(struct evsel *evsel)
 	zfree(&evsel->stats);
 }
 
-static int evsel__alloc_prev_raw_counts(struct evsel *evsel, int ncpus, int nthreads)
+static int evsel__alloc_prev_raw_counts(struct evsel *evsel)
 {
+	int cpu_map_nr = evsel__nr_cpus(evsel);
+	int nthreads = perf_thread_map__nr(evsel->core.threads);
 	struct perf_counts *counts;
 
-	counts = perf_counts__new(ncpus, nthreads);
+	counts = perf_counts__new(cpu_map_nr, nthreads);
 	if (counts)
 		evsel->prev_raw_counts = counts;
 
@@ -177,12 +179,9 @@ static void evsel__reset_prev_raw_counts(struct evsel *evsel)
 
 static int evsel__alloc_stats(struct evsel *evsel, bool alloc_raw)
 {
-	int ncpus = evsel__nr_cpus(evsel);
-	int nthreads = perf_thread_map__nr(evsel->core.threads);
-
 	if (evsel__alloc_stat_priv(evsel) < 0 ||
-	    evsel__alloc_counts(evsel, ncpus, nthreads) < 0 ||
-	    (alloc_raw && evsel__alloc_prev_raw_counts(evsel, ncpus, nthreads) < 0))
+	    evsel__alloc_counts(evsel) < 0 ||
+	    (alloc_raw && evsel__alloc_prev_raw_counts(evsel) < 0))
 		return -ENOMEM;
 
 	return 0;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 27/48] libperf: Switch cpu to more accurate cpu_map_idx
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (25 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 28/48] libperf: Use cpu not index for evsel mmap Ian Rogers
                   ` (21 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Modify variable names and adopt perf_cpu_map__for_each_cpu in
perf_evsel__open. Renaming is done by looking for consistency in API
usage.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/evsel.c              | 88 +++++++++++++++--------------
 tools/lib/perf/include/perf/evsel.h | 10 ++--
 2 files changed, 50 insertions(+), 48 deletions(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 68f83d2c27c1..8028b5a4da69 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -43,18 +43,22 @@ void perf_evsel__delete(struct perf_evsel *evsel)
 	free(evsel);
 }
 
-#define FD(e, x, y) ((int *) xyarray__entry(e->fd, x, y))
-#define MMAP(e, x, y) (e->mmap ? ((struct perf_mmap *) xyarray__entry(e->mmap, x, y)) : NULL)
+#define FD(_evsel, _cpu_map_idx, _thread)				\
+	((int *)xyarray__entry(_evsel->fd, _cpu_map_idx, _thread))
+#define MMAP(_evsel, _cpu_map_idx, _thread)				\
+	(_evsel->mmap ? ((struct perf_mmap *) xyarray__entry(_evsel->mmap, _cpu_map_idx, _thread)) \
+		      : NULL)
 
 int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
 {
 	evsel->fd = xyarray__new(ncpus, nthreads, sizeof(int));
 
 	if (evsel->fd) {
-		int cpu, thread;
-		for (cpu = 0; cpu < ncpus; cpu++) {
+		int idx, thread;
+
+		for (idx = 0; idx < ncpus; idx++) {
 			for (thread = 0; thread < nthreads; thread++) {
-				int *fd = FD(evsel, cpu, thread);
+				int *fd = FD(evsel, idx, thread);
 
 				if (fd)
 					*fd = -1;
@@ -80,7 +84,7 @@ sys_perf_event_open(struct perf_event_attr *attr,
 	return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
 }
 
-static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread, int *group_fd)
+static int get_group_fd(struct perf_evsel *evsel, int cpu_map_idx, int thread, int *group_fd)
 {
 	struct perf_evsel *leader = evsel->leader;
 	int *fd;
@@ -97,7 +101,7 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread, int *grou
 	if (!leader->fd)
 		return -ENOTCONN;
 
-	fd = FD(leader, cpu, thread);
+	fd = FD(leader, cpu_map_idx, thread);
 	if (fd == NULL || *fd == -1)
 		return -EBADF;
 
@@ -109,7 +113,7 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread, int *grou
 int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 		     struct perf_thread_map *threads)
 {
-	int cpu, thread, err = 0;
+	int cpu, idx, thread, err = 0;
 
 	if (cpus == NULL) {
 		static struct perf_cpu_map *empty_cpu_map;
@@ -139,21 +143,21 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 	    perf_evsel__alloc_fd(evsel, cpus->nr, threads->nr) < 0)
 		return -ENOMEM;
 
-	for (cpu = 0; cpu < cpus->nr; cpu++) {
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 		for (thread = 0; thread < threads->nr; thread++) {
 			int fd, group_fd, *evsel_fd;
 
-			evsel_fd = FD(evsel, cpu, thread);
+			evsel_fd = FD(evsel, idx, thread);
 			if (evsel_fd == NULL)
 				return -EINVAL;
 
-			err = get_group_fd(evsel, cpu, thread, &group_fd);
+			err = get_group_fd(evsel, idx, thread, &group_fd);
 			if (err < 0)
 				return err;
 
 			fd = sys_perf_event_open(&evsel->attr,
 						 threads->map[thread].pid,
-						 cpus->map[cpu], group_fd, 0);
+						 cpu, group_fd, 0);
 
 			if (fd < 0)
 				return -errno;
@@ -165,12 +169,12 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 	return err;
 }
 
-static void perf_evsel__close_fd_cpu(struct perf_evsel *evsel, int cpu)
+static void perf_evsel__close_fd_cpu(struct perf_evsel *evsel, int cpu_map_idx)
 {
 	int thread;
 
 	for (thread = 0; thread < xyarray__max_y(evsel->fd); ++thread) {
-		int *fd = FD(evsel, cpu, thread);
+		int *fd = FD(evsel, cpu_map_idx, thread);
 
 		if (fd && *fd >= 0) {
 			close(*fd);
@@ -181,10 +185,8 @@ static void perf_evsel__close_fd_cpu(struct perf_evsel *evsel, int cpu)
 
 void perf_evsel__close_fd(struct perf_evsel *evsel)
 {
-	int cpu;
-
-	for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++)
-		perf_evsel__close_fd_cpu(evsel, cpu);
+	for (int idx = 0; idx < xyarray__max_x(evsel->fd); idx++)
+		perf_evsel__close_fd_cpu(evsel, idx);
 }
 
 void perf_evsel__free_fd(struct perf_evsel *evsel)
@@ -202,29 +204,29 @@ void perf_evsel__close(struct perf_evsel *evsel)
 	perf_evsel__free_fd(evsel);
 }
 
-void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu)
+void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu_map_idx)
 {
 	if (evsel->fd == NULL)
 		return;
 
-	perf_evsel__close_fd_cpu(evsel, cpu);
+	perf_evsel__close_fd_cpu(evsel, cpu_map_idx);
 }
 
 void perf_evsel__munmap(struct perf_evsel *evsel)
 {
-	int cpu, thread;
+	int idx, thread;
 
 	if (evsel->fd == NULL || evsel->mmap == NULL)
 		return;
 
-	for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) {
+	for (idx = 0; idx < xyarray__max_x(evsel->fd); idx++) {
 		for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
-			int *fd = FD(evsel, cpu, thread);
+			int *fd = FD(evsel, idx, thread);
 
 			if (fd == NULL || *fd < 0)
 				continue;
 
-			perf_mmap__munmap(MMAP(evsel, cpu, thread));
+			perf_mmap__munmap(MMAP(evsel, idx, thread));
 		}
 	}
 
@@ -234,7 +236,7 @@ void perf_evsel__munmap(struct perf_evsel *evsel)
 
 int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
 {
-	int ret, cpu, thread;
+	int ret, idx, thread;
 	struct perf_mmap_param mp = {
 		.prot = PROT_READ | PROT_WRITE,
 		.mask = (pages * page_size) - 1,
@@ -246,18 +248,18 @@ int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
 	if (perf_evsel__alloc_mmap(evsel, xyarray__max_x(evsel->fd), xyarray__max_y(evsel->fd)) < 0)
 		return -ENOMEM;
 
-	for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) {
+	for (idx = 0; idx < xyarray__max_x(evsel->fd); idx++) {
 		for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
-			int *fd = FD(evsel, cpu, thread);
+			int *fd = FD(evsel, idx, thread);
 			struct perf_mmap *map;
 
 			if (fd == NULL || *fd < 0)
 				continue;
 
-			map = MMAP(evsel, cpu, thread);
+			map = MMAP(evsel, idx, thread);
 			perf_mmap__init(map, NULL, false, NULL);
 
-			ret = perf_mmap__mmap(map, &mp, *fd, cpu);
+			ret = perf_mmap__mmap(map, &mp, *fd, idx);
 			if (ret) {
 				perf_evsel__munmap(evsel);
 				return ret;
@@ -268,14 +270,14 @@ int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
 	return 0;
 }
 
-void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread)
+void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu_map_idx, int thread)
 {
-	int *fd = FD(evsel, cpu, thread);
+	int *fd = FD(evsel, cpu_map_idx, thread);
 
-	if (fd == NULL || *fd < 0 || MMAP(evsel, cpu, thread) == NULL)
+	if (fd == NULL || *fd < 0 || MMAP(evsel, cpu_map_idx, thread) == NULL)
 		return NULL;
 
-	return MMAP(evsel, cpu, thread)->base;
+	return MMAP(evsel, cpu_map_idx, thread)->base;
 }
 
 int perf_evsel__read_size(struct perf_evsel *evsel)
@@ -303,19 +305,19 @@ int perf_evsel__read_size(struct perf_evsel *evsel)
 	return size;
 }
 
-int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
+int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int thread,
 		     struct perf_counts_values *count)
 {
 	size_t size = perf_evsel__read_size(evsel);
-	int *fd = FD(evsel, cpu, thread);
+	int *fd = FD(evsel, cpu_map_idx, thread);
 
 	memset(count, 0, sizeof(*count));
 
 	if (fd == NULL || *fd < 0)
 		return -EINVAL;
 
-	if (MMAP(evsel, cpu, thread) &&
-	    !perf_mmap__read_self(MMAP(evsel, cpu, thread), count))
+	if (MMAP(evsel, cpu_map_idx, thread) &&
+	    !perf_mmap__read_self(MMAP(evsel, cpu_map_idx, thread), count))
 		return 0;
 
 	if (readn(*fd, count->values, size) <= 0)
@@ -326,13 +328,13 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 
 static int perf_evsel__run_ioctl(struct perf_evsel *evsel,
 				 int ioc,  void *arg,
-				 int cpu)
+				 int cpu_map_idx)
 {
 	int thread;
 
 	for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
 		int err;
-		int *fd = FD(evsel, cpu, thread);
+		int *fd = FD(evsel, cpu_map_idx, thread);
 
 		if (fd == NULL || *fd < 0)
 			return -1;
@@ -346,9 +348,9 @@ static int perf_evsel__run_ioctl(struct perf_evsel *evsel,
 	return 0;
 }
 
-int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu)
+int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
 {
-	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu);
+	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu_map_idx);
 }
 
 int perf_evsel__enable(struct perf_evsel *evsel)
@@ -361,9 +363,9 @@ int perf_evsel__enable(struct perf_evsel *evsel)
 	return err;
 }
 
-int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu)
+int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
 {
-	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_DISABLE, NULL, cpu);
+	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_DISABLE, NULL, cpu_map_idx);
 }
 
 int perf_evsel__disable(struct perf_evsel *evsel)
diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
index f401c7484bec..2a9516b42d15 100644
--- a/tools/lib/perf/include/perf/evsel.h
+++ b/tools/lib/perf/include/perf/evsel.h
@@ -28,16 +28,16 @@ LIBPERF_API void perf_evsel__delete(struct perf_evsel *evsel);
 LIBPERF_API int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 				 struct perf_thread_map *threads);
 LIBPERF_API void perf_evsel__close(struct perf_evsel *evsel);
-LIBPERF_API void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
+LIBPERF_API void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu_map_idx);
 LIBPERF_API int perf_evsel__mmap(struct perf_evsel *evsel, int pages);
 LIBPERF_API void perf_evsel__munmap(struct perf_evsel *evsel);
-LIBPERF_API void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread);
-LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
+LIBPERF_API void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu_map_idx, int thread);
+LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int thread,
 				 struct perf_counts_values *count);
 LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
-LIBPERF_API int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu);
+LIBPERF_API int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
 LIBPERF_API int perf_evsel__disable(struct perf_evsel *evsel);
-LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu);
+LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
 LIBPERF_API struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
 LIBPERF_API struct perf_thread_map *perf_evsel__threads(struct perf_evsel *evsel);
 LIBPERF_API struct perf_event_attr *perf_evsel__attr(struct perf_evsel *evsel);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 28/48] libperf: Use cpu not index for evsel mmap
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (26 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 27/48] libperf: Switch cpu to more accurate cpu_map_idx Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 29/48] perf counts: Switch name cpu to cpu_map_idx Ian Rogers
                   ` (20 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Fix issue where evsel's CPU map index was being used as the mmap cpu.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/evsel.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8028b5a4da69..f1e1665ef4bd 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -252,6 +252,7 @@ int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
 		for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
 			int *fd = FD(evsel, idx, thread);
 			struct perf_mmap *map;
+			int cpu = perf_cpu_map__cpu(evsel->cpus, idx);
 
 			if (fd == NULL || *fd < 0)
 				continue;
@@ -259,7 +260,7 @@ int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
 			map = MMAP(evsel, idx, thread);
 			perf_mmap__init(map, NULL, false, NULL);
 
-			ret = perf_mmap__mmap(map, &mp, *fd, idx);
+			ret = perf_mmap__mmap(map, &mp, *fd, cpu);
 			if (ret) {
 				perf_evsel__munmap(evsel);
 				return ret;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 29/48] perf counts: Switch name cpu to cpu_map_idx
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (27 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 28/48] libperf: Use cpu not index for evsel mmap Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 30/48] perf stat: Rename aggr_data cpu to imply it's an index Ian Rogers
                   ` (19 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Try to reduce confusion in particular when the cpu map doesn't contain
an entry for every CPU.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/counts.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
index 3e275e9c60d1..5de275194f2b 100644
--- a/tools/perf/util/counts.h
+++ b/tools/perf/util/counts.h
@@ -18,21 +18,21 @@ struct perf_counts {
 
 
 static inline struct perf_counts_values*
-perf_counts(struct perf_counts *counts, int cpu, int thread)
+perf_counts(struct perf_counts *counts, int cpu_map_idx, int thread)
 {
-	return xyarray__entry(counts->values, cpu, thread);
+	return xyarray__entry(counts->values, cpu_map_idx, thread);
 }
 
 static inline bool
-perf_counts__is_loaded(struct perf_counts *counts, int cpu, int thread)
+perf_counts__is_loaded(struct perf_counts *counts, int cpu_map_idx, int thread)
 {
-	return *((bool *) xyarray__entry(counts->loaded, cpu, thread));
+	return *((bool *) xyarray__entry(counts->loaded, cpu_map_idx, thread));
 }
 
 static inline void
-perf_counts__set_loaded(struct perf_counts *counts, int cpu, int thread, bool loaded)
+perf_counts__set_loaded(struct perf_counts *counts, int cpu_map_idx, int thread, bool loaded)
 {
-	*((bool *) xyarray__entry(counts->loaded, cpu, thread)) = loaded;
+	*((bool *) xyarray__entry(counts->loaded, cpu_map_idx, thread)) = loaded;
 }
 
 struct perf_counts *perf_counts__new(int ncpus, int nthreads);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 30/48] perf stat: Rename aggr_data cpu to imply it's an index
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (28 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 29/48] perf counts: Switch name cpu to cpu_map_idx Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 31/48] perf stat: Use perf_cpu_map__for_each_cpu Ian Rogers
                   ` (18 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Trying to make cpu maps less error prone.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index f48d1678861c..7e933a8fee68 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -626,7 +626,7 @@ struct aggr_data {
 	u64 ena, run, val;
 	struct aggr_cpu_id id;
 	int nr;
-	int cpu;
+	int cpu_map_idx;
 };
 
 static void aggr_cb(struct perf_stat_config *config,
@@ -878,9 +878,9 @@ static void counter_cb(struct perf_stat_config *config __maybe_unused,
 {
 	struct aggr_data *ad = data;
 
-	ad->val += perf_counts(counter->counts, ad->cpu, 0)->val;
-	ad->ena += perf_counts(counter->counts, ad->cpu, 0)->ena;
-	ad->run += perf_counts(counter->counts, ad->cpu, 0)->run;
+	ad->val += perf_counts(counter->counts, ad->cpu_map_idx, 0)->val;
+	ad->ena += perf_counts(counter->counts, ad->cpu_map_idx, 0)->ena;
+	ad->run += perf_counts(counter->counts, ad->cpu_map_idx, 0)->run;
 }
 
 /*
@@ -897,7 +897,7 @@ static void print_counter(struct perf_stat_config *config,
 	struct aggr_cpu_id id;
 
 	for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
-		struct aggr_data ad = { .cpu = cpu };
+		struct aggr_data ad = { .cpu_map_idx = cpu };
 
 		if (!collect_data(config, counter, counter_cb, &ad))
 			return;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 31/48] perf stat: Use perf_cpu_map__for_each_cpu
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (29 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 30/48] perf stat: Rename aggr_data cpu to imply it's an index Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 32/48] perf script: Use for each cpu to aid readability Ian Rogers
                   ` (17 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Correct in print_counter where an index was being used as a cpu.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 7e933a8fee68..0f192360b6c6 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -4,6 +4,7 @@
 #include <linux/string.h>
 #include <linux/time64.h>
 #include <math.h>
+#include <perf/cpumap.h>
 #include "color.h"
 #include "counts.h"
 #include "evlist.h"
@@ -732,7 +733,7 @@ static void print_aggr(struct perf_stat_config *config,
 		evlist__for_each_entry(evlist, counter) {
 			print_counter_aggrdata(config, counter, s,
 					       prefix, metric_only,
-					       &first, -1);
+					       &first, /*cpu=*/-1);
 		}
 		if (metric_only)
 			fputc('\n', output);
@@ -893,11 +894,11 @@ static void print_counter(struct perf_stat_config *config,
 	FILE *output = config->output;
 	u64 ena, run, val;
 	double uval;
-	int cpu;
+	int idx, cpu;
 	struct aggr_cpu_id id;
 
-	for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
-		struct aggr_data ad = { .cpu_map_idx = cpu };
+	perf_cpu_map__for_each_cpu(cpu, idx, evsel__cpus(counter)) {
+		struct aggr_data ad = { .cpu_map_idx = idx };
 
 		if (!collect_data(config, counter, counter_cb, &ad))
 			return;
@@ -1248,7 +1249,7 @@ static void print_percore(struct perf_stat_config *config,
 
 		print_counter_aggrdata(config, counter, s,
 				       prefix, metric_only,
-				       &first, -1);
+				       &first, /*cpu=*/-1);
 	}
 
 	if (metric_only)
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 32/48] perf script: Use for each cpu to aid readability
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (30 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 31/48] perf stat: Use perf_cpu_map__for_each_cpu Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 33/48] libperf: Allow NULL in perf_cpu_map__idx Ian Rogers
                   ` (16 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Use perf_cpu_map__for_each_cpu to help with readability.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-script.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c9b3002ec254..f40319144856 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2115,8 +2115,7 @@ static struct scripting_ops	*scripting_ops;
 static void __process_stat(struct evsel *counter, u64 tstamp)
 {
 	int nthreads = perf_thread_map__nr(counter->core.threads);
-	int ncpus = evsel__nr_cpus(counter);
-	int cpu, thread;
+	int idx, cpu, thread;
 	static int header_printed;
 
 	if (counter->core.system_wide)
@@ -2129,13 +2128,13 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 	}
 
 	for (thread = 0; thread < nthreads; thread++) {
-		for (cpu = 0; cpu < ncpus; cpu++) {
+		perf_cpu_map__for_each_cpu(cpu, idx, evsel__cpus(counter)) {
 			struct perf_counts_values *counts;
 
 			counts = perf_counts(counter->counts, cpu, thread);
 
 			printf("%3d %8d %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %s\n",
-				counter->core.cpus->map[cpu],
+				cpu,
 				perf_thread_map__pid(counter->core.threads, thread),
 				counts->val,
 				counts->ena,
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 33/48] libperf: Allow NULL in perf_cpu_map__idx
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (31 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 32/48] perf script: Use for each cpu to aid readability Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 34/48] perf evlist: Refactor evlist__for_each_cpu Ian Rogers
                   ` (15 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Return -1, not found, if NULL is passed.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index 3c36a06771af..eacea3ab965a 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -270,8 +270,13 @@ bool perf_cpu_map__empty(const struct perf_cpu_map *map)
 
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
 {
-	int low = 0, high = cpus->nr;
+	int low, high;
 
+	if (!cpus)
+		return -1;
+
+	low = 0;
+	high = cpus->nr;
 	while (low < high) {
 		int idx = (low + high) / 2,
 		    cpu_at_idx = cpus->map[idx];
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 34/48] perf evlist: Refactor evlist__for_each_cpu.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (32 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 33/48] libperf: Allow NULL in perf_cpu_map__idx Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 35/48] perf evsel: Pass cpu not cpu map index to synthesize Ian Rogers
                   ` (14 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Previously evlist__for_each_cpu needed to iterate over the evlist in an
inner loop and call "skip" routines. Refactor this so that the iteratr
is smarter and the next function can update both the current CPU and
evsel.
By using a cpu map index, fix apparent off-by-1 in __run_perf_stat's
call to perf_evsel__close_cpu.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c | 179 ++++++++++++++++++--------------------
 tools/perf/util/evlist.c  | 146 +++++++++++++++++--------------
 tools/perf/util/evlist.h  |  50 +++++++++--
 tools/perf/util/evsel.h   |   1 -
 4 files changed, 210 insertions(+), 166 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98f80c217279..f31bc34bd4cc 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -405,36 +405,33 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
 
 static int read_affinity_counters(struct timespec *rs)
 {
-	struct evsel *counter;
-	struct affinity affinity;
-	int i, ncpus, cpu;
+	struct evlist_cpu_iterator evlist_cpu_itr;
+	struct affinity saved_affinity, *affinity;
 
 	if (all_counters_use_bpf)
 		return 0;
 
-	if (affinity__setup(&affinity) < 0)
+	if (!target__has_cpu(&target) || target__has_per_thread(&target))
+		affinity = NULL;
+	else if (affinity__setup(&saved_affinity) < 0)
 		return -1;
+	else
+		affinity = &saved_affinity;
 
-	ncpus = perf_cpu_map__nr(evsel_list->core.all_cpus);
-	if (!target__has_cpu(&target) || target__has_per_thread(&target))
-		ncpus = 1;
-	evlist__for_each_cpu(evsel_list, i, cpu) {
-		if (i >= ncpus)
-			break;
-		affinity__set(&affinity, cpu);
+	evlist__for_each_cpu(evlist_cpu_itr, evsel_list, affinity) {
+		struct evsel *counter = evlist_cpu_itr.evsel;
 
-		evlist__for_each_entry(evsel_list, counter) {
-			if (evsel__cpu_iter_skip(counter, cpu))
-				continue;
-			if (evsel__is_bpf(counter))
-				continue;
-			if (!counter->err) {
-				counter->err = read_counter_cpu(counter, rs,
-								counter->cpu_iter - 1);
-			}
+		if (evsel__is_bpf(counter))
+			continue;
+
+		if (!counter->err) {
+			counter->err = read_counter_cpu(counter, rs,
+							evlist_cpu_itr.cpu_map_idx);
 		}
 	}
-	affinity__cleanup(&affinity);
+	if (affinity)
+		affinity__cleanup(&saved_affinity);
+
 	return 0;
 }
 
@@ -788,8 +785,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	int status = 0;
 	const bool forks = (argc > 0);
 	bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
+	struct evlist_cpu_iterator evlist_cpu_itr;
 	struct affinity affinity;
-	int i, cpu, err;
+	int err;
 	bool second_pass = false;
 
 	if (forks) {
@@ -813,102 +811,97 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			all_counters_use_bpf = false;
 	}
 
-	evlist__for_each_cpu (evsel_list, i, cpu) {
+	evlist__for_each_cpu(evlist_cpu_itr, evsel_list, &affinity) {
+		counter = evlist_cpu_itr.evsel;
+
 		/*
 		 * bperf calls evsel__open_per_cpu() in bperf__load(), so
 		 * no need to call it again here.
 		 */
 		if (target.use_bpf)
 			break;
-		affinity__set(&affinity, cpu);
 
-		evlist__for_each_entry(evsel_list, counter) {
-			if (evsel__cpu_iter_skip(counter, cpu))
+		if (counter->reset_group || counter->errored)
+			continue;
+		if (evsel__is_bpf(counter))
+			continue;
+try_again:
+		if (create_perf_stat_counter(counter, &stat_config, &target,
+					     evlist_cpu_itr.cpu_map_idx) < 0) {
+
+			/*
+			 * Weak group failed. We cannot just undo this here
+			 * because earlier CPUs might be in group mode, and the kernel
+			 * doesn't support mixing group and non group reads. Defer
+			 * it to later.
+			 * Don't close here because we're in the wrong affinity.
+			 */
+			if ((errno == EINVAL || errno == EBADF) &&
+				evsel__leader(counter) != counter &&
+				counter->weak_group) {
+				evlist__reset_weak_group(evsel_list, counter, false);
+				assert(counter->reset_group);
+				second_pass = true;
 				continue;
-			if (counter->reset_group || counter->errored)
+			}
+
+			switch (stat_handle_error(counter)) {
+			case COUNTER_FATAL:
+				return -1;
+			case COUNTER_RETRY:
+				goto try_again;
+			case COUNTER_SKIP:
 				continue;
-			if (evsel__is_bpf(counter))
+			default:
+				break;
+			}
+
+		}
+		counter->supported = true;
+	}
+
+	if (second_pass) {
+		/*
+		 * Now redo all the weak group after closing them,
+		 * and also close errored counters.
+		 */
+
+		/* First close errored or weak retry */
+		evlist__for_each_cpu(evlist_cpu_itr, evsel_list, &affinity) {
+			counter = evlist_cpu_itr.evsel;
+
+			if (!counter->reset_group && !counter->errored)
 				continue;
-try_again:
+
+			perf_evsel__close_cpu(&counter->core, evlist_cpu_itr.cpu_map_idx);
+		}
+		/* Now reopen weak */
+		evlist__for_each_cpu(evlist_cpu_itr, evsel_list, &affinity) {
+			counter = evlist_cpu_itr.evsel;
+
+			if (!counter->reset_group && !counter->errored)
+				continue;
+			if (!counter->reset_group)
+				continue;
+try_again_reset:
+			pr_debug2("reopening weak %s\n", evsel__name(counter));
 			if (create_perf_stat_counter(counter, &stat_config, &target,
-						     counter->cpu_iter - 1) < 0) {
-
-				/*
-				 * Weak group failed. We cannot just undo this here
-				 * because earlier CPUs might be in group mode, and the kernel
-				 * doesn't support mixing group and non group reads. Defer
-				 * it to later.
-				 * Don't close here because we're in the wrong affinity.
-				 */
-				if ((errno == EINVAL || errno == EBADF) &&
-				    evsel__leader(counter) != counter &&
-				    counter->weak_group) {
-					evlist__reset_weak_group(evsel_list, counter, false);
-					assert(counter->reset_group);
-					second_pass = true;
-					continue;
-				}
+						     evlist_cpu_itr.cpu_map_idx) < 0) {
 
 				switch (stat_handle_error(counter)) {
 				case COUNTER_FATAL:
 					return -1;
 				case COUNTER_RETRY:
-					goto try_again;
+					goto try_again_reset;
 				case COUNTER_SKIP:
 					continue;
 				default:
 					break;
 				}
-
 			}
 			counter->supported = true;
 		}
 	}
-
-	if (second_pass) {
-		/*
-		 * Now redo all the weak group after closing them,
-		 * and also close errored counters.
-		 */
-
-		evlist__for_each_cpu(evsel_list, i, cpu) {
-			affinity__set(&affinity, cpu);
-			/* First close errored or weak retry */
-			evlist__for_each_entry(evsel_list, counter) {
-				if (!counter->reset_group && !counter->errored)
-					continue;
-				if (evsel__cpu_iter_skip_no_inc(counter, cpu))
-					continue;
-				perf_evsel__close_cpu(&counter->core, counter->cpu_iter);
-			}
-			/* Now reopen weak */
-			evlist__for_each_entry(evsel_list, counter) {
-				if (!counter->reset_group && !counter->errored)
-					continue;
-				if (evsel__cpu_iter_skip(counter, cpu))
-					continue;
-				if (!counter->reset_group)
-					continue;
-try_again_reset:
-				pr_debug2("reopening weak %s\n", evsel__name(counter));
-				if (create_perf_stat_counter(counter, &stat_config, &target,
-							     counter->cpu_iter - 1) < 0) {
-
-					switch (stat_handle_error(counter)) {
-					case COUNTER_FATAL:
-						return -1;
-					case COUNTER_RETRY:
-						goto try_again_reset;
-					case COUNTER_SKIP:
-						continue;
-					default:
-						break;
-					}
-				}
-				counter->supported = true;
-			}
-		}
-	}
 	affinity__cleanup(&affinity);
 
 	evlist__for_each_entry(evsel_list, counter) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5f92319ce258..39d294f6c321 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -342,36 +342,65 @@ static int evlist__nr_threads(struct evlist *evlist, struct evsel *evsel)
 		return perf_thread_map__nr(evlist->core.threads);
 }
 
-void evlist__cpu_iter_start(struct evlist *evlist)
-{
-	struct evsel *pos;
-
-	/*
-	 * Reset the per evsel cpu_iter. This is needed because
-	 * each evsel's cpumap may have a different index space,
-	 * and some operations need the index to modify
-	 * the FD xyarray (e.g. open, close)
-	 */
-	evlist__for_each_entry(evlist, pos)
-		pos->cpu_iter = 0;
-}
+struct evlist_cpu_iterator evlist__cpu_begin(struct evlist *evlist, struct affinity *affinity)
+{
+	struct evlist_cpu_iterator itr = {
+		.container = evlist,
+		.evsel = evlist__first(evlist),
+		.cpu_map_idx = 0,
+		.evlist_cpu_map_idx = 0,
+		.evlist_cpu_map_nr = perf_cpu_map__nr(evlist->core.all_cpus),
+		.cpu = -1,
+		.affinity = affinity,
+	};
 
-bool evsel__cpu_iter_skip_no_inc(struct evsel *ev, int cpu)
-{
-	if (ev->cpu_iter >= ev->core.cpus->nr)
-		return true;
-	if (cpu >= 0 && ev->core.cpus->map[ev->cpu_iter] != cpu)
-		return true;
-	return false;
+	if (itr.affinity) {
+		itr.cpu = perf_cpu_map__cpu(evlist->core.all_cpus, 0);
+		affinity__set(itr.affinity, itr.cpu);
+		itr.cpu_map_idx = perf_cpu_map__idx(itr.evsel->core.cpus, itr.cpu);
+		/*
+		 * If this CPU isn't in the evsel's cpu map then advance through
+		 * the list.
+		 */
+		if (itr.cpu_map_idx == -1)
+			evlist_cpu_iterator__next(&itr);
+	}
+	return itr;
+}
+
+void evlist_cpu_iterator__next(struct evlist_cpu_iterator *evlist_cpu_itr)
+{
+	while (evlist_cpu_itr->evsel != evlist__last(evlist_cpu_itr->container)) {
+		evlist_cpu_itr->evsel = evsel__next(evlist_cpu_itr->evsel);
+		evlist_cpu_itr->cpu_map_idx =
+			perf_cpu_map__idx(evlist_cpu_itr->evsel->core.cpus,
+					  evlist_cpu_itr->cpu);
+		if (evlist_cpu_itr->cpu_map_idx != -1)
+			return;
+	}
+	evlist_cpu_itr->evlist_cpu_map_idx++;
+	if (evlist_cpu_itr->evlist_cpu_map_idx < evlist_cpu_itr->evlist_cpu_map_nr) {
+		evlist_cpu_itr->evsel = evlist__first(evlist_cpu_itr->container);
+		evlist_cpu_itr->cpu =
+			perf_cpu_map__cpu(evlist_cpu_itr->container->core.all_cpus,
+					  evlist_cpu_itr->evlist_cpu_map_idx);
+		if (evlist_cpu_itr->affinity)
+			affinity__set(evlist_cpu_itr->affinity, evlist_cpu_itr->cpu);
+		evlist_cpu_itr->cpu_map_idx =
+			perf_cpu_map__idx(evlist_cpu_itr->evsel->core.cpus,
+					  evlist_cpu_itr->cpu);
+		/*
+		 * If this CPU isn't in the evsel's cpu map then advance through
+		 * the list.
+		 */
+		if (evlist_cpu_itr->cpu_map_idx == -1)
+			evlist_cpu_iterator__next(evlist_cpu_itr);
+	}
 }
 
-bool evsel__cpu_iter_skip(struct evsel *ev, int cpu)
+bool evlist_cpu_iterator__end(const struct evlist_cpu_iterator *evlist_cpu_itr)
 {
-	if (!evsel__cpu_iter_skip_no_inc(ev, cpu)) {
-		ev->cpu_iter++;
-		return false;
-	}
-	return true;
+	return evlist_cpu_itr->evlist_cpu_map_idx >= evlist_cpu_itr->evlist_cpu_map_nr;
 }
 
 static int evsel__strcmp(struct evsel *pos, char *evsel_name)
@@ -400,31 +429,26 @@ static int evlist__is_enabled(struct evlist *evlist)
 static void __evlist__disable(struct evlist *evlist, char *evsel_name)
 {
 	struct evsel *pos;
+	struct evlist_cpu_iterator evlist_cpu_itr;
 	struct affinity affinity;
-	int cpu, i, imm = 0;
 	bool has_imm = false;
 
 	if (affinity__setup(&affinity) < 0)
 		return;
 
 	/* Disable 'immediate' events last */
-	for (imm = 0; imm <= 1; imm++) {
-		evlist__for_each_cpu(evlist, i, cpu) {
-			affinity__set(&affinity, cpu);
-
-			evlist__for_each_entry(evlist, pos) {
-				if (evsel__strcmp(pos, evsel_name))
-					continue;
-				if (evsel__cpu_iter_skip(pos, cpu))
-					continue;
-				if (pos->disabled || !evsel__is_group_leader(pos) || !pos->core.fd)
-					continue;
-				if (pos->immediate)
-					has_imm = true;
-				if (pos->immediate != imm)
-					continue;
-				evsel__disable_cpu(pos, pos->cpu_iter - 1);
-			}
+	for (int imm = 0; imm <= 1; imm++) {
+		evlist__for_each_cpu(evlist_cpu_itr, evlist, &affinity) {
+			pos = evlist_cpu_itr.evsel;
+			if (evsel__strcmp(pos, evsel_name))
+				continue;
+			if (pos->disabled || !evsel__is_group_leader(pos) || !pos->core.fd)
+				continue;
+			if (pos->immediate)
+				has_imm = true;
+			if (pos->immediate != imm)
+				continue;
+			evsel__disable_cpu(pos, evlist_cpu_itr.cpu_map_idx);
 		}
 		if (!has_imm)
 			break;
@@ -462,24 +486,19 @@ void evlist__disable_evsel(struct evlist *evlist, char *evsel_name)
 static void __evlist__enable(struct evlist *evlist, char *evsel_name)
 {
 	struct evsel *pos;
+	struct evlist_cpu_iterator evlist_cpu_itr;
 	struct affinity affinity;
-	int cpu, i;
 
 	if (affinity__setup(&affinity) < 0)
 		return;
 
-	evlist__for_each_cpu(evlist, i, cpu) {
-		affinity__set(&affinity, cpu);
-
-		evlist__for_each_entry(evlist, pos) {
-			if (evsel__strcmp(pos, evsel_name))
-				continue;
-			if (evsel__cpu_iter_skip(pos, cpu))
-				continue;
-			if (!evsel__is_group_leader(pos) || !pos->core.fd)
-				continue;
-			evsel__enable_cpu(pos, pos->cpu_iter - 1);
-		}
+	evlist__for_each_cpu(evlist_cpu_itr, evlist, &affinity) {
+		pos = evlist_cpu_itr.evsel;
+		if (evsel__strcmp(pos, evsel_name))
+			continue;
+		if (!evsel__is_group_leader(pos) || !pos->core.fd)
+			continue;
+		evsel__enable_cpu(pos, evlist_cpu_itr.cpu_map_idx);
 	}
 	affinity__cleanup(&affinity);
 	evlist__for_each_entry(evlist, pos) {
@@ -1264,8 +1283,8 @@ void evlist__set_selected(struct evlist *evlist, struct evsel *evsel)
 void evlist__close(struct evlist *evlist)
 {
 	struct evsel *evsel;
+	struct evlist_cpu_iterator evlist_cpu_itr;
 	struct affinity affinity;
-	int cpu, i;
 
 	/*
 	 * With perf record core.cpus is usually NULL.
@@ -1279,15 +1298,12 @@ void evlist__close(struct evlist *evlist)
 
 	if (affinity__setup(&affinity) < 0)
 		return;
-	evlist__for_each_cpu(evlist, i, cpu) {
-		affinity__set(&affinity, cpu);
 
-		evlist__for_each_entry_reverse(evlist, evsel) {
-			if (evsel__cpu_iter_skip(evsel, cpu))
-			    continue;
-			perf_evsel__close_cpu(&evsel->core, evsel->cpu_iter - 1);
-		}
+	evlist__for_each_cpu(evlist_cpu_itr, evlist, &affinity) {
+		perf_evsel__close_cpu(&evlist_cpu_itr.evsel->core,
+				      evlist_cpu_itr.cpu_map_idx);
 	}
+
 	affinity__cleanup(&affinity);
 	evlist__for_each_entry_reverse(evlist, evsel) {
 		perf_evsel__free_fd(&evsel->core);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 27594900a052..57828ebfcb61 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -327,17 +327,53 @@ void evlist__to_front(struct evlist *evlist, struct evsel *move_evsel);
 #define evlist__for_each_entry_safe(evlist, tmp, evsel) \
 	__evlist__for_each_entry_safe(&(evlist)->core.entries, tmp, evsel)
 
-#define evlist__for_each_cpu(evlist, index, cpu)	\
-	evlist__cpu_iter_start(evlist);			\
-	perf_cpu_map__for_each_cpu (cpu, index, (evlist)->core.all_cpus)
+/** Iterator state for evlist__for_each_cpu */
+struct evlist_cpu_iterator {
+	/** The list being iterated through. */
+	struct evlist *container;
+	/** The current evsel of the iterator. */
+	struct evsel *evsel;
+	/** The CPU map index corresponding to the evsel->core.cpus for the current CPU. */
+	int cpu_map_idx;
+	/**
+	 * The CPU map index corresponding to evlist->core.all_cpus for the
+	 * current CPU.  Distinct from cpu_map_idx as the evsel's cpu map may
+	 * contain fewer entries.
+	 */
+	int evlist_cpu_map_idx;
+	/** The number of CPU map entries in evlist->core.all_cpus. */
+	int evlist_cpu_map_nr;
+	/** The current CPU of the iterator. */
+	int cpu;
+	/** If present, used to set the affinity when switching between CPUs. */
+	struct affinity *affinity;
+};
+
+/**
+ * evlist__for_each_cpu - without affinity, iterate over the evlist. With
+ *                        affinity, iterate over all CPUs and then the evlist
+ *                        for each evsel on that CPU. When switching between
+ *                        CPUs the affinity is set to the CPU to avoid IPIs
+ *                        during syscalls.
+ * @evlist_cpu_itr: the iterator instance.
+ * @evlist: evlist instance to iterate.
+ * @affinity: NULL or used to set the affinity to the current CPU.
+ */
+#define evlist__for_each_cpu(evlist_cpu_itr, evlist, affinity)		\
+	for ((evlist_cpu_itr) = evlist__cpu_begin(evlist, affinity);	\
+	     !evlist_cpu_iterator__end(&evlist_cpu_itr);		\
+	     evlist_cpu_iterator__next(&evlist_cpu_itr))
+
+/** Returns an iterator set to the first CPU/evsel of evlist. */
+struct evlist_cpu_iterator evlist__cpu_begin(struct evlist *evlist, struct affinity *affinity);
+/** Move to next element in iterator, updating CPU, evsel and the affinity. */
+void evlist_cpu_iterator__next(struct evlist_cpu_iterator *evlist_cpu_itr);
+/** Returns true when iterator is at the end of the CPUs and evlist. */
+bool evlist_cpu_iterator__end(const struct evlist_cpu_iterator *evlist_cpu_itr);
 
 struct evsel *evlist__get_tracking_event(struct evlist *evlist);
 void evlist__set_tracking_event(struct evlist *evlist, struct evsel *tracking_evsel);
 
-void evlist__cpu_iter_start(struct evlist *evlist);
-bool evsel__cpu_iter_skip(struct evsel *ev, int cpu);
-bool evsel__cpu_iter_skip_no_inc(struct evsel *ev, int cpu);
-
 struct evsel *evlist__find_evsel_by_str(struct evlist *evlist, const char *str);
 
 struct evsel *evlist__event2evsel(struct evlist *evlist, union perf_event *event);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 99aa3363def7..7cb7c9c77ab0 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -121,7 +121,6 @@ struct evsel {
 	bool			errored;
 	struct hashmap		*per_pkg_mask;
 	int			err;
-	int			cpu_iter;
 	struct {
 		evsel__sb_cb_t	*cb;
 		void		*data;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 35/48] perf evsel: Pass cpu not cpu map index to synthesize
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (33 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 34/48] perf evlist: Refactor evlist__for_each_cpu Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 36/48] perf stat: Correct variable name for read counter Ian Rogers
                   ` (13 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
evsel__write_stat_event was incorrectly passing a cpu map index rather
than a CPU to perf_event__synthesize_stat.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f31bc34bd4cc..ec252579b587 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -327,10 +327,11 @@ static int write_stat_round_event(u64 tm, u64 type)
 
 #define SID(e, x, y) xyarray__entry(e->core.sample_id, x, y)
 
-static int evsel__write_stat_event(struct evsel *counter, u32 cpu, u32 thread,
+static int evsel__write_stat_event(struct evsel *counter, int cpu_map_idx, u32 thread,
 				   struct perf_counts_values *count)
 {
-	struct perf_sample_id *sid = SID(counter, cpu, thread);
+	struct perf_sample_id *sid = SID(counter, cpu_map_idx, thread);
+	int cpu = perf_cpu_map__cpu(evsel__cpus(counter), cpu_map_idx);
 
 	return perf_event__synthesize_stat(NULL, cpu, thread, sid->id, count,
 					   process_synthesized_event, NULL);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 36/48] perf stat: Correct variable name for read counter
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (34 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 35/48] perf evsel: Pass cpu not cpu map index to synthesize Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 37/48] perf evsel: Rename CPU around get_group_fd Ian Rogers
                   ` (12 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Switch from cpu to cpu_map_idx to reduce confusion.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c | 24 ++++++++++++------------
 tools/perf/util/evsel.c   | 30 +++++++++++++++---------------
 tools/perf/util/evsel.h   | 16 ++++++++--------
 3 files changed, 35 insertions(+), 35 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index ec252579b587..68543bc7459f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -337,25 +337,25 @@ static int evsel__write_stat_event(struct evsel *counter, int cpu_map_idx, u32 t
 					   process_synthesized_event, NULL);
 }
 
-static int read_single_counter(struct evsel *counter, int cpu,
+static int read_single_counter(struct evsel *counter, int cpu_map_idx,
 			       int thread, struct timespec *rs)
 {
 	if (counter->tool_event == PERF_TOOL_DURATION_TIME) {
 		u64 val = rs->tv_nsec + rs->tv_sec*1000000000ULL;
 		struct perf_counts_values *count =
-			perf_counts(counter->counts, cpu, thread);
+			perf_counts(counter->counts, cpu_map_idx, thread);
 		count->ena = count->run = val;
 		count->val = val;
 		return 0;
 	}
-	return evsel__read_counter(counter, cpu, thread);
+	return evsel__read_counter(counter, cpu_map_idx, thread);
 }
 
 /*
  * Read out the results of a single counter:
  * do not aggregate counts across CPUs in system-wide mode
  */
-static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
+static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu_map_idx)
 {
 	int nthreads = perf_thread_map__nr(evsel_list->core.threads);
 	int thread;
@@ -369,24 +369,24 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
 	for (thread = 0; thread < nthreads; thread++) {
 		struct perf_counts_values *count;
 
-		count = perf_counts(counter->counts, cpu, thread);
+		count = perf_counts(counter->counts, cpu_map_idx, thread);
 
 		/*
 		 * The leader's group read loads data into its group members
 		 * (via evsel__read_counter()) and sets their count->loaded.
 		 */
-		if (!perf_counts__is_loaded(counter->counts, cpu, thread) &&
-		    read_single_counter(counter, cpu, thread, rs)) {
+		if (!perf_counts__is_loaded(counter->counts, cpu_map_idx, thread) &&
+		    read_single_counter(counter, cpu_map_idx, thread, rs)) {
 			counter->counts->scaled = -1;
-			perf_counts(counter->counts, cpu, thread)->ena = 0;
-			perf_counts(counter->counts, cpu, thread)->run = 0;
+			perf_counts(counter->counts, cpu_map_idx, thread)->ena = 0;
+			perf_counts(counter->counts, cpu_map_idx, thread)->run = 0;
 			return -1;
 		}
 
-		perf_counts__set_loaded(counter->counts, cpu, thread, false);
+		perf_counts__set_loaded(counter->counts, cpu_map_idx, thread, false);
 
 		if (STAT_RECORD) {
-			if (evsel__write_stat_event(counter, cpu, thread, count)) {
+			if (evsel__write_stat_event(counter, cpu_map_idx, thread, count)) {
 				pr_err("failed to write stat event\n");
 				return -1;
 			}
@@ -396,7 +396,7 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
 			fprintf(stat_config.output,
 				"%s: %d: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
 					evsel__name(counter),
-					cpu,
+					perf_cpu_map__cpu(evsel__cpus(counter), cpu_map_idx),
 					count->val, count->ena, count->run);
 		}
 	}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6c9af21776e6..b4a2c5034cfe 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1476,11 +1476,11 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
 	count->run = count->run - tmp.run;
 }
 
-static int evsel__read_one(struct evsel *evsel, int cpu, int thread)
+static int evsel__read_one(struct evsel *evsel, int cpu_map_idx, int thread)
 {
-	struct perf_counts_values *count = perf_counts(evsel->counts, cpu, thread);
+	struct perf_counts_values *count = perf_counts(evsel->counts, cpu_map_idx, thread);
 
-	return perf_evsel__read(&evsel->core, cpu, thread, count);
+	return perf_evsel__read(&evsel->core, cpu_map_idx, thread, count);
 }
 
 static void evsel__set_count(struct evsel *counter, int cpu, int thread, u64 val, u64 ena, u64 run)
@@ -1530,7 +1530,7 @@ static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
 	return 0;
 }
 
-static int evsel__read_group(struct evsel *leader, int cpu, int thread)
+static int evsel__read_group(struct evsel *leader, int cpu_map_idx, int thread)
 {
 	struct perf_stat_evsel *ps = leader->stats;
 	u64 read_format = leader->core.attr.read_format;
@@ -1551,42 +1551,42 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
 		ps->group_data = data;
 	}
 
-	if (FD(leader, cpu, thread) < 0)
+	if (FD(leader, cpu_map_idx, thread) < 0)
 		return -EINVAL;
 
-	if (readn(FD(leader, cpu, thread), data, size) <= 0)
+	if (readn(FD(leader, cpu_map_idx, thread), data, size) <= 0)
 		return -errno;
 
-	return evsel__process_group_data(leader, cpu, thread, data);
+	return evsel__process_group_data(leader, cpu_map_idx, thread, data);
 }
 
-int evsel__read_counter(struct evsel *evsel, int cpu, int thread)
+int evsel__read_counter(struct evsel *evsel, int cpu_map_idx, int thread)
 {
 	u64 read_format = evsel->core.attr.read_format;
 
 	if (read_format & PERF_FORMAT_GROUP)
-		return evsel__read_group(evsel, cpu, thread);
+		return evsel__read_group(evsel, cpu_map_idx, thread);
 
-	return evsel__read_one(evsel, cpu, thread);
+	return evsel__read_one(evsel, cpu_map_idx, thread);
 }
 
-int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale)
+int __evsel__read_on_cpu(struct evsel *evsel, int cpu_map_idx, int thread, bool scale)
 {
 	struct perf_counts_values count;
 	size_t nv = scale ? 3 : 1;
 
-	if (FD(evsel, cpu, thread) < 0)
+	if (FD(evsel, cpu_map_idx, thread) < 0)
 		return -EINVAL;
 
 	if (evsel->counts == NULL && evsel__alloc_counts(evsel) < 0)
 		return -ENOMEM;
 
-	if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) <= 0)
+	if (readn(FD(evsel, cpu_map_idx, thread), &count, nv * sizeof(u64)) <= 0)
 		return -errno;
 
-	evsel__compute_deltas(evsel, cpu, thread, &count);
+	evsel__compute_deltas(evsel, cpu_map_idx, thread, &count);
 	perf_counts_values__scale(&count, scale, NULL);
-	*perf_counts(evsel->counts, cpu, thread) = count;
+	*perf_counts(evsel->counts, cpu_map_idx, thread) = count;
 	return 0;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 7cb7c9c77ab0..c3db41282400 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -333,32 +333,32 @@ static inline bool evsel__match2(struct evsel *e1, struct evsel *e2)
 	       (e1->core.attr.config == e2->core.attr.config);
 }
 
-int evsel__read_counter(struct evsel *evsel, int cpu, int thread);
+int evsel__read_counter(struct evsel *evsel, int cpu_map_idx, int thread);
 
-int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale);
+int __evsel__read_on_cpu(struct evsel *evsel, int cpu_map_idx, int thread, bool scale);
 
 /**
  * evsel__read_on_cpu - Read out the results on a CPU and thread
  *
  * @evsel - event selector to read value
- * @cpu - CPU of interest
+ * @cpu_map_idx - CPU of interest
  * @thread - thread of interest
  */
-static inline int evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread)
+static inline int evsel__read_on_cpu(struct evsel *evsel, int cpu_map_idx, int thread)
 {
-	return __evsel__read_on_cpu(evsel, cpu, thread, false);
+	return __evsel__read_on_cpu(evsel, cpu_map_idx, thread, false);
 }
 
 /**
  * evsel__read_on_cpu_scaled - Read out the results on a CPU and thread, scaled
  *
  * @evsel - event selector to read value
- * @cpu - CPU of interest
+ * @cpu_map_idx - CPU of interest
  * @thread - thread of interest
  */
-static inline int evsel__read_on_cpu_scaled(struct evsel *evsel, int cpu, int thread)
+static inline int evsel__read_on_cpu_scaled(struct evsel *evsel, int cpu_map_idx, int thread)
 {
-	return __evsel__read_on_cpu(evsel, cpu, thread, true);
+	return __evsel__read_on_cpu(evsel, cpu_map_idx, thread, true);
 }
 
 int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 37/48] perf evsel: Rename CPU around get_group_fd
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (35 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 36/48] perf stat: Correct variable name for read counter Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 38/48] perf evsel: Reduce scope of evsel__ignore_missing_thread Ian Rogers
                   ` (11 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
CPU is really a cpu map index, change names to make code more intention
revealing.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/evsel.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b4a2c5034cfe..4b3ba4408af2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1591,27 +1591,27 @@ int __evsel__read_on_cpu(struct evsel *evsel, int cpu_map_idx, int thread, bool
 }
 
 static int evsel__match_other_cpu(struct evsel *evsel, struct evsel *other,
-				  int cpu)
+				  int cpu_map_idx)
 {
-	int cpuid;
+	int cpu;
 
-	cpuid = perf_cpu_map__cpu(evsel->core.cpus, cpu);
-	return perf_cpu_map__idx(other->core.cpus, cpuid);
+	cpu = perf_cpu_map__cpu(evsel->core.cpus, cpu_map_idx);
+	return perf_cpu_map__idx(other->core.cpus, cpu);
 }
 
-static int evsel__hybrid_group_cpu(struct evsel *evsel, int cpu)
+static int evsel__hybrid_group_cpu_map_idx(struct evsel *evsel, int cpu_map_idx)
 {
 	struct evsel *leader = evsel__leader(evsel);
 
 	if ((evsel__is_hybrid(evsel) && !evsel__is_hybrid(leader)) ||
 	    (!evsel__is_hybrid(evsel) && evsel__is_hybrid(leader))) {
-		return evsel__match_other_cpu(evsel, leader, cpu);
+		return evsel__match_other_cpu(evsel, leader, cpu_map_idx);
 	}
 
-	return cpu;
+	return cpu_map_idx;
 }
 
-static int get_group_fd(struct evsel *evsel, int cpu, int thread)
+static int get_group_fd(struct evsel *evsel, int cpu_map_idx, int thread)
 {
 	struct evsel *leader = evsel__leader(evsel);
 	int fd;
@@ -1625,11 +1625,11 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread)
 	 */
 	BUG_ON(!leader->core.fd);
 
-	cpu = evsel__hybrid_group_cpu(evsel, cpu);
-	if (cpu == -1)
+	cpu_map_idx = evsel__hybrid_group_cpu_map_idx(evsel, cpu_map_idx);
+	if (cpu_map_idx == -1)
 		return -1;
 
-	fd = FD(leader, cpu, thread);
+	fd = FD(leader, cpu_map_idx, thread);
 	BUG_ON(fd == -1);
 
 	return fd;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 38/48] perf evsel: Reduce scope of evsel__ignore_missing_thread
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (36 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 37/48] perf evsel: Rename CPU around get_group_fd Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 39/48] perf evsel: Rename variable cpu to index Ian Rogers
                   ` (10 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Move to being static.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/evsel.c | 8 ++++----
 tools/perf/util/evsel.h | 4 ----
 2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4b3ba4408af2..8911ebde54d6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1666,10 +1666,10 @@ static int update_fds(struct evsel *evsel,
 	return 0;
 }
 
-bool evsel__ignore_missing_thread(struct evsel *evsel,
-				  int nr_cpus, int cpu,
-				  struct perf_thread_map *threads,
-				  int thread, int err)
+static bool evsel__ignore_missing_thread(struct evsel *evsel,
+					 int nr_cpus, int cpu,
+					 struct perf_thread_map *threads,
+					 int thread, int err)
 {
 	pid_t ignore_pid = perf_thread_map__pid(threads, thread);
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c3db41282400..84e597f6c395 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -301,10 +301,6 @@ bool evsel__detect_missing_features(struct evsel *evsel);
 enum rlimit_action { NO_CHANGE, SET_TO_MAX, INCREASED_MAX };
 bool evsel__increase_rlimit(enum rlimit_action *set_rlimit);
 
-bool evsel__ignore_missing_thread(struct evsel *evsel,
-				  int nr_cpus, int cpu,
-				  struct perf_thread_map *threads,
-				  int thread, int err);
 bool evsel__precise_ip_fallback(struct evsel *evsel);
 
 struct perf_sample;
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 39/48] perf evsel: Rename variable cpu to index
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (37 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 38/48] perf evsel: Reduce scope of evsel__ignore_missing_thread Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 40/48] perf test: Use perf_cpu_map__for_each_cpu Ian Rogers
                   ` (9 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Make naming less error prone.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/evsel.c | 83 +++++++++++++++++++++--------------------
 tools/perf/util/evsel.h |  6 +--
 tools/perf/util/stat.c  |  4 +-
 tools/perf/util/stat.h  |  2 +-
 4 files changed, 48 insertions(+), 47 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8911ebde54d6..2b18b29a534f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1372,9 +1372,9 @@ int evsel__append_addr_filter(struct evsel *evsel, const char *filter)
 }
 
 /* Caller has to clear disabled after going through all CPUs. */
-int evsel__enable_cpu(struct evsel *evsel, int cpu)
+int evsel__enable_cpu(struct evsel *evsel, int cpu_map_idx)
 {
-	return perf_evsel__enable_cpu(&evsel->core, cpu);
+	return perf_evsel__enable_cpu(&evsel->core, cpu_map_idx);
 }
 
 int evsel__enable(struct evsel *evsel)
@@ -1387,9 +1387,9 @@ int evsel__enable(struct evsel *evsel)
 }
 
 /* Caller has to set disabled after going through all CPUs. */
-int evsel__disable_cpu(struct evsel *evsel, int cpu)
+int evsel__disable_cpu(struct evsel *evsel, int cpu_map_idx)
 {
-	return perf_evsel__disable_cpu(&evsel->core, cpu);
+	return perf_evsel__disable_cpu(&evsel->core, cpu_map_idx);
 }
 
 int evsel__disable(struct evsel *evsel)
@@ -1455,7 +1455,7 @@ void evsel__delete(struct evsel *evsel)
 	free(evsel);
 }
 
-void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
+void evsel__compute_deltas(struct evsel *evsel, int cpu_map_idx, int thread,
 			   struct perf_counts_values *count)
 {
 	struct perf_counts_values tmp;
@@ -1463,12 +1463,12 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
 	if (!evsel->prev_raw_counts)
 		return;
 
-	if (cpu == -1) {
+	if (cpu_map_idx == -1) {
 		tmp = evsel->prev_raw_counts->aggr;
 		evsel->prev_raw_counts->aggr = *count;
 	} else {
-		tmp = *perf_counts(evsel->prev_raw_counts, cpu, thread);
-		*perf_counts(evsel->prev_raw_counts, cpu, thread) = *count;
+		tmp = *perf_counts(evsel->prev_raw_counts, cpu_map_idx, thread);
+		*perf_counts(evsel->prev_raw_counts, cpu_map_idx, thread) = *count;
 	}
 
 	count->val = count->val - tmp.val;
@@ -1483,20 +1483,21 @@ static int evsel__read_one(struct evsel *evsel, int cpu_map_idx, int thread)
 	return perf_evsel__read(&evsel->core, cpu_map_idx, thread, count);
 }
 
-static void evsel__set_count(struct evsel *counter, int cpu, int thread, u64 val, u64 ena, u64 run)
+static void evsel__set_count(struct evsel *counter, int cpu_map_idx, int thread,
+			     u64 val, u64 ena, u64 run)
 {
 	struct perf_counts_values *count;
 
-	count = perf_counts(counter->counts, cpu, thread);
+	count = perf_counts(counter->counts, cpu_map_idx, thread);
 
 	count->val    = val;
 	count->ena    = ena;
 	count->run    = run;
 
-	perf_counts__set_loaded(counter->counts, cpu, thread, true);
+	perf_counts__set_loaded(counter->counts, cpu_map_idx, thread, true);
 }
 
-static int evsel__process_group_data(struct evsel *leader, int cpu, int thread, u64 *data)
+static int evsel__process_group_data(struct evsel *leader, int cpu_map_idx, int thread, u64 *data)
 {
 	u64 read_format = leader->core.attr.read_format;
 	struct sample_read_value *v;
@@ -1515,7 +1516,7 @@ static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
 
 	v = (struct sample_read_value *) data;
 
-	evsel__set_count(leader, cpu, thread, v[0].value, ena, run);
+	evsel__set_count(leader, cpu_map_idx, thread, v[0].value, ena, run);
 
 	for (i = 1; i < nr; i++) {
 		struct evsel *counter;
@@ -1524,7 +1525,7 @@ static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
 		if (!counter)
 			return -EINVAL;
 
-		evsel__set_count(counter, cpu, thread, v[i].value, ena, run);
+		evsel__set_count(counter, cpu_map_idx, thread, v[i].value, ena, run);
 	}
 
 	return 0;
@@ -1643,16 +1644,16 @@ static void evsel__remove_fd(struct evsel *pos, int nr_cpus, int nr_threads, int
 }
 
 static int update_fds(struct evsel *evsel,
-		      int nr_cpus, int cpu_idx,
+		      int nr_cpus, int cpu_map_idx,
 		      int nr_threads, int thread_idx)
 {
 	struct evsel *pos;
 
-	if (cpu_idx >= nr_cpus || thread_idx >= nr_threads)
+	if (cpu_map_idx >= nr_cpus || thread_idx >= nr_threads)
 		return -EINVAL;
 
 	evlist__for_each_entry(evsel->evlist, pos) {
-		nr_cpus = pos != evsel ? nr_cpus : cpu_idx;
+		nr_cpus = pos != evsel ? nr_cpus : cpu_map_idx;
 
 		evsel__remove_fd(pos, nr_cpus, nr_threads, thread_idx);
 
@@ -1667,7 +1668,7 @@ static int update_fds(struct evsel *evsel,
 }
 
 static bool evsel__ignore_missing_thread(struct evsel *evsel,
-					 int nr_cpus, int cpu,
+					 int nr_cpus, int cpu_map_idx,
 					 struct perf_thread_map *threads,
 					 int thread, int err)
 {
@@ -1692,7 +1693,7 @@ static bool evsel__ignore_missing_thread(struct evsel *evsel,
 	 * We should remove fd for missing_thread first
 	 * because thread_map__remove() will decrease threads->nr.
 	 */
-	if (update_fds(evsel, nr_cpus, cpu, threads->nr, thread))
+	if (update_fds(evsel, nr_cpus, cpu_map_idx, threads->nr, thread))
 		return false;
 
 	if (thread_map__remove(threads, thread))
@@ -1974,9 +1975,9 @@ bool evsel__increase_rlimit(enum rlimit_action *set_rlimit)
 
 static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads,
-		int start_cpu, int end_cpu)
+		int start_cpu_map_idx, int end_cpu_map_idx)
 {
-	int cpu, thread, nthreads;
+	int idx, thread, nthreads;
 	int pid = -1, err, old_errno;
 	enum rlimit_action set_rlimit = NO_CHANGE;
 
@@ -2003,7 +2004,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 
 	display_attr(&evsel->core.attr);
 
-	for (cpu = start_cpu; cpu < end_cpu; cpu++) {
+	for (idx = start_cpu_map_idx; idx < end_cpu_map_idx; idx++) {
 
 		for (thread = 0; thread < nthreads; thread++) {
 			int fd, group_fd;
@@ -2014,17 +2015,17 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 			if (!evsel->cgrp && !evsel->core.system_wide)
 				pid = perf_thread_map__pid(threads, thread);
 
-			group_fd = get_group_fd(evsel, cpu, thread);
+			group_fd = get_group_fd(evsel, idx, thread);
 
 			test_attr__ready();
 
 			pr_debug2_peo("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
-				pid, cpus->map[cpu], group_fd, evsel->open_flags);
+				pid, cpus->map[idx], group_fd, evsel->open_flags);
 
-			fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
+			fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[idx],
 						group_fd, evsel->open_flags);
 
-			FD(evsel, cpu, thread) = fd;
+			FD(evsel, idx, thread) = fd;
 
 			if (fd < 0) {
 				err = -errno;
@@ -2034,10 +2035,10 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 				goto try_fallback;
 			}
 
-			bpf_counter__install_pe(evsel, cpu, fd);
+			bpf_counter__install_pe(evsel, idx, fd);
 
 			if (unlikely(test_attr__enabled)) {
-				test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
+				test_attr__open(&evsel->core.attr, pid, cpus->map[idx],
 						fd, group_fd, evsel->open_flags);
 			}
 
@@ -2078,7 +2079,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (evsel__precise_ip_fallback(evsel))
 		goto retry_open;
 
-	if (evsel__ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
+	if (evsel__ignore_missing_thread(evsel, cpus->nr, idx, threads, thread, err)) {
 		/* We just removed 1 thread, so lower the upper nthreads limit. */
 		nthreads--;
 
@@ -2093,7 +2094,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (err == -EMFILE && evsel__increase_rlimit(&set_rlimit))
 		goto retry_open;
 
-	if (err != -EINVAL || cpu > 0 || thread > 0)
+	if (err != -EINVAL || idx > 0 || thread > 0)
 		goto out_close;
 
 	if (evsel__detect_missing_features(evsel))
@@ -2105,12 +2106,12 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	old_errno = errno;
 	do {
 		while (--thread >= 0) {
-			if (FD(evsel, cpu, thread) >= 0)
-				close(FD(evsel, cpu, thread));
-			FD(evsel, cpu, thread) = -1;
+			if (FD(evsel, idx, thread) >= 0)
+				close(FD(evsel, idx, thread));
+			FD(evsel, idx, thread) = -1;
 		}
 		thread = nthreads;
-	} while (--cpu >= 0);
+	} while (--idx >= 0);
 	errno = old_errno;
 	return err;
 }
@@ -2127,13 +2128,13 @@ void evsel__close(struct evsel *evsel)
 	perf_evsel__free_id(&evsel->core);
 }
 
-int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu)
+int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu_map_idx)
 {
-	if (cpu == -1)
+	if (cpu_map_idx == -1)
 		return evsel__open_cpu(evsel, cpus, NULL, 0,
 					cpus ? cpus->nr : 1);
 
-	return evsel__open_cpu(evsel, cpus, NULL, cpu, cpu + 1);
+	return evsel__open_cpu(evsel, cpus, NULL, cpu_map_idx, cpu_map_idx + 1);
 }
 
 int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads)
@@ -2954,15 +2955,15 @@ struct perf_env *evsel__env(struct evsel *evsel)
 
 static int store_evsel_ids(struct evsel *evsel, struct evlist *evlist)
 {
-	int cpu, thread;
+	int cpu_map_idx, thread;
 
-	for (cpu = 0; cpu < xyarray__max_x(evsel->core.fd); cpu++) {
+	for (cpu_map_idx = 0; cpu_map_idx < xyarray__max_x(evsel->core.fd); cpu_map_idx++) {
 		for (thread = 0; thread < xyarray__max_y(evsel->core.fd);
 		     thread++) {
-			int fd = FD(evsel, cpu, thread);
+			int fd = FD(evsel, cpu_map_idx, thread);
 
 			if (perf_evlist__id_add_fd(&evlist->core, &evsel->core,
-						   cpu, thread, fd) < 0)
+						   cpu_map_idx, thread, fd) < 0)
 				return -1;
 		}
 	}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 84e597f6c395..5720ceebffac 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -284,12 +284,12 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr);
 int evsel__set_filter(struct evsel *evsel, const char *filter);
 int evsel__append_tp_filter(struct evsel *evsel, const char *filter);
 int evsel__append_addr_filter(struct evsel *evsel, const char *filter);
-int evsel__enable_cpu(struct evsel *evsel, int cpu);
+int evsel__enable_cpu(struct evsel *evsel, int cpu_map_idx);
 int evsel__enable(struct evsel *evsel);
 int evsel__disable(struct evsel *evsel);
-int evsel__disable_cpu(struct evsel *evsel, int cpu);
+int evsel__disable_cpu(struct evsel *evsel, int cpu_map_idx);
 
-int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu);
+int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu_map_idx);
 int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads);
 int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 995cb5003133..f7f9757eba23 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -531,7 +531,7 @@ size_t perf_event__fprintf_stat_config(union perf_event *event, FILE *fp)
 int create_perf_stat_counter(struct evsel *evsel,
 			     struct perf_stat_config *config,
 			     struct target *target,
-			     int cpu)
+			     int cpu_map_idx)
 {
 	struct perf_event_attr *attr = &evsel->core.attr;
 	struct evsel *leader = evsel__leader(evsel);
@@ -585,7 +585,7 @@ int create_perf_stat_counter(struct evsel *evsel,
 	}
 
 	if (target__has_cpu(target) && !target__has_per_thread(target))
-		return evsel__open_per_cpu(evsel, evsel__cpus(evsel), cpu);
+		return evsel__open_per_cpu(evsel, evsel__cpus(evsel), cpu_map_idx);
 
 	return evsel__open_per_thread(evsel, evsel->core.threads);
 }
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 32cf24186229..5e25d53e891b 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -248,7 +248,7 @@ size_t perf_event__fprintf_stat_config(union perf_event *event, FILE *fp);
 int create_perf_stat_counter(struct evsel *evsel,
 			     struct perf_stat_config *config,
 			     struct target *target,
-			     int cpu);
+			     int cpu_map_idx);
 void evlist__print_counters(struct evlist *evlist, struct perf_stat_config *config,
 			    struct target *_target, struct timespec *ts, int argc, const char **argv);
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 40/48] perf test: Use perf_cpu_map__for_each_cpu
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (38 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 39/48] perf evsel: Rename variable cpu to index Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 41/48] perf stat: Correct check_per_pkg cpu Ian Rogers
                   ` (8 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Clean up variable naming to make cpu and index clearer.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/openat-syscall-all-cpus.c | 28 +++++++++++-----------
 1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/tools/perf/tests/openat-syscall-all-cpus.c b/tools/perf/tests/openat-syscall-all-cpus.c
index 544db0839b3b..ca0a50e92839 100644
--- a/tools/perf/tests/openat-syscall-all-cpus.c
+++ b/tools/perf/tests/openat-syscall-all-cpus.c
@@ -22,7 +22,7 @@
 static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __maybe_unused,
 						  int subtest __maybe_unused)
 {
-	int err = -1, fd, cpu;
+	int err = -1, fd, idx, cpu;
 	struct perf_cpu_map *cpus;
 	struct evsel *evsel;
 	unsigned int nr_openat_calls = 111, i;
@@ -58,23 +58,23 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 		goto out_evsel_delete;
 	}
 
-	for (cpu = 0; cpu < cpus->nr; ++cpu) {
-		unsigned int ncalls = nr_openat_calls + cpu;
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+		unsigned int ncalls = nr_openat_calls + idx;
 		/*
 		 * XXX eventually lift this restriction in a way that
 		 * keeps perf building on older glibc installations
 		 * without CPU_ALLOC. 1024 cpus in 2010 still seems
 		 * a reasonable upper limit tho :-)
 		 */
-		if (cpus->map[cpu] >= CPU_SETSIZE) {
-			pr_debug("Ignoring CPU %d\n", cpus->map[cpu]);
+		if (cpu >= CPU_SETSIZE) {
+			pr_debug("Ignoring CPU %d\n", cpu);
 			continue;
 		}
 
-		CPU_SET(cpus->map[cpu], &cpu_set);
+		CPU_SET(cpu, &cpu_set);
 		if (sched_setaffinity(0, sizeof(cpu_set), &cpu_set) < 0) {
 			pr_debug("sched_setaffinity() failed on CPU %d: %s ",
-				 cpus->map[cpu],
+				 cpu,
 				 str_error_r(errno, sbuf, sizeof(sbuf)));
 			goto out_close_fd;
 		}
@@ -82,29 +82,29 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 			fd = openat(0, "/etc/passwd", O_RDONLY);
 			close(fd);
 		}
-		CPU_CLR(cpus->map[cpu], &cpu_set);
+		CPU_CLR(cpu, &cpu_set);
 	}
 
 	evsel->core.cpus = perf_cpu_map__get(cpus);
 
 	err = 0;
 
-	for (cpu = 0; cpu < cpus->nr; ++cpu) {
+	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 		unsigned int expected;
 
-		if (cpus->map[cpu] >= CPU_SETSIZE)
+		if (cpu >= CPU_SETSIZE)
 			continue;
 
-		if (evsel__read_on_cpu(evsel, cpu, 0) < 0) {
+		if (evsel__read_on_cpu(evsel, idx, 0) < 0) {
 			pr_debug("evsel__read_on_cpu\n");
 			err = -1;
 			break;
 		}
 
-		expected = nr_openat_calls + cpu;
-		if (perf_counts(evsel->counts, cpu, 0)->val != expected) {
+		expected = nr_openat_calls + idx;
+		if (perf_counts(evsel->counts, idx, 0)->val != expected) {
 			pr_debug("evsel__read_on_cpu: expected to intercept %d calls on cpu %d, got %" PRIu64 "\n",
-				 expected, cpus->map[cpu], perf_counts(evsel->counts, cpu, 0)->val);
+				 expected, cpu, perf_counts(evsel->counts, idx, 0)->val);
 			err = -1;
 		}
 	}
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 41/48] perf stat: Correct check_per_pkg cpu
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (39 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 40/48] perf test: Use perf_cpu_map__for_each_cpu Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 42/48] perf stat: Swap variable name cpu to index Ian Rogers
                   ` (7 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Code was incorrectly using the cpu map index as the CPU.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index f7f9757eba23..86ab427e87fc 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -292,11 +292,12 @@ static bool pkg_id_equal(const void *__key1, const void *__key2,
 	return *key1 == *key2;
 }
 
-static int check_per_pkg(struct evsel *counter,
-			 struct perf_counts_values *vals, int cpu, bool *skip)
+static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals,
+			 int cpu_map_idx, bool *skip)
 {
 	struct hashmap *mask = counter->per_pkg_mask;
 	struct perf_cpu_map *cpus = evsel__cpus(counter);
+	int cpu = perf_cpu_map__cpu(cpus, cpu_map_idx);
 	int s, d, ret = 0;
 	uint64_t *key;
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 42/48] perf stat: Swap variable name cpu to index
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (40 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 41/48] perf stat: Correct check_per_pkg cpu Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 43/48] libperf: Sync evsel documentation Ian Rogers
                   ` (6 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
The use of CPU is error prone, switch to cpu_map_idx.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-shadow.c | 308 +++++++++++++++++-----------------
 tools/perf/util/stat.c        |  16 +-
 tools/perf/util/stat.h        |   4 +-
 3 files changed, 164 insertions(+), 164 deletions(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 5c7308efa768..10af7804e482 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -32,7 +32,7 @@ struct saved_value {
 	struct evsel *evsel;
 	enum stat_type type;
 	int ctx;
-	int cpu;
+	int cpu_map_idx;
 	struct cgroup *cgrp;
 	struct runtime_stat *stat;
 	struct stats stats;
@@ -47,8 +47,8 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 					     rb_node);
 	const struct saved_value *b = entry;
 
-	if (a->cpu != b->cpu)
-		return a->cpu - b->cpu;
+	if (a->cpu_map_idx != b->cpu_map_idx)
+		return a->cpu_map_idx - b->cpu_map_idx;
 
 	/*
 	 * Previously the rbtree was used to link generic metrics.
@@ -105,7 +105,7 @@ static void saved_value_delete(struct rblist *rblist __maybe_unused,
 }
 
 static struct saved_value *saved_value_lookup(struct evsel *evsel,
-					      int cpu,
+					      int cpu_map_idx,
 					      bool create,
 					      enum stat_type type,
 					      int ctx,
@@ -115,7 +115,7 @@ static struct saved_value *saved_value_lookup(struct evsel *evsel,
 	struct rblist *rblist;
 	struct rb_node *nd;
 	struct saved_value dm = {
-		.cpu = cpu,
+		.cpu_map_idx = cpu_map_idx,
 		.evsel = evsel,
 		.type = type,
 		.ctx = ctx,
@@ -213,10 +213,10 @@ struct runtime_stat_data {
 
 static void update_runtime_stat(struct runtime_stat *st,
 				enum stat_type type,
-				int cpu, u64 count,
+				int cpu_map_idx, u64 count,
 				struct runtime_stat_data *rsd)
 {
-	struct saved_value *v = saved_value_lookup(NULL, cpu, true, type,
+	struct saved_value *v = saved_value_lookup(NULL, cpu_map_idx, true, type,
 						   rsd->ctx, st, rsd->cgrp);
 
 	if (v)
@@ -229,7 +229,7 @@ static void update_runtime_stat(struct runtime_stat *st,
  * instruction rates, etc:
  */
 void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
-				    int cpu, struct runtime_stat *st)
+				    int cpu_map_idx, struct runtime_stat *st)
 {
 	u64 count_ns = count;
 	struct saved_value *v;
@@ -241,88 +241,88 @@ void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
 	count *= counter->scale;
 
 	if (evsel__is_clock(counter))
-		update_runtime_stat(st, STAT_NSECS, cpu, count_ns, &rsd);
+		update_runtime_stat(st, STAT_NSECS, cpu_map_idx, count_ns, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
-		update_runtime_stat(st, STAT_CYCLES, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_CYCLES, cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
-		update_runtime_stat(st, STAT_CYCLES_IN_TX, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_CYCLES_IN_TX, cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TRANSACTION_START))
-		update_runtime_stat(st, STAT_TRANSACTION, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_TRANSACTION, cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, ELISION_START))
-		update_runtime_stat(st, STAT_ELISION, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_ELISION, cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_TOTAL_SLOTS))
 		update_runtime_stat(st, STAT_TOPDOWN_TOTAL_SLOTS,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_ISSUED))
 		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_ISSUED,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_RETIRED))
 		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_RETIRED,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_BUBBLES))
 		update_runtime_stat(st, STAT_TOPDOWN_FETCH_BUBBLES,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
 		update_runtime_stat(st, STAT_TOPDOWN_RECOVERY_BUBBLES,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_RETIRING))
 		update_runtime_stat(st, STAT_TOPDOWN_RETIRING,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_BAD_SPEC))
 		update_runtime_stat(st, STAT_TOPDOWN_BAD_SPEC,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FE_BOUND))
 		update_runtime_stat(st, STAT_TOPDOWN_FE_BOUND,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_BE_BOUND))
 		update_runtime_stat(st, STAT_TOPDOWN_BE_BOUND,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_HEAVY_OPS))
 		update_runtime_stat(st, STAT_TOPDOWN_HEAVY_OPS,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_BR_MISPREDICT))
 		update_runtime_stat(st, STAT_TOPDOWN_BR_MISPREDICT,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_LAT))
 		update_runtime_stat(st, STAT_TOPDOWN_FETCH_LAT,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_MEM_BOUND))
 		update_runtime_stat(st, STAT_TOPDOWN_MEM_BOUND,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
 		update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
 		update_runtime_stat(st, STAT_STALLED_CYCLES_BACK,
-				    cpu, count, &rsd);
+				    cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
-		update_runtime_stat(st, STAT_BRANCHES, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_BRANCHES, cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
-		update_runtime_stat(st, STAT_CACHEREFS, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_CACHEREFS, cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
-		update_runtime_stat(st, STAT_L1_DCACHE, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_L1_DCACHE, cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
-		update_runtime_stat(st, STAT_L1_ICACHE, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_L1_ICACHE, cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_LL))
-		update_runtime_stat(st, STAT_LL_CACHE, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_LL_CACHE, cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
-		update_runtime_stat(st, STAT_DTLB_CACHE, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_DTLB_CACHE, cpu_map_idx, count, &rsd);
 	else if (evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
-		update_runtime_stat(st, STAT_ITLB_CACHE, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_ITLB_CACHE, cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, SMI_NUM))
-		update_runtime_stat(st, STAT_SMI_NUM, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_SMI_NUM, cpu_map_idx, count, &rsd);
 	else if (perf_stat_evsel__is(counter, APERF))
-		update_runtime_stat(st, STAT_APERF, cpu, count, &rsd);
+		update_runtime_stat(st, STAT_APERF, cpu_map_idx, count, &rsd);
 
 	if (counter->collect_stat) {
-		v = saved_value_lookup(counter, cpu, true, STAT_NONE, 0, st,
+		v = saved_value_lookup(counter, cpu_map_idx, true, STAT_NONE, 0, st,
 				       rsd.cgrp);
 		update_stats(&v->stats, count);
 		if (counter->metric_leader)
 			v->metric_total += count;
 	} else if (counter->metric_leader) {
 		v = saved_value_lookup(counter->metric_leader,
-				       cpu, true, STAT_NONE, 0, st, rsd.cgrp);
+				       cpu_map_idx, true, STAT_NONE, 0, st, rsd.cgrp);
 		v->metric_total += count;
 		v->metric_other++;
 	}
@@ -464,12 +464,12 @@ void perf_stat__collect_metric_expr(struct evlist *evsel_list)
 }
 
 static double runtime_stat_avg(struct runtime_stat *st,
-			       enum stat_type type, int cpu,
+			       enum stat_type type, int cpu_map_idx,
 			       struct runtime_stat_data *rsd)
 {
 	struct saved_value *v;
 
-	v = saved_value_lookup(NULL, cpu, false, type, rsd->ctx, st, rsd->cgrp);
+	v = saved_value_lookup(NULL, cpu_map_idx, false, type, rsd->ctx, st, rsd->cgrp);
 	if (!v)
 		return 0.0;
 
@@ -477,12 +477,12 @@ static double runtime_stat_avg(struct runtime_stat *st,
 }
 
 static double runtime_stat_n(struct runtime_stat *st,
-			     enum stat_type type, int cpu,
+			     enum stat_type type, int cpu_map_idx,
 			     struct runtime_stat_data *rsd)
 {
 	struct saved_value *v;
 
-	v = saved_value_lookup(NULL, cpu, false, type, rsd->ctx, st, rsd->cgrp);
+	v = saved_value_lookup(NULL, cpu_map_idx, false, type, rsd->ctx, st, rsd->cgrp);
 	if (!v)
 		return 0.0;
 
@@ -490,7 +490,7 @@ static double runtime_stat_n(struct runtime_stat *st,
 }
 
 static void print_stalled_cycles_frontend(struct perf_stat_config *config,
-					  int cpu, double avg,
+					  int cpu_map_idx, double avg,
 					  struct perf_stat_output_ctx *out,
 					  struct runtime_stat *st,
 					  struct runtime_stat_data *rsd)
@@ -498,7 +498,7 @@ static void print_stalled_cycles_frontend(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_CYCLES, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -513,7 +513,7 @@ static void print_stalled_cycles_frontend(struct perf_stat_config *config,
 }
 
 static void print_stalled_cycles_backend(struct perf_stat_config *config,
-					 int cpu, double avg,
+					 int cpu_map_idx, double avg,
 					 struct perf_stat_output_ctx *out,
 					 struct runtime_stat *st,
 					 struct runtime_stat_data *rsd)
@@ -521,7 +521,7 @@ static void print_stalled_cycles_backend(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_CYCLES, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -532,7 +532,7 @@ static void print_stalled_cycles_backend(struct perf_stat_config *config,
 }
 
 static void print_branch_misses(struct perf_stat_config *config,
-				int cpu, double avg,
+				int cpu_map_idx, double avg,
 				struct perf_stat_output_ctx *out,
 				struct runtime_stat *st,
 				struct runtime_stat_data *rsd)
@@ -540,7 +540,7 @@ static void print_branch_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_BRANCHES, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_BRANCHES, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -551,7 +551,7 @@ static void print_branch_misses(struct perf_stat_config *config,
 }
 
 static void print_l1_dcache_misses(struct perf_stat_config *config,
-				   int cpu, double avg,
+				   int cpu_map_idx, double avg,
 				   struct perf_stat_output_ctx *out,
 				   struct runtime_stat *st,
 				   struct runtime_stat_data *rsd)
@@ -559,7 +559,7 @@ static void print_l1_dcache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_L1_DCACHE, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_L1_DCACHE, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -570,7 +570,7 @@ static void print_l1_dcache_misses(struct perf_stat_config *config,
 }
 
 static void print_l1_icache_misses(struct perf_stat_config *config,
-				   int cpu, double avg,
+				   int cpu_map_idx, double avg,
 				   struct perf_stat_output_ctx *out,
 				   struct runtime_stat *st,
 				   struct runtime_stat_data *rsd)
@@ -578,7 +578,7 @@ static void print_l1_icache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_L1_ICACHE, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_L1_ICACHE, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -588,7 +588,7 @@ static void print_l1_icache_misses(struct perf_stat_config *config,
 }
 
 static void print_dtlb_cache_misses(struct perf_stat_config *config,
-				    int cpu, double avg,
+				    int cpu_map_idx, double avg,
 				    struct perf_stat_output_ctx *out,
 				    struct runtime_stat *st,
 				    struct runtime_stat_data *rsd)
@@ -596,7 +596,7 @@ static void print_dtlb_cache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_DTLB_CACHE, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_DTLB_CACHE, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -606,7 +606,7 @@ static void print_dtlb_cache_misses(struct perf_stat_config *config,
 }
 
 static void print_itlb_cache_misses(struct perf_stat_config *config,
-				    int cpu, double avg,
+				    int cpu_map_idx, double avg,
 				    struct perf_stat_output_ctx *out,
 				    struct runtime_stat *st,
 				    struct runtime_stat_data *rsd)
@@ -614,7 +614,7 @@ static void print_itlb_cache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_ITLB_CACHE, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_ITLB_CACHE, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -624,7 +624,7 @@ static void print_itlb_cache_misses(struct perf_stat_config *config,
 }
 
 static void print_ll_cache_misses(struct perf_stat_config *config,
-				  int cpu, double avg,
+				  int cpu_map_idx, double avg,
 				  struct perf_stat_output_ctx *out,
 				  struct runtime_stat *st,
 				  struct runtime_stat_data *rsd)
@@ -632,7 +632,7 @@ static void print_ll_cache_misses(struct perf_stat_config *config,
 	double total, ratio = 0.0;
 	const char *color;
 
-	total = runtime_stat_avg(st, STAT_LL_CACHE, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_LL_CACHE, cpu_map_idx, rsd);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -690,61 +690,61 @@ static double sanitize_val(double x)
 	return x;
 }
 
-static double td_total_slots(int cpu, struct runtime_stat *st,
+static double td_total_slots(int cpu_map_idx, struct runtime_stat *st,
 			     struct runtime_stat_data *rsd)
 {
-	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, cpu, rsd);
+	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, cpu_map_idx, rsd);
 }
 
-static double td_bad_spec(int cpu, struct runtime_stat *st,
+static double td_bad_spec(int cpu_map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
 	double bad_spec = 0;
 	double total_slots;
 	double total;
 
-	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, cpu, rsd) -
-		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, cpu, rsd) +
-		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, cpu, rsd);
+	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, cpu_map_idx, rsd) -
+		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, cpu_map_idx, rsd) +
+		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, cpu_map_idx, rsd);
 
-	total_slots = td_total_slots(cpu, st, rsd);
+	total_slots = td_total_slots(cpu_map_idx, st, rsd);
 	if (total_slots)
 		bad_spec = total / total_slots;
 	return sanitize_val(bad_spec);
 }
 
-static double td_retiring(int cpu, struct runtime_stat *st,
+static double td_retiring(int cpu_map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
 	double retiring = 0;
-	double total_slots = td_total_slots(cpu, st, rsd);
+	double total_slots = td_total_slots(cpu_map_idx, st, rsd);
 	double ret_slots = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED,
-					    cpu, rsd);
+					    cpu_map_idx, rsd);
 
 	if (total_slots)
 		retiring = ret_slots / total_slots;
 	return retiring;
 }
 
-static double td_fe_bound(int cpu, struct runtime_stat *st,
+static double td_fe_bound(int cpu_map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
 	double fe_bound = 0;
-	double total_slots = td_total_slots(cpu, st, rsd);
+	double total_slots = td_total_slots(cpu_map_idx, st, rsd);
 	double fetch_bub = runtime_stat_avg(st, STAT_TOPDOWN_FETCH_BUBBLES,
-					    cpu, rsd);
+					    cpu_map_idx, rsd);
 
 	if (total_slots)
 		fe_bound = fetch_bub / total_slots;
 	return fe_bound;
 }
 
-static double td_be_bound(int cpu, struct runtime_stat *st,
+static double td_be_bound(int cpu_map_idx, struct runtime_stat *st,
 			  struct runtime_stat_data *rsd)
 {
-	double sum = (td_fe_bound(cpu, st, rsd) +
-		      td_bad_spec(cpu, st, rsd) +
-		      td_retiring(cpu, st, rsd));
+	double sum = (td_fe_bound(cpu_map_idx, st, rsd) +
+		      td_bad_spec(cpu_map_idx, st, rsd) +
+		      td_retiring(cpu_map_idx, st, rsd));
 	if (sum == 0)
 		return 0;
 	return sanitize_val(1.0 - sum);
@@ -755,15 +755,15 @@ static double td_be_bound(int cpu, struct runtime_stat *st,
  * the ratios we need to recreate the sum.
  */
 
-static double td_metric_ratio(int cpu, enum stat_type type,
+static double td_metric_ratio(int cpu_map_idx, enum stat_type type,
 			      struct runtime_stat *stat,
 			      struct runtime_stat_data *rsd)
 {
-	double sum = runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu, rsd) +
-		runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu, rsd) +
-		runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu, rsd) +
-		runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu, rsd);
-	double d = runtime_stat_avg(stat, type, cpu, rsd);
+	double sum = runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu_map_idx, rsd) +
+		runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu_map_idx, rsd) +
+		runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu_map_idx, rsd) +
+		runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu_map_idx, rsd);
+	double d = runtime_stat_avg(stat, type, cpu_map_idx, rsd);
 
 	if (sum)
 		return d / sum;
@@ -775,23 +775,23 @@ static double td_metric_ratio(int cpu, enum stat_type type,
  * We allow two missing.
  */
 
-static bool full_td(int cpu, struct runtime_stat *stat,
+static bool full_td(int cpu_map_idx, struct runtime_stat *stat,
 		    struct runtime_stat_data *rsd)
 {
 	int c = 0;
 
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_RETIRING, cpu_map_idx, rsd) > 0)
 		c++;
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_BE_BOUND, cpu_map_idx, rsd) > 0)
 		c++;
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_FE_BOUND, cpu_map_idx, rsd) > 0)
 		c++;
-	if (runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu, rsd) > 0)
+	if (runtime_stat_avg(stat, STAT_TOPDOWN_BAD_SPEC, cpu_map_idx, rsd) > 0)
 		c++;
 	return c >= 2;
 }
 
-static void print_smi_cost(struct perf_stat_config *config, int cpu,
+static void print_smi_cost(struct perf_stat_config *config, int cpu_map_idx,
 			   struct perf_stat_output_ctx *out,
 			   struct runtime_stat *st,
 			   struct runtime_stat_data *rsd)
@@ -799,9 +799,9 @@ static void print_smi_cost(struct perf_stat_config *config, int cpu,
 	double smi_num, aperf, cycles, cost = 0.0;
 	const char *color = NULL;
 
-	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, cpu, rsd);
-	aperf = runtime_stat_avg(st, STAT_APERF, cpu, rsd);
-	cycles = runtime_stat_avg(st, STAT_CYCLES, cpu, rsd);
+	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, cpu_map_idx, rsd);
+	aperf = runtime_stat_avg(st, STAT_APERF, cpu_map_idx, rsd);
+	cycles = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, rsd);
 
 	if ((cycles == 0) || (aperf == 0))
 		return;
@@ -818,7 +818,7 @@ static void print_smi_cost(struct perf_stat_config *config, int cpu,
 static int prepare_metric(struct evsel **metric_events,
 			  struct metric_ref *metric_refs,
 			  struct expr_parse_ctx *pctx,
-			  int cpu,
+			  int cpu_map_idx,
 			  struct runtime_stat *st)
 {
 	double scale;
@@ -836,7 +836,7 @@ static int prepare_metric(struct evsel **metric_events,
 			scale = 1e-9;
 			source_count = 1;
 		} else {
-			v = saved_value_lookup(metric_events[i], cpu, false,
+			v = saved_value_lookup(metric_events[i], cpu_map_idx, false,
 					       STAT_NONE, 0, st,
 					       metric_events[i]->cgrp);
 			if (!v)
@@ -874,7 +874,7 @@ static void generic_metric(struct perf_stat_config *config,
 			   const char *metric_name,
 			   const char *metric_unit,
 			   int runtime,
-			   int cpu,
+			   int cpu_map_idx,
 			   struct perf_stat_output_ctx *out,
 			   struct runtime_stat *st)
 {
@@ -889,7 +889,7 @@ static void generic_metric(struct perf_stat_config *config,
 		return;
 
 	pctx->runtime = runtime;
-	i = prepare_metric(metric_events, metric_refs, pctx, cpu, st);
+	i = prepare_metric(metric_events, metric_refs, pctx, cpu_map_idx, st);
 	if (i < 0) {
 		expr__ctx_free(pctx);
 		return;
@@ -934,7 +934,7 @@ static void generic_metric(struct perf_stat_config *config,
 	expr__ctx_free(pctx);
 }
 
-double test_generic_metric(struct metric_expr *mexp, int cpu, struct runtime_stat *st)
+double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct runtime_stat *st)
 {
 	struct expr_parse_ctx *pctx;
 	double ratio = 0.0;
@@ -943,7 +943,7 @@ double test_generic_metric(struct metric_expr *mexp, int cpu, struct runtime_sta
 	if (!pctx)
 		return NAN;
 
-	if (prepare_metric(mexp->metric_events, mexp->metric_refs, pctx, cpu, st) < 0)
+	if (prepare_metric(mexp->metric_events, mexp->metric_refs, pctx, cpu_map_idx, st) < 0)
 		goto out;
 
 	if (expr__parse(&ratio, pctx, mexp->metric_expr))
@@ -956,7 +956,7 @@ double test_generic_metric(struct metric_expr *mexp, int cpu, struct runtime_sta
 
 void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				   struct evsel *evsel,
-				   double avg, int cpu,
+				   double avg, int cpu_map_idx,
 				   struct perf_stat_output_ctx *out,
 				   struct rblist *metric_events,
 				   struct runtime_stat *st)
@@ -975,7 +975,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 	if (config->iostat_run) {
 		iostat_print_metric(config, evsel, out);
 	} else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
-		total = runtime_stat_avg(st, STAT_CYCLES, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
 
 		if (total) {
 			ratio = avg / total;
@@ -985,11 +985,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			print_metric(config, ctxp, NULL, NULL, "insn per cycle", 0);
 		}
 
-		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT, cpu_map_idx, &rsd);
 
 		total = max(total, runtime_stat_avg(st,
 						    STAT_STALLED_CYCLES_BACK,
-						    cpu, &rsd));
+						    cpu_map_idx, &rsd));
 
 		if (total && avg) {
 			out->new_line(config, ctxp);
@@ -999,8 +999,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					ratio);
 		}
 	} else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
-		if (runtime_stat_n(st, STAT_BRANCHES, cpu, &rsd) != 0)
-			print_branch_misses(config, cpu, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_BRANCHES, cpu_map_idx, &rsd) != 0)
+			print_branch_misses(config, cpu_map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all branches", 0);
 	} else if (
@@ -1009,8 +1009,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_L1_DCACHE, cpu, &rsd) != 0)
-			print_l1_dcache_misses(config, cpu, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_L1_DCACHE, cpu_map_idx, &rsd) != 0)
+			print_l1_dcache_misses(config, cpu_map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all L1-dcache accesses", 0);
 	} else if (
@@ -1019,8 +1019,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_L1_ICACHE, cpu, &rsd) != 0)
-			print_l1_icache_misses(config, cpu, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_L1_ICACHE, cpu_map_idx, &rsd) != 0)
+			print_l1_icache_misses(config, cpu_map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all L1-icache accesses", 0);
 	} else if (
@@ -1029,8 +1029,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_DTLB_CACHE, cpu, &rsd) != 0)
-			print_dtlb_cache_misses(config, cpu, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_DTLB_CACHE, cpu_map_idx, &rsd) != 0)
+			print_dtlb_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all dTLB cache accesses", 0);
 	} else if (
@@ -1039,8 +1039,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_ITLB_CACHE, cpu, &rsd) != 0)
-			print_itlb_cache_misses(config, cpu, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_ITLB_CACHE, cpu_map_idx, &rsd) != 0)
+			print_itlb_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all iTLB cache accesses", 0);
 	} else if (
@@ -1049,27 +1049,27 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
 
-		if (runtime_stat_n(st, STAT_LL_CACHE, cpu, &rsd) != 0)
-			print_ll_cache_misses(config, cpu, avg, out, st, &rsd);
+		if (runtime_stat_n(st, STAT_LL_CACHE, cpu_map_idx, &rsd) != 0)
+			print_ll_cache_misses(config, cpu_map_idx, avg, out, st, &rsd);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all LL-cache accesses", 0);
 	} else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
-		total = runtime_stat_avg(st, STAT_CACHEREFS, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_CACHEREFS, cpu_map_idx, &rsd);
 
 		if (total)
 			ratio = avg * 100 / total;
 
-		if (runtime_stat_n(st, STAT_CACHEREFS, cpu, &rsd) != 0)
+		if (runtime_stat_n(st, STAT_CACHEREFS, cpu_map_idx, &rsd) != 0)
 			print_metric(config, ctxp, NULL, "%8.3f %%",
 				     "of all cache refs", ratio);
 		else
 			print_metric(config, ctxp, NULL, NULL, "of all cache refs", 0);
 	} else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
-		print_stalled_cycles_frontend(config, cpu, avg, out, st, &rsd);
+		print_stalled_cycles_frontend(config, cpu_map_idx, avg, out, st, &rsd);
 	} else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
-		print_stalled_cycles_backend(config, cpu, avg, out, st, &rsd);
+		print_stalled_cycles_backend(config, cpu_map_idx, avg, out, st, &rsd);
 	} else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
-		total = runtime_stat_avg(st, STAT_NSECS, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_NSECS, cpu_map_idx, &rsd);
 
 		if (total) {
 			ratio = avg / total;
@@ -1078,7 +1078,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			print_metric(config, ctxp, NULL, NULL, "Ghz", 0);
 		}
 	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX)) {
-		total = runtime_stat_avg(st, STAT_CYCLES, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
 
 		if (total)
 			print_metric(config, ctxp, NULL,
@@ -1088,8 +1088,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			print_metric(config, ctxp, NULL, NULL, "transactional cycles",
 				     0);
 	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX_CP)) {
-		total = runtime_stat_avg(st, STAT_CYCLES, cpu, &rsd);
-		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES, cpu_map_idx, &rsd);
+		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
 
 		if (total2 < avg)
 			total2 = avg;
@@ -1099,19 +1099,19 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		else
 			print_metric(config, ctxp, NULL, NULL, "aborted cycles", 0);
 	} else if (perf_stat_evsel__is(evsel, TRANSACTION_START)) {
-		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
 
 		if (avg)
 			ratio = total / avg;
 
-		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, cpu, &rsd) != 0)
+		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd) != 0)
 			print_metric(config, ctxp, NULL, "%8.0f",
 				     "cycles / transaction", ratio);
 		else
 			print_metric(config, ctxp, NULL, NULL, "cycles / transaction",
 				      0);
 	} else if (perf_stat_evsel__is(evsel, ELISION_START)) {
-		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX, cpu_map_idx, &rsd);
 
 		if (avg)
 			ratio = total / avg;
@@ -1124,28 +1124,28 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		else
 			print_metric(config, ctxp, NULL, NULL, "CPUs utilized", 0);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_BUBBLES)) {
-		double fe_bound = td_fe_bound(cpu, st, &rsd);
+		double fe_bound = td_fe_bound(cpu_map_idx, st, &rsd);
 
 		if (fe_bound > 0.2)
 			color = PERF_COLOR_RED;
 		print_metric(config, ctxp, color, "%8.1f%%", "frontend bound",
 				fe_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_RETIRED)) {
-		double retiring = td_retiring(cpu, st, &rsd);
+		double retiring = td_retiring(cpu_map_idx, st, &rsd);
 
 		if (retiring > 0.7)
 			color = PERF_COLOR_GREEN;
 		print_metric(config, ctxp, color, "%8.1f%%", "retiring",
 				retiring * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RECOVERY_BUBBLES)) {
-		double bad_spec = td_bad_spec(cpu, st, &rsd);
+		double bad_spec = td_bad_spec(cpu_map_idx, st, &rsd);
 
 		if (bad_spec > 0.1)
 			color = PERF_COLOR_RED;
 		print_metric(config, ctxp, color, "%8.1f%%", "bad speculation",
 				bad_spec * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_ISSUED)) {
-		double be_bound = td_be_bound(cpu, st, &rsd);
+		double be_bound = td_be_bound(cpu_map_idx, st, &rsd);
 		const char *name = "backend bound";
 		static int have_recovery_bubbles = -1;
 
@@ -1158,14 +1158,14 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 
 		if (be_bound > 0.2)
 			color = PERF_COLOR_RED;
-		if (td_total_slots(cpu, st, &rsd) > 0)
+		if (td_total_slots(cpu_map_idx, st, &rsd) > 0)
 			print_metric(config, ctxp, color, "%8.1f%%", name,
 					be_bound * 100.);
 		else
 			print_metric(config, ctxp, NULL, NULL, name, 0);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RETIRING) &&
-		   full_td(cpu, st, &rsd)) {
-		double retiring = td_metric_ratio(cpu,
+		   full_td(cpu_map_idx, st, &rsd)) {
+		double retiring = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_RETIRING, st,
 						  &rsd);
 		if (retiring > 0.7)
@@ -1173,8 +1173,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "retiring",
 				retiring * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FE_BOUND) &&
-		   full_td(cpu, st, &rsd)) {
-		double fe_bound = td_metric_ratio(cpu,
+		   full_td(cpu_map_idx, st, &rsd)) {
+		double fe_bound = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_FE_BOUND, st,
 						  &rsd);
 		if (fe_bound > 0.2)
@@ -1182,8 +1182,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "frontend bound",
 				fe_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BE_BOUND) &&
-		   full_td(cpu, st, &rsd)) {
-		double be_bound = td_metric_ratio(cpu,
+		   full_td(cpu_map_idx, st, &rsd)) {
+		double be_bound = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_BE_BOUND, st,
 						  &rsd);
 		if (be_bound > 0.2)
@@ -1191,8 +1191,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "backend bound",
 				be_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BAD_SPEC) &&
-		   full_td(cpu, st, &rsd)) {
-		double bad_spec = td_metric_ratio(cpu,
+		   full_td(cpu_map_idx, st, &rsd)) {
+		double bad_spec = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_BAD_SPEC, st,
 						  &rsd);
 		if (bad_spec > 0.1)
@@ -1200,11 +1200,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "bad speculation",
 				bad_spec * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_HEAVY_OPS) &&
-			full_td(cpu, st, &rsd) && (config->topdown_level > 1)) {
-		double retiring = td_metric_ratio(cpu,
+			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double retiring = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_RETIRING, st,
 						  &rsd);
-		double heavy_ops = td_metric_ratio(cpu,
+		double heavy_ops = td_metric_ratio(cpu_map_idx,
 						   STAT_TOPDOWN_HEAVY_OPS, st,
 						   &rsd);
 		double light_ops = retiring - heavy_ops;
@@ -1220,11 +1220,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "light operations",
 				light_ops * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_BR_MISPREDICT) &&
-			full_td(cpu, st, &rsd) && (config->topdown_level > 1)) {
-		double bad_spec = td_metric_ratio(cpu,
+			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double bad_spec = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_BAD_SPEC, st,
 						  &rsd);
-		double br_mis = td_metric_ratio(cpu,
+		double br_mis = td_metric_ratio(cpu_map_idx,
 						STAT_TOPDOWN_BR_MISPREDICT, st,
 						&rsd);
 		double m_clears = bad_spec - br_mis;
@@ -1240,11 +1240,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "machine clears",
 				m_clears * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_LAT) &&
-			full_td(cpu, st, &rsd) && (config->topdown_level > 1)) {
-		double fe_bound = td_metric_ratio(cpu,
+			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double fe_bound = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_FE_BOUND, st,
 						  &rsd);
-		double fetch_lat = td_metric_ratio(cpu,
+		double fetch_lat = td_metric_ratio(cpu_map_idx,
 						   STAT_TOPDOWN_FETCH_LAT, st,
 						   &rsd);
 		double fetch_bw = fe_bound - fetch_lat;
@@ -1260,11 +1260,11 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 		print_metric(config, ctxp, color, "%8.1f%%", "fetch bandwidth",
 				fetch_bw * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_MEM_BOUND) &&
-			full_td(cpu, st, &rsd) && (config->topdown_level > 1)) {
-		double be_bound = td_metric_ratio(cpu,
+			full_td(cpu_map_idx, st, &rsd) && (config->topdown_level > 1)) {
+		double be_bound = td_metric_ratio(cpu_map_idx,
 						  STAT_TOPDOWN_BE_BOUND, st,
 						  &rsd);
-		double mem_bound = td_metric_ratio(cpu,
+		double mem_bound = td_metric_ratio(cpu_map_idx,
 						   STAT_TOPDOWN_MEM_BOUND, st,
 						   &rsd);
 		double core_bound = be_bound - mem_bound;
@@ -1281,12 +1281,12 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				core_bound * 100.);
 	} else if (evsel->metric_expr) {
 		generic_metric(config, evsel->metric_expr, evsel->metric_events, NULL,
-				evsel->name, evsel->metric_name, NULL, 1, cpu, out, st);
-	} else if (runtime_stat_n(st, STAT_NSECS, cpu, &rsd) != 0) {
+				evsel->name, evsel->metric_name, NULL, 1, cpu_map_idx, out, st);
+	} else if (runtime_stat_n(st, STAT_NSECS, cpu_map_idx, &rsd) != 0) {
 		char unit = ' ';
 		char unit_buf[10] = "/sec";
 
-		total = runtime_stat_avg(st, STAT_NSECS, cpu, &rsd);
+		total = runtime_stat_avg(st, STAT_NSECS, cpu_map_idx, &rsd);
 		if (total)
 			ratio = convert_unit_double(1000000000.0 * avg / total, &unit);
 
@@ -1294,7 +1294,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
 		print_metric(config, ctxp, NULL, "%8.3f", unit_buf, ratio);
 	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
-		print_smi_cost(config, cpu, out, st, &rsd);
+		print_smi_cost(config, cpu_map_idx, out, st, &rsd);
 	} else {
 		num = 0;
 	}
@@ -1307,7 +1307,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				out->new_line(config, ctxp);
 			generic_metric(config, mexp->metric_expr, mexp->metric_events,
 					mexp->metric_refs, evsel->name, mexp->metric_name,
-					mexp->metric_unit, mexp->runtime, cpu, out, st);
+					mexp->metric_unit, mexp->runtime, cpu_map_idx, out, st);
 		}
 	}
 	if (num == 0)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 86ab427e87fc..7dbd7c4f3c33 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -356,14 +356,14 @@ static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals,
 
 static int
 process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
-		       int cpu, int thread,
+		       int cpu_map_idx, int thread,
 		       struct perf_counts_values *count)
 {
 	struct perf_counts_values *aggr = &evsel->counts->aggr;
 	static struct perf_counts_values zero;
 	bool skip = false;
 
-	if (check_per_pkg(evsel, count, cpu, &skip)) {
+	if (check_per_pkg(evsel, count, cpu_map_idx, &skip)) {
 		pr_err("failed to read per-pkg counter\n");
 		return -1;
 	}
@@ -379,11 +379,11 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
 	case AGGR_NODE:
 	case AGGR_NONE:
 		if (!evsel->snapshot)
-			evsel__compute_deltas(evsel, cpu, thread, count);
+			evsel__compute_deltas(evsel, cpu_map_idx, thread, count);
 		perf_counts_values__scale(count, config->scale, NULL);
 		if ((config->aggr_mode == AGGR_NONE) && (!evsel->percore)) {
 			perf_stat__update_shadow_stats(evsel, count->val,
-						       cpu, &rt_stat);
+						       cpu_map_idx, &rt_stat);
 		}
 
 		if (config->aggr_mode == AGGR_THREAD) {
@@ -412,15 +412,15 @@ static int process_counter_maps(struct perf_stat_config *config,
 {
 	int nthreads = perf_thread_map__nr(counter->core.threads);
 	int ncpus = evsel__nr_cpus(counter);
-	int cpu, thread;
+	int idx, thread;
 
 	if (counter->core.system_wide)
 		nthreads = 1;
 
 	for (thread = 0; thread < nthreads; thread++) {
-		for (cpu = 0; cpu < ncpus; cpu++) {
-			if (process_counter_values(config, counter, cpu, thread,
-						   perf_counts(counter->counts, cpu, thread)))
+		for (idx = 0; idx < ncpus; idx++) {
+			if (process_counter_values(config, counter, idx, thread,
+						   perf_counts(counter->counts, idx, thread)))
 				return -1;
 		}
 	}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 5e25d53e891b..691c12fd8976 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -208,7 +208,7 @@ void perf_stat__init_shadow_stats(void);
 void perf_stat__reset_shadow_stats(void);
 void perf_stat__reset_shadow_per_stat(struct runtime_stat *st);
 void perf_stat__update_shadow_stats(struct evsel *counter, u64 count,
-				    int cpu, struct runtime_stat *st);
+				    int cpu_map_idx, struct runtime_stat *st);
 struct perf_stat_output_ctx {
 	void *ctx;
 	print_metric_t print_metric;
@@ -253,5 +253,5 @@ void evlist__print_counters(struct evlist *evlist, struct perf_stat_config *conf
 			    struct target *_target, struct timespec *ts, int argc, const char **argv);
 
 struct metric_expr;
-double test_generic_metric(struct metric_expr *mexp, int cpu, struct runtime_stat *st);
+double test_generic_metric(struct metric_expr *mexp, int cpu_map_idx, struct runtime_stat *st);
 #endif
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 43/48] libperf: Sync evsel documentation
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (41 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 42/48] perf stat: Swap variable name cpu to index Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx Ian Rogers
                   ` (5 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
cpu was renamed cpu_map_idx, for clarity.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/Documentation/libperf.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt
index faef9ba3a540..32c5051c24eb 100644
--- a/tools/lib/perf/Documentation/libperf.txt
+++ b/tools/lib/perf/Documentation/libperf.txt
@@ -136,16 +136,16 @@ SYNOPSIS
   int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
                        struct perf_thread_map *threads);
   void perf_evsel__close(struct perf_evsel *evsel);
-  void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
+  void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu_map_idx);
   int perf_evsel__mmap(struct perf_evsel *evsel, int pages);
   void perf_evsel__munmap(struct perf_evsel *evsel);
-  void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread);
-  int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
+  void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu_map_idx, int thread);
+  int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int thread,
                        struct perf_counts_values *count);
   int perf_evsel__enable(struct perf_evsel *evsel);
-  int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu);
+  int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
   int perf_evsel__disable(struct perf_evsel *evsel);
-  int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu);
+  int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
   struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
   struct perf_thread_map *perf_evsel__threads(struct perf_evsel *evsel);
   struct perf_event_attr *perf_evsel__attr(struct perf_evsel *evsel);
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (42 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 43/48] libperf: Sync evsel documentation Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-10 19:10   ` Arnaldo Carvalho de Melo
  2022-01-05  6:13 ` [PATCH v4 45/48] perf c2c: Use more intention revealing iterator Ian Rogers
                   ` (4 subsequent siblings)
  48 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Synchronize the caller in evsel with the called function.
Shorten 3 lines of code in bperf_read by using
perf_cpu_map__for_each_cpu.
This code is frequently using variables named cpu as cpu map indices,
which doesn't matter as all CPUs are in the CPU map. It is strange in
some cases the cpumap is used at all.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/bpf_counter.c | 16 +++++++---------
 tools/perf/util/bpf_counter.h |  4 ++--
 2 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index c17d4a43ce06..ae5bd87ff020 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -265,7 +265,7 @@ static int bpf_program_profiler__read(struct evsel *evsel)
 	return 0;
 }
 
-static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
+static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu_map_idx,
 					    int fd)
 {
 	struct bpf_prog_profiler_bpf *skel;
@@ -277,7 +277,7 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
 		assert(skel != NULL);
 
 		ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
-					  &cpu, &fd, BPF_ANY);
+					  &cpu_map_idx, &fd, BPF_ANY);
 		if (ret)
 			return ret;
 	}
@@ -566,12 +566,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
 	return err;
 }
 
-static int bperf__install_pe(struct evsel *evsel, int cpu, int fd)
+static int bperf__install_pe(struct evsel *evsel, int cpu_map_idx, int fd)
 {
 	struct bperf_leader_bpf *skel = evsel->leader_skel;
 
 	return bpf_map_update_elem(bpf_map__fd(skel->maps.events),
-				   &cpu, &fd, BPF_ANY);
+				   &cpu_map_idx, &fd, BPF_ANY);
 }
 
 /*
@@ -623,9 +623,7 @@ static int bperf__read(struct evsel *evsel)
 		case BPERF_FILTER_GLOBAL:
 			assert(i == 0);
 
-			num_cpu = all_cpu_map->nr;
-			for (j = 0; j < num_cpu; j++) {
-				cpu = all_cpu_map->map[j];
+			perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
 				perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
 				perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
 				perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
@@ -757,11 +755,11 @@ static inline bool bpf_counter_skip(struct evsel *evsel)
 		evsel->follower_skel == NULL;
 }
 
-int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
+int bpf_counter__install_pe(struct evsel *evsel, int cpu_map_idx, int fd)
 {
 	if (bpf_counter_skip(evsel))
 		return 0;
-	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
+	return evsel->bpf_counter_ops->install_pe(evsel, cpu_map_idx, fd);
 }
 
 int bpf_counter__load(struct evsel *evsel, struct target *target)
diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
index 65ebaa6694fb..4dbf26408b69 100644
--- a/tools/perf/util/bpf_counter.h
+++ b/tools/perf/util/bpf_counter.h
@@ -16,7 +16,7 @@ typedef int (*bpf_counter_evsel_op)(struct evsel *evsel);
 typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel,
 					   struct target *target);
 typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel,
-					       int cpu,
+					       int cpu_map_idx,
 					       int fd);
 
 struct bpf_counter_ops {
@@ -40,7 +40,7 @@ int bpf_counter__enable(struct evsel *evsel);
 int bpf_counter__disable(struct evsel *evsel);
 int bpf_counter__read(struct evsel *evsel);
 void bpf_counter__destroy(struct evsel *evsel);
-int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
+int bpf_counter__install_pe(struct evsel *evsel, int cpu_map_idx, int fd);
 
 #else /* HAVE_BPF_SKEL */
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 45/48] perf c2c: Use more intention revealing iterator
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (43 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 46/48] perf script: Fix flipped index and cpu Ian Rogers
                   ` (3 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
Use perf_cpu_map__for_each_cpu in setup_nodes.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-c2c.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index b5c67ef73862..ad1fbeafc93d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2015,7 +2015,7 @@ static int setup_nodes(struct perf_session *session)
 {
 	struct numa_node *n;
 	unsigned long **nodes;
-	int node, cpu;
+	int node, cpu, idx;
 	int *cpu2node;
 
 	if (c2c.node_info > 2)
@@ -2057,13 +2057,13 @@ static int setup_nodes(struct perf_session *session)
 		if (perf_cpu_map__empty(map))
 			continue;
 
-		for (cpu = 0; cpu < map->nr; cpu++) {
-			set_bit(map->map[cpu], set);
+		perf_cpu_map__for_each_cpu(cpu, idx, map) {
+			set_bit(cpu, set);
 
-			if (WARN_ONCE(cpu2node[map->map[cpu]] != -1, "node/cpu topology bug"))
+			if (WARN_ONCE(cpu2node[cpu] != -1, "node/cpu topology bug"))
 				return -EINVAL;
 
-			cpu2node[map->map[cpu]] = node;
+			cpu2node[cpu] = node;
 		}
 	}
 
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 46/48] perf script: Fix flipped index and cpu
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (44 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 45/48] perf c2c: Use more intention revealing iterator Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 47/48] perf stat: Correct first_shadow_cpu to return index Ian Rogers
                   ` (2 subsequent siblings)
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
perf_counts are accessed by the densely packed index.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-script.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index f40319144856..bb43529618b3 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2131,7 +2131,7 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 		perf_cpu_map__for_each_cpu(cpu, idx, evsel__cpus(counter)) {
 			struct perf_counts_values *counts;
 
-			counts = perf_counts(counter->counts, cpu, thread);
+			counts = perf_counts(counter->counts, idx, thread);
 
 			printf("%3d %8d %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %s\n",
 				cpu,
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 47/48] perf stat: Correct first_shadow_cpu to return index
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (45 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 46/48] perf script: Fix flipped index and cpu Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-05  6:13 ` [PATCH v4 48/48] perf cpumap: Give CPUs their own type Ian Rogers
  2022-01-10 20:17 ` [PATCH v4 00/48] Refactor perf cpumap Arnaldo Carvalho de Melo
  48 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
perf_stat__update_shadow_stats and perf_stat__print_shadow_stats use a
cpu map index rather than a CPU, but first_shadow_cpu is returning the
wrong value for this. Change first_shadow_cpu to
first_shadow_cpu_map_idx to make things agree.
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/stat-display.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 0f192360b6c6..ba95379efcfb 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -327,24 +327,23 @@ static void print_metric_header(struct perf_stat_config *config,
 		fprintf(os->fh, "%*s ", config->metric_only_len, unit);
 }
 
-static int first_shadow_cpu(struct perf_stat_config *config,
-			    struct evsel *evsel, const struct aggr_cpu_id *id)
+static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
+				struct evsel *evsel, const struct aggr_cpu_id *id)
 {
-	struct perf_cpu_map *cpus;
+	struct perf_cpu_map *cpus = evsel__cpus(evsel);
 	int cpu, idx;
 
 	if (config->aggr_mode == AGGR_NONE)
-		return id->cpu;
+		return perf_cpu_map__idx(cpus, id->cpu);
 
 	if (!config->aggr_get_id)
 		return 0;
 
-	cpus = evsel__cpus(evsel);
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 		struct aggr_cpu_id cpu_id = config->aggr_get_id(config, cpu);
 
 		if (aggr_cpu_id__equal(&cpu_id, id))
-			return cpu;
+			return idx;
 	}
 	return 0;
 }
@@ -503,7 +502,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
 	}
 
 	perf_stat__print_shadow_stats(config, counter, uval,
-				first_shadow_cpu(config, counter, &id),
+				first_shadow_cpu_map_idx(config, counter, &id),
 				&out, &config->metric_events, st);
 	if (!config->csv_output && !config->metric_only) {
 		print_noise(config, counter, noise);
@@ -532,7 +531,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
 				val += perf_counts(counter->counts, idx, 0)->val;
 			}
 			perf_stat__update_shadow_stats(counter, val,
-					first_shadow_cpu(config, counter, &id),
+					first_shadow_cpu_map_idx(config, counter, &id),
 					&rt_stat);
 		}
 	}
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (46 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 47/48] perf stat: Correct first_shadow_cpu to return index Ian Rogers
@ 2022-01-05  6:13 ` Ian Rogers
  2022-01-09 18:30   ` Jiri Olsa
  2022-01-10 19:50   ` Arnaldo Carvalho de Melo
  2022-01-10 20:17 ` [PATCH v4 00/48] Refactor perf cpumap Arnaldo Carvalho de Melo
  48 siblings, 2 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-05  6:13 UTC (permalink / raw)
  To: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian, Ian Rogers
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c                       | 121 ++++++++++--------
 tools/lib/perf/evlist.c                       |   4 +-
 tools/lib/perf/evsel.c                        |   9 +-
 tools/lib/perf/include/internal/cpumap.h      |   9 +-
 tools/lib/perf/include/internal/evlist.h      |   3 +-
 tools/lib/perf/include/internal/evsel.h       |   4 +-
 tools/lib/perf/include/internal/mmap.h        |   5 +-
 tools/lib/perf/include/perf/cpumap.h          |   9 +-
 tools/lib/perf/mmap.c                         |   2 +-
 tools/perf/bench/epoll-ctl.c                  |   2 +-
 tools/perf/bench/epoll-wait.c                 |   2 +-
 tools/perf/bench/futex-hash.c                 |   2 +-
 tools/perf/bench/futex-lock-pi.c              |   2 +-
 tools/perf/bench/futex-requeue.c              |   2 +-
 tools/perf/bench/futex-wake-parallel.c        |   2 +-
 tools/perf/bench/futex-wake.c                 |   2 +-
 tools/perf/builtin-c2c.c                      |  13 +-
 tools/perf/builtin-ftrace.c                   |   2 +-
 tools/perf/builtin-kmem.c                     |   2 +-
 tools/perf/builtin-record.c                   |   2 +-
 tools/perf/builtin-sched.c                    |  65 +++++-----
 tools/perf/builtin-script.c                   |   5 +-
 tools/perf/builtin-stat.c                     |  80 ++++++------
 tools/perf/tests/attr.c                       |   6 +-
 tools/perf/tests/bitmap.c                     |   2 +-
 tools/perf/tests/cpumap.c                     |   6 +-
 tools/perf/tests/event_update.c               |   6 +-
 tools/perf/tests/mem2node.c                   |   2 +-
 tools/perf/tests/mmap-basic.c                 |   4 +-
 tools/perf/tests/openat-syscall-all-cpus.c    |  17 +--
 tools/perf/tests/stat.c                       |   3 +-
 tools/perf/tests/topology.c                   |  30 +++--
 tools/perf/util/affinity.c                    |   2 +-
 tools/perf/util/auxtrace.c                    |  12 +-
 tools/perf/util/auxtrace.h                    |   5 +-
 tools/perf/util/cpumap.c                      |  91 ++++++-------
 tools/perf/util/cpumap.h                      |  26 ++--
 tools/perf/util/cputopo.c                     |   6 +-
 tools/perf/util/env.c                         |  29 +++--
 tools/perf/util/env.h                         |   3 +-
 tools/perf/util/evlist.c                      |   8 +-
 tools/perf/util/evlist.h                      |   2 +-
 tools/perf/util/evsel.c                       |   6 +-
 tools/perf/util/expr.c                        |   2 +-
                       |   6 +-
 tools/perf/util/mmap.c                        |  19 +--
 tools/perf/util/mmap.h                        |   3 +-
 tools/perf/util/perf_api_probe.c              |  15 ++-
 tools/perf/util/python.c                      |   4 +-
 tools/perf/util/record.c                      |  11 +-
 .../scripting-engines/trace-event-python.c    |   6 +-
 tools/perf/util/session.c                     |  10 +-
 tools/perf/util/stat-display.c                |  34 +++--
 tools/perf/util/stat.c                        |   2 +-
 tools/perf/util/stat.h                        |   2 +-
 tools/perf/util/svghelper.c                   |   6 +-
 tools/perf/util/synthetic-events.c            |  12 +-
 tools/perf/util/synthetic-events.h            |   3 +-
 tools/perf/util/util.h                        |   5 +-
 59 files changed, 408 insertions(+), 347 deletions(-)
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index eacea3ab965a..a3afe9ef74ef 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -10,16 +10,27 @@
 #include <ctype.h>
 #include <limits.h>
 
+static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+{
+	struct perf_cpu_map *cpus;
+
+	cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
+	if (!cpus)
+		return NULL;
+
+	cpus->nr = 1;
+	refcount_set(&cpus->refcnt, 1);
+	return cpus;
+}
+
 struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 {
-	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(int));
+	struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);
 
-	if (cpus != NULL) {
-		cpus->nr = 1;
-		cpus->map[0] = -1;
-		refcount_set(&cpus->refcnt, 1);
-	}
+	if (!cpus)
+		return NULL;
 
+	cpus->map[0].cpu = -1;
 	return cpus;
 }
 
@@ -54,16 +65,12 @@ static struct perf_cpu_map *cpu_map__default_new(void)
 	if (nr_cpus < 0)
 		return NULL;
 
-	cpus = malloc(sizeof(*cpus) + nr_cpus * sizeof(int));
-	if (cpus != NULL) {
-		int i;
-
-		for (i = 0; i < nr_cpus; ++i)
-			cpus->map[i] = i;
+	cpus = perf_cpu_map__alloc(nr_cpus);
+	if (!cpus)
+		return NULL;
 
-		cpus->nr = nr_cpus;
-		refcount_set(&cpus->refcnt, 1);
-	}
+	for (int i = 0; i < nr_cpus; ++i)
+		cpus->map[i].cpu = i;
 
 	return cpus;
 }
@@ -73,31 +80,32 @@ struct perf_cpu_map *perf_cpu_map__default_new(void)
 	return cpu_map__default_new();
 }
 
-static int cmp_int(const void *a, const void *b)
+
+static int cmp_cpu(const void *a, const void *b)
 {
-	return *(const int *)a - *(const int*)b;
+	const struct perf_cpu *cpu_a = a, *cpu_b = b;
+
+	return cpu_a->cpu - cpu_b->cpu;
 }
 
-static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, int *tmp_cpus)
+static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, const struct perf_cpu *tmp_cpus)
 {
-	size_t payload_size = nr_cpus * sizeof(int);
-	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + payload_size);
+	size_t payload_size = nr_cpus * sizeof(struct perf_cpu);
+	struct perf_cpu_map *cpus = perf_cpu_map__alloc(nr_cpus);
 	int i, j;
 
 	if (cpus != NULL) {
 		memcpy(cpus->map, tmp_cpus, payload_size);
-		qsort(cpus->map, nr_cpus, sizeof(int), cmp_int);
+		qsort(cpus->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
 		/* Remove dups */
 		j = 0;
 		for (i = 0; i < nr_cpus; i++) {
-			if (i == 0 || cpus->map[i] != cpus->map[i - 1])
-				cpus->map[j++] = cpus->map[i];
+			if (i == 0 || cpus->map[i].cpu != cpus->map[i - 1].cpu)
+				cpus->map[j++].cpu = cpus->map[i].cpu;
 		}
 		cpus->nr = j;
 		assert(j <= nr_cpus);
-		refcount_set(&cpus->refcnt, 1);
 	}
-
 	return cpus;
 }
 
@@ -105,7 +113,7 @@ struct perf_cpu_map *perf_cpu_map__read(FILE *file)
 {
 	struct perf_cpu_map *cpus = NULL;
 	int nr_cpus = 0;
-	int *tmp_cpus = NULL, *tmp;
+	struct perf_cpu *tmp_cpus = NULL, *tmp;
 	int max_entries = 0;
 	int n, cpu, prev;
 	char sep;
@@ -124,24 +132,24 @@ struct perf_cpu_map *perf_cpu_map__read(FILE *file)
 
 			if (new_max >= max_entries) {
 				max_entries = new_max + MAX_NR_CPUS / 2;
-				tmp = realloc(tmp_cpus, max_entries * sizeof(int));
+				tmp = realloc(tmp_cpus, max_entries * sizeof(struct perf_cpu));
 				if (tmp == NULL)
 					goto out_free_tmp;
 				tmp_cpus = tmp;
 			}
 
 			while (++prev < cpu)
-				tmp_cpus[nr_cpus++] = prev;
+				tmp_cpus[nr_cpus++].cpu = prev;
 		}
 		if (nr_cpus == max_entries) {
 			max_entries += MAX_NR_CPUS;
-			tmp = realloc(tmp_cpus, max_entries * sizeof(int));
+			tmp = realloc(tmp_cpus, max_entries * sizeof(struct perf_cpu));
 			if (tmp == NULL)
 				goto out_free_tmp;
 			tmp_cpus = tmp;
 		}
 
-		tmp_cpus[nr_cpus++] = cpu;
+		tmp_cpus[nr_cpus++].cpu = cpu;
 		if (n == 2 && sep == '-')
 			prev = cpu;
 		else
@@ -179,7 +187,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
 	unsigned long start_cpu, end_cpu = 0;
 	char *p = NULL;
 	int i, nr_cpus = 0;
-	int *tmp_cpus = NULL, *tmp;
+	struct perf_cpu *tmp_cpus = NULL, *tmp;
 	int max_entries = 0;
 
 	if (!cpu_list)
@@ -220,17 +228,17 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
 		for (; start_cpu <= end_cpu; start_cpu++) {
 			/* check for duplicates */
 			for (i = 0; i < nr_cpus; i++)
-				if (tmp_cpus[i] == (int)start_cpu)
+				if (tmp_cpus[i].cpu == (int)start_cpu)
 					goto invalid;
 
 			if (nr_cpus == max_entries) {
 				max_entries += MAX_NR_CPUS;
-				tmp = realloc(tmp_cpus, max_entries * sizeof(int));
+				tmp = realloc(tmp_cpus, max_entries * sizeof(struct perf_cpu));
 				if (tmp == NULL)
 					goto invalid;
 				tmp_cpus = tmp;
 			}
-			tmp_cpus[nr_cpus++] = (int)start_cpu;
+			tmp_cpus[nr_cpus++].cpu = (int)start_cpu;
 		}
 		if (*p)
 			++p;
@@ -250,12 +258,16 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
 	return cpus;
 }
 
-int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
+struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
 {
+	struct perf_cpu result = {
+		.cpu = -1
+	};
+
 	if (cpus && idx < cpus->nr)
-		return cpus->map[idx];
+		result = cpus->map[idx];
 
-	return -1;
+	return result;
 }
 
 int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
@@ -265,10 +277,10 @@ int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
 
 bool perf_cpu_map__empty(const struct perf_cpu_map *map)
 {
-	return map ? map->map[0] == -1 : true;
+	return map ? map->map[0].cpu == -1 : true;
 }
 
-int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
+int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
 {
 	int low, high;
 
@@ -278,13 +290,13 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
 	low = 0;
 	high = cpus->nr;
 	while (low < high) {
-		int idx = (low + high) / 2,
-		    cpu_at_idx = cpus->map[idx];
+		int idx = (low + high) / 2;
+		struct perf_cpu cpu_at_idx = cpus->map[idx];
 
-		if (cpu_at_idx == cpu)
+		if (cpu_at_idx.cpu == cpu.cpu)
 			return idx;
 
-		if (cpu_at_idx > cpu)
+		if (cpu_at_idx.cpu > cpu.cpu)
 			high = idx;
 		else
 			low = idx + 1;
@@ -293,15 +305,22 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
 	return -1;
 }
 
-bool perf_cpu_map__has(const struct perf_cpu_map *cpus, int cpu)
+bool perf_cpu_map__has(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
 {
 	return perf_cpu_map__idx(cpus, cpu) != -1;
 }
 
-int perf_cpu_map__max(struct perf_cpu_map *map)
+struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
 {
+	struct perf_cpu result = {
+		.cpu = -1
+	};
+
 	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
-	return map->nr > 0 ? map->map[map->nr - 1] : -1;
+	if (map->nr > 0)
+		result = map->map[map->nr - 1];
+
+	return result;
 }
 
 /*
@@ -315,7 +334,7 @@ int perf_cpu_map__max(struct perf_cpu_map *map)
 struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 					 struct perf_cpu_map *other)
 {
-	int *tmp_cpus;
+	struct perf_cpu *tmp_cpus;
 	int tmp_len;
 	int i, j, k;
 	struct perf_cpu_map *merged;
@@ -329,19 +348,19 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 	if (!other)
 		return orig;
 	if (orig->nr == other->nr &&
-	    !memcmp(orig->map, other->map, orig->nr * sizeof(int)))
+	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
 		return orig;
 
 	tmp_len = orig->nr + other->nr;
-	tmp_cpus = malloc(tmp_len * sizeof(int));
+	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
 	if (!tmp_cpus)
 		return NULL;
 
 	/* Standard merge algorithm from wikipedia */
 	i = j = k = 0;
 	while (i < orig->nr && j < other->nr) {
-		if (orig->map[i] <= other->map[j]) {
-			if (orig->map[i] == other->map[j])
+		if (orig->map[i].cpu <= other->map[j].cpu) {
+			if (orig->map[i].cpu == other->map[j].cpu)
 				j++;
 			tmp_cpus[k++] = orig->map[i++];
 		} else
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 245acbc53bd3..9a770bfdc804 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -407,7 +407,7 @@ perf_evlist__mmap_cb_get(struct perf_evlist *evlist, bool overwrite, int idx)
 
 static int
 perf_evlist__mmap_cb_mmap(struct perf_mmap *map, struct perf_mmap_param *mp,
-			  int output, int cpu)
+			  int output, struct perf_cpu cpu)
 {
 	return perf_mmap__mmap(map, mp, output, cpu);
 }
@@ -426,7 +426,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	       int idx, struct perf_mmap_param *mp, int cpu_idx,
 	       int thread, int *_output, int *_output_overwrite)
 {
-	int evlist_cpu = perf_cpu_map__cpu(evlist->cpus, cpu_idx);
+	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->cpus, cpu_idx);
 	struct perf_evsel *evsel;
 	int revent;
 
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index f1e1665ef4bd..7ea86a44eae5 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -78,10 +78,10 @@ static int perf_evsel__alloc_mmap(struct perf_evsel *evsel, int ncpus, int nthre
 
 static int
 sys_perf_event_open(struct perf_event_attr *attr,
-		    pid_t pid, int cpu, int group_fd,
+		    pid_t pid, struct perf_cpu cpu, int group_fd,
 		    unsigned long flags)
 {
-	return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
+	return syscall(__NR_perf_event_open, attr, pid, cpu.cpu, group_fd, flags);
 }
 
 static int get_group_fd(struct perf_evsel *evsel, int cpu_map_idx, int thread, int *group_fd)
@@ -113,7 +113,8 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu_map_idx, int thread, i
 int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 		     struct perf_thread_map *threads)
 {
-	int cpu, idx, thread, err = 0;
+	struct perf_cpu cpu;
+	int idx, thread, err = 0;
 
 	if (cpus == NULL) {
 		static struct perf_cpu_map *empty_cpu_map;
@@ -252,7 +253,7 @@ int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
 		for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
 			int *fd = FD(evsel, idx, thread);
 			struct perf_mmap *map;
-			int cpu = perf_cpu_map__cpu(evsel->cpus, idx);
+			struct perf_cpu cpu = perf_cpu_map__cpu(evsel->cpus, idx);
 
 			if (fd == NULL || *fd < 0)
 				continue;
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 71a31ed738c9..581f9ffb4237 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -4,6 +4,11 @@
 
 #include <linux/refcount.h>
 
+/** A wrapper around a CPU to avoid confusion with the perf_cpu_map's map's indices. */
+struct perf_cpu {
+	int cpu;
+};
+
 /**
  * A sized, reference counted, sorted array of integers representing CPU
  * numbers. This is commonly used to capture which CPUs a PMU is associated
@@ -16,13 +21,13 @@ struct perf_cpu_map {
 	/** Length of the map array. */
 	int		nr;
 	/** The CPU values. */
-	int		map[];
+	struct perf_cpu	map[];
 };
 
 #ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS	2048
 #endif
 
-int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu);
+int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
 
 #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 6f74269a3ad4..4cefade540bd 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -4,6 +4,7 @@
 
 #include <linux/list.h>
 #include <api/fd/array.h>
+#include <internal/cpumap.h>
 #include <internal/evsel.h>
 
 #define PERF_EVLIST__HLIST_BITS 8
@@ -36,7 +37,7 @@ typedef void
 typedef struct perf_mmap*
 (*perf_evlist_mmap__cb_get_t)(struct perf_evlist*, bool, int);
 typedef int
-(*perf_evlist_mmap__cb_mmap_t)(struct perf_mmap*, struct perf_mmap_param*, int, int);
+(*perf_evlist_mmap__cb_mmap_t)(struct perf_mmap*, struct perf_mmap_param*, int, struct perf_cpu);
 
 struct perf_evlist_mmap_ops {
 	perf_evlist_mmap__cb_idx_t	idx;
diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
index 1f3eacbad2e8..cfc9ebd7968e 100644
--- a/tools/lib/perf/include/internal/evsel.h
+++ b/tools/lib/perf/include/internal/evsel.h
@@ -6,8 +6,8 @@
 #include <linux/perf_event.h>
 #include <stdbool.h>
 #include <sys/types.h>
+#include <internal/cpumap.h>
 
-struct perf_cpu_map;
 struct perf_thread_map;
 struct xyarray;
 
@@ -27,7 +27,7 @@ struct perf_sample_id {
 	* queue number.
 	*/
 	int			 idx;
-	int			 cpu;
+	struct perf_cpu		 cpu;
 	pid_t			 tid;
 
 	/* Holds total ID period value for PERF_SAMPLE_READ processing. */
diff --git a/tools/lib/perf/include/internal/mmap.h b/tools/lib/perf/include/internal/mmap.h
index 5e3422f40ed5..5a062af8e9d8 100644
--- a/tools/lib/perf/include/internal/mmap.h
+++ b/tools/lib/perf/include/internal/mmap.h
@@ -6,6 +6,7 @@
 #include <linux/refcount.h>
 #include <linux/types.h>
 #include <stdbool.h>
+#include <internal/cpumap.h>
 
 /* perf sample has 16 bits size limit */
 #define PERF_SAMPLE_MAX_SIZE (1 << 16)
@@ -24,7 +25,7 @@ struct perf_mmap {
 	void			*base;
 	int			 mask;
 	int			 fd;
-	int			 cpu;
+	struct perf_cpu		 cpu;
 	refcount_t		 refcnt;
 	u64			 prev;
 	u64			 start;
@@ -46,7 +47,7 @@ size_t perf_mmap__mmap_len(struct perf_mmap *map);
 void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev,
 		     bool overwrite, libperf_unmap_cb_t unmap_cb);
 int perf_mmap__mmap(struct perf_mmap *map, struct perf_mmap_param *mp,
-		    int fd, int cpu);
+		    int fd, struct perf_cpu cpu);
 void perf_mmap__munmap(struct perf_mmap *map);
 void perf_mmap__get(struct perf_mmap *map);
 void perf_mmap__put(struct perf_mmap *map);
diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
index 3f1c0afa3ccd..15b8faafd615 100644
--- a/tools/lib/perf/include/perf/cpumap.h
+++ b/tools/lib/perf/include/perf/cpumap.h
@@ -3,11 +3,10 @@
 #define __LIBPERF_CPUMAP_H
 
 #include <perf/core.h>
+#include <perf/cpumap.h>
 #include <stdio.h>
 #include <stdbool.h>
 
-struct perf_cpu_map;
-
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__dummy_new(void);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__default_new(void);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list);
@@ -16,11 +15,11 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 						     struct perf_cpu_map *other);
 LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map);
-LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
+LIBPERF_API struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
 LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
 LIBPERF_API bool perf_cpu_map__empty(const struct perf_cpu_map *map);
-LIBPERF_API int perf_cpu_map__max(struct perf_cpu_map *map);
-LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
+LIBPERF_API struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map);
+LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu);
 
 #define perf_cpu_map__for_each_cpu(cpu, idx, cpus)		\
 	for ((idx) = 0, (cpu) = perf_cpu_map__cpu(cpus, idx);	\
diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
index aaa457904008..f7ee07cb5818 100644
--- a/tools/lib/perf/mmap.c
+++ b/tools/lib/perf/mmap.c
@@ -32,7 +32,7 @@ size_t perf_mmap__mmap_len(struct perf_mmap *map)
 }
 
 int perf_mmap__mmap(struct perf_mmap *map, struct perf_mmap_param *mp,
-		    int fd, int cpu)
+		    int fd, struct perf_cpu cpu)
 {
 	map->prev = 0;
 	map->mask = mp->mask;
diff --git a/tools/perf/bench/epoll-ctl.c b/tools/perf/bench/epoll-ctl.c
index ddaca75c3bc0..1a17ec83d3c4 100644
--- a/tools/perf/bench/epoll-ctl.c
+++ b/tools/perf/bench/epoll-ctl.c
@@ -253,7 +253,7 @@ static int do_threads(struct worker *worker, struct perf_cpu_map *cpu)
 
 		if (!noaffinity) {
 			CPU_ZERO(&cpuset);
-			CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+			CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 			ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
 			if (ret)
diff --git a/tools/perf/bench/epoll-wait.c b/tools/perf/bench/epoll-wait.c
index 79d13dbc0a47..0d1dd8879197 100644
--- a/tools/perf/bench/epoll-wait.c
+++ b/tools/perf/bench/epoll-wait.c
@@ -342,7 +342,7 @@ static int do_threads(struct worker *worker, struct perf_cpu_map *cpu)
 
 		if (!noaffinity) {
 			CPU_ZERO(&cpuset);
-			CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+			CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 			ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
 			if (ret)
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index fcdea3e44937..9627b6ab8670 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -177,7 +177,7 @@ int bench_futex_hash(int argc, const char **argv)
 			goto errmem;
 
 		CPU_ZERO(&cpuset);
-		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+		CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 		ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
 		if (ret)
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 137890f78e17..a512a320df74 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -136,7 +136,7 @@ static void create_threads(struct worker *w, pthread_attr_t thread_attr,
 			worker[i].futex = &global_futex;
 
 		CPU_ZERO(&cpuset);
-		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+		CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index f7a5ffebb940..aca47ce8b1e7 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -131,7 +131,7 @@ static void block_threads(pthread_t *w,
 	/* create and block all threads */
 	for (i = 0; i < params.nthreads; i++) {
 		CPU_ZERO(&cpuset);
-		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+		CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 0983f40b4b40..888ee6037945 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -152,7 +152,7 @@ static void block_threads(pthread_t *w, pthread_attr_t thread_attr,
 	/* create and block all threads */
 	for (i = 0; i < params.nthreads; i++) {
 		CPU_ZERO(&cpuset);
-		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+		CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 2226a475e782..aa82db51c0ab 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -105,7 +105,7 @@ static void block_threads(pthread_t *w,
 	/* create and block all threads */
 	for (i = 0; i < params.nthreads; i++) {
 		CPU_ZERO(&cpuset);
-		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+		CPU_SET(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, &cpuset);
 
 		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index ad1fbeafc93d..77dd4afacca4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2015,7 +2015,8 @@ static int setup_nodes(struct perf_session *session)
 {
 	struct numa_node *n;
 	unsigned long **nodes;
-	int node, cpu, idx;
+	int node, idx;
+	struct perf_cpu cpu;
 	int *cpu2node;
 
 	if (c2c.node_info > 2)
@@ -2038,8 +2039,8 @@ static int setup_nodes(struct perf_session *session)
 	if (!cpu2node)
 		return -ENOMEM;
 
-	for (cpu = 0; cpu < c2c.cpus_cnt; cpu++)
-		cpu2node[cpu] = -1;
+	for (idx = 0; idx < c2c.cpus_cnt; idx++)
+		cpu2node[idx] = -1;
 
 	c2c.cpu2node = cpu2node;
 
@@ -2058,12 +2059,12 @@ static int setup_nodes(struct perf_session *session)
 			continue;
 
 		perf_cpu_map__for_each_cpu(cpu, idx, map) {
-			set_bit(cpu, set);
+			set_bit(cpu.cpu, set);
 
-			if (WARN_ONCE(cpu2node[cpu] != -1, "node/cpu topology bug"))
+			if (WARN_ONCE(cpu2node[cpu.cpu] != -1, "node/cpu topology bug"))
 				return -EINVAL;
 
-			cpu2node[cpu] = node;
+			cpu2node[cpu.cpu] = node;
 		}
 	}
 
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index f16c39a37a52..71452599f87d 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -281,7 +281,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
 	int ret;
 	int last_cpu;
 
-	last_cpu = perf_cpu_map__cpu(cpumap, cpumap->nr - 1);
+	last_cpu = perf_cpu_map__cpu(cpumap, cpumap->nr - 1).cpu;
 	mask_size = last_cpu / 4 + 2; /* one more byte for EOS */
 	mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */
 
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index da03a341c63c..99d7ff9a8eff 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -192,7 +192,7 @@ static int evsel__process_alloc_node_event(struct evsel *evsel, struct perf_samp
 	int ret = evsel__process_alloc_event(evsel, sample);
 
 	if (!ret) {
-		int node1 = cpu__get_node(sample->cpu),
+		int node1 = cpu__get_node((struct perf_cpu){.cpu = sample->cpu}),
 		    node2 = evsel__intval(evsel, sample, "node");
 
 		if (node1 != node2)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6ac2160913ea..0a63295d30f0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2796,7 +2796,7 @@ int cmd_record(int argc, const char **argv)
 	symbol__init(NULL);
 
 	if (rec->opts.affinity != PERF_AFFINITY_SYS) {
-		rec->affinity_mask.nbits = cpu__max_cpu();
+		rec->affinity_mask.nbits = cpu__max_cpu().cpu;
 		rec->affinity_mask.bits = bitmap_zalloc(rec->affinity_mask.nbits);
 		if (!rec->affinity_mask.bits) {
 			pr_err("Failed to allocate thread mask for %zd cpus\n", rec->affinity_mask.nbits);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 9da1da4749c9..72d446de9c60 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -167,7 +167,7 @@ struct trace_sched_handler {
 
 struct perf_sched_map {
 	DECLARE_BITMAP(comp_cpus_mask, MAX_CPUS);
-	int			*comp_cpus;
+	struct perf_cpu		*comp_cpus;
 	bool			 comp;
 	struct perf_thread_map *color_pids;
 	const char		*color_pids_str;
@@ -191,7 +191,7 @@ struct perf_sched {
  * Track the current task - that way we can know whether there's any
  * weird events, such as a task being switched away that is not current.
  */
-	int		 max_cpu;
+	struct perf_cpu	 max_cpu;
 	u32		 curr_pid[MAX_CPUS];
 	struct thread	 *curr_thread[MAX_CPUS];
 	char		 next_shortname1;
@@ -1535,28 +1535,31 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
 	int new_shortname;
 	u64 timestamp0, timestamp = sample->time;
 	s64 delta;
-	int i, this_cpu = sample->cpu;
+	int i;
+	struct perf_cpu this_cpu = {
+		.cpu = sample->cpu,
+	};
 	int cpus_nr;
 	bool new_cpu = false;
 	const char *color = PERF_COLOR_NORMAL;
 	char stimestamp[32];
 
-	BUG_ON(this_cpu >= MAX_CPUS || this_cpu < 0);
+	BUG_ON(this_cpu.cpu >= MAX_CPUS || this_cpu.cpu < 0);
 
-	if (this_cpu > sched->max_cpu)
+	if (this_cpu.cpu > sched->max_cpu.cpu)
 		sched->max_cpu = this_cpu;
 
 	if (sched->map.comp) {
 		cpus_nr = bitmap_weight(sched->map.comp_cpus_mask, MAX_CPUS);
-		if (!test_and_set_bit(this_cpu, sched->map.comp_cpus_mask)) {
+		if (!test_and_set_bit(this_cpu.cpu, sched->map.comp_cpus_mask)) {
 			sched->map.comp_cpus[cpus_nr++] = this_cpu;
 			new_cpu = true;
 		}
 	} else
-		cpus_nr = sched->max_cpu;
+		cpus_nr = sched->max_cpu.cpu;
 
-	timestamp0 = sched->cpu_last_switched[this_cpu];
-	sched->cpu_last_switched[this_cpu] = timestamp;
+	timestamp0 = sched->cpu_last_switched[this_cpu.cpu];
+	sched->cpu_last_switched[this_cpu.cpu] = timestamp;
 	if (timestamp0)
 		delta = timestamp - timestamp0;
 	else
@@ -1577,7 +1580,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
 		return -1;
 	}
 
-	sched->curr_thread[this_cpu] = thread__get(sched_in);
+	sched->curr_thread[this_cpu.cpu] = thread__get(sched_in);
 
 	printf("  ");
 
@@ -1608,8 +1611,10 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
 	}
 
 	for (i = 0; i < cpus_nr; i++) {
-		int cpu = sched->map.comp ? sched->map.comp_cpus[i] : i;
-		struct thread *curr_thread = sched->curr_thread[cpu];
+		struct perf_cpu cpu = {
+			.cpu = sched->map.comp ? sched->map.comp_cpus[i].cpu : i,
+		};
+		struct thread *curr_thread = sched->curr_thread[cpu.cpu];
 		struct thread_runtime *curr_tr;
 		const char *pid_color = color;
 		const char *cpu_color = color;
@@ -1623,13 +1628,13 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
 		if (sched->map.color_cpus && perf_cpu_map__has(sched->map.color_cpus, cpu))
 			cpu_color = COLOR_CPUS;
 
-		if (cpu != this_cpu)
+		if (cpu.cpu != this_cpu.cpu)
 			color_fprintf(stdout, color, " ");
 		else
 			color_fprintf(stdout, cpu_color, "*");
 
-		if (sched->curr_thread[cpu]) {
-			curr_tr = thread__get_runtime(sched->curr_thread[cpu]);
+		if (sched->curr_thread[cpu.cpu]) {
+			curr_tr = thread__get_runtime(sched->curr_thread[cpu.cpu]);
 			if (curr_tr == NULL) {
 				thread__put(sched_in);
 				return -1;
@@ -1929,7 +1934,7 @@ static char *timehist_get_commstr(struct thread *thread)
 
 static void timehist_header(struct perf_sched *sched)
 {
-	u32 ncpus = sched->max_cpu + 1;
+	u32 ncpus = sched->max_cpu.cpu + 1;
 	u32 i, j;
 
 	printf("%15s %6s ", "time", "cpu");
@@ -2008,7 +2013,7 @@ static void timehist_print_sample(struct perf_sched *sched,
 	struct thread_runtime *tr = thread__priv(thread);
 	const char *next_comm = evsel__strval(evsel, sample, "next_comm");
 	const u32 next_pid = evsel__intval(evsel, sample, "next_pid");
-	u32 max_cpus = sched->max_cpu + 1;
+	u32 max_cpus = sched->max_cpu.cpu + 1;
 	char tstr[64];
 	char nstr[30];
 	u64 wait_time;
@@ -2389,7 +2394,7 @@ static void timehist_print_wakeup_event(struct perf_sched *sched,
 	timestamp__scnprintf_usec(sample->time, tstr, sizeof(tstr));
 	printf("%15s [%04d] ", tstr, sample->cpu);
 	if (sched->show_cpu_visual)
-		printf(" %*s ", sched->max_cpu + 1, "");
+		printf(" %*s ", sched->max_cpu.cpu + 1, "");
 
 	printf(" %-*s ", comm_width, timehist_get_commstr(thread));
 
@@ -2449,13 +2454,13 @@ static void timehist_print_migration_event(struct perf_sched *sched,
 {
 	struct thread *thread;
 	char tstr[64];
-	u32 max_cpus = sched->max_cpu + 1;
+	u32 max_cpus;
 	u32 ocpu, dcpu;
 
 	if (sched->summary_only)
 		return;
 
-	max_cpus = sched->max_cpu + 1;
+	max_cpus = sched->max_cpu.cpu + 1;
 	ocpu = evsel__intval(evsel, sample, "orig_cpu");
 	dcpu = evsel__intval(evsel, sample, "dest_cpu");
 
@@ -2918,7 +2923,7 @@ static void timehist_print_summary(struct perf_sched *sched,
 
 	printf("    Total scheduling time (msec): ");
 	print_sched_time(hist_time, 2);
-	printf(" (x %d)\n", sched->max_cpu);
+	printf(" (x %d)\n", sched->max_cpu.cpu);
 }
 
 typedef int (*sched_handler)(struct perf_tool *tool,
@@ -2935,9 +2940,11 @@ static int perf_timehist__process_sample(struct perf_tool *tool,
 {
 	struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
 	int err = 0;
-	int this_cpu = sample->cpu;
+	struct perf_cpu this_cpu = {
+		.cpu = sample->cpu,
+	};
 
-	if (this_cpu > sched->max_cpu)
+	if (this_cpu.cpu > sched->max_cpu.cpu)
 		sched->max_cpu = this_cpu;
 
 	if (evsel->handler != NULL) {
@@ -3054,10 +3061,10 @@ static int perf_sched__timehist(struct perf_sched *sched)
 		goto out;
 
 	/* pre-allocate struct for per-CPU idle stats */
-	sched->max_cpu = session->header.env.nr_cpus_online;
-	if (sched->max_cpu == 0)
-		sched->max_cpu = 4;
-	if (init_idle_threads(sched->max_cpu))
+	sched->max_cpu.cpu = session->header.env.nr_cpus_online;
+	if (sched->max_cpu.cpu == 0)
+		sched->max_cpu.cpu = 4;
+	if (init_idle_threads(sched->max_cpu.cpu))
 		goto out;
 
 	/* summary_only implies summary option, but don't overwrite summary if set */
@@ -3209,10 +3216,10 @@ static int setup_map_cpus(struct perf_sched *sched)
 {
 	struct perf_cpu_map *map;
 
-	sched->max_cpu  = sysconf(_SC_NPROCESSORS_CONF);
+	sched->max_cpu.cpu  = sysconf(_SC_NPROCESSORS_CONF);
 
 	if (sched->map.comp) {
-		sched->map.comp_cpus = zalloc(sched->max_cpu * sizeof(int));
+		sched->map.comp_cpus = zalloc(sched->max_cpu.cpu * sizeof(int));
 		if (!sched->map.comp_cpus)
 			return -1;
 	}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index bb43529618b3..ecd4f99a6c14 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2115,7 +2115,8 @@ static struct scripting_ops	*scripting_ops;
 static void __process_stat(struct evsel *counter, u64 tstamp)
 {
 	int nthreads = perf_thread_map__nr(counter->core.threads);
-	int idx, cpu, thread;
+	int idx, thread;
+	struct perf_cpu cpu;
 	static int header_printed;
 
 	if (counter->core.system_wide)
@@ -2134,7 +2135,7 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 			counts = perf_counts(counter->counts, idx, thread);
 
 			printf("%3d %8d %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %15" PRIu64 " %s\n",
-				cpu,
+				cpu.cpu,
 				perf_thread_map__pid(counter->core.threads, thread),
 				counts->val,
 				counts->ena,
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 68543bc7459f..f81c7f595fd9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -234,7 +234,7 @@ static bool cpus_map_matched(struct evsel *a, struct evsel *b)
 		return false;
 
 	for (int i = 0; i < a->core.cpus->nr; i++) {
-		if (a->core.cpus->map[i] != b->core.cpus->map[i])
+		if (a->core.cpus->map[i].cpu != b->core.cpus->map[i].cpu)
 			return false;
 	}
 
@@ -331,7 +331,7 @@ static int evsel__write_stat_event(struct evsel *counter, int cpu_map_idx, u32 t
 				   struct perf_counts_values *count)
 {
 	struct perf_sample_id *sid = SID(counter, cpu_map_idx, thread);
-	int cpu = perf_cpu_map__cpu(evsel__cpus(counter), cpu_map_idx);
+	struct perf_cpu cpu = perf_cpu_map__cpu(evsel__cpus(counter), cpu_map_idx);
 
 	return perf_event__synthesize_stat(NULL, cpu, thread, sid->id, count,
 					   process_synthesized_event, NULL);
@@ -396,7 +396,8 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu_
 			fprintf(stat_config.output,
 				"%s: %d: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
 					evsel__name(counter),
-					perf_cpu_map__cpu(evsel__cpus(counter), cpu_map_idx),
+					perf_cpu_map__cpu(evsel__cpus(counter),
+							  cpu_map_idx).cpu,
 					count->val, count->ena, count->run);
 		}
 	}
@@ -1328,61 +1329,61 @@ static const char *const aggr_mode__string[] = {
 };
 
 static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
-						int cpu)
+						struct perf_cpu cpu)
 {
 	return aggr_cpu_id__socket(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_die(struct perf_stat_config *config __maybe_unused,
-					     int cpu)
+					     struct perf_cpu cpu)
 {
 	return aggr_cpu_id__die(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_core(struct perf_stat_config *config __maybe_unused,
-					      int cpu)
+					      struct perf_cpu cpu)
 {
 	return aggr_cpu_id__core(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __maybe_unused,
-					      int cpu)
+					      struct perf_cpu cpu)
 {
 	return aggr_cpu_id__node(cpu, /*data=*/NULL);
 }
 
 static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
-					      aggr_get_id_t get_id, int cpu)
+					      aggr_get_id_t get_id, struct perf_cpu cpu)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (aggr_cpu_id__is_empty(&config->cpus_aggr_map->map[cpu]))
-		config->cpus_aggr_map->map[cpu] = get_id(config, cpu);
+	if (aggr_cpu_id__is_empty(&config->cpus_aggr_map->map[cpu.cpu]))
+		config->cpus_aggr_map->map[cpu.cpu] = get_id(config, cpu);
 
-	id = config->cpus_aggr_map->map[cpu];
+	id = config->cpus_aggr_map->map[cpu.cpu];
 	return id;
 }
 
 static struct aggr_cpu_id perf_stat__get_socket_cached(struct perf_stat_config *config,
-						       int cpu)
+						       struct perf_cpu cpu)
 {
 	return perf_stat__get_aggr(config, perf_stat__get_socket, cpu);
 }
 
 static struct aggr_cpu_id perf_stat__get_die_cached(struct perf_stat_config *config,
-						    int cpu)
+						    struct perf_cpu cpu)
 {
 	return perf_stat__get_aggr(config, perf_stat__get_die, cpu);
 }
 
 static struct aggr_cpu_id perf_stat__get_core_cached(struct perf_stat_config *config,
-						     int cpu)
+						     struct perf_cpu cpu)
 {
 	return perf_stat__get_aggr(config, perf_stat__get_core, cpu);
 }
 
 static struct aggr_cpu_id perf_stat__get_node_cached(struct perf_stat_config *config,
-						     int cpu)
+						     struct perf_cpu cpu)
 {
 	return perf_stat__get_aggr(config, perf_stat__get_node, cpu);
 }
@@ -1467,7 +1468,7 @@ static int perf_stat_init_aggr_mode(void)
 	 * taking the highest cpu number to be the size of
 	 * the aggregation translate cpumap.
 	 */
-	nr = perf_cpu_map__max(evsel_list->core.cpus);
+	nr = perf_cpu_map__max(evsel_list->core.cpus).cpu;
 	stat_config.cpus_aggr_map = cpu_aggr_map__empty_new(nr + 1);
 	return stat_config.cpus_aggr_map ? 0 : -ENOMEM;
 }
@@ -1495,71 +1496,66 @@ static void perf_stat__exit_aggr_mode(void)
 	stat_config.cpus_aggr_map = NULL;
 }
 
-static inline int perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
+static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
 {
 	struct perf_env *env = data;
-	int cpu;
+	struct perf_cpu cpu = perf_cpu_map__cpu(map, idx);
 
-	if (idx > map->nr)
-		return -1;
-
-	cpu = map->map[idx];
-
-	if (cpu >= env->nr_cpus_avail)
-		return -1;
+	if (cpu.cpu >= env->nr_cpus_avail)
+		cpu.cpu = -1;
 
 	return cpu;
 }
 
-static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
+static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(struct perf_cpu cpu, void *data)
 {
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu != -1)
-		id.socket = env->cpu[cpu].socket_id;
+	if (cpu.cpu != -1)
+		id.socket = env->cpu[cpu.cpu].socket_id;
 
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
+static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(struct perf_cpu cpu, void *data)
 {
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu != -1) {
+	if (cpu.cpu != -1) {
 		/*
 		 * die_id is relative to socket, so start
 		 * with the socket ID and then add die to
 		 * make a unique ID.
 		 */
-		id.socket = env->cpu[cpu].socket_id;
-		id.die = env->cpu[cpu].die_id;
+		id.socket = env->cpu[cpu.cpu].socket_id;
+		id.die = env->cpu[cpu.cpu].die_id;
 	}
 
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
+static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(struct perf_cpu cpu, void *data)
 {
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu != -1) {
+	if (cpu.cpu != -1) {
 		/*
 		 * core_id is relative to socket and die,
 		 * we need a global id. So we set
 		 * socket, die id and core id
 		 */
-		id.socket = env->cpu[cpu].socket_id;
-		id.die = env->cpu[cpu].die_id;
-		id.core = env->cpu[cpu].core_id;
+		id.socket = env->cpu[cpu.cpu].socket_id;
+		id.die = env->cpu[cpu.cpu].die_id;
+		id.core = env->cpu[cpu.cpu].core_id;
 	}
 
 	return id;
 }
 
-static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
+static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(struct perf_cpu cpu, void *data)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
@@ -1568,24 +1564,24 @@ static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
 }
 
 static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
-						     int cpu)
+						     struct perf_cpu cpu)
 {
 	return perf_env__get_socket_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 static struct aggr_cpu_id perf_stat__get_die_file(struct perf_stat_config *config __maybe_unused,
-						  int cpu)
+						  struct perf_cpu cpu)
 {
 	return perf_env__get_die_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 
 static struct aggr_cpu_id perf_stat__get_core_file(struct perf_stat_config *config __maybe_unused,
-						   int cpu)
+						   struct perf_cpu cpu)
 {
 	return perf_env__get_core_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
 
 static struct aggr_cpu_id perf_stat__get_node_file(struct perf_stat_config *config __maybe_unused,
-						   int cpu)
+						   struct perf_cpu cpu)
 {
 	return perf_env__get_node_aggr_by_cpu(cpu, &perf_stat.session->header.env);
 }
diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c
index 0f73e300f207..56fba08a3037 100644
--- a/tools/perf/tests/attr.c
+++ b/tools/perf/tests/attr.c
@@ -65,7 +65,7 @@ do {									\
 
 #define WRITE_ASS(field, fmt) __WRITE_ASS(field, fmt, attr->field)
 
-static int store_event(struct perf_event_attr *attr, pid_t pid, int cpu,
+static int store_event(struct perf_event_attr *attr, pid_t pid, struct perf_cpu cpu,
 		       int fd, int group_fd, unsigned long flags)
 {
 	FILE *file;
@@ -93,7 +93,7 @@ static int store_event(struct perf_event_attr *attr, pid_t pid, int cpu,
 	/* syscall arguments */
 	__WRITE_ASS(fd,       "d", fd);
 	__WRITE_ASS(group_fd, "d", group_fd);
-	__WRITE_ASS(cpu,      "d", cpu);
+	__WRITE_ASS(cpu,      "d", cpu.cpu);
 	__WRITE_ASS(pid,      "d", pid);
 	__WRITE_ASS(flags,   "lu", flags);
 
@@ -144,7 +144,7 @@ static int store_event(struct perf_event_attr *attr, pid_t pid, int cpu,
 	return 0;
 }
 
-void test_attr__open(struct perf_event_attr *attr, pid_t pid, int cpu,
+void test_attr__open(struct perf_event_attr *attr, pid_t pid, struct perf_cpu cpu,
 		     int fd, int group_fd, unsigned long flags)
 {
 	int errno_saved = errno;
diff --git a/tools/perf/tests/bitmap.c b/tools/perf/tests/bitmap.c
index 384856347236..0bf399c49849 100644
--- a/tools/perf/tests/bitmap.c
+++ b/tools/perf/tests/bitmap.c
@@ -18,7 +18,7 @@ static unsigned long *get_bitmap(const char *str, int nbits)
 
 	if (map && bm) {
 		for (i = 0; i < map->nr; i++)
-			set_bit(map->map[i], bm);
+			set_bit(map->map[i].cpu, bm);
 	}
 
 	if (map)
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index 89a155092f85..84e87e31f119 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -38,7 +38,7 @@ static int process_event_mask(struct perf_tool *tool __maybe_unused,
 	TEST_ASSERT_VAL("wrong nr",  map->nr == 20);
 
 	for (i = 0; i < 20; i++) {
-		TEST_ASSERT_VAL("wrong cpu", map->map[i] == i);
+		TEST_ASSERT_VAL("wrong cpu", map->map[i].cpu == i);
 	}
 
 	perf_cpu_map__put(map);
@@ -67,8 +67,8 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
 
 	map = cpu_map__new_data(data);
 	TEST_ASSERT_VAL("wrong nr",  map->nr == 2);
-	TEST_ASSERT_VAL("wrong cpu", map->map[0] == 1);
-	TEST_ASSERT_VAL("wrong cpu", map->map[1] == 256);
+	TEST_ASSERT_VAL("wrong cpu", map->map[0].cpu == 1);
+	TEST_ASSERT_VAL("wrong cpu", map->map[1].cpu == 256);
 	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
 	perf_cpu_map__put(map);
 	return 0;
diff --git a/tools/perf/tests/event_update.c b/tools/perf/tests/event_update.c
index d01532d40acb..16b6d6f47f38 100644
--- a/tools/perf/tests/event_update.c
+++ b/tools/perf/tests/event_update.c
@@ -76,9 +76,9 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
 	TEST_ASSERT_VAL("wrong id", ev->id == 123);
 	TEST_ASSERT_VAL("wrong type", ev->type == PERF_EVENT_UPDATE__CPUS);
 	TEST_ASSERT_VAL("wrong cpus", map->nr == 3);
-	TEST_ASSERT_VAL("wrong cpus", map->map[0] == 1);
-	TEST_ASSERT_VAL("wrong cpus", map->map[1] == 2);
-	TEST_ASSERT_VAL("wrong cpus", map->map[2] == 3);
+	TEST_ASSERT_VAL("wrong cpus", map->map[0].cpu == 1);
+	TEST_ASSERT_VAL("wrong cpus", map->map[1].cpu == 2);
+	TEST_ASSERT_VAL("wrong cpus", map->map[2].cpu == 3);
 	perf_cpu_map__put(map);
 	return 0;
 }
diff --git a/tools/perf/tests/mem2node.c b/tools/perf/tests/mem2node.c
index b17b86391383..f4a4aba33f76 100644
--- a/tools/perf/tests/mem2node.c
+++ b/tools/perf/tests/mem2node.c
@@ -31,7 +31,7 @@ static unsigned long *get_bitmap(const char *str, int nbits)
 
 	if (map && bm) {
 		for (i = 0; i < map->nr; i++) {
-			set_bit(map->map[i], bm);
+			set_bit(map->map[i].cpu, bm);
 		}
 	}
 
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 90b2feda31ac..0ad62914b4d7 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -59,11 +59,11 @@ static int test__basic_mmap(struct test_suite *test __maybe_unused, int subtest
 	}
 
 	CPU_ZERO(&cpu_set);
-	CPU_SET(cpus->map[0], &cpu_set);
+	CPU_SET(cpus->map[0].cpu, &cpu_set);
 	sched_setaffinity(0, sizeof(cpu_set), &cpu_set);
 	if (sched_setaffinity(0, sizeof(cpu_set), &cpu_set) < 0) {
 		pr_debug("sched_setaffinity() failed on CPU %d: %s ",
-			 cpus->map[0], str_error_r(errno, sbuf, sizeof(sbuf)));
+			 cpus->map[0].cpu, str_error_r(errno, sbuf, sizeof(sbuf)));
 		goto out_free_cpus;
 	}
 
diff --git a/tools/perf/tests/openat-syscall-all-cpus.c b/tools/perf/tests/openat-syscall-all-cpus.c
index ca0a50e92839..1ab362323d25 100644
--- a/tools/perf/tests/openat-syscall-all-cpus.c
+++ b/tools/perf/tests/openat-syscall-all-cpus.c
@@ -22,7 +22,8 @@
 static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __maybe_unused,
 						  int subtest __maybe_unused)
 {
-	int err = -1, fd, idx, cpu;
+	int err = -1, fd, idx;
+	struct perf_cpu cpu;
 	struct perf_cpu_map *cpus;
 	struct evsel *evsel;
 	unsigned int nr_openat_calls = 111, i;
@@ -66,15 +67,15 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 		 * without CPU_ALLOC. 1024 cpus in 2010 still seems
 		 * a reasonable upper limit tho :-)
 		 */
-		if (cpu >= CPU_SETSIZE) {
-			pr_debug("Ignoring CPU %d\n", cpu);
+		if (cpu.cpu >= CPU_SETSIZE) {
+			pr_debug("Ignoring CPU %d\n", cpu.cpu);
 			continue;
 		}
 
-		CPU_SET(cpu, &cpu_set);
+		CPU_SET(cpu.cpu, &cpu_set);
 		if (sched_setaffinity(0, sizeof(cpu_set), &cpu_set) < 0) {
 			pr_debug("sched_setaffinity() failed on CPU %d: %s ",
-				 cpu,
+				 cpu.cpu,
 				 str_error_r(errno, sbuf, sizeof(sbuf)));
 			goto out_close_fd;
 		}
@@ -82,7 +83,7 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 			fd = openat(0, "/etc/passwd", O_RDONLY);
 			close(fd);
 		}
-		CPU_CLR(cpu, &cpu_set);
+		CPU_CLR(cpu.cpu, &cpu_set);
 	}
 
 	evsel->core.cpus = perf_cpu_map__get(cpus);
@@ -92,7 +93,7 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
 		unsigned int expected;
 
-		if (cpu >= CPU_SETSIZE)
+		if (cpu.cpu >= CPU_SETSIZE)
 			continue;
 
 		if (evsel__read_on_cpu(evsel, idx, 0) < 0) {
@@ -104,7 +105,7 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb
 		expected = nr_openat_calls + idx;
 		if (perf_counts(evsel->counts, idx, 0)->val != expected) {
 			pr_debug("evsel__read_on_cpu: expected to intercept %d calls on cpu %d, got %" PRIu64 "\n",
-				 expected, cpu, perf_counts(evsel->counts, idx, 0)->val);
+				 expected, cpu.cpu, perf_counts(evsel->counts, idx, 0)->val);
 			err = -1;
 		}
 	}
diff --git a/tools/perf/tests/stat.c b/tools/perf/tests/stat.c
index 2eb096b5e6da..500974040fe3 100644
--- a/tools/perf/tests/stat.c
+++ b/tools/perf/tests/stat.c
@@ -87,7 +87,8 @@ static int test__synthesize_stat(struct test_suite *test __maybe_unused, int sub
 	count.run = 300;
 
 	TEST_ASSERT_VAL("failed to synthesize stat_config",
-		!perf_event__synthesize_stat(NULL, 1, 2, 3, &count, process_stat_event, NULL));
+			!perf_event__synthesize_stat(NULL, (struct perf_cpu){.cpu = 1}, 2, 3,
+						     &count, process_stat_event, NULL));
 
 	return 0;
 }
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 33e4cb81265c..c4ef0c7002f1 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -112,7 +112,9 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	TEST_ASSERT_VAL("Session header CPU map not set", session->header.env.cpu);
 
 	for (i = 0; i < session->header.env.nr_cpus_avail; i++) {
-		if (!perf_cpu_map__has(map, i))
+		struct perf_cpu cpu = { .cpu = i };
+
+		if (!perf_cpu_map__has(map, cpu))
 			continue;
 		pr_debug("CPU %d, core %d, socket %d\n", i,
 			 session->header.env.cpu[i].core_id,
@@ -122,15 +124,15 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	// Test that CPU ID contains socket, die, core and CPU
 	for (i = 0; i < map->nr; i++) {
 		id = aggr_cpu_id__cpu(perf_cpu_map__cpu(map, i), NULL);
-		TEST_ASSERT_VAL("Cpu map - CPU ID doesn't match", map->map[i] == id.cpu);
+		TEST_ASSERT_VAL("Cpu map - CPU ID doesn't match", map->map[i].cpu == id.cpu.cpu);
 
 		TEST_ASSERT_VAL("Cpu map - Core ID doesn't match",
-			session->header.env.cpu[map->map[i]].core_id == id.core);
+			session->header.env.cpu[map->map[i].cpu].core_id == id.core);
 		TEST_ASSERT_VAL("Cpu map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i]].socket_id == id.socket);
+			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
 
 		TEST_ASSERT_VAL("Cpu map - Die ID doesn't match",
-			session->header.env.cpu[map->map[i]].die_id == id.die);
+			session->header.env.cpu[map->map[i].cpu].die_id == id.die);
 		TEST_ASSERT_VAL("Cpu map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Cpu map - Thread is set", id.thread == -1);
 	}
@@ -139,13 +141,13 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	for (i = 0; i < map->nr; i++) {
 		id = aggr_cpu_id__core(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Core map - Core ID doesn't match",
-			session->header.env.cpu[map->map[i]].core_id == id.core);
+			session->header.env.cpu[map->map[i].cpu].core_id == id.core);
 
 		TEST_ASSERT_VAL("Core map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i]].socket_id == id.socket);
+			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
 
 		TEST_ASSERT_VAL("Core map - Die ID doesn't match",
-			session->header.env.cpu[map->map[i]].die_id == id.die);
+			session->header.env.cpu[map->map[i].cpu].die_id == id.die);
 		TEST_ASSERT_VAL("Core map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Core map - Thread is set", id.thread == -1);
 	}
@@ -154,14 +156,14 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	for (i = 0; i < map->nr; i++) {
 		id = aggr_cpu_id__die(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Die map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i]].socket_id == id.socket);
+			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
 
 		TEST_ASSERT_VAL("Die map - Die ID doesn't match",
-			session->header.env.cpu[map->map[i]].die_id == id.die);
+			session->header.env.cpu[map->map[i].cpu].die_id == id.die);
 
 		TEST_ASSERT_VAL("Die map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Die map - Core is set", id.core == -1);
-		TEST_ASSERT_VAL("Die map - CPU is set", id.cpu == -1);
+		TEST_ASSERT_VAL("Die map - CPU is set", id.cpu.cpu == -1);
 		TEST_ASSERT_VAL("Die map - Thread is set", id.thread == -1);
 	}
 
@@ -169,12 +171,12 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	for (i = 0; i < map->nr; i++) {
 		id = aggr_cpu_id__socket(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i]].socket_id == id.socket);
+			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
 
 		TEST_ASSERT_VAL("Socket map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Socket map - Die ID is set", id.die == -1);
 		TEST_ASSERT_VAL("Socket map - Core is set", id.core == -1);
-		TEST_ASSERT_VAL("Socket map - CPU is set", id.cpu == -1);
+		TEST_ASSERT_VAL("Socket map - CPU is set", id.cpu.cpu == -1);
 		TEST_ASSERT_VAL("Socket map - Thread is set", id.thread == -1);
 	}
 
@@ -186,7 +188,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 		TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
 		TEST_ASSERT_VAL("Node map - Die ID is set", id.die == -1);
 		TEST_ASSERT_VAL("Node map - Core is set", id.core == -1);
-		TEST_ASSERT_VAL("Node map - CPU is set", id.cpu == -1);
+		TEST_ASSERT_VAL("Node map - CPU is set", id.cpu.cpu == -1);
 		TEST_ASSERT_VAL("Node map - Thread is set", id.thread == -1);
 	}
 	perf_session__delete(session);
diff --git a/tools/perf/util/affinity.c b/tools/perf/util/affinity.c
index 7b12bd7a3080..f1e30d566db3 100644
--- a/tools/perf/util/affinity.c
+++ b/tools/perf/util/affinity.c
@@ -11,7 +11,7 @@
 
 static int get_cpu_set_size(void)
 {
-	int sz = cpu__max_cpu() + 8 - 1;
+	int sz = cpu__max_cpu().cpu + 8 - 1;
 	/*
 	 * sched_getaffinity doesn't like masks smaller than the kernel.
 	 * Hopefully that's big enough.
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c679394b898d..5632efc44738 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -123,7 +123,7 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
 	mm->prev = 0;
 	mm->idx = mp->idx;
 	mm->tid = mp->tid;
-	mm->cpu = mp->cpu;
+	mm->cpu = mp->cpu.cpu;
 
 	if (!mp->len) {
 		mm->base = NULL;
@@ -180,7 +180,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
 		else
 			mp->tid = -1;
 	} else {
-		mp->cpu = -1;
+		mp->cpu.cpu = -1;
 		mp->tid = perf_thread_map__pid(evlist->core.threads, idx);
 	}
 }
@@ -292,7 +292,7 @@ static int auxtrace_queues__queue_buffer(struct auxtrace_queues *queues,
 	if (!queue->set) {
 		queue->set = true;
 		queue->tid = buffer->tid;
-		queue->cpu = buffer->cpu;
+		queue->cpu = buffer->cpu.cpu;
 	}
 
 	buffer->buffer_nr = queues->next_buffer_nr++;
@@ -339,11 +339,11 @@ static int auxtrace_queues__split_buffer(struct auxtrace_queues *queues,
 	return 0;
 }
 
-static bool filter_cpu(struct perf_session *session, int cpu)
+static bool filter_cpu(struct perf_session *session, struct perf_cpu cpu)
 {
 	unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
 
-	return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap);
+	return cpu_bitmap && cpu.cpu != -1 && !test_bit(cpu.cpu, cpu_bitmap);
 }
 
 static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues,
@@ -399,7 +399,7 @@ int auxtrace_queues__add_event(struct auxtrace_queues *queues,
 	struct auxtrace_buffer buffer = {
 		.pid = -1,
 		.tid = event->auxtrace.tid,
-		.cpu = event->auxtrace.cpu,
+		.cpu = { event->auxtrace.cpu },
 		.data_offset = data_offset,
 		.offset = event->auxtrace.offset,
 		.reference = event->auxtrace.reference,
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index bbf0d78c6401..19910b9011f3 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -15,6 +15,7 @@
 #include <linux/list.h>
 #include <linux/perf_event.h>
 #include <linux/types.h>
+#include <internal/cpumap.h>
 #include <asm/bitsperlong.h>
 #include <asm/barrier.h>
 
@@ -240,7 +241,7 @@ struct auxtrace_buffer {
 	size_t			size;
 	pid_t			pid;
 	pid_t			tid;
-	int			cpu;
+	struct perf_cpu		cpu;
 	void			*data;
 	off_t			data_offset;
 	void			*mmap_addr;
@@ -350,7 +351,7 @@ struct auxtrace_mmap_params {
 	int		prot;
 	int		idx;
 	pid_t		tid;
-	int		cpu;
+	struct perf_cpu	cpu;
 };
 
 /**
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 48ce583af0ec..12b2243222b0 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -13,8 +13,8 @@
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 
-static int max_cpu_num;
-static int max_present_cpu_num;
+static struct perf_cpu max_cpu_num;
+static struct perf_cpu max_present_cpu_num;
 static int max_node_num;
 /**
  * The numa node X as read from /sys/devices/system/node/nodeX indexed by the
@@ -37,9 +37,9 @@ static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
 			 * otherwise it would become 65535.
 			 */
 			if (cpus->cpu[i] == (u16) -1)
-				map->map[i] = -1;
+				map->map[i].cpu = -1;
 			else
-				map->map[i] = (int) cpus->cpu[i];
+				map->map[i].cpu = (int) cpus->cpu[i];
 		}
 	}
 
@@ -58,7 +58,7 @@ static struct perf_cpu_map *cpu_map__from_mask(struct perf_record_record_cpu_map
 		int cpu, i = 0;
 
 		for_each_set_bit(cpu, mask->mask, nbits)
-			map->map[i++] = cpu;
+			map->map[i++].cpu = cpu;
 	}
 	return map;
 
@@ -91,7 +91,7 @@ struct perf_cpu_map *perf_cpu_map__empty_new(int nr)
 
 		cpus->nr = nr;
 		for (i = 0; i < nr; i++)
-			cpus->map[i] = -1;
+			cpus->map[i].cpu = -1;
 
 		refcount_set(&cpus->refcnt, 1);
 	}
@@ -126,13 +126,13 @@ static int cpu__get_topology_int(int cpu, const char *name, int *value)
 	return sysfs__read_int(path, value);
 }
 
-int cpu__get_socket_id(int cpu)
+int cpu__get_socket_id(struct perf_cpu cpu)
 {
-	int value, ret = cpu__get_topology_int(cpu, "physical_package_id", &value);
+	int value, ret = cpu__get_topology_int(cpu.cpu, "physical_package_id", &value);
 	return ret ?: value;
 }
 
-struct aggr_cpu_id aggr_cpu_id__socket(int cpu, void *data __maybe_unused)
+struct aggr_cpu_id aggr_cpu_id__socket(struct perf_cpu cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
@@ -161,7 +161,8 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 				       aggr_cpu_id_get_t get_id,
 				       void *data)
 {
-	int cpu, idx;
+	int idx;
+	struct perf_cpu cpu;
 	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
 
 	if (!c)
@@ -201,14 +202,14 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 
 }
 
-int cpu__get_die_id(int cpu)
+int cpu__get_die_id(struct perf_cpu cpu)
 {
-	int value, ret = cpu__get_topology_int(cpu, "die_id", &value);
+	int value, ret = cpu__get_topology_int(cpu.cpu, "die_id", &value);
 
 	return ret ?: value;
 }
 
-struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__die(struct perf_cpu cpu, void *data)
 {
 	struct aggr_cpu_id id;
 	int die;
@@ -231,13 +232,13 @@ struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data)
 	return id;
 }
 
-int cpu__get_core_id(int cpu)
+int cpu__get_core_id(struct perf_cpu cpu)
 {
-	int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
+	int value, ret = cpu__get_topology_int(cpu.cpu, "core_id", &value);
 	return ret ?: value;
 }
 
-struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__core(struct perf_cpu cpu, void *data)
 {
 	struct aggr_cpu_id id;
 	int core = cpu__get_core_id(cpu);
@@ -256,7 +257,7 @@ struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data)
 
 }
 
-struct aggr_cpu_id aggr_cpu_id__cpu(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__cpu(struct perf_cpu cpu, void *data)
 {
 	struct aggr_cpu_id id;
 
@@ -270,7 +271,7 @@ struct aggr_cpu_id aggr_cpu_id__cpu(int cpu, void *data)
 
 }
 
-struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data __maybe_unused)
+struct aggr_cpu_id aggr_cpu_id__node(struct perf_cpu cpu, void *data __maybe_unused)
 {
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
@@ -318,8 +319,8 @@ static void set_max_cpu_num(void)
 	int ret = -1;
 
 	/* set up default */
-	max_cpu_num = 4096;
-	max_present_cpu_num = 4096;
+	max_cpu_num.cpu = 4096;
+	max_present_cpu_num.cpu = 4096;
 
 	mnt = sysfs__mountpoint();
 	if (!mnt)
@@ -332,7 +333,7 @@ static void set_max_cpu_num(void)
 		goto out;
 	}
 
-	ret = get_max_num(path, &max_cpu_num);
+	ret = get_max_num(path, &max_cpu_num.cpu);
 	if (ret)
 		goto out;
 
@@ -343,11 +344,11 @@ static void set_max_cpu_num(void)
 		goto out;
 	}
 
-	ret = get_max_num(path, &max_present_cpu_num);
+	ret = get_max_num(path, &max_present_cpu_num.cpu);
 
 out:
 	if (ret)
-		pr_err("Failed to read max cpus, using default of %d\n", max_cpu_num);
+		pr_err("Failed to read max cpus, using default of %d\n", max_cpu_num.cpu);
 }
 
 /* Determine highest possible node in the system for sparse allocation */
@@ -386,31 +387,31 @@ int cpu__max_node(void)
 	return max_node_num;
 }
 
-int cpu__max_cpu(void)
+struct perf_cpu cpu__max_cpu(void)
 {
-	if (unlikely(!max_cpu_num))
+	if (unlikely(!max_cpu_num.cpu))
 		set_max_cpu_num();
 
 	return max_cpu_num;
 }
 
-int cpu__max_present_cpu(void)
+struct perf_cpu cpu__max_present_cpu(void)
 {
-	if (unlikely(!max_present_cpu_num))
+	if (unlikely(!max_present_cpu_num.cpu))
 		set_max_cpu_num();
 
 	return max_present_cpu_num;
 }
 
 
-int cpu__get_node(int cpu)
+int cpu__get_node(struct perf_cpu cpu)
 {
 	if (unlikely(cpunode_map == NULL)) {
 		pr_debug("cpu_map not initialized\n");
 		return -1;
 	}
 
-	return cpunode_map[cpu];
+	return cpunode_map[cpu.cpu];
 }
 
 static int init_cpunode_map(void)
@@ -420,13 +421,13 @@ static int init_cpunode_map(void)
 	set_max_cpu_num();
 	set_max_node_num();
 
-	cpunode_map = calloc(max_cpu_num, sizeof(int));
+	cpunode_map = calloc(max_cpu_num.cpu, sizeof(int));
 	if (!cpunode_map) {
 		pr_err("%s: calloc failed\n", __func__);
 		return -1;
 	}
 
-	for (i = 0; i < max_cpu_num; i++)
+	for (i = 0; i < max_cpu_num.cpu; i++)
 		cpunode_map[i] = -1;
 
 	return 0;
@@ -487,35 +488,37 @@ int cpu__setup_cpunode_map(void)
 
 size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
 {
-	int i, cpu, start = -1;
+	int i, start = -1;
 	bool first = true;
 	size_t ret = 0;
 
 #define COMMA first ? "" : ","
 
 	for (i = 0; i < map->nr + 1; i++) {
+		struct perf_cpu cpu = { .cpu = INT_MAX };
 		bool last = i == map->nr;
 
-		cpu = last ? INT_MAX : map->map[i];
+		if (!last)
+			cpu = map->map[i];
 
 		if (start == -1) {
 			start = i;
 			if (last) {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d", COMMA,
-						map->map[i]);
+						map->map[i].cpu);
 			}
-		} else if (((i - start) != (cpu - map->map[start])) || last) {
+		} else if (((i - start) != (cpu.cpu - map->map[start].cpu)) || last) {
 			int end = i - 1;
 
 			if (start == end) {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d", COMMA,
-						map->map[start]);
+						map->map[start].cpu);
 			} else {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d-%d", COMMA,
-						map->map[start], map->map[end]);
+						map->map[start].cpu, map->map[end].cpu);
 			}
 			first = false;
 			start = i;
@@ -542,23 +545,23 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 	int i, cpu;
 	char *ptr = buf;
 	unsigned char *bitmap;
-	int last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
+	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
 
 	if (buf == NULL)
 		return 0;
 
-	bitmap = zalloc(last_cpu / 8 + 1);
+	bitmap = zalloc(last_cpu.cpu / 8 + 1);
 	if (bitmap == NULL) {
 		buf[0] = '\0';
 		return 0;
 	}
 
 	for (i = 0; i < map->nr; i++) {
-		cpu = perf_cpu_map__cpu(map, i);
+		cpu = perf_cpu_map__cpu(map, i).cpu;
 		bitmap[cpu / 8] |= 1 << (cpu % 8);
 	}
 
-	for (cpu = last_cpu / 4 * 4; cpu >= 0; cpu -= 4) {
+	for (cpu = last_cpu.cpu / 4 * 4; cpu >= 0; cpu -= 4) {
 		unsigned char bits = bitmap[cpu / 8];
 
 		if (cpu % 8)
@@ -594,7 +597,7 @@ bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b
 		a->socket == b->socket &&
 		a->die == b->die &&
 		a->core == b->core &&
-		a->cpu == b->cpu;
+		a->cpu.cpu == b->cpu.cpu;
 }
 
 bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a)
@@ -604,7 +607,7 @@ bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a)
 		a->socket == -1 &&
 		a->die == -1 &&
 		a->core == -1 &&
-		a->cpu == -1;
+		a->cpu.cpu == -1;
 }
 
 struct aggr_cpu_id aggr_cpu_id__empty(void)
@@ -615,7 +618,7 @@ struct aggr_cpu_id aggr_cpu_id__empty(void)
 		.socket = -1,
 		.die = -1,
 		.core = -1,
-		.cpu = -1
+		.cpu = (struct perf_cpu){ .cpu = -1 },
 	};
 	return ret;
 }
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index b98cd1739677..afc15027d678 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -23,7 +23,7 @@ struct aggr_cpu_id {
 	/** The core id as read from /sys/devices/system/cpu/cpuX/topology/core_id. */
 	int core;
 	/** CPU aggregation, note there is one CPU for each SMT thread. */
-	int cpu;
+	struct perf_cpu cpu;
 };
 
 /** A collection of aggr_cpu_id values, the "built" version is sorted and uniqued. */
@@ -48,28 +48,28 @@ const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */
 int cpu__setup_cpunode_map(void);
 
 int cpu__max_node(void);
-int cpu__max_cpu(void);
-int cpu__max_present_cpu(void);
+struct perf_cpu cpu__max_cpu(void);
+struct perf_cpu cpu__max_present_cpu(void);
 /**
  * cpu__get_node - Returns the numa node X as read from
  * /sys/devices/system/node/nodeX for the given CPU.
  */
-int cpu__get_node(int cpu);
+int cpu__get_node(struct perf_cpu cpu);
 /**
  * cpu__get_socket_id - Returns the socket number as read from
  * /sys/devices/system/cpu/cpuX/topology/physical_package_id for the given CPU.
  */
-int cpu__get_socket_id(int cpu);
+int cpu__get_socket_id(struct perf_cpu cpu);
 /**
  * cpu__get_die_id - Returns the die id as read from
  * /sys/devices/system/cpu/cpuX/topology/die_id for the given CPU.
  */
-int cpu__get_die_id(int cpu);
+int cpu__get_die_id(struct perf_cpu cpu);
 /**
  * cpu__get_core_id - Returns the core id as read from
  * /sys/devices/system/cpu/cpuX/topology/core_id for the given CPU.
  */
-int cpu__get_core_id(int cpu);
+int cpu__get_core_id(struct perf_cpu cpu);
 
 /**
  * cpu_aggr_map__empty_new - Create a cpu_aggr_map of size nr with every entry
@@ -77,7 +77,7 @@ int cpu__get_core_id(int cpu);
  */
 struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr);
 
-typedef struct aggr_cpu_id (*aggr_cpu_id_get_t)(int cpu, void *data);
+typedef struct aggr_cpu_id (*aggr_cpu_id_get_t)(struct perf_cpu cpu, void *data);
 
 /**
  * cpu_aggr_map__new - Create a cpu_aggr_map with an aggr_cpu_id for each cpu in
@@ -98,29 +98,29 @@ struct aggr_cpu_id aggr_cpu_id__empty(void);
  * the socket for cpu. The function signature is compatible with
  * aggr_cpu_id_get_t.
  */
-struct aggr_cpu_id aggr_cpu_id__socket(int cpu, void *data);
+struct aggr_cpu_id aggr_cpu_id__socket(struct perf_cpu cpu, void *data);
 /**
  * aggr_cpu_id__die - Create an aggr_cpu_id with the die and socket populated
  * with the die and socket for cpu. The function signature is compatible with
  * aggr_cpu_id_get_t.
  */
-struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data);
+struct aggr_cpu_id aggr_cpu_id__die(struct perf_cpu cpu, void *data);
 /**
  * aggr_cpu_id__core - Create an aggr_cpu_id with the core, die and socket
  * populated with the core, die and socket for cpu. The function signature is
  * compatible with aggr_cpu_id_get_t.
  */
-struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data);
+struct aggr_cpu_id aggr_cpu_id__core(struct perf_cpu cpu, void *data);
 /**
  * aggr_cpu_id__core - Create an aggr_cpu_id with the cpu, core, die and socket
  * populated with the cpu, core, die and socket for cpu. The function signature
  * is compatible with aggr_cpu_id_get_t.
  */
-struct aggr_cpu_id aggr_cpu_id__cpu(int cpu, void *data);
+struct aggr_cpu_id aggr_cpu_id__cpu(struct perf_cpu cpu, void *data);
 /**
  * aggr_cpu_id__node - Create an aggr_cpu_id with the numa node populated for
  * cpu. The function signature is compatible with aggr_cpu_id_get_t.
  */
-struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data);
+struct aggr_cpu_id aggr_cpu_id__node(struct perf_cpu cpu, void *data);
 
 #endif /* __PERF_CPUMAP_H */
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index 8affb37d90e7..84ca106a3246 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -187,7 +187,7 @@ struct cpu_topology *cpu_topology__new(void)
 	struct perf_cpu_map *map;
 	bool has_die = has_die_topology();
 
-	ncpus = cpu__max_present_cpu();
+	ncpus = cpu__max_present_cpu().cpu;
 
 	/* build online CPU map */
 	map = perf_cpu_map__new(NULL);
@@ -218,7 +218,7 @@ struct cpu_topology *cpu_topology__new(void)
 	tp->core_cpus_list = addr;
 
 	for (i = 0; i < nr; i++) {
-		if (!perf_cpu_map__has(map, i))
+		if (!perf_cpu_map__has(map, (struct perf_cpu){ .cpu = i }))
 			continue;
 
 		ret = build_cpu_topology(tp, i);
@@ -333,7 +333,7 @@ struct numa_topology *numa_topology__new(void)
 	tp->nr = nr;
 
 	for (i = 0; i < nr; i++) {
-		if (load_numa_node(&tp->nodes[i], node_map->map[i])) {
+		if (load_numa_node(&tp->nodes[i], node_map->map[i].cpu)) {
 			numa_topology__delete(tp);
 			tp = NULL;
 			break;
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index fd12c0dcaefb..579e44c59914 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -285,13 +285,13 @@ int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
 
 int perf_env__read_cpu_topology_map(struct perf_env *env)
 {
-	int cpu, nr_cpus;
+	int idx, nr_cpus;
 
 	if (env->cpu != NULL)
 		return 0;
 
 	if (env->nr_cpus_avail == 0)
-		env->nr_cpus_avail = cpu__max_present_cpu();
+		env->nr_cpus_avail = cpu__max_present_cpu().cpu;
 
 	nr_cpus = env->nr_cpus_avail;
 	if (nr_cpus == -1)
@@ -301,10 +301,12 @@ int perf_env__read_cpu_topology_map(struct perf_env *env)
 	if (env->cpu == NULL)
 		return -ENOMEM;
 
-	for (cpu = 0; cpu < nr_cpus; ++cpu) {
-		env->cpu[cpu].core_id	= cpu__get_core_id(cpu);
-		env->cpu[cpu].socket_id	= cpu__get_socket_id(cpu);
-		env->cpu[cpu].die_id	= cpu__get_die_id(cpu);
+	for (idx = 0; idx < nr_cpus; ++idx) {
+		struct perf_cpu cpu = { .cpu = idx };
+
+		env->cpu[idx].core_id	= cpu__get_core_id(cpu);
+		env->cpu[idx].socket_id	= cpu__get_socket_id(cpu);
+		env->cpu[idx].die_id	= cpu__get_die_id(cpu);
 	}
 
 	env->nr_cpus_avail = nr_cpus;
@@ -381,7 +383,7 @@ static int perf_env__read_arch(struct perf_env *env)
 static int perf_env__read_nr_cpus_avail(struct perf_env *env)
 {
 	if (env->nr_cpus_avail == 0)
-		env->nr_cpus_avail = cpu__max_present_cpu();
+		env->nr_cpus_avail = cpu__max_present_cpu().cpu;
 
 	return env->nr_cpus_avail ? 0 : -ENOENT;
 }
@@ -487,7 +489,7 @@ const char *perf_env__pmu_mappings(struct perf_env *env)
 	return env->pmu_mappings;
 }
 
-int perf_env__numa_node(struct perf_env *env, int cpu)
+int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu)
 {
 	if (!env->nr_numa_map) {
 		struct numa_node *nn;
@@ -495,7 +497,7 @@ int perf_env__numa_node(struct perf_env *env, int cpu)
 
 		for (i = 0; i < env->nr_numa_nodes; i++) {
 			nn = &env->numa_nodes[i];
-			nr = max(nr, perf_cpu_map__max(nn->map));
+			nr = max(nr, perf_cpu_map__max(nn->map).cpu);
 		}
 
 		nr++;
@@ -514,13 +516,14 @@ int perf_env__numa_node(struct perf_env *env, int cpu)
 		env->nr_numa_map = nr;
 
 		for (i = 0; i < env->nr_numa_nodes; i++) {
-			int tmp, j;
+			struct perf_cpu tmp;
+			int j;
 
 			nn = &env->numa_nodes[i];
-			perf_cpu_map__for_each_cpu(j, tmp, nn->map)
-				env->numa_map[j] = i;
+			perf_cpu_map__for_each_cpu(tmp, j, nn->map)
+				env->numa_map[tmp.cpu] = i;
 		}
 	}
 
-	return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1;
+	return cpu.cpu >= 0 && cpu.cpu < env->nr_numa_map ? env->numa_map[cpu.cpu] : -1;
 }
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 163e5ec503a2..a3541f98e1fc 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -4,6 +4,7 @@
 
 #include <linux/types.h>
 #include <linux/rbtree.h>
+#include "cpumap.h"
 #include "rwsem.h"
 
 struct perf_cpu_map;
@@ -170,5 +171,5 @@ struct bpf_prog_info_node *perf_env__find_bpf_prog_info(struct perf_env *env,
 bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
 struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
 
-int perf_env__numa_node(struct perf_env *env, int cpu);
+int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu);
 #endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 39d294f6c321..11eb95b2106b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -350,13 +350,13 @@ struct evlist_cpu_iterator evlist__cpu_begin(struct evlist *evlist, struct affin
 		.cpu_map_idx = 0,
 		.evlist_cpu_map_idx = 0,
 		.evlist_cpu_map_nr = perf_cpu_map__nr(evlist->core.all_cpus),
-		.cpu = -1,
+		.cpu = (struct perf_cpu){ .cpu = -1},
 		.affinity = affinity,
 	};
 
 	if (itr.affinity) {
 		itr.cpu = perf_cpu_map__cpu(evlist->core.all_cpus, 0);
-		affinity__set(itr.affinity, itr.cpu);
+		affinity__set(itr.affinity, itr.cpu.cpu);
 		itr.cpu_map_idx = perf_cpu_map__idx(itr.evsel->core.cpus, itr.cpu);
 		/*
 		 * If this CPU isn't in the evsel's cpu map then advance through
@@ -385,7 +385,7 @@ void evlist_cpu_iterator__next(struct evlist_cpu_iterator *evlist_cpu_itr)
 			perf_cpu_map__cpu(evlist_cpu_itr->container->core.all_cpus,
 					  evlist_cpu_itr->evlist_cpu_map_idx);
 		if (evlist_cpu_itr->affinity)
-			affinity__set(evlist_cpu_itr->affinity, evlist_cpu_itr->cpu);
+			affinity__set(evlist_cpu_itr->affinity, evlist_cpu_itr->cpu.cpu);
 		evlist_cpu_itr->cpu_map_idx =
 			perf_cpu_map__idx(evlist_cpu_itr->evsel->core.cpus,
 					  evlist_cpu_itr->cpu);
@@ -819,7 +819,7 @@ perf_evlist__mmap_cb_get(struct perf_evlist *_evlist, bool overwrite, int idx)
 
 static int
 perf_evlist__mmap_cb_mmap(struct perf_mmap *_map, struct perf_mmap_param *_mp,
-			  int output, int cpu)
+			  int output, struct perf_cpu cpu)
 {
 	struct mmap *map = container_of(_map, struct mmap, core);
 	struct mmap_params *mp = container_of(_mp, struct mmap_params, core);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 57828ebfcb61..64cba56fbc74 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -344,7 +344,7 @@ struct evlist_cpu_iterator {
 	/** The number of CPU map entries in evlist->core.all_cpus. */
 	int evlist_cpu_map_nr;
 	/** The current CPU of the iterator. */
-	int cpu;
+	struct perf_cpu cpu;
 	/** If present, used to set the affinity when switching between CPUs. */
 	struct affinity *affinity;
 };
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2b18b29a534f..7b806cb462bb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1594,7 +1594,7 @@ int __evsel__read_on_cpu(struct evsel *evsel, int cpu_map_idx, int thread, bool
 static int evsel__match_other_cpu(struct evsel *evsel, struct evsel *other,
 				  int cpu_map_idx)
 {
-	int cpu;
+	struct perf_cpu cpu;
 
 	cpu = perf_cpu_map__cpu(evsel->core.cpus, cpu_map_idx);
 	return perf_cpu_map__idx(other->core.cpus, cpu);
@@ -2020,9 +2020,9 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 			test_attr__ready();
 
 			pr_debug2_peo("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
-				pid, cpus->map[idx], group_fd, evsel->open_flags);
+				pid, cpus->map[idx].cpu, group_fd, evsel->open_flags);
 
-			fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[idx],
+			fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[idx].cpu,
 						group_fd, evsel->open_flags);
 
 			FD(evsel, idx, thread) = fd;
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 666b59baeb70..e808738493e2 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -410,7 +410,7 @@ double expr__get_literal(const char *literal)
 		return smt_on() > 0 ? 1.0 : 0.0;
 
 	if (!strcmp("#num_cpus", literal))
-		return cpu__max_present_cpu();
+		return cpu__max_present_cpu().cpu;
 
 	/*
 	 * Assume that topology strings are consistent, such as CPUs "0-1"
 --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index e3c1a532d059..6da12e522edc 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -472,7 +472,7 @@ static int write_nrcpus(struct feat_fd *ff,
 	u32 nrc, nra;
 	int ret;
 
-	nrc = cpu__max_present_cpu();
+	nrc = cpu__max_present_cpu().cpu;
 
 	nr = sysconf(_SC_NPROCESSORS_ONLN);
 	if (nr < 0)
@@ -1163,7 +1163,7 @@ static int build_caches(struct cpu_cache_level caches[], u32 *cntp)
 	u32 nr, cpu;
 	u16 level;
 
-	nr = cpu__max_cpu();
+	nr = cpu__max_cpu().cpu;
 
 	for (cpu = 0; cpu < nr; cpu++) {
 		for (level = 0; level < MAX_CACHE_LVL; level++) {
@@ -1195,7 +1195,7 @@ static int build_caches(struct cpu_cache_level caches[], u32 *cntp)
 static int write_cache(struct feat_fd *ff,
 		       struct evlist *evlist __maybe_unused)
 {
-	u32 max_caches = cpu__max_cpu() * MAX_CACHE_LVL;
+	u32 max_caches = cpu__max_cpu().cpu * MAX_CACHE_LVL;
 	struct cpu_cache_level caches[max_caches];
 	u32 cnt = 0, i, version = 1;
 	int ret;
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 23ecdba9e670..12261ed8c15b 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -94,7 +94,7 @@ static void perf_mmap__aio_free(struct mmap *map, int idx)
 	}
 }
 
-static int perf_mmap__aio_bind(struct mmap *map, int idx, int cpu, int affinity)
+static int perf_mmap__aio_bind(struct mmap *map, int idx, struct perf_cpu cpu, int affinity)
 {
 	void *data;
 	size_t mmap_len;
@@ -138,7 +138,7 @@ static void perf_mmap__aio_free(struct mmap *map, int idx)
 }
 
 static int perf_mmap__aio_bind(struct mmap *map __maybe_unused, int idx __maybe_unused,
-		int cpu __maybe_unused, int affinity __maybe_unused)
+		struct perf_cpu cpu __maybe_unused, int affinity __maybe_unused)
 {
 	return 0;
 }
@@ -240,7 +240,8 @@ void mmap__munmap(struct mmap *map)
 
 static void build_node_mask(int node, struct mmap_cpu_mask *mask)
 {
-	int c, cpu, nr_cpus;
+	int idx, nr_cpus;
+	struct perf_cpu cpu;
 	const struct perf_cpu_map *cpu_map = NULL;
 
 	cpu_map = cpu_map__online();
@@ -248,16 +249,16 @@ static void build_node_mask(int node, struct mmap_cpu_mask *mask)
 		return;
 
 	nr_cpus = perf_cpu_map__nr(cpu_map);
-	for (c = 0; c < nr_cpus; c++) {
-		cpu = cpu_map->map[c]; /* map c index to online cpu index */
+	for (idx = 0; idx < nr_cpus; idx++) {
+		cpu = cpu_map->map[idx]; /* map c index to online cpu index */
 		if (cpu__get_node(cpu) == node)
-			set_bit(cpu, mask->bits);
+			set_bit(cpu.cpu, mask->bits);
 	}
 }
 
 static int perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
 {
-	map->affinity_mask.nbits = cpu__max_cpu();
+	map->affinity_mask.nbits = cpu__max_cpu().cpu;
 	map->affinity_mask.bits = bitmap_zalloc(map->affinity_mask.nbits);
 	if (!map->affinity_mask.bits)
 		return -1;
@@ -265,12 +266,12 @@ static int perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *
 	if (mp->affinity == PERF_AFFINITY_NODE && cpu__max_node() > 1)
 		build_node_mask(cpu__get_node(map->core.cpu), &map->affinity_mask);
 	else if (mp->affinity == PERF_AFFINITY_CPU)
-		set_bit(map->core.cpu, map->affinity_mask.bits);
+		set_bit(map->core.cpu.cpu, map->affinity_mask.bits);
 
 	return 0;
 }
 
-int mmap__mmap(struct mmap *map, struct mmap_params *mp, int fd, int cpu)
+int mmap__mmap(struct mmap *map, struct mmap_params *mp, int fd, struct perf_cpu cpu)
 {
 	if (perf_mmap__mmap(&map->core, &mp->core, fd, cpu)) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 8e259b9610f8..83f6bd4d4082 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -7,6 +7,7 @@
 #include <linux/types.h>
 #include <linux/ring_buffer.h>
 #include <linux/bitops.h>
+#include <perf/cpumap.h>
 #include <stdbool.h>
 #include <pthread.h> // for cpu_set_t
 #ifdef HAVE_AIO_SUPPORT
@@ -52,7 +53,7 @@ struct mmap_params {
 	struct auxtrace_mmap_params auxtrace_mp;
 };
 
-int mmap__mmap(struct mmap *map, struct mmap_params *mp, int fd, int cpu);
+int mmap__mmap(struct mmap *map, struct mmap_params *mp, int fd, struct perf_cpu cpu);
 void mmap__munmap(struct mmap *map);
 
 union perf_event *perf_mmap__read_forward(struct mmap *map);
diff --git a/tools/perf/util/perf_api_probe.c b/tools/perf/util/perf_api_probe.c
index 020411682a3c..734d006d9a8c 100644
--- a/tools/perf/util/perf_api_probe.c
+++ b/tools/perf/util/perf_api_probe.c
@@ -11,7 +11,7 @@
 
 typedef void (*setup_probe_fn_t)(struct evsel *evsel);
 
-static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
+static int perf_do_probe_api(setup_probe_fn_t fn, struct perf_cpu cpu, const char *str)
 {
 	struct evlist *evlist;
 	struct evsel *evsel;
@@ -29,7 +29,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 	evsel = evlist__first(evlist);
 
 	while (1) {
-		fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1, flags);
+		fd = sys_perf_event_open(&evsel->core.attr, pid, cpu.cpu, -1, flags);
 		if (fd < 0) {
 			if (pid == -1 && errno == EACCES) {
 				pid = 0;
@@ -43,7 +43,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1, flags);
+	fd = sys_perf_event_open(&evsel->core.attr, pid, cpu.cpu, -1, flags);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -61,7 +61,8 @@ static bool perf_probe_api(setup_probe_fn_t fn)
 {
 	const char *try[] = {"cycles:u", "instructions:u", "cpu-clock:u", NULL};
 	struct perf_cpu_map *cpus;
-	int cpu, ret, i = 0;
+	struct perf_cpu cpu;
+	int ret, i = 0;
 
 	cpus = perf_cpu_map__new(NULL);
 	if (!cpus)
@@ -136,15 +137,17 @@ bool perf_can_record_cpu_wide(void)
 		.exclude_kernel = 1,
 	};
 	struct perf_cpu_map *cpus;
-	int cpu, fd;
+	struct perf_cpu cpu;
+	int fd;
 
 	cpus = perf_cpu_map__new(NULL);
 	if (!cpus)
 		return false;
+
 	cpu = cpus->map[0];
 	perf_cpu_map__put(cpus);
 
-	fd = sys_perf_event_open(&attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&attr, -1, cpu.cpu, -1, 0);
 	if (fd < 0)
 		return false;
 	close(fd);
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 7f782a31bda3..95fb53899bcd 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -1057,7 +1057,7 @@ static struct mmap *get_md(struct evlist *evlist, int cpu)
 	for (i = 0; i < evlist->core.nr_mmaps; i++) {
 		struct mmap *md = &evlist->mmap[i];
 
-		if (md->core.cpu == cpu)
+		if (md->core.cpu.cpu == cpu)
 			return md;
 	}
 
@@ -1443,7 +1443,7 @@ PyMODINIT_FUNC PyInit_perf(void)
  * Dummy, to avoid dragging all the test_attr infrastructure in the python
  * binding.
  */
-void test_attr__open(struct perf_event_attr *attr, pid_t pid, int cpu,
+void test_attr__open(struct perf_event_attr *attr, pid_t pid, struct perf_cpu cpu,
                      int fd, int group_fd, unsigned long flags)
 {
 }
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index bff669b615ee..20461f174991 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -106,7 +106,7 @@ void evlist__config(struct evlist *evlist, struct record_opts *opts, struct call
 	if (opts->group)
 		evlist__set_leader(evlist);
 
-	if (evlist->core.cpus->map[0] < 0)
+	if (evlist->core.cpus->map[0].cpu < 0)
 		opts->no_inherit = true;
 
 	use_comm_exec = perf_can_comm_exec();
@@ -229,7 +229,8 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
 {
 	struct evlist *temp_evlist;
 	struct evsel *evsel;
-	int err, fd, cpu;
+	int err, fd;
+	struct perf_cpu cpu = { .cpu = 0 };
 	bool ret = false;
 	pid_t pid = -1;
 
@@ -246,14 +247,16 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
 	if (!evlist || perf_cpu_map__empty(evlist->core.cpus)) {
 		struct perf_cpu_map *cpus = perf_cpu_map__new(NULL);
 
-		cpu =  cpus ? cpus->map[0] : 0;
+		if (cpus)
+			cpu =  cpus->map[0];
+
 		perf_cpu_map__put(cpus);
 	} else {
 		cpu = evlist->core.cpus->map[0];
 	}
 
 	while (1) {
-		fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1,
+		fd = sys_perf_event_open(&evsel->core.attr, pid, cpu.cpu, -1,
 					 perf_event_open_cloexec_flag());
 		if (fd < 0) {
 			if (pid == -1 && errno == EACCES) {
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 0445bee9290f..bd95d60018a9 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1555,7 +1555,7 @@ static void get_handler_name(char *str, size_t size,
 }
 
 static void
-process_stat(struct evsel *counter, int cpu, int thread, u64 tstamp,
+process_stat(struct evsel *counter, struct perf_cpu cpu, int thread, u64 tstamp,
 	     struct perf_counts_values *count)
 {
 	PyObject *handler, *t;
@@ -1575,7 +1575,7 @@ process_stat(struct evsel *counter, int cpu, int thread, u64 tstamp,
 		return;
 	}
 
-	PyTuple_SetItem(t, n++, _PyLong_FromLong(cpu));
+	PyTuple_SetItem(t, n++, _PyLong_FromLong(cpu.cpu));
 	PyTuple_SetItem(t, n++, _PyLong_FromLong(thread));
 
 	tuple_set_u64(t, n++, tstamp);
@@ -1599,7 +1599,7 @@ static void python_process_stat(struct perf_stat_config *config,
 	int cpu, thread;
 
 	if (config->aggr_mode == AGGR_GLOBAL) {
-		process_stat(counter, -1, -1, tstamp,
+		process_stat(counter, (struct perf_cpu){ .cpu = -1 }, -1, tstamp,
 			     &counter->counts->aggr);
 		return;
 	}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e1a273048681..f19348dddd55 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2538,15 +2538,15 @@ int perf_session__cpu_bitmap(struct perf_session *session,
 	}
 
 	for (i = 0; i < map->nr; i++) {
-		int cpu = map->map[i];
+		struct perf_cpu cpu = map->map[i];
 
-		if (cpu >= nr_cpus) {
+		if (cpu.cpu >= nr_cpus) {
 			pr_err("Requested CPU %d too large. "
-			       "Consider raising MAX_NR_CPUS\n", cpu);
+			       "Consider raising MAX_NR_CPUS\n", cpu.cpu);
 			goto out_delete_map;
 		}
 
-		set_bit(cpu, cpu_bitmap);
+		set_bit(cpu.cpu, cpu_bitmap);
 	}
 
 	err = 0;
@@ -2598,7 +2598,7 @@ int perf_event__process_id_index(struct perf_session *session,
 		if (!sid)
 			return -ENOENT;
 		sid->idx = e->idx;
-		sid->cpu = e->cpu;
+		sid->cpu.cpu = e->cpu;
 		sid->tid = e->tid;
 	}
 	return 0;
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index ba95379efcfb..5db83e51ceef 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -121,10 +121,10 @@ static void aggr_printout(struct perf_stat_config *config,
 				id.die,
 				config->csv_output ? 0 : -3,
 				id.core, config->csv_sep);
-		} else if (id.cpu > -1) {
+		} else if (id.cpu.cpu > -1) {
 			fprintf(config->output, "CPU%*d%s",
 				config->csv_output ? 0 : -7,
-				id.cpu, config->csv_sep);
+				id.cpu.cpu, config->csv_sep);
 		}
 		break;
 	case AGGR_THREAD:
@@ -331,7 +331,8 @@ static int first_shadow_cpu_map_idx(struct perf_stat_config *config,
 				struct evsel *evsel, const struct aggr_cpu_id *id)
 {
 	struct perf_cpu_map *cpus = evsel__cpus(evsel);
-	int cpu, idx;
+	struct perf_cpu cpu;
+	int idx;
 
 	if (config->aggr_mode == AGGR_NONE)
 		return perf_cpu_map__idx(cpus, id->cpu);
@@ -513,7 +514,8 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
 static void aggr_update_shadow(struct perf_stat_config *config,
 			       struct evlist *evlist)
 {
-	int cpu, idx, s;
+	int idx, s;
+	struct perf_cpu cpu;
 	struct aggr_cpu_id s2, id;
 	u64 val;
 	struct evsel *counter;
@@ -633,7 +635,8 @@ static void aggr_cb(struct perf_stat_config *config,
 		    struct evsel *counter, void *data, bool first)
 {
 	struct aggr_data *ad = data;
-	int idx, cpu;
+	int idx;
+	struct perf_cpu cpu;
 	struct perf_cpu_map *cpus;
 	struct aggr_cpu_id s2;
 
@@ -666,7 +669,7 @@ static void aggr_cb(struct perf_stat_config *config,
 static void print_counter_aggrdata(struct perf_stat_config *config,
 				   struct evsel *counter, int s,
 				   char *prefix, bool metric_only,
-				   bool *first, int cpu)
+				   bool *first, struct perf_cpu cpu)
 {
 	struct aggr_data ad;
 	FILE *output = config->output;
@@ -696,7 +699,7 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
 		fprintf(output, "%s", prefix);
 
 	uval = val * counter->scale;
-	if (cpu != -1)
+	if (cpu.cpu != -1)
 		id = aggr_cpu_id__cpu(cpu, /*data=*/NULL);
 
 	printout(config, id, nr, counter, uval,
@@ -731,8 +734,8 @@ static void print_aggr(struct perf_stat_config *config,
 		first = true;
 		evlist__for_each_entry(evlist, counter) {
 			print_counter_aggrdata(config, counter, s,
-					       prefix, metric_only,
-					       &first, /*cpu=*/-1);
+					prefix, metric_only,
+					&first, (struct perf_cpu){ .cpu = -1 });
 		}
 		if (metric_only)
 			fputc('\n', output);
@@ -893,7 +896,8 @@ static void print_counter(struct perf_stat_config *config,
 	FILE *output = config->output;
 	u64 ena, run, val;
 	double uval;
-	int idx, cpu;
+	int idx;
+	struct perf_cpu cpu;
 	struct aggr_cpu_id id;
 
 	perf_cpu_map__for_each_cpu(cpu, idx, evsel__cpus(counter)) {
@@ -921,7 +925,8 @@ static void print_no_aggr_metric(struct perf_stat_config *config,
 				 struct evlist *evlist,
 				 char *prefix)
 {
-	int all_idx, cpu;
+	int all_idx;
+	struct perf_cpu cpu;
 
 	perf_cpu_map__for_each_cpu(cpu, all_idx, evlist->core.cpus) {
 		struct evsel *counter;
@@ -1211,7 +1216,8 @@ static void print_percore_thread(struct perf_stat_config *config,
 	struct aggr_cpu_id s2, id;
 	struct perf_cpu_map *cpus;
 	bool first = true;
-	int idx, cpu;
+	int idx;
+	struct perf_cpu cpu;
 
 	cpus = evsel__cpus(counter);
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
@@ -1247,8 +1253,8 @@ static void print_percore(struct perf_stat_config *config,
 			fprintf(output, "%s", prefix);
 
 		print_counter_aggrdata(config, counter, s,
-				       prefix, metric_only,
-				       &first, /*cpu=*/-1);
+				prefix, metric_only,
+				&first, (struct perf_cpu){ .cpu = -1 });
 	}
 
 	if (metric_only)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 7dbd7c4f3c33..ee6f03481215 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -297,7 +297,7 @@ static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals,
 {
 	struct hashmap *mask = counter->per_pkg_mask;
 	struct perf_cpu_map *cpus = evsel__cpus(counter);
-	int cpu = perf_cpu_map__cpu(cpus, cpu_map_idx);
+	struct perf_cpu cpu = perf_cpu_map__cpu(cpus, cpu_map_idx);
 	int s, d, ret = 0;
 	uint64_t *key;
 
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 691c12fd8976..335d19cc3063 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -108,7 +108,7 @@ struct runtime_stat {
 	struct rblist value_list;
 };
 
-typedef struct aggr_cpu_id (*aggr_get_id_t)(struct perf_stat_config *config, int cpu);
+typedef struct aggr_cpu_id (*aggr_get_id_t)(struct perf_stat_config *config, struct perf_cpu cpu);
 
 struct perf_stat_config {
 	enum aggr_mode		 aggr_mode;
diff --git a/tools/perf/util/svghelper.c b/tools/perf/util/svghelper.c
index 96f941e01681..4c9f211249db 100644
--- a/tools/perf/util/svghelper.c
+++ b/tools/perf/util/svghelper.c
@@ -728,7 +728,7 @@ static int str_to_bitmap(char *s, cpumask_t *b, int nr_cpus)
 	int i;
 	int ret = 0;
 	struct perf_cpu_map *m;
-	int c;
+	struct perf_cpu c;
 
 	m = perf_cpu_map__new(s);
 	if (!m)
@@ -736,12 +736,12 @@ static int str_to_bitmap(char *s, cpumask_t *b, int nr_cpus)
 
 	for (i = 0; i < m->nr; i++) {
 		c = m->map[i];
-		if (c >= nr_cpus) {
+		if (c.cpu >= nr_cpus) {
 			ret = -1;
 			break;
 		}
 
-		set_bit(c, cpumask_bits(b));
+		set_bit(c.cpu, cpumask_bits(b));
 	}
 
 	perf_cpu_map__put(m);
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 198982109f0f..c9ba8050cc2b 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -1191,7 +1191,7 @@ static void synthesize_cpus(struct cpu_map_entries *cpus,
 	cpus->nr = map->nr;
 
 	for (i = 0; i < map->nr; i++)
-		cpus->cpu[i] = map->map[i];
+		cpus->cpu[i] = map->map[i].cpu;
 }
 
 static void synthesize_mask(struct perf_record_record_cpu_map *mask,
@@ -1203,7 +1203,7 @@ static void synthesize_mask(struct perf_record_record_cpu_map *mask,
 	mask->long_size = sizeof(long);
 
 	for (i = 0; i < map->nr; i++)
-		set_bit(map->map[i], mask->mask);
+		set_bit(map->map[i].cpu, mask->mask);
 }
 
 static size_t cpus_size(struct perf_cpu_map *map)
@@ -1219,7 +1219,7 @@ static size_t mask_size(struct perf_cpu_map *map, int *max)
 
 	for (i = 0; i < map->nr; i++) {
 		/* bit position of the cpu is + 1 */
-		int bit = map->map[i] + 1;
+		int bit = map->map[i].cpu + 1;
 
 		if (bit > *max)
 			*max = bit;
@@ -1354,7 +1354,7 @@ int perf_event__synthesize_stat_config(struct perf_tool *tool,
 }
 
 int perf_event__synthesize_stat(struct perf_tool *tool,
-				u32 cpu, u32 thread, u64 id,
+				struct perf_cpu cpu, u32 thread, u64 id,
 				struct perf_counts_values *count,
 				perf_event__handler_t process,
 				struct machine *machine)
@@ -1366,7 +1366,7 @@ int perf_event__synthesize_stat(struct perf_tool *tool,
 	event.header.misc = 0;
 
 	event.id        = id;
-	event.cpu       = cpu;
+	event.cpu       = cpu.cpu;
 	event.thread    = thread;
 	event.val       = count->val;
 	event.ena       = count->ena;
@@ -1763,7 +1763,7 @@ int perf_event__synthesize_id_index(struct perf_tool *tool, perf_event__handler_
 			}
 
 			e->idx = sid->idx;
-			e->cpu = sid->cpu;
+			e->cpu = sid->cpu.cpu;
 			e->tid = sid->tid;
 		}
 	}
diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
index c931433bacbf..78a0450db164 100644
--- a/tools/perf/util/synthetic-events.h
+++ b/tools/perf/util/synthetic-events.h
@@ -6,6 +6,7 @@
 #include <sys/types.h> // pid_t
 #include <linux/compiler.h>
 #include <linux/types.h>
+#include <perf/cpumap.h>
 
 struct auxtrace_record;
 struct dso;
@@ -63,7 +64,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
 int perf_event__synthesize_stat_config(struct perf_tool *tool, struct perf_stat_config *config, perf_event__handler_t process, struct machine *machine);
 int perf_event__synthesize_stat_events(struct perf_stat_config *config, struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process, bool attrs);
 int perf_event__synthesize_stat_round(struct perf_tool *tool, u64 time, u64 type, perf_event__handler_t process, struct machine *machine);
-int perf_event__synthesize_stat(struct perf_tool *tool, u32 cpu, u32 thread, u64 id, struct perf_counts_values *count, perf_event__handler_t process, struct machine *machine);
+int perf_event__synthesize_stat(struct perf_tool *tool, struct perf_cpu cpu, u32 thread, u64 id, struct perf_counts_values *count, perf_event__handler_t process, struct machine *machine);
 int perf_event__synthesize_thread_map2(struct perf_tool *tool, struct perf_thread_map *threads, perf_event__handler_t process, struct machine *machine);
 int perf_event__synthesize_thread_map(struct perf_tool *tool, struct perf_thread_map *threads, perf_event__handler_t process, struct machine *machine, bool needs_mmap, bool mmap_data);
 int perf_event__synthesize_threads(struct perf_tool *tool, perf_event__handler_t process, struct machine *machine, bool needs_mmap, bool mmap_data, unsigned int nr_threads_synthesize);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 9f0d36ba77f2..9443c29afa52 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -11,6 +11,9 @@
 #include <stddef.h>
 #include <linux/compiler.h>
 #include <sys/types.h>
+#ifndef __cplusplus
+#include <internal/cpumap.h>
+#endif
 
 /* General helper functions */
 void usage(const char *err) __noreturn;
@@ -66,6 +69,6 @@ extern bool test_attr__enabled;
 void test_attr__ready(void);
 void test_attr__init(void);
 struct perf_event_attr;
-void test_attr__open(struct perf_event_attr *attr, pid_t pid, int cpu,
+void test_attr__open(struct perf_event_attr *attr, pid_t pid, struct perf_cpu cpu,
 		     int fd, int group_fd, unsigned long flags);
 #endif /* GIT_COMPAT_UTIL_H */
-- 
2.34.1.448.ga2b2bfdf31-goog
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-05  6:13 ` [PATCH v4 48/48] perf cpumap: Give CPUs their own type Ian Rogers
@ 2022-01-09 18:30   ` Jiri Olsa
  2022-01-10  5:45     ` Ian Rogers
  2022-01-10 19:50   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 84+ messages in thread
From: Jiri Olsa @ 2022-01-09 18:30 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Namhyung Kim, John Garry, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Tue, Jan 04, 2022 at 10:13:51PM -0800, Ian Rogers wrote:
> A common problem is confusing CPU map indices with the CPU, by wrapping
> the CPU with a struct then this is avoided. This approach is similar to
> atomic_t.
> 
> Suggested-by: John Garry <john.garry@huawei.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
SNIP
>  tools/perf/util/stat.h                        |   2 +-
>  tools/perf/util/svghelper.c                   |   6 +-
>  tools/perf/util/synthetic-events.c            |  12 +-
>  tools/perf/util/synthetic-events.h            |   3 +-
>  tools/perf/util/util.h                        |   5 +-
>  59 files changed, 408 insertions(+), 347 deletions(-)
that's massive ;-) did it find any mis-use of the index/value?
how about the same for threads?
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 71a31ed738c9..581f9ffb4237 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -4,6 +4,11 @@
>  
>  #include <linux/refcount.h>
>  
> +/** A wrapper around a CPU to avoid confusion with the perf_cpu_map's map's indices. */
> +struct perf_cpu {
> +	int cpu;
> +};
should we use 'int val' or 'int v' instead, so we don't have cpu.cpu ?
jirka
SNIP
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts
  2022-01-05  6:13 ` [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts Ian Rogers
@ 2022-01-09 18:30   ` Jiri Olsa
  2022-01-10  5:49     ` Ian Rogers
  0 siblings, 1 reply; 84+ messages in thread
From: Jiri Olsa @ 2022-01-09 18:30 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Namhyung Kim, John Garry, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Tue, Jan 04, 2022 at 10:13:29PM -0800, Ian Rogers wrote:
SNIP
> diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
> index 7ff36bf6d644..3e275e9c60d1 100644
> --- a/tools/perf/util/counts.h
> +++ b/tools/perf/util/counts.h
> @@ -40,7 +40,7 @@ void perf_counts__delete(struct perf_counts *counts);
>  void perf_counts__reset(struct perf_counts *counts);
>  
>  void evsel__reset_counts(struct evsel *evsel);
> -int evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
> +int evsel__alloc_counts(struct evsel *evsel);
>  void evsel__free_counts(struct evsel *evsel);
>  
>  #endif /* __PERF_COUNTS_H */
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 656c30b988ce..6c9af21776e6 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1578,7 +1578,7 @@ int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale)
>  	if (FD(evsel, cpu, thread) < 0)
>  		return -EINVAL;
>  
> -	if (evsel->counts == NULL && evsel__alloc_counts(evsel, cpu + 1, thread + 1) < 0)
> +	if (evsel->counts == NULL && evsel__alloc_counts(evsel) < 0)
ugh, nice.. good that it's used only from test code ;-)
btw I noticed evsel__read_on_cpu_scaled is not used at all
jirka
>  		return -ENOMEM;
>  
>  	if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) <= 0)
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index c69b221f5e3e..995cb5003133 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -152,11 +152,13 @@ static void evsel__free_stat_priv(struct evsel *evsel)
>  	zfree(&evsel->stats);
>  }
>  
> -static int evsel__alloc_prev_raw_counts(struct evsel *evsel, int ncpus, int nthreads)
> +static int evsel__alloc_prev_raw_counts(struct evsel *evsel)
>  {
> +	int cpu_map_nr = evsel__nr_cpus(evsel);
> +	int nthreads = perf_thread_map__nr(evsel->core.threads);
>  	struct perf_counts *counts;
>  
> -	counts = perf_counts__new(ncpus, nthreads);
> +	counts = perf_counts__new(cpu_map_nr, nthreads);
>  	if (counts)
>  		evsel->prev_raw_counts = counts;
>  
> @@ -177,12 +179,9 @@ static void evsel__reset_prev_raw_counts(struct evsel *evsel)
>  
>  static int evsel__alloc_stats(struct evsel *evsel, bool alloc_raw)
>  {
> -	int ncpus = evsel__nr_cpus(evsel);
> -	int nthreads = perf_thread_map__nr(evsel->core.threads);
> -
>  	if (evsel__alloc_stat_priv(evsel) < 0 ||
> -	    evsel__alloc_counts(evsel, ncpus, nthreads) < 0 ||
> -	    (alloc_raw && evsel__alloc_prev_raw_counts(evsel, ncpus, nthreads) < 0))
> +	    evsel__alloc_counts(evsel) < 0 ||
> +	    (alloc_raw && evsel__alloc_prev_raw_counts(evsel) < 0))
>  		return -ENOMEM;
>  
>  	return 0;
> -- 
> 2.34.1.448.ga2b2bfdf31-goog
> 
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-09 18:30   ` Jiri Olsa
@ 2022-01-10  5:45     ` Ian Rogers
  0 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-10  5:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Namhyung Kim, John Garry, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Sun, Jan 9, 2022 at 10:30 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Jan 04, 2022 at 10:13:51PM -0800, Ian Rogers wrote:
> > A common problem is confusing CPU map indices with the CPU, by wrapping
> > the CPU with a struct then this is avoided. This approach is similar to
> > atomic_t.
> >
> > Suggested-by: John Garry <john.garry@huawei.com>
> > Signed-off-by: Ian Rogers <irogers@google.com>
>
> SNIP
>
> >  tools/perf/util/stat.h                        |   2 +-
> >  tools/perf/util/svghelper.c                   |   6 +-
> >  tools/perf/util/synthetic-events.c            |  12 +-
> >  tools/perf/util/synthetic-events.h            |   3 +-
> >  tools/perf/util/util.h                        |   5 +-
> >  59 files changed, 408 insertions(+), 347 deletions(-)
>
> that's massive ;-) did it find any mis-use of the index/value?
It did. The fixes precede this patch so that we can have the fixes
without the large rename.
> how about the same for threads?
Agreed, but the patch set was already big enough.
> > diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> > index 71a31ed738c9..581f9ffb4237 100644
> > --- a/tools/lib/perf/include/internal/cpumap.h
> > +++ b/tools/lib/perf/include/internal/cpumap.h
> > @@ -4,6 +4,11 @@
> >
> >  #include <linux/refcount.h>
> >
> > +/** A wrapper around a CPU to avoid confusion with the perf_cpu_map's map's indices. */
> > +struct perf_cpu {
> > +     int cpu;
> > +};
>
> should we use 'int val' or 'int v' instead, so we don't have cpu.cpu ?
I can rename it if you have a preference.
Thanks,
Ian
> jirka
>
> SNIP
>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts
  2022-01-09 18:30   ` Jiri Olsa
@ 2022-01-10  5:49     ` Ian Rogers
  0 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-10  5:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Namhyung Kim, John Garry, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Sun, Jan 9, 2022 at 10:30 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Jan 04, 2022 at 10:13:29PM -0800, Ian Rogers wrote:
>
> SNIP
>
> > diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
> > index 7ff36bf6d644..3e275e9c60d1 100644
> > --- a/tools/perf/util/counts.h
> > +++ b/tools/perf/util/counts.h
> > @@ -40,7 +40,7 @@ void perf_counts__delete(struct perf_counts *counts);
> >  void perf_counts__reset(struct perf_counts *counts);
> >
> >  void evsel__reset_counts(struct evsel *evsel);
> > -int evsel__alloc_counts(struct evsel *evsel, int ncpus, int nthreads);
> > +int evsel__alloc_counts(struct evsel *evsel);
> >  void evsel__free_counts(struct evsel *evsel);
> >
> >  #endif /* __PERF_COUNTS_H */
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 656c30b988ce..6c9af21776e6 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -1578,7 +1578,7 @@ int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale)
> >       if (FD(evsel, cpu, thread) < 0)
> >               return -EINVAL;
> >
> > -     if (evsel->counts == NULL && evsel__alloc_counts(evsel, cpu + 1, thread + 1) < 0)
> > +     if (evsel->counts == NULL && evsel__alloc_counts(evsel) < 0)
>
> ugh, nice.. good that it's used only from test code ;-)
>
> btw I noticed evsel__read_on_cpu_scaled is not used at all
I'd noticed that too, but wanted to stay away from making the patch
set larger.  You end up removing the argument to
__evsel__read_on_cpu, etc. Definitely worth cleaning up, but maybe we
can do it later.
Thanks,
Ian
> jirka
>
> >               return -ENOMEM;
> >
> >       if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) <= 0)
> > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> > index c69b221f5e3e..995cb5003133 100644
> > --- a/tools/perf/util/stat.c
> > +++ b/tools/perf/util/stat.c
> > @@ -152,11 +152,13 @@ static void evsel__free_stat_priv(struct evsel *evsel)
> >       zfree(&evsel->stats);
> >  }
> >
> > -static int evsel__alloc_prev_raw_counts(struct evsel *evsel, int ncpus, int nthreads)
> > +static int evsel__alloc_prev_raw_counts(struct evsel *evsel)
> >  {
> > +     int cpu_map_nr = evsel__nr_cpus(evsel);
> > +     int nthreads = perf_thread_map__nr(evsel->core.threads);
> >       struct perf_counts *counts;
> >
> > -     counts = perf_counts__new(ncpus, nthreads);
> > +     counts = perf_counts__new(cpu_map_nr, nthreads);
> >       if (counts)
> >               evsel->prev_raw_counts = counts;
> >
> > @@ -177,12 +179,9 @@ static void evsel__reset_prev_raw_counts(struct evsel *evsel)
> >
> >  static int evsel__alloc_stats(struct evsel *evsel, bool alloc_raw)
> >  {
> > -     int ncpus = evsel__nr_cpus(evsel);
> > -     int nthreads = perf_thread_map__nr(evsel->core.threads);
> > -
> >       if (evsel__alloc_stat_priv(evsel) < 0 ||
> > -         evsel__alloc_counts(evsel, ncpus, nthreads) < 0 ||
> > -         (alloc_raw && evsel__alloc_prev_raw_counts(evsel, ncpus, nthreads) < 0))
> > +         evsel__alloc_counts(evsel) < 0 ||
> > +         (alloc_raw && evsel__alloc_prev_raw_counts(evsel) < 0))
> >               return -ENOMEM;
> >
> >       return 0;
> > --
> > 2.34.1.448.ga2b2bfdf31-goog
> >
>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 01/48] libperf: Add comments to perf_cpu_map.
  2022-01-05  6:13 ` [PATCH v4 01/48] libperf: Add comments to perf_cpu_map Ian Rogers
@ 2022-01-10 17:00   ` John Garry
  2022-01-12  5:35   ` kajoljain
  1 sibling, 0 replies; 84+ messages in thread
From: John Garry @ 2022-01-10 17:00 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 05/01/2022 06:13, Ian Rogers wrote:
> A particular observed problem is confusing the index with the CPU value,
> documentation should hopefully reduce this type of problem.
> 
> Reviewed-by: James Clark<james.clark@arm.com>
> Signed-off-by: Ian Rogers<irogers@google.com>
Reviewed-by: John Garry <john.garry@huawei.com>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-05  6:13 ` [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu Ian Rogers
@ 2022-01-10 17:10   ` John Garry
  2022-01-10 17:36     ` Ian Rogers
  2022-01-10 19:17   ` John Garry
  2022-01-11 19:33   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 84+ messages in thread
From: John Garry @ 2022-01-10 17:10 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 05/01/2022 06:13, Ian Rogers wrote:
>   
> +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> +				       void *data)
> +{
> +	if (idx < 0 || idx > map->nr)
> +		return cpu_map__empty_aggr_cpu_id();
> +
> +	return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> +}
> +
This is later deleted in the series. Can the series be reworked so that 
we don't add stuff and then later delete it? One reason for that 
approach is that we don't spend time reviewing something which will be 
deleted, especially in such a big series...
If it really makes sense to do it this way then fine.
Thanks,
John
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-10 17:10   ` John Garry
@ 2022-01-10 17:36     ` Ian Rogers
  2022-01-10 17:51       ` John Garry
  2022-01-10 18:52       ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-10 17:36 UTC (permalink / raw)
  To: John Garry
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Mon, Jan 10, 2022 at 9:10 AM John Garry <john.garry@huawei.com> wrote:
>
> On 05/01/2022 06:13, Ian Rogers wrote:
> >
> > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > +                                    void *data)
> > +{
> > +     if (idx < 0 || idx > map->nr)
> > +             return cpu_map__empty_aggr_cpu_id();
> > +
> > +     return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
>
>
> This is later deleted in the series. Can the series be reworked so that
> we don't add stuff and then later delete it? One reason for that
> approach is that we don't spend time reviewing something which will be
> deleted, especially in such a big series...
Hi John,
I think you are asking to squash:
https://lore.kernel.org/lkml/20220105061351.120843-8-irogers@google.com/
into this change. There are other  similar related changes that may
also be squashed. The changes are trying to introduce a new API and
then add changes to switch over to using it. This is with a view to
making bisection easier, have each change only do 1 thing and so on. I
believe the format of the patches is house style, but it is fine to
squash changes together too. Having sent patches to Arnaldo and having
had them split I'm reluctant to do a v5 with them squashed without him
expressing a preference.
Thanks,
Ian
> If it really makes sense to do it this way then fine.
>
> Thanks,
> John
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-10 17:36     ` Ian Rogers
@ 2022-01-10 17:51       ` John Garry
  2022-01-10 18:52       ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 84+ messages in thread
From: John Garry @ 2022-01-10 17:51 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On 10/01/2022 17:36, Ian Rogers wrote:
> I think you are asking to squash:
> https://lore.kernel.org/lkml/20220105061351.120843-8-irogers@google.com/
> into this change. 
That's the general idea.
 > There are other similar related changes that may
 > also be squashed. The changes are trying to introduce a new API and
 > then add changes to switch over to using it. This is with a view to
 > making bisection easier, have each change only do 1 thing and so on. I
 > believe the format of the patches is house style, but it is fine to
 > squash changes together too. Having sent patches to Arnaldo and having
 > had them split I'm reluctant to do a v5 with them squashed without him
 > expressing a preference.
 >
I don't feel so strongly. But I do think that ability to review should 
take preference to providing simple bisections.
Anyway, I'll stop talking and actually have a look.
Cheers,
John
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 03/48] perf stat: Correct aggregation CPU map
  2022-01-05  6:13 ` [PATCH v4 03/48] perf stat: Correct aggregation CPU map Ian Rogers
@ 2022-01-10 18:43   ` John Garry
  0 siblings, 0 replies; 84+ messages in thread
From: John Garry @ 2022-01-10 18:43 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 05/01/2022 06:13, Ian Rogers wrote:
> Switch the perf_cpu_map in aggr_update_shadow from
> the evlist to the counter's cpu map, so the index is appropriate. This
> addresses a problem where uncore counts, with a cpumap like:
> $ cat /sys/devices/uncore_imc_0/cpumask
> 0,18
> Don't aggregate counts in CPUs based on the index of those values in the
> cpumap (0 and 1) but on the actual CPU (0 and 18). Thereby correcting
> metric calculations in per-socket mode for counters without a full
> cpumask.
> 
> On a SkylakeX with a tweaked DRAM_BW_Use metric, to remove unnecessary
> scaling, this gives:
> 
> Before:
> $ /perf stat --per-socket -M DRAM_BW_Use -I 1000
>       1.001102293 S0        1              27.01 MiB  uncore_imc/cas_count_write/ #   103.00 DRAM_BW_Use
>       1.001102293 S0        1              30.22 MiB  uncore_imc/cas_count_read/
>       1.001102293 S0        1      1,001,102,293 ns   duration_time
>       1.001102293 S1        1              20.10 MiB  uncore_imc/cas_count_write/ #     0.00 DRAM_BW_Use
>       1.001102293 S1        1              32.74 MiB  uncore_imc/cas_count_read/
>       1.001102293 S1        0      <not counted> ns   duration_time
>       2.003517973 S0        1              83.04 MiB  uncore_imc/cas_count_write/ #   920.00 DRAM_BW_Use
>       2.003517973 S0        1             145.95 MiB  uncore_imc/cas_count_read/
>       2.003517973 S0        1      1,002,415,680 ns   duration_time
>       2.003517973 S1        1             302.45 MiB  uncore_imc/cas_count_write/ #     0.00 DRAM_BW_Use
>       2.003517973 S1        1             290.99 MiB  uncore_imc/cas_count_read/
>       2.003517973 S1        0      <not counted> ns   duration_time
> 
> After:
> $ perf stat --per-socket -M DRAM_BW_Use -I 1000
>       1.001080840 S0        1              24.96 MiB  uncore_imc/cas_count_write/ #    54.00 DRAM_BW_Use
>       1.001080840 S0        1              33.64 MiB  uncore_imc/cas_count_read/
>       1.001080840 S0        1      1,001,080,840 ns   duration_time
>       1.001080840 S1        1              42.43 MiB  uncore_imc/cas_count_write/ #    84.00 DRAM_BW_Use
>       1.001080840 S1        1              47.05 MiB  uncore_imc/cas_count_read/
>       1.001080840 S1        0      <not counted> ns   duration_time
> 
> Signed-off-by: Ian Rogers<irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop
  2022-01-05  6:13 ` [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop Ian Rogers
@ 2022-01-10 18:50   ` John Garry
  2022-01-10 22:22     ` Ian Rogers
  0 siblings, 1 reply; 84+ messages in thread
From: John Garry @ 2022-01-10 18:50 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 05/01/2022 06:13, Ian Rogers wrote:
> +	cpus = evsel__cpus(evsel);
> +	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
It seems a common pattern to evaluate cpus and use in 
perf_cpu_map__for_each_cpu() - is it possible to make a macro to accept 
evsel and create cpus, like perf_evsel__for_each_cpu()?
> +		if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
> +						 id))
> +			return cpu;
>   	}
>   	return 0;
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-10 17:36     ` Ian Rogers
  2022-01-10 17:51       ` John Garry
@ 2022-01-10 18:52       ` Arnaldo Carvalho de Melo
  2022-01-10 18:53         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 18:52 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 09:36:49AM -0800, Ian Rogers escreveu:
> On Mon, Jan 10, 2022 at 9:10 AM John Garry <john.garry@huawei.com> wrote:
> >
> > On 05/01/2022 06:13, Ian Rogers wrote:
> > >
> > > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > > +                                    void *data)
> > > +{
> > > +     if (idx < 0 || idx > map->nr)
> > > +             return cpu_map__empty_aggr_cpu_id();
> > > +
> > > +     return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > > +}
> > > +
> >
> >
> > This is later deleted in the series. Can the series be reworked so that
> > we don't add stuff and then later delete it? One reason for that
> > approach is that we don't spend time reviewing something which will be
> > deleted, especially in such a big series...
> 
> Hi John,
> 
> I think you are asking to squash:
> https://lore.kernel.org/lkml/20220105061351.120843-8-irogers@google.com/
> into this change. There are other  similar related changes that may
> also be squashed. The changes are trying to introduce a new API and
> then add changes to switch over to using it. This is with a view to
> making bisection easier, have each change only do 1 thing and so on. I
> believe the format of the patches is house style, but it is fine to
> squash changes together too. Having sent patches to Arnaldo and having
> had them split I'm reluctant to do a v5 with them squashed without him
> expressing a preference.
Right, sometimes this is needed, I'm getting the patchkit now to test
build it in my containers and will go patch by patch reviewing.
- Arnaldo
 
> Thanks,
> Ian
> 
> > If it really makes sense to do it this way then fine.
> >
> > Thanks,
> > John
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-10 18:52       ` Arnaldo Carvalho de Melo
@ 2022-01-10 18:53         ` Arnaldo Carvalho de Melo
  2022-01-10 18:56           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 18:53 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 03:52:11PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 10, 2022 at 09:36:49AM -0800, Ian Rogers escreveu:
> > On Mon, Jan 10, 2022 at 9:10 AM John Garry <john.garry@huawei.com> wrote:
> > >
> > > On 05/01/2022 06:13, Ian Rogers wrote:
> > > >
> > > > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > > > +                                    void *data)
> > > > +{
> > > > +     if (idx < 0 || idx > map->nr)
> > > > +             return cpu_map__empty_aggr_cpu_id();
> > > > +
> > > > +     return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > > > +}
> > > > +
> > >
> > >
> > > This is later deleted in the series. Can the series be reworked so that
> > > we don't add stuff and then later delete it? One reason for that
> > > approach is that we don't spend time reviewing something which will be
> > > deleted, especially in such a big series...
> > 
> > Hi John,
> > 
> > I think you are asking to squash:
> > https://lore.kernel.org/lkml/20220105061351.120843-8-irogers@google.com/
> > into this change. There are other  similar related changes that may
> > also be squashed. The changes are trying to introduce a new API and
> > then add changes to switch over to using it. This is with a view to
> > making bisection easier, have each change only do 1 thing and so on. I
> > believe the format of the patches is house style, but it is fine to
> > squash changes together too. Having sent patches to Arnaldo and having
> > had them split I'm reluctant to do a v5 with them squashed without him
> > expressing a preference.
> 
> Right, sometimes this is needed, I'm getting the patchkit now to test
> build it in my containers and will go patch by patch reviewing.
Good start:
⬢[acme@toolbox perf]$ b4 am -ctsl --cc-trailers CAP-5=fWT_19OfZTTjvLUcChV4nDwqc5Zq4VE93Gak6OO4NORsA@mail.gmail.com
Looking up https://lore.kernel.org/r/CAP-5%3DfWT_19OfZTTjvLUcChV4nDwqc5Zq4VE93Gak6OO4NORsA%40mail.gmail.com
Grabbing thread from lore.kernel.org/all/CAP-5%3DfWT_19OfZTTjvLUcChV4nDwqc5Zq4VE93Gak6OO4NORsA%40mail.gmail.com/t.mbox.gz
Checking for newer revisions on https://lore.kernel.org/all/
Analyzing 58 messages in the thread
Checking attestation on all messages, may take a moment...
---
  ✓ [PATCH v4 1/48] libperf: Add comments to perf_cpu_map.
    + Reviewed-by: John Garry <john.garry@huawei.com>
    + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    + Link: https://lore.kernel.org/r/20220105061351.120843-2-irogers@google.com
<BIG SNIP>
Cover: ./v4_20220104_irogers_refactor_perf_cpumap.cover
 Link: https://lore.kernel.org/r/20220105061351.120843-1-irogers@google.com
 Base: applies clean to current tree
       git checkout -b v4_20220104_irogers_google_com HEAD
       git am ./v4_20220104_irogers_refactor_perf_cpumap.mbx
:-)
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-10 18:53         ` Arnaldo Carvalho de Melo
@ 2022-01-10 18:56           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 18:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 03:53:45PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 10, 2022 at 03:52:11PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Mon, Jan 10, 2022 at 09:36:49AM -0800, Ian Rogers escreveu:
> > > On Mon, Jan 10, 2022 at 9:10 AM John Garry <john.garry@huawei.com> wrote:
> > > >
> > > > On 05/01/2022 06:13, Ian Rogers wrote:
> > > > >
> > > > > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > > > > +                                    void *data)
> > > > > +{
> > > > > +     if (idx < 0 || idx > map->nr)
> > > > > +             return cpu_map__empty_aggr_cpu_id();
> > > > > +
> > > > > +     return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > > > > +}
> > > > > +
> > > >
> > > >
> > > > This is later deleted in the series. Can the series be reworked so that
> > > > we don't add stuff and then later delete it? One reason for that
> > > > approach is that we don't spend time reviewing something which will be
> > > > deleted, especially in such a big series...
> > > 
> > > Hi John,
> > > 
> > > I think you are asking to squash:
> > > https://lore.kernel.org/lkml/20220105061351.120843-8-irogers@google.com/
> > > into this change. There are other  similar related changes that may
> > > also be squashed. The changes are trying to introduce a new API and
> > > then add changes to switch over to using it. This is with a view to
> > > making bisection easier, have each change only do 1 thing and so on. I
> > > believe the format of the patches is house style, but it is fine to
> > > squash changes together too. Having sent patches to Arnaldo and having
> > > had them split I'm reluctant to do a v5 with them squashed without him
> > > expressing a preference.
> > 
> > Right, sometimes this is needed, I'm getting the patchkit now to test
> > build it in my containers and will go patch by patch reviewing.
> 
> Good start:
> 
> ⬢[acme@toolbox perf]$ b4 am -ctsl --cc-trailers CAP-5=fWT_19OfZTTjvLUcChV4nDwqc5Zq4VE93Gak6OO4NORsA@mail.gmail.com
> Looking up https://lore.kernel.org/r/CAP-5%3DfWT_19OfZTTjvLUcChV4nDwqc5Zq4VE93Gak6OO4NORsA%40mail.gmail.com
> Grabbing thread from lore.kernel.org/all/CAP-5%3DfWT_19OfZTTjvLUcChV4nDwqc5Zq4VE93Gak6OO4NORsA%40mail.gmail.com/t.mbox.gz
> Checking for newer revisions on https://lore.kernel.org/all/
> Analyzing 58 messages in the thread
> Checking attestation on all messages, may take a moment...
> ---
>   ✓ [PATCH v4 1/48] libperf: Add comments to perf_cpu_map.
>     + Reviewed-by: John Garry <john.garry@huawei.com>
>     + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>     + Link: https://lore.kernel.org/r/20220105061351.120843-2-irogers@google.com
> <BIG SNIP>
> Cover: ./v4_20220104_irogers_refactor_perf_cpumap.cover
>  Link: https://lore.kernel.org/r/20220105061351.120843-1-irogers@google.com
>  Base: applies clean to current tree
>        git checkout -b v4_20220104_irogers_google_com HEAD
>        git am ./v4_20220104_irogers_refactor_perf_cpumap.mbx
> 
oops:
  CC      /tmp/build/perf/util/unwind-libunwind-local.o
util/bpf_counter_cgroup.c: In function ‘bperf_load_program’:
util/bpf_counter_cgroup.c:51:26: error: incompatible types when initializing type ‘int’ using type ‘struct perf_cpu’
   51 |         int total_cpus = cpu__max_cpu();
      |                          ^~~~~~~~~~~~
util/bpf_counter_cgroup.c:127:74: error: invalid operands to binary + (have ‘int’ and ‘struct perf_cpu’)
  127 |                                 __u32 idx = evsel->core.idx * total_cpus +
      |                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      |                                                             |
      |                                                             int
  128 |                                         evlist->core.all_cpus->map[cpu];
      |                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                                   |
      |                                                                   struct perf_cpu
util/bpf_counter_cgroup.c: In function ‘bperf_cgrp__sync_counters’:
util/bpf_counter_cgroup.c:215:23: error: incompatible types when assigning to type ‘int’ from type ‘struct perf_cpu’
  215 |                 cpu = evlist->core.all_cpus->map[i];
      |                       ^~~~~~
util/bpf_counter_cgroup.c: In function ‘bperf_cgrp__read’:
util/bpf_counter_cgroup.c:248:26: error: incompatible types when initializing type ‘int’ using type ‘struct perf_cpu’
  248 |         int total_cpus = cpu__max_cpu();
      |                          ^~~~~~~~~~~~
util/bpf_counter_cgroup.c:275:31: error: incompatible types when assigning to type ‘int’ from type ‘struct perf_cpu’
  275 |                         cpu = evlist->core.all_cpus->map[i];
      |                               ^~~~~~
make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:97: /tmp/build/perf/util/bpf_counter_cgroup.o] Error 1
make[4]: *** Waiting for unfinished jobs....
util/bpf_counter.c: In function ‘bperf__load’:
util/bpf_counter.c:543:31: error: incompatible types when assigning to type ‘__u32’ {aka ‘unsigned int’} from type ‘struct perf_cpu’
  543 |                         key = evsel->core.cpus->map[i];
      |                               ^~~~~
util/bpf_counter.c: In function ‘bperf_sync_counters’:
util/bpf_counter.c:587:23: error: incompatible types when assigning to type ‘int’ from type ‘struct perf_cpu’
  587 |                 cpu = all_cpu_map->map[i];
      |                       ^~~~~~~~~~~
util/bpf_counter.c: In function ‘bperf__read’:
util/bpf_counter.c:608:29: error: incompatible types when initializing type ‘__u32’ {aka ‘unsigned int’} using type ‘struct perf_cpu’
  608 |         __u32 num_cpu_bpf = cpu__max_cpu();
      |                             ^~~~~~~~~~~~
In file included from /var/home/acme/git/perf/tools/perf/util/cpumap.h:8,
                 from /var/home/acme/git/perf/tools/perf/util/env.h:7,
                 from util/cgroup.h:8,
                 from util/bpf_counter.c:22:
/var/home/acme/git/perf/tools/lib/perf/include/perf/cpumap.h:25:33: error: incompatible types when assigning to type ‘__u32’ {aka ‘unsigned int’} from type ‘struct perf_cpu’
   25 |         for ((idx) = 0, (cpu) = perf_cpu_map__cpu(cpus, idx);   \
      |                                 ^~~~~~~~~~~~~~~~~
util/bpf_counter.c:626:25: note: in expansion of macro ‘perf_cpu_map__for_each_cpu’
  626 |                         perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
/var/home/acme/git/perf/tools/lib/perf/include/perf/cpumap.h:26:20: error: comparison of integer expressions of different signedness: ‘__u32’ {aka ‘unsigned int’} and ‘int’ [-Werror=sign-compare]
   26 |              (idx) < perf_cpu_map__nr(cpus);                    \
      |                    ^
util/bpf_counter.c:626:25: note: in expansion of macro ‘perf_cpu_map__for_each_cpu’
  626 |                         perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
/var/home/acme/git/perf/tools/lib/perf/include/perf/cpumap.h:27:31: error: incompatible types when assigning to type ‘__u32’ {aka ‘unsigned int’} from type ‘struct perf_cpu’
   27 |              (idx)++, (cpu) = perf_cpu_map__cpu(cpus, idx))
      |                               ^~~~~~~~~~~~~~~~~
util/bpf_counter.c:626:25: note: in expansion of macro ‘perf_cpu_map__for_each_cpu’
  626 |                         perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
util/bpf_counter.c:633:31: error: incompatible types when assigning to type ‘__u32’ {aka ‘unsigned int’} from type ‘struct perf_cpu’
  633 |                         cpu = evsel->core.cpus->map[i];
      |                               ^~~~~
util/bpf_counter.c:611:21: error: unused variable ‘num_cpu’ [-Werror=unused-variable]
  611 |         __u32 i, j, num_cpu;
      |                     ^~~~~~~
cc1: all warnings being treated as errors
make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_counter.o] Error 1
util/bpf_ftrace.c: In function ‘perf_ftrace__latency_prepare_bpf’:
util/bpf_ftrace.c:66:31: error: implicit declaration of function ‘cpu_map__cpu’; did you mean ‘cpu__max_cpu’? [-Werror=implicit-function-declaration]
   66 |                         cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
      |                               ^~~~~~~~~~~~
      |                               cpu__max_cpu
util/bpf_ftrace.c: In function ‘perf_ftrace__latency_read_bpf’:
util/bpf_ftrace.c:125:21: error: incompatible types when initializing type ‘int’ using type ‘struct perf_cpu’
  125 |         int ncpus = cpu__max_cpu();
      |                     ^~~~~~~~~~~~
cc1: all warnings being treated as errors
make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_ftrace.o] Error 1
  LD      /tmp/build/perf/util/intel-pt-decoder/perf-in.o
  LD      /tmp/build/perf/util/scripting-engines/perf-in.o
make[3]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
make[2]: *** [Makefile.perf:665: /tmp/build/perf/perf-in.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:240: sub-make] Error 2
make: *** [Makefile:113: install-bin] Error 2
make: Leaving directory '/var/home/acme/git/perf/tools/perf'
 Performance counter stats for 'make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin':
   279,675,447,200      cycles:u
   348,898,435,627      instructions:u            #    1.25  insn per cycle
       4.978587957 seconds time elapsed
      68.698979000 seconds user
      12.541229000 seconds sys
⬢[acme@toolbox perf]$
This is building with:
$ make BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin
In general for such refactorings its important to run:
make -C tools/perf build-test
But there is no entry there with BUILD_BPF_SKEL=1, will fix that now and
try to fix this in the patchkit as well.
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function.
  2022-01-05  6:13 ` [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function Ian Rogers
@ 2022-01-10 19:02   ` Arnaldo Carvalho de Melo
  2022-01-10 19:04     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 19:02 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 04, 2022 at 10:13:20PM -0800, Ian Rogers escreveu:
> Switch the remaining few users of cpu_map__cpu to perf_cpu_map__cpu and
> remove the function.
> 
> Reviewed-by: James Clark <james.clark@arm.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
util/bpf_ftrace.c: In function ‘perf_ftrace__latency_prepare_bpf’:
util/bpf_ftrace.c:66:31: error: implicit declaration of function ‘cpu_map__cpu’; did you mean ‘cpu_map__has’? [-Werror=implicit-function-declaration]
   66 |                         cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
      |                               ^~~~~~~~~~~~
      |                               cpu_map__has
cc1: all warnings being treated as errors
make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_ftrace.o] Error 1
make[4]: *** Waiting for unfinished jobs....
  LD      /tmp/build/perf/ui/browsers/perf-in.o
  LD      /tmp/build/perf/ui/perf-in.o
  LD      /tmp/build/perf/util/scripting-engines/perf-in.o
make[3]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
make[2]: *** [Makefile.perf:665: /tmp/build/perf/perf-in.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:240: sub-make] Error 2
make: *** [Makefile:113: install-bin] Error 2
make: Leaving directory '/var/home/acme/git/perf/tools/perf'
 Performance counter stats for 'make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin':
   163,635,422,196      cycles:u                                                    
   213,602,262,800      instructions:u            #    1.31  insn per cycle         
       3.804738555 seconds time elapsed
      40.622581000 seconds user
       6.589980000 seconds sys
⬢[acme@toolbox perf]$ git log --oneline -1
dfe103e757aeff79 (HEAD) perf cpumap: Remove cpu_map__cpu, use libperf function.
This is the first breakage, trying to fix now.
> ---
>  tools/perf/builtin-ftrace.c | 2 +-
>  tools/perf/util/cpumap.c    | 9 ++-------
>  tools/perf/util/cpumap.h    | 1 -
>  3 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
> index 2b54e2ddc80a..f16c39a37a52 100644
> --- a/tools/perf/builtin-ftrace.c
> +++ b/tools/perf/builtin-ftrace.c
> @@ -281,7 +281,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
>  	int ret;
>  	int last_cpu;
>  
> -	last_cpu = cpu_map__cpu(cpumap, cpumap->nr - 1);
> +	last_cpu = perf_cpu_map__cpu(cpumap, cpumap->nr - 1);
>  	mask_size = last_cpu / 4 + 2; /* one more byte for EOS */
>  	mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */
>  
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index e0d7f1da5858..32f9fc2dd389 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -485,11 +485,6 @@ bool cpu_map__has(struct perf_cpu_map *cpus, int cpu)
>  	return perf_cpu_map__idx(cpus, cpu) != -1;
>  }
>  
> -int cpu_map__cpu(struct perf_cpu_map *cpus, int idx)
> -{
> -	return cpus->map[idx];
> -}
> -
>  size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
>  {
>  	int i, cpu, start = -1;
> @@ -547,7 +542,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>  	int i, cpu;
>  	char *ptr = buf;
>  	unsigned char *bitmap;
> -	int last_cpu = cpu_map__cpu(map, map->nr - 1);
> +	int last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
>  
>  	if (buf == NULL)
>  		return 0;
> @@ -559,7 +554,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>  	}
>  
>  	for (i = 0; i < map->nr; i++) {
> -		cpu = cpu_map__cpu(map, i);
> +		cpu = perf_cpu_map__cpu(map, i);
>  		bitmap[cpu / 8] |= 1 << (cpu % 8);
>  	}
>  
> diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> index a053bf31a3f0..87545bcd461d 100644
> --- a/tools/perf/util/cpumap.h
> +++ b/tools/perf/util/cpumap.h
> @@ -80,7 +80,6 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
>  		       struct aggr_cpu_id (*f)(int cpu, void *data),
>  		       void *data);
>  
> -int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
>  bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
>  
>  bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
> -- 
> 2.34.1.448.ga2b2bfdf31-goog
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function.
  2022-01-10 19:02   ` Arnaldo Carvalho de Melo
@ 2022-01-10 19:04     ` Arnaldo Carvalho de Melo
  2022-01-10 19:06       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 19:04 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 04:02:49PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jan 04, 2022 at 10:13:20PM -0800, Ian Rogers escreveu:
> > Switch the remaining few users of cpu_map__cpu to perf_cpu_map__cpu and
> > remove the function.
> > 
> > Reviewed-by: James Clark <james.clark@arm.com>
> > Signed-off-by: Ian Rogers <irogers@google.com>
> 
> util/bpf_ftrace.c: In function ‘perf_ftrace__latency_prepare_bpf’:
> util/bpf_ftrace.c:66:31: error: implicit declaration of function ‘cpu_map__cpu’; did you mean ‘cpu_map__has’? [-Werror=implicit-function-declaration]
>    66 |                         cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
>       |                               ^~~~~~~~~~~~
>       |                               cpu_map__has
> cc1: all warnings being treated as errors
> make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_ftrace.o] Error 1
> make[4]: *** Waiting for unfinished jobs....
>   LD      /tmp/build/perf/ui/browsers/perf-in.o
>   LD      /tmp/build/perf/ui/perf-in.o
>   LD      /tmp/build/perf/util/scripting-engines/perf-in.o
> make[3]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
> make[2]: *** [Makefile.perf:665: /tmp/build/perf/perf-in.o] Error 2
> make[2]: *** Waiting for unfinished jobs....
> make[1]: *** [Makefile.perf:240: sub-make] Error 2
> make: *** [Makefile:113: install-bin] Error 2
> make: Leaving directory '/var/home/acme/git/perf/tools/perf'
> 
>  Performance counter stats for 'make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin':
> 
>    163,635,422,196      cycles:u                                                    
>    213,602,262,800      instructions:u            #    1.31  insn per cycle         
> 
>        3.804738555 seconds time elapsed
> 
>       40.622581000 seconds user
>        6.589980000 seconds sys
> 
> 
> ⬢[acme@toolbox perf]$ git log --oneline -1
> dfe103e757aeff79 (HEAD) perf cpumap: Remove cpu_map__cpu, use libperf function.
> 
> This is the first breakage, trying to fix now.
⬢[acme@toolbox perf]$ find tools/perf/ -name "*.c" | xargs grep -w cpu_map__cpu
tools/perf/util/bpf_ftrace.c:			cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
⬢[acme@toolbox perf]$
Only this one remaining, so just moving to perf_cpu_map__cpu() like done
elsewhere in this patch.
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get
  2022-01-05  6:13 ` [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get Ian Rogers
@ 2022-01-10 19:04   ` John Garry
  2022-01-11 18:46     ` Ian Rogers
  0 siblings, 1 reply; 84+ messages in thread
From: John Garry @ 2022-01-10 19:04 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 05/01/2022 06:13, Ian Rogers wrote:
> Avoid bugs where the wrong index is passed with the cpu_map.
nit: Don't you mean possible bugs?
> 
> Reviewed-by: James Clark <james.clark@arm.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function.
  2022-01-10 19:04     ` Arnaldo Carvalho de Melo
@ 2022-01-10 19:06       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 19:06 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 04:04:08PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 10, 2022 at 04:02:49PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Tue, Jan 04, 2022 at 10:13:20PM -0800, Ian Rogers escreveu:
> > > Switch the remaining few users of cpu_map__cpu to perf_cpu_map__cpu and
> > > remove the function.
> > > 
> > > Reviewed-by: James Clark <james.clark@arm.com>
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > 
> > util/bpf_ftrace.c: In function ‘perf_ftrace__latency_prepare_bpf’:
> > util/bpf_ftrace.c:66:31: error: implicit declaration of function ‘cpu_map__cpu’; did you mean ‘cpu_map__has’? [-Werror=implicit-function-declaration]
> >    66 |                         cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
> >       |                               ^~~~~~~~~~~~
> >       |                               cpu_map__has
> > cc1: all warnings being treated as errors
> > make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_ftrace.o] Error 1
> > make[4]: *** Waiting for unfinished jobs....
> >   LD      /tmp/build/perf/ui/browsers/perf-in.o
> >   LD      /tmp/build/perf/ui/perf-in.o
> >   LD      /tmp/build/perf/util/scripting-engines/perf-in.o
> > make[3]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
> > make[2]: *** [Makefile.perf:665: /tmp/build/perf/perf-in.o] Error 2
> > make[2]: *** Waiting for unfinished jobs....
> > make[1]: *** [Makefile.perf:240: sub-make] Error 2
> > make: *** [Makefile:113: install-bin] Error 2
> > make: Leaving directory '/var/home/acme/git/perf/tools/perf'
> > 
> >  Performance counter stats for 'make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin':
> > 
> >    163,635,422,196      cycles:u                                                    
> >    213,602,262,800      instructions:u            #    1.31  insn per cycle         
> > 
> >        3.804738555 seconds time elapsed
> > 
> >       40.622581000 seconds user
> >        6.589980000 seconds sys
> > 
> > 
> > ⬢[acme@toolbox perf]$ git log --oneline -1
> > dfe103e757aeff79 (HEAD) perf cpumap: Remove cpu_map__cpu, use libperf function.
> > 
> > This is the first breakage, trying to fix now.
> 
> ⬢[acme@toolbox perf]$ find tools/perf/ -name "*.c" | xargs grep -w cpu_map__cpu
> tools/perf/util/bpf_ftrace.c:			cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
> ⬢[acme@toolbox perf]$
> 
> Only this one remaining, so just moving to perf_cpu_map__cpu() like done
> elsewhere in this patch.
> 
⬢[acme@toolbox perf]$ git diff
diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
index f00a2de6778ccb97..28dc4c60c7884818 100644
--- a/tools/perf/util/bpf_ftrace.c
+++ b/tools/perf/util/bpf_ftrace.c
@@ -63,7 +63,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
                fd = bpf_map__fd(skel->maps.cpu_filter);
                for (i = 0; i < ncpus; i++) {
-                       cpu = cpu_map__cpu(ftrace->evlist->core.cpus, i);
+                       cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i);
                        bpf_map_update_elem(fd, &cpu, &val, BPF_ANY);
                }
        }
⬢[acme@toolbox perf]$
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx
  2022-01-05  6:13 ` [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx Ian Rogers
@ 2022-01-10 19:10   ` Arnaldo Carvalho de Melo
  2022-01-10 19:26     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 19:10 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 04, 2022 at 10:13:47PM -0800, Ian Rogers escreveu:
> Synchronize the caller in evsel with the called function.
> Shorten 3 lines of code in bperf_read by using
> perf_cpu_map__for_each_cpu.
> This code is frequently using variables named cpu as cpu map indices,
> which doesn't matter as all CPUs are in the CPU map. It is strange in
> some cases the cpumap is used at all.
  CC      /tmp/build/perf/builtin-stat.o
  INSTALL trace_plugins
  CC      /tmp/build/perf/util/evlist.o
  CC      /tmp/build/perf/util/evsel.o
  CC      /tmp/build/perf/util/header.o
  CC      /tmp/build/perf/util/bpf_counter.o
  CC      /tmp/build/perf/util/bpf_counter_cgroup.o
  CC      /tmp/build/perf/util/bpf_ftrace.o
In file included from util/cpumap.h:8,
                 from util/bpf_counter.c:23:
util/bpf_counter.c: In function ‘bperf__read’:
/var/home/acme/git/perf/tools/lib/perf/include/perf/cpumap.h:27:20: error: comparison of integer expressions of different signedness: ‘__u32’ {aka ‘unsigned int’} and ‘int’ [-Werror=sign-compare]
   27 |              (idx) < perf_cpu_map__nr(cpus);                    \
      |                    ^
util/bpf_counter.c:626:25: note: in expansion of macro ‘perf_cpu_map__for_each_cpu’
  626 |                         perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
util/bpf_counter.c:611:21: error: unused variable ‘num_cpu’ [-Werror=unused-variable]
  611 |         __u32 i, j, num_cpu;
      |                     ^~~~~~~
cc1: all warnings being treated as errors
make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_counter.o] Error 1
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
make[2]: *** [Makefile.perf:665: /tmp/build/perf/perf-in.o] Error 2
make[1]: *** [Makefile.perf:240: sub-make] Error 2
make: *** [Makefile:113: install-bin] Error 2
make: Leaving directory '/var/home/acme/git/perf/tools/perf'
 Performance counter stats for 'make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin':
    15,589,752,360      cycles:u
    28,227,862,710      instructions:u            #    1.81  insn per cycle
       1.996638375 seconds time elapsed
       3.567457000 seconds user
       0.934840000 seconds sys
⬢[acme@toolbox perf]$ git log --oneline -1
0083ba4c6a931fd0 (HEAD) perf bpf: Rename cpu to cpu_map_idx
⬢[acme@toolbox perf]$
Trying to fix...
 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/bpf_counter.c | 16 +++++++---------
>  tools/perf/util/bpf_counter.h |  4 ++--
>  2 files changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> index c17d4a43ce06..ae5bd87ff020 100644
> --- a/tools/perf/util/bpf_counter.c
> +++ b/tools/perf/util/bpf_counter.c
> @@ -265,7 +265,7 @@ static int bpf_program_profiler__read(struct evsel *evsel)
>  	return 0;
>  }
>  
> -static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu_map_idx,
>  					    int fd)
>  {
>  	struct bpf_prog_profiler_bpf *skel;
> @@ -277,7 +277,7 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
>  		assert(skel != NULL);
>  
>  		ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
> -					  &cpu, &fd, BPF_ANY);
> +					  &cpu_map_idx, &fd, BPF_ANY);
>  		if (ret)
>  			return ret;
>  	}
> @@ -566,12 +566,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
>  	return err;
>  }
>  
> -static int bperf__install_pe(struct evsel *evsel, int cpu, int fd)
> +static int bperf__install_pe(struct evsel *evsel, int cpu_map_idx, int fd)
>  {
>  	struct bperf_leader_bpf *skel = evsel->leader_skel;
>  
>  	return bpf_map_update_elem(bpf_map__fd(skel->maps.events),
> -				   &cpu, &fd, BPF_ANY);
> +				   &cpu_map_idx, &fd, BPF_ANY);
>  }
>  
>  /*
> @@ -623,9 +623,7 @@ static int bperf__read(struct evsel *evsel)
>  		case BPERF_FILTER_GLOBAL:
>  			assert(i == 0);
>  
> -			num_cpu = all_cpu_map->nr;
> -			for (j = 0; j < num_cpu; j++) {
> -				cpu = all_cpu_map->map[j];
> +			perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
>  				perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
>  				perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
>  				perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
> @@ -757,11 +755,11 @@ static inline bool bpf_counter_skip(struct evsel *evsel)
>  		evsel->follower_skel == NULL;
>  }
>  
> -int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
> +int bpf_counter__install_pe(struct evsel *evsel, int cpu_map_idx, int fd)
>  {
>  	if (bpf_counter_skip(evsel))
>  		return 0;
> -	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
> +	return evsel->bpf_counter_ops->install_pe(evsel, cpu_map_idx, fd);
>  }
>  
>  int bpf_counter__load(struct evsel *evsel, struct target *target)
> diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
> index 65ebaa6694fb..4dbf26408b69 100644
> --- a/tools/perf/util/bpf_counter.h
> +++ b/tools/perf/util/bpf_counter.h
> @@ -16,7 +16,7 @@ typedef int (*bpf_counter_evsel_op)(struct evsel *evsel);
>  typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel,
>  					   struct target *target);
>  typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel,
> -					       int cpu,
> +					       int cpu_map_idx,
>  					       int fd);
>  
>  struct bpf_counter_ops {
> @@ -40,7 +40,7 @@ int bpf_counter__enable(struct evsel *evsel);
>  int bpf_counter__disable(struct evsel *evsel);
>  int bpf_counter__read(struct evsel *evsel);
>  void bpf_counter__destroy(struct evsel *evsel);
> -int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
> +int bpf_counter__install_pe(struct evsel *evsel, int cpu_map_idx, int fd);
>  
>  #else /* HAVE_BPF_SKEL */
>  
> -- 
> 2.34.1.448.ga2b2bfdf31-goog
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-05  6:13 ` [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu Ian Rogers
  2022-01-10 17:10   ` John Garry
@ 2022-01-10 19:17   ` John Garry
  2022-01-11 19:33   ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 84+ messages in thread
From: John Garry @ 2022-01-10 19:17 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 05/01/2022 06:13, Ian Rogers wrote:
> +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> +				       void *data)
> +{
> +	if (idx < 0 || idx > map->nr)
Incidentally should this be >= map->nr?
> +		return cpu_map__empty_aggr_cpu_id();
> +
> +	return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> +}
> +
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx
  2022-01-10 19:10   ` Arnaldo Carvalho de Melo
@ 2022-01-10 19:26     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 19:26 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 04:10:46PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jan 04, 2022 at 10:13:47PM -0800, Ian Rogers escreveu:
> > Synchronize the caller in evsel with the called function.
> > Shorten 3 lines of code in bperf_read by using
> > perf_cpu_map__for_each_cpu.
> > This code is frequently using variables named cpu as cpu map indices,
> > which doesn't matter as all CPUs are in the CPU map. It is strange in
> > some cases the cpumap is used at all.
> 
>   CC      /tmp/build/perf/builtin-stat.o
>   INSTALL trace_plugins
>   CC      /tmp/build/perf/util/evlist.o
>   CC      /tmp/build/perf/util/evsel.o
>   CC      /tmp/build/perf/util/header.o
>   CC      /tmp/build/perf/util/bpf_counter.o
>   CC      /tmp/build/perf/util/bpf_counter_cgroup.o
>   CC      /tmp/build/perf/util/bpf_ftrace.o
> In file included from util/cpumap.h:8,
>                  from util/bpf_counter.c:23:
> util/bpf_counter.c: In function ‘bperf__read’:
> /var/home/acme/git/perf/tools/lib/perf/include/perf/cpumap.h:27:20: error: comparison of integer expressions of different signedness: ‘__u32’ {aka ‘unsigned int’} and ‘int’ [-Werror=sign-compare]
>    27 |              (idx) < perf_cpu_map__nr(cpus);                    \
>       |                    ^
> util/bpf_counter.c:626:25: note: in expansion of macro ‘perf_cpu_map__for_each_cpu’
>   626 |                         perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
>       |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
> util/bpf_counter.c:611:21: error: unused variable ‘num_cpu’ [-Werror=unused-variable]
>   611 |         __u32 i, j, num_cpu;
>       |                     ^~~~~~~
> cc1: all warnings being treated as errors
> make[4]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/bpf_counter.o] Error 1
> make[4]: *** Waiting for unfinished jobs....
> make[3]: *** [/var/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
> make[2]: *** [Makefile.perf:665: /tmp/build/perf/perf-in.o] Error 2
> make[1]: *** [Makefile.perf:240: sub-make] Error 2
> make: *** [Makefile:113: install-bin] Error 2
> make: Leaving directory '/var/home/acme/git/perf/tools/perf'
> 
>  Performance counter stats for 'make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin':
> 
>     15,589,752,360      cycles:u
>     28,227,862,710      instructions:u            #    1.81  insn per cycle
> 
>        1.996638375 seconds time elapsed
> 
>        3.567457000 seconds user
>        0.934840000 seconds sys
> 
> 
> ⬢[acme@toolbox perf]$ git log --oneline -1
> 0083ba4c6a931fd0 (HEAD) perf bpf: Rename cpu to cpu_map_idx
> ⬢[acme@toolbox perf]$
> 
> Trying to fix...
Would be better to have the "Shorten 3 lines of code in bperf_read by
using perf_cpu_map__for_each_cpu." in a separate patch, try to do this
next time.
Below cures the two issues.
- Arnaldo
    Committer notes:
    Found when building with BUILD_BPF_SKEL=1:
    Remove unused 'num_cpu' variable in bperf__read().
    Make 'j' an 'int' as it is used in perf_cpu_map__for_each_cpu() to compare against an 'int'
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index ae5bd87ff02033c2..80d1a3a31052fe55 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -608,7 +608,8 @@ static int bperf__read(struct evsel *evsel)
 	__u32 num_cpu_bpf = cpu__max_cpu();
 	struct bpf_perf_event_value values[num_cpu_bpf];
 	int reading_map_fd, err = 0;
-	__u32 i, j, num_cpu;
+	__u32 i;
+	int j;
 
 	bperf_sync_counters(evsel);
 	reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-05  6:13 ` [PATCH v4 48/48] perf cpumap: Give CPUs their own type Ian Rogers
  2022-01-09 18:30   ` Jiri Olsa
@ 2022-01-10 19:50   ` Arnaldo Carvalho de Melo
  2022-01-11 19:16     ` Ian Rogers
  1 sibling, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 19:50 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 04, 2022 at 10:13:51PM -0800, Ian Rogers escreveu:
> A common problem is confusing CPU map indices with the CPU, by wrapping
> the CPU with a struct then this is avoided. This approach is similar to
> atomic_t.
This one needed this to build with BUILD_BPF_SKEL=1, please check, I'll
soon push this to tmp.perf/perf_cpu so that you can take a look and test
it.
- Arnaldo
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 80d1a3a31052fe55..328479df5e16a638 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -540,7 +540,7 @@ static int bperf__load(struct evsel *evsel, struct target *target)
 		    filter_type == BPERF_FILTER_TGID)
 			key = evsel->core.threads->map[i].pid;
 		else if (filter_type == BPERF_FILTER_CPU)
-			key = evsel->core.cpus->map[i];
+			key = evsel->core.cpus->map[i].cpu;
 		else
 			break;
 
@@ -584,7 +584,7 @@ static int bperf_sync_counters(struct evsel *evsel)
 
 	num_cpu = all_cpu_map->nr;
 	for (i = 0; i < num_cpu; i++) {
-		cpu = all_cpu_map->map[i];
+		cpu = all_cpu_map->map[i].cpu;
 		bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
 	}
 	return 0;
@@ -605,7 +605,7 @@ static int bperf__disable(struct evsel *evsel)
 static int bperf__read(struct evsel *evsel)
 {
 	struct bperf_follower_bpf *skel = evsel->follower_skel;
-	__u32 num_cpu_bpf = cpu__max_cpu();
+	__u32 num_cpu_bpf = cpu__max_cpu().cpu;
 	struct bpf_perf_event_value values[num_cpu_bpf];
 	int reading_map_fd, err = 0;
 	__u32 i;
@@ -615,6 +615,7 @@ static int bperf__read(struct evsel *evsel)
 	reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
 
 	for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
+		struct perf_cpu entry;
 		__u32 cpu;
 
 		err = bpf_map_lookup_elem(reading_map_fd, &i, values);
@@ -624,14 +625,15 @@ static int bperf__read(struct evsel *evsel)
 		case BPERF_FILTER_GLOBAL:
 			assert(i == 0);
 
-			perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
+			perf_cpu_map__for_each_cpu(entry, j, all_cpu_map) {
+				cpu = entry.cpu;
 				perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
 				perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
 				perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
 			}
 			break;
 		case BPERF_FILTER_CPU:
-			cpu = evsel->core.cpus->map[i];
+			cpu = evsel->core.cpus->map[i].cpu;
 			perf_counts(evsel->counts, i, 0)->val = values[cpu].counter;
 			perf_counts(evsel->counts, i, 0)->ena = values[cpu].enabled;
 			perf_counts(evsel->counts, i, 0)->run = values[cpu].running;
diff --git a/tools/perf/util/bpf_counter_cgroup.c b/tools/perf/util/bpf_counter_cgroup.c
index cbc6c2bca488f6bf..631e34a0b66ff084 100644
--- a/tools/perf/util/bpf_counter_cgroup.c
+++ b/tools/perf/util/bpf_counter_cgroup.c
@@ -48,7 +48,7 @@ static int bperf_load_program(struct evlist *evlist)
 	struct cgroup *cgrp, *leader_cgrp;
 	__u32 i, cpu;
 	__u32 nr_cpus = evlist->core.all_cpus->nr;
-	int total_cpus = cpu__max_cpu();
+	int total_cpus = cpu__max_cpu().cpu;
 	int map_size, map_fd;
 	int prog_fd, err;
 
@@ -125,7 +125,7 @@ static int bperf_load_program(struct evlist *evlist)
 			for (cpu = 0; cpu < nr_cpus; cpu++) {
 				int fd = FD(evsel, cpu);
 				__u32 idx = evsel->core.idx * total_cpus +
-					evlist->core.all_cpus->map[cpu];
+					evlist->core.all_cpus->map[cpu].cpu;
 
 				err = bpf_map_update_elem(map_fd, &idx, &fd,
 							  BPF_ANY);
@@ -212,7 +212,7 @@ static int bperf_cgrp__sync_counters(struct evlist *evlist)
 	int prog_fd = bpf_program__fd(skel->progs.trigger_read);
 
 	for (i = 0; i < nr_cpus; i++) {
-		cpu = evlist->core.all_cpus->map[i];
+		cpu = evlist->core.all_cpus->map[i].cpu;
 		bperf_trigger_reading(prog_fd, cpu);
 	}
 
@@ -245,7 +245,7 @@ static int bperf_cgrp__read(struct evsel *evsel)
 {
 	struct evlist *evlist = evsel->evlist;
 	int i, cpu, nr_cpus = evlist->core.all_cpus->nr;
-	int total_cpus = cpu__max_cpu();
+	int total_cpus = cpu__max_cpu().cpu;
 	struct perf_counts_values *counts;
 	struct bpf_perf_event_value *values;
 	int reading_map_fd, err = 0;
@@ -272,7 +272,7 @@ static int bperf_cgrp__read(struct evsel *evsel)
 		}
 
 		for (i = 0; i < nr_cpus; i++) {
-			cpu = evlist->core.all_cpus->map[i];
+			cpu = evlist->core.all_cpus->map[i].cpu;
 
 			counts = perf_counts(evsel->counts, i, 0);
 			counts->val = values[cpu].counter;
diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
index 28dc4c60c7884818..d756cc66eef32ae8 100644
--- a/tools/perf/util/bpf_ftrace.c
+++ b/tools/perf/util/bpf_ftrace.c
@@ -63,7 +63,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 		fd = bpf_map__fd(skel->maps.cpu_filter);
 
 		for (i = 0; i < ncpus; i++) {
-			cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i);
+			cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i).cpu;
 			bpf_map_update_elem(fd, &cpu, &val, BPF_ANY);
 		}
 	}
@@ -122,7 +122,7 @@ int perf_ftrace__latency_read_bpf(struct perf_ftrace *ftrace __maybe_unused,
 	int i, fd, err;
 	u32 idx;
 	u64 *hist;
-	int ncpus = cpu__max_cpu();
+	int ncpus = cpu__max_cpu().cpu;
 
 	fd = bpf_map__fd(skel->maps.latency);
 
^ permalink raw reply related	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 00/48] Refactor perf cpumap
  2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
                   ` (47 preceding siblings ...)
  2022-01-05  6:13 ` [PATCH v4 48/48] perf cpumap: Give CPUs their own type Ian Rogers
@ 2022-01-10 20:17 ` Arnaldo Carvalho de Melo
  2022-01-10 20:40   ` Arnaldo Carvalho de Melo
  48 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 20:17 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 04, 2022 at 10:13:03PM -0800, Ian Rogers escreveu:
> Perf cpu map has various functions where a cpumap and index are passed
> in order to load the cpu. A problem with this is that the wrong index
> may be passed for the cpumap, causing problems like aggregation on the
> wrong CPU:
> https://lore.kernel.org/lkml/20211204023409.969668-1-irogers@google.com/
> 
> This patch set refactors the cpu map API, reducing it and explicitly
> passing the cpu (rather than the pair) to functions that need
> it. Comments are added at the same time. Changes modify the same
> file/function more than once as refactoring and fixes are broken apart
> for the sake of bisection.
> 
> v2. Incorproates fixes suggested Jiri Olsa, rewrites the evlist CPU
>     iterator in part in a way suggested by Riccardo Mancini. The new
>     fixes start at patch 23. The final change was suggested by John
>     Garry to make the CPUs have their own struct wrapper.
> 
> v3. Incorporates fixes suggested by Namhyung Kim.
> 
> v4. Rebases and incorporates fixes suggested by Jiri Olsa.
Did fixes in places where BUILD_BPF_SKEL=1 was failing, now, at the end
of the series:
⬢[acme@toolbox perf]$ git log --oneline -1 ; time make -C tools/perf build-test
462aafa8fcbdc78f (HEAD -> perf/core) perf cpumap: Give CPUs their own type
make: Entering directory '/var/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
                 make_static: cd . && make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1 -j32  DESTDIR=/tmp/tmp.lFrqVcqiRY
              make_with_gtk2: cd . && make GTK2=1 -j32  DESTDIR=/tmp/tmp.KvHmTqUGmD
- /var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP: cd . && make FEATURE_DUMP_COPY=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP  feature-dump
cd . && make FEATURE_DUMP_COPY=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
            make_no_libelf_O: cd . && make NO_LIBELF=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.xz3LYy7WpA DESTDIR=/tmp/tmp.zVjIMpvQMM
      make_no_libbpf_DEBUG_O: cd . && make NO_LIBBPF=1 DEBUG=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.0gxAOSjb1j DESTDIR=/tmp/tmp.kYTk1WjWWz
          make_no_libaudit_O: cd . && make NO_LIBAUDIT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.6hArTG9Dth DESTDIR=/tmp/tmp.hCNdYeDRCr
         make_no_backtrace_O: cd . && make NO_BACKTRACE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.VtoPI58z3t DESTDIR=/tmp/tmp.SYyBBMSHSL
         make_no_libcrypto_O: cd . && make NO_LIBCRYPTO=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.SO1qtxJwvE DESTDIR=/tmp/tmp.AqrPw1oZfx
       make_with_clangllvm_O: cd . && make LIBCLANGLLVM=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.G9KVEaDv2B DESTDIR=/tmp/tmp.6CSqj0wPqJ
       make_with_coresight_O: cd . && make CORESIGHT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.PnVTJNHy1E DESTDIR=/tmp/tmp.k7Arerg778
         make_no_libpython_O: cd . && make NO_LIBPYTHON=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.cKjvFagVL3 DESTDIR=/tmp/tmp.aCaD0hbP3O
               make_no_sdt_O: cd . && make NO_SDT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.EKoWnoZtl4 DESTDIR=/tmp/tmp.1gGYetcptB
       make_libbpf_dynamic_O: cd . && make LIBBPF_DYNAMIC=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.Nked5O10EK DESTDIR=/tmp/tmp.eo6YAEOfZO
              make_minimal_O: cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.Z6dSqYBXfU DESTDIR=/tmp/tmp.Y4iRu634of
 make_install_prefix_slash_O: cd . && make install prefix=/tmp/krava/ FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.BgipDyNu70 DESTDIR=/tmp/tmp.pOwrdDYBqf
     make_util_pmu_bison_o_O: cd . && make util/pmu-bison.o FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.XeO82Ajmv6 DESTDIR=/tmp/tmp.dxQKNNBM6W
              make_install_O: cd . && make install FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.KuCyqzLwnA DESTDIR=/tmp/tmp.KYvqoiOKzW
       make_install_prefix_O: cd . && make install prefix=/tmp/krava FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.rJNNPELdri DESTDIR=/tmp/tmp.g52i9CcmnU
          make_install_bin_O: cd . && make install-bin FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.i118vbQuo0 DESTDIR=/tmp/tmp.90tKcjOWD8
               make_perf_o_O: cd . && make perf.o FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.kEZMTvh3wf DESTDIR=/tmp/tmp.JYvQLdkPsp
             make_no_slang_O: cd . && make NO_SLANG=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.WDNGH0DpdW DESTDIR=/tmp/tmp.Zfa804wClJ
                make_no_ui_O: cd . && make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.P4nTBNqkP4 DESTDIR=/tmp/tmp.iCLqOyrh4o
              make_no_newt_O: cd . && make NO_NEWT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.EdjjHoqODV DESTDIR=/tmp/tmp.jKfYRizWtm
         make_with_libpfm4_O: cd . && make LIBPFM4=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.r3hPAOgfId DESTDIR=/tmp/tmp.GvMmEZUc2B
          make_no_demangle_O: cd . && make NO_DEMANGLE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.wG9f40LzG3 DESTDIR=/tmp/tmp.i76sPjSg1m
         make_no_libunwind_O: cd . && make NO_LIBUNWIND=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.SmFfmmB9KQ DESTDIR=/tmp/tmp.9tjMdCWMxG
            make_no_libbpf_O: cd . && make NO_LIBBPF=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.7jH21oP9n7 DESTDIR=/tmp/tmp.taUqTgY37S
       make_no_syscall_tbl_O: cd . && make NO_SYSCALL_TABLE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.L2v7vi1AQl DESTDIR=/tmp/tmp.wHEA4jEpw9
           make_no_libnuma_O: cd . && make NO_LIBNUMA=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.9N5pQYuRbO DESTDIR=/tmp/tmp.B0Yptb7xP6
           make_util_map_o_O: cd . && make util/map.o FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.B8oe4jnWwB DESTDIR=/tmp/tmp.m0APlsli14
      make_with_babeltrace_O: cd . && make LIBBABELTRACE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.3G1kLTJLJx DESTDIR=/tmp/tmp.ahuDENSMsz
                make_debug_O: cd . && make DEBUG=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.KpsvOj0KHD DESTDIR=/tmp/tmp.f2Yv9Lrorg
           make_no_libperl_O: cd . && make NO_LIBPERL=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.7tIufwAi4O DESTDIR=/tmp/tmp.vcBv7UYjuw
         make_no_libbionic_O: cd . && make NO_LIBBIONIC=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.HCJA7xCDxq DESTDIR=/tmp/tmp.us5CDSPqyO
                 make_pure_O: cd . && make FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.eg2wCpydvg DESTDIR=/tmp/tmp.MmD1xESO2A
                  make_doc_O: cd . && make doc FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.TqkyGrzRut DESTDIR=/tmp/tmp.ap34IVxoM9
           make_no_scripts_O: cd . && make NO_LIBPYTHON=1 NO_LIBPERL=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.pRWHKzpPhv DESTDIR=/tmp/tmp.dUNhAp2rLt
            make_clean_all_O: cd . && make clean all FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.YnzNLLk9ID DESTDIR=/tmp/tmp.7kXr5obkmA
make_no_libdw_dwarf_unwind_O: cd . && make NO_LIBDW_DWARF_UNWIND=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.GzZCxuYLwR DESTDIR=/tmp/tmp.n3YS9a9DcU
                 make_help_O: cd . && make help FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.t7QwWVHsod DESTDIR=/tmp/tmp.NG5uSWCZzi
          make_no_auxtrace_O: cd . && make NO_AUXTRACE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.wiDZkJzs8R DESTDIR=/tmp/tmp.ITtPlX0usX
              make_no_gtk2_O: cd . && make NO_GTK2=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.xd1x5QIckJ DESTDIR=/tmp/tmp.f6KzsHHkZH
                 make_tags_O: cd . && make tags FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.e8oSaBuFUb DESTDIR=/tmp/tmp.4czd6ovlSB
OK
make: Leaving directory '/var/home/acme/git/perf/tools/perf'
real	4m29.388s
user	59m6.060s
sys	12m19.581s
⬢[acme@toolbox perf]$
alias m='perf stat -e cycles:u,instructions:u make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin && git status && perf test python'
So 'perf test python' is done after every build, no problems with that
as well.
I'm now making it go thru the container builds.
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 00/48] Refactor perf cpumap
  2022-01-10 20:17 ` [PATCH v4 00/48] Refactor perf cpumap Arnaldo Carvalho de Melo
@ 2022-01-10 20:40   ` Arnaldo Carvalho de Melo
  2022-01-10 20:42     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 20:40 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 05:17:25PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jan 04, 2022 at 10:13:03PM -0800, Ian Rogers escreveu:
> > Perf cpu map has various functions where a cpumap and index are passed
> > in order to load the cpu. A problem with this is that the wrong index
> > may be passed for the cpumap, causing problems like aggregation on the
> > wrong CPU:
> > https://lore.kernel.org/lkml/20211204023409.969668-1-irogers@google.com/
> > 
> > This patch set refactors the cpu map API, reducing it and explicitly
> > passing the cpu (rather than the pair) to functions that need
> > it. Comments are added at the same time. Changes modify the same
> > file/function more than once as refactoring and fixes are broken apart
> > for the sake of bisection.
> > 
> > v2. Incorproates fixes suggested Jiri Olsa, rewrites the evlist CPU
> >     iterator in part in a way suggested by Riccardo Mancini. The new
> >     fixes start at patch 23. The final change was suggested by John
> >     Garry to make the CPUs have their own struct wrapper.
> > 
> > v3. Incorporates fixes suggested by Namhyung Kim.
> > 
> > v4. Rebases and incorporates fixes suggested by Jiri Olsa.
> 
> Did fixes in places where BUILD_BPF_SKEL=1 was failing, now, at the end
> of the series:
> 
> ⬢[acme@toolbox perf]$ git log --oneline -1 ; time make -C tools/perf build-test
> 462aafa8fcbdc78f (HEAD -> perf/core) perf cpumap: Give CPUs their own type
> make: Entering directory '/var/home/acme/git/perf/tools/perf'
> - tarpkg: ./tests/perf-targz-src-pkg .
>                  make_static: cd . && make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1 -j32  DESTDIR=/tmp/tmp.lFrqVcqiRY
>               make_with_gtk2: cd . && make GTK2=1 -j32  DESTDIR=/tmp/tmp.KvHmTqUGmD
> - /var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP: cd . && make FEATURE_DUMP_COPY=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP  feature-dump
> cd . && make FEATURE_DUMP_COPY=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
>             make_no_libelf_O: cd . && make NO_LIBELF=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.xz3LYy7WpA DESTDIR=/tmp/tmp.zVjIMpvQMM
>       make_no_libbpf_DEBUG_O: cd . && make NO_LIBBPF=1 DEBUG=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.0gxAOSjb1j DESTDIR=/tmp/tmp.kYTk1WjWWz
>           make_no_libaudit_O: cd . && make NO_LIBAUDIT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.6hArTG9Dth DESTDIR=/tmp/tmp.hCNdYeDRCr
>          make_no_backtrace_O: cd . && make NO_BACKTRACE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.VtoPI58z3t DESTDIR=/tmp/tmp.SYyBBMSHSL
>          make_no_libcrypto_O: cd . && make NO_LIBCRYPTO=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.SO1qtxJwvE DESTDIR=/tmp/tmp.AqrPw1oZfx
>        make_with_clangllvm_O: cd . && make LIBCLANGLLVM=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.G9KVEaDv2B DESTDIR=/tmp/tmp.6CSqj0wPqJ
>        make_with_coresight_O: cd . && make CORESIGHT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.PnVTJNHy1E DESTDIR=/tmp/tmp.k7Arerg778
>          make_no_libpython_O: cd . && make NO_LIBPYTHON=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.cKjvFagVL3 DESTDIR=/tmp/tmp.aCaD0hbP3O
>                make_no_sdt_O: cd . && make NO_SDT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.EKoWnoZtl4 DESTDIR=/tmp/tmp.1gGYetcptB
>        make_libbpf_dynamic_O: cd . && make LIBBPF_DYNAMIC=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.Nked5O10EK DESTDIR=/tmp/tmp.eo6YAEOfZO
>               make_minimal_O: cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.Z6dSqYBXfU DESTDIR=/tmp/tmp.Y4iRu634of
>  make_install_prefix_slash_O: cd . && make install prefix=/tmp/krava/ FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.BgipDyNu70 DESTDIR=/tmp/tmp.pOwrdDYBqf
>      make_util_pmu_bison_o_O: cd . && make util/pmu-bison.o FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.XeO82Ajmv6 DESTDIR=/tmp/tmp.dxQKNNBM6W
>               make_install_O: cd . && make install FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.KuCyqzLwnA DESTDIR=/tmp/tmp.KYvqoiOKzW
>        make_install_prefix_O: cd . && make install prefix=/tmp/krava FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.rJNNPELdri DESTDIR=/tmp/tmp.g52i9CcmnU
>           make_install_bin_O: cd . && make install-bin FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.i118vbQuo0 DESTDIR=/tmp/tmp.90tKcjOWD8
>                make_perf_o_O: cd . && make perf.o FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.kEZMTvh3wf DESTDIR=/tmp/tmp.JYvQLdkPsp
>              make_no_slang_O: cd . && make NO_SLANG=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.WDNGH0DpdW DESTDIR=/tmp/tmp.Zfa804wClJ
>                 make_no_ui_O: cd . && make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.P4nTBNqkP4 DESTDIR=/tmp/tmp.iCLqOyrh4o
>               make_no_newt_O: cd . && make NO_NEWT=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.EdjjHoqODV DESTDIR=/tmp/tmp.jKfYRizWtm
>          make_with_libpfm4_O: cd . && make LIBPFM4=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.r3hPAOgfId DESTDIR=/tmp/tmp.GvMmEZUc2B
>           make_no_demangle_O: cd . && make NO_DEMANGLE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.wG9f40LzG3 DESTDIR=/tmp/tmp.i76sPjSg1m
>          make_no_libunwind_O: cd . && make NO_LIBUNWIND=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.SmFfmmB9KQ DESTDIR=/tmp/tmp.9tjMdCWMxG
>             make_no_libbpf_O: cd . && make NO_LIBBPF=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.7jH21oP9n7 DESTDIR=/tmp/tmp.taUqTgY37S
>        make_no_syscall_tbl_O: cd . && make NO_SYSCALL_TABLE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.L2v7vi1AQl DESTDIR=/tmp/tmp.wHEA4jEpw9
>            make_no_libnuma_O: cd . && make NO_LIBNUMA=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.9N5pQYuRbO DESTDIR=/tmp/tmp.B0Yptb7xP6
>            make_util_map_o_O: cd . && make util/map.o FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.B8oe4jnWwB DESTDIR=/tmp/tmp.m0APlsli14
>       make_with_babeltrace_O: cd . && make LIBBABELTRACE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.3G1kLTJLJx DESTDIR=/tmp/tmp.ahuDENSMsz
>                 make_debug_O: cd . && make DEBUG=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.KpsvOj0KHD DESTDIR=/tmp/tmp.f2Yv9Lrorg
>            make_no_libperl_O: cd . && make NO_LIBPERL=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.7tIufwAi4O DESTDIR=/tmp/tmp.vcBv7UYjuw
>          make_no_libbionic_O: cd . && make NO_LIBBIONIC=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.HCJA7xCDxq DESTDIR=/tmp/tmp.us5CDSPqyO
>                  make_pure_O: cd . && make FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.eg2wCpydvg DESTDIR=/tmp/tmp.MmD1xESO2A
>                   make_doc_O: cd . && make doc FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.TqkyGrzRut DESTDIR=/tmp/tmp.ap34IVxoM9
>            make_no_scripts_O: cd . && make NO_LIBPYTHON=1 NO_LIBPERL=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.pRWHKzpPhv DESTDIR=/tmp/tmp.dUNhAp2rLt
>             make_clean_all_O: cd . && make clean all FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.YnzNLLk9ID DESTDIR=/tmp/tmp.7kXr5obkmA
> make_no_libdw_dwarf_unwind_O: cd . && make NO_LIBDW_DWARF_UNWIND=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.GzZCxuYLwR DESTDIR=/tmp/tmp.n3YS9a9DcU
>                  make_help_O: cd . && make help FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.t7QwWVHsod DESTDIR=/tmp/tmp.NG5uSWCZzi
>           make_no_auxtrace_O: cd . && make NO_AUXTRACE=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.wiDZkJzs8R DESTDIR=/tmp/tmp.ITtPlX0usX
>               make_no_gtk2_O: cd . && make NO_GTK2=1 FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.xd1x5QIckJ DESTDIR=/tmp/tmp.f6KzsHHkZH
>                  make_tags_O: cd . && make tags FEATURES_DUMP=/var/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j32 O=/tmp/tmp.e8oSaBuFUb DESTDIR=/tmp/tmp.4czd6ovlSB
> OK
> make: Leaving directory '/var/home/acme/git/perf/tools/perf'
> 
> real	4m29.388s
> user	59m6.060s
> sys	12m19.581s
> ⬢[acme@toolbox perf]$
> 
> alias m='perf stat -e cycles:u,instructions:u make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin && git status && perf test python'
> 
> So 'perf test python' is done after every build, no problems with that
> as well.
> 
> I'm now making it go thru the container builds.
Not a good start, checking:
[perfbuilder@five ~]$ time dm
   1    50.25 almalinux:8                   : FAIL clang version 12.0.1 (Red Hat 12.0.1-4.module_el8.5.0+1025+93159d6c)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   2    54.14 alpine:3.4                    : FAIL clang version 3.8.0 (tags/RELEASE_380/final)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   3    39.78 alpine:3.5                    : FAIL clang version 3.8.1 (tags/RELEASE_381/final)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   4    82.52 alpine:3.6                    : FAIL clang version 4.0.0 (tags/RELEASE_400/final)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   5   113.61 alpine:3.7                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   6   119.02 alpine:3.8                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   7    64.55 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
    builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
    static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                  ^
    1 error generated.
   8: alpine:3.10
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 00/48] Refactor perf cpumap
  2022-01-10 20:40   ` Arnaldo Carvalho de Melo
@ 2022-01-10 20:42     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-10 20:42 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Mon, Jan 10, 2022 at 05:40:12PM -0300, Arnaldo Carvalho de Melo escreveu:
> > I'm now making it go thru the container builds.
> 
> Not a good start, checking:
> 
> [perfbuilder@five ~]$ time dm
>    1    50.25 almalinux:8                   : FAIL clang version 12.0.1 (Red Hat 12.0.1-4.module_el8.5.0+1025+93159d6c)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
                                   ^
Interesting, it builds with gcc, not with clang:
+ make PYTHON=python3 ARCH= CROSS_COMPILE= EXTRA_CFLAGS= -C tools/perf O=/tmp/build/perf CC=clang
And it really isn't used anywhere in builtin-stat.c, will remove it
after figuring out when it stops being used.
- Arnaldo
>     1 error generated.
>    2    54.14 alpine:3.4                    : FAIL clang version 3.8.0 (tags/RELEASE_380/final)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
>                                   ^
>     1 error generated.
>    3    39.78 alpine:3.5                    : FAIL clang version 3.8.1 (tags/RELEASE_381/final)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
>                                   ^
>     1 error generated.
>    4    82.52 alpine:3.6                    : FAIL clang version 4.0.0 (tags/RELEASE_400/final)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
>                                   ^
>     1 error generated.
>    5   113.61 alpine:3.7                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
>                                   ^
>     1 error generated.
>    6   119.02 alpine:3.8                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
>                                   ^
>     1 error generated.
>    7    64.55 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
>     builtin-stat.c:1499:31: error: unused function 'perf_env__get_cpu' [-Werror,-Wunused-function]
>     static inline struct perf_cpu perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
>                                   ^
>     1 error generated.
>    8: alpine:3.10
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop
  2022-01-10 18:50   ` John Garry
@ 2022-01-10 22:22     ` Ian Rogers
  0 siblings, 0 replies; 84+ messages in thread
From: Ian Rogers @ 2022-01-10 22:22 UTC (permalink / raw)
  To: John Garry
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Mon, Jan 10, 2022 at 10:51 AM John Garry <john.garry@huawei.com> wrote:
>
> On 05/01/2022 06:13, Ian Rogers wrote:
> > +     cpus = evsel__cpus(evsel);
> > +     perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
>
> It seems a common pattern to evaluate cpus and use in
> perf_cpu_map__for_each_cpu() - is it possible to make a macro to accept
> evsel and create cpus, like perf_evsel__for_each_cpu()?
It's possible, I'm not sure it is saving much and it is common to also
want to iterate the cpumap of the evlist.
Thanks,
Ian
> > +             if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
> > +                                              id))
> > +                     return cpu;
> >       }
> >       return 0;
>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get
  2022-01-10 19:04   ` John Garry
@ 2022-01-11 18:46     ` Ian Rogers
  2022-01-11 18:50       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-11 18:46 UTC (permalink / raw)
  To: John Garry
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Mon, Jan 10, 2022 at 11:04 AM John Garry <john.garry@huawei.com> wrote:
>
> On 05/01/2022 06:13, Ian Rogers wrote:
> > Avoid bugs where the wrong index is passed with the cpu_map.
>
> nit: Don't you mean possible bugs?
You are right. I'll watch myself doing this in the future.
Thanks,
Ian
> >
> > Reviewed-by: James Clark <james.clark@arm.com>
> > Signed-off-by: Ian Rogers <irogers@google.com>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get
  2022-01-11 18:46     ` Ian Rogers
@ 2022-01-11 18:50       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-11 18:50 UTC (permalink / raw)
  To: Ian Rogers, John Garry
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, Kajol Jain, Paul A . Clarke,
	Arnaldo Carvalho de Melo, Riccardo Mancini, Kan Liang,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	linux-perf-users, linux-kernel, Vineet Singh, James Clark,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On January 11, 2022 3:46:41 PM GMT-03:00, Ian Rogers <irogers@google.com> wrote:
>On Mon, Jan 10, 2022 at 11:04 AM John Garry <john.garry@huawei.com> wrote:
>>
>> On 05/01/2022 06:13, Ian Rogers wrote:
>> > Avoid bugs where the wrong index is passed with the cpu_map.
>>
>> nit: Don't you mean possible bugs?
>
>You are right. I'll watch myself doing this in the future.
I'll update the comment,
I'm still testing it and giving more time for review, including to the adjustments I made..
- Arnaldo
>
>Thanks,
>Ian
>
>> >
>> > Reviewed-by: James Clark <james.clark@arm.com>
>> > Signed-off-by: Ian Rogers <irogers@google.com>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-10 19:50   ` Arnaldo Carvalho de Melo
@ 2022-01-11 19:16     ` Ian Rogers
  2022-01-11 20:01       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-11 19:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Mon, Jan 10, 2022 at 11:50 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Tue, Jan 04, 2022 at 10:13:51PM -0800, Ian Rogers escreveu:
> > A common problem is confusing CPU map indices with the CPU, by wrapping
> > the CPU with a struct then this is avoided. This approach is similar to
> > atomic_t.
>
> This one needed this to build with BUILD_BPF_SKEL=1, please check, I'll
> soon push this to tmp.perf/perf_cpu so that you can take a look and test
> it.
Thanks Arnaldo, I did eye-ball one issue where cpu_map__default_new's
behavior would be impacted (nr would be 1 rather than nr_cpus). The
fix is:
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -18,7 +18,7 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
        if (!cpus)
                return NULL;
-       cpus->nr = 1;
+       cpus->nr = nr_cpus;
        refcount_set(&cpus->refcnt, 1);
        return cpus;
 }
Could you add that into this patch as well?
Thanks,
Ian
> - Arnaldo
>
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> index 80d1a3a31052fe55..328479df5e16a638 100644
> --- a/tools/perf/util/bpf_counter.c
> +++ b/tools/perf/util/bpf_counter.c
> @@ -540,7 +540,7 @@ static int bperf__load(struct evsel *evsel, struct target *target)
>                     filter_type == BPERF_FILTER_TGID)
>                         key = evsel->core.threads->map[i].pid;
>                 else if (filter_type == BPERF_FILTER_CPU)
> -                       key = evsel->core.cpus->map[i];
> +                       key = evsel->core.cpus->map[i].cpu;
>                 else
>                         break;
>
> @@ -584,7 +584,7 @@ static int bperf_sync_counters(struct evsel *evsel)
>
>         num_cpu = all_cpu_map->nr;
>         for (i = 0; i < num_cpu; i++) {
> -               cpu = all_cpu_map->map[i];
> +               cpu = all_cpu_map->map[i].cpu;
>                 bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
>         }
>         return 0;
> @@ -605,7 +605,7 @@ static int bperf__disable(struct evsel *evsel)
>  static int bperf__read(struct evsel *evsel)
>  {
>         struct bperf_follower_bpf *skel = evsel->follower_skel;
> -       __u32 num_cpu_bpf = cpu__max_cpu();
> +       __u32 num_cpu_bpf = cpu__max_cpu().cpu;
>         struct bpf_perf_event_value values[num_cpu_bpf];
>         int reading_map_fd, err = 0;
>         __u32 i;
> @@ -615,6 +615,7 @@ static int bperf__read(struct evsel *evsel)
>         reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>
>         for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
> +               struct perf_cpu entry;
>                 __u32 cpu;
>
>                 err = bpf_map_lookup_elem(reading_map_fd, &i, values);
> @@ -624,14 +625,15 @@ static int bperf__read(struct evsel *evsel)
>                 case BPERF_FILTER_GLOBAL:
>                         assert(i == 0);
>
> -                       perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
> +                       perf_cpu_map__for_each_cpu(entry, j, all_cpu_map) {
> +                               cpu = entry.cpu;
>                                 perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
>                                 perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
>                                 perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
>                         }
>                         break;
>                 case BPERF_FILTER_CPU:
> -                       cpu = evsel->core.cpus->map[i];
> +                       cpu = evsel->core.cpus->map[i].cpu;
>                         perf_counts(evsel->counts, i, 0)->val = values[cpu].counter;
>                         perf_counts(evsel->counts, i, 0)->ena = values[cpu].enabled;
>                         perf_counts(evsel->counts, i, 0)->run = values[cpu].running;
> diff --git a/tools/perf/util/bpf_counter_cgroup.c b/tools/perf/util/bpf_counter_cgroup.c
> index cbc6c2bca488f6bf..631e34a0b66ff084 100644
> --- a/tools/perf/util/bpf_counter_cgroup.c
> +++ b/tools/perf/util/bpf_counter_cgroup.c
> @@ -48,7 +48,7 @@ static int bperf_load_program(struct evlist *evlist)
>         struct cgroup *cgrp, *leader_cgrp;
>         __u32 i, cpu;
>         __u32 nr_cpus = evlist->core.all_cpus->nr;
> -       int total_cpus = cpu__max_cpu();
> +       int total_cpus = cpu__max_cpu().cpu;
>         int map_size, map_fd;
>         int prog_fd, err;
>
> @@ -125,7 +125,7 @@ static int bperf_load_program(struct evlist *evlist)
>                         for (cpu = 0; cpu < nr_cpus; cpu++) {
>                                 int fd = FD(evsel, cpu);
>                                 __u32 idx = evsel->core.idx * total_cpus +
> -                                       evlist->core.all_cpus->map[cpu];
> +                                       evlist->core.all_cpus->map[cpu].cpu;
>
>                                 err = bpf_map_update_elem(map_fd, &idx, &fd,
>                                                           BPF_ANY);
> @@ -212,7 +212,7 @@ static int bperf_cgrp__sync_counters(struct evlist *evlist)
>         int prog_fd = bpf_program__fd(skel->progs.trigger_read);
>
>         for (i = 0; i < nr_cpus; i++) {
> -               cpu = evlist->core.all_cpus->map[i];
> +               cpu = evlist->core.all_cpus->map[i].cpu;
>                 bperf_trigger_reading(prog_fd, cpu);
>         }
>
> @@ -245,7 +245,7 @@ static int bperf_cgrp__read(struct evsel *evsel)
>  {
>         struct evlist *evlist = evsel->evlist;
>         int i, cpu, nr_cpus = evlist->core.all_cpus->nr;
> -       int total_cpus = cpu__max_cpu();
> +       int total_cpus = cpu__max_cpu().cpu;
>         struct perf_counts_values *counts;
>         struct bpf_perf_event_value *values;
>         int reading_map_fd, err = 0;
> @@ -272,7 +272,7 @@ static int bperf_cgrp__read(struct evsel *evsel)
>                 }
>
>                 for (i = 0; i < nr_cpus; i++) {
> -                       cpu = evlist->core.all_cpus->map[i];
> +                       cpu = evlist->core.all_cpus->map[i].cpu;
>
>                         counts = perf_counts(evsel->counts, i, 0);
>                         counts->val = values[cpu].counter;
> diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
> index 28dc4c60c7884818..d756cc66eef32ae8 100644
> --- a/tools/perf/util/bpf_ftrace.c
> +++ b/tools/perf/util/bpf_ftrace.c
> @@ -63,7 +63,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
>                 fd = bpf_map__fd(skel->maps.cpu_filter);
>
>                 for (i = 0; i < ncpus; i++) {
> -                       cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i);
> +                       cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i).cpu;
>                         bpf_map_update_elem(fd, &cpu, &val, BPF_ANY);
>                 }
>         }
> @@ -122,7 +122,7 @@ int perf_ftrace__latency_read_bpf(struct perf_ftrace *ftrace __maybe_unused,
>         int i, fd, err;
>         u32 idx;
>         u64 *hist;
> -       int ncpus = cpu__max_cpu();
> +       int ncpus = cpu__max_cpu().cpu;
>
>         fd = bpf_map__fd(skel->maps.latency);
>
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-05  6:13 ` [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu Ian Rogers
  2022-01-10 17:10   ` John Garry
  2022-01-10 19:17   ` John Garry
@ 2022-01-11 19:33   ` Arnaldo Carvalho de Melo
  2022-01-11 19:36     ` Ian Rogers
  2022-01-11 19:49     ` Arnaldo Carvalho de Melo
  2 siblings, 2 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-11 19:33 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 04, 2022 at 10:13:05PM -0800, Ian Rogers escreveu:
> The cpu_map and index can get confused. Add variants of the cpu_map__get
> routines that are passed a cpu. Make the existing cpu_map__get routines
> use the new functions with a view to remove them when no longer used.
> 
> Reviewed-by: James Clark <james.clark@arm.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
>  tools/perf/util/cpumap.h |  6 ++-
>  2 files changed, 51 insertions(+), 34 deletions(-)
> 
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 87d3eca9b872..49fba2c53822 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
>  	return ret ?: value;
>  }
>  
> -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> -					void *data __maybe_unused)
> +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
>  {
> -	int cpu;
>  	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
>  
> -	if (idx > map->nr)
> -		return id;
> -
> -	cpu = map->map[idx];
> -
>  	id.socket = cpu_map__get_socket_id(cpu);
>  	return id;
>  }
>  
> +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> +				       void *data)
> +{
> +	if (idx < 0 || idx > map->nr)
> +		return cpu_map__empty_aggr_cpu_id();
> +
> +	return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> +}
> +
This 'idx < 0' wasn't in the original code nor is described in the
comment log message, please avoid doing this, this may be harmless or
even a good hardening, but either way would be interesting to have it in
a separate patch. This eases review as in the end this code is just a
refactoring, moving things around but in the end should be equivalent code.
There are a few more, please consider this and if you agree, to speed
things up I can make the changes here, if I think this won't fallout in
changes to subsequent patches touching this area.
- Arnaldo
>  static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
>  {
>  	struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> @@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
>  	return ret ?: value;
>  }
>  
> -struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
> +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
>  {
> -	int cpu, die;
> -	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> -
> -	if (idx > map->nr)
> -		return id;
> -
> -	cpu = map->map[idx];
> +	struct aggr_cpu_id id;
> +	int die;
>  
>  	die = cpu_map__get_die_id(cpu);
>  	/* There is no die_id on legacy system. */
> @@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
>  	 * with the socket ID and then add die to
>  	 * make a unique ID.
>  	 */
> -	id = cpu_map__get_socket(map, idx, data);
> +	id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
>  	if (cpu_map__aggr_cpu_id_is_empty(id))
>  		return id;
>  
> @@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
>  	return id;
>  }
>  
> +struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
> +				    void *data)
> +{
> +	if (idx < 0 || idx > map->nr)
> +		return cpu_map__empty_aggr_cpu_id();
Ditto
> +
> +	return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
> +}
> +
>  int cpu_map__get_core_id(int cpu)
>  {
>  	int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
> @@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
>  	return cpu__get_node(cpu);
>  }
>  
> -struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
>  {
> -	int cpu;
> -	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> -
> -	if (idx > map->nr)
> -		return id;
> -
> -	cpu = map->map[idx];
> -
> -	cpu = cpu_map__get_core_id(cpu);
> +	struct aggr_cpu_id id;
> +	int core = cpu_map__get_core_id(cpu);
>  
>  	/* cpu_map__get_die returns a struct with socket and die set*/
> -	id = cpu_map__get_die(map, idx, data);
> +	id = cpu_map__get_die_aggr_by_cpu(cpu, data);
>  	if (cpu_map__aggr_cpu_id_is_empty(id))
>  		return id;
>  
> @@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
>  	 * core_id is relative to socket and die, we need a global id.
>  	 * So we combine the result from cpu_map__get_die with the core id
>  	 */
> -	id.core = cpu;
> +	id.core = core;
>  	return id;
> +
>  }
>  
> -struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
> +struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> +{
> +	if (idx < 0 || idx > map->nr)
> +		return cpu_map__empty_aggr_cpu_id();
Ditto
> +
> +	return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
> +}
> +
> +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
>  {
>  	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
>  
> +	id.node = cpu_map__get_node_id(cpu);
> +	return id;
> +}
> +
> +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
> +{
>  	if (idx < 0 || idx >= map->nr)
> -		return id;
> +		return cpu_map__empty_aggr_cpu_id();
>  
> -	id.node = cpu_map__get_node_id(map->map[idx]);
> -	return id;
> +	return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
>  }
>  
>  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
> diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> index a27eeaf086e8..c62d67704425 100644
> --- a/tools/perf/util/cpumap.h
> +++ b/tools/perf/util/cpumap.h
> @@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
>  size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
>  size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
>  int cpu_map__get_socket_id(int cpu);
> +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
>  struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
>  int cpu_map__get_die_id(int cpu);
> +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
>  struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
>  int cpu_map__get_core_id(int cpu);
> +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
>  struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
>  int cpu_map__get_node_id(int cpu);
> -struct aggr_cpu_id  cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
> +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
>  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
>  int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
>  int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
> -- 
> 2.34.1.448.ga2b2bfdf31-goog
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-11 19:33   ` Arnaldo Carvalho de Melo
@ 2022-01-11 19:36     ` Ian Rogers
  2022-01-11 19:56       ` Arnaldo Carvalho de Melo
  2022-01-11 19:49     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 84+ messages in thread
From: Ian Rogers @ 2022-01-11 19:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
On Tue, Jan 11, 2022 at 11:33 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Tue, Jan 04, 2022 at 10:13:05PM -0800, Ian Rogers escreveu:
> > The cpu_map and index can get confused. Add variants of the cpu_map__get
> > routines that are passed a cpu. Make the existing cpu_map__get routines
> > use the new functions with a view to remove them when no longer used.
> >
> > Reviewed-by: James Clark <james.clark@arm.com>
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
> >  tools/perf/util/cpumap.h |  6 ++-
> >  2 files changed, 51 insertions(+), 34 deletions(-)
> >
> > diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> > index 87d3eca9b872..49fba2c53822 100644
> > --- a/tools/perf/util/cpumap.c
> > +++ b/tools/perf/util/cpumap.c
> > @@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
> >       return ret ?: value;
> >  }
> >
> > -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > -                                     void *data __maybe_unused)
> > +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
> >  {
> > -     int cpu;
> >       struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> >
> > -     if (idx > map->nr)
> > -             return id;
> > -
> > -     cpu = map->map[idx];
> > -
> >       id.socket = cpu_map__get_socket_id(cpu);
> >       return id;
> >  }
> >
> > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > +                                    void *data)
> > +{
> > +     if (idx < 0 || idx > map->nr)
> > +             return cpu_map__empty_aggr_cpu_id();
> > +
> > +     return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
>
>
> This 'idx < 0' wasn't in the original code nor is described in the
> comment log message, please avoid doing this, this may be harmless or
> even a good hardening, but either way would be interesting to have it in
> a separate patch. This eases review as in the end this code is just a
> refactoring, moving things around but in the end should be equivalent code.
>
> There are a few more, please consider this and if you agree, to speed
> things up I can make the changes here, if I think this won't fallout in
> changes to subsequent patches touching this area.
>
> - Arnaldo
Fwiw, there's the same issue in cpu_map__get_die that's also in
cpu_map__get_core, but weirdly not copied into cpu_map__get_node. As
these functions are removed later I think doing nothing is best here.
Thanks,
Ian
> >  static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> >  {
> >       struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> > @@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
> >       return ret ?: value;
> >  }
> >
> > -struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
> > +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
> >  {
> > -     int cpu, die;
> > -     struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > -
> > -     if (idx > map->nr)
> > -             return id;
> > -
> > -     cpu = map->map[idx];
> > +     struct aggr_cpu_id id;
> > +     int die;
> >
> >       die = cpu_map__get_die_id(cpu);
> >       /* There is no die_id on legacy system. */
> > @@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> >        * with the socket ID and then add die to
> >        * make a unique ID.
> >        */
> > -     id = cpu_map__get_socket(map, idx, data);
> > +     id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
> >       if (cpu_map__aggr_cpu_id_is_empty(id))
> >               return id;
> >
> > @@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> >       return id;
> >  }
> >
> > +struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
> > +                                 void *data)
> > +{
> > +     if (idx < 0 || idx > map->nr)
> > +             return cpu_map__empty_aggr_cpu_id();
>
> Ditto
>
> > +
> > +     return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
> >  int cpu_map__get_core_id(int cpu)
> >  {
> >       int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
> > @@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
> >       return cpu__get_node(cpu);
> >  }
> >
> > -struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> > +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
> >  {
> > -     int cpu;
> > -     struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > -
> > -     if (idx > map->nr)
> > -             return id;
> > -
> > -     cpu = map->map[idx];
> > -
> > -     cpu = cpu_map__get_core_id(cpu);
> > +     struct aggr_cpu_id id;
> > +     int core = cpu_map__get_core_id(cpu);
> >
> >       /* cpu_map__get_die returns a struct with socket and die set*/
> > -     id = cpu_map__get_die(map, idx, data);
> > +     id = cpu_map__get_die_aggr_by_cpu(cpu, data);
> >       if (cpu_map__aggr_cpu_id_is_empty(id))
> >               return id;
> >
> > @@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
> >        * core_id is relative to socket and die, we need a global id.
> >        * So we combine the result from cpu_map__get_die with the core id
> >        */
> > -     id.core = cpu;
> > +     id.core = core;
> >       return id;
> > +
> >  }
> >
> > -struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
> > +struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> > +{
> > +     if (idx < 0 || idx > map->nr)
> > +             return cpu_map__empty_aggr_cpu_id();
>
> Ditto
>
> > +
> > +     return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
> > +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
> >  {
> >       struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> >
> > +     id.node = cpu_map__get_node_id(cpu);
> > +     return id;
> > +}
> > +
> > +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
> > +{
> >       if (idx < 0 || idx >= map->nr)
> > -             return id;
> > +             return cpu_map__empty_aggr_cpu_id();
> >
> > -     id.node = cpu_map__get_node_id(map->map[idx]);
> > -     return id;
> > +     return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
> >  }
> >
> >  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
> > diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> > index a27eeaf086e8..c62d67704425 100644
> > --- a/tools/perf/util/cpumap.h
> > +++ b/tools/perf/util/cpumap.h
> > @@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
> >  size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
> >  size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
> >  int cpu_map__get_socket_id(int cpu);
> > +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
> >  struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__get_die_id(int cpu);
> > +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
> >  struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__get_core_id(int cpu);
> > +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
> >  struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__get_node_id(int cpu);
> > -struct aggr_cpu_id  cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> > +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
> > +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
> >  int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
> >  int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
> > --
> > 2.34.1.448.ga2b2bfdf31-goog
>
> --
>
> - Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-11 19:33   ` Arnaldo Carvalho de Melo
  2022-01-11 19:36     ` Ian Rogers
@ 2022-01-11 19:49     ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-11 19:49 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 11, 2022 at 04:33:01PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jan 04, 2022 at 10:13:05PM -0800, Ian Rogers escreveu:
> > The cpu_map and index can get confused. Add variants of the cpu_map__get
> > routines that are passed a cpu. Make the existing cpu_map__get routines
> > use the new functions with a view to remove them when no longer used.
> > 
> > Reviewed-by: James Clark <james.clark@arm.com>
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
> >  tools/perf/util/cpumap.h |  6 ++-
> >  2 files changed, 51 insertions(+), 34 deletions(-)
> > 
> > diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> > index 87d3eca9b872..49fba2c53822 100644
> > --- a/tools/perf/util/cpumap.c
> > +++ b/tools/perf/util/cpumap.c
> > @@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
> >  	return ret ?: value;
> >  }
> >  
> > -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > -					void *data __maybe_unused)
> > +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
> >  {
> > -	int cpu;
> >  	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> >  
> > -	if (idx > map->nr)
> > -		return id;
> > -
> > -	cpu = map->map[idx];
> > -
> >  	id.socket = cpu_map__get_socket_id(cpu);
> >  	return id;
> >  }
> >  
> > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > +				       void *data)
> > +{
> > +	if (idx < 0 || idx > map->nr)
> > +		return cpu_map__empty_aggr_cpu_id();
> > +
> > +	return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
> 
> 
> This 'idx < 0' wasn't in the original code nor is described in the
> comment log message, please avoid doing this, this may be harmless or
> even a good hardening, but either way would be interesting to have it in
> a separate patch. This eases review as in the end this code is just a
> refactoring, moving things around but in the end should be equivalent code.
> 
> There are a few more, please consider this and if you agree, to speed
> things up I can make the changes here, if I think this won't fallout in
> changes to subsequent patches touching this area.
Nah, should be harmless, and its too much work already, so lets leave it
as-is, but please consider this in the future.
- Arnaldo
 
> - Arnaldo
> 
> >  static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> >  {
> >  	struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> > @@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
> >  	return ret ?: value;
> >  }
> >  
> > -struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
> > +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
> >  {
> > -	int cpu, die;
> > -	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > -
> > -	if (idx > map->nr)
> > -		return id;
> > -
> > -	cpu = map->map[idx];
> > +	struct aggr_cpu_id id;
> > +	int die;
> >  
> >  	die = cpu_map__get_die_id(cpu);
> >  	/* There is no die_id on legacy system. */
> > @@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> >  	 * with the socket ID and then add die to
> >  	 * make a unique ID.
> >  	 */
> > -	id = cpu_map__get_socket(map, idx, data);
> > +	id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
> >  	if (cpu_map__aggr_cpu_id_is_empty(id))
> >  		return id;
> >  
> > @@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> >  	return id;
> >  }
> >  
> > +struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
> > +				    void *data)
> > +{
> > +	if (idx < 0 || idx > map->nr)
> > +		return cpu_map__empty_aggr_cpu_id();
> 
> Ditto
> 
> > +
> > +	return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
> >  int cpu_map__get_core_id(int cpu)
> >  {
> >  	int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
> > @@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
> >  	return cpu__get_node(cpu);
> >  }
> >  
> > -struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> > +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
> >  {
> > -	int cpu;
> > -	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > -
> > -	if (idx > map->nr)
> > -		return id;
> > -
> > -	cpu = map->map[idx];
> > -
> > -	cpu = cpu_map__get_core_id(cpu);
> > +	struct aggr_cpu_id id;
> > +	int core = cpu_map__get_core_id(cpu);
> >  
> >  	/* cpu_map__get_die returns a struct with socket and die set*/
> > -	id = cpu_map__get_die(map, idx, data);
> > +	id = cpu_map__get_die_aggr_by_cpu(cpu, data);
> >  	if (cpu_map__aggr_cpu_id_is_empty(id))
> >  		return id;
> >  
> > @@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
> >  	 * core_id is relative to socket and die, we need a global id.
> >  	 * So we combine the result from cpu_map__get_die with the core id
> >  	 */
> > -	id.core = cpu;
> > +	id.core = core;
> >  	return id;
> > +
> >  }
> >  
> > -struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
> > +struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> > +{
> > +	if (idx < 0 || idx > map->nr)
> > +		return cpu_map__empty_aggr_cpu_id();
> 
> Ditto
> 
> > +
> > +	return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
> > +}
> > +
> > +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
> >  {
> >  	struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> >  
> > +	id.node = cpu_map__get_node_id(cpu);
> > +	return id;
> > +}
> > +
> > +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
> > +{
> >  	if (idx < 0 || idx >= map->nr)
> > -		return id;
> > +		return cpu_map__empty_aggr_cpu_id();
> >  
> > -	id.node = cpu_map__get_node_id(map->map[idx]);
> > -	return id;
> > +	return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
> >  }
> >  
> >  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
> > diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> > index a27eeaf086e8..c62d67704425 100644
> > --- a/tools/perf/util/cpumap.h
> > +++ b/tools/perf/util/cpumap.h
> > @@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
> >  size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
> >  size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
> >  int cpu_map__get_socket_id(int cpu);
> > +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
> >  struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__get_die_id(int cpu);
> > +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
> >  struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__get_core_id(int cpu);
> > +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
> >  struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__get_node_id(int cpu);
> > -struct aggr_cpu_id  cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> > +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
> > +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> >  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
> >  int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
> >  int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
> > -- 
> > 2.34.1.448.ga2b2bfdf31-goog
> 
> -- 
> 
> - Arnaldo
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu.
  2022-01-11 19:36     ` Ian Rogers
@ 2022-01-11 19:56       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-11 19:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 11, 2022 at 11:36:02AM -0800, Ian Rogers escreveu:
> On Tue, Jan 11, 2022 at 11:33 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Tue, Jan 04, 2022 at 10:13:05PM -0800, Ian Rogers escreveu:
> > > The cpu_map and index can get confused. Add variants of the cpu_map__get
> > > routines that are passed a cpu. Make the existing cpu_map__get routines
> > > use the new functions with a view to remove them when no longer used.
> > >
> > > Reviewed-by: James Clark <james.clark@arm.com>
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
> > >  tools/perf/util/cpumap.h |  6 ++-
> > >  2 files changed, 51 insertions(+), 34 deletions(-)
> > >
> > > diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> > > index 87d3eca9b872..49fba2c53822 100644
> > > --- a/tools/perf/util/cpumap.c
> > > +++ b/tools/perf/util/cpumap.c
> > > @@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
> > >       return ret ?: value;
> > >  }
> > >
> > > -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > > -                                     void *data __maybe_unused)
> > > +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
> > >  {
> > > -     int cpu;
> > >       struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > >
> > > -     if (idx > map->nr)
> > > -             return id;
> > > -
> > > -     cpu = map->map[idx];
> > > -
> > >       id.socket = cpu_map__get_socket_id(cpu);
> > >       return id;
> > >  }
> > >
> > > +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> > > +                                    void *data)
> > > +{
> > > +     if (idx < 0 || idx > map->nr)
> > > +             return cpu_map__empty_aggr_cpu_id();
> > > +
> > > +     return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> > > +}
> > > +
> >
> >
> > This 'idx < 0' wasn't in the original code nor is described in the
> > comment log message, please avoid doing this, this may be harmless or
> > even a good hardening, but either way would be interesting to have it in
> > a separate patch. This eases review as in the end this code is just a
> > refactoring, moving things around but in the end should be equivalent code.
> >
> > There are a few more, please consider this and if you agree, to speed
> > things up I can make the changes here, if I think this won't fallout in
> > changes to subsequent patches touching this area.
> >
> > - Arnaldo
> 
> Fwiw, there's the same issue in cpu_map__get_die that's also in
> cpu_map__get_core, but weirdly not copied into cpu_map__get_node. As
> these functions are removed later I think doing nothing is best here.
Sure.
 
> Thanks,
> Ian
> 
> > >  static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> > >  {
> > >       struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> > > @@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
> > >       return ret ?: value;
> > >  }
> > >
> > > -struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
> > > +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
> > >  {
> > > -     int cpu, die;
> > > -     struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > > -
> > > -     if (idx > map->nr)
> > > -             return id;
> > > -
> > > -     cpu = map->map[idx];
> > > +     struct aggr_cpu_id id;
> > > +     int die;
> > >
> > >       die = cpu_map__get_die_id(cpu);
> > >       /* There is no die_id on legacy system. */
> > > @@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> > >        * with the socket ID and then add die to
> > >        * make a unique ID.
> > >        */
> > > -     id = cpu_map__get_socket(map, idx, data);
> > > +     id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
> > >       if (cpu_map__aggr_cpu_id_is_empty(id))
> > >               return id;
> > >
> > > @@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> > >       return id;
> > >  }
> > >
> > > +struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
> > > +                                 void *data)
> > > +{
> > > +     if (idx < 0 || idx > map->nr)
> > > +             return cpu_map__empty_aggr_cpu_id();
> >
> > Ditto
> >
> > > +
> > > +     return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
> > > +}
> > > +
> > >  int cpu_map__get_core_id(int cpu)
> > >  {
> > >       int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
> > > @@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
> > >       return cpu__get_node(cpu);
> > >  }
> > >
> > > -struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> > > +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
> > >  {
> > > -     int cpu;
> > > -     struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > > -
> > > -     if (idx > map->nr)
> > > -             return id;
> > > -
> > > -     cpu = map->map[idx];
> > > -
> > > -     cpu = cpu_map__get_core_id(cpu);
> > > +     struct aggr_cpu_id id;
> > > +     int core = cpu_map__get_core_id(cpu);
> > >
> > >       /* cpu_map__get_die returns a struct with socket and die set*/
> > > -     id = cpu_map__get_die(map, idx, data);
> > > +     id = cpu_map__get_die_aggr_by_cpu(cpu, data);
> > >       if (cpu_map__aggr_cpu_id_is_empty(id))
> > >               return id;
> > >
> > > @@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
> > >        * core_id is relative to socket and die, we need a global id.
> > >        * So we combine the result from cpu_map__get_die with the core id
> > >        */
> > > -     id.core = cpu;
> > > +     id.core = core;
> > >       return id;
> > > +
> > >  }
> > >
> > > -struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
> > > +struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> > > +{
> > > +     if (idx < 0 || idx > map->nr)
> > > +             return cpu_map__empty_aggr_cpu_id();
> >
> > Ditto
> >
> > > +
> > > +     return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
> > > +}
> > > +
> > > +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
> > >  {
> > >       struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> > >
> > > +     id.node = cpu_map__get_node_id(cpu);
> > > +     return id;
> > > +}
> > > +
> > > +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
> > > +{
> > >       if (idx < 0 || idx >= map->nr)
> > > -             return id;
> > > +             return cpu_map__empty_aggr_cpu_id();
> > >
> > > -     id.node = cpu_map__get_node_id(map->map[idx]);
> > > -     return id;
> > > +     return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
> > >  }
> > >
> > >  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
> > > diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> > > index a27eeaf086e8..c62d67704425 100644
> > > --- a/tools/perf/util/cpumap.h
> > > +++ b/tools/perf/util/cpumap.h
> > > @@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
> > >  size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
> > >  size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
> > >  int cpu_map__get_socket_id(int cpu);
> > > +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
> > >  struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
> > >  int cpu_map__get_die_id(int cpu);
> > > +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
> > >  struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
> > >  int cpu_map__get_core_id(int cpu);
> > > +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
> > >  struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
> > >  int cpu_map__get_node_id(int cpu);
> > > -struct aggr_cpu_id  cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> > > +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
> > > +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> > >  int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
> > >  int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
> > >  int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
> > > --
> > > 2.34.1.448.ga2b2bfdf31-goog
> >
> > --
> >
> > - Arnaldo
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-11 19:16     ` Ian Rogers
@ 2022-01-11 20:01       ` Arnaldo Carvalho de Melo
  2022-01-11 20:47         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-11 20:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 11, 2022 at 11:16:16AM -0800, Ian Rogers escreveu:
> On Mon, Jan 10, 2022 at 11:50 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Tue, Jan 04, 2022 at 10:13:51PM -0800, Ian Rogers escreveu:
> > > A common problem is confusing CPU map indices with the CPU, by wrapping
> > > the CPU with a struct then this is avoided. This approach is similar to
> > > atomic_t.
> >
> > This one needed this to build with BUILD_BPF_SKEL=1, please check, I'll
> > soon push this to tmp.perf/perf_cpu so that you can take a look and test
> > it.
> 
> 
> Thanks Arnaldo, I did eye-ball one issue where cpu_map__default_new's
> behavior would be impacted (nr would be 1 rather than nr_cpus). The
> fix is:
> 
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -18,7 +18,7 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
>         if (!cpus)
>                 return NULL;
> 
> -       cpus->nr = 1;
> +       cpus->nr = nr_cpus;
>         refcount_set(&cpus->refcnt, 1);
>         return cpus;
>  }
> Could you add that into this patch as well?
Sure.
I´m now running perf-test after each of those csets, so far so good.
 
> Thanks,
> Ian
> 
> 
> > - Arnaldo
> >
> > diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> > index 80d1a3a31052fe55..328479df5e16a638 100644
> > --- a/tools/perf/util/bpf_counter.c
> > +++ b/tools/perf/util/bpf_counter.c
> > @@ -540,7 +540,7 @@ static int bperf__load(struct evsel *evsel, struct target *target)
> >                     filter_type == BPERF_FILTER_TGID)
> >                         key = evsel->core.threads->map[i].pid;
> >                 else if (filter_type == BPERF_FILTER_CPU)
> > -                       key = evsel->core.cpus->map[i];
> > +                       key = evsel->core.cpus->map[i].cpu;
> >                 else
> >                         break;
> >
> > @@ -584,7 +584,7 @@ static int bperf_sync_counters(struct evsel *evsel)
> >
> >         num_cpu = all_cpu_map->nr;
> >         for (i = 0; i < num_cpu; i++) {
> > -               cpu = all_cpu_map->map[i];
> > +               cpu = all_cpu_map->map[i].cpu;
> >                 bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
> >         }
> >         return 0;
> > @@ -605,7 +605,7 @@ static int bperf__disable(struct evsel *evsel)
> >  static int bperf__read(struct evsel *evsel)
> >  {
> >         struct bperf_follower_bpf *skel = evsel->follower_skel;
> > -       __u32 num_cpu_bpf = cpu__max_cpu();
> > +       __u32 num_cpu_bpf = cpu__max_cpu().cpu;
> >         struct bpf_perf_event_value values[num_cpu_bpf];
> >         int reading_map_fd, err = 0;
> >         __u32 i;
> > @@ -615,6 +615,7 @@ static int bperf__read(struct evsel *evsel)
> >         reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> >
> >         for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
> > +               struct perf_cpu entry;
> >                 __u32 cpu;
> >
> >                 err = bpf_map_lookup_elem(reading_map_fd, &i, values);
> > @@ -624,14 +625,15 @@ static int bperf__read(struct evsel *evsel)
> >                 case BPERF_FILTER_GLOBAL:
> >                         assert(i == 0);
> >
> > -                       perf_cpu_map__for_each_cpu(cpu, j, all_cpu_map) {
> > +                       perf_cpu_map__for_each_cpu(entry, j, all_cpu_map) {
> > +                               cpu = entry.cpu;
> >                                 perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
> >                                 perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
> >                                 perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
> >                         }
> >                         break;
> >                 case BPERF_FILTER_CPU:
> > -                       cpu = evsel->core.cpus->map[i];
> > +                       cpu = evsel->core.cpus->map[i].cpu;
> >                         perf_counts(evsel->counts, i, 0)->val = values[cpu].counter;
> >                         perf_counts(evsel->counts, i, 0)->ena = values[cpu].enabled;
> >                         perf_counts(evsel->counts, i, 0)->run = values[cpu].running;
> > diff --git a/tools/perf/util/bpf_counter_cgroup.c b/tools/perf/util/bpf_counter_cgroup.c
> > index cbc6c2bca488f6bf..631e34a0b66ff084 100644
> > --- a/tools/perf/util/bpf_counter_cgroup.c
> > +++ b/tools/perf/util/bpf_counter_cgroup.c
> > @@ -48,7 +48,7 @@ static int bperf_load_program(struct evlist *evlist)
> >         struct cgroup *cgrp, *leader_cgrp;
> >         __u32 i, cpu;
> >         __u32 nr_cpus = evlist->core.all_cpus->nr;
> > -       int total_cpus = cpu__max_cpu();
> > +       int total_cpus = cpu__max_cpu().cpu;
> >         int map_size, map_fd;
> >         int prog_fd, err;
> >
> > @@ -125,7 +125,7 @@ static int bperf_load_program(struct evlist *evlist)
> >                         for (cpu = 0; cpu < nr_cpus; cpu++) {
> >                                 int fd = FD(evsel, cpu);
> >                                 __u32 idx = evsel->core.idx * total_cpus +
> > -                                       evlist->core.all_cpus->map[cpu];
> > +                                       evlist->core.all_cpus->map[cpu].cpu;
> >
> >                                 err = bpf_map_update_elem(map_fd, &idx, &fd,
> >                                                           BPF_ANY);
> > @@ -212,7 +212,7 @@ static int bperf_cgrp__sync_counters(struct evlist *evlist)
> >         int prog_fd = bpf_program__fd(skel->progs.trigger_read);
> >
> >         for (i = 0; i < nr_cpus; i++) {
> > -               cpu = evlist->core.all_cpus->map[i];
> > +               cpu = evlist->core.all_cpus->map[i].cpu;
> >                 bperf_trigger_reading(prog_fd, cpu);
> >         }
> >
> > @@ -245,7 +245,7 @@ static int bperf_cgrp__read(struct evsel *evsel)
> >  {
> >         struct evlist *evlist = evsel->evlist;
> >         int i, cpu, nr_cpus = evlist->core.all_cpus->nr;
> > -       int total_cpus = cpu__max_cpu();
> > +       int total_cpus = cpu__max_cpu().cpu;
> >         struct perf_counts_values *counts;
> >         struct bpf_perf_event_value *values;
> >         int reading_map_fd, err = 0;
> > @@ -272,7 +272,7 @@ static int bperf_cgrp__read(struct evsel *evsel)
> >                 }
> >
> >                 for (i = 0; i < nr_cpus; i++) {
> > -                       cpu = evlist->core.all_cpus->map[i];
> > +                       cpu = evlist->core.all_cpus->map[i].cpu;
> >
> >                         counts = perf_counts(evsel->counts, i, 0);
> >                         counts->val = values[cpu].counter;
> > diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
> > index 28dc4c60c7884818..d756cc66eef32ae8 100644
> > --- a/tools/perf/util/bpf_ftrace.c
> > +++ b/tools/perf/util/bpf_ftrace.c
> > @@ -63,7 +63,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
> >                 fd = bpf_map__fd(skel->maps.cpu_filter);
> >
> >                 for (i = 0; i < ncpus; i++) {
> > -                       cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i);
> > +                       cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i).cpu;
> >                         bpf_map_update_elem(fd, &cpu, &val, BPF_ANY);
> >                 }
> >         }
> > @@ -122,7 +122,7 @@ int perf_ftrace__latency_read_bpf(struct perf_ftrace *ftrace __maybe_unused,
> >         int i, fd, err;
> >         u32 idx;
> >         u64 *hist;
> > -       int ncpus = cpu__max_cpu();
> > +       int ncpus = cpu__max_cpu().cpu;
> >
> >         fd = bpf_map__fd(skel->maps.latency);
> >
-- 
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 48/48] perf cpumap: Give CPUs their own type.
  2022-01-11 20:01       ` Arnaldo Carvalho de Melo
@ 2022-01-11 20:47         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 84+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-11 20:47 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry, Kajol Jain,
	Paul A . Clarke, Riccardo Mancini, Kan Liang, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, linux-perf-users,
	linux-kernel, Vineet Singh, James Clark, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Leo Yan, coresight,
	linux-arm-kernel, zhengjun.xing, eranian
Em Tue, Jan 11, 2022 at 05:01:10PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jan 11, 2022 at 11:16:16AM -0800, Ian Rogers escreveu:
> > On Mon, Jan 10, 2022 at 11:50 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > Em Tue, Jan 04, 2022 at 10:13:51PM -0800, Ian Rogers escreveu:
> > > > A common problem is confusing CPU map indices with the CPU, by wrapping
> > > > the CPU with a struct then this is avoided. This approach is similar to
> > > > atomic_t.
> > >
> > > This one needed this to build with BUILD_BPF_SKEL=1, please check, I'll
> > > soon push this to tmp.perf/perf_cpu so that you can take a look and test
> > > it.
> > Thanks Arnaldo, I did eye-ball one issue where cpu_map__default_new's
> > behavior would be impacted (nr would be 1 rather than nr_cpus). The
> > fix is:
> > 
> > --- a/tools/lib/perf/cpumap.c
> > +++ b/tools/lib/perf/cpumap.c
> > @@ -18,7 +18,7 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
> >         if (!cpus)
> >                 return NULL;
> > 
> > -       cpus->nr = 1;
> > +       cpus->nr = nr_cpus;
> >         refcount_set(&cpus->refcnt, 1);
> >         return cpus;
> >  }
> 
> > Could you add that into this patch as well?
> 
> Sure.
> 
> I´m now running perf-test after each of those csets, so far so good.
This is what I'm running now after doing a 'git rebase -i
start-of-your-patchkit' with all set to 'edit'.
Will take a while :-)
# cat ~/bin/perf-test-build
#!/bin/bash
while true ; do
	rm -rf /tmp/build/perf ; sudo -u acme mkdir /tmp/build/perf
	sudo -u acme make -k BUILD_BPF_SKEL=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin || exit 1
	cd ..
	perf test || exit 2
	cd -
	sudo -u acme git rebase --continue || exit 0
done
- Arnaldo
^ permalink raw reply	[flat|nested] 84+ messages in thread
* Re: [PATCH v4 01/48] libperf: Add comments to perf_cpu_map.
  2022-01-05  6:13 ` [PATCH v4 01/48] libperf: Add comments to perf_cpu_map Ian Rogers
  2022-01-10 17:00   ` John Garry
@ 2022-01-12  5:35   ` kajoljain
  1 sibling, 0 replies; 84+ messages in thread
From: kajoljain @ 2022-01-12  5:35 UTC (permalink / raw)
  To: Ian Rogers, Andi Kleen, Jiri Olsa, Namhyung Kim, John Garry,
	Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
	Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, linux-perf-users, linux-kernel, Vineet Singh,
	James Clark, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	Leo Yan, coresight, linux-arm-kernel, zhengjun.xing
  Cc: eranian
On 1/5/22 11:43 AM, Ian Rogers wrote:
> A particular observed problem is confusing the index with the CPU value,
> documentation should hopefully reduce this type of problem.
> 
Patch looks good.
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
> Reviewed-by: James Clark <james.clark@arm.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/include/internal/cpumap.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 840d4032587b..4054169c12c5 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -4,9 +4,18 @@
>  
>  #include <linux/refcount.h>
>  
> +/**
> + * A sized, reference counted, sorted array of integers representing CPU
> + * numbers. This is commonly used to capture which CPUs a PMU is associated
> + * with. The indices into the cpumap are frequently used as they avoid having
> + * gaps if CPU numbers were used. For events associated with a pid, rather than
> + * a CPU, a single dummy map with an entry of -1 is used.
> + */
>  struct perf_cpu_map {
>  	refcount_t	refcnt;
> +	/** Length of the map array. */
>  	int		nr;
> +	/** The CPU values. */
>  	int		map[];
>  };
>  
> 
^ permalink raw reply	[flat|nested] 84+ messages in thread
end of thread, other threads:[~2022-01-12  5:36 UTC | newest]
Thread overview: 84+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-05  6:13 [PATCH v4 00/48] Refactor perf cpumap Ian Rogers
2022-01-05  6:13 ` [PATCH v4 01/48] libperf: Add comments to perf_cpu_map Ian Rogers
2022-01-10 17:00   ` John Garry
2022-01-12  5:35   ` kajoljain
2022-01-05  6:13 ` [PATCH v4 02/48] perf stat: Add aggr creators that are passed a cpu Ian Rogers
2022-01-10 17:10   ` John Garry
2022-01-10 17:36     ` Ian Rogers
2022-01-10 17:51       ` John Garry
2022-01-10 18:52       ` Arnaldo Carvalho de Melo
2022-01-10 18:53         ` Arnaldo Carvalho de Melo
2022-01-10 18:56           ` Arnaldo Carvalho de Melo
2022-01-10 19:17   ` John Garry
2022-01-11 19:33   ` Arnaldo Carvalho de Melo
2022-01-11 19:36     ` Ian Rogers
2022-01-11 19:56       ` Arnaldo Carvalho de Melo
2022-01-11 19:49     ` Arnaldo Carvalho de Melo
2022-01-05  6:13 ` [PATCH v4 03/48] perf stat: Correct aggregation CPU map Ian Rogers
2022-01-10 18:43   ` John Garry
2022-01-05  6:13 ` [PATCH v4 04/48] perf stat: Switch aggregation to use for_each loop Ian Rogers
2022-01-10 18:50   ` John Garry
2022-01-10 22:22     ` Ian Rogers
2022-01-05  6:13 ` [PATCH v4 05/48] perf stat: Switch to cpu version of cpu_map__get Ian Rogers
2022-01-10 19:04   ` John Garry
2022-01-11 18:46     ` Ian Rogers
2022-01-11 18:50       ` Arnaldo Carvalho de Melo
2022-01-05  6:13 ` [PATCH v4 06/48] perf cpumap: Switch cpu_map__build_map to cpu function Ian Rogers
2022-01-05  6:13 ` [PATCH v4 07/48] perf cpumap: Remove map+index get_socket Ian Rogers
2022-01-05  6:13 ` [PATCH v4 08/48] perf cpumap: Remove map+index get_die Ian Rogers
2022-01-05  6:13 ` [PATCH v4 09/48] perf cpumap: Remove map+index get_core Ian Rogers
2022-01-05  6:13 ` [PATCH v4 10/48] perf cpumap: Remove map+index get_node Ian Rogers
2022-01-05  6:13 ` [PATCH v4 11/48] perf cpumap: Add comments to aggr_cpu_id Ian Rogers
2022-01-05  6:13 ` [PATCH v4 12/48] perf cpumap: Remove unused cpu_map__socket Ian Rogers
2022-01-05  6:13 ` [PATCH v4 13/48] perf cpumap: Simplify equal function name Ian Rogers
2022-01-05  6:13 ` [PATCH v4 14/48] perf cpumap: Rename empty functions Ian Rogers
2022-01-05  6:13 ` [PATCH v4 15/48] perf cpumap: Document cpu__get_node and remove redundant function Ian Rogers
2022-01-05  6:13 ` [PATCH v4 16/48] perf cpumap: Remove map from function names that don't use a map Ian Rogers
2022-01-05  6:13 ` [PATCH v4 17/48] perf cpumap: Remove cpu_map__cpu, use libperf function Ian Rogers
2022-01-10 19:02   ` Arnaldo Carvalho de Melo
2022-01-10 19:04     ` Arnaldo Carvalho de Melo
2022-01-10 19:06       ` Arnaldo Carvalho de Melo
2022-01-05  6:13 ` [PATCH v4 18/48] perf cpumap: Refactor cpu_map__build_map Ian Rogers
2022-01-05  6:13 ` [PATCH v4 19/48] perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions Ian Rogers
2022-01-05  6:13 ` [PATCH v4 20/48] perf cpumap: Move 'has' function to libperf Ian Rogers
2022-01-05  6:13 ` [PATCH v4 21/48] perf cpumap: Add some comments to cpu_aggr_map Ian Rogers
2022-01-05  6:13 ` [PATCH v4 22/48] perf cpumap: Trim the cpu_aggr_map Ian Rogers
2022-01-05  6:13 ` [PATCH v4 23/48] perf stat: Fix memory leak in check_per_pkg Ian Rogers
2022-01-05  6:13 ` [PATCH v4 24/48] perf cpumap: Add CPU to aggr_cpu_id Ian Rogers
2022-01-05  6:13 ` [PATCH v4 25/48] perf stat-display: Avoid use of core for CPU Ian Rogers
2022-01-05  6:13 ` [PATCH v4 26/48] perf evsel: Derive CPUs and threads in alloc_counts Ian Rogers
2022-01-09 18:30   ` Jiri Olsa
2022-01-10  5:49     ` Ian Rogers
2022-01-05  6:13 ` [PATCH v4 27/48] libperf: Switch cpu to more accurate cpu_map_idx Ian Rogers
2022-01-05  6:13 ` [PATCH v4 28/48] libperf: Use cpu not index for evsel mmap Ian Rogers
2022-01-05  6:13 ` [PATCH v4 29/48] perf counts: Switch name cpu to cpu_map_idx Ian Rogers
2022-01-05  6:13 ` [PATCH v4 30/48] perf stat: Rename aggr_data cpu to imply it's an index Ian Rogers
2022-01-05  6:13 ` [PATCH v4 31/48] perf stat: Use perf_cpu_map__for_each_cpu Ian Rogers
2022-01-05  6:13 ` [PATCH v4 32/48] perf script: Use for each cpu to aid readability Ian Rogers
2022-01-05  6:13 ` [PATCH v4 33/48] libperf: Allow NULL in perf_cpu_map__idx Ian Rogers
2022-01-05  6:13 ` [PATCH v4 34/48] perf evlist: Refactor evlist__for_each_cpu Ian Rogers
2022-01-05  6:13 ` [PATCH v4 35/48] perf evsel: Pass cpu not cpu map index to synthesize Ian Rogers
2022-01-05  6:13 ` [PATCH v4 36/48] perf stat: Correct variable name for read counter Ian Rogers
2022-01-05  6:13 ` [PATCH v4 37/48] perf evsel: Rename CPU around get_group_fd Ian Rogers
2022-01-05  6:13 ` [PATCH v4 38/48] perf evsel: Reduce scope of evsel__ignore_missing_thread Ian Rogers
2022-01-05  6:13 ` [PATCH v4 39/48] perf evsel: Rename variable cpu to index Ian Rogers
2022-01-05  6:13 ` [PATCH v4 40/48] perf test: Use perf_cpu_map__for_each_cpu Ian Rogers
2022-01-05  6:13 ` [PATCH v4 41/48] perf stat: Correct check_per_pkg cpu Ian Rogers
2022-01-05  6:13 ` [PATCH v4 42/48] perf stat: Swap variable name cpu to index Ian Rogers
2022-01-05  6:13 ` [PATCH v4 43/48] libperf: Sync evsel documentation Ian Rogers
2022-01-05  6:13 ` [PATCH v4 44/48] perf bpf: Rename cpu to cpu_map_idx Ian Rogers
2022-01-10 19:10   ` Arnaldo Carvalho de Melo
2022-01-10 19:26     ` Arnaldo Carvalho de Melo
2022-01-05  6:13 ` [PATCH v4 45/48] perf c2c: Use more intention revealing iterator Ian Rogers
2022-01-05  6:13 ` [PATCH v4 46/48] perf script: Fix flipped index and cpu Ian Rogers
2022-01-05  6:13 ` [PATCH v4 47/48] perf stat: Correct first_shadow_cpu to return index Ian Rogers
2022-01-05  6:13 ` [PATCH v4 48/48] perf cpumap: Give CPUs their own type Ian Rogers
2022-01-09 18:30   ` Jiri Olsa
2022-01-10  5:45     ` Ian Rogers
2022-01-10 19:50   ` Arnaldo Carvalho de Melo
2022-01-11 19:16     ` Ian Rogers
2022-01-11 20:01       ` Arnaldo Carvalho de Melo
2022-01-11 20:47         ` Arnaldo Carvalho de Melo
2022-01-10 20:17 ` [PATCH v4 00/48] Refactor perf cpumap Arnaldo Carvalho de Melo
2022-01-10 20:40   ` Arnaldo Carvalho de Melo
2022-01-10 20:42     ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).