* [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid
@ 2025-07-17 21:02 Ian Rogers
2025-07-17 21:02 ` [PATCH v2 01/12] perf parse-events: Warn if a cpu term is unsupported by a CPU Ian Rogers
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
On hybrid systems some PMUs apply to all core types, particularly for
metrics the msr PMU and the tsc event. The metrics often only want the
values of the counter for their specific core type. These patches
allow the cpu term in an event to give a PMU name to take the cpumask
from. For example:
$ perf stat -e msr/tsc,cpu=cpu_atom/ ...
will aggregate the msr/tsc/ value but only for atom cores. In doing
this problems were identified in how cpumasks are handled by parsing
and event setup when cpumasks are specified along with a task to
profile. The event parsing, cpumask evlist propagation code and perf
stat code are updated accordingly.
The final result of the patch series is to be able to run:
```
$ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
10.1: Basic parsing test : Ok
10.2: Parsing without PMU name : Ok
10.3: Parsing with PMU name : Ok
Performance counter stats for 'perf test -F 10':
63,704,975 msr/tsc/
47,060,704 msr/tsc,cpu=cpu_core/ (4.62%)
16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%)
```
This has (further) identified a kernel bug for task events around the
enabled time being too large leading to invalid scaling (hence the
--no-scale in the command line above).
v2: Add additional documentation of the cpu term to `perf list`
(Namhyung), extend the term to also allow CPU ranges. Add Thomas
Falcon's reviewed-by. Still open for discussion whether the term
cpu should have >1 variant for PMUs, etc. or whether the single
term is okay. We could refactor later and add a term, but that
would break existing users, but they are most likely to be metrics
so probably not a huge issue.
Ian Rogers (12):
perf parse-events: Warn if a cpu term is unsupported by a CPU
perf stat: Avoid buffer overflow to the aggregation map
perf stat: Don't size aggregation ids from user_requested_cpus
perf parse-events: Allow the cpu term to be a PMU or CPU range
perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask
libperf evsel: Rename own_cpus to pmu_cpus
libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete
perf evsel: Use libperf perf_evsel__exit
perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu
perf parse-events: Minor __add_event refactoring
perf evsel: Add evsel__open_per_cpu_and_thread
perf parse-events: Support user CPUs mixed with threads/processes
tools/lib/perf/evlist.c | 118 ++++++++++++++++--------
tools/lib/perf/evsel.c | 9 +-
tools/lib/perf/include/internal/evsel.h | 3 +-
tools/perf/Documentation/perf-list.txt | 25 +++--
tools/perf/builtin-stat.c | 9 +-
tools/perf/tests/event_update.c | 4 +-
tools/perf/util/evlist.c | 15 +--
tools/perf/util/evsel.c | 55 +++++++++--
tools/perf/util/evsel.h | 5 +
tools/perf/util/expr.c | 2 +-
tools/perf/util/header.c | 4 +-
tools/perf/util/parse-events.c | 112 ++++++++++++++++------
tools/perf/util/pmus.c | 29 +++---
tools/perf/util/pmus.h | 2 +
tools/perf/util/stat.c | 6 +-
tools/perf/util/synthetic-events.c | 4 +-
tools/perf/util/tool_pmu.c | 56 +++++++++--
tools/perf/util/tool_pmu.h | 2 +-
18 files changed, 323 insertions(+), 137 deletions(-)
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 01/12] perf parse-events: Warn if a cpu term is unsupported by a CPU
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 02/12] perf stat: Avoid buffer overflow to the aggregation map Ian Rogers
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
Factor requested CPU warning out of evlist and into evsel. At the end
of adding an event, perform the warning check. To avoid repeatedly
testing if the cpu_list is empty, add a local variable.
```
$ perf stat -e cpu_atom/cycles,cpu=1/ -a true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cpu_atom/cycles/'
Performance counter stats for 'system wide':
<not supported> cpu_atom/cycles/
0.000781511 seconds time elapsed
```
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/evlist.c | 15 +--------------
tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
tools/perf/util/evsel.h | 2 ++
tools/perf/util/parse-events.c | 12 ++++++++----
4 files changed, 35 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 995ad5f654d0..80d8387e6b97 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -2549,20 +2549,7 @@ void evlist__warn_user_requested_cpus(struct evlist *evlist, const char *cpu_lis
return;
evlist__for_each_entry(evlist, pos) {
- struct perf_cpu_map *intersect, *to_test, *online = cpu_map__online();
- const struct perf_pmu *pmu = evsel__find_pmu(pos);
-
- to_test = pmu && pmu->is_core ? pmu->cpus : online;
- intersect = perf_cpu_map__intersect(to_test, user_requested_cpus);
- if (!perf_cpu_map__equal(intersect, user_requested_cpus)) {
- char buf[128];
-
- cpu_map__snprint(to_test, buf, sizeof(buf));
- pr_warning("WARNING: A requested CPU in '%s' is not supported by PMU '%s' (CPUs %s) for event '%s'\n",
- cpu_list, pmu ? pmu->name : "cpu", buf, evsel__name(pos));
- }
- perf_cpu_map__put(intersect);
- perf_cpu_map__put(online);
+ evsel__warn_user_requested_cpus(pos, user_requested_cpus);
}
perf_cpu_map__put(user_requested_cpus);
}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3896a04d90af..d9b6bf78d67b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -4091,3 +4091,27 @@ void evsel__uniquify_counter(struct evsel *counter)
counter->uniquified_name = false;
}
}
+
+void evsel__warn_user_requested_cpus(struct evsel *evsel, struct perf_cpu_map *user_requested_cpus)
+{
+ struct perf_cpu_map *intersect, *online = NULL;
+ const struct perf_pmu *pmu = evsel__find_pmu(evsel);
+
+ if (pmu && pmu->is_core) {
+ intersect = perf_cpu_map__intersect(pmu->cpus, user_requested_cpus);
+ } else {
+ online = cpu_map__online();
+ intersect = perf_cpu_map__intersect(online, user_requested_cpus);
+ }
+ if (!perf_cpu_map__equal(intersect, user_requested_cpus)) {
+ char buf1[128];
+ char buf2[128];
+
+ cpu_map__snprint(user_requested_cpus, buf1, sizeof(buf1));
+ cpu_map__snprint(online ?: pmu->cpus, buf2, sizeof(buf2));
+ pr_warning("WARNING: A requested CPU in '%s' is not supported by PMU '%s' (CPUs %s) for event '%s'\n",
+ buf1, pmu ? pmu->name : "cpu", buf2, evsel__name(evsel));
+ }
+ perf_cpu_map__put(intersect);
+ perf_cpu_map__put(online);
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index b84ee274602d..cefa8e64c0d5 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -574,4 +574,6 @@ void evsel__set_config_if_unset(struct perf_pmu *pmu, struct evsel *evsel,
bool evsel__is_offcpu_event(struct evsel *evsel);
+void evsel__warn_user_requested_cpus(struct evsel *evsel, struct perf_cpu_map *user_requested_cpus);
+
#endif /* __PERF_EVSEL_H */
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index a59ae5ca0f89..3fd6cc0c2794 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -252,6 +252,7 @@ __add_event(struct list_head *list, int *idx,
struct evsel *evsel;
bool is_pmu_core;
struct perf_cpu_map *cpus;
+ bool has_cpu_list = !perf_cpu_map__is_empty(cpu_list);
/*
* Ensure the first_wildcard_match's PMU matches that of the new event
@@ -276,7 +277,7 @@ __add_event(struct list_head *list, int *idx,
if (pmu) {
is_pmu_core = pmu->is_core;
- cpus = perf_cpu_map__get(perf_cpu_map__is_empty(cpu_list) ? pmu->cpus : cpu_list);
+ cpus = perf_cpu_map__get(has_cpu_list ? cpu_list : pmu->cpus);
perf_pmu__warn_invalid_formats(pmu);
if (attr->type == PERF_TYPE_RAW || attr->type >= PERF_TYPE_MAX) {
perf_pmu__warn_invalid_config(pmu, attr->config, name,
@@ -291,10 +292,10 @@ __add_event(struct list_head *list, int *idx,
} else {
is_pmu_core = (attr->type == PERF_TYPE_HARDWARE ||
attr->type == PERF_TYPE_HW_CACHE);
- if (perf_cpu_map__is_empty(cpu_list))
- cpus = is_pmu_core ? perf_cpu_map__new_online_cpus() : NULL;
- else
+ if (has_cpu_list)
cpus = perf_cpu_map__get(cpu_list);
+ else
+ cpus = is_pmu_core ? cpu_map__online() : NULL;
}
if (init_attr)
event_attr_init(attr);
@@ -326,6 +327,9 @@ __add_event(struct list_head *list, int *idx,
if (list)
list_add_tail(&evsel->core.node, list);
+ if (has_cpu_list)
+ evsel__warn_user_requested_cpus(evsel, cpu_list);
+
return evsel;
}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 02/12] perf stat: Avoid buffer overflow to the aggregation map
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
2025-07-17 21:02 ` [PATCH v2 01/12] perf parse-events: Warn if a cpu term is unsupported by a CPU Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 03/12] perf stat: Don't size aggregation ids from user_requested_cpus Ian Rogers
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
CPUs may be created and passed to perf_stat__get_aggr (via
config->aggr_get_id), such as in the stat display
should_skip_zero_counter. There may be no such aggr_id, for example,
if running with a thread. Add a missing bound check and just create
IDs for these cases.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-stat.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 77e2248fa7fc..73b4521ab8af 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1365,7 +1365,7 @@ static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
struct aggr_cpu_id id;
/* per-process mode - should use global aggr mode */
- if (cpu.cpu == -1)
+ if (cpu.cpu == -1 || cpu.cpu >= config->cpus_aggr_map->nr)
return get_id(config, cpu);
if (aggr_cpu_id__is_empty(&config->cpus_aggr_map->map[cpu.cpu]))
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 03/12] perf stat: Don't size aggregation ids from user_requested_cpus
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
2025-07-17 21:02 ` [PATCH v2 01/12] perf parse-events: Warn if a cpu term is unsupported by a CPU Ian Rogers
2025-07-17 21:02 ` [PATCH v2 02/12] perf stat: Avoid buffer overflow to the aggregation map Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 04/12] perf parse-events: Allow the cpu term to be a PMU or CPU range Ian Rogers
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
As evsels may have additional CPU terms, the user_requested_cpus may
not reflect all the CPUs requested. Use evlist->all_cpus to size the
array as that reflects all the CPUs potentially needed by the evlist.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-stat.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 73b4521ab8af..00fce828cd5e 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1513,11 +1513,8 @@ static int perf_stat_init_aggr_mode(void)
* taking the highest cpu number to be the size of
* the aggregation translate cpumap.
*/
- if (!perf_cpu_map__is_any_cpu_or_is_empty(evsel_list->core.user_requested_cpus))
- nr = perf_cpu_map__max(evsel_list->core.user_requested_cpus).cpu;
- else
- nr = 0;
- stat_config.cpus_aggr_map = cpu_aggr_map__empty_new(nr + 1);
+ nr = perf_cpu_map__max(evsel_list->core.all_cpus).cpu + 1;
+ stat_config.cpus_aggr_map = cpu_aggr_map__empty_new(nr);
return stat_config.cpus_aggr_map ? 0 : -ENOMEM;
}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 04/12] perf parse-events: Allow the cpu term to be a PMU or CPU range
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (2 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 03/12] perf stat: Don't size aggregation ids from user_requested_cpus Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 05/12] perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask Ian Rogers
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
On hybrid systems, events like msr/tsc/ will aggregate counts across
all CPUs. Often metrics only want a value like msr/tsc/ for the cores
on which the metric is being computed. Listing each CPU with terms
cpu=0,cpu=1.. is laborious and would need to be encoded for all
variations of a CPU model.
Allow the cpumask from a PMU to be an argument to the cpu term. For
example in the following the cpumask of the cstate_pkg PMU selects the
CPUs to count msr/tsc/ counter upon:
```
$ cat /sys/bus/event_source/devices/cstate_pkg/cpumask
0
$ perf stat -A -e 'msr/tsc,cpu=cstate_pkg/' -a sleep 0.1
Performance counter stats for 'system wide':
CPU0 252,621,253 msr/tsc,cpu=cstate_pkg/
0.101184092 seconds time elapsed
```
As the cpu term is now also allowed to be a string, allow it to encode
a range of CPUs (a list can't be supported as ',' is already a special
token).
The "event qualifiers" section of the `perf list` man page is updated
to detail the additional behavior. The man page formatting is tidied
up in this section, as it was incorrectly appearing within the
"parameterized events" section.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/Documentation/perf-list.txt | 25 +++++++++-----
tools/perf/util/parse-events.c | 47 +++++++++++++++++++++-----
2 files changed, 54 insertions(+), 18 deletions(-)
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index ce0735021473..28215306a78a 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -278,26 +278,33 @@ also be supplied. For example:
perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
-EVENT QUALIFIERS:
+EVENT QUALIFIERS
+----------------
It is also possible to add extra qualifiers to an event:
percore:
-Sums up the event counts for all hardware threads in a core, e.g.:
-
-
- perf stat -e cpu/event=0,umask=0x3,percore=1/
+ Sums up the event counts for all hardware threads in a core, e.g.:
+ perf stat -e cpu/event=0,umask=0x3,percore=1/
cpu:
-Specifies the CPU to open the event upon. The value may be repeated to
-specify opening the event on multiple CPUs:
+ Specifies a CPU or a range of CPUs to open the event upon. It may
+ also reference a PMU to copy the CPU mask from. The value may be
+ repeated to specify opening the event on multiple CPUs.
+ Example 1: to open the instructions event on CPUs 0 and 2, the
+ cycles event on CPUs 1 and 2:
+ perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1-2/ -a sleep 1
- perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1,cpu=2/ -a sleep 1
- perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1
+ Example 2: to open the data_read uncore event on CPU 0 and the
+ data_write uncore event on CPU 1:
+ perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1
+ Example 3: to open the software msr/tsc/ event only on the CPUs
+ matching those from the cpu_core PMU:
+ perf stat -e msr/tsc,cpu=cpu_core/ -a sleep 1
EVENT GROUPS
------------
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3fd6cc0c2794..a337e4d22ff2 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -187,10 +187,22 @@ static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head
list_for_each_entry(term, &head_terms->terms, list) {
if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) {
- struct perf_cpu_map *cpu = perf_cpu_map__new_int(term->val.num);
-
- perf_cpu_map__merge(&cpus, cpu);
- perf_cpu_map__put(cpu);
+ struct perf_cpu_map *term_cpus;
+
+ if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+ term_cpus = perf_cpu_map__new_int(term->val.num);
+ } else {
+ struct perf_pmu *pmu = perf_pmus__find(term->val.str);
+
+ if (pmu && perf_cpu_map__is_empty(pmu->cpus))
+ term_cpus = pmu->is_core ? cpu_map__online() : NULL;
+ else if (pmu)
+ term_cpus = perf_cpu_map__get(pmu->cpus);
+ else
+ term_cpus = perf_cpu_map__new(term->val.str);
+ }
+ perf_cpu_map__merge(&cpus, term_cpus);
+ perf_cpu_map__put(term_cpus);
}
}
@@ -1048,15 +1060,32 @@ do { \
return -EINVAL;
}
break;
- case PARSE_EVENTS__TERM_TYPE_CPU:
- CHECK_TYPE_VAL(NUM);
- if (term->val.num >= (u64)cpu__max_present_cpu().cpu) {
+ case PARSE_EVENTS__TERM_TYPE_CPU: {
+ struct perf_cpu_map *map;
+
+ if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+ if (term->val.num >= (u64)cpu__max_present_cpu().cpu) {
+ parse_events_error__handle(err, term->err_val,
+ strdup("too big"),
+ /*help=*/NULL);
+ return -EINVAL;
+ }
+ break;
+ }
+ assert(term->type_val == PARSE_EVENTS__TERM_TYPE_STR);
+ if (perf_pmus__find(term->val.str) != NULL)
+ break;
+
+ map = perf_cpu_map__new(term->val.str);
+ if (!map) {
parse_events_error__handle(err, term->err_val,
- strdup("too big"),
- NULL);
+ strdup("not a valid PMU or CPU number"),
+ /*help=*/NULL);
return -EINVAL;
}
+ perf_cpu_map__put(map);
break;
+ }
case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
case PARSE_EVENTS__TERM_TYPE_USER:
case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 05/12] perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (3 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 04/12] perf parse-events: Allow the cpu term to be a PMU or CPU range Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 06/12] libperf evsel: Rename own_cpus to pmu_cpus Ian Rogers
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
For hybrid metrics it is useful to know the number of p-core or e-core
CPUs. If a cpumask is specified for the num_cpus or num_cpus_online
tool events, compute the value relative to the given mask rather than
for the full system.
```
$ sudo /tmp/perf/perf stat -e 'tool/num_cpus/,tool/num_cpus,cpu=cpu_core/,
tool/num_cpus,cpu=cpu_atom/,tool/num_cpus_online/,tool/num_cpus_online,
cpu=cpu_core/,tool/num_cpus_online,cpu=cpu_atom/' true
Performance counter stats for 'true':
28 tool/num_cpus/
16 tool/num_cpus,cpu=cpu_core/
12 tool/num_cpus,cpu=cpu_atom/
28 tool/num_cpus_online/
16 tool/num_cpus_online,cpu=cpu_core/
12 tool/num_cpus_online,cpu=cpu_atom/
0.000767205 seconds time elapsed
0.000938000 seconds user
0.000000000 seconds sys
```
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/expr.c | 2 +-
tools/perf/util/tool_pmu.c | 56 +++++++++++++++++++++++++++++++++-----
tools/perf/util/tool_pmu.h | 2 +-
3 files changed, 51 insertions(+), 9 deletions(-)
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index ca70a14c7cdf..7fda0ff89c16 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -401,7 +401,7 @@ double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx
if (ev != TOOL_PMU__EVENT_NONE) {
u64 count;
- if (tool_pmu__read_event(ev, &count))
+ if (tool_pmu__read_event(ev, /*evsel=*/NULL, &count))
result = count;
else
pr_err("Failure to read '%s'", literal);
diff --git a/tools/perf/util/tool_pmu.c b/tools/perf/util/tool_pmu.c
index 4630b8cc8e52..7aa4f315b0ac 100644
--- a/tools/perf/util/tool_pmu.c
+++ b/tools/perf/util/tool_pmu.c
@@ -332,7 +332,7 @@ static bool has_pmem(void)
return has_pmem;
}
-bool tool_pmu__read_event(enum tool_pmu_event ev, u64 *result)
+bool tool_pmu__read_event(enum tool_pmu_event ev, struct evsel *evsel, u64 *result)
{
const struct cpu_topology *topology;
@@ -347,18 +347,60 @@ bool tool_pmu__read_event(enum tool_pmu_event ev, u64 *result)
return true;
case TOOL_PMU__EVENT_NUM_CPUS:
- *result = cpu__max_present_cpu().cpu;
+ if (!evsel || perf_cpu_map__is_empty(evsel->core.cpus)) {
+ /* No evsel to be specific to. */
+ *result = cpu__max_present_cpu().cpu;
+ } else if (!perf_cpu_map__has_any_cpu(evsel->core.cpus)) {
+ /* Evsel just has specific CPUs. */
+ *result = perf_cpu_map__nr(evsel->core.cpus);
+ } else {
+ /*
+ * "Any CPU" event that can be scheduled on any CPU in
+ * the PMU's cpumask. The PMU cpumask should be saved in
+ * own_cpus. If not present fall back to max.
+ */
+ if (!perf_cpu_map__is_empty(evsel->core.own_cpus))
+ *result = perf_cpu_map__nr(evsel->core.own_cpus);
+ else
+ *result = cpu__max_present_cpu().cpu;
+ }
return true;
case TOOL_PMU__EVENT_NUM_CPUS_ONLINE: {
struct perf_cpu_map *online = cpu_map__online();
- if (online) {
+ if (!online)
+ return false;
+
+ if (!evsel || perf_cpu_map__is_empty(evsel->core.cpus)) {
+ /* No evsel to be specific to. */
*result = perf_cpu_map__nr(online);
- perf_cpu_map__put(online);
- return true;
+ } else if (!perf_cpu_map__has_any_cpu(evsel->core.cpus)) {
+ /* Evsel just has specific CPUs. */
+ struct perf_cpu_map *tmp =
+ perf_cpu_map__intersect(online, evsel->core.cpus);
+
+ *result = perf_cpu_map__nr(tmp);
+ perf_cpu_map__put(tmp);
+ } else {
+ /*
+ * "Any CPU" event that can be scheduled on any CPU in
+ * the PMU's cpumask. The PMU cpumask should be saved in
+ * own_cpus, if not present then just the online cpu
+ * mask.
+ */
+ if (!perf_cpu_map__is_empty(evsel->core.own_cpus)) {
+ struct perf_cpu_map *tmp =
+ perf_cpu_map__intersect(online, evsel->core.own_cpus);
+
+ *result = perf_cpu_map__nr(tmp);
+ perf_cpu_map__put(tmp);
+ } else {
+ *result = perf_cpu_map__nr(online);
+ }
}
- return false;
+ perf_cpu_map__put(online);
+ return true;
}
case TOOL_PMU__EVENT_NUM_DIES:
topology = online_topology();
@@ -417,7 +459,7 @@ int evsel__tool_pmu_read(struct evsel *evsel, int cpu_map_idx, int thread)
old_count = perf_counts(evsel->prev_raw_counts, cpu_map_idx, thread);
val = 0;
if (cpu_map_idx == 0 && thread == 0) {
- if (!tool_pmu__read_event(ev, &val)) {
+ if (!tool_pmu__read_event(ev, evsel, &val)) {
count->lost++;
val = 0;
}
diff --git a/tools/perf/util/tool_pmu.h b/tools/perf/util/tool_pmu.h
index c6ad1dd90a56..d642e7d73910 100644
--- a/tools/perf/util/tool_pmu.h
+++ b/tools/perf/util/tool_pmu.h
@@ -34,7 +34,7 @@ enum tool_pmu_event tool_pmu__str_to_event(const char *str);
bool tool_pmu__skip_event(const char *name);
int tool_pmu__num_skip_events(void);
-bool tool_pmu__read_event(enum tool_pmu_event ev, u64 *result);
+bool tool_pmu__read_event(enum tool_pmu_event ev, struct evsel *evsel, u64 *result);
u64 tool_pmu__cpu_slots_per_cycle(void);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 06/12] libperf evsel: Rename own_cpus to pmu_cpus
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (4 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 05/12] perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 07/12] libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete Ian Rogers
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
own_cpus is generally the cpumask from the PMU. Rename to pmu_cpus to
try to make this clearer. Variable rename with no other changes.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/lib/perf/evlist.c | 8 ++++----
tools/lib/perf/evsel.c | 2 +-
tools/lib/perf/include/internal/evsel.h | 2 +-
tools/perf/tests/event_update.c | 4 ++--
tools/perf/util/evsel.c | 6 +++---
| 4 ++--
tools/perf/util/parse-events.c | 2 +-
tools/perf/util/synthetic-events.c | 4 ++--
tools/perf/util/tool_pmu.c | 12 ++++++------
9 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index b1f4c8176b32..9d9dec21f510 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -46,7 +46,7 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
* are valid by intersecting with those of the PMU.
*/
perf_cpu_map__put(evsel->cpus);
- evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
+ evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->pmu_cpus);
/*
* Empty cpu lists would eventually get opened as "any" so remove
@@ -61,7 +61,7 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
list_for_each_entry_from(next, &evlist->entries, node)
next->idx--;
}
- } else if (!evsel->own_cpus || evlist->has_user_cpus ||
+ } else if (!evsel->pmu_cpus || evlist->has_user_cpus ||
(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
/*
* The PMU didn't specify a default cpu map, this isn't a core
@@ -72,13 +72,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
*/
perf_cpu_map__put(evsel->cpus);
evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
- } else if (evsel->cpus != evsel->own_cpus) {
+ } else if (evsel->cpus != evsel->pmu_cpus) {
/*
* No user requested cpu map but the PMU cpu map doesn't match
* the evsel's. Reset it back to the PMU cpu map.
*/
perf_cpu_map__put(evsel->cpus);
- evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
+ evsel->cpus = perf_cpu_map__get(evsel->pmu_cpus);
}
if (evsel->system_wide) {
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 2a85e0bfee1e..127abe7df63d 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -46,7 +46,7 @@ void perf_evsel__delete(struct perf_evsel *evsel)
assert(evsel->mmap == NULL); /* If not munmap wasn't called. */
assert(evsel->sample_id == NULL); /* If not free_id wasn't called. */
perf_cpu_map__put(evsel->cpus);
- perf_cpu_map__put(evsel->own_cpus);
+ perf_cpu_map__put(evsel->pmu_cpus);
perf_thread_map__put(evsel->threads);
free(evsel);
}
diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
index ea78defa77d0..b97dc8c92882 100644
--- a/tools/lib/perf/include/internal/evsel.h
+++ b/tools/lib/perf/include/internal/evsel.h
@@ -99,7 +99,7 @@ struct perf_evsel {
* cpu map for opening the event on, for example, the first CPU on a
* socket for an uncore event.
*/
- struct perf_cpu_map *own_cpus;
+ struct perf_cpu_map *pmu_cpus;
struct perf_thread_map *threads;
struct xyarray *fd;
struct xyarray *mmap;
diff --git a/tools/perf/tests/event_update.c b/tools/perf/tests/event_update.c
index 9301fde11366..cb9e6de2e033 100644
--- a/tools/perf/tests/event_update.c
+++ b/tools/perf/tests/event_update.c
@@ -109,8 +109,8 @@ static int test__event_update(struct test_suite *test __maybe_unused, int subtes
TEST_ASSERT_VAL("failed to synthesize attr update name",
!perf_event__synthesize_event_update_name(&tmp.tool, evsel, process_event_name));
- perf_cpu_map__put(evsel->core.own_cpus);
- evsel->core.own_cpus = perf_cpu_map__new("1,2,3");
+ perf_cpu_map__put(evsel->core.pmu_cpus);
+ evsel->core.pmu_cpus = perf_cpu_map__new("1,2,3");
TEST_ASSERT_VAL("failed to synthesize attr update cpus",
!perf_event__synthesize_event_update_cpus(&tmp.tool, evsel, process_event_cpus));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d9b6bf78d67b..ba0c9799928b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -488,7 +488,7 @@ struct evsel *evsel__clone(struct evsel *dest, struct evsel *orig)
return NULL;
evsel->core.cpus = perf_cpu_map__get(orig->core.cpus);
- evsel->core.own_cpus = perf_cpu_map__get(orig->core.own_cpus);
+ evsel->core.pmu_cpus = perf_cpu_map__get(orig->core.pmu_cpus);
evsel->core.threads = perf_thread_map__get(orig->core.threads);
evsel->core.nr_members = orig->core.nr_members;
evsel->core.system_wide = orig->core.system_wide;
@@ -1527,7 +1527,7 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
attr->exclude_user = 1;
}
- if (evsel->core.own_cpus || evsel->unit)
+ if (evsel->core.pmu_cpus || evsel->unit)
evsel->core.attr.read_format |= PERF_FORMAT_ID;
/*
@@ -1680,7 +1680,7 @@ void evsel__exit(struct evsel *evsel)
evsel__free_config_terms(evsel);
cgroup__put(evsel->cgrp);
perf_cpu_map__put(evsel->core.cpus);
- perf_cpu_map__put(evsel->core.own_cpus);
+ perf_cpu_map__put(evsel->core.pmu_cpus);
perf_thread_map__put(evsel->core.threads);
zfree(&evsel->group_name);
zfree(&evsel->name);
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 53d54fbda10d..d941d7aa0f49 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -4507,8 +4507,8 @@ int perf_event__process_event_update(const struct perf_tool *tool __maybe_unused
case PERF_EVENT_UPDATE__CPUS:
map = cpu_map__new_data(&ev->cpus.cpus);
if (map) {
- perf_cpu_map__put(evsel->core.own_cpus);
- evsel->core.own_cpus = map;
+ perf_cpu_map__put(evsel->core.pmu_cpus);
+ evsel->core.pmu_cpus = map;
} else
pr_err("failed to get event_update cpus\n");
default:
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index a337e4d22ff2..d506f9943506 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -320,7 +320,7 @@ __add_event(struct list_head *list, int *idx,
(*idx)++;
evsel->core.cpus = cpus;
- evsel->core.own_cpus = perf_cpu_map__get(cpus);
+ evsel->core.pmu_cpus = perf_cpu_map__get(cpus);
evsel->core.requires_cpu = pmu ? pmu->is_uncore : false;
evsel->core.is_pmu_core = is_pmu_core;
evsel->pmu = pmu;
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 2fc4d0537840..7c00b09e3a93 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2045,7 +2045,7 @@ int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struc
int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel,
perf_event__handler_t process)
{
- struct synthesize_cpu_map_data syn_data = { .map = evsel->core.own_cpus };
+ struct synthesize_cpu_map_data syn_data = { .map = evsel->core.pmu_cpus };
struct perf_record_event_update *ev;
int err;
@@ -2126,7 +2126,7 @@ int perf_event__synthesize_extra_attr(const struct perf_tool *tool, struct evlis
}
}
- if (evsel->core.own_cpus) {
+ if (evsel->core.pmu_cpus) {
err = perf_event__synthesize_event_update_cpus(tool, evsel, process);
if (err < 0) {
pr_err("Couldn't synthesize evsel cpus.\n");
diff --git a/tools/perf/util/tool_pmu.c b/tools/perf/util/tool_pmu.c
index 7aa4f315b0ac..d99e699e646d 100644
--- a/tools/perf/util/tool_pmu.c
+++ b/tools/perf/util/tool_pmu.c
@@ -357,10 +357,10 @@ bool tool_pmu__read_event(enum tool_pmu_event ev, struct evsel *evsel, u64 *resu
/*
* "Any CPU" event that can be scheduled on any CPU in
* the PMU's cpumask. The PMU cpumask should be saved in
- * own_cpus. If not present fall back to max.
+ * pmu_cpus. If not present fall back to max.
*/
- if (!perf_cpu_map__is_empty(evsel->core.own_cpus))
- *result = perf_cpu_map__nr(evsel->core.own_cpus);
+ if (!perf_cpu_map__is_empty(evsel->core.pmu_cpus))
+ *result = perf_cpu_map__nr(evsel->core.pmu_cpus);
else
*result = cpu__max_present_cpu().cpu;
}
@@ -386,12 +386,12 @@ bool tool_pmu__read_event(enum tool_pmu_event ev, struct evsel *evsel, u64 *resu
/*
* "Any CPU" event that can be scheduled on any CPU in
* the PMU's cpumask. The PMU cpumask should be saved in
- * own_cpus, if not present then just the online cpu
+ * pmu_cpus, if not present then just the online cpu
* mask.
*/
- if (!perf_cpu_map__is_empty(evsel->core.own_cpus)) {
+ if (!perf_cpu_map__is_empty(evsel->core.pmu_cpus)) {
struct perf_cpu_map *tmp =
- perf_cpu_map__intersect(online, evsel->core.own_cpus);
+ perf_cpu_map__intersect(online, evsel->core.pmu_cpus);
*result = perf_cpu_map__nr(tmp);
perf_cpu_map__put(tmp);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 07/12] libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (5 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 06/12] libperf evsel: Rename own_cpus to pmu_cpus Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 08/12] perf evsel: Use libperf perf_evsel__exit Ian Rogers
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
This allows the perf_evsel__exit to be called when the struct
perf_evsel is embedded inside another struct, such as struct evsel in
perf.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/lib/perf/evsel.c | 7 ++++++-
tools/lib/perf/include/internal/evsel.h | 1 +
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 127abe7df63d..13a307fc75ae 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -40,7 +40,7 @@ struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr)
return evsel;
}
-void perf_evsel__delete(struct perf_evsel *evsel)
+void perf_evsel__exit(struct perf_evsel *evsel)
{
assert(evsel->fd == NULL); /* If not fds were not closed. */
assert(evsel->mmap == NULL); /* If not munmap wasn't called. */
@@ -48,6 +48,11 @@ void perf_evsel__delete(struct perf_evsel *evsel)
perf_cpu_map__put(evsel->cpus);
perf_cpu_map__put(evsel->pmu_cpus);
perf_thread_map__put(evsel->threads);
+}
+
+void perf_evsel__delete(struct perf_evsel *evsel)
+{
+ perf_evsel__exit(evsel);
free(evsel);
}
diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
index b97dc8c92882..fefe64ba5e26 100644
--- a/tools/lib/perf/include/internal/evsel.h
+++ b/tools/lib/perf/include/internal/evsel.h
@@ -133,6 +133,7 @@ struct perf_evsel {
void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
int idx);
+void perf_evsel__exit(struct perf_evsel *evsel);
int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads);
void perf_evsel__close_fd(struct perf_evsel *evsel);
void perf_evsel__free_fd(struct perf_evsel *evsel);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 08/12] perf evsel: Use libperf perf_evsel__exit
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (6 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 07/12] libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 09/12] perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu Ian Rogers
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
Avoid the duplicated code and better enable perf_evsel to change.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/evsel.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ba0c9799928b..af2b26c6456a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1679,9 +1679,7 @@ void evsel__exit(struct evsel *evsel)
perf_evsel__free_id(&evsel->core);
evsel__free_config_terms(evsel);
cgroup__put(evsel->cgrp);
- perf_cpu_map__put(evsel->core.cpus);
- perf_cpu_map__put(evsel->core.pmu_cpus);
- perf_thread_map__put(evsel->core.threads);
+ perf_evsel__exit(&evsel->core);
zfree(&evsel->group_name);
zfree(&evsel->name);
#ifdef HAVE_LIBTRACEEVENT
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 09/12] perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (7 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 08/12] perf evsel: Use libperf perf_evsel__exit Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 10/12] perf parse-events: Minor __add_event refactoring Ian Rogers
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
Allow a PMU to be found by a perf_event_attr, useful when creating
evsels.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/pmus.c | 29 +++++++++++++++++------------
tools/perf/util/pmus.h | 2 ++
2 files changed, 19 insertions(+), 12 deletions(-)
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 409b909cfa02..9137bb9036ed 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -814,24 +814,18 @@ bool perf_pmus__supports_extended_type(void)
return perf_pmus__do_support_extended_type;
}
-struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
+struct perf_pmu *perf_pmus__find_by_attr(const struct perf_event_attr *attr)
{
- struct perf_pmu *pmu = evsel->pmu;
- bool legacy_core_type;
-
- if (pmu)
- return pmu;
+ struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type);
+ u32 type = attr->type;
+ bool legacy_core_type = type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE;
- pmu = perf_pmus__find_by_type(evsel->core.attr.type);
- legacy_core_type =
- evsel->core.attr.type == PERF_TYPE_HARDWARE ||
- evsel->core.attr.type == PERF_TYPE_HW_CACHE;
if (!pmu && legacy_core_type && perf_pmus__supports_extended_type()) {
- u32 type = evsel->core.attr.config >> PERF_PMU_TYPE_SHIFT;
+ type = attr->config >> PERF_PMU_TYPE_SHIFT;
pmu = perf_pmus__find_by_type(type);
}
- if (!pmu && (legacy_core_type || evsel->core.attr.type == PERF_TYPE_RAW)) {
+ if (!pmu && (legacy_core_type || type == PERF_TYPE_RAW)) {
/*
* For legacy events, if there was no extended type info then
* assume the PMU is the first core PMU.
@@ -842,6 +836,17 @@ struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
*/
pmu = perf_pmus__find_core_pmu();
}
+ return pmu;
+}
+
+struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
+{
+ struct perf_pmu *pmu = evsel->pmu;
+
+ if (pmu)
+ return pmu;
+
+ pmu = perf_pmus__find_by_attr(&evsel->core.attr);
((struct evsel *)evsel)->pmu = pmu;
return pmu;
}
diff --git a/tools/perf/util/pmus.h b/tools/perf/util/pmus.h
index 86842ee5f539..7cb36863711a 100644
--- a/tools/perf/util/pmus.h
+++ b/tools/perf/util/pmus.h
@@ -5,6 +5,7 @@
#include <stdbool.h>
#include <stddef.h>
+struct perf_event_attr;
struct perf_pmu;
struct print_callbacks;
@@ -16,6 +17,7 @@ void perf_pmus__destroy(void);
struct perf_pmu *perf_pmus__find(const char *name);
struct perf_pmu *perf_pmus__find_by_type(unsigned int type);
+struct perf_pmu *perf_pmus__find_by_attr(const struct perf_event_attr *attr);
struct perf_pmu *perf_pmus__scan(struct perf_pmu *pmu);
struct perf_pmu *perf_pmus__scan_core(struct perf_pmu *pmu);
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 10/12] perf parse-events: Minor __add_event refactoring
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (8 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 09/12] perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 11/12] perf evsel: Add evsel__open_per_cpu_and_thread Ian Rogers
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
Rename cpu_list to user_cpus. If a PMU isn't given, find it early from
the perf_event_attr. Make the pmu_cpus more explicitly a copy from the
PMU (except when user_cpus are given). Derive the cpus from pmu_cpus
and user_cpus as appropriate. Handle strdup errors on name and
metric_id.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/parse-events.c | 69 +++++++++++++++++++++++-----------
1 file changed, 48 insertions(+), 21 deletions(-)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d506f9943506..bd2d831d5123 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -259,12 +259,12 @@ __add_event(struct list_head *list, int *idx,
bool init_attr,
const char *name, const char *metric_id, struct perf_pmu *pmu,
struct list_head *config_terms, struct evsel *first_wildcard_match,
- struct perf_cpu_map *cpu_list, u64 alternate_hw_config)
+ struct perf_cpu_map *user_cpus, u64 alternate_hw_config)
{
struct evsel *evsel;
bool is_pmu_core;
- struct perf_cpu_map *cpus;
- bool has_cpu_list = !perf_cpu_map__is_empty(cpu_list);
+ struct perf_cpu_map *cpus, *pmu_cpus;
+ bool has_user_cpus = !perf_cpu_map__is_empty(user_cpus);
/*
* Ensure the first_wildcard_match's PMU matches that of the new event
@@ -288,8 +288,6 @@ __add_event(struct list_head *list, int *idx,
}
if (pmu) {
- is_pmu_core = pmu->is_core;
- cpus = perf_cpu_map__get(has_cpu_list ? cpu_list : pmu->cpus);
perf_pmu__warn_invalid_formats(pmu);
if (attr->type == PERF_TYPE_RAW || attr->type >= PERF_TYPE_MAX) {
perf_pmu__warn_invalid_config(pmu, attr->config, name,
@@ -301,48 +299,77 @@ __add_event(struct list_head *list, int *idx,
perf_pmu__warn_invalid_config(pmu, attr->config3, name,
PERF_PMU_FORMAT_VALUE_CONFIG3, "config3");
}
+ }
+ /*
+ * If a PMU wasn't given, such as for legacy events, find now that
+ * warnings won't be generated.
+ */
+ if (!pmu)
+ pmu = perf_pmus__find_by_attr(attr);
+
+ if (pmu) {
+ is_pmu_core = pmu->is_core;
+ pmu_cpus = perf_cpu_map__get(pmu->cpus);
} else {
is_pmu_core = (attr->type == PERF_TYPE_HARDWARE ||
attr->type == PERF_TYPE_HW_CACHE);
- if (has_cpu_list)
- cpus = perf_cpu_map__get(cpu_list);
- else
- cpus = is_pmu_core ? cpu_map__online() : NULL;
+ pmu_cpus = is_pmu_core ? cpu_map__online() : NULL;
+ }
+
+ if (has_user_cpus) {
+ cpus = perf_cpu_map__get(user_cpus);
+ /* Existing behavior that pmu_cpus matches the given user ones. */
+ perf_cpu_map__put(pmu_cpus);
+ pmu_cpus = perf_cpu_map__get(user_cpus);
+ } else {
+ cpus = perf_cpu_map__get(pmu_cpus);
}
+
if (init_attr)
event_attr_init(attr);
evsel = evsel__new_idx(attr, *idx);
- if (!evsel) {
- perf_cpu_map__put(cpus);
- return NULL;
+ if (!evsel)
+ goto out_err;
+
+ if (name) {
+ evsel->name = strdup(name);
+ if (!evsel->name)
+ goto out_err;
+ }
+
+ if (metric_id) {
+ evsel->metric_id = strdup(metric_id);
+ if (!evsel->metric_id)
+ goto out_err;
}
(*idx)++;
evsel->core.cpus = cpus;
- evsel->core.pmu_cpus = perf_cpu_map__get(cpus);
+ evsel->core.pmu_cpus = pmu_cpus;
evsel->core.requires_cpu = pmu ? pmu->is_uncore : false;
evsel->core.is_pmu_core = is_pmu_core;
evsel->pmu = pmu;
evsel->alternate_hw_config = alternate_hw_config;
evsel->first_wildcard_match = first_wildcard_match;
- if (name)
- evsel->name = strdup(name);
-
- if (metric_id)
- evsel->metric_id = strdup(metric_id);
-
if (config_terms)
list_splice_init(config_terms, &evsel->config_terms);
if (list)
list_add_tail(&evsel->core.node, list);
- if (has_cpu_list)
- evsel__warn_user_requested_cpus(evsel, cpu_list);
+ if (has_user_cpus)
+ evsel__warn_user_requested_cpus(evsel, user_cpus);
return evsel;
+out_err:
+ perf_cpu_map__put(cpus);
+ perf_cpu_map__put(pmu_cpus);
+ zfree(&evsel->name);
+ zfree(&evsel->metric_id);
+ free(evsel);
+ return NULL;
}
struct evsel *parse_events__add_event(int idx, struct perf_event_attr *attr,
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 11/12] perf evsel: Add evsel__open_per_cpu_and_thread
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (9 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 10/12] perf parse-events: Minor __add_event refactoring Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-17 21:02 ` [PATCH v2 12/12] perf parse-events: Support user CPUs mixed with threads/processes Ian Rogers
2025-07-18 18:19 ` [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Namhyung Kim
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
Add evsel__open_per_cpu_and_thread that combines the operation of
evsel__open_per_cpu and evsel__open_per_thread so that an event
without the "any" cpumask can be opened with its cpumask and with
threads it specifies. Change the implementation of evsel__open_per_cpu
and evsel__open_per_thread to use evsel__open_per_cpu_and_thread to
make the implementation of those functions clearer.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/evsel.c | 23 +++++++++++++++++++----
tools/perf/util/evsel.h | 3 +++
2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index af2b26c6456a..ae11df1e7902 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2761,17 +2761,32 @@ void evsel__close(struct evsel *evsel)
perf_evsel__free_id(&evsel->core);
}
-int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu_map_idx)
+int evsel__open_per_cpu_and_thread(struct evsel *evsel,
+ struct perf_cpu_map *cpus, int cpu_map_idx,
+ struct perf_thread_map *threads)
{
if (cpu_map_idx == -1)
- return evsel__open_cpu(evsel, cpus, NULL, 0, perf_cpu_map__nr(cpus));
+ return evsel__open_cpu(evsel, cpus, threads, 0, perf_cpu_map__nr(cpus));
- return evsel__open_cpu(evsel, cpus, NULL, cpu_map_idx, cpu_map_idx + 1);
+ return evsel__open_cpu(evsel, cpus, threads, cpu_map_idx, cpu_map_idx + 1);
+}
+
+int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu_map_idx)
+{
+ struct perf_thread_map *threads = thread_map__new_by_tid(-1);
+ int ret = evsel__open_per_cpu_and_thread(evsel, cpus, cpu_map_idx, threads);
+
+ perf_thread_map__put(threads);
+ return ret;
}
int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads)
{
- return evsel__open(evsel, NULL, threads);
+ struct perf_cpu_map *cpus = perf_cpu_map__new_any_cpu();
+ int ret = evsel__open_per_cpu_and_thread(evsel, cpus, -1, threads);
+
+ perf_cpu_map__put(cpus);
+ return ret;
}
static int perf_evsel__parse_id_sample(const struct evsel *evsel,
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cefa8e64c0d5..8e79eb6d41b3 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -351,6 +351,9 @@ int evsel__enable(struct evsel *evsel);
int evsel__disable(struct evsel *evsel);
int evsel__disable_cpu(struct evsel *evsel, int cpu_map_idx);
+int evsel__open_per_cpu_and_thread(struct evsel *evsel,
+ struct perf_cpu_map *cpus, int cpu_map_idx,
+ struct perf_thread_map *threads);
int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu_map_idx);
int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads);
int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 12/12] perf parse-events: Support user CPUs mixed with threads/processes
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (10 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 11/12] perf evsel: Add evsel__open_per_cpu_and_thread Ian Rogers
@ 2025-07-17 21:02 ` Ian Rogers
2025-07-18 18:19 ` [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Namhyung Kim
12 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2025-07-17 21:02 UTC (permalink / raw)
To: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Ben Gainey, James Clark, Howard Chu, Weilin Wang,
Levi Yun, Dr. David Alan Gilbert, Zhongqiu Han, Blake Jones,
Yicong Yang, Anubhav Shelat, Thomas Richter, Jean-Philippe Romain,
Song Liu, linux-perf-users, linux-kernel
Counting events system-wide with a specified CPU prior to this change
worked:
```
$ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' -a sleep 1
Performance counter stats for 'system wide':
59,393,419,099 msr/tsc/
33,927,965,927 msr/tsc,cpu=cpu_core/
25,465,608,044 msr/tsc,cpu=cpu_atom/
```
However, when counting with process the counts became system wide:
```
$ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
10.1: Basic parsing test : Ok
10.2: Parsing without PMU name : Ok
10.3: Parsing with PMU name : Ok
Performance counter stats for 'perf test -F 10':
59,233,549 msr/tsc/
59,227,556 msr/tsc,cpu=cpu_core/
59,224,053 msr/tsc,cpu=cpu_atom/
```
Make the handling of CPU maps with event parsing clearer. When an
event is parsed creating an evsel the cpus should be either the PMU's
cpumask or user specified CPUs.
Update perf_evlist__propagate_maps so that it doesn't clobber the user
specified CPUs. Try to make the behavior clearer, firstly fix up
missing cpumasks. Next, perform sanity checks and adjustments from the
global evlist CPU requests and for the PMU including simplifying to
the "any CPU"(-1) value. Finally remove the event if the cpumask is
empty.
So that events are opened with a CPU and a thread change stat's
create_perf_stat_counter to give both.
With the change things are fixed:
```
$ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
10.1: Basic parsing test : Ok
10.2: Parsing without PMU name : Ok
10.3: Parsing with PMU name : Ok
Performance counter stats for 'perf test -F 10':
63,704,975 msr/tsc/
47,060,704 msr/tsc,cpu=cpu_core/ (4.62%)
16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%)
```
However, note the "--no-scale" option is used. This is necessary as
the running time for the event on the counter isn't the same as the
enabled time because the thread doesn't necessarily run on the CPUs
specified for the counter. All counter values are scaled with:
scaled_value = value * time_enabled / time_running
and so without --no-scale the scaled_value becomes very large. This
problem already exists on hybrid systems for the same reason. Here are
2 runs of the same code with an instructions event that counts the
same on both types of core, there is no real multiplexing happening on
the event:
```
$ perf stat -e instructions perf test -F 10
...
Performance counter stats for 'perf test -F 10':
87,896,447 cpu_atom/instructions/ (14.37%)
98,171,964 cpu_core/instructions/ (85.63%)
...
$ perf stat --no-scale -e instructions perf test -F 10
...
Performance counter stats for 'perf test -F 10':
13,069,890 cpu_atom/instructions/ (19.32%)
83,460,274 cpu_core/instructions/ (80.68%)
...
```
The scaling has inflated per-PMU instruction counts and the overall
count by 2x.
To fix this the kernel needs changing when a task+CPU event (or just
task event on hybrid) is scheduled out. A fix could be that the state
isn't inactive but off for such events, so that time_enabled counts
don't accumulate on them.
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/lib/perf/evlist.c | 118 ++++++++++++++++++++++-----------
tools/perf/util/parse-events.c | 10 ++-
tools/perf/util/stat.c | 6 +-
3 files changed, 86 insertions(+), 48 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 9d9dec21f510..2d2236400220 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -36,49 +36,87 @@ void perf_evlist__init(struct perf_evlist *evlist)
static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
struct perf_evsel *evsel)
{
- if (evsel->system_wide) {
- /* System wide: set the cpu map of the evsel to all online CPUs. */
- perf_cpu_map__put(evsel->cpus);
- evsel->cpus = perf_cpu_map__new_online_cpus();
- } else if (evlist->has_user_cpus && evsel->is_pmu_core) {
- /*
- * User requested CPUs on a core PMU, ensure the requested CPUs
- * are valid by intersecting with those of the PMU.
- */
+ if (perf_cpu_map__is_empty(evsel->cpus)) {
+ if (perf_cpu_map__is_empty(evsel->pmu_cpus)) {
+ /*
+ * Assume the unset PMU cpus were for a system-wide
+ * event, like a software or tracepoint.
+ */
+ evsel->pmu_cpus = perf_cpu_map__new_online_cpus();
+ }
+ if (evlist->has_user_cpus && !evsel->system_wide) {
+ /*
+ * Use the user CPUs unless the evsel is set to be
+ * system wide, such as the dummy event.
+ */
+ evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
+ } else {
+ /*
+ * System wide and other modes, assume the cpu map
+ * should be set to all PMU CPUs.
+ */
+ evsel->cpus = perf_cpu_map__get(evsel->pmu_cpus);
+ }
+ }
+ /*
+ * Avoid "any CPU"(-1) for uncore and PMUs that require a CPU, even if
+ * requested.
+ */
+ if (evsel->requires_cpu && perf_cpu_map__has_any_cpu(evsel->cpus)) {
perf_cpu_map__put(evsel->cpus);
- evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->pmu_cpus);
+ evsel->cpus = perf_cpu_map__get(evsel->pmu_cpus);
+ }
- /*
- * Empty cpu lists would eventually get opened as "any" so remove
- * genuinely empty ones before they're opened in the wrong place.
- */
- if (perf_cpu_map__is_empty(evsel->cpus)) {
- struct perf_evsel *next = perf_evlist__next(evlist, evsel);
-
- perf_evlist__remove(evlist, evsel);
- /* Keep idx contiguous */
- if (next)
- list_for_each_entry_from(next, &evlist->entries, node)
- next->idx--;
+ /*
+ * Globally requested CPUs replace user requested unless the evsel is
+ * set to be system wide.
+ */
+ if (evlist->has_user_cpus && !evsel->system_wide) {
+ assert(!perf_cpu_map__has_any_cpu(evlist->user_requested_cpus));
+ if (!perf_cpu_map__equal(evsel->cpus, evlist->user_requested_cpus)) {
+ perf_cpu_map__put(evsel->cpus);
+ evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
}
- } else if (!evsel->pmu_cpus || evlist->has_user_cpus ||
- (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
- /*
- * The PMU didn't specify a default cpu map, this isn't a core
- * event and the user requested CPUs or the evlist user
- * requested CPUs have the "any CPU" (aka dummy) CPU value. In
- * which case use the user requested CPUs rather than the PMU
- * ones.
- */
+ }
+
+ /* Ensure cpus only references valid PMU CPUs. */
+ if (!perf_cpu_map__has_any_cpu(evsel->cpus) &&
+ !perf_cpu_map__is_subset(evsel->pmu_cpus, evsel->cpus)) {
+ struct perf_cpu_map *tmp = perf_cpu_map__intersect(evsel->pmu_cpus, evsel->cpus);
+
perf_cpu_map__put(evsel->cpus);
- evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
- } else if (evsel->cpus != evsel->pmu_cpus) {
- /*
- * No user requested cpu map but the PMU cpu map doesn't match
- * the evsel's. Reset it back to the PMU cpu map.
- */
+ evsel->cpus = tmp;
+ }
+
+ /*
+ * Was event requested on all the PMU's CPUs but the user requested is
+ * any CPU (-1)? If so switch to using any CPU (-1) to reduce the number
+ * of events.
+ */
+ if (!evsel->system_wide &&
+ perf_cpu_map__equal(evsel->cpus, evsel->pmu_cpus) &&
+ perf_cpu_map__has_any_cpu(evlist->user_requested_cpus)) {
perf_cpu_map__put(evsel->cpus);
- evsel->cpus = perf_cpu_map__get(evsel->pmu_cpus);
+ evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
+ }
+
+ /* Sanity check assert before the evsel is potentially removed. */
+ assert(!evsel->requires_cpu || !perf_cpu_map__has_any_cpu(evsel->cpus));
+
+ /*
+ * Empty cpu lists would eventually get opened as "any" so remove
+ * genuinely empty ones before they're opened in the wrong place.
+ */
+ if (perf_cpu_map__is_empty(evsel->cpus)) {
+ struct perf_evsel *next = perf_evlist__next(evlist, evsel);
+
+ perf_evlist__remove(evlist, evsel);
+ /* Keep idx contiguous */
+ if (next)
+ list_for_each_entry_from(next, &evlist->entries, node)
+ next->idx--;
+
+ return;
}
if (evsel->system_wide) {
@@ -98,6 +136,10 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
evlist->needs_map_propagation = true;
+ /* Clear the all_cpus set which will be merged into during propagation. */
+ perf_cpu_map__put(evlist->all_cpus);
+ evlist->all_cpus = NULL;
+
list_for_each_entry_safe(evsel, n, &evlist->entries, node)
__perf_evlist__propagate_maps(evlist, evsel);
}
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index bd2d831d5123..fe2073c6b549 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -310,20 +310,18 @@ __add_event(struct list_head *list, int *idx,
if (pmu) {
is_pmu_core = pmu->is_core;
pmu_cpus = perf_cpu_map__get(pmu->cpus);
+ if (perf_cpu_map__is_empty(pmu_cpus))
+ pmu_cpus = cpu_map__online();
} else {
is_pmu_core = (attr->type == PERF_TYPE_HARDWARE ||
attr->type == PERF_TYPE_HW_CACHE);
pmu_cpus = is_pmu_core ? cpu_map__online() : NULL;
}
- if (has_user_cpus) {
+ if (has_user_cpus)
cpus = perf_cpu_map__get(user_cpus);
- /* Existing behavior that pmu_cpus matches the given user ones. */
- perf_cpu_map__put(pmu_cpus);
- pmu_cpus = perf_cpu_map__get(user_cpus);
- } else {
+ else
cpus = perf_cpu_map__get(pmu_cpus);
- }
if (init_attr)
event_attr_init(attr);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index b0205e99a4c9..50b1a92d16df 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -769,8 +769,6 @@ int create_perf_stat_counter(struct evsel *evsel,
attr->enable_on_exec = 1;
}
- if (target__has_cpu(target) && !target__has_per_thread(target))
- return evsel__open_per_cpu(evsel, evsel__cpus(evsel), cpu_map_idx);
-
- return evsel__open_per_thread(evsel, evsel->core.threads);
+ return evsel__open_per_cpu_and_thread(evsel, evsel__cpus(evsel), cpu_map_idx,
+ evsel->core.threads);
}
--
2.50.0.727.gbf7dc18ff4-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
` (11 preceding siblings ...)
2025-07-17 21:02 ` [PATCH v2 12/12] perf parse-events: Support user CPUs mixed with threads/processes Ian Rogers
@ 2025-07-18 18:19 ` Namhyung Kim
12 siblings, 0 replies; 14+ messages in thread
From: Namhyung Kim @ 2025-07-18 18:19 UTC (permalink / raw)
To: Ian Rogers
Cc: Thomas Falcon, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Ben Gainey, James Clark,
Howard Chu, Weilin Wang, Levi Yun, Dr. David Alan Gilbert,
Zhongqiu Han, Blake Jones, Yicong Yang, Anubhav Shelat,
Thomas Richter, Jean-Philippe Romain, Song Liu, linux-perf-users,
linux-kernel
On Thu, Jul 17, 2025 at 02:02:21PM -0700, Ian Rogers wrote:
> On hybrid systems some PMUs apply to all core types, particularly for
> metrics the msr PMU and the tsc event. The metrics often only want the
> values of the counter for their specific core type. These patches
> allow the cpu term in an event to give a PMU name to take the cpumask
> from. For example:
>
> $ perf stat -e msr/tsc,cpu=cpu_atom/ ...
>
> will aggregate the msr/tsc/ value but only for atom cores. In doing
> this problems were identified in how cpumasks are handled by parsing
> and event setup when cpumasks are specified along with a task to
> profile. The event parsing, cpumask evlist propagation code and perf
> stat code are updated accordingly.
>
> The final result of the patch series is to be able to run:
> ```
> $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
> 10.1: Basic parsing test : Ok
> 10.2: Parsing without PMU name : Ok
> 10.3: Parsing with PMU name : Ok
>
> Performance counter stats for 'perf test -F 10':
>
> 63,704,975 msr/tsc/
> 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%)
> 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%)
> ```
>
> This has (further) identified a kernel bug for task events around the
> enabled time being too large leading to invalid scaling (hence the
> --no-scale in the command line above).
>
> v2: Add additional documentation of the cpu term to `perf list`
> (Namhyung), extend the term to also allow CPU ranges. Add Thomas
> Falcon's reviewed-by. Still open for discussion whether the term
> cpu should have >1 variant for PMUs, etc. or whether the single
> term is okay. We could refactor later and add a term, but that
> would break existing users, but they are most likely to be metrics
> so probably not a huge issue.
I slightly prefer a new term to minimize the possible confusion. But
it's not strong and I'm open to other opinion.
On the other hand, the 'cpu' term is new and I don't think it has many
users outside. So probably we can change it later.
Thanks,
Namhyung
>
> Ian Rogers (12):
> perf parse-events: Warn if a cpu term is unsupported by a CPU
> perf stat: Avoid buffer overflow to the aggregation map
> perf stat: Don't size aggregation ids from user_requested_cpus
> perf parse-events: Allow the cpu term to be a PMU or CPU range
> perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask
> libperf evsel: Rename own_cpus to pmu_cpus
> libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete
> perf evsel: Use libperf perf_evsel__exit
> perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu
> perf parse-events: Minor __add_event refactoring
> perf evsel: Add evsel__open_per_cpu_and_thread
> perf parse-events: Support user CPUs mixed with threads/processes
>
> tools/lib/perf/evlist.c | 118 ++++++++++++++++--------
> tools/lib/perf/evsel.c | 9 +-
> tools/lib/perf/include/internal/evsel.h | 3 +-
> tools/perf/Documentation/perf-list.txt | 25 +++--
> tools/perf/builtin-stat.c | 9 +-
> tools/perf/tests/event_update.c | 4 +-
> tools/perf/util/evlist.c | 15 +--
> tools/perf/util/evsel.c | 55 +++++++++--
> tools/perf/util/evsel.h | 5 +
> tools/perf/util/expr.c | 2 +-
> tools/perf/util/header.c | 4 +-
> tools/perf/util/parse-events.c | 112 ++++++++++++++++------
> tools/perf/util/pmus.c | 29 +++---
> tools/perf/util/pmus.h | 2 +
> tools/perf/util/stat.c | 6 +-
> tools/perf/util/synthetic-events.c | 4 +-
> tools/perf/util/tool_pmu.c | 56 +++++++++--
> tools/perf/util/tool_pmu.h | 2 +-
> 18 files changed, 323 insertions(+), 137 deletions(-)
>
> --
> 2.50.0.727.gbf7dc18ff4-goog
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-07-18 18:19 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17 21:02 [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Ian Rogers
2025-07-17 21:02 ` [PATCH v2 01/12] perf parse-events: Warn if a cpu term is unsupported by a CPU Ian Rogers
2025-07-17 21:02 ` [PATCH v2 02/12] perf stat: Avoid buffer overflow to the aggregation map Ian Rogers
2025-07-17 21:02 ` [PATCH v2 03/12] perf stat: Don't size aggregation ids from user_requested_cpus Ian Rogers
2025-07-17 21:02 ` [PATCH v2 04/12] perf parse-events: Allow the cpu term to be a PMU or CPU range Ian Rogers
2025-07-17 21:02 ` [PATCH v2 05/12] perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask Ian Rogers
2025-07-17 21:02 ` [PATCH v2 06/12] libperf evsel: Rename own_cpus to pmu_cpus Ian Rogers
2025-07-17 21:02 ` [PATCH v2 07/12] libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete Ian Rogers
2025-07-17 21:02 ` [PATCH v2 08/12] perf evsel: Use libperf perf_evsel__exit Ian Rogers
2025-07-17 21:02 ` [PATCH v2 09/12] perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu Ian Rogers
2025-07-17 21:02 ` [PATCH v2 10/12] perf parse-events: Minor __add_event refactoring Ian Rogers
2025-07-17 21:02 ` [PATCH v2 11/12] perf evsel: Add evsel__open_per_cpu_and_thread Ian Rogers
2025-07-17 21:02 ` [PATCH v2 12/12] perf parse-events: Support user CPUs mixed with threads/processes Ian Rogers
2025-07-18 18:19 ` [PATCH v2 00/12] CPU mask improvements/fixes particularly for hybrid Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).