* [PATCH v3 0/3] Add support for cpu event term
@ 2024-09-18 22:01 Ian Rogers
2024-09-18 22:01 ` [PATCH v3 1/3] libperf cpumap: Add ability to create CPU from a single CPU number Ian Rogers
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Ian Rogers @ 2024-09-18 22:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria,
Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users,
linux-kernel
Being able to set the cpu mask per event was discussed in the context
of a sysfs event.cpus file is discussed here:
https://lore.kernel.org/lkml/CAP-5=fXXuWchzUK0n5KTH8kamr=DQoEni+bUoo8f-4j8Y+eMBg@mail.gmail.com/
Ultimately Kan preferred to have multiple PMUs with a cpumask each
rather than an event.cpus file per event. It is still useful to have
the cpu event term and so the sysfs part of the original patch series
is dropped.
v3: Drop sysfs event.cpus file support patch from series. Reference
to using cpu to modify uncore events is dropped from the commit
message. Reference counting issues on the cpumap are addressed.
v2: Add support for multiple cpu terms on an event that are
merged. For example, an event of "l1d-misses/cpu=4,cpu=5/" will
now be opened on both CPU 4 and 5 rather than just CPU 4.
Ian Rogers (3):
libperf cpumap: Add ability to create CPU from a single CPU number
perf parse-events: Set is_pmu_core for legacy hardware events
perf parse-events: Add "cpu" term to set the CPU an event is recorded
on
tools/lib/perf/cpumap.c | 10 +++
tools/lib/perf/include/perf/cpumap.h | 2 +
tools/perf/Documentation/perf-list.txt | 9 +++
tools/perf/tests/event_update.c | 1 +
tools/perf/util/evsel_config.h | 1 +
tools/perf/util/parse-events.c | 108 +++++++++++++++++++------
tools/perf/util/parse-events.h | 3 +-
tools/perf/util/parse-events.l | 1 +
tools/perf/util/pmu.c | 1 +
9 files changed, 110 insertions(+), 26 deletions(-)
--
2.46.0.662.g92d0881bb0-goog
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v3 1/3] libperf cpumap: Add ability to create CPU from a single CPU number 2024-09-18 22:01 [PATCH v3 0/3] Add support for cpu event term Ian Rogers @ 2024-09-18 22:01 ` Ian Rogers 2024-09-18 22:01 ` [PATCH v3 2/3] perf parse-events: Set is_pmu_core for legacy hardware events Ian Rogers 2024-09-18 22:01 ` [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on Ian Rogers 2 siblings, 0 replies; 8+ messages in thread From: Ian Rogers @ 2024-09-18 22:01 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel Add perf_cpu_map__new_int so that a CPU map can be created from a single integer. Signed-off-by: Ian Rogers <irogers@google.com> --- tools/lib/perf/cpumap.c | 10 ++++++++++ tools/lib/perf/include/perf/cpumap.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index cae799ad44e1..2c8e36d0efaa 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -293,6 +293,16 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list) return cpus; } +struct perf_cpu_map *perf_cpu_map__new_int(int cpu) +{ + struct perf_cpu_map *cpus = perf_cpu_map__alloc(1); + + if (cpus) + RC_CHK_ACCESS(cpus)->map[0].cpu = cpu; + + return cpus; +} + static int __perf_cpu_map__nr(const struct perf_cpu_map *cpus) { return RC_CHK_ACCESS(cpus)->nr; diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index 90457d17fb2f..79ed3449d288 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -37,6 +37,8 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_online_cpus(void); * perf_cpu_map__new_online_cpus is returned. */ LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list); +/** perf_cpu_map__new_int - create a map with the one given cpu. */ +LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_int(int cpu); LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file); LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map); LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, -- 2.46.0.662.g92d0881bb0-goog ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 2/3] perf parse-events: Set is_pmu_core for legacy hardware events 2024-09-18 22:01 [PATCH v3 0/3] Add support for cpu event term Ian Rogers 2024-09-18 22:01 ` [PATCH v3 1/3] libperf cpumap: Add ability to create CPU from a single CPU number Ian Rogers @ 2024-09-18 22:01 ` Ian Rogers 2024-09-18 22:01 ` [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on Ian Rogers 2 siblings, 0 replies; 8+ messages in thread From: Ian Rogers @ 2024-09-18 22:01 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel Also set the CPU map to all online CPU maps. This is done so the behavior of legacy hardware and hardware cache events better matches that of sysfs and json events during __perf_evlist__propagate_maps. Fix missing cpumap put in "Synthesize attr update" test. Signed-off-by: Ian Rogers <irogers@google.com> --- tools/perf/tests/event_update.c | 1 + tools/perf/util/parse-events.c | 37 ++++++++++++++++++++------------- 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/tools/perf/tests/event_update.c b/tools/perf/tests/event_update.c index d6b4ce3ef4ee..9301fde11366 100644 --- a/tools/perf/tests/event_update.c +++ b/tools/perf/tests/event_update.c @@ -109,6 +109,7 @@ static int test__event_update(struct test_suite *test __maybe_unused, int subtes TEST_ASSERT_VAL("failed to synthesize attr update name", !perf_event__synthesize_event_update_name(&tmp.tool, evsel, process_event_name)); + perf_cpu_map__put(evsel->core.own_cpus); evsel->core.own_cpus = perf_cpu_map__new("1,2,3"); TEST_ASSERT_VAL("failed to synthesize attr update cpus", diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 9a8be1e46d67..017d31d51ea4 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -231,21 +231,30 @@ __add_event(struct list_head *list, int *idx, struct perf_cpu_map *cpu_list) { struct evsel *evsel; - struct perf_cpu_map *cpus = perf_cpu_map__is_empty(cpu_list) && pmu ? pmu->cpus : cpu_list; + bool is_pmu_core; + struct perf_cpu_map *cpus; - cpus = perf_cpu_map__get(cpus); - if (pmu) + if (pmu) { + is_pmu_core = pmu->is_core; + cpus = perf_cpu_map__get(perf_cpu_map__is_empty(cpu_list) ? pmu->cpus : cpu_list); perf_pmu__warn_invalid_formats(pmu); - - if (pmu && (attr->type == PERF_TYPE_RAW || attr->type >= PERF_TYPE_MAX)) { - perf_pmu__warn_invalid_config(pmu, attr->config, name, - PERF_PMU_FORMAT_VALUE_CONFIG, "config"); - perf_pmu__warn_invalid_config(pmu, attr->config1, name, - PERF_PMU_FORMAT_VALUE_CONFIG1, "config1"); - perf_pmu__warn_invalid_config(pmu, attr->config2, name, - PERF_PMU_FORMAT_VALUE_CONFIG2, "config2"); - perf_pmu__warn_invalid_config(pmu, attr->config3, name, - PERF_PMU_FORMAT_VALUE_CONFIG3, "config3"); + if (attr->type == PERF_TYPE_RAW || attr->type >= PERF_TYPE_MAX) { + perf_pmu__warn_invalid_config(pmu, attr->config, name, + PERF_PMU_FORMAT_VALUE_CONFIG, "config"); + perf_pmu__warn_invalid_config(pmu, attr->config1, name, + PERF_PMU_FORMAT_VALUE_CONFIG1, "config1"); + perf_pmu__warn_invalid_config(pmu, attr->config2, name, + PERF_PMU_FORMAT_VALUE_CONFIG2, "config2"); + perf_pmu__warn_invalid_config(pmu, attr->config3, name, + PERF_PMU_FORMAT_VALUE_CONFIG3, "config3"); + } + } else { + is_pmu_core = (attr->type == PERF_TYPE_HARDWARE || + attr->type == PERF_TYPE_HW_CACHE); + if (perf_cpu_map__is_empty(cpu_list)) + cpus = is_pmu_core ? perf_cpu_map__new_online_cpus() : NULL; + else + cpus = perf_cpu_map__get(cpu_list); } if (init_attr) event_attr_init(attr); @@ -260,7 +269,7 @@ __add_event(struct list_head *list, int *idx, evsel->core.cpus = cpus; evsel->core.own_cpus = perf_cpu_map__get(cpus); evsel->core.requires_cpu = pmu ? pmu->is_uncore : false; - evsel->core.is_pmu_core = pmu ? pmu->is_core : false; + evsel->core.is_pmu_core = is_pmu_core; evsel->auto_merge_stats = auto_merge_stats; evsel->pmu = pmu; evsel->pmu_name = pmu ? strdup(pmu->name) : NULL; -- 2.46.0.662.g92d0881bb0-goog ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on 2024-09-18 22:01 [PATCH v3 0/3] Add support for cpu event term Ian Rogers 2024-09-18 22:01 ` [PATCH v3 1/3] libperf cpumap: Add ability to create CPU from a single CPU number Ian Rogers 2024-09-18 22:01 ` [PATCH v3 2/3] perf parse-events: Set is_pmu_core for legacy hardware events Ian Rogers @ 2024-09-18 22:01 ` Ian Rogers 2024-09-30 21:42 ` Namhyung Kim 2 siblings, 1 reply; 8+ messages in thread From: Ian Rogers @ 2024-09-18 22:01 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel The -C option allows the CPUs for a list of events to be specified but its not possible to set the CPU for a single event. Add a term to allow this. The term isn't a general CPU list due to ',' already being a special character in event parsing instead multiple cpu= terms may be provided and they will be merged/unioned together. An example of mixing different types of events counted on different CPUs: ``` $ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1 Performance counter stats for 'system wide': CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle CPU4 <not counted> instructions/cpu=0/ CPU5 <not counted> instructions/cpu=0/ CPU8 <not counted> instructions/cpu=0/ CPU0 <not counted> l1d-misses [cpu] CPU4 203,377 l1d-misses [cpu] CPU5 138,231 l1d-misses [cpu] CPU8 <not counted> l1d-misses [cpu] CPU0 <not counted> cpu/cpu=8/ CPU4 <not counted> cpu/cpu=8/ CPU5 <not counted> cpu/cpu=8/ CPU8 943,861 cpu/cpu=8/ CPU0 1,412,071 cycles CPU4 20,362,900 cycles CPU5 10,172,725 cycles CPU8 2,406,081 cycles 0.102925309 seconds time elapsed ``` Note, the event name of inst_retired.any is missing, reported as cpu/cpu=8/, as there are unmerged uniquify fixes: https://lore.kernel.org/lkml/20240510053705.2462258-3-irogers@google.com/ Signed-off-by: Ian Rogers <irogers@google.com> --- tools/perf/Documentation/perf-list.txt | 9 ++++ tools/perf/util/evsel_config.h | 1 + tools/perf/util/parse-events.c | 71 ++++++++++++++++++++++---- tools/perf/util/parse-events.h | 3 +- tools/perf/util/parse-events.l | 1 + tools/perf/util/pmu.c | 1 + 6 files changed, 74 insertions(+), 12 deletions(-) diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index dea005410ec0..e0cd9bb8283e 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -274,6 +274,15 @@ Sums up the event counts for all hardware threads in a core, e.g.: perf stat -e cpu/event=0,umask=0x3,percore=1/ +cpu: + +Specifies the CPU to open the event upon. The value may be repeated to +specify opening the event on multiple CPUs: + + + perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1,cpu=2/ -a sleep 1 + perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1 + EVENT GROUPS ------------ diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h index aee6f808b512..9630c4a24721 100644 --- a/tools/perf/util/evsel_config.h +++ b/tools/perf/util/evsel_config.h @@ -47,6 +47,7 @@ struct evsel_config_term { u32 aux_sample_size; u64 cfg_chg; char *str; + int cpu; } val; bool weak; }; diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 017d31d51ea4..63c31cfdd79b 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -7,6 +7,7 @@ #include <errno.h> #include <sys/ioctl.h> #include <sys/param.h> +#include "cpumap.h" #include "term.h" #include "env.h" #include "evlist.h" @@ -178,6 +179,26 @@ static char *get_config_name(const struct parse_events_terms *head_terms) return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME); } +static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms) +{ + struct parse_events_term *term; + struct perf_cpu_map *cpus = NULL; + + if (!head_terms) + return NULL; + + list_for_each_entry(term, &head_terms->terms, list) { + if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) { + struct perf_cpu_map *cpu = perf_cpu_map__new_int(term->val.num); + + cpus = perf_cpu_map__merge(cpus, cpu); + perf_cpu_map__put(cpu); + } + } + + return cpus; +} + /** * fix_raw - For each raw term see if there is an event (aka alias) in pmu that * matches the raw's string value. If the string value matches an @@ -469,11 +490,12 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, bool found_supported = false; const char *config_name = get_config_name(parsed_terms); const char *metric_id = get_config_metric_id(parsed_terms); + struct perf_cpu_map *cpus = get_config_cpu(parsed_terms); + int ret = 0; while ((pmu = perf_pmus__scan(pmu)) != NULL) { LIST_HEAD(config_terms); struct perf_event_attr attr; - int ret; if (parse_events__filter_pmu(parse_state, pmu)) continue; @@ -487,7 +509,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, parsed_terms, perf_pmu__auto_merge_stats(pmu)); if (ret) - return ret; + goto out_err; continue; } @@ -507,20 +529,27 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, if (parsed_terms) { if (config_attr(&attr, parsed_terms, parse_state->error, - config_term_common)) - return -EINVAL; - - if (get_config_terms(parsed_terms, &config_terms)) - return -ENOMEM; + config_term_common)) { + ret = -EINVAL; + goto out_err; + } + if (get_config_terms(parsed_terms, &config_terms)) { + ret = -ENOMEM; + goto out_err; + } } if (__add_event(list, idx, &attr, /*init_attr*/true, config_name ?: name, metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, - /*cpu_list=*/NULL) == NULL) - return -ENOMEM; + cpus) == NULL) + ret = -ENOMEM; free_config_terms(&config_terms); + if (ret) + goto out_err; } +out_err: + perf_cpu_map__put(cpus); return found_supported ? 0 : -EINVAL; } @@ -835,6 +864,7 @@ static const char *config_term_name(enum parse_events__term_type term_type) [PARSE_EVENTS__TERM_TYPE_RAW] = "raw", [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", + [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", }; if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) return "unknown term"; @@ -864,6 +894,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er case PARSE_EVENTS__TERM_TYPE_METRIC_ID: case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: case PARSE_EVENTS__TERM_TYPE_PERCORE: + case PARSE_EVENTS__TERM_TYPE_CPU: return true; case PARSE_EVENTS__TERM_TYPE_USER: case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: @@ -1007,6 +1038,15 @@ do { \ return -EINVAL; } break; + case PARSE_EVENTS__TERM_TYPE_CPU: + CHECK_TYPE_VAL(NUM); + if (term->val.num >= (u64)cpu__max_present_cpu().cpu) { + parse_events_error__handle(err, term->err_val, + strdup("too big"), + NULL); + return -EINVAL; + } + break; case PARSE_EVENTS__TERM_TYPE_DRV_CFG: case PARSE_EVENTS__TERM_TYPE_USER: case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: @@ -1133,6 +1173,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, case PARSE_EVENTS__TERM_TYPE_RAW: case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: case PARSE_EVENTS__TERM_TYPE_HARDWARE: + case PARSE_EVENTS__TERM_TYPE_CPU: default: if (err) { parse_events_error__handle(err, term->err_term, @@ -1264,6 +1305,7 @@ do { \ case PARSE_EVENTS__TERM_TYPE_RAW: case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: case PARSE_EVENTS__TERM_TYPE_HARDWARE: + case PARSE_EVENTS__TERM_TYPE_CPU: default: break; } @@ -1317,6 +1359,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head case PARSE_EVENTS__TERM_TYPE_RAW: case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: case PARSE_EVENTS__TERM_TYPE_HARDWARE: + case PARSE_EVENTS__TERM_TYPE_CPU: default: break; } @@ -1371,6 +1414,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, struct perf_event_attr attr; LIST_HEAD(config_terms); const char *name, *metric_id; + struct perf_cpu_map *cpus; int ret; memset(&attr, 0, sizeof(attr)); @@ -1392,9 +1436,11 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, name = get_config_name(head_config); metric_id = get_config_metric_id(head_config); + cpus = get_config_cpu(head_config); ret = __add_event(list, &parse_state->idx, &attr, /*init_attr*/true, name, metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, - /*cpu_list=*/NULL) ? 0 : -ENOMEM; + cpus) ? 0 : -ENOMEM; + perf_cpu_map__put(cpus); free_config_terms(&config_terms); return ret; } @@ -1461,6 +1507,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, LIST_HEAD(config_terms); struct parse_events_terms parsed_terms; bool alias_rewrote_terms = false; + struct perf_cpu_map *term_cpu = NULL; if (verbose > 1) { struct strbuf sb; @@ -1552,10 +1599,12 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, return -EINVAL; } + term_cpu = get_config_cpu(&parsed_terms); evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, get_config_name(&parsed_terms), get_config_metric_id(&parsed_terms), pmu, - &config_terms, auto_merge_stats, /*cpu_list=*/NULL); + &config_terms, auto_merge_stats, term_cpu); + perf_cpu_map__put(term_cpu); if (!evsel) { parse_events_terms__exit(&parsed_terms); return -ENOMEM; diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index 10cc9c433116..2532c81d4f9a 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -79,7 +79,8 @@ enum parse_events__term_type { PARSE_EVENTS__TERM_TYPE_RAW, PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, PARSE_EVENTS__TERM_TYPE_HARDWARE, -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_HARDWARE + 1) + PARSE_EVENTS__TERM_TYPE_CPU, +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) }; struct parse_events_term { diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 5a0bcd7f166a..635d216632d7 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -331,6 +331,7 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); } aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); } aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } +cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 61bdda01a05a..4f68026a97bb 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -1738,6 +1738,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call "percore", "aux-output", "aux-sample-size=number", + "cpu=number", }; struct perf_pmu_format *format; int ret; -- 2.46.0.662.g92d0881bb0-goog ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on 2024-09-18 22:01 ` [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on Ian Rogers @ 2024-09-30 21:42 ` Namhyung Kim 2024-10-01 2:03 ` Ian Rogers 0 siblings, 1 reply; 8+ messages in thread From: Namhyung Kim @ 2024-09-30 21:42 UTC (permalink / raw) To: Ian Rogers Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel On Thu, Sep 19, 2024 at 12:01:33AM +0200, Ian Rogers wrote: > The -C option allows the CPUs for a list of events to be specified but > its not possible to set the CPU for a single event. Add a term to > allow this. The term isn't a general CPU list due to ',' already being > a special character in event parsing instead multiple cpu= terms may > be provided and they will be merged/unioned together. > > An example of mixing different types of events counted on different CPUs: > ``` > $ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1 > > Performance counter stats for 'system wide': > > CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle > CPU4 <not counted> instructions/cpu=0/ > CPU5 <not counted> instructions/cpu=0/ > CPU8 <not counted> instructions/cpu=0/ I think this should have an entry for CPU0 only. IOW the cpu parameter in the event specification has a higher priority than -C option. Thanks, Namhyung > CPU0 <not counted> l1d-misses [cpu] > CPU4 203,377 l1d-misses [cpu] > CPU5 138,231 l1d-misses [cpu] > CPU8 <not counted> l1d-misses [cpu] > CPU0 <not counted> cpu/cpu=8/ > CPU4 <not counted> cpu/cpu=8/ > CPU5 <not counted> cpu/cpu=8/ > CPU8 943,861 cpu/cpu=8/ > CPU0 1,412,071 cycles > CPU4 20,362,900 cycles > CPU5 10,172,725 cycles > CPU8 2,406,081 cycles > > 0.102925309 seconds time elapsed > ``` > > Note, the event name of inst_retired.any is missing, reported as > cpu/cpu=8/, as there are unmerged uniquify fixes: > https://lore.kernel.org/lkml/20240510053705.2462258-3-irogers@google.com/ > > Signed-off-by: Ian Rogers <irogers@google.com> > --- > tools/perf/Documentation/perf-list.txt | 9 ++++ > tools/perf/util/evsel_config.h | 1 + > tools/perf/util/parse-events.c | 71 ++++++++++++++++++++++---- > tools/perf/util/parse-events.h | 3 +- > tools/perf/util/parse-events.l | 1 + > tools/perf/util/pmu.c | 1 + > 6 files changed, 74 insertions(+), 12 deletions(-) > > diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt > index dea005410ec0..e0cd9bb8283e 100644 > --- a/tools/perf/Documentation/perf-list.txt > +++ b/tools/perf/Documentation/perf-list.txt > @@ -274,6 +274,15 @@ Sums up the event counts for all hardware threads in a core, e.g.: > > perf stat -e cpu/event=0,umask=0x3,percore=1/ > > +cpu: > + > +Specifies the CPU to open the event upon. The value may be repeated to > +specify opening the event on multiple CPUs: > + > + > + perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1,cpu=2/ -a sleep 1 > + perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1 > + > > EVENT GROUPS > ------------ > diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h > index aee6f808b512..9630c4a24721 100644 > --- a/tools/perf/util/evsel_config.h > +++ b/tools/perf/util/evsel_config.h > @@ -47,6 +47,7 @@ struct evsel_config_term { > u32 aux_sample_size; > u64 cfg_chg; > char *str; > + int cpu; > } val; > bool weak; > }; > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > index 017d31d51ea4..63c31cfdd79b 100644 > --- a/tools/perf/util/parse-events.c > +++ b/tools/perf/util/parse-events.c > @@ -7,6 +7,7 @@ > #include <errno.h> > #include <sys/ioctl.h> > #include <sys/param.h> > +#include "cpumap.h" > #include "term.h" > #include "env.h" > #include "evlist.h" > @@ -178,6 +179,26 @@ static char *get_config_name(const struct parse_events_terms *head_terms) > return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME); > } > > +static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms) > +{ > + struct parse_events_term *term; > + struct perf_cpu_map *cpus = NULL; > + > + if (!head_terms) > + return NULL; > + > + list_for_each_entry(term, &head_terms->terms, list) { > + if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) { > + struct perf_cpu_map *cpu = perf_cpu_map__new_int(term->val.num); > + > + cpus = perf_cpu_map__merge(cpus, cpu); > + perf_cpu_map__put(cpu); > + } > + } > + > + return cpus; > +} > + > /** > * fix_raw - For each raw term see if there is an event (aka alias) in pmu that > * matches the raw's string value. If the string value matches an > @@ -469,11 +490,12 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > bool found_supported = false; > const char *config_name = get_config_name(parsed_terms); > const char *metric_id = get_config_metric_id(parsed_terms); > + struct perf_cpu_map *cpus = get_config_cpu(parsed_terms); > + int ret = 0; > > while ((pmu = perf_pmus__scan(pmu)) != NULL) { > LIST_HEAD(config_terms); > struct perf_event_attr attr; > - int ret; > > if (parse_events__filter_pmu(parse_state, pmu)) > continue; > @@ -487,7 +509,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > parsed_terms, > perf_pmu__auto_merge_stats(pmu)); > if (ret) > - return ret; > + goto out_err; > continue; > } > > @@ -507,20 +529,27 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > if (parsed_terms) { > if (config_attr(&attr, parsed_terms, parse_state->error, > - config_term_common)) > - return -EINVAL; > - > - if (get_config_terms(parsed_terms, &config_terms)) > - return -ENOMEM; > + config_term_common)) { > + ret = -EINVAL; > + goto out_err; > + } > + if (get_config_terms(parsed_terms, &config_terms)) { > + ret = -ENOMEM; > + goto out_err; > + } > } > > if (__add_event(list, idx, &attr, /*init_attr*/true, config_name ?: name, > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > - /*cpu_list=*/NULL) == NULL) > - return -ENOMEM; > + cpus) == NULL) > + ret = -ENOMEM; > > free_config_terms(&config_terms); > + if (ret) > + goto out_err; > } > +out_err: > + perf_cpu_map__put(cpus); > return found_supported ? 0 : -EINVAL; > } > > @@ -835,6 +864,7 @@ static const char *config_term_name(enum parse_events__term_type term_type) > [PARSE_EVENTS__TERM_TYPE_RAW] = "raw", > [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", > [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", > + [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", > }; > if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) > return "unknown term"; > @@ -864,6 +894,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er > case PARSE_EVENTS__TERM_TYPE_METRIC_ID: > case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: > case PARSE_EVENTS__TERM_TYPE_PERCORE: > + case PARSE_EVENTS__TERM_TYPE_CPU: > return true; > case PARSE_EVENTS__TERM_TYPE_USER: > case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: > @@ -1007,6 +1038,15 @@ do { \ > return -EINVAL; > } > break; > + case PARSE_EVENTS__TERM_TYPE_CPU: > + CHECK_TYPE_VAL(NUM); > + if (term->val.num >= (u64)cpu__max_present_cpu().cpu) { > + parse_events_error__handle(err, term->err_val, > + strdup("too big"), > + NULL); > + return -EINVAL; > + } > + break; > case PARSE_EVENTS__TERM_TYPE_DRV_CFG: > case PARSE_EVENTS__TERM_TYPE_USER: > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > @@ -1133,6 +1173,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, > case PARSE_EVENTS__TERM_TYPE_RAW: > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > + case PARSE_EVENTS__TERM_TYPE_CPU: > default: > if (err) { > parse_events_error__handle(err, term->err_term, > @@ -1264,6 +1305,7 @@ do { \ > case PARSE_EVENTS__TERM_TYPE_RAW: > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > + case PARSE_EVENTS__TERM_TYPE_CPU: > default: > break; > } > @@ -1317,6 +1359,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head > case PARSE_EVENTS__TERM_TYPE_RAW: > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > + case PARSE_EVENTS__TERM_TYPE_CPU: > default: > break; > } > @@ -1371,6 +1414,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > struct perf_event_attr attr; > LIST_HEAD(config_terms); > const char *name, *metric_id; > + struct perf_cpu_map *cpus; > int ret; > > memset(&attr, 0, sizeof(attr)); > @@ -1392,9 +1436,11 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > name = get_config_name(head_config); > metric_id = get_config_metric_id(head_config); > + cpus = get_config_cpu(head_config); > ret = __add_event(list, &parse_state->idx, &attr, /*init_attr*/true, name, > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > - /*cpu_list=*/NULL) ? 0 : -ENOMEM; > + cpus) ? 0 : -ENOMEM; > + perf_cpu_map__put(cpus); > free_config_terms(&config_terms); > return ret; > } > @@ -1461,6 +1507,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > LIST_HEAD(config_terms); > struct parse_events_terms parsed_terms; > bool alias_rewrote_terms = false; > + struct perf_cpu_map *term_cpu = NULL; > > if (verbose > 1) { > struct strbuf sb; > @@ -1552,10 +1599,12 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > return -EINVAL; > } > > + term_cpu = get_config_cpu(&parsed_terms); > evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, > get_config_name(&parsed_terms), > get_config_metric_id(&parsed_terms), pmu, > - &config_terms, auto_merge_stats, /*cpu_list=*/NULL); > + &config_terms, auto_merge_stats, term_cpu); > + perf_cpu_map__put(term_cpu); > if (!evsel) { > parse_events_terms__exit(&parsed_terms); > return -ENOMEM; > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h > index 10cc9c433116..2532c81d4f9a 100644 > --- a/tools/perf/util/parse-events.h > +++ b/tools/perf/util/parse-events.h > @@ -79,7 +79,8 @@ enum parse_events__term_type { > PARSE_EVENTS__TERM_TYPE_RAW, > PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, > PARSE_EVENTS__TERM_TYPE_HARDWARE, > -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_HARDWARE + 1) > + PARSE_EVENTS__TERM_TYPE_CPU, > +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) > }; > > struct parse_events_term { > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l > index 5a0bcd7f166a..635d216632d7 100644 > --- a/tools/perf/util/parse-events.l > +++ b/tools/perf/util/parse-events.l > @@ -331,6 +331,7 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); } > aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); } > aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } > metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } > +cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } > cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } > stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } > stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c > index 61bdda01a05a..4f68026a97bb 100644 > --- a/tools/perf/util/pmu.c > +++ b/tools/perf/util/pmu.c > @@ -1738,6 +1738,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call > "percore", > "aux-output", > "aux-sample-size=number", > + "cpu=number", > }; > struct perf_pmu_format *format; > int ret; > -- > 2.46.0.662.g92d0881bb0-goog > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on 2024-09-30 21:42 ` Namhyung Kim @ 2024-10-01 2:03 ` Ian Rogers 2024-10-01 22:53 ` Namhyung Kim 0 siblings, 1 reply; 8+ messages in thread From: Ian Rogers @ 2024-10-01 2:03 UTC (permalink / raw) To: Namhyung Kim Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel On Mon, Sep 30, 2024 at 2:42 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Thu, Sep 19, 2024 at 12:01:33AM +0200, Ian Rogers wrote: > > The -C option allows the CPUs for a list of events to be specified but > > its not possible to set the CPU for a single event. Add a term to > > allow this. The term isn't a general CPU list due to ',' already being > > a special character in event parsing instead multiple cpu= terms may > > be provided and they will be merged/unioned together. > > > > An example of mixing different types of events counted on different CPUs: > > ``` > > $ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1 > > > > Performance counter stats for 'system wide': > > > > CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle > > CPU4 <not counted> instructions/cpu=0/ > > CPU5 <not counted> instructions/cpu=0/ > > CPU8 <not counted> instructions/cpu=0/ > > I think this should have an entry for CPU0 only. IOW the cpu parameter > in the event specification has a higher priority than -C option. I think you are agreeing with what the output shows and what is written in the commit message. See below for cleaning up the stat output. Thanks, Ian > Thanks, > Namhyung > > > > CPU0 <not counted> l1d-misses [cpu] > > CPU4 203,377 l1d-misses [cpu] > > CPU5 138,231 l1d-misses [cpu] > > CPU8 <not counted> l1d-misses [cpu] > > CPU0 <not counted> cpu/cpu=8/ > > CPU4 <not counted> cpu/cpu=8/ > > CPU5 <not counted> cpu/cpu=8/ > > CPU8 943,861 cpu/cpu=8/ > > CPU0 1,412,071 cycles > > CPU4 20,362,900 cycles > > CPU5 10,172,725 cycles > > CPU8 2,406,081 cycles > > > > 0.102925309 seconds time elapsed > > ``` > > > > Note, the event name of inst_retired.any is missing, reported as > > cpu/cpu=8/, as there are unmerged uniquify fixes: > > https://lore.kernel.org/lkml/20240510053705.2462258-3-irogers@google.com/ > > > > Signed-off-by: Ian Rogers <irogers@google.com> > > --- > > tools/perf/Documentation/perf-list.txt | 9 ++++ > > tools/perf/util/evsel_config.h | 1 + > > tools/perf/util/parse-events.c | 71 ++++++++++++++++++++++---- > > tools/perf/util/parse-events.h | 3 +- > > tools/perf/util/parse-events.l | 1 + > > tools/perf/util/pmu.c | 1 + > > 6 files changed, 74 insertions(+), 12 deletions(-) > > > > diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt > > index dea005410ec0..e0cd9bb8283e 100644 > > --- a/tools/perf/Documentation/perf-list.txt > > +++ b/tools/perf/Documentation/perf-list.txt > > @@ -274,6 +274,15 @@ Sums up the event counts for all hardware threads in a core, e.g.: > > > > perf stat -e cpu/event=0,umask=0x3,percore=1/ > > > > +cpu: > > + > > +Specifies the CPU to open the event upon. The value may be repeated to > > +specify opening the event on multiple CPUs: > > + > > + > > + perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1,cpu=2/ -a sleep 1 > > + perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1 > > + > > > > EVENT GROUPS > > ------------ > > diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h > > index aee6f808b512..9630c4a24721 100644 > > --- a/tools/perf/util/evsel_config.h > > +++ b/tools/perf/util/evsel_config.h > > @@ -47,6 +47,7 @@ struct evsel_config_term { > > u32 aux_sample_size; > > u64 cfg_chg; > > char *str; > > + int cpu; > > } val; > > bool weak; > > }; > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > index 017d31d51ea4..63c31cfdd79b 100644 > > --- a/tools/perf/util/parse-events.c > > +++ b/tools/perf/util/parse-events.c > > @@ -7,6 +7,7 @@ > > #include <errno.h> > > #include <sys/ioctl.h> > > #include <sys/param.h> > > +#include "cpumap.h" > > #include "term.h" > > #include "env.h" > > #include "evlist.h" > > @@ -178,6 +179,26 @@ static char *get_config_name(const struct parse_events_terms *head_terms) > > return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME); > > } > > > > +static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms) > > +{ > > + struct parse_events_term *term; > > + struct perf_cpu_map *cpus = NULL; > > + > > + if (!head_terms) > > + return NULL; > > + > > + list_for_each_entry(term, &head_terms->terms, list) { > > + if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) { > > + struct perf_cpu_map *cpu = perf_cpu_map__new_int(term->val.num); > > + > > + cpus = perf_cpu_map__merge(cpus, cpu); > > + perf_cpu_map__put(cpu); > > + } > > + } > > + > > + return cpus; > > +} > > + > > /** > > * fix_raw - For each raw term see if there is an event (aka alias) in pmu that > > * matches the raw's string value. If the string value matches an > > @@ -469,11 +490,12 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > bool found_supported = false; > > const char *config_name = get_config_name(parsed_terms); > > const char *metric_id = get_config_metric_id(parsed_terms); > > + struct perf_cpu_map *cpus = get_config_cpu(parsed_terms); > > + int ret = 0; > > > > while ((pmu = perf_pmus__scan(pmu)) != NULL) { > > LIST_HEAD(config_terms); > > struct perf_event_attr attr; > > - int ret; > > > > if (parse_events__filter_pmu(parse_state, pmu)) > > continue; > > @@ -487,7 +509,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > parsed_terms, > > perf_pmu__auto_merge_stats(pmu)); > > if (ret) > > - return ret; > > + goto out_err; > > continue; > > } > > > > @@ -507,20 +529,27 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > > if (parsed_terms) { > > if (config_attr(&attr, parsed_terms, parse_state->error, > > - config_term_common)) > > - return -EINVAL; > > - > > - if (get_config_terms(parsed_terms, &config_terms)) > > - return -ENOMEM; > > + config_term_common)) { > > + ret = -EINVAL; > > + goto out_err; > > + } > > + if (get_config_terms(parsed_terms, &config_terms)) { > > + ret = -ENOMEM; > > + goto out_err; > > + } > > } > > > > if (__add_event(list, idx, &attr, /*init_attr*/true, config_name ?: name, > > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > > - /*cpu_list=*/NULL) == NULL) > > - return -ENOMEM; > > + cpus) == NULL) > > + ret = -ENOMEM; > > > > free_config_terms(&config_terms); > > + if (ret) > > + goto out_err; > > } > > +out_err: > > + perf_cpu_map__put(cpus); > > return found_supported ? 0 : -EINVAL; > > } > > > > @@ -835,6 +864,7 @@ static const char *config_term_name(enum parse_events__term_type term_type) > > [PARSE_EVENTS__TERM_TYPE_RAW] = "raw", > > [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", > > [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", > > + [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", > > }; > > if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) > > return "unknown term"; > > @@ -864,6 +894,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er > > case PARSE_EVENTS__TERM_TYPE_METRIC_ID: > > case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: > > case PARSE_EVENTS__TERM_TYPE_PERCORE: > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > return true; > > case PARSE_EVENTS__TERM_TYPE_USER: > > case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: > > @@ -1007,6 +1038,15 @@ do { \ > > return -EINVAL; > > } > > break; > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > + CHECK_TYPE_VAL(NUM); > > + if (term->val.num >= (u64)cpu__max_present_cpu().cpu) { > > + parse_events_error__handle(err, term->err_val, > > + strdup("too big"), > > + NULL); > > + return -EINVAL; > > + } > > + break; > > case PARSE_EVENTS__TERM_TYPE_DRV_CFG: > > case PARSE_EVENTS__TERM_TYPE_USER: > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > @@ -1133,6 +1173,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, > > case PARSE_EVENTS__TERM_TYPE_RAW: > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > default: > > if (err) { > > parse_events_error__handle(err, term->err_term, > > @@ -1264,6 +1305,7 @@ do { \ > > case PARSE_EVENTS__TERM_TYPE_RAW: > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > default: > > break; > > } > > @@ -1317,6 +1359,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head > > case PARSE_EVENTS__TERM_TYPE_RAW: > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > default: > > break; > > } > > @@ -1371,6 +1414,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > struct perf_event_attr attr; > > LIST_HEAD(config_terms); > > const char *name, *metric_id; > > + struct perf_cpu_map *cpus; > > int ret; > > > > memset(&attr, 0, sizeof(attr)); > > @@ -1392,9 +1436,11 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > > > name = get_config_name(head_config); > > metric_id = get_config_metric_id(head_config); > > + cpus = get_config_cpu(head_config); > > ret = __add_event(list, &parse_state->idx, &attr, /*init_attr*/true, name, > > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > > - /*cpu_list=*/NULL) ? 0 : -ENOMEM; > > + cpus) ? 0 : -ENOMEM; > > + perf_cpu_map__put(cpus); > > free_config_terms(&config_terms); > > return ret; > > } > > @@ -1461,6 +1507,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > > LIST_HEAD(config_terms); > > struct parse_events_terms parsed_terms; > > bool alias_rewrote_terms = false; > > + struct perf_cpu_map *term_cpu = NULL; > > > > if (verbose > 1) { > > struct strbuf sb; > > @@ -1552,10 +1599,12 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > > return -EINVAL; > > } > > > > + term_cpu = get_config_cpu(&parsed_terms); > > evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, > > get_config_name(&parsed_terms), > > get_config_metric_id(&parsed_terms), pmu, > > - &config_terms, auto_merge_stats, /*cpu_list=*/NULL); > > + &config_terms, auto_merge_stats, term_cpu); > > + perf_cpu_map__put(term_cpu); > > if (!evsel) { > > parse_events_terms__exit(&parsed_terms); > > return -ENOMEM; > > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h > > index 10cc9c433116..2532c81d4f9a 100644 > > --- a/tools/perf/util/parse-events.h > > +++ b/tools/perf/util/parse-events.h > > @@ -79,7 +79,8 @@ enum parse_events__term_type { > > PARSE_EVENTS__TERM_TYPE_RAW, > > PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, > > PARSE_EVENTS__TERM_TYPE_HARDWARE, > > -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_HARDWARE + 1) > > + PARSE_EVENTS__TERM_TYPE_CPU, > > +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) > > }; > > > > struct parse_events_term { > > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l > > index 5a0bcd7f166a..635d216632d7 100644 > > --- a/tools/perf/util/parse-events.l > > +++ b/tools/perf/util/parse-events.l > > @@ -331,6 +331,7 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); } > > aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); } > > aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } > > metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } > > +cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } > > cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } > > stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } > > stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } > > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c > > index 61bdda01a05a..4f68026a97bb 100644 > > --- a/tools/perf/util/pmu.c > > +++ b/tools/perf/util/pmu.c > > @@ -1738,6 +1738,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call > > "percore", > > "aux-output", > > "aux-sample-size=number", > > + "cpu=number", > > }; > > struct perf_pmu_format *format; > > int ret; > > -- > > 2.46.0.662.g92d0881bb0-goog > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on 2024-10-01 2:03 ` Ian Rogers @ 2024-10-01 22:53 ` Namhyung Kim 2024-10-01 23:18 ` Ian Rogers 0 siblings, 1 reply; 8+ messages in thread From: Namhyung Kim @ 2024-10-01 22:53 UTC (permalink / raw) To: Ian Rogers Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel On Mon, Sep 30, 2024 at 07:03:35PM -0700, Ian Rogers wrote: > On Mon, Sep 30, 2024 at 2:42 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > On Thu, Sep 19, 2024 at 12:01:33AM +0200, Ian Rogers wrote: > > > The -C option allows the CPUs for a list of events to be specified but > > > its not possible to set the CPU for a single event. Add a term to > > > allow this. The term isn't a general CPU list due to ',' already being > > > a special character in event parsing instead multiple cpu= terms may > > > be provided and they will be merged/unioned together. > > > > > > An example of mixing different types of events counted on different CPUs: > > > ``` > > > $ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1 > > > > > > Performance counter stats for 'system wide': > > > > > > CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle > > > CPU4 <not counted> instructions/cpu=0/ > > > CPU5 <not counted> instructions/cpu=0/ > > > CPU8 <not counted> instructions/cpu=0/ > > > > I think this should have an entry for CPU0 only. IOW the cpu parameter > > in the event specification has a higher priority than -C option. > > I think you are agreeing with what the output shows and what is > written in the commit message. See below for cleaning up the stat > output. I agree it'd be helpful if we can specify different CPUs for each event. But I want the output omitting the not counted events on other CPUs like CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle CPU4 203,377 l1d-misses [cpu] CPU5 138,231 l1d-misses [cpu] CPU8 943,861 cpu/cpu=8/ CPU0 1,412,071 cycles CPU4 20,362,900 cycles CPU5 10,172,725 cycles CPU8 2,406,081 cycles Thanks, Namhyung > > > > > > > CPU0 <not counted> l1d-misses [cpu] > > > CPU4 203,377 l1d-misses [cpu] > > > CPU5 138,231 l1d-misses [cpu] > > > CPU8 <not counted> l1d-misses [cpu] > > > CPU0 <not counted> cpu/cpu=8/ > > > CPU4 <not counted> cpu/cpu=8/ > > > CPU5 <not counted> cpu/cpu=8/ > > > CPU8 943,861 cpu/cpu=8/ > > > CPU0 1,412,071 cycles > > > CPU4 20,362,900 cycles > > > CPU5 10,172,725 cycles > > > CPU8 2,406,081 cycles > > > > > > 0.102925309 seconds time elapsed > > > ``` > > > > > > Note, the event name of inst_retired.any is missing, reported as > > > cpu/cpu=8/, as there are unmerged uniquify fixes: > > > https://lore.kernel.org/lkml/20240510053705.2462258-3-irogers@google.com/ > > > > > > Signed-off-by: Ian Rogers <irogers@google.com> > > > --- > > > tools/perf/Documentation/perf-list.txt | 9 ++++ > > > tools/perf/util/evsel_config.h | 1 + > > > tools/perf/util/parse-events.c | 71 ++++++++++++++++++++++---- > > > tools/perf/util/parse-events.h | 3 +- > > > tools/perf/util/parse-events.l | 1 + > > > tools/perf/util/pmu.c | 1 + > > > 6 files changed, 74 insertions(+), 12 deletions(-) > > > > > > diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt > > > index dea005410ec0..e0cd9bb8283e 100644 > > > --- a/tools/perf/Documentation/perf-list.txt > > > +++ b/tools/perf/Documentation/perf-list.txt > > > @@ -274,6 +274,15 @@ Sums up the event counts for all hardware threads in a core, e.g.: > > > > > > perf stat -e cpu/event=0,umask=0x3,percore=1/ > > > > > > +cpu: > > > + > > > +Specifies the CPU to open the event upon. The value may be repeated to > > > +specify opening the event on multiple CPUs: > > > + > > > + > > > + perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1,cpu=2/ -a sleep 1 > > > + perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1 > > > + > > > > > > EVENT GROUPS > > > ------------ > > > diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h > > > index aee6f808b512..9630c4a24721 100644 > > > --- a/tools/perf/util/evsel_config.h > > > +++ b/tools/perf/util/evsel_config.h > > > @@ -47,6 +47,7 @@ struct evsel_config_term { > > > u32 aux_sample_size; > > > u64 cfg_chg; > > > char *str; > > > + int cpu; > > > } val; > > > bool weak; > > > }; > > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > > index 017d31d51ea4..63c31cfdd79b 100644 > > > --- a/tools/perf/util/parse-events.c > > > +++ b/tools/perf/util/parse-events.c > > > @@ -7,6 +7,7 @@ > > > #include <errno.h> > > > #include <sys/ioctl.h> > > > #include <sys/param.h> > > > +#include "cpumap.h" > > > #include "term.h" > > > #include "env.h" > > > #include "evlist.h" > > > @@ -178,6 +179,26 @@ static char *get_config_name(const struct parse_events_terms *head_terms) > > > return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME); > > > } > > > > > > +static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms) > > > +{ > > > + struct parse_events_term *term; > > > + struct perf_cpu_map *cpus = NULL; > > > + > > > + if (!head_terms) > > > + return NULL; > > > + > > > + list_for_each_entry(term, &head_terms->terms, list) { > > > + if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) { > > > + struct perf_cpu_map *cpu = perf_cpu_map__new_int(term->val.num); > > > + > > > + cpus = perf_cpu_map__merge(cpus, cpu); > > > + perf_cpu_map__put(cpu); > > > + } > > > + } > > > + > > > + return cpus; > > > +} > > > + > > > /** > > > * fix_raw - For each raw term see if there is an event (aka alias) in pmu that > > > * matches the raw's string value. If the string value matches an > > > @@ -469,11 +490,12 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > bool found_supported = false; > > > const char *config_name = get_config_name(parsed_terms); > > > const char *metric_id = get_config_metric_id(parsed_terms); > > > + struct perf_cpu_map *cpus = get_config_cpu(parsed_terms); > > > + int ret = 0; > > > > > > while ((pmu = perf_pmus__scan(pmu)) != NULL) { > > > LIST_HEAD(config_terms); > > > struct perf_event_attr attr; > > > - int ret; > > > > > > if (parse_events__filter_pmu(parse_state, pmu)) > > > continue; > > > @@ -487,7 +509,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > parsed_terms, > > > perf_pmu__auto_merge_stats(pmu)); > > > if (ret) > > > - return ret; > > > + goto out_err; > > > continue; > > > } > > > > > > @@ -507,20 +529,27 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > > > > if (parsed_terms) { > > > if (config_attr(&attr, parsed_terms, parse_state->error, > > > - config_term_common)) > > > - return -EINVAL; > > > - > > > - if (get_config_terms(parsed_terms, &config_terms)) > > > - return -ENOMEM; > > > + config_term_common)) { > > > + ret = -EINVAL; > > > + goto out_err; > > > + } > > > + if (get_config_terms(parsed_terms, &config_terms)) { > > > + ret = -ENOMEM; > > > + goto out_err; > > > + } > > > } > > > > > > if (__add_event(list, idx, &attr, /*init_attr*/true, config_name ?: name, > > > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > > > - /*cpu_list=*/NULL) == NULL) > > > - return -ENOMEM; > > > + cpus) == NULL) > > > + ret = -ENOMEM; > > > > > > free_config_terms(&config_terms); > > > + if (ret) > > > + goto out_err; > > > } > > > +out_err: > > > + perf_cpu_map__put(cpus); > > > return found_supported ? 0 : -EINVAL; > > > } > > > > > > @@ -835,6 +864,7 @@ static const char *config_term_name(enum parse_events__term_type term_type) > > > [PARSE_EVENTS__TERM_TYPE_RAW] = "raw", > > > [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", > > > [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", > > > + [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", > > > }; > > > if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) > > > return "unknown term"; > > > @@ -864,6 +894,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er > > > case PARSE_EVENTS__TERM_TYPE_METRIC_ID: > > > case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: > > > case PARSE_EVENTS__TERM_TYPE_PERCORE: > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > return true; > > > case PARSE_EVENTS__TERM_TYPE_USER: > > > case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: > > > @@ -1007,6 +1038,15 @@ do { \ > > > return -EINVAL; > > > } > > > break; > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > + CHECK_TYPE_VAL(NUM); > > > + if (term->val.num >= (u64)cpu__max_present_cpu().cpu) { > > > + parse_events_error__handle(err, term->err_val, > > > + strdup("too big"), > > > + NULL); > > > + return -EINVAL; > > > + } > > > + break; > > > case PARSE_EVENTS__TERM_TYPE_DRV_CFG: > > > case PARSE_EVENTS__TERM_TYPE_USER: > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > @@ -1133,6 +1173,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, > > > case PARSE_EVENTS__TERM_TYPE_RAW: > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > default: > > > if (err) { > > > parse_events_error__handle(err, term->err_term, > > > @@ -1264,6 +1305,7 @@ do { \ > > > case PARSE_EVENTS__TERM_TYPE_RAW: > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > default: > > > break; > > > } > > > @@ -1317,6 +1359,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head > > > case PARSE_EVENTS__TERM_TYPE_RAW: > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > default: > > > break; > > > } > > > @@ -1371,6 +1414,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > > struct perf_event_attr attr; > > > LIST_HEAD(config_terms); > > > const char *name, *metric_id; > > > + struct perf_cpu_map *cpus; > > > int ret; > > > > > > memset(&attr, 0, sizeof(attr)); > > > @@ -1392,9 +1436,11 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > > > > > name = get_config_name(head_config); > > > metric_id = get_config_metric_id(head_config); > > > + cpus = get_config_cpu(head_config); > > > ret = __add_event(list, &parse_state->idx, &attr, /*init_attr*/true, name, > > > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > > > - /*cpu_list=*/NULL) ? 0 : -ENOMEM; > > > + cpus) ? 0 : -ENOMEM; > > > + perf_cpu_map__put(cpus); > > > free_config_terms(&config_terms); > > > return ret; > > > } > > > @@ -1461,6 +1507,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > > > LIST_HEAD(config_terms); > > > struct parse_events_terms parsed_terms; > > > bool alias_rewrote_terms = false; > > > + struct perf_cpu_map *term_cpu = NULL; > > > > > > if (verbose > 1) { > > > struct strbuf sb; > > > @@ -1552,10 +1599,12 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > > > return -EINVAL; > > > } > > > > > > + term_cpu = get_config_cpu(&parsed_terms); > > > evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, > > > get_config_name(&parsed_terms), > > > get_config_metric_id(&parsed_terms), pmu, > > > - &config_terms, auto_merge_stats, /*cpu_list=*/NULL); > > > + &config_terms, auto_merge_stats, term_cpu); > > > + perf_cpu_map__put(term_cpu); > > > if (!evsel) { > > > parse_events_terms__exit(&parsed_terms); > > > return -ENOMEM; > > > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h > > > index 10cc9c433116..2532c81d4f9a 100644 > > > --- a/tools/perf/util/parse-events.h > > > +++ b/tools/perf/util/parse-events.h > > > @@ -79,7 +79,8 @@ enum parse_events__term_type { > > > PARSE_EVENTS__TERM_TYPE_RAW, > > > PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, > > > PARSE_EVENTS__TERM_TYPE_HARDWARE, > > > -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_HARDWARE + 1) > > > + PARSE_EVENTS__TERM_TYPE_CPU, > > > +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) > > > }; > > > > > > struct parse_events_term { > > > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l > > > index 5a0bcd7f166a..635d216632d7 100644 > > > --- a/tools/perf/util/parse-events.l > > > +++ b/tools/perf/util/parse-events.l > > > @@ -331,6 +331,7 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); } > > > aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); } > > > aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } > > > metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } > > > +cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } > > > cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } > > > stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } > > > stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } > > > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c > > > index 61bdda01a05a..4f68026a97bb 100644 > > > --- a/tools/perf/util/pmu.c > > > +++ b/tools/perf/util/pmu.c > > > @@ -1738,6 +1738,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call > > > "percore", > > > "aux-output", > > > "aux-sample-size=number", > > > + "cpu=number", > > > }; > > > struct perf_pmu_format *format; > > > int ret; > > > -- > > > 2.46.0.662.g92d0881bb0-goog > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on 2024-10-01 22:53 ` Namhyung Kim @ 2024-10-01 23:18 ` Ian Rogers 0 siblings, 0 replies; 8+ messages in thread From: Ian Rogers @ 2024-10-01 23:18 UTC (permalink / raw) To: Namhyung Kim Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang, James Clark, Ravi Bangoria, Weilin Wang, Dominique Martinet, Jing Zhang, linux-perf-users, linux-kernel On Tue, Oct 1, 2024 at 3:53 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Mon, Sep 30, 2024 at 07:03:35PM -0700, Ian Rogers wrote: > > On Mon, Sep 30, 2024 at 2:42 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > On Thu, Sep 19, 2024 at 12:01:33AM +0200, Ian Rogers wrote: > > > > The -C option allows the CPUs for a list of events to be specified but > > > > its not possible to set the CPU for a single event. Add a term to > > > > allow this. The term isn't a general CPU list due to ',' already being > > > > a special character in event parsing instead multiple cpu= terms may > > > > be provided and they will be merged/unioned together. > > > > > > > > An example of mixing different types of events counted on different CPUs: > > > > ``` > > > > $ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1 > > > > > > > > Performance counter stats for 'system wide': > > > > > > > > CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle > > > > CPU4 <not counted> instructions/cpu=0/ > > > > CPU5 <not counted> instructions/cpu=0/ > > > > CPU8 <not counted> instructions/cpu=0/ > > > > > > I think this should have an entry for CPU0 only. IOW the cpu parameter > > > in the event specification has a higher priority than -C option. > > > > I think you are agreeing with what the output shows and what is > > written in the commit message. See below for cleaning up the stat > > output. > > I agree it'd be helpful if we can specify different CPUs for each event. > But I want the output omitting the not counted events on other CPUs like > > CPU0 368,647 instructions/cpu=0/ # 0.26 insn per cycle > CPU4 203,377 l1d-misses [cpu] > CPU5 138,231 l1d-misses [cpu] > CPU8 943,861 cpu/cpu=8/ > CPU0 1,412,071 cycles > CPU4 20,362,900 cycles > CPU5 10,172,725 cycles > CPU8 2,406,081 cycles Sure that needs a display update and isn't part of supporting a modifier *sigh* Ian > Thanks, > Namhyung > > > > > > > > > > > CPU0 <not counted> l1d-misses [cpu] > > > > CPU4 203,377 l1d-misses [cpu] > > > > CPU5 138,231 l1d-misses [cpu] > > > > CPU8 <not counted> l1d-misses [cpu] > > > > CPU0 <not counted> cpu/cpu=8/ > > > > CPU4 <not counted> cpu/cpu=8/ > > > > CPU5 <not counted> cpu/cpu=8/ > > > > CPU8 943,861 cpu/cpu=8/ > > > > CPU0 1,412,071 cycles > > > > CPU4 20,362,900 cycles > > > > CPU5 10,172,725 cycles > > > > CPU8 2,406,081 cycles > > > > > > > > 0.102925309 seconds time elapsed > > > > ``` > > > > > > > > Note, the event name of inst_retired.any is missing, reported as > > > > cpu/cpu=8/, as there are unmerged uniquify fixes: > > > > https://lore.kernel.org/lkml/20240510053705.2462258-3-irogers@google.com/ > > > > > > > > Signed-off-by: Ian Rogers <irogers@google.com> > > > > --- > > > > tools/perf/Documentation/perf-list.txt | 9 ++++ > > > > tools/perf/util/evsel_config.h | 1 + > > > > tools/perf/util/parse-events.c | 71 ++++++++++++++++++++++---- > > > > tools/perf/util/parse-events.h | 3 +- > > > > tools/perf/util/parse-events.l | 1 + > > > > tools/perf/util/pmu.c | 1 + > > > > 6 files changed, 74 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt > > > > index dea005410ec0..e0cd9bb8283e 100644 > > > > --- a/tools/perf/Documentation/perf-list.txt > > > > +++ b/tools/perf/Documentation/perf-list.txt > > > > @@ -274,6 +274,15 @@ Sums up the event counts for all hardware threads in a core, e.g.: > > > > > > > > perf stat -e cpu/event=0,umask=0x3,percore=1/ > > > > > > > > +cpu: > > > > + > > > > +Specifies the CPU to open the event upon. The value may be repeated to > > > > +specify opening the event on multiple CPUs: > > > > + > > > > + > > > > + perf stat -e instructions/cpu=0,cpu=2/,cycles/cpu=1,cpu=2/ -a sleep 1 > > > > + perf stat -e data_read/cpu=0/,data_write/cpu=1/ -a sleep 1 > > > > + > > > > > > > > EVENT GROUPS > > > > ------------ > > > > diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h > > > > index aee6f808b512..9630c4a24721 100644 > > > > --- a/tools/perf/util/evsel_config.h > > > > +++ b/tools/perf/util/evsel_config.h > > > > @@ -47,6 +47,7 @@ struct evsel_config_term { > > > > u32 aux_sample_size; > > > > u64 cfg_chg; > > > > char *str; > > > > + int cpu; > > > > } val; > > > > bool weak; > > > > }; > > > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > > > index 017d31d51ea4..63c31cfdd79b 100644 > > > > --- a/tools/perf/util/parse-events.c > > > > +++ b/tools/perf/util/parse-events.c > > > > @@ -7,6 +7,7 @@ > > > > #include <errno.h> > > > > #include <sys/ioctl.h> > > > > #include <sys/param.h> > > > > +#include "cpumap.h" > > > > #include "term.h" > > > > #include "env.h" > > > > #include "evlist.h" > > > > @@ -178,6 +179,26 @@ static char *get_config_name(const struct parse_events_terms *head_terms) > > > > return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME); > > > > } > > > > > > > > +static struct perf_cpu_map *get_config_cpu(const struct parse_events_terms *head_terms) > > > > +{ > > > > + struct parse_events_term *term; > > > > + struct perf_cpu_map *cpus = NULL; > > > > + > > > > + if (!head_terms) > > > > + return NULL; > > > > + > > > > + list_for_each_entry(term, &head_terms->terms, list) { > > > > + if (term->type_term == PARSE_EVENTS__TERM_TYPE_CPU) { > > > > + struct perf_cpu_map *cpu = perf_cpu_map__new_int(term->val.num); > > > > + > > > > + cpus = perf_cpu_map__merge(cpus, cpu); > > > > + perf_cpu_map__put(cpu); > > > > + } > > > > + } > > > > + > > > > + return cpus; > > > > +} > > > > + > > > > /** > > > > * fix_raw - For each raw term see if there is an event (aka alias) in pmu that > > > > * matches the raw's string value. If the string value matches an > > > > @@ -469,11 +490,12 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > > bool found_supported = false; > > > > const char *config_name = get_config_name(parsed_terms); > > > > const char *metric_id = get_config_metric_id(parsed_terms); > > > > + struct perf_cpu_map *cpus = get_config_cpu(parsed_terms); > > > > + int ret = 0; > > > > > > > > while ((pmu = perf_pmus__scan(pmu)) != NULL) { > > > > LIST_HEAD(config_terms); > > > > struct perf_event_attr attr; > > > > - int ret; > > > > > > > > if (parse_events__filter_pmu(parse_state, pmu)) > > > > continue; > > > > @@ -487,7 +509,7 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > > parsed_terms, > > > > perf_pmu__auto_merge_stats(pmu)); > > > > if (ret) > > > > - return ret; > > > > + goto out_err; > > > > continue; > > > > } > > > > > > > > @@ -507,20 +529,27 @@ int parse_events_add_cache(struct list_head *list, int *idx, const char *name, > > > > > > > > if (parsed_terms) { > > > > if (config_attr(&attr, parsed_terms, parse_state->error, > > > > - config_term_common)) > > > > - return -EINVAL; > > > > - > > > > - if (get_config_terms(parsed_terms, &config_terms)) > > > > - return -ENOMEM; > > > > + config_term_common)) { > > > > + ret = -EINVAL; > > > > + goto out_err; > > > > + } > > > > + if (get_config_terms(parsed_terms, &config_terms)) { > > > > + ret = -ENOMEM; > > > > + goto out_err; > > > > + } > > > > } > > > > > > > > if (__add_event(list, idx, &attr, /*init_attr*/true, config_name ?: name, > > > > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > > > > - /*cpu_list=*/NULL) == NULL) > > > > - return -ENOMEM; > > > > + cpus) == NULL) > > > > + ret = -ENOMEM; > > > > > > > > free_config_terms(&config_terms); > > > > + if (ret) > > > > + goto out_err; > > > > } > > > > +out_err: > > > > + perf_cpu_map__put(cpus); > > > > return found_supported ? 0 : -EINVAL; > > > > } > > > > > > > > @@ -835,6 +864,7 @@ static const char *config_term_name(enum parse_events__term_type term_type) > > > > [PARSE_EVENTS__TERM_TYPE_RAW] = "raw", > > > > [PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE] = "legacy-cache", > > > > [PARSE_EVENTS__TERM_TYPE_HARDWARE] = "hardware", > > > > + [PARSE_EVENTS__TERM_TYPE_CPU] = "cpu", > > > > }; > > > > if ((unsigned int)term_type >= __PARSE_EVENTS__TERM_TYPE_NR) > > > > return "unknown term"; > > > > @@ -864,6 +894,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er > > > > case PARSE_EVENTS__TERM_TYPE_METRIC_ID: > > > > case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: > > > > case PARSE_EVENTS__TERM_TYPE_PERCORE: > > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > > return true; > > > > case PARSE_EVENTS__TERM_TYPE_USER: > > > > case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: > > > > @@ -1007,6 +1038,15 @@ do { \ > > > > return -EINVAL; > > > > } > > > > break; > > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > > + CHECK_TYPE_VAL(NUM); > > > > + if (term->val.num >= (u64)cpu__max_present_cpu().cpu) { > > > > + parse_events_error__handle(err, term->err_val, > > > > + strdup("too big"), > > > > + NULL); > > > > + return -EINVAL; > > > > + } > > > > + break; > > > > case PARSE_EVENTS__TERM_TYPE_DRV_CFG: > > > > case PARSE_EVENTS__TERM_TYPE_USER: > > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > > @@ -1133,6 +1173,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, > > > > case PARSE_EVENTS__TERM_TYPE_RAW: > > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > > default: > > > > if (err) { > > > > parse_events_error__handle(err, term->err_term, > > > > @@ -1264,6 +1305,7 @@ do { \ > > > > case PARSE_EVENTS__TERM_TYPE_RAW: > > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > > default: > > > > break; > > > > } > > > > @@ -1317,6 +1359,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head > > > > case PARSE_EVENTS__TERM_TYPE_RAW: > > > > case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE: > > > > case PARSE_EVENTS__TERM_TYPE_HARDWARE: > > > > + case PARSE_EVENTS__TERM_TYPE_CPU: > > > > default: > > > > break; > > > > } > > > > @@ -1371,6 +1414,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > > > struct perf_event_attr attr; > > > > LIST_HEAD(config_terms); > > > > const char *name, *metric_id; > > > > + struct perf_cpu_map *cpus; > > > > int ret; > > > > > > > > memset(&attr, 0, sizeof(attr)); > > > > @@ -1392,9 +1436,11 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, > > > > > > > > name = get_config_name(head_config); > > > > metric_id = get_config_metric_id(head_config); > > > > + cpus = get_config_cpu(head_config); > > > > ret = __add_event(list, &parse_state->idx, &attr, /*init_attr*/true, name, > > > > metric_id, pmu, &config_terms, /*auto_merge_stats=*/false, > > > > - /*cpu_list=*/NULL) ? 0 : -ENOMEM; > > > > + cpus) ? 0 : -ENOMEM; > > > > + perf_cpu_map__put(cpus); > > > > free_config_terms(&config_terms); > > > > return ret; > > > > } > > > > @@ -1461,6 +1507,7 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > > > > LIST_HEAD(config_terms); > > > > struct parse_events_terms parsed_terms; > > > > bool alias_rewrote_terms = false; > > > > + struct perf_cpu_map *term_cpu = NULL; > > > > > > > > if (verbose > 1) { > > > > struct strbuf sb; > > > > @@ -1552,10 +1599,12 @@ static int parse_events_add_pmu(struct parse_events_state *parse_state, > > > > return -EINVAL; > > > > } > > > > > > > > + term_cpu = get_config_cpu(&parsed_terms); > > > > evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, > > > > get_config_name(&parsed_terms), > > > > get_config_metric_id(&parsed_terms), pmu, > > > > - &config_terms, auto_merge_stats, /*cpu_list=*/NULL); > > > > + &config_terms, auto_merge_stats, term_cpu); > > > > + perf_cpu_map__put(term_cpu); > > > > if (!evsel) { > > > > parse_events_terms__exit(&parsed_terms); > > > > return -ENOMEM; > > > > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h > > > > index 10cc9c433116..2532c81d4f9a 100644 > > > > --- a/tools/perf/util/parse-events.h > > > > +++ b/tools/perf/util/parse-events.h > > > > @@ -79,7 +79,8 @@ enum parse_events__term_type { > > > > PARSE_EVENTS__TERM_TYPE_RAW, > > > > PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE, > > > > PARSE_EVENTS__TERM_TYPE_HARDWARE, > > > > -#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_HARDWARE + 1) > > > > + PARSE_EVENTS__TERM_TYPE_CPU, > > > > +#define __PARSE_EVENTS__TERM_TYPE_NR (PARSE_EVENTS__TERM_TYPE_CPU + 1) > > > > }; > > > > > > > > struct parse_events_term { > > > > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l > > > > index 5a0bcd7f166a..635d216632d7 100644 > > > > --- a/tools/perf/util/parse-events.l > > > > +++ b/tools/perf/util/parse-events.l > > > > @@ -331,6 +331,7 @@ percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); } > > > > aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); } > > > > aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); } > > > > metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } > > > > +cpu { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CPU); } > > > > cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); } > > > > stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); } > > > > stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); } > > > > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c > > > > index 61bdda01a05a..4f68026a97bb 100644 > > > > --- a/tools/perf/util/pmu.c > > > > +++ b/tools/perf/util/pmu.c > > > > @@ -1738,6 +1738,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call > > > > "percore", > > > > "aux-output", > > > > "aux-sample-size=number", > > > > + "cpu=number", > > > > }; > > > > struct perf_pmu_format *format; > > > > int ret; > > > > -- > > > > 2.46.0.662.g92d0881bb0-goog > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-10-01 23:19 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-18 22:01 [PATCH v3 0/3] Add support for cpu event term Ian Rogers 2024-09-18 22:01 ` [PATCH v3 1/3] libperf cpumap: Add ability to create CPU from a single CPU number Ian Rogers 2024-09-18 22:01 ` [PATCH v3 2/3] perf parse-events: Set is_pmu_core for legacy hardware events Ian Rogers 2024-09-18 22:01 ` [PATCH v3 3/3] perf parse-events: Add "cpu" term to set the CPU an event is recorded on Ian Rogers 2024-09-30 21:42 ` Namhyung Kim 2024-10-01 2:03 ` Ian Rogers 2024-10-01 22:53 ` Namhyung Kim 2024-10-01 23:18 ` Ian Rogers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).