* [PATCH v6 1/2] perf record: Skip don't fail for events that don't open
2025-03-24 16:45 [PATCH v6 0/2] Prefer sysfs/JSON events also when no PMU is provided Ian Rogers
@ 2025-03-24 16:45 ` Ian Rogers
2025-03-24 16:45 ` [PATCH v6 2/2] perf parse-events: Reapply "Prefer sysfs/JSON hardware events over legacy" Ian Rogers
2025-03-27 19:13 ` [PATCH v6 0/2] Prefer sysfs/JSON events also when no PMU is provided Ian Rogers
2 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2025-03-24 16:45 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ze Gao,
Weilin Wang, Dominique Martinet, Jean-Philippe Romain, Junhao He,
linux-perf-users, linux-kernel, Aditya Bodkhe
Cc: Leo Yan, Atish Patra
Whilst for many tools it is an expected behavior that failure to open
a perf event is a failure, ARM decided to name PMU events the same as
legacy events and then failed to rename such events on a server uncore
SLC PMU. As perf's default behavior when no PMU is specified is to
open the event on all PMUs that advertise/"have" the event, this
yielded failures when trying to make the priority of legacy and
sysfs/json events uniform - something requested by RISC-V and ARM. A
legacy event user on ARM hardware may find their event opened on an
uncore PMU which for perf record will fail. Arnaldo suggested skipping
such events which this patch implements. Rather than have the skipping
conditional on running on ARM, the skipping is done on all
architectures as such a fundamental behavioral difference could lead
to problems with tools built/depending on perf.
An example of perf record failing to open events on x86 is:
```
$ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
Error:
Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
"dmesg | grep -i perf" may provide additional information.
Error:
Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
"dmesg | grep -i perf" may provide additional information.
Error:
Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
The LLC-prefetch-read event is not supported.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
$ perf report --stats
Aggregated stats:
TOTAL events: 17255
MMAP events: 284 ( 1.6%)
COMM events: 1961 (11.4%)
EXIT events: 1 ( 0.0%)
FORK events: 1960 (11.4%)
SAMPLE events: 87 ( 0.5%)
MMAP2 events: 12836 (74.4%)
KSYMBOL events: 83 ( 0.5%)
BPF_EVENT events: 36 ( 0.2%)
FINISHED_ROUND events: 2 ( 0.0%)
ID_INDEX events: 1 ( 0.0%)
THREAD_MAP events: 1 ( 0.0%)
CPU_MAP events: 1 ( 0.0%)
TIME_CONV events: 1 ( 0.0%)
FINISHED_INIT events: 1 ( 0.0%)
cycles stats:
SAMPLE events: 87
```
If all events fail to open then the perf record will fail:
```
$ perf record -e LLC-prefetch-read true
Error:
Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
The LLC-prefetch-read event is not supported.
Error:
Failure to open any events for recording
```
As an evlist may have dummy events that open when all command line
events fail we ignore dummy events when detecting if at least some
events open. This still permits the dummy event on its own to be used
as a permission check:
```
$ perf record -e dummy true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.046 MB perf.data ]
```
but allows failure when a dummy event is implicilty inserted or when
there are insufficient permissions to open it:
```
$ perf record -e LLC-prefetch-read -a true
Error:
Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
The LLC-prefetch-read event is not supported.
Error:
Failure to open any events for recording
```
The issue with legacy events is that on RISC-V they want the driver to
not have mappings from legacy to non-legacy config encodings for each
vendor/model due to size, complexity and difficulty to update. It was
reported that on ARM Apple-M? CPUs the legacy mapping in the driver
was broken and the sysfs/json events should always take precedent,
however, it isn't clear this is still the case. It is the case that
without working around this issue a legacy event like cycles without a
PMU can encode differently than when specified with a PMU - the
non-PMU version favoring legacy encodings, the PMU one avoiding legacy
encodings.
The patch removes events and then adjusts the idx value for each
evsel. This is done so that the dense xyarrays used for file
descriptors, etc. don't contain broken entries. As event opening
happens relatively late in the record process, use of the idx value
before the open will have become corrupted, so it is expected there
are latent bugs hidden behind this change - the change is best
effort. As the only vendor that has broken event names is ARM, this
will principally effect ARM users. They will also experience warning
messages like those above because of the uncore PMU advertising legacy
event names.
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
---
tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
1 file changed, 41 insertions(+), 6 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ba20bf7c011d..00f6a1c23254 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -962,7 +962,6 @@ static int record__config_tracking_events(struct record *rec)
*/
if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
perf_pmus__num_core_pmus() > 1) {
-
/*
* User space tasks can migrate between CPUs, so when tracing
* selected CPUs, sideband for all CPUs is still needed.
@@ -1367,6 +1366,7 @@ static int record__open(struct record *rec)
struct perf_session *session = rec->session;
struct record_opts *opts = &rec->opts;
int rc = 0;
+ bool skipped = false;
evlist__for_each_entry(evlist, pos) {
try_again:
@@ -1382,15 +1382,50 @@ static int record__open(struct record *rec)
pos = evlist__reset_weak_group(evlist, pos, true);
goto try_again;
}
- rc = -errno;
evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
- ui__error("%s\n", msg);
- goto out;
+ ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
+ evsel__name(pos), evsel__pmu_name(pos), msg);
+ pos->skippable = true;
+ skipped = true;
+ } else {
+ pos->supported = true;
}
-
- pos->supported = true;
}
+ if (skipped) {
+ struct evsel *tmp;
+ int idx = 0;
+ bool evlist_empty = true;
+
+ /* Remove evsels that failed to open and update indices. */
+ evlist__for_each_entry_safe(evlist, tmp, pos) {
+ if (pos->skippable) {
+ evlist__remove(evlist, pos);
+ continue;
+ }
+
+ /*
+ * Note, dummy events may be command line parsed or
+ * added by the tool. We care about supporting `perf
+ * record -e dummy` which may be used as a permission
+ * check. Dummy events that are added to the command
+ * line and opened along with other events that fail,
+ * will still fail as if the dummy events were tool
+ * added events for the sake of code simplicity.
+ */
+ if (!evsel__is_dummy_event(pos))
+ evlist_empty = false;
+ }
+ evlist__for_each_entry(evlist, pos) {
+ pos->core.idx = idx++;
+ }
+ /* If list is empty then fail. */
+ if (evlist_empty) {
+ ui__error("Failure to open any events for recording.\n");
+ rc = -1;
+ goto out;
+ }
+ }
if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
pr_warning(
"WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
--
2.49.0.395.g12beb8f557-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v6 2/2] perf parse-events: Reapply "Prefer sysfs/JSON hardware events over legacy"
2025-03-24 16:45 [PATCH v6 0/2] Prefer sysfs/JSON events also when no PMU is provided Ian Rogers
2025-03-24 16:45 ` [PATCH v6 1/2] perf record: Skip don't fail for events that don't open Ian Rogers
@ 2025-03-24 16:45 ` Ian Rogers
2025-03-27 19:13 ` [PATCH v6 0/2] Prefer sysfs/JSON events also when no PMU is provided Ian Rogers
2 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2025-03-24 16:45 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ze Gao,
Weilin Wang, Dominique Martinet, Jean-Philippe Romain, Junhao He,
linux-perf-users, linux-kernel, Aditya Bodkhe
Cc: Atish Patra, Leo Yan, Beeman Strong, Arnaldo Carvalho de Melo
Originally posted and merged from:
https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com
This reverts commit 4f1b067359ac8364cdb7f9fda41085fa85789d0f although
the patch is now smaller due to related fixes being applied in commit
22a4db3c3603 ("perf evsel: Add alternate_hw_config and use in
evsel__match").
The original commit message was:
It was requested that RISC-V be able to add events to the perf tool so
the PMU driver didn't need to map legacy events to config encodings:
https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/
This change makes the priority of events specified without a PMU the
same as those specified with a PMU, namely sysfs and JSON events are
checked first before using the legacy encoding.
The hw_term is made more generic as a hardware_event that encodes a
pair of string and int value, allowing parse_events_multi_pmu_add to
fall back on a known encoding when the sysfs/JSON adding fails for
core events. As this covers PE_VALUE_SYM_HW, that token is removed and
related code simplified.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Beeman Strong <beeman@rivosinc.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/parse-events.c | 26 +++++++++---
tools/perf/util/parse-events.l | 76 +++++++++++++++++-----------------
tools/perf/util/parse-events.y | 60 ++++++++++++++++++---------
3 files changed, 98 insertions(+), 64 deletions(-)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 5152fd5a6ead..cc9cd8069cab 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1551,8 +1551,8 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state,
struct list_head *list = NULL;
struct perf_pmu *pmu = NULL;
YYLTYPE *loc = loc_;
- int ok = 0;
- const char *config;
+ int ok = 0, core_ok = 0;
+ const char *tmp;
struct parse_events_terms parsed_terms;
*listp = NULL;
@@ -1565,15 +1565,15 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state,
return ret;
}
- config = strdup(event_name);
- if (!config)
+ tmp = strdup(event_name);
+ if (!tmp)
goto out_err;
if (parse_events_term__num(&term,
PARSE_EVENTS__TERM_TYPE_USER,
- config, /*num=*/1, /*novalue=*/true,
+ tmp, /*num=*/1, /*novalue=*/true,
loc, /*loc_val=*/NULL) < 0) {
- zfree(&config);
+ zfree(&tmp);
goto out_err;
}
list_add_tail(&term->list, &parsed_terms.terms);
@@ -1604,6 +1604,8 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state,
pr_debug("%s -> %s/%s/\n", event_name, pmu->name, sb.buf);
strbuf_release(&sb);
ok++;
+ if (pmu->is_core)
+ core_ok++;
}
}
@@ -1620,6 +1622,18 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state,
}
}
+ if (hw_config != PERF_COUNT_HW_MAX && !core_ok) {
+ /*
+ * The event wasn't found on core PMUs but it has a hardware
+ * config version to try.
+ */
+ if (!parse_events_add_numeric(parse_state, list,
+ PERF_TYPE_HARDWARE, hw_config,
+ const_parsed_terms,
+ /*wildcard=*/true))
+ ok++;
+ }
+
out_err:
parse_events_terms__exit(&parsed_terms);
if (ok)
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 7ed86e3e34e3..324b7dc8a0d3 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -117,12 +117,12 @@ do { \
yyless(0); \
} while (0)
-static int sym(yyscan_t scanner, int type, int config)
+static int sym(yyscan_t scanner, int config)
{
YYSTYPE *yylval = parse_events_get_lval(scanner);
- yylval->num = (type << 16) + config;
- return type == PERF_TYPE_HARDWARE ? PE_VALUE_SYM_HW : PE_VALUE_SYM_SW;
+ yylval->num = config;
+ return PE_VALUE_SYM_SW;
}
static int term(yyscan_t scanner, enum parse_events__term_type type)
@@ -133,13 +133,13 @@ static int term(yyscan_t scanner, enum parse_events__term_type type)
return PE_TERM;
}
-static int hw_term(yyscan_t scanner, int config)
+static int hw(yyscan_t scanner, int config)
{
YYSTYPE *yylval = parse_events_get_lval(scanner);
char *text = parse_events_get_text(scanner);
- yylval->hardware_term.str = strdup(text);
- yylval->hardware_term.num = PERF_TYPE_HARDWARE + config;
+ yylval->hardware_event.str = strdup(text);
+ yylval->hardware_event.num = config;
return PE_TERM_HW;
}
@@ -335,16 +335,16 @@ aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
aux-action { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_ACTION); }
aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
metric-id { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); }
-cpu-cycles|cycles { return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
-stalled-cycles-frontend|idle-cycles-frontend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
-stalled-cycles-backend|idle-cycles-backend { return hw_term(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
-instructions { return hw_term(yyscanner, PERF_COUNT_HW_INSTRUCTIONS); }
-cache-references { return hw_term(yyscanner, PERF_COUNT_HW_CACHE_REFERENCES); }
-cache-misses { return hw_term(yyscanner, PERF_COUNT_HW_CACHE_MISSES); }
-branch-instructions|branches { return hw_term(yyscanner, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
-branch-misses { return hw_term(yyscanner, PERF_COUNT_HW_BRANCH_MISSES); }
-bus-cycles { return hw_term(yyscanner, PERF_COUNT_HW_BUS_CYCLES); }
-ref-cycles { return hw_term(yyscanner, PERF_COUNT_HW_REF_CPU_CYCLES); }
+cpu-cycles|cycles { return hw(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
+stalled-cycles-frontend|idle-cycles-frontend { return hw(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
+stalled-cycles-backend|idle-cycles-backend { return hw(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
+instructions { return hw(yyscanner, PERF_COUNT_HW_INSTRUCTIONS); }
+cache-references { return hw(yyscanner, PERF_COUNT_HW_CACHE_REFERENCES); }
+cache-misses { return hw(yyscanner, PERF_COUNT_HW_CACHE_MISSES); }
+branch-instructions|branches { return hw(yyscanner, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
+branch-misses { return hw(yyscanner, PERF_COUNT_HW_BRANCH_MISSES); }
+bus-cycles { return hw(yyscanner, PERF_COUNT_HW_BUS_CYCLES); }
+ref-cycles { return hw(yyscanner, PERF_COUNT_HW_REF_CPU_CYCLES); }
r{num_raw_hex} { return str(yyscanner, PE_RAW); }
r0x{num_raw_hex} { return str(yyscanner, PE_RAW); }
, { return ','; }
@@ -390,28 +390,28 @@ r0x{num_raw_hex} { return str(yyscanner, PE_RAW); }
<<EOF>> { BEGIN(INITIAL); }
}
-cpu-cycles|cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES); }
-stalled-cycles-frontend|idle-cycles-frontend { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
-stalled-cycles-backend|idle-cycles-backend { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
-instructions { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS); }
-cache-references { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES); }
-cache-misses { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_MISSES); }
-branch-instructions|branches { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
-branch-misses { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_MISSES); }
-bus-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BUS_CYCLES); }
-ref-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_REF_CPU_CYCLES); }
-cpu-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_CLOCK); }
-task-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_TASK_CLOCK); }
-page-faults|faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS); }
-minor-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS_MIN); }
-major-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS_MAJ); }
-context-switches|cs { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CONTEXT_SWITCHES); }
-cpu-migrations|migrations { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_MIGRATIONS); }
-alignment-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
-emulation-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
-dummy { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
-bpf-output { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
-cgroup-switches { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CGROUP_SWITCHES); }
+cpu-cycles|cycles { return hw(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
+stalled-cycles-frontend|idle-cycles-frontend { return hw(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
+stalled-cycles-backend|idle-cycles-backend { return hw(yyscanner, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
+instructions { return hw(yyscanner, PERF_COUNT_HW_INSTRUCTIONS); }
+cache-references { return hw(yyscanner, PERF_COUNT_HW_CACHE_REFERENCES); }
+cache-misses { return hw(yyscanner, PERF_COUNT_HW_CACHE_MISSES); }
+branch-instructions|branches { return hw(yyscanner, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
+branch-misses { return hw(yyscanner, PERF_COUNT_HW_BRANCH_MISSES); }
+bus-cycles { return hw(yyscanner, PERF_COUNT_HW_BUS_CYCLES); }
+ref-cycles { return hw(yyscanner, PERF_COUNT_HW_REF_CPU_CYCLES); }
+cpu-clock { return sym(yyscanner, PERF_COUNT_SW_CPU_CLOCK); }
+task-clock { return sym(yyscanner, PERF_COUNT_SW_TASK_CLOCK); }
+page-faults|faults { return sym(yyscanner, PERF_COUNT_SW_PAGE_FAULTS); }
+minor-faults { return sym(yyscanner, PERF_COUNT_SW_PAGE_FAULTS_MIN); }
+major-faults { return sym(yyscanner, PERF_COUNT_SW_PAGE_FAULTS_MAJ); }
+context-switches|cs { return sym(yyscanner, PERF_COUNT_SW_CONTEXT_SWITCHES); }
+cpu-migrations|migrations { return sym(yyscanner, PERF_COUNT_SW_CPU_MIGRATIONS); }
+alignment-faults { return sym(yyscanner, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
+emulation-faults { return sym(yyscanner, PERF_COUNT_SW_EMULATION_FAULTS); }
+dummy { return sym(yyscanner, PERF_COUNT_SW_DUMMY); }
+bpf-output { return sym(yyscanner, PERF_COUNT_SW_BPF_OUTPUT); }
+cgroup-switches { return sym(yyscanner, PERF_COUNT_SW_CGROUP_SWITCHES); }
{lc_type} { return str(yyscanner, PE_LEGACY_CACHE); }
{lc_type}-{lc_op_result} { return str(yyscanner, PE_LEGACY_CACHE); }
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index f888cbb076d6..d2ef1890007e 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -55,7 +55,7 @@ static void free_list_evsel(struct list_head* list_evsel)
%}
%token PE_START_EVENTS PE_START_TERMS
-%token PE_VALUE PE_VALUE_SYM_HW PE_VALUE_SYM_SW PE_TERM
+%token PE_VALUE PE_VALUE_SYM_SW PE_TERM
%token PE_EVENT_NAME
%token PE_RAW PE_NAME
%token PE_MODIFIER_EVENT PE_MODIFIER_BP PE_BP_COLON PE_BP_SLASH
@@ -65,11 +65,9 @@ static void free_list_evsel(struct list_head* list_evsel)
%token PE_DRV_CFG_TERM
%token PE_TERM_HW
%type <num> PE_VALUE
-%type <num> PE_VALUE_SYM_HW
%type <num> PE_VALUE_SYM_SW
%type <mod> PE_MODIFIER_EVENT
%type <term_type> PE_TERM
-%type <num> value_sym
%type <str> PE_RAW
%type <str> PE_NAME
%type <str> PE_LEGACY_CACHE
@@ -85,6 +83,7 @@ static void free_list_evsel(struct list_head* list_evsel)
%type <list_terms> opt_pmu_config
%destructor { parse_events_terms__delete ($$); } <list_terms>
%type <list_evsel> event_pmu
+%type <list_evsel> event_legacy_hardware
%type <list_evsel> event_legacy_symbol
%type <list_evsel> event_legacy_cache
%type <list_evsel> event_legacy_mem
@@ -102,8 +101,8 @@ static void free_list_evsel(struct list_head* list_evsel)
%destructor { free_list_evsel ($$); } <list_evsel>
%type <tracepoint_name> tracepoint_name
%destructor { free ($$.sys); free ($$.event); } <tracepoint_name>
-%type <hardware_term> PE_TERM_HW
-%destructor { free ($$.str); } <hardware_term>
+%type <hardware_event> PE_TERM_HW
+%destructor { free ($$.str); } <hardware_event>
%union
{
@@ -118,10 +117,10 @@ static void free_list_evsel(struct list_head* list_evsel)
char *sys;
char *event;
} tracepoint_name;
- struct hardware_term {
+ struct hardware_event {
char *str;
u64 num;
- } hardware_term;
+ } hardware_event;
}
%%
@@ -264,6 +263,7 @@ PE_EVENT_NAME event_def
event_def
event_def: event_pmu |
+ event_legacy_hardware |
event_legacy_symbol |
event_legacy_cache sep_dc |
event_legacy_mem sep_dc |
@@ -306,24 +306,45 @@ PE_NAME sep_dc
$$ = list;
}
-value_sym:
-PE_VALUE_SYM_HW
+event_legacy_hardware:
+PE_TERM_HW opt_pmu_config
+{
+ /* List of created evsels. */
+ struct list_head *list = NULL;
+ int err = parse_events_multi_pmu_add(_parse_state, $1.str, $1.num, $2, &list, &@1);
+
+ free($1.str);
+ parse_events_terms__delete($2);
+ if (err)
+ PE_ABORT(err);
+
+ $$ = list;
+}
|
-PE_VALUE_SYM_SW
+PE_TERM_HW sep_dc
+{
+ struct list_head *list;
+ int err;
+
+ err = parse_events_multi_pmu_add(_parse_state, $1.str, $1.num, NULL, &list, &@1);
+ free($1.str);
+ if (err)
+ PE_ABORT(err);
+ $$ = list;
+}
event_legacy_symbol:
-value_sym '/' event_config '/'
+PE_VALUE_SYM_SW '/' event_config '/'
{
struct list_head *list;
- int type = $1 >> 16;
- int config = $1 & 255;
int err;
- bool wildcard = (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE);
list = alloc_list();
if (!list)
YYNOMEM;
- err = parse_events_add_numeric(_parse_state, list, type, config, $3, wildcard);
+ err = parse_events_add_numeric(_parse_state, list,
+ /*type=*/PERF_TYPE_SOFTWARE, /*config=*/$1,
+ $3, /*wildcard=*/false);
parse_events_terms__delete($3);
if (err) {
free_list_evsel(list);
@@ -332,18 +353,17 @@ value_sym '/' event_config '/'
$$ = list;
}
|
-value_sym sep_slash_slash_dc
+PE_VALUE_SYM_SW sep_slash_slash_dc
{
struct list_head *list;
- int type = $1 >> 16;
- int config = $1 & 255;
- bool wildcard = (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE);
int err;
list = alloc_list();
if (!list)
YYNOMEM;
- err = parse_events_add_numeric(_parse_state, list, type, config, /*head_config=*/NULL, wildcard);
+ err = parse_events_add_numeric(_parse_state, list,
+ /*type=*/PERF_TYPE_SOFTWARE, /*config=*/$1,
+ /*head_config=*/NULL, /*wildcard=*/false);
if (err)
PE_ABORT(err);
$$ = list;
--
2.49.0.395.g12beb8f557-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v6 0/2] Prefer sysfs/JSON events also when no PMU is provided
2025-03-24 16:45 [PATCH v6 0/2] Prefer sysfs/JSON events also when no PMU is provided Ian Rogers
2025-03-24 16:45 ` [PATCH v6 1/2] perf record: Skip don't fail for events that don't open Ian Rogers
2025-03-24 16:45 ` [PATCH v6 2/2] perf parse-events: Reapply "Prefer sysfs/JSON hardware events over legacy" Ian Rogers
@ 2025-03-27 19:13 ` Ian Rogers
2025-04-29 15:14 ` Arnaldo Carvalho de Melo
2 siblings, 1 reply; 6+ messages in thread
From: Ian Rogers @ 2025-03-27 19:13 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, James Clark, Ze Gao,
Weilin Wang, Dominique Martinet, Jean-Philippe Romain, Junhao He,
linux-perf-users, linux-kernel, Aditya Bodkhe
On Mon, Mar 24, 2025 at 9:46 AM Ian Rogers <irogers@google.com> wrote:
>
> At the RISC-V summit the topic of avoiding event data being in the
> RISC-V PMU kernel driver came up. There is a preference for sysfs/JSON
> events being the priority when no PMU is provided so that legacy
> events maybe supported via json. Originally Mark Rutland also
> expressed at LPC 2023 that doing this would resolve bugs on ARM Apple
> M? processors, but James Clark more recently tested this and believes
> the driver issues there may not have existed or have been resolved. In
> any case, it is inconsistent that with a PMU event names avoid legacy
> encodings, but when wildcarding PMUs (ie without a PMU with the event
> name) the legacy encodings have priority.
>
> The patch doing this work was reverted in a v6.10 release candidate
> as, even though the patch was posted for weeks and had been on
> linux-next for weeks without issue, Linus was in the habit of using
> explicit legacy events with unsupported precision options on his
> Neoverse-N1. This machine has SLC PMU events for bus and CPU cycles
> where ARM decided to call the events bus_cycles and cycles, the latter
> being also a legacy event name. ARM haven't renamed the cycles event
> to a more consistent cpu_cycles and avoided the problem. With these
> changes the problematic event will now be skipped, a large warning
> produced, and perf record will continue for the other PMU events. This
> solution was proposed by Arnaldo.
>
> v6: Rebase of v5 (dropping already merged patches):
> https://lore.kernel.org/lkml/20250109222109.567031-1-irogers@google.com/
> that unusually had an RFC posted for it:
> https://lore.kernel.org/lkml/Z7Z5kv75BMML2A1q@google.com/
> Note, this patch conflicts/contradicts:
> https://lore.kernel.org/lkml/20250312211623.2495798-1-irogers@google.com/
> that I posted so that we could either consistently prioritize
> sysfs/json (these patches) or legacy events (the other
> patches). That lack of event printing and encoding inconsistency
> is most prominent in the encoding of events like "instructions"
> which on hybrid are reported as "cpu_core/instructions/" but
> "instructions" before these patches gets a legacy encoding while
> "cpu_core/instructions/" gets a sysfs/json encoding. These patches
> make "instructions" always get a sysfs/json encoding while the
> alternate patches make it always get a legacy encoding.
So another fun finding. Sysfs and json events are case insensitive:
```
$ perf stat -e 'inst_retired.any,INST_RETIRED.ANY' true
Performance counter stats for 'true':
129,134 cpu_atom/inst_retired.any:u/
<not counted> cpu_core/inst_retired.any:u/
(0.00%)
129,134 cpu_atom/INST_RETIRED.ANY:u/
<not counted> cpu_core/INST_RETIRED.ANY:u/
(0.00%)
0.002193191 seconds time elapsed
0.002354000 seconds user
0.000000000 seconds sys
```
But legacy events match in lex code that is case sensitive. This means
(on x86) the event 'instructions' is currently legacy, but the event
'INSTRUCTIONS' is a sysfs event. The event CYCLES is a parse error as
there is no sysfs/json version. Given legacy events don't follow the
case insensitivity norm this is more evidence we need to reduce their
priority by merging these patches.
Thanks,
Ian
> v5: Follow Namhyung's suggestion and ignore the case where command
> line dummy events fail to open alongside other events that all
> fail to open. Note, the Tested-by tags are left on the series as
> v4 and v5 were changing an error case that doesn't occur in
> testing but was manually tested by myself.
>
> v4: Rework the no events opening change from v3 to make it handle
> multiple dummy events. Sadly an evlist isn't empty if it just
> contains dummy events as the dummy event may be used with "perf
> record -e dummy .." as a way to determine whether permission
> issues exist. Other software events like cpu-clock would suffice
> for this, but the using dummy genie has left the bottle.
>
> Another problem is that we appear to have an excessive number of
> dummy events added, for example, we can likely avoid a dummy event
> and add sideband data to the original event. For auxtrace more
> dummy events may be opened too. Anyway, this has led to the
> approach taken in patch 3 where the number of dummy parsed events
> is computed. If the number of removed/failing-to-open non-dummy
> events matches the number of non-dummy events then we want to
> fail, but only if there are no parsed dummy events or if there was
> one then it must have opened. The math here is hard to read, but
> passes my manual testing.
>
> v3: Make no events opening for perf record a failure as suggested by
> James Clark and Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>. Also,
> rebase.
>
> v2: Rebase and add tested-by tags from James Clark, Leo Yan and Atish
> Patra who have tested on RISC-V and ARM CPUs, including the
> problem case from before.
>
> Ian Rogers (2):
> perf record: Skip don't fail for events that don't open
> perf parse-events: Reapply "Prefer sysfs/JSON hardware events over
> legacy"
>
> tools/perf/builtin-record.c | 47 ++++++++++++++++++---
> tools/perf/util/parse-events.c | 26 +++++++++---
> tools/perf/util/parse-events.l | 76 +++++++++++++++++-----------------
> tools/perf/util/parse-events.y | 60 ++++++++++++++++++---------
> 4 files changed, 139 insertions(+), 70 deletions(-)
>
> --
> 2.49.0.395.g12beb8f557-goog
>
^ permalink raw reply [flat|nested] 6+ messages in thread