* [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python
@ 2025-12-02 17:49 Ian Rogers
2025-12-02 17:49 ` [PATCH v9 01/48] perf python: Correct copying of metric_leader in an evsel Ian Rogers
` (48 more replies)
0 siblings, 49 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:49 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics in the perf tool come in via json. Json doesn't allow
comments, line breaks, etc. making it an inconvenient way to write
metrics. Further, it is useful to detect when writing a metric that
the event specified is supported within the event json for a
model. From the metric python code Event(s) are used, with fallback
events provided, if no event is found then an exception is thrown and
that can either indicate a failure or an unsupported model. To avoid
confusion all the metrics and their metricgroups are prefixed with
'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
characters aren't ideal, this separates the metrics from other vendor
provided metrics.
* The first 14 patches introduce infrastructure and fixes for the
addition of metrics written in python for Arm64, AMD Zen and Intel
CPUs. The ilist.py and perf python module are fixed to work better
with metrics on hybrid architectures.
* The next 9 patches generate additional metrics for AMD zen. Rapl
and Idle metrics aren't specific to AMD but are placed here for ease
and convenience. Uncore L3 metrics are added along with the majority
of core metrics.
* The next 20 patches add additional metrics for Intel. Rapl and Idle
metrics aren't specific to Intel but are placed here for ease and
convenience. Smi and tsx metrics are added so they can be dropped
from the per model json files. There are four uncore sets of metrics
and eleven core metrics. Add a CheckPmu function to metric to
simplify detecting the presence of hybrid PMUs in events. Metrics
with experimental events are flagged as experimental in their
description.
* The next 2 patches add additional metrics for Arm64, where the
topdown set decomposes yet further. The metrcs primarily use json
events, where the json contains architecture standard events. Not
all events are in the json, such as for a53 where the events are in
sysfs. Workaround this by adding the sysfs events to the metrics but
longer-term such events should be added to the json.
* The final patch validates that all events provided to an Event
object exist in a json file somewhere. This is to avoid mistakes
like unfortunate typos.
This series has benefitted from the input of Leo Yan
<leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
<thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
Thomas Falcon (Intel).
v8. Combine the previous 4 series for clarity. Rebase on top of the
more recent legacy metric and event changes. Make the python more
pep8 and pylint compliant.
Foundations:
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
v5. Rebase on top of legacy hardware/cache changes that now generate
events using python:
https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
the v5 series is:
https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
v4. Rebase and small Build/Makefile tweak
https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
v3. Some code tidying, make the input directory a command line
argument, but no other functional or output changes.
https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
v2. Fixes two type issues in the python code but no functional or
output changes.
https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
AMD:
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
names with lpm_ (short for Linux Perf Metric) so that python
generated metrics are clearly namespaced.
https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
v4. Rebase.
https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
v3. Some minor code cleanup changes.
https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
v2. Drop the cycles breakdown in favor of having it as a common
metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
Intel:
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
names with lpm_ (short for Linux Perf Metric) so that python
generated metrics are clearly namespaced. Kan requested a
namespace in his review:
https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
The v5 series is:
https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
minor code cleanup changes. Drop reference to merged fix for
umasks/occ_sel in PCU events and for cstate metrics.
https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
v2. Drop the cycles breakdown in favor of having it as a common
metric, spelling and other improvements suggested by Kan Liang
<kan.liang@linux.intel.com>.
https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
ARM:
v7. Switch a use of cycles to cpu-cycles due to ARM having too many
cycles events.
https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
v5. Rebase. Address review comments from Leo Yan
<leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
Linux Perf Metric) so that python generated metrics are clearly
namespaced. Use cpu-cycles rather than cycles legacy event for
cycles metrics to avoid confusion with ARM PMUs. Add patch that
checks events to ensure all possible event names are present in at
least one json file.
https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
v4. Tweak to build dependencies and rebase.
https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
v3. Some minor code cleanup changes.
https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
v2. The cycles metrics are now made common and shared with AMD and
Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
assumes these patches come after the AMD and Intel sets.
https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
Ian Rogers (48):
perf python: Correct copying of metric_leader in an evsel
perf ilist: Be tolerant of reading a metric on the wrong CPU
perf jevents: Allow multiple metricgroups.json files
perf jevents: Update metric constraint support
perf jevents: Add descriptions to metricgroup abstraction
perf jevents: Allow metric groups not to be named
perf jevents: Support parsing negative exponents
perf jevents: Term list fix in event parsing
perf jevents: Add threshold expressions to Metric
perf jevents: Move json encoding to its own functions
perf jevents: Drop duplicate pending metrics
perf jevents: Skip optional metrics in metric group list
perf jevents: Build support for generating metrics from python
perf jevents: Add load event json to verify and allow fallbacks
perf jevents: Add RAPL event metric for AMD zen models
perf jevents: Add idle metric for AMD zen models
perf jevents: Add upc metric for uops per cycle for AMD
perf jevents: Add br metric group for branch statistics on AMD
perf jevents: Add itlb metric group for AMD
perf jevents: Add dtlb metric group for AMD
perf jevents: Add uncore l3 metric group for AMD
perf jevents: Add load store breakdown metrics ldst for AMD
perf jevents: Add context switch metrics for AMD
perf jevents: Add RAPL metrics for all Intel models
perf jevents: Add idle metric for Intel models
perf jevents: Add CheckPmu to see if a PMU is in loaded json events
perf jevents: Add smi metric group for Intel models
perf jevents: Mark metrics with experimental events as experimental
perf jevents: Add tsx metric group for Intel models
perf jevents: Add br metric group for branch statistics on Intel
perf jevents: Add software prefetch (swpf) metric group for Intel
perf jevents: Add ports metric group giving utilization on Intel
perf jevents: Add L2 metrics for Intel
perf jevents: Add load store breakdown metrics ldst for Intel
perf jevents: Add ILP metrics for Intel
perf jevents: Add context switch metrics for Intel
perf jevents: Add FPU metrics for Intel
perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
perf jevents: Add mem_bw metric for Intel
perf jevents: Add local/remote "mem" breakdown metrics for Intel
perf jevents: Add dir breakdown metrics for Intel
perf jevents: Add C-State metrics from the PCU PMU for Intel
perf jevents: Add local/remote miss latency metrics for Intel
perf jevents: Add upi_bw metric for Intel
perf jevents: Add mesh bandwidth saturation metric for Intel
perf jevents: Add collection of topdown like metrics for arm64
perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
perf jevents: Validate that all names given an Event
tools/perf/.gitignore | 5 +
tools/perf/Makefile.perf | 2 +
tools/perf/pmu-events/Build | 51 +-
tools/perf/pmu-events/amd_metrics.py | 491 ++++++++++
tools/perf/pmu-events/arm64_metrics.py | 187 ++++
tools/perf/pmu-events/common_metrics.py | 19 +
tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
tools/perf/pmu-events/jevents.py | 7 +-
tools/perf/pmu-events/metric.py | 256 ++++-
tools/perf/pmu-events/metric_test.py | 4 +
tools/perf/python/ilist.py | 8 +-
tools/perf/util/evsel.c | 1 +
tools/perf/util/python.c | 82 +-
13 files changed, 2188 insertions(+), 54 deletions(-)
create mode 100755 tools/perf/pmu-events/amd_metrics.py
create mode 100755 tools/perf/pmu-events/arm64_metrics.py
create mode 100644 tools/perf/pmu-events/common_metrics.py
create mode 100755 tools/perf/pmu-events/intel_metrics.py
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v9 01/48] perf python: Correct copying of metric_leader in an evsel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
@ 2025-12-02 17:49 ` Ian Rogers
2025-12-02 17:49 ` [PATCH v9 02/48] perf ilist: Be tolerant of reading a metric on the wrong CPU Ian Rogers
` (47 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:49 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Ensure the metric_leader is copied and set up correctly. In
compute_metric determine the correct metric_leader event to match the
requested CPU. Fixes the handling of metrics particularly on hybrid
machines.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/util/evsel.c | 1 +
tools/perf/util/python.c | 82 +++++++++++++++++++++++++++++-----------
2 files changed, 61 insertions(+), 22 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index aee42666e882..5aae7f791bc2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -538,6 +538,7 @@ struct evsel *evsel__clone(struct evsel *dest, struct evsel *orig)
#endif
evsel->handler = orig->handler;
evsel->core.leader = orig->core.leader;
+ evsel->metric_leader = orig->metric_leader;
evsel->max_events = orig->max_events;
zfree(&evsel->unit);
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index fa5e4270d182..cc1019d29a5d 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -1340,27 +1340,48 @@ static int prepare_metric(const struct metric_expr *mexp,
struct metric_ref *metric_refs = mexp->metric_refs;
for (int i = 0; metric_events[i]; i++) {
- char *n = strdup(evsel__metric_id(metric_events[i]));
+ struct evsel *cur = metric_events[i];
double val, ena, run;
- int source_count = evsel__source_count(metric_events[i]);
- int ret;
+ int ret, source_count = 0;
struct perf_counts_values *old_count, *new_count;
+ char *n = strdup(evsel__metric_id(cur));
if (!n)
return -ENOMEM;
+ /*
+ * If there are multiple uncore PMUs and we're not reading the
+ * leader's stats, determine the stats for the appropriate
+ * uncore PMU.
+ */
+ if (evsel && evsel->metric_leader &&
+ evsel->pmu != evsel->metric_leader->pmu &&
+ cur->pmu == evsel->metric_leader->pmu) {
+ struct evsel *pos;
+
+ evlist__for_each_entry(evsel->evlist, pos) {
+ if (pos->pmu != evsel->pmu)
+ continue;
+ if (pos->metric_leader != cur)
+ continue;
+ cur = pos;
+ source_count = 1;
+ break;
+ }
+ }
+
if (source_count == 0)
- source_count = 1;
+ source_count = evsel__source_count(cur);
- ret = evsel__ensure_counts(metric_events[i]);
+ ret = evsel__ensure_counts(cur);
if (ret)
return ret;
/* Set up pointers to the old and newly read counter values. */
- old_count = perf_counts(metric_events[i]->prev_raw_counts, cpu_idx, thread_idx);
- new_count = perf_counts(metric_events[i]->counts, cpu_idx, thread_idx);
- /* Update the value in metric_events[i]->counts. */
- evsel__read_counter(metric_events[i], cpu_idx, thread_idx);
+ old_count = perf_counts(cur->prev_raw_counts, cpu_idx, thread_idx);
+ new_count = perf_counts(cur->counts, cpu_idx, thread_idx);
+ /* Update the value in cur->counts. */
+ evsel__read_counter(cur, cpu_idx, thread_idx);
val = new_count->val - old_count->val;
ena = new_count->ena - old_count->ena;
@@ -1392,6 +1413,7 @@ static PyObject *pyrf_evlist__compute_metric(struct pyrf_evlist *pevlist,
struct metric_expr *mexp = NULL;
struct expr_parse_ctx *pctx;
double result = 0;
+ struct evsel *metric_evsel = NULL;
if (!PyArg_ParseTuple(args, "sii", &metric, &cpu, &thread))
return NULL;
@@ -1404,6 +1426,7 @@ static PyObject *pyrf_evlist__compute_metric(struct pyrf_evlist *pevlist,
list_for_each(pos, &me->head) {
struct metric_expr *e = container_of(pos, struct metric_expr, nd);
+ struct evsel *pos2;
if (strcmp(e->metric_name, metric))
continue;
@@ -1411,20 +1434,24 @@ static PyObject *pyrf_evlist__compute_metric(struct pyrf_evlist *pevlist,
if (e->metric_events[0] == NULL)
continue;
- cpu_idx = perf_cpu_map__idx(e->metric_events[0]->core.cpus,
- (struct perf_cpu){.cpu = cpu});
- if (cpu_idx < 0)
- continue;
-
- thread_idx = perf_thread_map__idx(e->metric_events[0]->core.threads,
- thread);
- if (thread_idx < 0)
- continue;
-
- mexp = e;
- break;
+ evlist__for_each_entry(&pevlist->evlist, pos2) {
+ if (pos2->metric_leader != e->metric_events[0])
+ continue;
+ cpu_idx = perf_cpu_map__idx(pos2->core.cpus,
+ (struct perf_cpu){.cpu = cpu});
+ if (cpu_idx < 0)
+ continue;
+
+ thread_idx = perf_thread_map__idx(pos2->core.threads, thread);
+ if (thread_idx < 0)
+ continue;
+ metric_evsel = pos2;
+ mexp = e;
+ goto done;
+ }
}
}
+done:
if (!mexp) {
PyErr_Format(PyExc_TypeError, "Unknown metric '%s' for CPU '%d' and thread '%d'",
metric, cpu, thread);
@@ -1435,7 +1462,7 @@ static PyObject *pyrf_evlist__compute_metric(struct pyrf_evlist *pevlist,
if (!pctx)
return PyErr_NoMemory();
- ret = prepare_metric(mexp, mexp->metric_events[0], pctx, cpu_idx, thread_idx);
+ ret = prepare_metric(mexp, metric_evsel, pctx, cpu_idx, thread_idx);
if (ret) {
expr__ctx_free(pctx);
errno = -ret;
@@ -1996,6 +2023,17 @@ static PyObject *pyrf_evlist__from_evlist(struct evlist *evlist)
else if (leader == NULL)
evsel__set_leader(pos, pos);
}
+
+ leader = pos->metric_leader;
+
+ if (pos != leader) {
+ int idx = evlist__pos(evlist, leader);
+
+ if (idx >= 0)
+ pos->metric_leader = evlist__at(&pevlist->evlist, idx);
+ else if (leader == NULL)
+ pos->metric_leader = pos;
+ }
}
metricgroup__copy_metric_events(&pevlist->evlist, /*cgrp=*/NULL,
&pevlist->evlist.metric_events,
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 02/48] perf ilist: Be tolerant of reading a metric on the wrong CPU
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
2025-12-02 17:49 ` [PATCH v9 01/48] perf python: Correct copying of metric_leader in an evsel Ian Rogers
@ 2025-12-02 17:49 ` Ian Rogers
2025-12-02 17:49 ` [PATCH v9 03/48] perf jevents: Allow multiple metricgroups.json files Ian Rogers
` (46 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:49 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
This happens on hybrid machine metrics. Be tolerant and don't cause
the ilist application to crash with an exception.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/python/ilist.py | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/perf/python/ilist.py b/tools/perf/python/ilist.py
index eb687ce9d5a6..0d757ddb4795 100755
--- a/tools/perf/python/ilist.py
+++ b/tools/perf/python/ilist.py
@@ -77,8 +77,12 @@ class Metric(TreeValue):
return perf.parse_metrics(self.metric_name, self.metric_pmu)
def value(self, evlist: perf.evlist, evsel: perf.evsel, cpu: int, thread: int) -> float:
- val = evlist.compute_metric(self.metric_name, cpu, thread)
- return 0 if math.isnan(val) else val
+ try:
+ val = evlist.compute_metric(self.metric_name, cpu, thread)
+ return 0 if math.isnan(val) else val
+ except:
+ # Be tolerant of failures to compute metrics on particular CPUs/threads.
+ return 0
@dataclass
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 03/48] perf jevents: Allow multiple metricgroups.json files
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
2025-12-02 17:49 ` [PATCH v9 01/48] perf python: Correct copying of metric_leader in an evsel Ian Rogers
2025-12-02 17:49 ` [PATCH v9 02/48] perf ilist: Be tolerant of reading a metric on the wrong CPU Ian Rogers
@ 2025-12-02 17:49 ` Ian Rogers
2025-12-02 17:49 ` [PATCH v9 04/48] perf jevents: Update metric constraint support Ian Rogers
` (45 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:49 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Allow multiple metricgroups.json files by handling any file ending
with metricgroups.json as a metricgroups file.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/jevents.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 3413ee5d0227..03f5ad262eb5 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -640,7 +640,7 @@ def preprocess_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
if not item.is_file() or not item.name.endswith('.json'):
return
- if item.name == 'metricgroups.json':
+ if item.name.endswith('metricgroups.json'):
metricgroup_descriptions = json.load(open(item.path))
for mgroup in metricgroup_descriptions:
assert len(mgroup) > 1, parents
@@ -693,7 +693,7 @@ def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
# Ignore other directories. If the file name does not have a .json
# extension, ignore it. It could be a readme.txt for instance.
- if not item.is_file() or not item.name.endswith('.json') or item.name == 'metricgroups.json':
+ if not item.is_file() or not item.name.endswith('.json') or item.name.endswith('metricgroups.json'):
return
add_events_table_entries(item, get_topic(item.name))
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 04/48] perf jevents: Update metric constraint support
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (2 preceding siblings ...)
2025-12-02 17:49 ` [PATCH v9 03/48] perf jevents: Allow multiple metricgroups.json files Ian Rogers
@ 2025-12-02 17:49 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 05/48] perf jevents: Add descriptions to metricgroup abstraction Ian Rogers
` (44 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:49 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Previous metric constraints were binary, either none or don't group
when the NMI watchdog is present. Update to match the definitions in
'enum metric_event_groups' in pmu-events.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 92acd89ed97a..8a718dd4b1fe 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -4,8 +4,14 @@ import ast
import decimal
import json
import re
+from enum import Enum
from typing import Dict, List, Optional, Set, Tuple, Union
+class MetricConstraint(Enum):
+ GROUPED_EVENTS = 0
+ NO_GROUP_EVENTS = 1
+ NO_GROUP_EVENTS_NMI = 2
+ NO_GROUP_EVENTS_SMT = 3
class Expression:
"""Abstract base class of elements in a metric expression."""
@@ -423,14 +429,14 @@ class Metric:
groups: Set[str]
expr: Expression
scale_unit: str
- constraint: bool
+ constraint: MetricConstraint
def __init__(self,
name: str,
description: str,
expr: Expression,
scale_unit: str,
- constraint: bool = False):
+ constraint: MetricConstraint = MetricConstraint.GROUPED_EVENTS):
self.name = name
self.description = description
self.expr = expr.Simplify()
@@ -464,8 +470,8 @@ class Metric:
'MetricExpr': self.expr.ToPerfJson(),
'ScaleUnit': self.scale_unit
}
- if self.constraint:
- result['MetricConstraint'] = 'NO_NMI_WATCHDOG'
+ if self.constraint != MetricConstraint.GROUPED_EVENTS:
+ result['MetricConstraint'] = self.constraint.name
return result
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 05/48] perf jevents: Add descriptions to metricgroup abstraction
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (3 preceding siblings ...)
2025-12-02 17:49 ` [PATCH v9 04/48] perf jevents: Update metric constraint support Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 06/48] perf jevents: Allow metric groups not to be named Ian Rogers
` (43 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add a function to recursively generate metric group descriptions.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 8a718dd4b1fe..1de4fb72c75e 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -475,6 +475,8 @@ class Metric:
return result
+ def ToMetricGroupDescriptions(self, root: bool = True) -> Dict[str, str]:
+ return {}
class _MetricJsonEncoder(json.JSONEncoder):
"""Special handling for Metric objects."""
@@ -493,10 +495,12 @@ class MetricGroup:
which can facilitate arrangements similar to trees.
"""
- def __init__(self, name: str, metric_list: List[Union[Metric,
- 'MetricGroup']]):
+ def __init__(self, name: str,
+ metric_list: List[Union[Metric, 'MetricGroup']],
+ description: Optional[str] = None):
self.name = name
self.metric_list = metric_list
+ self.description = description
for metric in metric_list:
metric.AddToMetricGroup(self)
@@ -516,6 +520,12 @@ class MetricGroup:
def ToPerfJson(self) -> str:
return json.dumps(sorted(self.Flatten()), indent=2, cls=_MetricJsonEncoder)
+ def ToMetricGroupDescriptions(self, root: bool = True) -> Dict[str, str]:
+ result = {self.name: self.description} if self.description else {}
+ for x in self.metric_list:
+ result.update(x.ToMetricGroupDescriptions(False))
+ return result
+
def __str__(self) -> str:
return self.ToPerfJson()
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 06/48] perf jevents: Allow metric groups not to be named
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (4 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 05/48] perf jevents: Add descriptions to metricgroup abstraction Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 07/48] perf jevents: Support parsing negative exponents Ian Rogers
` (42 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
It can be convenient to have unnamed metric groups for the sake of
organizing other metrics and metric groups. An unspecified name
shouldn't contribute to the MetricGroup json value, so don't record
it.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 1de4fb72c75e..847b614d40d5 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -455,7 +455,8 @@ class Metric:
def AddToMetricGroup(self, group):
"""Callback used when being added to a MetricGroup."""
- self.groups.add(group.name)
+ if group.name:
+ self.groups.add(group.name)
def Flatten(self) -> Set['Metric']:
"""Return a leaf metric."""
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 07/48] perf jevents: Support parsing negative exponents
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (5 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 06/48] perf jevents: Allow metric groups not to be named Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 08/48] perf jevents: Term list fix in event parsing Ian Rogers
` (41 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Support negative exponents when parsing from a json metric string by
making the numbers after the 'e' optional in the 'Event' insertion fix
up.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 2 +-
tools/perf/pmu-events/metric_test.py | 4 ++++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 847b614d40d5..31eea2f45152 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -573,7 +573,7 @@ def ParsePerfJson(orig: str) -> Expression:
# a double by the Bison parser
py = re.sub(r'0Event\(r"[xX]([0-9a-fA-F]*)"\)', r'Event("0x\1")', py)
# Convert accidentally converted scientific notation constants back
- py = re.sub(r'([0-9]+)Event\(r"(e[0-9]+)"\)', r'\1\2', py)
+ py = re.sub(r'([0-9]+)Event\(r"(e[0-9]*)"\)', r'\1\2', py)
# Convert all the known keywords back from events to just the keyword
keywords = ['if', 'else', 'min', 'max', 'd_ratio', 'source_count', 'has_event', 'strcmp_cpuid_str']
for kw in keywords:
diff --git a/tools/perf/pmu-events/metric_test.py b/tools/perf/pmu-events/metric_test.py
index ee22ff43ddd7..8acfe4652b55 100755
--- a/tools/perf/pmu-events/metric_test.py
+++ b/tools/perf/pmu-events/metric_test.py
@@ -61,6 +61,10 @@ class TestMetricExpressions(unittest.TestCase):
after = before
self.assertEqual(ParsePerfJson(before).ToPerfJson(), after)
+ before = r'a + 3e-12 + b'
+ after = before
+ self.assertEqual(ParsePerfJson(before).ToPerfJson(), after)
+
def test_IfElseTests(self):
# if-else needs rewriting to Select and back.
before = r'Event1 if #smt_on else Event2'
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 08/48] perf jevents: Term list fix in event parsing
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (6 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 07/48] perf jevents: Support parsing negative exponents Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 09/48] perf jevents: Add threshold expressions to Metric Ian Rogers
` (40 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Fix events seemingly broken apart at a comma.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 31eea2f45152..0f4e67e5cfea 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -568,6 +568,12 @@ def ParsePerfJson(orig: str) -> Expression:
r'Event(r"\1")', py)
# If it started with a # it should have been a literal, rather than an event name
py = re.sub(r'#Event\(r"([^"]*)"\)', r'Literal("#\1")', py)
+ # Fix events wrongly broken at a ','
+ while True:
+ prev_py = py
+ py = re.sub(r'Event\(r"([^"]*)"\),Event\(r"([^"]*)"\)', r'Event(r"\1,\2")', py)
+ if py == prev_py:
+ break
# Convert accidentally converted hex constants ("0Event(r"xDEADBEEF)"") back to a constant,
# but keep it wrapped in Event(), otherwise Python drops the 0x prefix and it gets interpreted as
# a double by the Bison parser
@@ -586,7 +592,6 @@ def ParsePerfJson(orig: str) -> Expression:
parsed = ast.fix_missing_locations(parsed)
return _Constify(eval(compile(parsed, orig, 'eval')))
-
def RewriteMetricsInTermsOfOthers(metrics: List[Tuple[str, str, Expression]]
)-> Dict[Tuple[str, str], Expression]:
"""Shorten metrics by rewriting in terms of others.
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 09/48] perf jevents: Add threshold expressions to Metric
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (7 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 08/48] perf jevents: Term list fix in event parsing Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 10/48] perf jevents: Move json encoding to its own functions Ian Rogers
` (39 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Allow threshold expressions for metrics to be generated.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 0f4e67e5cfea..e81fed2e29b5 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -430,13 +430,15 @@ class Metric:
expr: Expression
scale_unit: str
constraint: MetricConstraint
+ threshold: Optional[Expression]
def __init__(self,
name: str,
description: str,
expr: Expression,
scale_unit: str,
- constraint: MetricConstraint = MetricConstraint.GROUPED_EVENTS):
+ constraint: MetricConstraint = MetricConstraint.GROUPED_EVENTS,
+ threshold: Optional[Expression] = None):
self.name = name
self.description = description
self.expr = expr.Simplify()
@@ -447,6 +449,7 @@ class Metric:
else:
self.scale_unit = f'1{scale_unit}'
self.constraint = constraint
+ self.threshold = threshold
self.groups = set()
def __lt__(self, other):
@@ -473,6 +476,8 @@ class Metric:
}
if self.constraint != MetricConstraint.GROUPED_EVENTS:
result['MetricConstraint'] = self.constraint.name
+ if self.threshold:
+ result['MetricThreshold'] = self.threshold.ToPerfJson()
return result
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 10/48] perf jevents: Move json encoding to its own functions
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (8 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 09/48] perf jevents: Add threshold expressions to Metric Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 11/48] perf jevents: Drop duplicate pending metrics Ian Rogers
` (38 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Have dedicated encode functions rather than having them embedded in
MetricGroup. This is to provide some uniformity in the Metric ToXXX
routines.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 34 +++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index e81fed2e29b5..b39189182608 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -484,15 +484,6 @@ class Metric:
def ToMetricGroupDescriptions(self, root: bool = True) -> Dict[str, str]:
return {}
-class _MetricJsonEncoder(json.JSONEncoder):
- """Special handling for Metric objects."""
-
- def default(self, o):
- if isinstance(o, Metric):
- return o.ToPerfJson()
- return json.JSONEncoder.default(self, o)
-
-
class MetricGroup:
"""A group of metrics.
@@ -523,8 +514,11 @@ class MetricGroup:
return result
- def ToPerfJson(self) -> str:
- return json.dumps(sorted(self.Flatten()), indent=2, cls=_MetricJsonEncoder)
+ def ToPerfJson(self) -> List[Dict[str, str]]:
+ result = []
+ for x in sorted(self.Flatten()):
+ result.append(x.ToPerfJson())
+ return result
def ToMetricGroupDescriptions(self, root: bool = True) -> Dict[str, str]:
result = {self.name: self.description} if self.description else {}
@@ -533,7 +527,23 @@ class MetricGroup:
return result
def __str__(self) -> str:
- return self.ToPerfJson()
+ return str(self.ToPerfJson())
+
+
+def JsonEncodeMetric(x: MetricGroup):
+ class MetricJsonEncoder(json.JSONEncoder):
+ """Special handling for Metric objects."""
+
+ def default(self, o):
+ if isinstance(o, Metric) or isinstance(o, MetricGroup):
+ return o.ToPerfJson()
+ return json.JSONEncoder.default(self, o)
+
+ return json.dumps(x, indent=2, cls=MetricJsonEncoder)
+
+
+def JsonEncodeMetricGroupDescriptions(x: MetricGroup):
+ return json.dumps(x.ToMetricGroupDescriptions(), indent=2)
class _RewriteIfExpToSelect(ast.NodeTransformer):
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 11/48] perf jevents: Drop duplicate pending metrics
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (9 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 10/48] perf jevents: Move json encoding to its own functions Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 12/48] perf jevents: Skip optional metrics in metric group list Ian Rogers
` (37 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Drop adding a pending metric if there is an existing one. Ensure the
PMUs differ for hybrid systems.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/jevents.py | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 03f5ad262eb5..3a1bcdcdc685 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -501,7 +501,8 @@ def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
for e in read_json_events(item.path, topic):
if e.name:
_pending_events.append(e)
- if e.metric_name:
+ if e.metric_name and not any(e.metric_name == x.metric_name and
+ e.pmu == x.pmu for x in _pending_metrics):
_pending_metrics.append(e)
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 12/48] perf jevents: Skip optional metrics in metric group list
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (10 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 11/48] perf jevents: Drop duplicate pending metrics Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 13/48] perf jevents: Build support for generating metrics from python Ian Rogers
` (36 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
For metric groups, skip metrics in the list that are None. This allows
functions to better optionally return metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index b39189182608..dd8fd06940e6 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -493,13 +493,15 @@ class MetricGroup:
"""
def __init__(self, name: str,
- metric_list: List[Union[Metric, 'MetricGroup']],
+ metric_list: List[Union[Optional[Metric], Optional['MetricGroup']]],
description: Optional[str] = None):
self.name = name
- self.metric_list = metric_list
+ self.metric_list = []
self.description = description
for metric in metric_list:
- metric.AddToMetricGroup(self)
+ if metric:
+ self.metric_list.append(metric)
+ metric.AddToMetricGroup(self)
def AddToMetricGroup(self, group):
"""Callback used when a MetricGroup is added into another."""
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 13/48] perf jevents: Build support for generating metrics from python
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (11 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 12/48] perf jevents: Skip optional metrics in metric group list Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 14/48] perf jevents: Add load event json to verify and allow fallbacks Ian Rogers
` (35 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Generate extra-metrics.json and extra-metricgroups.json from python
architecture specific scripts. The metrics themselves will be added in
later patches.
If a build takes place in tools/perf/ then extra-metrics.json and
extra-metricgroups.json are generated in that directory and so added
to .gitignore. If there is an OUTPUT directory then the
tools/perf/pmu-events/arch files are copied to it so the generated
extra-metrics.json and extra-metricgroups.json can be added/generated
there.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/.gitignore | 5 +++
tools/perf/Makefile.perf | 2 +
tools/perf/pmu-events/Build | 51 +++++++++++++++++++++++++-
tools/perf/pmu-events/amd_metrics.py | 42 +++++++++++++++++++++
tools/perf/pmu-events/arm64_metrics.py | 43 ++++++++++++++++++++++
tools/perf/pmu-events/intel_metrics.py | 42 +++++++++++++++++++++
6 files changed, 184 insertions(+), 1 deletion(-)
create mode 100755 tools/perf/pmu-events/amd_metrics.py
create mode 100755 tools/perf/pmu-events/arm64_metrics.py
create mode 100755 tools/perf/pmu-events/intel_metrics.py
diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
index b64302a76144..133e343bf44e 100644
--- a/tools/perf/.gitignore
+++ b/tools/perf/.gitignore
@@ -42,6 +42,11 @@ pmu-events/metric_test.log
pmu-events/empty-pmu-events.log
pmu-events/test-empty-pmu-events.c
*.shellcheck_log
+pmu-events/arch/**/extra-metrics.json
+pmu-events/arch/**/extra-metricgroups.json
+tests/shell/*.shellcheck_log
+tests/shell/coresight/*.shellcheck_log
+tests/shell/lib/*.shellcheck_log
feature/
libapi/
libbpf/
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 80decc7ce13c..a6040e09cead 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1277,6 +1277,8 @@ ifeq ($(OUTPUT),)
pmu-events/metric_test.log \
pmu-events/test-empty-pmu-events.c \
pmu-events/empty-pmu-events.log
+ $(Q)find pmu-events/arch -name 'extra-metrics.json' -delete -o \
+ -name 'extra-metricgroups.json' -delete
else # When an OUTPUT directory is present, clean up the copied pmu-events/arch directory.
$(call QUIET_CLEAN, pmu-events) $(RM) -r $(OUTPUT)pmu-events/arch \
$(OUTPUT)pmu-events/pmu-events.c \
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index a46ab7b612df..c9df78ee003c 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -29,6 +29,10 @@ $(PMU_EVENTS_C): $(EMPTY_PMU_EVENTS_C)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)cp $< $@
else
+# Functions to extract the model from a extra-metrics.json or extra-metricgroups.json path.
+model_name = $(shell echo $(1)|sed -e 's@.\+/\(.*\)/extra-metric.*\.json@\1@')
+vendor_name = $(shell echo $(1)|sed -e 's@.\+/\(.*\)/[^/]*/extra-metric.*\.json@\1@')
+
# Copy checked-in json to OUTPUT for generation if it's an out of source build
ifneq ($(OUTPUT),)
$(OUTPUT)pmu-events/arch/%: pmu-events/arch/%
@@ -40,7 +44,52 @@ $(LEGACY_CACHE_JSON): $(LEGACY_CACHE_PY)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(LEGACY_CACHE_PY) > $@
-GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) $(LEGACY_CACHE_JSON)
+GEN_METRIC_DEPS := pmu-events/metric.py
+
+# Generate AMD Json
+ZENS = $(shell ls -d pmu-events/arch/x86/amdzen*)
+ZEN_METRICS = $(foreach x,$(ZENS),$(OUTPUT)$(x)/extra-metrics.json)
+ZEN_METRICGROUPS = $(foreach x,$(ZENS),$(OUTPUT)$(x)/extra-metricgroups.json)
+
+$(ZEN_METRICS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+
+$(ZEN_METRICGROUPS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+
+# Generate ARM Json
+ARMS = $(shell ls -d pmu-events/arch/arm64/arm/*)
+ARM_METRICS = $(foreach x,$(ARMS),$(OUTPUT)$(x)/extra-metrics.json)
+ARM_METRICGROUPS = $(foreach x,$(ARMS),$(OUTPUT)$(x)/extra-metricgroups.json)
+
+$(ARM_METRICS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call vendor_name,$@) $(call model_name,$@) arch > $@
+
+$(ARM_METRICGROUPS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call vendor_name,$@) $(call model_name,$@) arch > $@
+
+# Generate Intel Json
+INTELS = $(shell ls -d pmu-events/arch/x86/*|grep -v amdzen|grep -v mapfile.csv)
+INTEL_METRICS = $(foreach x,$(INTELS),$(OUTPUT)$(x)/extra-metrics.json)
+INTEL_METRICGROUPS = $(foreach x,$(INTELS),$(OUTPUT)$(x)/extra-metricgroups.json)
+
+$(INTEL_METRICS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+
+$(INTEL_METRICGROUPS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+
+GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) \
+ $(LEGACY_CACHE_JSON) \
+ $(ZEN_METRICS) $(ZEN_METRICGROUPS) \
+ $(ARM_METRICS) $(ARM_METRICGROUPS) \
+ $(INTEL_METRICS) $(INTEL_METRICGROUPS)
$(METRIC_TEST_LOG): $(METRIC_TEST_PY) $(METRIC_PY)
$(call rule_mkdir)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
new file mode 100755
index 000000000000..5f44687d8d20
--- /dev/null
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+import argparse
+import os
+from metric import (
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+
+# Global command line arguments.
+_args = None
+
+
+def main() -> None:
+ global _args
+
+ def dir_path(path: str) -> str:
+ """Validate path is a directory for argparse."""
+ if os.path.isdir(path):
+ return path
+ raise argparse.ArgumentTypeError(
+ f'\'{path}\' is not a valid directory')
+
+ parser = argparse.ArgumentParser(description="AMD perf json generator")
+ parser.add_argument(
+ "-metricgroups", help="Generate metricgroups data", action='store_true')
+ parser.add_argument("model", help="e.g. amdzen[123]")
+ parser.add_argument(
+ 'events_path',
+ type=dir_path,
+ help='Root of tree containing architecture directories containing json files'
+ )
+ _args = parser.parse_args()
+
+ all_metrics = MetricGroup("", [])
+
+ if _args.metricgroups:
+ print(JsonEncodeMetricGroupDescriptions(all_metrics))
+ else:
+ print(JsonEncodeMetric(all_metrics))
+
+
+if __name__ == '__main__':
+ main()
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
new file mode 100755
index 000000000000..204b3b08c680
--- /dev/null
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -0,0 +1,43 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+import argparse
+import os
+from metric import (
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+
+# Global command line arguments.
+_args = None
+
+
+def main() -> None:
+ global _args
+
+ def dir_path(path: str) -> str:
+ """Validate path is a directory for argparse."""
+ if os.path.isdir(path):
+ return path
+ raise argparse.ArgumentTypeError(
+ f'\'{path}\' is not a valid directory')
+
+ parser = argparse.ArgumentParser(description="ARM perf json generator")
+ parser.add_argument(
+ "-metricgroups", help="Generate metricgroups data", action='store_true')
+ parser.add_argument("vendor", help="e.g. arm")
+ parser.add_argument("model", help="e.g. neoverse-n1")
+ parser.add_argument(
+ 'events_path',
+ type=dir_path,
+ help='Root of tree containing architecture directories containing json files'
+ )
+ _args = parser.parse_args()
+
+ all_metrics = MetricGroup("", [])
+
+ if _args.metricgroups:
+ print(JsonEncodeMetricGroupDescriptions(all_metrics))
+ else:
+ print(JsonEncodeMetric(all_metrics))
+
+
+if __name__ == '__main__':
+ main()
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
new file mode 100755
index 000000000000..65ada006d05a
--- /dev/null
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+import argparse
+import os
+from metric import (
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+
+# Global command line arguments.
+_args = None
+
+
+def main() -> None:
+ global _args
+
+ def dir_path(path: str) -> str:
+ """Validate path is a directory for argparse."""
+ if os.path.isdir(path):
+ return path
+ raise argparse.ArgumentTypeError(
+ f'\'{path}\' is not a valid directory')
+
+ parser = argparse.ArgumentParser(description="Intel perf json generator")
+ parser.add_argument(
+ "-metricgroups", help="Generate metricgroups data", action='store_true')
+ parser.add_argument("model", help="e.g. skylakex")
+ parser.add_argument(
+ 'events_path',
+ type=dir_path,
+ help='Root of tree containing architecture directories containing json files'
+ )
+ _args = parser.parse_args()
+
+ all_metrics = MetricGroup("", [])
+
+ if _args.metricgroups:
+ print(JsonEncodeMetricGroupDescriptions(all_metrics))
+ else:
+ print(JsonEncodeMetric(all_metrics))
+
+
+if __name__ == '__main__':
+ main()
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 14/48] perf jevents: Add load event json to verify and allow fallbacks
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (12 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 13/48] perf jevents: Build support for generating metrics from python Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 15/48] perf jevents: Add RAPL event metric for AMD zen models Ian Rogers
` (34 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add a LoadEvents function that loads all event json files in a
directory. In the Event constructor ensure all events are defined in
the event json except for legacy events like "cycles". If the initial
event isn't found then legacy_event1 is used, and if that isn't found
legacy_event2 is used. This allows a single Event to have multiple
event names as models will often rename the same event over time. If
the event doesn't exist an exception is raised.
So that references to metrics can be added, add the MetricRef
class. This doesn't validate as an event name and so provides an
escape hatch for metrics to refer to each other.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/Build | 12 ++--
tools/perf/pmu-events/amd_metrics.py | 7 ++-
tools/perf/pmu-events/arm64_metrics.py | 7 ++-
tools/perf/pmu-events/intel_metrics.py | 7 ++-
tools/perf/pmu-events/metric.py | 83 +++++++++++++++++++++++++-
5 files changed, 101 insertions(+), 15 deletions(-)
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index c9df78ee003c..f7d67d03d055 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -53,11 +53,11 @@ ZEN_METRICGROUPS = $(foreach x,$(ZENS),$(OUTPUT)$(x)/extra-metricgroups.json)
$(ZEN_METRICS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) pmu-events/arch > $@
$(ZEN_METRICGROUPS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) pmu-events/arch > $@
# Generate ARM Json
ARMS = $(shell ls -d pmu-events/arch/arm64/arm/*)
@@ -66,11 +66,11 @@ ARM_METRICGROUPS = $(foreach x,$(ARMS),$(OUTPUT)$(x)/extra-metricgroups.json)
$(ARM_METRICS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call vendor_name,$@) $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call vendor_name,$@) $(call model_name,$@) pmu-events/arch > $@
$(ARM_METRICGROUPS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call vendor_name,$@) $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call vendor_name,$@) $(call model_name,$@) pmu-events/arch > $@
# Generate Intel Json
INTELS = $(shell ls -d pmu-events/arch/x86/*|grep -v amdzen|grep -v mapfile.csv)
@@ -79,11 +79,11 @@ INTEL_METRICGROUPS = $(foreach x,$(INTELS),$(OUTPUT)$(x)/extra-metricgroups.json
$(INTEL_METRICS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) pmu-events/arch > $@
$(INTEL_METRICGROUPS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) pmu-events/arch > $@
GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) \
$(LEGACY_CACHE_JSON) \
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 5f44687d8d20..bc91d9c120fa 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -2,8 +2,8 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (
- JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
+ MetricGroup)
# Global command line arguments.
_args = None
@@ -30,6 +30,9 @@ def main() -> None:
)
_args = parser.parse_args()
+ directory = f"{_args.events_path}/x86/{_args.model}/"
+ LoadEvents(directory)
+
all_metrics = MetricGroup("", [])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index 204b3b08c680..ac717ca3513a 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -2,8 +2,8 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (
- JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
+ MetricGroup)
# Global command line arguments.
_args = None
@@ -31,6 +31,9 @@ def main() -> None:
)
_args = parser.parse_args()
+ directory = f"{_args.events_path}/arm64/{_args.vendor}/{_args.model}/"
+ LoadEvents(directory)
+
all_metrics = MetricGroup("", [])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 65ada006d05a..b287ef115193 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -2,8 +2,8 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (
- JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
+ MetricGroup)
# Global command line arguments.
_args = None
@@ -30,6 +30,9 @@ def main() -> None:
)
_args = parser.parse_args()
+ directory = f"{_args.events_path}/x86/{_args.model}/"
+ LoadEvents(directory)
+
all_metrics = MetricGroup("", [])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index dd8fd06940e6..e33e163b2815 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -3,10 +3,56 @@
import ast
import decimal
import json
+import os
import re
from enum import Enum
from typing import Dict, List, Optional, Set, Tuple, Union
+all_events = set()
+
+def LoadEvents(directory: str) -> None:
+ """Populate a global set of all known events for the purpose of validating Event names"""
+ global all_events
+ all_events = {
+ "context\\-switches",
+ "cpu\\-cycles",
+ "cycles",
+ "duration_time",
+ "instructions",
+ "l2_itlb_misses",
+ }
+ for file in os.listdir(os.fsencode(directory)):
+ filename = os.fsdecode(file)
+ if filename.endswith(".json"):
+ try:
+ for x in json.load(open(f"{directory}/{filename}")):
+ if "EventName" in x:
+ all_events.add(x["EventName"])
+ elif "ArchStdEvent" in x:
+ all_events.add(x["ArchStdEvent"])
+ except json.decoder.JSONDecodeError:
+ # The generated directory may be the same as the input, which
+ # causes partial json files. Ignore errors.
+ pass
+
+
+def CheckEvent(name: str) -> bool:
+ """Check the event name exists in the set of all loaded events"""
+ global all_events
+ if len(all_events) == 0:
+ # No events loaded so assume any event is good.
+ return True
+
+ if ':' in name:
+ # Remove trailing modifier.
+ name = name[:name.find(':')]
+ elif '/' in name:
+ # Name could begin with a PMU or an event, for now assume it is good.
+ return True
+
+ return name in all_events
+
+
class MetricConstraint(Enum):
GROUPED_EVENTS = 0
NO_GROUP_EVENTS = 1
@@ -317,9 +363,18 @@ def _FixEscapes(s: str) -> str:
class Event(Expression):
"""An event in an expression."""
- def __init__(self, name: str, legacy_name: str = ''):
- self.name = _FixEscapes(name)
- self.legacy_name = _FixEscapes(legacy_name)
+ def __init__(self, *args: str):
+ error = ""
+ for name in args:
+ if CheckEvent(name):
+ self.name = _FixEscapes(name)
+ return
+ if error:
+ error += " or " + name
+ else:
+ error = name
+ global all_events
+ raise Exception(f"No event {error} in:\n{all_events}")
def ToPerfJson(self):
result = re.sub('/', '@', self.name)
@@ -338,6 +393,28 @@ class Event(Expression):
return self
+class MetricRef(Expression):
+ """A metric reference in an expression."""
+
+ def __init__(self, name: str):
+ self.name = _FixEscapes(name)
+
+ def ToPerfJson(self):
+ return self.name
+
+ def ToPython(self):
+ return f'MetricRef(r"{self.name}")'
+
+ def Simplify(self) -> Expression:
+ return self
+
+ def Equals(self, other: Expression) -> bool:
+ return isinstance(other, MetricRef) and self.name == other.name
+
+ def Substitute(self, name: str, expression: Expression) -> Expression:
+ return self
+
+
class Constant(Expression):
"""A constant within the expression tree."""
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 15/48] perf jevents: Add RAPL event metric for AMD zen models
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (13 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 14/48] perf jevents: Add load event json to verify and allow fallbacks Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 16/48] perf jevents: Add idle " Ian Rogers
` (33 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add power per second metrics based on RAPL.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 31 +++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index bc91d9c120fa..b6cdeb4f09fe 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -1,13 +1,36 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
+import math
import os
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
- MetricGroup)
+from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ LoadEvents, Metric, MetricGroup, Select)
# Global command line arguments.
_args = None
+interval_sec = Event("duration_time")
+
+
+def Rapl() -> MetricGroup:
+ """Processor socket power consumption estimate.
+
+ Use events from the running average power limit (RAPL) driver.
+ """
+ # Watts = joules/second
+ # Currently only energy-pkg is supported by AMD:
+ # https://lore.kernel.org/lkml/20220105185659.643355-1-eranian@google.com/
+ pkg = Event("power/energy\\-pkg/")
+ cond_pkg = Select(pkg, has_event(pkg), math.nan)
+ scale = 2.3283064365386962890625e-10
+ metrics = [
+ Metric("lpm_cpu_power_pkg", "",
+ d_ratio(cond_pkg * scale, interval_sec), "Watts"),
+ ]
+
+ return MetricGroup("lpm_cpu_power", metrics,
+ description="Processor socket power consumption estimates")
+
def main() -> None:
global _args
@@ -33,7 +56,9 @@ def main() -> None:
directory = f"{_args.events_path}/x86/{_args.model}/"
LoadEvents(directory)
- all_metrics = MetricGroup("", [])
+ all_metrics = MetricGroup("", [
+ Rapl(),
+ ])
if _args.metricgroups:
print(JsonEncodeMetricGroupDescriptions(all_metrics))
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 16/48] perf jevents: Add idle metric for AMD zen models
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (14 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 15/48] perf jevents: Add RAPL event metric for AMD zen models Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 17/48] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
` (32 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Compute using the msr PMU the percentage of wallclock cycles where the
CPUs are in a low power state.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index b6cdeb4f09fe..f51a044b8005 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -3,8 +3,9 @@
import argparse
import math
import os
-from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
- LoadEvents, Metric, MetricGroup, Select)
+from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+ JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
+ MetricGroup, Select)
# Global command line arguments.
_args = None
@@ -12,6 +13,16 @@ _args = None
interval_sec = Event("duration_time")
+def Idle() -> Metric:
+ cyc = Event("msr/mperf/")
+ tsc = Event("msr/tsc/")
+ low = max(tsc - cyc, 0)
+ return Metric(
+ "lpm_idle",
+ "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
+ d_ratio(low, tsc), "100%")
+
+
def Rapl() -> MetricGroup:
"""Processor socket power consumption estimate.
@@ -57,6 +68,7 @@ def main() -> None:
LoadEvents(directory)
all_metrics = MetricGroup("", [
+ Idle(),
Rapl(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 17/48] perf jevents: Add upc metric for uops per cycle for AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (15 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 16/48] perf jevents: Add idle " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-08 9:46 ` Sandipan Das
2025-12-02 17:50 ` [PATCH v9 18/48] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
` (31 subsequent siblings)
48 siblings, 1 reply; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The metric adjusts for whether or not SMT is on.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index f51a044b8005..42e46b33334d 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -3,14 +3,26 @@
import argparse
import math
import os
+from typing import Optional
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, Select)
+ JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
+ Metric, MetricGroup, Select)
# Global command line arguments.
_args = None
-
+_zen_model: int = 1
interval_sec = Event("duration_time")
+ins = Event("instructions")
+cycles = Event("cycles")
+# Number of CPU cycles scaled for SMT.
+smt_cycles = Select(cycles / 2, Literal("#smt_on"), cycles)
+
+
+def AmdUpc() -> Metric:
+ ops = Event("ex_ret_ops", "ex_ret_cops")
+ upc = d_ratio(ops, smt_cycles)
+ return Metric("lpm_upc", "Micro-ops retired per core cycle (higher is better)",
+ upc, "uops/cycle")
def Idle() -> Metric:
@@ -45,6 +57,7 @@ def Rapl() -> MetricGroup:
def main() -> None:
global _args
+ global _zen_model
def dir_path(path: str) -> str:
"""Validate path is a directory for argparse."""
@@ -67,7 +80,10 @@ def main() -> None:
directory = f"{_args.events_path}/x86/{_args.model}/"
LoadEvents(directory)
+ _zen_model = int(_args.model[6:])
+
all_metrics = MetricGroup("", [
+ AmdUpc(),
Idle(),
Rapl(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 18/48] perf jevents: Add br metric group for branch statistics on AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (16 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 17/48] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-08 12:42 ` Sandipan Das
2025-12-02 17:50 ` [PATCH v9 19/48] perf jevents: Add itlb metric group for AMD Ian Rogers
` (30 subsequent siblings)
48 siblings, 1 reply; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using json
events. The lack of conditional events on anything but zen2 means this
category is lacking on zen1, zen3 and zen4.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 104 +++++++++++++++++++++++++++
1 file changed, 104 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 42e46b33334d..1880ccf9c6fc 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -18,6 +18,109 @@ cycles = Event("cycles")
smt_cycles = Select(cycles / 2, Literal("#smt_on"), cycles)
+def AmdBr():
+ def Total() -> MetricGroup:
+ br = Event("ex_ret_brn")
+ br_m_all = Event("ex_ret_brn_misp")
+ br_clr = Event("ex_ret_msprd_brnch_instr_dir_msmtch",
+ "ex_ret_brn_resync")
+
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ misp_r = d_ratio(br_m_all, br)
+ clr_r = d_ratio(br_clr, interval_sec)
+
+ return MetricGroup("lpm_br_total", [
+ Metric("lpm_br_total_retired",
+ "The number of branch instructions retired per second.", br_r,
+ "insn/s"),
+ Metric(
+ "lpm_br_total_mispred",
+ "The number of branch instructions retired, of any type, that were "
+ "not correctly predicted as a percentage of all branch instrucions.",
+ misp_r, "100%"),
+ Metric("lpm_br_total_insn_between_branches",
+ "The number of instructions divided by the number of branches.",
+ ins_r, "insn"),
+ Metric("lpm_br_total_insn_fe_resteers",
+ "The number of resync branches per second.", clr_r, "req/s")
+ ])
+
+ def Taken() -> MetricGroup:
+ br = Event("ex_ret_brn_tkn")
+ br_m_tk = Event("ex_ret_brn_tkn_misp")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ misp_r = d_ratio(br_m_tk, br)
+ return MetricGroup("lpm_br_taken", [
+ Metric("lpm_br_taken_retired",
+ "The number of taken branches that were retired per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_taken_mispred",
+ "The number of retired taken branch instructions that were "
+ "mispredicted as a percentage of all taken branches.", misp_r,
+ "100%"),
+ Metric(
+ "lpm_br_taken_insn_between_branches",
+ "The number of instructions divided by the number of taken branches.",
+ ins_r, "insn"),
+ ])
+
+ def Conditional() -> Optional[MetricGroup]:
+ global _zen_model
+ br = Event("ex_ret_cond")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+
+ metrics = [
+ Metric("lpm_br_cond_retired", "Retired conditional branch instructions.",
+ br_r, "insn/s"),
+ Metric("lpm_br_cond_insn_between_branches",
+ "The number of instructions divided by the number of conditional "
+ "branches.", ins_r, "insn"),
+ ]
+ if _zen_model == 2:
+ br_m_cond = Event("ex_ret_cond_misp")
+ misp_r = d_ratio(br_m_cond, br)
+ metrics += [
+ Metric("lpm_br_cond_mispred",
+ "Retired conditional branch instructions mispredicted as a "
+ "percentage of all conditional branches.", misp_r, "100%"),
+ ]
+
+ return MetricGroup("lpm_br_cond", metrics)
+
+ def Fused() -> MetricGroup:
+ br = Event("ex_ret_fused_instr", "ex_ret_fus_brnch_inst")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ return MetricGroup("lpm_br_cond", [
+ Metric("lpm_br_fused_retired",
+ "Retired fused branch instructions per second.", br_r, "insn/s"),
+ Metric(
+ "lpm_br_fused_insn_between_branches",
+ "The number of instructions divided by the number of fused "
+ "branches.", ins_r, "insn"),
+ ])
+
+ def Far() -> MetricGroup:
+ br = Event("ex_ret_brn_far")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ return MetricGroup("lpm_br_far", [
+ Metric("lpm_br_far_retired", "Retired far control transfers per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_far_insn_between_branches",
+ "The number of instructions divided by the number of far branches.",
+ ins_r, "insn"),
+ ])
+
+ return MetricGroup("lpm_br", [Total(), Taken(), Conditional(), Fused(), Far()],
+ description="breakdown of retired branch instructions")
+
+
def AmdUpc() -> Metric:
ops = Event("ex_ret_ops", "ex_ret_cops")
upc = d_ratio(ops, smt_cycles)
@@ -83,6 +186,7 @@ def main() -> None:
_zen_model = int(_args.model[6:])
all_metrics = MetricGroup("", [
+ AmdBr(),
AmdUpc(),
Idle(),
Rapl(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 19/48] perf jevents: Add itlb metric group for AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (17 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 18/48] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 20/48] perf jevents: Add dtlb " Ian Rogers
` (29 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add metrics that give an overview and details of the l1 itlb (zen1,
zen2, zen3) and l2 itlb (all zens).
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 49 ++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 1880ccf9c6fc..7a418990a767 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -121,6 +121,54 @@ def AmdBr():
description="breakdown of retired branch instructions")
+def AmdItlb():
+ global _zen_model
+ l2h = Event("bp_l1_tlb_miss_l2_tlb_hit", "bp_l1_tlb_miss_l2_hit")
+ l2m = Event("l2_itlb_misses")
+ l2r = l2h + l2m
+
+ itlb_l1_mg = None
+ l1m = l2r
+ if _zen_model <= 3:
+ l1r = Event("ic_fw32")
+ l1h = max(l1r - l1m, 0)
+ itlb_l1_mg = MetricGroup("lpm_itlb_l1", [
+ Metric("lpm_itlb_l1_hits",
+ "L1 ITLB hits as a perecentage of L1 ITLB accesses.",
+ d_ratio(l1h, l1h + l1m), "100%"),
+ Metric("lpm_itlb_l1_miss",
+ "L1 ITLB misses as a perecentage of L1 ITLB accesses.",
+ d_ratio(l1m, l1h + l1m), "100%"),
+ Metric("lpm_itlb_l1_reqs",
+ "The number of 32B fetch windows transferred from IC pipe to DE "
+ "instruction decoder per second.", d_ratio(
+ l1r, interval_sec),
+ "windows/sec"),
+ ])
+
+ return MetricGroup("lpm_itlb", [
+ MetricGroup("lpm_itlb_ov", [
+ Metric("lpm_itlb_ov_insn_bt_l1_miss",
+ "Number of instructions between l1 misses", d_ratio(
+ ins, l1m), "insns"),
+ Metric("lpm_itlb_ov_insn_bt_l2_miss",
+ "Number of instructions between l2 misses", d_ratio(
+ ins, l2m), "insns"),
+ ]),
+ itlb_l1_mg,
+ MetricGroup("lpm_itlb_l2", [
+ Metric("lpm_itlb_l2_hits",
+ "L2 ITLB hits as a percentage of all L2 ITLB accesses.",
+ d_ratio(l2h, l2r), "100%"),
+ Metric("lpm_itlb_l2_miss",
+ "L2 ITLB misses as a percentage of all L2 ITLB accesses.",
+ d_ratio(l2m, l2r), "100%"),
+ Metric("lpm_itlb_l2_reqs", "ITLB accesses per second.",
+ d_ratio(l2r, interval_sec), "accesses/sec"),
+ ]),
+ ], description="Instruction TLB breakdown")
+
+
def AmdUpc() -> Metric:
ops = Event("ex_ret_ops", "ex_ret_cops")
upc = d_ratio(ops, smt_cycles)
@@ -187,6 +235,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
AmdBr(),
+ AmdItlb(),
AmdUpc(),
Idle(),
Rapl(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 20/48] perf jevents: Add dtlb metric group for AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (18 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 19/48] perf jevents: Add itlb metric group for AMD Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 21/48] perf jevents: Add uncore l3 " Ian Rogers
` (28 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add metrics that give an overview and details of the dtlb (zen1, zen2,
zen3).
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 111 +++++++++++++++++++++++++++
1 file changed, 111 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 7a418990a767..2d1d25cb40b2 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -121,6 +121,116 @@ def AmdBr():
description="breakdown of retired branch instructions")
+def AmdDtlb() -> Optional[MetricGroup]:
+ global _zen_model
+ if _zen_model >= 4:
+ return None
+
+ d_dat = Event("ls_dc_accesses") if _zen_model <= 3 else None
+ d_h4k = Event("ls_l1_d_tlb_miss.tlb_reload_4k_l2_hit")
+ d_hcoal = Event(
+ "ls_l1_d_tlb_miss.tlb_reload_coalesced_page_hit") if _zen_model >= 2 else 0
+ d_h2m = Event("ls_l1_d_tlb_miss.tlb_reload_2m_l2_hit")
+ d_h1g = Event("ls_l1_d_tlb_miss.tlb_reload_1g_l2_hit")
+
+ d_m4k = Event("ls_l1_d_tlb_miss.tlb_reload_4k_l2_miss")
+ d_mcoal = Event(
+ "ls_l1_d_tlb_miss.tlb_reload_coalesced_page_miss") if _zen_model >= 2 else 0
+ d_m2m = Event("ls_l1_d_tlb_miss.tlb_reload_2m_l2_miss")
+ d_m1g = Event("ls_l1_d_tlb_miss.tlb_reload_1g_l2_miss")
+
+ d_w0 = Event("ls_tablewalker.dc_type0") if _zen_model <= 3 else None
+ d_w1 = Event("ls_tablewalker.dc_type1") if _zen_model <= 3 else None
+ walks = d_w0 + d_w1
+ walks_r = d_ratio(walks, interval_sec)
+ ins_w = d_ratio(ins, walks)
+ l1 = d_dat
+ l1_r = d_ratio(l1, interval_sec)
+ l2_hits = d_h4k + d_hcoal + d_h2m + d_h1g
+ l2_miss = d_m4k + d_mcoal + d_m2m + d_m1g
+ l2_r = d_ratio(l2_hits + l2_miss, interval_sec)
+ l1_miss = l2_hits + l2_miss + walks
+ l1_hits = max(l1 - l1_miss, 0)
+ ins_l = d_ratio(ins, l1_miss)
+
+ return MetricGroup("lpm_dtlb", [
+ MetricGroup("lpm_dtlb_ov", [
+ Metric("lpm_dtlb_ov_insn_bt_l1_miss",
+ "DTLB overview: instructions between l1 misses.", ins_l,
+ "insns"),
+ Metric("lpm_dtlb_ov_insn_bt_walks",
+ "DTLB overview: instructions between dtlb page table walks.",
+ ins_w, "insns"),
+ ]),
+ MetricGroup("lpm_dtlb_l1", [
+ Metric("lpm_dtlb_l1_hits",
+ "DTLB L1 hits as percentage of all DTLB L1 accesses.",
+ d_ratio(l1_hits, l1), "100%"),
+ Metric("lpm_dtlb_l1_miss",
+ "DTLB L1 misses as percentage of all DTLB L1 accesses.",
+ d_ratio(l1_miss, l1), "100%"),
+ Metric("lpm_dtlb_l1_reqs", "DTLB L1 accesses per second.", l1_r,
+ "insns/s"),
+ ]),
+ MetricGroup("lpm_dtlb_l2", [
+ Metric("lpm_dtlb_l2_hits",
+ "DTLB L2 hits as percentage of all DTLB L2 accesses.",
+ d_ratio(l2_hits, l2_hits + l2_miss), "100%"),
+ Metric("lpm_dtlb_l2_miss",
+ "DTLB L2 misses as percentage of all DTLB L2 accesses.",
+ d_ratio(l2_miss, l2_hits + l2_miss), "100%"),
+ Metric("lpm_dtlb_l2_reqs", "DTLB L2 accesses per second.", l2_r,
+ "insns/s"),
+ MetricGroup("lpm_dtlb_l2_4kb", [
+ Metric(
+ "lpm_dtlb_l2_4kb_hits",
+ "DTLB L2 4kb page size hits as percentage of all DTLB L2 4kb "
+ "accesses.", d_ratio(d_h4k, d_h4k + d_m4k), "100%"),
+ Metric(
+ "lpm_dtlb_l2_4kb_miss",
+ "DTLB L2 4kb page size misses as percentage of all DTLB L2 4kb"
+ "accesses.", d_ratio(d_m4k, d_h4k + d_m4k), "100%")
+ ]),
+ MetricGroup("lpm_dtlb_l2_coalesced", [
+ Metric(
+ "lpm_dtlb_l2_coal_hits",
+ "DTLB L2 coalesced page (16kb) hits as percentage of all DTLB "
+ "L2 coalesced accesses.", d_ratio(d_hcoal,
+ d_hcoal + d_mcoal), "100%"),
+ Metric(
+ "lpm_dtlb_l2_coal_miss",
+ "DTLB L2 coalesced page (16kb) misses as percentage of all "
+ "DTLB L2 coalesced accesses.",
+ d_ratio(d_mcoal, d_hcoal + d_mcoal), "100%")
+ ]),
+ MetricGroup("lpm_dtlb_l2_2mb", [
+ Metric(
+ "lpm_dtlb_l2_2mb_hits",
+ "DTLB L2 2mb page size hits as percentage of all DTLB L2 2mb "
+ "accesses.", d_ratio(d_h2m, d_h2m + d_m2m), "100%"),
+ Metric(
+ "lpm_dtlb_l2_2mb_miss",
+ "DTLB L2 2mb page size misses as percentage of all DTLB L2 "
+ "accesses.", d_ratio(d_m2m, d_h2m + d_m2m), "100%")
+ ]),
+ MetricGroup("lpm_dtlb_l2_1g", [
+ Metric(
+ "lpm_dtlb_l2_1g_hits",
+ "DTLB L2 1gb page size hits as percentage of all DTLB L2 1gb "
+ "accesses.", d_ratio(d_h1g, d_h1g + d_m1g), "100%"),
+ Metric(
+ "lpm_dtlb_l2_1g_miss",
+ "DTLB L2 1gb page size misses as percentage of all DTLB L2 "
+ "1gb accesses.", d_ratio(d_m1g, d_h1g + d_m1g), "100%")
+ ]),
+ ]),
+ MetricGroup("lpm_dtlb_walks", [
+ Metric("lpm_dtlb_walks_reqs", "DTLB page table walks per second.",
+ walks_r, "walks/s"),
+ ]),
+ ], description="Data TLB metrics")
+
+
def AmdItlb():
global _zen_model
l2h = Event("bp_l1_tlb_miss_l2_tlb_hit", "bp_l1_tlb_miss_l2_hit")
@@ -235,6 +345,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
AmdBr(),
+ AmdDtlb(),
AmdItlb(),
AmdUpc(),
Idle(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 21/48] perf jevents: Add uncore l3 metric group for AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (19 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 20/48] perf jevents: Add dtlb " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 22/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
` (27 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics use the amd_l3 PMU for access/miss/hit information.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 2d1d25cb40b2..6542c334a82b 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -316,6 +316,24 @@ def Rapl() -> MetricGroup:
description="Processor socket power consumption estimates")
+def UncoreL3():
+ acc = Event("l3_lookup_state.all_coherent_accesses_to_l3",
+ "l3_lookup_state.all_l3_req_typs")
+ miss = Event("l3_lookup_state.l3_miss",
+ "l3_comb_clstr_state.request_miss")
+ acc = max(acc, miss)
+ hits = acc - miss
+
+ return MetricGroup("lpm_l3", [
+ Metric("lpm_l3_accesses", "L3 victim cache accesses",
+ d_ratio(acc, interval_sec), "accesses/sec"),
+ Metric("lpm_l3_hits", "L3 victim cache hit rate",
+ d_ratio(hits, acc), "100%"),
+ Metric("lpm_l3_miss", "L3 victim cache miss rate", d_ratio(miss, acc),
+ "100%"),
+ ], description="L3 cache breakdown per CCX")
+
+
def main() -> None:
global _args
global _zen_model
@@ -350,6 +368,7 @@ def main() -> None:
AmdUpc(),
Idle(),
Rapl(),
+ UncoreL3(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 22/48] perf jevents: Add load store breakdown metrics ldst for AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (20 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 21/48] perf jevents: Add uncore l3 " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-08 9:21 ` Sandipan Das
2025-12-02 17:50 ` [PATCH v9 23/48] perf jevents: Add context switch metrics " Ian Rogers
` (26 subsequent siblings)
48 siblings, 1 reply; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 75 ++++++++++++++++++++++++++++
1 file changed, 75 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 6542c334a82b..1611d0e50d03 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -279,6 +279,80 @@ def AmdItlb():
], description="Instruction TLB breakdown")
+def AmdLdSt() -> MetricGroup:
+ ldst_ld = Event("ls_dispatch.ld_dispatch")
+ ldst_st = Event("ls_dispatch.store_dispatch")
+ ldst_ldc1 = Event(f"{ldst_ld}/cmask=1/")
+ ldst_stc1 = Event(f"{ldst_st}/cmask=1/")
+ ldst_ldc2 = Event(f"{ldst_ld}/cmask=2/")
+ ldst_stc2 = Event(f"{ldst_st}/cmask=2/")
+ ldst_ldc3 = Event(f"{ldst_ld}/cmask=3/")
+ ldst_stc3 = Event(f"{ldst_st}/cmask=3/")
+ ldst_cyc = Event("ls_not_halted_cyc")
+
+ ld_rate = d_ratio(ldst_ld, interval_sec)
+ st_rate = d_ratio(ldst_st, interval_sec)
+
+ ld_v1 = max(ldst_ldc1 - ldst_ldc2, 0)
+ ld_v2 = max(ldst_ldc2 - ldst_ldc3, 0)
+ ld_v3 = ldst_ldc3
+
+ st_v1 = max(ldst_stc1 - ldst_stc2, 0)
+ st_v2 = max(ldst_stc2 - ldst_stc3, 0)
+ st_v3 = ldst_stc3
+
+ return MetricGroup("lpm_ldst", [
+ MetricGroup("lpm_ldst_total", [
+ Metric("lpm_ldst_total_ld", "Number of loads dispatched per second.",
+ ld_rate, "insns/sec"),
+ Metric("lpm_ldst_total_st", "Number of stores dispatched per second.",
+ st_rate, "insns/sec"),
+ ]),
+ MetricGroup("lpm_ldst_percent_insn", [
+ Metric("lpm_ldst_percent_insn_ld",
+ "Load instructions as a percentage of all instructions.",
+ d_ratio(ldst_ld, ins), "100%"),
+ Metric("lpm_ldst_percent_insn_st",
+ "Store instructions as a percentage of all instructions.",
+ d_ratio(ldst_st, ins), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_loads_per_cycle", [
+ Metric(
+ "lpm_ldst_ret_loads_per_cycle_1",
+ "Load instructions retiring in 1 cycle as a percentage of all "
+ "unhalted cycles.", d_ratio(ld_v1, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_loads_per_cycle_2",
+ "Load instructions retiring in 2 cycles as a percentage of all "
+ "unhalted cycles.", d_ratio(ld_v2, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_loads_per_cycle_3",
+ "Load instructions retiring in 3 or more cycles as a percentage"
+ "of all unhalted cycles.", d_ratio(ld_v3, ldst_cyc), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_stores_per_cycle", [
+ Metric(
+ "lpm_ldst_ret_stores_per_cycle_1",
+ "Store instructions retiring in 1 cycle as a percentage of all "
+ "unhalted cycles.", d_ratio(st_v1, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_stores_per_cycle_2",
+ "Store instructions retiring in 2 cycles as a percentage of all "
+ "unhalted cycles.", d_ratio(st_v2, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_stores_per_cycle_3",
+ "Store instructions retiring in 3 or more cycles as a percentage"
+ "of all unhalted cycles.", d_ratio(st_v3, ldst_cyc), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_insn_bt", [
+ Metric("lpm_ldst_insn_bt_ld", "Number of instructions between loads.",
+ d_ratio(ins, ldst_ld), "insns"),
+ Metric("lpm_ldst_insn_bt_st", "Number of instructions between stores.",
+ d_ratio(ins, ldst_st), "insns"),
+ ])
+ ], description="Breakdown of load/store instructions")
+
+
def AmdUpc() -> Metric:
ops = Event("ex_ret_ops", "ex_ret_cops")
upc = d_ratio(ops, smt_cycles)
@@ -365,6 +439,7 @@ def main() -> None:
AmdBr(),
AmdDtlb(),
AmdItlb(),
+ AmdLdSt(),
AmdUpc(),
Idle(),
Rapl(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 23/48] perf jevents: Add context switch metrics for AMD
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (21 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 22/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 24/48] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
` (25 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics break down context switches for different kinds of
instruction.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/pmu-events/amd_metrics.py | 33 ++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 1611d0e50d03..780e611fe575 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -121,6 +121,38 @@ def AmdBr():
description="breakdown of retired branch instructions")
+def AmdCtxSw() -> MetricGroup:
+ cs = Event("context\\-switches")
+ metrics = [
+ Metric("lpm_cs_rate", "Context switches per second",
+ d_ratio(cs, interval_sec), "ctxsw/s")
+ ]
+
+ ev = Event("instructions")
+ metrics.append(Metric("lpm_cs_instr", "Instructions per context switch",
+ d_ratio(ev, cs), "instr/cs"))
+
+ ev = Event("cycles")
+ metrics.append(Metric("lpm_cs_cycles", "Cycles per context switch",
+ d_ratio(ev, cs), "cycles/cs"))
+
+ ev = Event("ls_dispatch.ld_dispatch")
+ metrics.append(Metric("lpm_cs_loads", "Loads per context switch",
+ d_ratio(ev, cs), "loads/cs"))
+
+ ev = Event("ls_dispatch.store_dispatch")
+ metrics.append(Metric("lpm_cs_stores", "Stores per context switch",
+ d_ratio(ev, cs), "stores/cs"))
+
+ ev = Event("ex_ret_brn_tkn")
+ metrics.append(Metric("lpm_cs_br_taken", "Branches taken per context switch",
+ d_ratio(ev, cs), "br_taken/cs"))
+
+ return MetricGroup("lpm_cs", metrics,
+ description=("Number of context switches per second, instructions "
+ "retired & core cycles between context switches"))
+
+
def AmdDtlb() -> Optional[MetricGroup]:
global _zen_model
if _zen_model >= 4:
@@ -437,6 +469,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
AmdBr(),
+ AmdCtxSw(),
AmdDtlb(),
AmdItlb(),
AmdLdSt(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 24/48] perf jevents: Add RAPL metrics for all Intel models
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (22 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 23/48] perf jevents: Add context switch metrics " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 25/48] perf jevents: Add idle metric for " Ian Rogers
` (24 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add a 'cpu_power' metric group that computes the power consumption
from RAPL events if they are present.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 44 ++++++++++++++++++++++++--
1 file changed, 41 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index b287ef115193..61778deedfff 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1,12 +1,48 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
+import math
import os
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
- MetricGroup)
+from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ LoadEvents, Metric, MetricGroup, Select)
# Global command line arguments.
_args = None
+interval_sec = Event("duration_time")
+
+
+def Rapl() -> MetricGroup:
+ """Processor power consumption estimate.
+
+ Use events from the running average power limit (RAPL) driver.
+ """
+ # Watts = joules/second
+ pkg = Event("power/energy\\-pkg/")
+ cond_pkg = Select(pkg, has_event(pkg), math.nan)
+ cores = Event("power/energy\\-cores/")
+ cond_cores = Select(cores, has_event(cores), math.nan)
+ ram = Event("power/energy\\-ram/")
+ cond_ram = Select(ram, has_event(ram), math.nan)
+ gpu = Event("power/energy\\-gpu/")
+ cond_gpu = Select(gpu, has_event(gpu), math.nan)
+ psys = Event("power/energy\\-psys/")
+ cond_psys = Select(psys, has_event(psys), math.nan)
+ scale = 2.3283064365386962890625e-10
+ metrics = [
+ Metric("lpm_cpu_power_pkg", "",
+ d_ratio(cond_pkg * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_cores", "",
+ d_ratio(cond_cores * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_ram", "",
+ d_ratio(cond_ram * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_gpu", "",
+ d_ratio(cond_gpu * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_psys", "",
+ d_ratio(cond_psys * scale, interval_sec), "Watts"),
+ ]
+
+ return MetricGroup("lpm_cpu_power", metrics,
+ description="Running Average Power Limit (RAPL) power consumption estimates")
def main() -> None:
@@ -33,7 +69,9 @@ def main() -> None:
directory = f"{_args.events_path}/x86/{_args.model}/"
LoadEvents(directory)
- all_metrics = MetricGroup("", [])
+ all_metrics = MetricGroup("", [
+ Rapl(),
+ ])
if _args.metricgroups:
print(JsonEncodeMetricGroupDescriptions(all_metrics))
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 25/48] perf jevents: Add idle metric for Intel models
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (23 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 24/48] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 26/48] perf jevents: Add CheckPmu to see if a PMU is in loaded json events Ian Rogers
` (23 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Compute using the msr PMU the percentage of wallclock cycles where the
CPUs are in a low power state.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 61778deedfff..0cb7a38ad238 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -3,14 +3,25 @@
import argparse
import math
import os
-from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
- LoadEvents, Metric, MetricGroup, Select)
+from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+ JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
+ MetricGroup, Select)
# Global command line arguments.
_args = None
interval_sec = Event("duration_time")
+def Idle() -> Metric:
+ cyc = Event("msr/mperf/")
+ tsc = Event("msr/tsc/")
+ low = max(tsc - cyc, 0)
+ return Metric(
+ "lpm_idle",
+ "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
+ d_ratio(low, tsc), "100%")
+
+
def Rapl() -> MetricGroup:
"""Processor power consumption estimate.
@@ -70,6 +81,7 @@ def main() -> None:
LoadEvents(directory)
all_metrics = MetricGroup("", [
+ Idle(),
Rapl(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 26/48] perf jevents: Add CheckPmu to see if a PMU is in loaded json events
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (24 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 25/48] perf jevents: Add idle metric for " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 27/48] perf jevents: Add smi metric group for Intel models Ian Rogers
` (22 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
CheckPmu can be used to determine if hybrid events are present,
allowing for hybrid conditional metrics/events/pmus to be premised on
the json files rather than hard coded tables.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index e33e163b2815..62d1a1e1d458 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -8,10 +8,12 @@ import re
from enum import Enum
from typing import Dict, List, Optional, Set, Tuple, Union
+all_pmus = set()
all_events = set()
def LoadEvents(directory: str) -> None:
"""Populate a global set of all known events for the purpose of validating Event names"""
+ global all_pmus
global all_events
all_events = {
"context\\-switches",
@@ -26,6 +28,8 @@ def LoadEvents(directory: str) -> None:
if filename.endswith(".json"):
try:
for x in json.load(open(f"{directory}/{filename}")):
+ if "Unit" in x:
+ all_pmus.add(x["Unit"])
if "EventName" in x:
all_events.add(x["EventName"])
elif "ArchStdEvent" in x:
@@ -36,6 +40,10 @@ def LoadEvents(directory: str) -> None:
pass
+def CheckPmu(name: str) -> bool:
+ return name in all_pmus
+
+
def CheckEvent(name: str) -> bool:
"""Check the event name exists in the set of all loaded events"""
global all_events
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 27/48] perf jevents: Add smi metric group for Intel models
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (25 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 26/48] perf jevents: Add CheckPmu to see if a PMU is in loaded json events Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 28/48] perf jevents: Mark metrics with experimental events as experimental Ian Rogers
` (21 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Allow duplicated metric to be dropped from json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 0cb7a38ad238..94604b1b07d8 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -3,9 +3,9 @@
import argparse
import math
import os
-from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, Select)
+ MetricGroup, MetricRef, Select)
# Global command line arguments.
_args = None
@@ -56,6 +56,25 @@ def Rapl() -> MetricGroup:
description="Running Average Power Limit (RAPL) power consumption estimates")
+def Smi() -> MetricGroup:
+ pmu = "<cpu_core or cpu_atom>" if CheckPmu("cpu_core") else "cpu"
+ aperf = Event('msr/aperf/')
+ cycles = Event('cycles')
+ smi_num = Event('msr/smi/')
+ smi_cycles = Select(Select((aperf - cycles) / aperf, smi_num > 0, 0),
+ has_event(aperf),
+ 0)
+ return MetricGroup('smi', [
+ Metric('smi_num', 'Number of SMI interrupts.',
+ Select(smi_num, has_event(smi_num), 0), 'SMI#'),
+ # Note, the smi_cycles "Event" is really a reference to the metric.
+ Metric('smi_cycles',
+ 'Percentage of cycles spent in System Management Interrupts. '
+ f'Requires /sys/bus/event_source/devices/{pmu}/freeze_on_smi to be 1.',
+ smi_cycles, '100%', threshold=(MetricRef('smi_cycles') > 0.10))
+ ], description='System Management Interrupt metrics')
+
+
def main() -> None:
global _args
@@ -83,6 +102,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
+ Smi(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 28/48] perf jevents: Mark metrics with experimental events as experimental
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (26 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 27/48] perf jevents: Add smi metric group for Intel models Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 29/48] perf jevents: Add tsx metric group for Intel models Ian Rogers
` (20 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
When metrics are made with experimental events it is desirable the
metric description also carries this information in case of metric
inaccuracies.
Suggested-by: Perry Taylor <perry.taylor@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 44 +++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 62d1a1e1d458..2029b6e28365 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -10,11 +10,13 @@ from typing import Dict, List, Optional, Set, Tuple, Union
all_pmus = set()
all_events = set()
+experimental_events = set()
def LoadEvents(directory: str) -> None:
"""Populate a global set of all known events for the purpose of validating Event names"""
global all_pmus
global all_events
+ global experimental_events
all_events = {
"context\\-switches",
"cpu\\-cycles",
@@ -32,6 +34,8 @@ def LoadEvents(directory: str) -> None:
all_pmus.add(x["Unit"])
if "EventName" in x:
all_events.add(x["EventName"])
+ if "Experimental" in x and x["Experimental"] == "1":
+ experimental_events.add(x["EventName"])
elif "ArchStdEvent" in x:
all_events.add(x["ArchStdEvent"])
except json.decoder.JSONDecodeError:
@@ -61,6 +65,18 @@ def CheckEvent(name: str) -> bool:
return name in all_events
+def IsExperimentalEvent(name: str) -> bool:
+ global experimental_events
+ if ':' in name:
+ # Remove trailing modifier.
+ name = name[:name.find(':')]
+ elif '/' in name:
+ # Name could begin with a PMU or an event, for now assume it is not experimental.
+ return False
+
+ return name in experimental_events
+
+
class MetricConstraint(Enum):
GROUPED_EVENTS = 0
NO_GROUP_EVENTS = 1
@@ -82,6 +98,10 @@ class Expression:
"""Returns a simplified version of self."""
raise NotImplementedError()
+ def HasExperimentalEvents(self) -> bool:
+ """Are experimental events used in the expression?"""
+ raise NotImplementedError()
+
def Equals(self, other) -> bool:
"""Returns true when two expressions are the same."""
raise NotImplementedError()
@@ -249,6 +269,9 @@ class Operator(Expression):
return Operator(self.operator, lhs, rhs)
+ def HasExperimentalEvents(self) -> bool:
+ return self.lhs.HasExperimentalEvents() or self.rhs.HasExperimentalEvents()
+
def Equals(self, other: Expression) -> bool:
if isinstance(other, Operator):
return self.operator == other.operator and self.lhs.Equals(
@@ -297,6 +320,10 @@ class Select(Expression):
return Select(true_val, cond, false_val)
+ def HasExperimentalEvents(self) -> bool:
+ return (self.cond.HasExperimentalEvents() or self.true_val.HasExperimentalEvents() or
+ self.false_val.HasExperimentalEvents())
+
def Equals(self, other: Expression) -> bool:
if isinstance(other, Select):
return self.cond.Equals(other.cond) and self.false_val.Equals(
@@ -345,6 +372,9 @@ class Function(Expression):
return Function(self.fn, lhs, rhs)
+ def HasExperimentalEvents(self) -> bool:
+ return self.lhs.HasExperimentalEvents() or (self.rhs and self.rhs.HasExperimentalEvents())
+
def Equals(self, other: Expression) -> bool:
if isinstance(other, Function):
result = self.fn == other.fn and self.lhs.Equals(other.lhs)
@@ -384,6 +414,9 @@ class Event(Expression):
global all_events
raise Exception(f"No event {error} in:\n{all_events}")
+ def HasExperimentalEvents(self) -> bool:
+ return IsExperimentalEvent(self.name)
+
def ToPerfJson(self):
result = re.sub('/', '@', self.name)
return result
@@ -416,6 +449,9 @@ class MetricRef(Expression):
def Simplify(self) -> Expression:
return self
+ def HasExperimentalEvents(self) -> bool:
+ return False
+
def Equals(self, other: Expression) -> bool:
return isinstance(other, MetricRef) and self.name == other.name
@@ -443,6 +479,9 @@ class Constant(Expression):
def Simplify(self) -> Expression:
return self
+ def HasExperimentalEvents(self) -> bool:
+ return False
+
def Equals(self, other: Expression) -> bool:
return isinstance(other, Constant) and self.value == other.value
@@ -465,6 +504,9 @@ class Literal(Expression):
def Simplify(self) -> Expression:
return self
+ def HasExperimentalEvents(self) -> bool:
+ return False
+
def Equals(self, other: Expression) -> bool:
return isinstance(other, Literal) and self.value == other.value
@@ -527,6 +569,8 @@ class Metric:
self.name = name
self.description = description
self.expr = expr.Simplify()
+ if self.expr.HasExperimentalEvents():
+ self.description += " (metric should be considered experimental as it contains experimental events)."
# Workraound valid_only_metric hiding certain metrics based on unit.
scale_unit = scale_unit.replace('/sec', ' per sec')
if scale_unit[0].isdigit():
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 29/48] perf jevents: Add tsx metric group for Intel models
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (27 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 28/48] perf jevents: Mark metrics with experimental events as experimental Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 30/48] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
` (19 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Allow duplicated metric to be dropped from json files. Detect when TSX
is supported by a model by using the json events, use sysfs events at
runtime as hypervisors, etc. may disable TSX.
Add CheckPmu to metric to determine if which PMUs have been associated
with the loaded events.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 50 ++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 94604b1b07d8..05f3d94ec5d5 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -3,6 +3,7 @@
import argparse
import math
import os
+from typing import Optional
from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
MetricGroup, MetricRef, Select)
@@ -75,6 +76,54 @@ def Smi() -> MetricGroup:
], description='System Management Interrupt metrics')
+def Tsx() -> Optional[MetricGroup]:
+ pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
+ cycles = Event('cycles')
+ cycles_in_tx = Event(f'{pmu}/cycles\\-t/')
+ cycles_in_tx_cp = Event(f'{pmu}/cycles\\-ct/')
+ try:
+ # Test if the tsx event is present in the json, prefer the
+ # sysfs version so that we can detect its presence at runtime.
+ transaction_start = Event("RTM_RETIRED.START")
+ transaction_start = Event(f'{pmu}/tx\\-start/')
+ except:
+ return None
+
+ elision_start = None
+ try:
+ # Elision start isn't supported by all models, but we'll not
+ # generate the tsx_cycles_per_elision metric in that
+ # case. Again, prefer the sysfs encoding of the event.
+ elision_start = Event("HLE_RETIRED.START")
+ elision_start = Event(f'{pmu}/el\\-start/')
+ except:
+ pass
+
+ return MetricGroup('transaction', [
+ Metric('tsx_transactional_cycles',
+ 'Percentage of cycles within a transaction region.',
+ Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
+ '100%'),
+ Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
+ Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
+ has_event(cycles_in_tx),
+ 0),
+ '100%'),
+ Metric('tsx_cycles_per_transaction',
+ 'Number of cycles within a transaction divided by the number of transactions.',
+ Select(cycles_in_tx / transaction_start,
+ has_event(cycles_in_tx),
+ 0),
+ "cycles / transaction"),
+ Metric('tsx_cycles_per_elision',
+ 'Number of cycles within a transaction divided by the number of elisions.',
+ Select(cycles_in_tx / elision_start,
+ has_event(elision_start),
+ 0),
+ "cycles / elision") if elision_start else None,
+ ], description="Breakdown of transactional memory statistics")
+
+
def main() -> None:
global _args
@@ -103,6 +152,7 @@ def main() -> None:
Idle(),
Rapl(),
Smi(),
+ Tsx(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 30/48] perf jevents: Add br metric group for branch statistics on Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (28 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 29/48] perf jevents: Add tsx metric group for Intel models Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 31/48] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
` (18 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using json
events. Conditional taken and not taken metrics are specific to
Icelake and later generations, so the presence of the event is used to
determine whether the metric should exist.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 138 +++++++++++++++++++++++++
1 file changed, 138 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 05f3d94ec5d5..e1944d821248 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -124,6 +124,143 @@ def Tsx() -> Optional[MetricGroup]:
], description="Breakdown of transactional memory statistics")
+def IntelBr():
+ ins = Event("instructions")
+
+ def Total() -> MetricGroup:
+ br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
+ br_m_all = Event("BR_MISP_RETIRED.ALL_BRANCHES",
+ "BR_INST_RETIRED.MISPRED",
+ "BR_MISP_EXEC.ANY")
+ br_clr = None
+ try:
+ br_clr = Event("BACLEARS.ANY", "BACLEARS.ALL")
+ except:
+ pass
+
+ br_r = d_ratio(br_all, interval_sec)
+ ins_r = d_ratio(ins, br_all)
+ misp_r = d_ratio(br_m_all, br_all)
+ clr_r = d_ratio(br_clr, interval_sec) if br_clr else None
+
+ return MetricGroup("lpm_br_total", [
+ Metric("lpm_br_total_retired",
+ "The number of branch instructions retired per second.", br_r,
+ "insn/s"),
+ Metric(
+ "lpm_br_total_mispred",
+ "The number of branch instructions retired, of any type, that were "
+ "not correctly predicted as a percentage of all branch instrucions.",
+ misp_r, "100%"),
+ Metric("lpm_br_total_insn_between_branches",
+ "The number of instructions divided by the number of branches.",
+ ins_r, "insn"),
+ Metric("lpm_br_total_insn_fe_resteers",
+ "The number of resync branches per second.", clr_r, "req/s"
+ ) if clr_r else None
+ ])
+
+ def Taken() -> MetricGroup:
+ br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
+ br_m_tk = None
+ try:
+ br_m_tk = Event("BR_MISP_RETIRED.NEAR_TAKEN",
+ "BR_MISP_RETIRED.TAKEN_JCC",
+ "BR_INST_RETIRED.MISPRED_TAKEN")
+ except:
+ pass
+ br_r = d_ratio(br_all, interval_sec)
+ ins_r = d_ratio(ins, br_all)
+ misp_r = d_ratio(br_m_tk, br_all) if br_m_tk else None
+ return MetricGroup("lpm_br_taken", [
+ Metric("lpm_br_taken_retired",
+ "The number of taken branches that were retired per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_taken_mispred",
+ "The number of retired taken branch instructions that were "
+ "mispredicted as a percentage of all taken branches.", misp_r,
+ "100%") if misp_r else None,
+ Metric(
+ "lpm_br_taken_insn_between_branches",
+ "The number of instructions divided by the number of taken branches.",
+ ins_r, "insn"),
+ ])
+
+ def Conditional() -> Optional[MetricGroup]:
+ try:
+ br_cond = Event("BR_INST_RETIRED.COND",
+ "BR_INST_RETIRED.CONDITIONAL",
+ "BR_INST_RETIRED.TAKEN_JCC")
+ br_m_cond = Event("BR_MISP_RETIRED.COND",
+ "BR_MISP_RETIRED.CONDITIONAL",
+ "BR_MISP_RETIRED.TAKEN_JCC")
+ except:
+ return None
+
+ br_cond_nt = None
+ br_m_cond_nt = None
+ try:
+ br_cond_nt = Event("BR_INST_RETIRED.COND_NTAKEN")
+ br_m_cond_nt = Event("BR_MISP_RETIRED.COND_NTAKEN")
+ except:
+ pass
+ br_r = d_ratio(br_cond, interval_sec)
+ ins_r = d_ratio(ins, br_cond)
+ misp_r = d_ratio(br_m_cond, br_cond)
+ taken_metrics = [
+ Metric("lpm_br_cond_retired", "Retired conditional branch instructions.",
+ br_r, "insn/s"),
+ Metric("lpm_br_cond_insn_between_branches",
+ "The number of instructions divided by the number of conditional "
+ "branches.", ins_r, "insn"),
+ Metric("lpm_br_cond_mispred",
+ "Retired conditional branch instructions mispredicted as a "
+ "percentage of all conditional branches.", misp_r, "100%"),
+ ]
+ if not br_m_cond_nt:
+ return MetricGroup("lpm_br_cond", taken_metrics)
+
+ br_r = d_ratio(br_cond_nt, interval_sec)
+ ins_r = d_ratio(ins, br_cond_nt)
+ misp_r = d_ratio(br_m_cond_nt, br_cond_nt)
+
+ not_taken_metrics = [
+ Metric("lpm_br_cond_retired", "Retired conditional not taken branch instructions.",
+ br_r, "insn/s"),
+ Metric("lpm_br_cond_insn_between_branches",
+ "The number of instructions divided by the number of not taken conditional "
+ "branches.", ins_r, "insn"),
+ Metric("lpm_br_cond_mispred",
+ "Retired not taken conditional branch instructions mispredicted as a "
+ "percentage of all not taken conditional branches.", misp_r, "100%"),
+ ]
+ return MetricGroup("lpm_br_cond", [
+ MetricGroup("lpm_br_cond_nt", not_taken_metrics),
+ MetricGroup("lpm_br_cond_tkn", taken_metrics),
+ ])
+
+ def Far() -> Optional[MetricGroup]:
+ try:
+ br_far = Event("BR_INST_RETIRED.FAR_BRANCH")
+ except:
+ return None
+
+ br_r = d_ratio(br_far, interval_sec)
+ ins_r = d_ratio(ins, br_far)
+ return MetricGroup("lpm_br_far", [
+ Metric("lpm_br_far_retired", "Retired far control transfers per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_far_insn_between_branches",
+ "The number of instructions divided by the number of far branches.",
+ ins_r, "insn"),
+ ])
+
+ return MetricGroup("lpm_br", [Total(), Taken(), Conditional(), Far()],
+ description="breakdown of retired branch instructions")
+
+
def main() -> None:
global _args
@@ -153,6 +290,7 @@ def main() -> None:
Rapl(),
Smi(),
Tsx(),
+ IntelBr(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 31/48] perf jevents: Add software prefetch (swpf) metric group for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (29 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 30/48] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 32/48] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
` (17 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add metrics that breakdown software prefetch instruction use.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 66 ++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index e1944d821248..919a058c343a 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -261,6 +261,71 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelSwpf() -> Optional[MetricGroup]:
+ ins = Event("instructions")
+ try:
+ s_ld = Event("MEM_INST_RETIRED.ALL_LOADS",
+ "MEM_UOPS_RETIRED.ALL_LOADS")
+ s_nta = Event("SW_PREFETCH_ACCESS.NTA")
+ s_t0 = Event("SW_PREFETCH_ACCESS.T0")
+ s_t1 = Event("SW_PREFETCH_ACCESS.T1_T2")
+ s_w = Event("SW_PREFETCH_ACCESS.PREFETCHW")
+ except:
+ return None
+
+ all_sw = s_nta + s_t0 + s_t1 + s_w
+ swp_r = d_ratio(all_sw, interval_sec)
+ ins_r = d_ratio(ins, all_sw)
+ ld_r = d_ratio(s_ld, all_sw)
+
+ return MetricGroup("lpm_swpf", [
+ MetricGroup("lpm_swpf_totals", [
+ Metric("lpm_swpf_totals_exec", "Software prefetch instructions per second",
+ swp_r, "swpf/s"),
+ Metric("lpm_swpf_totals_insn_per_pf",
+ "Average number of instructions between software prefetches",
+ ins_r, "insn/swpf"),
+ Metric("lpm_swpf_totals_loads_per_pf",
+ "Average number of loads between software prefetches",
+ ld_r, "loads/swpf"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn", [
+ MetricGroup("lpm_swpf_bkdwn_nta", [
+ Metric("lpm_swpf_bkdwn_nta_per_swpf",
+ "Software prefetch NTA instructions as a percent of all prefetch instructions",
+ d_ratio(s_nta, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_nta_rate",
+ "Software prefetch NTA instructions per second",
+ d_ratio(s_nta, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn_t0", [
+ Metric("lpm_swpf_bkdwn_t0_per_swpf",
+ "Software prefetch T0 instructions as a percent of all prefetch instructions",
+ d_ratio(s_t0, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_t0_rate",
+ "Software prefetch T0 instructions per second",
+ d_ratio(s_t0, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn_t1_t2", [
+ Metric("lpm_swpf_bkdwn_t1_t2_per_swpf",
+ "Software prefetch T1 or T2 instructions as a percent of all prefetch instructions",
+ d_ratio(s_t1, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_t1_t2_rate",
+ "Software prefetch T1 or T2 instructions per second",
+ d_ratio(s_t1, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn_w", [
+ Metric("lpm_swpf_bkdwn_w_per_swpf",
+ "Software prefetch W instructions as a percent of all prefetch instructions",
+ d_ratio(s_w, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_w_rate",
+ "Software prefetch W instructions per second",
+ d_ratio(s_w, interval_sec), "insn/s"),
+ ]),
+ ]),
+ ], description="Software prefetch instruction breakdown")
+
+
def main() -> None:
global _args
@@ -291,6 +356,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelSwpf(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 32/48] perf jevents: Add ports metric group giving utilization on Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (30 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 31/48] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 33/48] perf jevents: Add L2 metrics for Intel Ian Rogers
` (16 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The ports metric group contains a metric for each port giving its
utilization as a ratio of cycles. The metrics are created by looking
for UOPS_DISPATCHED.PORT events.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 35 ++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 919a058c343a..7fcc0a1c544d 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1,12 +1,14 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
+import json
import math
import os
+import re
from typing import Optional
from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, MetricRef, Select)
+ JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
+ Metric, MetricGroup, MetricRef, Select)
# Global command line arguments.
_args = None
@@ -261,6 +263,34 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelPorts() -> Optional[MetricGroup]:
+ pipeline_events = json.load(
+ open(f"{_args.events_path}/x86/{_args.model}/pipeline.json"))
+
+ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
+ "CPU_CLK_UNHALTED.DISTRIBUTED",
+ "cycles")
+ # Number of CPU cycles scaled for SMT.
+ smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)
+
+ metrics = []
+ for x in pipeline_events:
+ if "EventName" in x and re.search("^UOPS_DISPATCHED.PORT", x["EventName"]):
+ name = x["EventName"]
+ port = re.search(r"(PORT_[0-9].*)", name).group(0).lower()
+ if name.endswith("_CORE"):
+ cyc = core_cycles
+ else:
+ cyc = smt_cycles
+ metrics.append(Metric(f"lpm_{port}", f"{port} utilization (higher is better)",
+ d_ratio(Event(name), cyc), "100%"))
+ if len(metrics) == 0:
+ return None
+
+ return MetricGroup("lpm_ports", metrics, "functional unit (port) utilization -- "
+ "fraction of cycles each port is utilized (higher is better)")
+
+
def IntelSwpf() -> Optional[MetricGroup]:
ins = Event("instructions")
try:
@@ -356,6 +386,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelPorts(),
IntelSwpf(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 33/48] perf jevents: Add L2 metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (31 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 32/48] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 34/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
` (15 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Give a breakdown of various L2 counters as metrics, including totals,
reads, hardware prefetcher, RFO, code and evictions.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 170 +++++++++++++++++++++++++
1 file changed, 170 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 7fcc0a1c544d..d190d97f4aff 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -263,6 +263,175 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelL2() -> Optional[MetricGroup]:
+ try:
+ DC_HIT = Event("L2_RQSTS.DEMAND_DATA_RD_HIT")
+ except:
+ return None
+ try:
+ DC_MISS = Event("L2_RQSTS.DEMAND_DATA_RD_MISS")
+ l2_dmnd_miss = DC_MISS
+ l2_dmnd_rd_all = DC_MISS + DC_HIT
+ except:
+ DC_ALL = Event("L2_RQSTS.ALL_DEMAND_DATA_RD")
+ l2_dmnd_miss = DC_ALL - DC_HIT
+ l2_dmnd_rd_all = DC_ALL
+ l2_dmnd_mrate = d_ratio(l2_dmnd_miss, interval_sec)
+ l2_dmnd_rrate = d_ratio(l2_dmnd_rd_all, interval_sec)
+
+ DC_PFH = None
+ DC_PFM = None
+ l2_pf_all = None
+ l2_pf_mrate = None
+ l2_pf_rrate = None
+ try:
+ DC_PFH = Event("L2_RQSTS.PF_HIT")
+ DC_PFM = Event("L2_RQSTS.PF_MISS")
+ l2_pf_all = DC_PFH + DC_PFM
+ l2_pf_mrate = d_ratio(DC_PFM, interval_sec)
+ l2_pf_rrate = d_ratio(l2_pf_all, interval_sec)
+ except:
+ pass
+
+ DC_RFOH = None
+ DC_RFOM = None
+ l2_rfo_all = None
+ l2_rfo_mrate = None
+ l2_rfo_rrate = None
+ try:
+ DC_RFOH = Event("L2_RQSTS.RFO_HIT")
+ DC_RFOM = Event("L2_RQSTS.RFO_MISS")
+ l2_rfo_all = DC_RFOH + DC_RFOM
+ l2_rfo_mrate = d_ratio(DC_RFOM, interval_sec)
+ l2_rfo_rrate = d_ratio(l2_rfo_all, interval_sec)
+ except:
+ pass
+
+ DC_CH = None
+ try:
+ DC_CH = Event("L2_RQSTS.CODE_RD_HIT")
+ except:
+ pass
+ DC_CM = Event("L2_RQSTS.CODE_RD_MISS")
+ DC_IN = Event("L2_LINES_IN.ALL")
+ DC_OUT_NS = None
+ DC_OUT_S = None
+ l2_lines_out = None
+ l2_out_rate = None
+ wbn = None
+ isd = None
+ try:
+ DC_OUT_NS = Event("L2_LINES_OUT.NON_SILENT",
+ "L2_LINES_OUT.DEMAND_DIRTY",
+ "L2_LINES_IN.S")
+ DC_OUT_S = Event("L2_LINES_OUT.SILENT",
+ "L2_LINES_OUT.DEMAND_CLEAN",
+ "L2_LINES_IN.I")
+ if DC_OUT_S.name == "L2_LINES_OUT.SILENT" and (
+ args.model.startswith("skylake") or
+ args.model == "cascadelakex"):
+ DC_OUT_S.name = "L2_LINES_OUT.SILENT/any/"
+ # bring is back to per-CPU
+ l2_s = Select(DC_OUT_S / 2, Literal("#smt_on"), DC_OUT_S)
+ l2_ns = DC_OUT_NS
+ l2_lines_out = l2_s + l2_ns
+ l2_out_rate = d_ratio(l2_lines_out, interval_sec)
+ nlr = max(l2_ns - DC_WB_U - DC_WB_D, 0)
+ wbn = d_ratio(nlr, interval_sec)
+ isd = d_ratio(l2_s, interval_sec)
+ except:
+ pass
+ DC_OUT_U = None
+ l2_pf_useless = None
+ l2_useless_rate = None
+ try:
+ DC_OUT_U = Event("L2_LINES_OUT.USELESS_HWPF")
+ l2_pf_useless = DC_OUT_U
+ l2_useless_rate = d_ratio(l2_pf_useless, interval_sec)
+ except:
+ pass
+ DC_WB_U = None
+ DC_WB_D = None
+ wbu = None
+ wbd = None
+ try:
+ DC_WB_U = Event("IDI_MISC.WB_UPGRADE")
+ DC_WB_D = Event("IDI_MISC.WB_DOWNGRADE")
+ wbu = d_ratio(DC_WB_U, interval_sec)
+ wbd = d_ratio(DC_WB_D, interval_sec)
+ except:
+ pass
+
+ l2_lines_in = DC_IN
+ l2_code_all = (DC_CH + DC_CM) if DC_CH else None
+ l2_code_rate = d_ratio(l2_code_all, interval_sec) if DC_CH else None
+ l2_code_miss_rate = d_ratio(DC_CM, interval_sec)
+ l2_in_rate = d_ratio(l2_lines_in, interval_sec)
+
+ return MetricGroup("lpm_l2", [
+ MetricGroup("lpm_l2_totals", [
+ Metric("lpm_l2_totals_in", "L2 cache total in per second",
+ l2_in_rate, "In/s"),
+ Metric("lpm_l2_totals_out", "L2 cache total out per second",
+ l2_out_rate, "Out/s") if l2_out_rate else None,
+ ]),
+ MetricGroup("lpm_l2_rd", [
+ Metric("lpm_l2_rd_hits", "L2 cache data read hits",
+ d_ratio(DC_HIT, l2_dmnd_rd_all), "100%"),
+ Metric("lpm_l2_rd_hits", "L2 cache data read hits",
+ d_ratio(l2_dmnd_miss, l2_dmnd_rd_all), "100%"),
+ Metric("lpm_l2_rd_requests", "L2 cache data read requests per second",
+ l2_dmnd_rrate, "requests/s"),
+ Metric("lpm_l2_rd_misses", "L2 cache data read misses per second",
+ l2_dmnd_mrate, "misses/s"),
+ ]),
+ MetricGroup("lpm_l2_hwpf", [
+ Metric("lpm_l2_hwpf_hits", "L2 cache hardware prefetcher hits",
+ d_ratio(DC_PFH, l2_pf_all), "100%"),
+ Metric("lpm_l2_hwpf_misses", "L2 cache hardware prefetcher misses",
+ d_ratio(DC_PFM, l2_pf_all), "100%"),
+ Metric("lpm_l2_hwpf_useless", "L2 cache hardware prefetcher useless prefetches per second",
+ l2_useless_rate, "100%") if l2_useless_rate else None,
+ Metric("lpm_l2_hwpf_requests", "L2 cache hardware prefetcher requests per second",
+ l2_pf_rrate, "100%"),
+ Metric("lpm_l2_hwpf_misses", "L2 cache hardware prefetcher misses per second",
+ l2_pf_mrate, "100%"),
+ ]) if DC_PFH else None,
+ MetricGroup("lpm_l2_rfo", [
+ Metric("lpm_l2_rfo_hits", "L2 cache request for ownership (RFO) hits",
+ d_ratio(DC_RFOH, l2_rfo_all), "100%"),
+ Metric("lpm_l2_rfo_misses", "L2 cache request for ownership (RFO) misses",
+ d_ratio(DC_RFOM, l2_rfo_all), "100%"),
+ Metric("lpm_l2_rfo_requests", "L2 cache request for ownership (RFO) requests per second",
+ l2_rfo_rrate, "requests/s"),
+ Metric("lpm_l2_rfo_misses", "L2 cache request for ownership (RFO) misses per second",
+ l2_rfo_mrate, "misses/s"),
+ ]) if DC_RFOH else None,
+ MetricGroup("lpm_l2_code", [
+ Metric("lpm_l2_code_hits", "L2 cache code hits",
+ d_ratio(DC_CH, l2_code_all), "100%") if DC_CH else None,
+ Metric("lpm_l2_code_misses", "L2 cache code misses",
+ d_ratio(DC_CM, l2_code_all), "100%") if DC_CH else None,
+ Metric("lpm_l2_code_requests", "L2 cache code requests per second",
+ l2_code_rate, "requests/s") if DC_CH else None,
+ Metric("lpm_l2_code_misses", "L2 cache code misses per second",
+ l2_code_miss_rate, "misses/s"),
+ ]),
+ MetricGroup("lpm_l2_evict", [
+ MetricGroup("lpm_l2_evict_mef_lines", [
+ Metric("lpm_l2_evict_mef_lines_l3_hot_lru", "L2 evictions M/E/F lines L3 hot LRU per second",
+ wbu, "HotLRU/s") if wbu else None,
+ Metric("lpm_l2_evict_mef_lines_l3_norm_lru", "L2 evictions M/E/F lines L3 normal LRU per second",
+ wbn, "NormLRU/s") if wbn else None,
+ Metric("lpm_l2_evict_mef_lines_dropped", "L2 evictions M/E/F lines dropped per second",
+ wbd, "dropped/s") if wbd else None,
+ Metric("lpm_l2_evict_is_lines_dropped", "L2 evictions I/S lines dropped per second",
+ isd, "dropped/s") if isd else None,
+ ]),
+ ]),
+ ], description="L2 data cache analysis")
+
+
def IntelPorts() -> Optional[MetricGroup]:
pipeline_events = json.load(
open(f"{_args.events_path}/x86/{_args.model}/pipeline.json"))
@@ -386,6 +555,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelL2(),
IntelPorts(),
IntelSwpf(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 34/48] perf jevents: Add load store breakdown metrics ldst for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (32 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 33/48] perf jevents: Add L2 metrics for Intel Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 35/48] perf jevents: Add ILP metrics " Ian Rogers
` (14 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 87 +++++++++++++++++++++++++-
1 file changed, 86 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index d190d97f4aff..19a284b4c520 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -8,7 +8,7 @@ import re
from typing import Optional
from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
- Metric, MetricGroup, MetricRef, Select)
+ Metric, MetricConstraint, MetricGroup, MetricRef, Select)
# Global command line arguments.
_args = None
@@ -525,6 +525,90 @@ def IntelSwpf() -> Optional[MetricGroup]:
], description="Software prefetch instruction breakdown")
+def IntelLdSt() -> Optional[MetricGroup]:
+ if _args.model in [
+ "bonnell",
+ "nehalemep",
+ "nehalemex",
+ "westmereep-dp",
+ "westmereep-sp",
+ "westmereex",
+ ]:
+ return None
+ LDST_LD = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ LDST_ST = Event("MEM_INST_RETIRED.ALL_STORES",
+ "MEM_UOPS_RETIRED.ALL_STORES")
+ LDST_LDC1 = Event(f"{LDST_LD.name}/cmask=1/")
+ LDST_STC1 = Event(f"{LDST_ST.name}/cmask=1/")
+ LDST_LDC2 = Event(f"{LDST_LD.name}/cmask=2/")
+ LDST_STC2 = Event(f"{LDST_ST.name}/cmask=2/")
+ LDST_LDC3 = Event(f"{LDST_LD.name}/cmask=3/")
+ LDST_STC3 = Event(f"{LDST_ST.name}/cmask=3/")
+ ins = Event("instructions")
+ LDST_CYC = Event("CPU_CLK_UNHALTED.THREAD",
+ "CPU_CLK_UNHALTED.CORE_P",
+ "CPU_CLK_UNHALTED.THREAD_P")
+ LDST_PRE = None
+ try:
+ LDST_PRE = Event("LOAD_HIT_PREFETCH.SWPF", "LOAD_HIT_PRE.SW_PF")
+ except:
+ pass
+ LDST_AT = None
+ try:
+ LDST_AT = Event("MEM_INST_RETIRED.LOCK_LOADS")
+ except:
+ pass
+ cyc = LDST_CYC
+
+ ld_rate = d_ratio(LDST_LD, interval_sec)
+ st_rate = d_ratio(LDST_ST, interval_sec)
+ pf_rate = d_ratio(LDST_PRE, interval_sec) if LDST_PRE else None
+ at_rate = d_ratio(LDST_AT, interval_sec) if LDST_AT else None
+
+ ldst_ret_constraint = MetricConstraint.GROUPED_EVENTS
+ if LDST_LD.name == "MEM_UOPS_RETIRED.ALL_LOADS":
+ ldst_ret_constraint = MetricConstraint.NO_GROUP_EVENTS_NMI
+
+ return MetricGroup("lpm_ldst", [
+ MetricGroup("lpm_ldst_total", [
+ Metric("lpm_ldst_total_loads", "Load/store instructions total loads",
+ ld_rate, "loads"),
+ Metric("lpm_ldst_total_stores", "Load/store instructions total stores",
+ st_rate, "stores"),
+ ]),
+ MetricGroup("lpm_ldst_prcnt", [
+ Metric("lpm_ldst_prcnt_loads", "Percent of all instructions that are loads",
+ d_ratio(LDST_LD, ins), "100%"),
+ Metric("lpm_ldst_prcnt_stores", "Percent of all instructions that are stores",
+ d_ratio(LDST_ST, ins), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_lds", [
+ Metric("lpm_ldst_ret_lds_1", "Retired loads in 1 cycle",
+ d_ratio(max(LDST_LDC1 - LDST_LDC2, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_lds_2", "Retired loads in 2 cycles",
+ d_ratio(max(LDST_LDC2 - LDST_LDC3, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_lds_3", "Retired loads in 3 or more cycles",
+ d_ratio(LDST_LDC3, cyc), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_sts", [
+ Metric("lpm_ldst_ret_sts_1", "Retired stores in 1 cycle",
+ d_ratio(max(LDST_STC1 - LDST_STC2, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_sts_2", "Retired stores in 2 cycles",
+ d_ratio(max(LDST_STC2 - LDST_STC3, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_sts_3", "Retired stores in 3 more cycles",
+ d_ratio(LDST_STC3, cyc), "100%"),
+ ]),
+ Metric("lpm_ldst_ld_hit_swpf", "Load hit software prefetches per second",
+ pf_rate, "swpf/s") if pf_rate else None,
+ Metric("lpm_ldst_atomic_lds", "Atomic loads per second",
+ at_rate, "loads/s") if at_rate else None,
+ ], description="Breakdown of load/store instructions")
+
+
def main() -> None:
global _args
@@ -556,6 +640,7 @@ def main() -> None:
Tsx(),
IntelBr(),
IntelL2(),
+ IntelLdSt(),
IntelPorts(),
IntelSwpf(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 35/48] perf jevents: Add ILP metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (33 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 34/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 36/48] perf jevents: Add context switch " Ian Rogers
` (13 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Use the counter mask (cmask) to see how many cycles an instruction
takes to retire. Present as a set of ILP metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 40 ++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 19a284b4c520..bc3c50285916 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -263,6 +263,45 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelIlp() -> MetricGroup:
+ tsc = Event("msr/tsc/")
+ c0 = Event("msr/mperf/")
+ low = tsc - c0
+ inst_ret = Event("INST_RETIRED.ANY_P")
+ inst_ret_c = [Event(f"{inst_ret.name}/cmask={x}/") for x in range(1, 6)]
+ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
+ "CPU_CLK_UNHALTED.DISTRIBUTED",
+ "cycles")
+ ilp = [d_ratio(max(inst_ret_c[x] - inst_ret_c[x + 1], 0), core_cycles)
+ for x in range(0, 4)]
+ ilp.append(d_ratio(inst_ret_c[4], core_cycles))
+ ilp0 = 1
+ for x in ilp:
+ ilp0 -= x
+ return MetricGroup("lpm_ilp", [
+ Metric("lpm_ilp_idle", "Lower power cycles as a percentage of all cycles",
+ d_ratio(low, tsc), "100%"),
+ Metric("lpm_ilp_inst_ret_0",
+ "Instructions retired in 0 cycles as a percentage of all cycles",
+ ilp0, "100%"),
+ Metric("lpm_ilp_inst_ret_1",
+ "Instructions retired in 1 cycles as a percentage of all cycles",
+ ilp[0], "100%"),
+ Metric("lpm_ilp_inst_ret_2",
+ "Instructions retired in 2 cycles as a percentage of all cycles",
+ ilp[1], "100%"),
+ Metric("lpm_ilp_inst_ret_3",
+ "Instructions retired in 3 cycles as a percentage of all cycles",
+ ilp[2], "100%"),
+ Metric("lpm_ilp_inst_ret_4",
+ "Instructions retired in 4 cycles as a percentage of all cycles",
+ ilp[3], "100%"),
+ Metric("lpm_ilp_inst_ret_5",
+ "Instructions retired in 5 or more cycles as a percentage of all cycles",
+ ilp[4], "100%"),
+ ])
+
+
def IntelL2() -> Optional[MetricGroup]:
try:
DC_HIT = Event("L2_RQSTS.DEMAND_DATA_RD_HIT")
@@ -639,6 +678,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelIlp(),
IntelL2(),
IntelLdSt(),
IntelPorts(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 36/48] perf jevents: Add context switch metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (34 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 35/48] perf jevents: Add ILP metrics " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 37/48] perf jevents: Add FPU " Ian Rogers
` (12 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics break down context switches for different kinds of
instruction.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 58 ++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index bc3c50285916..9cf4bd8ac769 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -263,6 +263,63 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelCtxSw() -> MetricGroup:
+ cs = Event("context\\-switches")
+ metrics = [
+ Metric("lpm_cs_rate", "Context switches per second",
+ d_ratio(cs, interval_sec), "ctxsw/s")
+ ]
+
+ ev = Event("instructions")
+ metrics.append(Metric("lpm_cs_instr", "Instructions per context switch",
+ d_ratio(ev, cs), "instr/cs"))
+
+ ev = Event("cycles")
+ metrics.append(Metric("lpm_cs_cycles", "Cycles per context switch",
+ d_ratio(ev, cs), "cycles/cs"))
+
+ try:
+ ev = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ metrics.append(Metric("lpm_cs_loads", "Loads per context switch",
+ d_ratio(ev, cs), "loads/cs"))
+ except:
+ pass
+
+ try:
+ ev = Event("MEM_INST_RETIRED.ALL_STORES",
+ "MEM_UOPS_RETIRED.ALL_STORES")
+ metrics.append(Metric("lpm_cs_stores", "Stores per context switch",
+ d_ratio(ev, cs), "stores/cs"))
+ except:
+ pass
+
+ try:
+ ev = Event("BR_INST_RETIRED.NEAR_TAKEN", "BR_INST_RETIRED.TAKEN_JCC")
+ metrics.append(Metric("lpm_cs_br_taken", "Branches taken per context switch",
+ d_ratio(ev, cs), "br_taken/cs"))
+ except:
+ pass
+
+ try:
+ l2_misses = (Event("L2_RQSTS.DEMAND_DATA_RD_MISS") +
+ Event("L2_RQSTS.RFO_MISS") +
+ Event("L2_RQSTS.CODE_RD_MISS"))
+ try:
+ l2_misses += Event("L2_RQSTS.HWPF_MISS",
+ "L2_RQSTS.L2_PF_MISS", "L2_RQSTS.PF_MISS")
+ except:
+ pass
+
+ metrics.append(Metric("lpm_cs_l2_misses", "L2 misses per context switch",
+ d_ratio(l2_misses, cs), "l2_misses/cs"))
+ except:
+ pass
+
+ return MetricGroup("lpm_cs", metrics,
+ description=("Number of context switches per second, instructions "
+ "retired & core cycles between context switches"))
+
+
def IntelIlp() -> MetricGroup:
tsc = Event("msr/tsc/")
c0 = Event("msr/mperf/")
@@ -678,6 +735,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelCtxSw(),
IntelIlp(),
IntelL2(),
IntelLdSt(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 37/48] perf jevents: Add FPU metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (35 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 36/48] perf jevents: Add context switch " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 38/48] perf jevents: Add Miss Level Parallelism (MLP) metric " Ian Rogers
` (11 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics break down of floating point operations.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 97 ++++++++++++++++++++++++++
1 file changed, 97 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 9cf4bd8ac769..77b8e10194db 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -320,6 +320,102 @@ def IntelCtxSw() -> MetricGroup:
"retired & core cycles between context switches"))
+def IntelFpu() -> Optional[MetricGroup]:
+ cyc = Event("cycles")
+ try:
+ s_64 = Event("FP_ARITH_INST_RETIRED.SCALAR_SINGLE",
+ "SIMD_INST_RETIRED.SCALAR_SINGLE")
+ except:
+ return None
+ d_64 = Event("FP_ARITH_INST_RETIRED.SCALAR_DOUBLE",
+ "SIMD_INST_RETIRED.SCALAR_DOUBLE")
+ s_128 = Event("FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
+ "SIMD_INST_RETIRED.PACKED_SINGLE")
+
+ flop = s_64 + d_64 + 4 * s_128
+
+ d_128 = None
+ s_256 = None
+ d_256 = None
+ s_512 = None
+ d_512 = None
+ try:
+ d_128 = Event("FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE")
+ flop += 2 * d_128
+ s_256 = Event("FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE")
+ flop += 8 * s_256
+ d_256 = Event("FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE")
+ flop += 4 * d_256
+ s_512 = Event("FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE")
+ flop += 16 * s_512
+ d_512 = Event("FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE")
+ flop += 8 * d_512
+ except:
+ pass
+
+ f_assist = Event("ASSISTS.FP", "FP_ASSIST.ANY", "FP_ASSIST.S")
+ if f_assist in [
+ "ASSISTS.FP",
+ "FP_ASSIST.S",
+ ]:
+ f_assist += "/cmask=1/"
+
+ flop_r = d_ratio(flop, interval_sec)
+ flop_c = d_ratio(flop, cyc)
+ nmi_constraint = MetricConstraint.GROUPED_EVENTS
+ if f_assist.name == "ASSISTS.FP": # Icelake+
+ nmi_constraint = MetricConstraint.NO_GROUP_EVENTS_NMI
+
+ def FpuMetrics(group: str, fl: Optional[Event], mult: int, desc: str) -> Optional[MetricGroup]:
+ if not fl:
+ return None
+
+ f = fl * mult
+ fl_r = d_ratio(f, interval_sec)
+ r_s = d_ratio(fl, interval_sec)
+ return MetricGroup(group, [
+ Metric(f"{group}_of_total", desc + " floating point operations per second",
+ d_ratio(f, flop), "100%"),
+ Metric(f"{group}_flops", desc + " floating point operations per second",
+ fl_r, "flops/s"),
+ Metric(f"{group}_ops", desc + " operations per second",
+ r_s, "ops/s"),
+ ])
+
+ return MetricGroup("lpm_fpu", [
+ MetricGroup("lpm_fpu_total", [
+ Metric("lpm_fpu_total_flops", "Floating point operations per second",
+ flop_r, "flops/s"),
+ Metric("lpm_fpu_total_flopc", "Floating point operations per cycle",
+ flop_c, "flops/cycle", constraint=nmi_constraint),
+ ]),
+ MetricGroup("lpm_fpu_64", [
+ FpuMetrics("lpm_fpu_64_single", s_64, 1, "64-bit single"),
+ FpuMetrics("lpm_fpu_64_double", d_64, 1, "64-bit double"),
+ ]),
+ MetricGroup("lpm_fpu_128", [
+ FpuMetrics("lpm_fpu_128_single", s_128,
+ 4, "128-bit packed single"),
+ FpuMetrics("lpm_fpu_128_double", d_128,
+ 2, "128-bit packed double"),
+ ]),
+ MetricGroup("lpm_fpu_256", [
+ FpuMetrics("lpm_fpu_256_single", s_256,
+ 8, "128-bit packed single"),
+ FpuMetrics("lpm_fpu_256_double", d_256,
+ 4, "128-bit packed double"),
+ ]),
+ MetricGroup("lpm_fpu_512", [
+ FpuMetrics("lpm_fpu_512_single", s_512,
+ 16, "128-bit packed single"),
+ FpuMetrics("lpm_fpu_512_double", d_512,
+ 8, "128-bit packed double"),
+ ]),
+ Metric("lpm_fpu_assists", "FP assists as a percentage of cycles",
+ d_ratio(f_assist, cyc), "100%"),
+ ])
+
+
def IntelIlp() -> MetricGroup:
tsc = Event("msr/tsc/")
c0 = Event("msr/mperf/")
@@ -736,6 +832,7 @@ def main() -> None:
Tsx(),
IntelBr(),
IntelCtxSw(),
+ IntelFpu(),
IntelIlp(),
IntelL2(),
IntelLdSt(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 38/48] perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (36 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 37/48] perf jevents: Add FPU " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 39/48] perf jevents: Add mem_bw " Ian Rogers
` (10 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Number of outstanding load misses per cycle.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 77b8e10194db..dddeae35e4b4 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -624,6 +624,20 @@ def IntelL2() -> Optional[MetricGroup]:
], description="L2 data cache analysis")
+def IntelMlp() -> Optional[Metric]:
+ try:
+ l1d = Event("L1D_PEND_MISS.PENDING")
+ l1dc = Event("L1D_PEND_MISS.PENDING_CYCLES")
+ except:
+ return None
+
+ l1dc = Select(l1dc / 2, Literal("#smt_on"), l1dc)
+ ml = d_ratio(l1d, l1dc)
+ return Metric("lpm_mlp",
+ "Miss level parallelism - number of outstanding load misses per cycle (higher is better)",
+ ml, "load_miss_pending/cycle")
+
+
def IntelPorts() -> Optional[MetricGroup]:
pipeline_events = json.load(
open(f"{_args.events_path}/x86/{_args.model}/pipeline.json"))
@@ -836,6 +850,7 @@ def main() -> None:
IntelIlp(),
IntelL2(),
IntelLdSt(),
+ IntelMlp(),
IntelPorts(),
IntelSwpf(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 39/48] perf jevents: Add mem_bw metric for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (37 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 38/48] perf jevents: Add Miss Level Parallelism (MLP) metric " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 40/48] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
` (9 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Break down memory bandwidth using uncore counters. For many models
this matches the memory_bandwidth_* metrics, but these metrics aren't
made available on all models. Add support for free running counters.
Query the event json when determining which what events/counters are
available.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 62 ++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index dddeae35e4b4..f671d6e4fd67 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,67 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreMemBw() -> Optional[MetricGroup]:
+ mem_events = []
+ try:
+ mem_events = json.load(open(f"{os.path.dirname(os.path.realpath(__file__))}"
+ f"/arch/x86/{args.model}/uncore-memory.json"))
+ except:
+ pass
+
+ ddr_rds = 0
+ ddr_wrs = 0
+ ddr_total = 0
+ for x in mem_events:
+ if "EventName" in x:
+ name = x["EventName"]
+ if re.search("^UNC_MC[0-9]+_RDCAS_COUNT_FREERUN", name):
+ ddr_rds += Event(name)
+ elif re.search("^UNC_MC[0-9]+_WRCAS_COUNT_FREERUN", name):
+ ddr_wrs += Event(name)
+ # elif re.search("^UNC_MC[0-9]+_TOTAL_REQCOUNT_FREERUN", name):
+ # ddr_total += Event(name)
+
+ if ddr_rds == 0:
+ try:
+ ddr_rds = Event("UNC_M_CAS_COUNT.RD")
+ ddr_wrs = Event("UNC_M_CAS_COUNT.WR")
+ except:
+ return None
+
+ ddr_total = ddr_rds + ddr_wrs
+
+ pmm_rds = 0
+ pmm_wrs = 0
+ try:
+ pmm_rds = Event("UNC_M_PMM_RPQ_INSERTS")
+ pmm_wrs = Event("UNC_M_PMM_WPQ_INSERTS")
+ except:
+ pass
+
+ pmm_total = pmm_rds + pmm_wrs
+
+ scale = 64 / 1_000_000
+ return MetricGroup("lpm_mem_bw", [
+ MetricGroup("lpm_mem_bw_ddr", [
+ Metric("lpm_mem_bw_ddr_read", "DDR memory read bandwidth",
+ d_ratio(ddr_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_ddr_write", "DDR memory write bandwidth",
+ d_ratio(ddr_wrs, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_ddr_total", "DDR memory write bandwidth",
+ d_ratio(ddr_total, interval_sec), f"{scale}MB/s"),
+ ], description="DDR Memory Bandwidth"),
+ MetricGroup("lpm_mem_bw_pmm", [
+ Metric("lpm_mem_bw_pmm_read", "PMM memory read bandwidth",
+ d_ratio(pmm_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_pmm_write", "PMM memory write bandwidth",
+ d_ratio(pmm_wrs, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_pmm_total", "PMM memory write bandwidth",
+ d_ratio(pmm_total, interval_sec), f"{scale}MB/s"),
+ ], description="PMM Memory Bandwidth") if pmm_rds != 0 else None,
+ ], description="Memory Bandwidth")
+
+
def main() -> None:
global _args
@@ -853,6 +914,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreMemBw(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 40/48] perf jevents: Add local/remote "mem" breakdown metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (38 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 39/48] perf jevents: Add mem_bw " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 41/48] perf jevents: Add dir " Ian Rogers
` (8 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Breakdown local and remote memory bandwidth, read and writes. The
implementation uses the HA and CHA PMUs present in server models
broadwellde, broadwellx cascadelakex, emeraldrapids, haswellx,
icelakex, ivytown, sapphirerapids and skylakex.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 31 ++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index f671d6e4fd67..983e5021f3d3 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,36 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreMem() -> Optional[MetricGroup]:
+ try:
+ loc_rds = Event("UNC_CHA_REQUESTS.READS_LOCAL",
+ "UNC_H_REQUESTS.READS_LOCAL")
+ rem_rds = Event("UNC_CHA_REQUESTS.READS_REMOTE",
+ "UNC_H_REQUESTS.READS_REMOTE")
+ loc_wrs = Event("UNC_CHA_REQUESTS.WRITES_LOCAL",
+ "UNC_H_REQUESTS.WRITES_LOCAL")
+ rem_wrs = Event("UNC_CHA_REQUESTS.WRITES_REMOTE",
+ "UNC_H_REQUESTS.WRITES_REMOTE")
+ except:
+ return None
+
+ scale = 64 / 1_000_000
+ return MetricGroup("lpm_mem", [
+ MetricGroup("lpm_mem_local", [
+ Metric("lpm_mem_local_read", "Local memory read bandwidth not including directory updates",
+ d_ratio(loc_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_local_write", "Local memory write bandwidth not including directory updates",
+ d_ratio(loc_wrs, interval_sec), f"{scale}MB/s"),
+ ]),
+ MetricGroup("lpm_mem_remote", [
+ Metric("lpm_mem_remote_read", "Remote memory read bandwidth not including directory updates",
+ d_ratio(rem_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_remote_write", "Remote memory write bandwidth not including directory updates",
+ d_ratio(rem_wrs, interval_sec), f"{scale}MB/s"),
+ ]),
+ ], description="Memory Bandwidth breakdown local vs. remote (remote requests in). directory updates not included")
+
+
def UncoreMemBw() -> Optional[MetricGroup]:
mem_events = []
try:
@@ -914,6 +944,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreMem(),
UncoreMemBw(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 41/48] perf jevents: Add dir breakdown metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (39 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 40/48] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 42/48] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
` (7 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Breakdown directory hit, misses and requests. The implementation uses
the M2M and CHA PMUs present in server models broadwellde, broadwellx
cascadelakex, emeraldrapids, icelakex, sapphirerapids and skylakex.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 36 ++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 983e5021f3d3..24ceb7f8719b 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,41 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreDir() -> Optional[MetricGroup]:
+ try:
+ m2m_upd = Event("UNC_M2M_DIRECTORY_UPDATE.ANY")
+ m2m_hits = Event("UNC_M2M_DIRECTORY_HIT.DIRTY_I")
+ # Turn the umask into a ANY rather than DIRTY_I filter.
+ m2m_hits.name += "/umask=0xFF,name=UNC_M2M_DIRECTORY_HIT.ANY/"
+ m2m_miss = Event("UNC_M2M_DIRECTORY_MISS.DIRTY_I")
+ # Turn the umask into a ANY rather than DIRTY_I filter.
+ m2m_miss.name += "/umask=0xFF,name=UNC_M2M_DIRECTORY_MISS.ANY/"
+ cha_upd = Event("UNC_CHA_DIR_UPDATE.HA")
+ # Turn the umask into a ANY rather than HA filter.
+ cha_upd.name += "/umask=3,name=UNC_CHA_DIR_UPDATE.ANY/"
+ except:
+ return None
+
+ m2m_total = m2m_hits + m2m_miss
+ upd = m2m_upd + cha_upd # in cache lines
+ upd_r = upd / interval_sec
+ look_r = m2m_total / interval_sec
+
+ scale = 64 / 1_000_000 # Cache lines to MB
+ return MetricGroup("lpm_dir", [
+ Metric("lpm_dir_lookup_rate", "",
+ d_ratio(m2m_total, interval_sec), "requests/s"),
+ Metric("lpm_dir_lookup_hits", "",
+ d_ratio(m2m_hits, m2m_total), "100%"),
+ Metric("lpm_dir_lookup_misses", "",
+ d_ratio(m2m_miss, m2m_total), "100%"),
+ Metric("lpm_dir_update_requests", "",
+ d_ratio(m2m_upd + cha_upd, interval_sec), "requests/s"),
+ Metric("lpm_dir_update_bw", "",
+ d_ratio(m2m_upd + cha_upd, interval_sec), f"{scale}MB/s"),
+ ])
+
+
def UncoreMem() -> Optional[MetricGroup]:
try:
loc_rds = Event("UNC_CHA_REQUESTS.READS_LOCAL",
@@ -944,6 +979,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreDir(),
UncoreMem(),
UncoreMemBw(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 42/48] perf jevents: Add C-State metrics from the PCU PMU for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (40 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 41/48] perf jevents: Add dir " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 43/48] perf jevents: Add local/remote miss latency metrics " Ian Rogers
` (6 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Use occupancy events fixed in:
https://lore.kernel.org/lkml/20240226201517.3540187-1-irogers@google.com/
Metrics are at the socket level referring to cores, not hyperthreads.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 30 ++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 24ceb7f8719b..118fe0fc05a3 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,35 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreCState() -> Optional[MetricGroup]:
+ try:
+ pcu_ticks = Event("UNC_P_CLOCKTICKS")
+ c0 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C0")
+ c3 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C3")
+ c6 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C6")
+ except:
+ return None
+
+ num_cores = Literal("#num_cores") / Literal("#num_packages")
+
+ max_cycles = pcu_ticks * num_cores
+ total_cycles = c0 + c3 + c6
+
+ # remove fused-off cores which show up in C6/C7.
+ c6 = Select(max(c6 - (total_cycles - max_cycles), 0),
+ total_cycles > max_cycles,
+ c6)
+
+ return MetricGroup("lpm_cstate", [
+ Metric("lpm_cstate_c0", "C-State cores in C0/C1",
+ d_ratio(c0, pcu_ticks), "cores"),
+ Metric("lpm_cstate_c3", "C-State cores in C3",
+ d_ratio(c3, pcu_ticks), "cores"),
+ Metric("lpm_cstate_c6", "C-State cores in C6/C7",
+ d_ratio(c6, pcu_ticks), "cores"),
+ ])
+
+
def UncoreDir() -> Optional[MetricGroup]:
try:
m2m_upd = Event("UNC_M2M_DIRECTORY_UPDATE.ANY")
@@ -979,6 +1008,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreCState(),
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 43/48] perf jevents: Add local/remote miss latency metrics for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (41 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 42/48] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 44/48] perf jevents: Add upi_bw metric " Ian Rogers
` (5 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Derive from CBOX/CHA occupancy and inserts the average latency as is
provided in Intel's uncore performance monitoring reference.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 70 ++++++++++++++++++++++++--
1 file changed, 67 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 118fe0fc05a3..037f9b2ea1b6 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -6,9 +6,10 @@ import math
import os
import re
from typing import Optional
-from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
- Metric, MetricConstraint, MetricGroup, MetricRef, Select)
+from metric import (d_ratio, has_event, max, source_count, CheckPmu, Event,
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ Literal, LoadEvents, Metric, MetricConstraint, MetricGroup,
+ MetricRef, Select)
# Global command line arguments.
_args = None
@@ -624,6 +625,68 @@ def IntelL2() -> Optional[MetricGroup]:
], description="L2 data cache analysis")
+def IntelMissLat() -> Optional[MetricGroup]:
+ try:
+ ticks = Event("UNC_CHA_CLOCKTICKS", "UNC_C_CLOCKTICKS")
+ data_rd_loc_occ = Event("UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_LOCAL",
+ "UNC_CHA_TOR_OCCUPANCY.IA_MISS",
+ "UNC_C_TOR_OCCUPANCY.MISS_LOCAL_OPCODE",
+ "UNC_C_TOR_OCCUPANCY.MISS_OPCODE")
+ data_rd_loc_ins = Event("UNC_CHA_TOR_INSERTS.IA_MISS_DRD_LOCAL",
+ "UNC_CHA_TOR_INSERTS.IA_MISS",
+ "UNC_C_TOR_INSERTS.MISS_LOCAL_OPCODE",
+ "UNC_C_TOR_INSERTS.MISS_OPCODE")
+ data_rd_rem_occ = Event("UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE",
+ "UNC_CHA_TOR_OCCUPANCY.IA_MISS",
+ "UNC_C_TOR_OCCUPANCY.MISS_REMOTE_OPCODE",
+ "UNC_C_TOR_OCCUPANCY.NID_MISS_OPCODE")
+ data_rd_rem_ins = Event("UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE",
+ "UNC_CHA_TOR_INSERTS.IA_MISS",
+ "UNC_C_TOR_INSERTS.MISS_REMOTE_OPCODE",
+ "UNC_C_TOR_INSERTS.NID_MISS_OPCODE")
+ except:
+ return None
+
+ if (data_rd_loc_occ.name == "UNC_C_TOR_OCCUPANCY.MISS_LOCAL_OPCODE" or
+ data_rd_loc_occ.name == "UNC_C_TOR_OCCUPANCY.MISS_OPCODE"):
+ data_rd = 0x182
+ for e in [data_rd_loc_occ, data_rd_loc_ins, data_rd_rem_occ, data_rd_rem_ins]:
+ e.name += f"/filter_opc={hex(data_rd)}/"
+ elif data_rd_loc_occ.name == "UNC_CHA_TOR_OCCUPANCY.IA_MISS":
+ # Demand Data Read - Full cache-line read requests from core for
+ # lines to be cached in S or E, typically for data
+ demand_data_rd = 0x202
+ # LLC Prefetch Data - Uncore will first look up the line in the
+ # LLC; for a cache hit, the LRU will be updated, on a miss, the
+ # DRd will be initiated
+ llc_prefetch_data = 0x25a
+ local_filter = (f"/filter_opc0={hex(demand_data_rd)},"
+ f"filter_opc1={hex(llc_prefetch_data)},"
+ "filter_loc,filter_nm,filter_not_nm/")
+ remote_filter = (f"/filter_opc0={hex(demand_data_rd)},"
+ f"filter_opc1={hex(llc_prefetch_data)},"
+ "filter_rem,filter_nm,filter_not_nm/")
+ for e in [data_rd_loc_occ, data_rd_loc_ins]:
+ e.name += local_filter
+ for e in [data_rd_rem_occ, data_rd_rem_ins]:
+ e.name += remote_filter
+ else:
+ assert data_rd_loc_occ.name == "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_LOCAL", data_rd_loc_occ
+
+ ticks_per_cha = ticks / source_count(data_rd_loc_ins)
+ loc_lat = interval_sec * 1e9 * data_rd_loc_occ / \
+ (ticks_per_cha * data_rd_loc_ins)
+ ticks_per_cha = ticks / source_count(data_rd_rem_ins)
+ rem_lat = interval_sec * 1e9 * data_rd_rem_occ / \
+ (ticks_per_cha * data_rd_rem_ins)
+ return MetricGroup("lpm_miss_lat", [
+ Metric("lpm_miss_lat_loc", "Local to a socket miss latency in nanoseconds",
+ loc_lat, "ns"),
+ Metric("lpm_miss_lat_rem", "Remote to a socket miss latency in nanoseconds",
+ rem_lat, "ns"),
+ ])
+
+
def IntelMlp() -> Optional[Metric]:
try:
l1d = Event("L1D_PEND_MISS.PENDING")
@@ -1005,6 +1068,7 @@ def main() -> None:
IntelIlp(),
IntelL2(),
IntelLdSt(),
+ IntelMissLat(),
IntelMlp(),
IntelPorts(),
IntelSwpf(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 44/48] perf jevents: Add upi_bw metric for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (42 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 43/48] perf jevents: Add local/remote miss latency metrics " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 45/48] perf jevents: Add mesh bandwidth saturation " Ian Rogers
` (4 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Break down UPI read and write bandwidth using uncore_upi counters.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 037f9b2ea1b6..f6bb691dc5bb 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1033,6 +1033,27 @@ def UncoreMemBw() -> Optional[MetricGroup]:
], description="Memory Bandwidth")
+def UncoreUpiBw() -> Optional[MetricGroup]:
+ try:
+ upi_rds = Event("UNC_UPI_RxL_FLITS.ALL_DATA")
+ upi_wrs = Event("UNC_UPI_TxL_FLITS.ALL_DATA")
+ except:
+ return None
+
+ upi_total = upi_rds + upi_wrs
+
+ # From "Uncore Performance Monitoring": When measuring the amount of
+ # bandwidth consumed by transmission of the data (i.e. NOT including
+ # the header), it should be .ALL_DATA / 9 * 64B.
+ scale = (64 / 9) / 1_000_000
+ return MetricGroup("lpm_upi_bw", [
+ Metric("lpm_upi_bw_read", "UPI read bandwidth",
+ d_ratio(upi_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_upi_bw_write", "DDR memory write bandwidth",
+ d_ratio(upi_wrs, interval_sec), f"{scale}MB/s"),
+ ], description="UPI Bandwidth")
+
+
def main() -> None:
global _args
@@ -1076,6 +1097,7 @@ def main() -> None:
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
+ UncoreUpiBw(),
])
if _args.metricgroups:
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 45/48] perf jevents: Add mesh bandwidth saturation metric for Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (43 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 44/48] perf jevents: Add upi_bw metric " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
` (3 subsequent siblings)
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Memory bandwidth saturation from CBOX/CHA events present in
broadwellde, broadwellx, cascadelakex, haswellx, icelakex, skylakex
and snowridgex.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/intel_metrics.py | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index f6bb691dc5bb..d56bab7337df 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1033,6 +1033,22 @@ def UncoreMemBw() -> Optional[MetricGroup]:
], description="Memory Bandwidth")
+def UncoreMemSat() -> Optional[Metric]:
+ try:
+ clocks = Event("UNC_CHA_CLOCKTICKS", "UNC_C_CLOCKTICKS")
+ sat = Event("UNC_CHA_DISTRESS_ASSERTED.VERT", "UNC_CHA_FAST_ASSERTED.VERT",
+ "UNC_C_FAST_ASSERTED")
+ except:
+ return None
+
+ desc = ("Mesh Bandwidth saturation (% CBOX cycles with FAST signal asserted, "
+ "include QPI bandwidth saturation), lower is better")
+ if "UNC_CHA_" in sat.name:
+ desc = ("Mesh Bandwidth saturation (% CHA cycles with FAST signal asserted, "
+ "include UPI bandwidth saturation), lower is better")
+ return Metric("lpm_mem_sat", desc, d_ratio(sat, clocks), "100%")
+
+
def UncoreUpiBw() -> Optional[MetricGroup]:
try:
upi_rds = Event("UNC_UPI_RxL_FLITS.ALL_DATA")
@@ -1097,6 +1113,7 @@ def main() -> None:
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
+ UncoreMemSat(),
UncoreUpiBw(),
])
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (44 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 45/48] perf jevents: Add mesh bandwidth saturation " Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-09 11:31 ` James Clark
2025-12-02 17:50 ` [PATCH v9 47/48] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
` (2 subsequent siblings)
48 siblings, 1 reply; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics are created using legacy, common and recommended events. As
events may be missing a TryEvent function will give None if an event
is missing. To workaround missing JSON events for cortex-a53, sysfs
encodings are used.
Signed-off-by: Ian Rogers <irogers@google.com>
---
An earlier review of this patch by Leo Yan is here:
https://lore.kernel.org/lkml/8168c713-005c-4fd9-a928-66763dab746a@arm.com/
Hopefully all corrections were made.
---
tools/perf/pmu-events/arm64_metrics.py | 145 ++++++++++++++++++++++++-
1 file changed, 142 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index ac717ca3513a..9678253e2e0e 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -2,13 +2,150 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
- MetricGroup)
+from typing import Optional
+from metric import (d_ratio, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ LoadEvents, Metric, MetricGroup)
# Global command line arguments.
_args = None
+def Arm64Topdown() -> MetricGroup:
+ """Returns a MetricGroup representing ARM64 topdown like metrics."""
+ def TryEvent(name: str) -> Optional[Event]:
+ # Skip an event if not in the json files.
+ try:
+ return Event(name)
+ except:
+ return None
+ # ARM models like a53 lack JSON for INST_RETIRED but have the
+ # architetural standard event in sysfs. Use the PMU name to identify
+ # the sysfs event.
+ pmu_name = f'armv8_{_args.model.replace("-", "_")}'
+ ins = Event("instructions")
+ ins_ret = Event("INST_RETIRED", f"{pmu_name}/inst_retired/")
+ cycles = Event("cpu\\-cycles")
+ stall_fe = TryEvent("STALL_FRONTEND")
+ stall_be = TryEvent("STALL_BACKEND")
+ br_ret = TryEvent("BR_RETIRED")
+ br_mp_ret = TryEvent("BR_MIS_PRED_RETIRED")
+ dtlb_walk = TryEvent("DTLB_WALK")
+ itlb_walk = TryEvent("ITLB_WALK")
+ l1d_tlb = TryEvent("L1D_TLB")
+ l1i_tlb = TryEvent("L1I_TLB")
+ l1d_refill = Event("L1D_CACHE_REFILL", f"{pmu_name}/l1d_cache_refill/")
+ l2d_refill = Event("L2D_CACHE_REFILL", f"{pmu_name}/l2d_cache_refill/")
+ l1i_refill = Event("L1I_CACHE_REFILL", f"{pmu_name}/l1i_cache_refill/")
+ l1d_access = Event("L1D_CACHE", f"{pmu_name}/l1d_cache/")
+ l2d_access = Event("L2D_CACHE", f"{pmu_name}/l2d_cache/")
+ llc_access = TryEvent("LL_CACHE_RD")
+ l1i_access = Event("L1I_CACHE", f"{pmu_name}/l1i_cache/")
+ llc_miss_rd = TryEvent("LL_CACHE_MISS_RD")
+ ase_spec = TryEvent("ASE_SPEC")
+ ld_spec = TryEvent("LD_SPEC")
+ st_spec = TryEvent("ST_SPEC")
+ vfp_spec = TryEvent("VFP_SPEC")
+ dp_spec = TryEvent("DP_SPEC")
+ br_immed_spec = TryEvent("BR_IMMED_SPEC")
+ br_indirect_spec = TryEvent("BR_INDIRECT_SPEC")
+ br_ret_spec = TryEvent("BR_RETURN_SPEC")
+ crypto_spec = TryEvent("CRYPTO_SPEC")
+ inst_spec = TryEvent("INST_SPEC")
+ return MetricGroup("lpm_topdown", [
+ MetricGroup("lpm_topdown_tl", [
+ Metric("lpm_topdown_tl_ipc", "Instructions per cycle", d_ratio(
+ ins, cycles), "insn/cycle"),
+ Metric("lpm_topdown_tl_stall_fe_rate", "Frontend stalls to all cycles",
+ d_ratio(stall_fe, cycles), "100%") if stall_fe else None,
+ Metric("lpm_topdown_tl_stall_be_rate", "Backend stalls to all cycles",
+ d_ratio(stall_be, cycles), "100%") if stall_be else None,
+ ]),
+ MetricGroup("lpm_topdown_fe_bound", [
+ MetricGroup("lpm_topdown_fe_br", [
+ Metric("lpm_topdown_fe_br_mp_per_insn",
+ "Branch mispredicts per instruction retired",
+ d_ratio(br_mp_ret, ins_ret), "br/insn") if br_mp_ret else None,
+ Metric("lpm_topdown_fe_br_ins_rate",
+ "Branches per instruction retired", d_ratio(
+ br_ret, ins_ret), "100%") if br_ret else None,
+ Metric("lpm_topdown_fe_br_mispredict",
+ "Branch mispredicts per branch instruction",
+ d_ratio(br_mp_ret, br_ret), "100%") if (br_mp_ret and br_ret) else None,
+ ]),
+ MetricGroup("lpm_topdown_fe_itlb", [
+ Metric("lpm_topdown_fe_itlb_walks", "Itlb walks per insn",
+ d_ratio(itlb_walk, ins_ret), "walk/insn"),
+ Metric("lpm_topdown_fe_itlb_walk_rate", "Itlb walks per L1I TLB access",
+ d_ratio(itlb_walk, l1i_tlb) if l1i_tlb else None, "100%"),
+ ]) if itlb_walk else None,
+ MetricGroup("lpm_topdown_fe_icache", [
+ Metric("lpm_topdown_fe_icache_l1i_per_insn",
+ "L1I cache refills per instruction",
+ d_ratio(l1i_refill, ins_ret), "l1i/insn"),
+ Metric("lpm_topdown_fe_icache_l1i_miss_rate",
+ "L1I cache refills per L1I cache access",
+ d_ratio(l1i_refill, l1i_access), "100%"),
+ ]),
+ ]),
+ MetricGroup("lpm_topdown_be_bound", [
+ MetricGroup("lpm_topdown_be_dtlb", [
+ Metric("lpm_topdown_be_dtlb_walks", "Dtlb walks per instruction",
+ d_ratio(dtlb_walk, ins_ret), "walk/insn"),
+ Metric("lpm_topdown_be_dtlb_walk_rate", "Dtlb walks per L1D TLB access",
+ d_ratio(dtlb_walk, l1d_tlb) if l1d_tlb else None, "100%"),
+ ]) if dtlb_walk else None,
+ MetricGroup("lpm_topdown_be_mix", [
+ Metric("lpm_topdown_be_mix_ld", "Percentage of load instructions",
+ d_ratio(ld_spec, inst_spec), "100%") if ld_spec else None,
+ Metric("lpm_topdown_be_mix_st", "Percentage of store instructions",
+ d_ratio(st_spec, inst_spec), "100%") if st_spec else None,
+ Metric("lpm_topdown_be_mix_simd", "Percentage of SIMD instructions",
+ d_ratio(ase_spec, inst_spec), "100%") if ase_spec else None,
+ Metric("lpm_topdown_be_mix_fp",
+ "Percentage of floating point instructions",
+ d_ratio(vfp_spec, inst_spec), "100%") if vfp_spec else None,
+ Metric("lpm_topdown_be_mix_dp",
+ "Percentage of data processing instructions",
+ d_ratio(dp_spec, inst_spec), "100%") if dp_spec else None,
+ Metric("lpm_topdown_be_mix_crypto",
+ "Percentage of data processing instructions",
+ d_ratio(crypto_spec, inst_spec), "100%") if crypto_spec else None,
+ Metric(
+ "lpm_topdown_be_mix_br", "Percentage of branch instructions",
+ d_ratio(br_immed_spec + br_indirect_spec + br_ret_spec,
+ inst_spec), "100%") if br_immed_spec and br_indirect_spec and br_ret_spec else None,
+ ], description="Breakdown of instructions by type. Counts include both useful and wasted speculative instructions"
+ ) if inst_spec else None,
+ MetricGroup("lpm_topdown_be_dcache", [
+ MetricGroup("lpm_topdown_be_dcache_l1", [
+ Metric("lpm_topdown_be_dcache_l1_per_insn",
+ "L1D cache refills per instruction",
+ d_ratio(l1d_refill, ins_ret), "refills/insn"),
+ Metric("lpm_topdown_be_dcache_l1_miss_rate",
+ "L1D cache refills per L1D cache access",
+ d_ratio(l1d_refill, l1d_access), "100%")
+ ]),
+ MetricGroup("lpm_topdown_be_dcache_l2", [
+ Metric("lpm_topdown_be_dcache_l2_per_insn",
+ "L2D cache refills per instruction",
+ d_ratio(l2d_refill, ins_ret), "refills/insn"),
+ Metric("lpm_topdown_be_dcache_l2_miss_rate",
+ "L2D cache refills per L2D cache access",
+ d_ratio(l2d_refill, l2d_access), "100%")
+ ]),
+ MetricGroup("lpm_topdown_be_dcache_llc", [
+ Metric("lpm_topdown_be_dcache_llc_per_insn",
+ "Last level cache misses per instruction",
+ d_ratio(llc_miss_rd, ins_ret), "miss/insn"),
+ Metric("lpm_topdown_be_dcache_llc_miss_rate",
+ "Last level cache misses per last level cache access",
+ d_ratio(llc_miss_rd, llc_access), "100%")
+ ]) if llc_miss_rd and llc_access else None,
+ ]),
+ ]),
+ ])
+
+
def main() -> None:
global _args
@@ -34,7 +171,9 @@ def main() -> None:
directory = f"{_args.events_path}/arm64/{_args.vendor}/{_args.model}/"
LoadEvents(directory)
- all_metrics = MetricGroup("", [])
+ all_metrics = MetricGroup("", [
+ Arm64Topdown(),
+ ])
if _args.metricgroups:
print(JsonEncodeMetricGroupDescriptions(all_metrics))
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 47/48] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (45 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 48/48] perf jevents: Validate that all names given an Event Ian Rogers
2025-12-03 17:59 ` [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Namhyung Kim
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Breakdown cycles to user, kernel and guest. Add a common_metrics.py
file for such metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/Build | 2 +-
tools/perf/pmu-events/amd_metrics.py | 2 ++
tools/perf/pmu-events/arm64_metrics.py | 2 ++
tools/perf/pmu-events/common_metrics.py | 19 +++++++++++++++++++
tools/perf/pmu-events/intel_metrics.py | 2 ++
5 files changed, 26 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/pmu-events/common_metrics.py
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index f7d67d03d055..a3d7a04f0abf 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -44,7 +44,7 @@ $(LEGACY_CACHE_JSON): $(LEGACY_CACHE_PY)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(LEGACY_CACHE_PY) > $@
-GEN_METRIC_DEPS := pmu-events/metric.py
+GEN_METRIC_DEPS := pmu-events/metric.py pmu-events/common_metrics.py
# Generate AMD Json
ZENS = $(shell ls -d pmu-events/arch/x86/amdzen*)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 780e611fe575..feb3b1fc5152 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -4,6 +4,7 @@ import argparse
import math
import os
from typing import Optional
+from common_metrics import Cycles
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
Metric, MetricGroup, Select)
@@ -474,6 +475,7 @@ def main() -> None:
AmdItlb(),
AmdLdSt(),
AmdUpc(),
+ Cycles(),
Idle(),
Rapl(),
UncoreL3(),
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index 9678253e2e0e..ac518e7f1120 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -3,6 +3,7 @@
import argparse
import os
from typing import Optional
+from common_metrics import Cycles
from metric import (d_ratio, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
LoadEvents, Metric, MetricGroup)
@@ -173,6 +174,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
Arm64Topdown(),
+ Cycles(),
])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/common_metrics.py b/tools/perf/pmu-events/common_metrics.py
new file mode 100644
index 000000000000..fcdfb9d3e648
--- /dev/null
+++ b/tools/perf/pmu-events/common_metrics.py
@@ -0,0 +1,19 @@
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+from metric import (d_ratio, Event, Metric, MetricGroup)
+
+
+def Cycles() -> MetricGroup:
+ cyc_k = Event("cpu\\-cycles:kHh") # exclude user and guest
+ cyc_g = Event("cpu\\-cycles:G") # exclude host
+ cyc_u = Event("cpu\\-cycles:uH") # exclude kernel, hypervisor and guest
+ cyc = cyc_k + cyc_g + cyc_u
+
+ return MetricGroup("lpm_cycles", [
+ Metric("lpm_cycles_total", "Total number of cycles", cyc, "cycles"),
+ Metric("lpm_cycles_user", "User cycles as a percentage of all cycles",
+ d_ratio(cyc_u, cyc), "100%"),
+ Metric("lpm_cycles_kernel", "Kernel cycles as a percentage of all cycles",
+ d_ratio(cyc_k, cyc), "100%"),
+ Metric("lpm_cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
+ d_ratio(cyc_g, cyc), "100%"),
+ ], description="cycles breakdown per privilege level (users, kernel, guest)")
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index d56bab7337df..52035433b505 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -6,6 +6,7 @@ import math
import os
import re
from typing import Optional
+from common_metrics import Cycles
from metric import (d_ratio, has_event, max, source_count, CheckPmu, Event,
JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
Literal, LoadEvents, Metric, MetricConstraint, MetricGroup,
@@ -1095,6 +1096,7 @@ def main() -> None:
LoadEvents(directory)
all_metrics = MetricGroup("", [
+ Cycles(),
Idle(),
Rapl(),
Smi(),
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v9 48/48] perf jevents: Validate that all names given an Event
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (46 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 47/48] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
@ 2025-12-02 17:50 ` Ian Rogers
2025-12-03 17:59 ` [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Namhyung Kim
48 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-02 17:50 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Validate they exist in a json file from one directory found from one
directory above the model's json directory. This avoids broken
fallback encodings being created.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
---
tools/perf/pmu-events/metric.py | 36 +++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2029b6e28365..585454828c2f 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -11,12 +11,14 @@ from typing import Dict, List, Optional, Set, Tuple, Union
all_pmus = set()
all_events = set()
experimental_events = set()
+all_events_all_models = set()
def LoadEvents(directory: str) -> None:
"""Populate a global set of all known events for the purpose of validating Event names"""
global all_pmus
global all_events
global experimental_events
+ global all_events_all_models
all_events = {
"context\\-switches",
"cpu\\-cycles",
@@ -42,6 +44,20 @@ def LoadEvents(directory: str) -> None:
# The generated directory may be the same as the input, which
# causes partial json files. Ignore errors.
pass
+ all_events_all_models = all_events.copy()
+ for root, dirs, files in os.walk(directory + ".."):
+ for filename in files:
+ if filename.endswith(".json"):
+ try:
+ for x in json.load(open(f"{root}/{filename}")):
+ if "EventName" in x:
+ all_events_all_models.add(x["EventName"])
+ elif "ArchStdEvent" in x:
+ all_events_all_models.add(x["ArchStdEvent"])
+ except json.decoder.JSONDecodeError:
+ # The generated directory may be the same as the input, which
+ # causes partial json files. Ignore errors.
+ pass
def CheckPmu(name: str) -> bool:
@@ -64,6 +80,25 @@ def CheckEvent(name: str) -> bool:
return name in all_events
+def CheckEveryEvent(*names: str) -> None:
+ """Check all the events exist in at least one json file"""
+ global all_events_all_models
+ if len(all_events_all_models) == 0:
+ assert len(names) == 1, f"Cannot determine valid events in {names}"
+ # No events loaded so assume any event is good.
+ return
+
+ for name in names:
+ # Remove trailing modifier.
+ if ':' in name:
+ name = name[:name.find(':')]
+ elif '/' in name:
+ name = name[:name.find('/')]
+ if any([name.startswith(x) for x in ['amd', 'arm', 'cpu', 'msr', 'power']]):
+ continue
+ if name not in all_events_all_models:
+ raise Exception(f"Is {name} a named json event?")
+
def IsExperimentalEvent(name: str) -> bool:
global experimental_events
@@ -403,6 +438,7 @@ class Event(Expression):
def __init__(self, *args: str):
error = ""
+ CheckEveryEvent(*args)
for name in args:
if CheckEvent(name):
self.name = _FixEscapes(name)
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
` (47 preceding siblings ...)
2025-12-02 17:50 ` [PATCH v9 48/48] perf jevents: Validate that all names given an Event Ian Rogers
@ 2025-12-03 17:59 ` Namhyung Kim
48 siblings, 0 replies; 55+ messages in thread
From: Namhyung Kim @ 2025-12-03 17:59 UTC (permalink / raw)
To: Ian Rogers
Cc: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ingo Molnar,
James Clark, Jing Zhang, Jiri Olsa, John Garry, Leo Yan,
Perry Taylor, Peter Zijlstra, Samantha Alt, Sandipan Das,
Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On Tue, Dec 02, 2025 at 09:49:55AM -0800, Ian Rogers wrote:
> Metrics in the perf tool come in via json. Json doesn't allow
> comments, line breaks, etc. making it an inconvenient way to write
> metrics. Further, it is useful to detect when writing a metric that
> the event specified is supported within the event json for a
> model. From the metric python code Event(s) are used, with fallback
> events provided, if no event is found then an exception is thrown and
> that can either indicate a failure or an unsupported model. To avoid
> confusion all the metrics and their metricgroups are prefixed with
> 'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
> characters aren't ideal, this separates the metrics from other vendor
> provided metrics.
>
> * The first 14 patches introduce infrastructure and fixes for the
> addition of metrics written in python for Arm64, AMD Zen and Intel
> CPUs. The ilist.py and perf python module are fixed to work better
> with metrics on hybrid architectures.
I've applied the first 12 patches to perf-tools-next, thanks!
Namhyung
>
> * The next 9 patches generate additional metrics for AMD zen. Rapl
> and Idle metrics aren't specific to AMD but are placed here for ease
> and convenience. Uncore L3 metrics are added along with the majority
> of core metrics.
>
> * The next 20 patches add additional metrics for Intel. Rapl and Idle
> metrics aren't specific to Intel but are placed here for ease and
> convenience. Smi and tsx metrics are added so they can be dropped
> from the per model json files. There are four uncore sets of metrics
> and eleven core metrics. Add a CheckPmu function to metric to
> simplify detecting the presence of hybrid PMUs in events. Metrics
> with experimental events are flagged as experimental in their
> description.
>
> * The next 2 patches add additional metrics for Arm64, where the
> topdown set decomposes yet further. The metrcs primarily use json
> events, where the json contains architecture standard events. Not
> all events are in the json, such as for a53 where the events are in
> sysfs. Workaround this by adding the sysfs events to the metrics but
> longer-term such events should be added to the json.
>
> * The final patch validates that all events provided to an Event
> object exist in a json file somewhere. This is to avoid mistakes
> like unfortunate typos.
>
> This series has benefitted from the input of Leo Yan
> <leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
> <thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
>
> v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
> reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
> Thomas Falcon (Intel).
>
> v8. Combine the previous 4 series for clarity. Rebase on top of the
> more recent legacy metric and event changes. Make the python more
> pep8 and pylint compliant.
>
> Foundations:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
>
> v5. Rebase on top of legacy hardware/cache changes that now generate
> events using python:
> https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
> the v5 series is:
> https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
>
> v4. Rebase and small Build/Makefile tweak
> https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
>
> v3. Some code tidying, make the input directory a command line
> argument, but no other functional or output changes.
> https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
>
> v2. Fixes two type issues in the python code but no functional or
> output changes.
> https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
>
> AMD:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
>
> v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
> names with lpm_ (short for Linux Perf Metric) so that python
> generated metrics are clearly namespaced.
> https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
>
> v4. Rebase.
> https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
>
> v3. Some minor code cleanup changes.
> https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
>
> v2. Drop the cycles breakdown in favor of having it as a common
> metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
> https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
>
> Intel:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
>
> v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
> names with lpm_ (short for Linux Perf Metric) so that python
> generated metrics are clearly namespaced. Kan requested a
> namespace in his review:
> https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
> The v5 series is:
> https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
>
> v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
>
> v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
> minor code cleanup changes. Drop reference to merged fix for
> umasks/occ_sel in PCU events and for cstate metrics.
> https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
>
> v2. Drop the cycles breakdown in favor of having it as a common
> metric, spelling and other improvements suggested by Kan Liang
> <kan.liang@linux.intel.com>.
> https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
>
> ARM:
> v7. Switch a use of cycles to cpu-cycles due to ARM having too many
> cycles events.
> https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
>
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
>
> v5. Rebase. Address review comments from Leo Yan
> <leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
> Linux Perf Metric) so that python generated metrics are clearly
> namespaced. Use cpu-cycles rather than cycles legacy event for
> cycles metrics to avoid confusion with ARM PMUs. Add patch that
> checks events to ensure all possible event names are present in at
> least one json file.
> https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
>
> v4. Tweak to build dependencies and rebase.
> https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
>
> v3. Some minor code cleanup changes.
> https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
>
> v2. The cycles metrics are now made common and shared with AMD and
> Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
> assumes these patches come after the AMD and Intel sets.
> https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
>
> Ian Rogers (48):
> perf python: Correct copying of metric_leader in an evsel
> perf ilist: Be tolerant of reading a metric on the wrong CPU
> perf jevents: Allow multiple metricgroups.json files
> perf jevents: Update metric constraint support
> perf jevents: Add descriptions to metricgroup abstraction
> perf jevents: Allow metric groups not to be named
> perf jevents: Support parsing negative exponents
> perf jevents: Term list fix in event parsing
> perf jevents: Add threshold expressions to Metric
> perf jevents: Move json encoding to its own functions
> perf jevents: Drop duplicate pending metrics
> perf jevents: Skip optional metrics in metric group list
> perf jevents: Build support for generating metrics from python
> perf jevents: Add load event json to verify and allow fallbacks
> perf jevents: Add RAPL event metric for AMD zen models
> perf jevents: Add idle metric for AMD zen models
> perf jevents: Add upc metric for uops per cycle for AMD
> perf jevents: Add br metric group for branch statistics on AMD
> perf jevents: Add itlb metric group for AMD
> perf jevents: Add dtlb metric group for AMD
> perf jevents: Add uncore l3 metric group for AMD
> perf jevents: Add load store breakdown metrics ldst for AMD
> perf jevents: Add context switch metrics for AMD
> perf jevents: Add RAPL metrics for all Intel models
> perf jevents: Add idle metric for Intel models
> perf jevents: Add CheckPmu to see if a PMU is in loaded json events
> perf jevents: Add smi metric group for Intel models
> perf jevents: Mark metrics with experimental events as experimental
> perf jevents: Add tsx metric group for Intel models
> perf jevents: Add br metric group for branch statistics on Intel
> perf jevents: Add software prefetch (swpf) metric group for Intel
> perf jevents: Add ports metric group giving utilization on Intel
> perf jevents: Add L2 metrics for Intel
> perf jevents: Add load store breakdown metrics ldst for Intel
> perf jevents: Add ILP metrics for Intel
> perf jevents: Add context switch metrics for Intel
> perf jevents: Add FPU metrics for Intel
> perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
> perf jevents: Add mem_bw metric for Intel
> perf jevents: Add local/remote "mem" breakdown metrics for Intel
> perf jevents: Add dir breakdown metrics for Intel
> perf jevents: Add C-State metrics from the PCU PMU for Intel
> perf jevents: Add local/remote miss latency metrics for Intel
> perf jevents: Add upi_bw metric for Intel
> perf jevents: Add mesh bandwidth saturation metric for Intel
> perf jevents: Add collection of topdown like metrics for arm64
> perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
> perf jevents: Validate that all names given an Event
>
> tools/perf/.gitignore | 5 +
> tools/perf/Makefile.perf | 2 +
> tools/perf/pmu-events/Build | 51 +-
> tools/perf/pmu-events/amd_metrics.py | 491 ++++++++++
> tools/perf/pmu-events/arm64_metrics.py | 187 ++++
> tools/perf/pmu-events/common_metrics.py | 19 +
> tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
> tools/perf/pmu-events/jevents.py | 7 +-
> tools/perf/pmu-events/metric.py | 256 ++++-
> tools/perf/pmu-events/metric_test.py | 4 +
> tools/perf/python/ilist.py | 8 +-
> tools/perf/util/evsel.c | 1 +
> tools/perf/util/python.c | 82 +-
> 13 files changed, 2188 insertions(+), 54 deletions(-)
> create mode 100755 tools/perf/pmu-events/amd_metrics.py
> create mode 100755 tools/perf/pmu-events/arm64_metrics.py
> create mode 100644 tools/perf/pmu-events/common_metrics.py
> create mode 100755 tools/perf/pmu-events/intel_metrics.py
>
> --
> 2.52.0.158.g65b55ccf14-goog
>
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v9 22/48] perf jevents: Add load store breakdown metrics ldst for AMD
2025-12-02 17:50 ` [PATCH v9 22/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
@ 2025-12-08 9:21 ` Sandipan Das
0 siblings, 0 replies; 55+ messages in thread
From: Sandipan Das @ 2025-12-08 9:21 UTC (permalink / raw)
To: Ian Rogers, Adrian Hunter, Alexander Shishkin,
Arnaldo Carvalho de Melo, Benjamin Gray, Caleb Biggers,
Edward Baker, Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa,
John Garry, Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra,
Samantha Alt, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On 12/2/2025 11:20 PM, Ian Rogers wrote:
> Give breakdown of number of instructions. Use the counter mask (cmask)
> to show the number of cycles taken to retire the instructions.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Sandipan Das <sandipan.das@amd.com>
> ---
> tools/perf/pmu-events/amd_metrics.py | 75 ++++++++++++++++++++++++++++
> 1 file changed, 75 insertions(+)
>
> diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
> index 6542c334a82b..1611d0e50d03 100755
> --- a/tools/perf/pmu-events/amd_metrics.py
> +++ b/tools/perf/pmu-events/amd_metrics.py
> @@ -279,6 +279,80 @@ def AmdItlb():
> ], description="Instruction TLB breakdown")
>
>
> +def AmdLdSt() -> MetricGroup:
> + ldst_ld = Event("ls_dispatch.ld_dispatch")
> + ldst_st = Event("ls_dispatch.store_dispatch")
> + ldst_ldc1 = Event(f"{ldst_ld}/cmask=1/")
> + ldst_stc1 = Event(f"{ldst_st}/cmask=1/")
> + ldst_ldc2 = Event(f"{ldst_ld}/cmask=2/")
> + ldst_stc2 = Event(f"{ldst_st}/cmask=2/")
> + ldst_ldc3 = Event(f"{ldst_ld}/cmask=3/")
> + ldst_stc3 = Event(f"{ldst_st}/cmask=3/")
> + ldst_cyc = Event("ls_not_halted_cyc")
> +
> + ld_rate = d_ratio(ldst_ld, interval_sec)
> + st_rate = d_ratio(ldst_st, interval_sec)
> +
> + ld_v1 = max(ldst_ldc1 - ldst_ldc2, 0)
> + ld_v2 = max(ldst_ldc2 - ldst_ldc3, 0)
> + ld_v3 = ldst_ldc3
> +
> + st_v1 = max(ldst_stc1 - ldst_stc2, 0)
> + st_v2 = max(ldst_stc2 - ldst_stc3, 0)
> + st_v3 = ldst_stc3
> +
> + return MetricGroup("lpm_ldst", [
> + MetricGroup("lpm_ldst_total", [
> + Metric("lpm_ldst_total_ld", "Number of loads dispatched per second.",
> + ld_rate, "insns/sec"),
> + Metric("lpm_ldst_total_st", "Number of stores dispatched per second.",
> + st_rate, "insns/sec"),
> + ]),
> + MetricGroup("lpm_ldst_percent_insn", [
> + Metric("lpm_ldst_percent_insn_ld",
> + "Load instructions as a percentage of all instructions.",
> + d_ratio(ldst_ld, ins), "100%"),
> + Metric("lpm_ldst_percent_insn_st",
> + "Store instructions as a percentage of all instructions.",
> + d_ratio(ldst_st, ins), "100%"),
> + ]),
> + MetricGroup("lpm_ldst_ret_loads_per_cycle", [
> + Metric(
> + "lpm_ldst_ret_loads_per_cycle_1",
> + "Load instructions retiring in 1 cycle as a percentage of all "
> + "unhalted cycles.", d_ratio(ld_v1, ldst_cyc), "100%"),
> + Metric(
> + "lpm_ldst_ret_loads_per_cycle_2",
> + "Load instructions retiring in 2 cycles as a percentage of all "
> + "unhalted cycles.", d_ratio(ld_v2, ldst_cyc), "100%"),
> + Metric(
> + "lpm_ldst_ret_loads_per_cycle_3",
> + "Load instructions retiring in 3 or more cycles as a percentage"
> + "of all unhalted cycles.", d_ratio(ld_v3, ldst_cyc), "100%"),
> + ]),
> + MetricGroup("lpm_ldst_ret_stores_per_cycle", [
> + Metric(
> + "lpm_ldst_ret_stores_per_cycle_1",
> + "Store instructions retiring in 1 cycle as a percentage of all "
> + "unhalted cycles.", d_ratio(st_v1, ldst_cyc), "100%"),
> + Metric(
> + "lpm_ldst_ret_stores_per_cycle_2",
> + "Store instructions retiring in 2 cycles as a percentage of all "
> + "unhalted cycles.", d_ratio(st_v2, ldst_cyc), "100%"),
> + Metric(
> + "lpm_ldst_ret_stores_per_cycle_3",
> + "Store instructions retiring in 3 or more cycles as a percentage"
> + "of all unhalted cycles.", d_ratio(st_v3, ldst_cyc), "100%"),
> + ]),
A subset of dispatched loads and stores do not retire. So PMCx029, which is used by
ldst_ld and ldst_st above, does not provide the number of retired loads and stores.
There is currently no event which counts retired loads and stores :(
> + MetricGroup("lpm_ldst_insn_bt", [
> + Metric("lpm_ldst_insn_bt_ld", "Number of instructions between loads.",
> + d_ratio(ins, ldst_ld), "insns"),
> + Metric("lpm_ldst_insn_bt_st", "Number of instructions between stores.",
> + d_ratio(ins, ldst_st), "insns"),
> + ])
> + ], description="Breakdown of load/store instructions")
> +
> +
> def AmdUpc() -> Metric:
> ops = Event("ex_ret_ops", "ex_ret_cops")
> upc = d_ratio(ops, smt_cycles)
> @@ -365,6 +439,7 @@ def main() -> None:
> AmdBr(),
> AmdDtlb(),
> AmdItlb(),
> + AmdLdSt(),
> AmdUpc(),
> Idle(),
> Rapl(),
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v9 17/48] perf jevents: Add upc metric for uops per cycle for AMD
2025-12-02 17:50 ` [PATCH v9 17/48] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
@ 2025-12-08 9:46 ` Sandipan Das
0 siblings, 0 replies; 55+ messages in thread
From: Sandipan Das @ 2025-12-08 9:46 UTC (permalink / raw)
To: Ian Rogers, Adrian Hunter, Alexander Shishkin,
Arnaldo Carvalho de Melo, Benjamin Gray, Caleb Biggers,
Edward Baker, Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa,
John Garry, Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra,
Samantha Alt, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On 12/2/2025 11:20 PM, Ian Rogers wrote:
> The metric adjusts for whether or not SMT is on.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Sandipan Das <sandipan.das@amd.com>
> ---
> tools/perf/pmu-events/amd_metrics.py | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
> index f51a044b8005..42e46b33334d 100755
> --- a/tools/perf/pmu-events/amd_metrics.py
> +++ b/tools/perf/pmu-events/amd_metrics.py
> @@ -3,14 +3,26 @@
> import argparse
> import math
> import os
> +from typing import Optional
> from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
> - JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
> - MetricGroup, Select)
> + JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
> + Metric, MetricGroup, Select)
>
> # Global command line arguments.
> _args = None
> -
> +_zen_model: int = 1
> interval_sec = Event("duration_time")
> +ins = Event("instructions")
> +cycles = Event("cycles")
> +# Number of CPU cycles scaled for SMT.
> +smt_cycles = Select(cycles / 2, Literal("#smt_on"), cycles)
The CPU architects said that while there are some fairness guarantees, the distribution
of resources amongst sibling threads is not always equal. There is also no easy way to
determine this.
> +
> +
> +def AmdUpc() -> Metric:
> + ops = Event("ex_ret_ops", "ex_ret_cops")
> + upc = d_ratio(ops, smt_cycles)
> + return Metric("lpm_upc", "Micro-ops retired per core cycle (higher is better)",
> + upc, "uops/cycle")
Zen 3 onwards, PMCx0C1, which is used by ex_ret_ops and ex_ret_cops, counts retired
macro-ops.
>
>
> def Idle() -> Metric:
> @@ -45,6 +57,7 @@ def Rapl() -> MetricGroup:
>
> def main() -> None:
> global _args
> + global _zen_model
>
> def dir_path(path: str) -> str:
> """Validate path is a directory for argparse."""
> @@ -67,7 +80,10 @@ def main() -> None:
> directory = f"{_args.events_path}/x86/{_args.model}/"
> LoadEvents(directory)
>
> + _zen_model = int(_args.model[6:])
> +
> all_metrics = MetricGroup("", [
> + AmdUpc(),
> Idle(),
> Rapl(),
> ])
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v9 18/48] perf jevents: Add br metric group for branch statistics on AMD
2025-12-02 17:50 ` [PATCH v9 18/48] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
@ 2025-12-08 12:42 ` Sandipan Das
0 siblings, 0 replies; 55+ messages in thread
From: Sandipan Das @ 2025-12-08 12:42 UTC (permalink / raw)
To: Ian Rogers, Adrian Hunter, Alexander Shishkin,
Arnaldo Carvalho de Melo, Benjamin Gray, Caleb Biggers,
Edward Baker, Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa,
John Garry, Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra,
Samantha Alt, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On 12/2/2025 11:20 PM, Ian Rogers wrote:
> The br metric group for branches itself comprises metric groups for
> total, taken, conditional, fused and far metric groups using json
> events. The lack of conditional events on anything but zen2 means this
> category is lacking on zen1, zen3 and zen4.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Sandipan Das <sandipan.das@amd.com>
> ---
> tools/perf/pmu-events/amd_metrics.py | 104 +++++++++++++++++++++++++++
> 1 file changed, 104 insertions(+)
>
> diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
> index 42e46b33334d..1880ccf9c6fc 100755
> --- a/tools/perf/pmu-events/amd_metrics.py
> +++ b/tools/perf/pmu-events/amd_metrics.py
> @@ -18,6 +18,109 @@ cycles = Event("cycles")
> smt_cycles = Select(cycles / 2, Literal("#smt_on"), cycles)
>
>
> +def AmdBr():
> + def Total() -> MetricGroup:
> + br = Event("ex_ret_brn")
> + br_m_all = Event("ex_ret_brn_misp")
> + br_clr = Event("ex_ret_msprd_brnch_instr_dir_msmtch",
> + "ex_ret_brn_resync")
PMCx1C7, used by ex_ret_msprd_brnch_instr_dir_msmtch, and PMCx0C7, used by ex_ret_brn_resync,
are not equivalent. I have been told that the now deprecated PMCx0C7 was meant to count
pipeline restarts for reasons other than branch misprediction. Is the intention here to count
such restarts or just the ones caused by branch misprediction or both?
> +
> + br_r = d_ratio(br, interval_sec)
> + ins_r = d_ratio(ins, br)
> + misp_r = d_ratio(br_m_all, br)
> + clr_r = d_ratio(br_clr, interval_sec)
> +
> + return MetricGroup("lpm_br_total", [
> + Metric("lpm_br_total_retired",
> + "The number of branch instructions retired per second.", br_r,
> + "insn/s"),
> + Metric(
> + "lpm_br_total_mispred",
> + "The number of branch instructions retired, of any type, that were "
> + "not correctly predicted as a percentage of all branch instrucions.",
> + misp_r, "100%"),
> + Metric("lpm_br_total_insn_between_branches",
> + "The number of instructions divided by the number of branches.",
> + ins_r, "insn"),
> + Metric("lpm_br_total_insn_fe_resteers",
> + "The number of resync branches per second.", clr_r, "req/s")
> + ])
> +
> + def Taken() -> MetricGroup:
> + br = Event("ex_ret_brn_tkn")
> + br_m_tk = Event("ex_ret_brn_tkn_misp")
> + br_r = d_ratio(br, interval_sec)
> + ins_r = d_ratio(ins, br)
> + misp_r = d_ratio(br_m_tk, br)
> + return MetricGroup("lpm_br_taken", [
> + Metric("lpm_br_taken_retired",
> + "The number of taken branches that were retired per second.",
> + br_r, "insn/s"),
> + Metric(
> + "lpm_br_taken_mispred",
> + "The number of retired taken branch instructions that were "
> + "mispredicted as a percentage of all taken branches.", misp_r,
> + "100%"),
> + Metric(
> + "lpm_br_taken_insn_between_branches",
> + "The number of instructions divided by the number of taken branches.",
> + ins_r, "insn"),
> + ])
> +
> + def Conditional() -> Optional[MetricGroup]:
> + global _zen_model
> + br = Event("ex_ret_cond")
> + br_r = d_ratio(br, interval_sec)
> + ins_r = d_ratio(ins, br)
> +
> + metrics = [
> + Metric("lpm_br_cond_retired", "Retired conditional branch instructions.",
> + br_r, "insn/s"),
> + Metric("lpm_br_cond_insn_between_branches",
> + "The number of instructions divided by the number of conditional "
> + "branches.", ins_r, "insn"),
> + ]
> + if _zen_model == 2:
> + br_m_cond = Event("ex_ret_cond_misp")
> + misp_r = d_ratio(br_m_cond, br)
> + metrics += [
> + Metric("lpm_br_cond_mispred",
> + "Retired conditional branch instructions mispredicted as a "
> + "percentage of all conditional branches.", misp_r, "100%"),
> + ]
> +
> + return MetricGroup("lpm_br_cond", metrics)
> +
> + def Fused() -> MetricGroup:
> + br = Event("ex_ret_fused_instr", "ex_ret_fus_brnch_inst")
> + br_r = d_ratio(br, interval_sec)
> + ins_r = d_ratio(ins, br)
> + return MetricGroup("lpm_br_cond", [
> + Metric("lpm_br_fused_retired",
> + "Retired fused branch instructions per second.", br_r, "insn/s"),
> + Metric(
> + "lpm_br_fused_insn_between_branches",
> + "The number of instructions divided by the number of fused "
> + "branches.", ins_r, "insn"),
> + ])
> +
> + def Far() -> MetricGroup:
> + br = Event("ex_ret_brn_far")
> + br_r = d_ratio(br, interval_sec)
> + ins_r = d_ratio(ins, br)
> + return MetricGroup("lpm_br_far", [
> + Metric("lpm_br_far_retired", "Retired far control transfers per second.",
> + br_r, "insn/s"),
> + Metric(
> + "lpm_br_far_insn_between_branches",
> + "The number of instructions divided by the number of far branches.",
> + ins_r, "insn"),
> + ])
> +
> + return MetricGroup("lpm_br", [Total(), Taken(), Conditional(), Fused(), Far()],
> + description="breakdown of retired branch instructions")
> +
> +
> def AmdUpc() -> Metric:
> ops = Event("ex_ret_ops", "ex_ret_cops")
> upc = d_ratio(ops, smt_cycles)
> @@ -83,6 +186,7 @@ def main() -> None:
> _zen_model = int(_args.model[6:])
>
> all_metrics = MetricGroup("", [
> + AmdBr(),
> AmdUpc(),
> Idle(),
> Rapl(),
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64
2025-12-02 17:50 ` [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
@ 2025-12-09 11:31 ` James Clark
2025-12-09 21:23 ` Ian Rogers
0 siblings, 1 reply; 55+ messages in thread
From: James Clark @ 2025-12-09 11:31 UTC (permalink / raw)
To: Ian Rogers, Andi Kleen, Liang, Kan
Cc: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ingo Molnar,
Jing Zhang, Jiri Olsa, John Garry, Leo Yan, Namhyung Kim,
Perry Taylor, Peter Zijlstra, Samantha Alt, Sandipan Das,
Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On 02/12/2025 5:50 pm, Ian Rogers wrote:
> Metrics are created using legacy, common and recommended events. As
> events may be missing a TryEvent function will give None if an event
> is missing. To workaround missing JSON events for cortex-a53, sysfs
> encodings are used.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> An earlier review of this patch by Leo Yan is here:
> https://lore.kernel.org/lkml/8168c713-005c-4fd9-a928-66763dab746a@arm.com/
> Hopefully all corrections were made.
> ---
> tools/perf/pmu-events/arm64_metrics.py | 145 ++++++++++++++++++++++++-
> 1 file changed, 142 insertions(+), 3 deletions(-)
>
[...]
> + MetricGroup("lpm_topdown_be_bound", [
> + MetricGroup("lpm_topdown_be_dtlb", [
> + Metric("lpm_topdown_be_dtlb_walks", "Dtlb walks per instruction",
> + d_ratio(dtlb_walk, ins_ret), "walk/insn"),
> + Metric("lpm_topdown_be_dtlb_walk_rate", "Dtlb walks per L1D TLB access",
> + d_ratio(dtlb_walk, l1d_tlb) if l1d_tlb else None, "100%"),
> + ]) if dtlb_walk else None,
> + MetricGroup("lpm_topdown_be_mix", [
> + Metric("lpm_topdown_be_mix_ld", "Percentage of load instructions",
> + d_ratio(ld_spec, inst_spec), "100%") if ld_spec else None,
> + Metric("lpm_topdown_be_mix_st", "Percentage of store instructions",
> + d_ratio(st_spec, inst_spec), "100%") if st_spec else None,
> + Metric("lpm_topdown_be_mix_simd", "Percentage of SIMD instructions",
> + d_ratio(ase_spec, inst_spec), "100%") if ase_spec else None,
> + Metric("lpm_topdown_be_mix_fp",
> + "Percentage of floating point instructions",
> + d_ratio(vfp_spec, inst_spec), "100%") if vfp_spec else None,
> + Metric("lpm_topdown_be_mix_dp",
> + "Percentage of data processing instructions",
> + d_ratio(dp_spec, inst_spec), "100%") if dp_spec else None,
> + Metric("lpm_topdown_be_mix_crypto",
> + "Percentage of data processing instructions",
> + d_ratio(crypto_spec, inst_spec), "100%") if crypto_spec else None,
> + Metric(
> + "lpm_topdown_be_mix_br", "Percentage of branch instructions",
> + d_ratio(br_immed_spec + br_indirect_spec + br_ret_spec,
> + inst_spec), "100%") if br_immed_spec and br_indirect_spec and br_ret_spec else None,
Hi Ian,
I've been trying to engage with the team that's publishing the metrics
in Arm [1] to see if there was any chance in getting some unity between
these new metrics and their existing json ones. The feedback from them
was that the decision to only publish metrics for certain cores is
deliberate and there is no plan to change anything. The metrics there
are well tested, known to be working, and usually contain workarounds
for specific issues. They don't want to do "Arm wide" common metrics for
existing cores as they believe it has more potential to mislead people
than help.
I'm commenting on this "lpm_topdown_be_mix_br" as one example, that the
equivalent Arm metric "branch_percentage" excludes br_ret_spec because
br_indirect_spec also counts returns. Or on neoverse-n3 it's
"PC_WRITE_SPEC / INST_SPEC".
I see that you've prefixed all the metrics so the names won't clash from
Kan's feedback [2]. But it makes me wonder if at some point some kind of
alias list could be implemented to override the generated metrics with
hand written json ones. But by that point why not just use the same
names? The Arm metric team's feedback was that there isn't really an
industry standard for naming, and that differences between architectures
would make it almost impossible to standardise anyway in their opinion.
But here we're adding duplicate metrics with different names, where the
new ones are known to have issues. It's not a great user experience IMO,
but at the same time missing old cores from the Arm metrics isn't a
great user experience either. I actually don't have a solution, other
than to say I tried to get them to consider more unified naming.
I also have to say that I do still agree with Andi's old feedback [3]
that the existing json was good enough, and maybe this isn't the right
direction, although it's not very useful feedback at this point. I
thought I had replied to that thread long ago, but must not have pressed
send, sorry about that.
[1]:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/data
[2]:
https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com
[3]: https://lore.kernel.org/lkml/ZeJJyCmXO9GxpDiF@tassilo/
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64
2025-12-09 11:31 ` James Clark
@ 2025-12-09 21:23 ` Ian Rogers
0 siblings, 0 replies; 55+ messages in thread
From: Ian Rogers @ 2025-12-09 21:23 UTC (permalink / raw)
To: James Clark
Cc: Andi Kleen, Liang, Kan, Adrian Hunter, Alexander Shishkin,
Arnaldo Carvalho de Melo, Benjamin Gray, Caleb Biggers,
Edward Baker, Ingo Molnar, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users, Stephane Eranian
On Tue, Dec 9, 2025 at 3:31 AM James Clark <james.clark@linaro.org> wrote:
>
> On 02/12/2025 5:50 pm, Ian Rogers wrote:
> > Metrics are created using legacy, common and recommended events. As
> > events may be missing a TryEvent function will give None if an event
> > is missing. To workaround missing JSON events for cortex-a53, sysfs
> > encodings are used.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> > An earlier review of this patch by Leo Yan is here:
> > https://lore.kernel.org/lkml/8168c713-005c-4fd9-a928-66763dab746a@arm.com/
> > Hopefully all corrections were made.
> > ---
> > tools/perf/pmu-events/arm64_metrics.py | 145 ++++++++++++++++++++++++-
> > 1 file changed, 142 insertions(+), 3 deletions(-)
> >
> [...]
> > + MetricGroup("lpm_topdown_be_bound", [
> > + MetricGroup("lpm_topdown_be_dtlb", [
> > + Metric("lpm_topdown_be_dtlb_walks", "Dtlb walks per instruction",
> > + d_ratio(dtlb_walk, ins_ret), "walk/insn"),
> > + Metric("lpm_topdown_be_dtlb_walk_rate", "Dtlb walks per L1D TLB access",
> > + d_ratio(dtlb_walk, l1d_tlb) if l1d_tlb else None, "100%"),
> > + ]) if dtlb_walk else None,
> > + MetricGroup("lpm_topdown_be_mix", [
> > + Metric("lpm_topdown_be_mix_ld", "Percentage of load instructions",
> > + d_ratio(ld_spec, inst_spec), "100%") if ld_spec else None,
> > + Metric("lpm_topdown_be_mix_st", "Percentage of store instructions",
> > + d_ratio(st_spec, inst_spec), "100%") if st_spec else None,
> > + Metric("lpm_topdown_be_mix_simd", "Percentage of SIMD instructions",
> > + d_ratio(ase_spec, inst_spec), "100%") if ase_spec else None,
> > + Metric("lpm_topdown_be_mix_fp",
> > + "Percentage of floating point instructions",
> > + d_ratio(vfp_spec, inst_spec), "100%") if vfp_spec else None,
> > + Metric("lpm_topdown_be_mix_dp",
> > + "Percentage of data processing instructions",
> > + d_ratio(dp_spec, inst_spec), "100%") if dp_spec else None,
> > + Metric("lpm_topdown_be_mix_crypto",
> > + "Percentage of data processing instructions",
> > + d_ratio(crypto_spec, inst_spec), "100%") if crypto_spec else None,
> > + Metric(
> > + "lpm_topdown_be_mix_br", "Percentage of branch instructions",
> > + d_ratio(br_immed_spec + br_indirect_spec + br_ret_spec,
> > + inst_spec), "100%") if br_immed_spec and br_indirect_spec and br_ret_spec else None,
>
> Hi Ian,
>
> I've been trying to engage with the team that's publishing the metrics
> in Arm [1] to see if there was any chance in getting some unity between
> these new metrics and their existing json ones. The feedback from them
> was that the decision to only publish metrics for certain cores is
> deliberate and there is no plan to change anything. The metrics there
> are well tested, known to be working, and usually contain workarounds
> for specific issues. They don't want to do "Arm wide" common metrics for
> existing cores as they believe it has more potential to mislead people
> than help.
So this is sad, but I'll drop the patch from the series so as not to
delay things and keep carrying it in Google's tree. Just looking in
tools/perf/pmu-events/arch/arm64/arm there are 20 ARM models of which
only neoverse models (5 of the 20) have metrics. Could ARM's metric
people step up to fill the void? Models like cortex-a76 are actively
sold in the Raspberry Pi 5 and yet lack metrics.
I think there has to be a rule at some point of, "don't let perfect be
the enemy of good." There's no implication that ARM should maintain
these metrics and they be perfect just as there isn't an implication
that ARM should maintain the legacy metrics like
"stalled_cycles_per_instruction":
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/common/common/metrics.json?h=perf-tools-next#n49
I'm guessing the cycles breakdown:
https://lore.kernel.org/lkml/20251202175043.623597-48-irogers@google.com/
is okay and will keep that for ARM.
> I'm commenting on this "lpm_topdown_be_mix_br" as one example, that the
> equivalent Arm metric "branch_percentage" excludes br_ret_spec because
> br_indirect_spec also counts returns. Or on neoverse-n3 it's
> "PC_WRITE_SPEC / INST_SPEC".
This is the value in upstreaming metrics like this, to bug fix. This
is what has happened with the AMD and Intel metrics. I'm happy we can
deliver more metrics to users on those CPUs.
> I see that you've prefixed all the metrics so the names won't clash from
> Kan's feedback [2]. But it makes me wonder if at some point some kind of
> alias list could be implemented to override the generated metrics with
> hand written json ones. But by that point why not just use the same
> names? The Arm metric team's feedback was that there isn't really an
> industry standard for naming, and that differences between architectures
> would make it almost impossible to standardise anyway in their opinion.
So naming is always a challenge. One solution here is the ilist
application. When doing the legacy event reorganization I remember you
arguing that legacy events should be the norm and not the exception,
but it is because of all the model quirks precisely why that doesn't
work. By working through the quirks with cross platform metrics,
that's value to users and not misleading. If it does mislead then
that's a bug, let's fix it. Presenting users with no data isn't a fix
nor being particularly helpful.
> But here we're adding duplicate metrics with different names, where the
> new ones are known to have issues. It's not a great user experience IMO,
> but at the same time missing old cores from the Arm metrics isn't a
> great user experience either. I actually don't have a solution, other
> than to say I tried to get them to consider more unified naming.
So the lpm_ metrics are on top of whatever a vendor wants to add.
There are often more than one way to compute a metric, such as memory
controller counters vs l3 cache, on Intel an lpm_ metric may use
uncore counters while a tma_ metric uses the cache. I don't know if
sticking "ARM doesn't support this" in all the ARM lpm_ metric
descriptions would mitigate your metric creators' concerns, it is
something implied by Linux's licensing. We do highlight metrics that
contain experimental events, such as on Intel, should be considered
similarly experimental.
> I also have to say that I do still agree with Andi's old feedback [3]
> that the existing json was good enough, and maybe this isn't the right
> direction, although it's not very useful feedback at this point. I
> thought I had replied to that thread long ago, but must not have pressed
> send, sorry about that.
So having handwritten long metrics in json it's horrid, Having been
there I wouldn't want to be doing more of it. No comments, no line
breaks, huge potential for typos, peculiar rules on when commas are
allowed (so removing a line breaks parsing), .. This is why we have
make_legacy_cache.py
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/make_legacy_cache.py?h=perf-tools-next
writing 1216 legacy cache event descriptions (7266 lines of json) vs
129 lines of python. I'm going to be on team python all day long. In
terms of the Linux build, I don't think there's a reasonable
alternative language.
Thanks,
Ian
> [1]:
> https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/data
> [2]:
> https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com
> [3]: https://lore.kernel.org/lkml/ZeJJyCmXO9GxpDiF@tassilo/
>
^ permalink raw reply [flat|nested] 55+ messages in thread
end of thread, other threads:[~2025-12-09 21:23 UTC | newest]
Thread overview: 55+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 17:49 [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Ian Rogers
2025-12-02 17:49 ` [PATCH v9 01/48] perf python: Correct copying of metric_leader in an evsel Ian Rogers
2025-12-02 17:49 ` [PATCH v9 02/48] perf ilist: Be tolerant of reading a metric on the wrong CPU Ian Rogers
2025-12-02 17:49 ` [PATCH v9 03/48] perf jevents: Allow multiple metricgroups.json files Ian Rogers
2025-12-02 17:49 ` [PATCH v9 04/48] perf jevents: Update metric constraint support Ian Rogers
2025-12-02 17:50 ` [PATCH v9 05/48] perf jevents: Add descriptions to metricgroup abstraction Ian Rogers
2025-12-02 17:50 ` [PATCH v9 06/48] perf jevents: Allow metric groups not to be named Ian Rogers
2025-12-02 17:50 ` [PATCH v9 07/48] perf jevents: Support parsing negative exponents Ian Rogers
2025-12-02 17:50 ` [PATCH v9 08/48] perf jevents: Term list fix in event parsing Ian Rogers
2025-12-02 17:50 ` [PATCH v9 09/48] perf jevents: Add threshold expressions to Metric Ian Rogers
2025-12-02 17:50 ` [PATCH v9 10/48] perf jevents: Move json encoding to its own functions Ian Rogers
2025-12-02 17:50 ` [PATCH v9 11/48] perf jevents: Drop duplicate pending metrics Ian Rogers
2025-12-02 17:50 ` [PATCH v9 12/48] perf jevents: Skip optional metrics in metric group list Ian Rogers
2025-12-02 17:50 ` [PATCH v9 13/48] perf jevents: Build support for generating metrics from python Ian Rogers
2025-12-02 17:50 ` [PATCH v9 14/48] perf jevents: Add load event json to verify and allow fallbacks Ian Rogers
2025-12-02 17:50 ` [PATCH v9 15/48] perf jevents: Add RAPL event metric for AMD zen models Ian Rogers
2025-12-02 17:50 ` [PATCH v9 16/48] perf jevents: Add idle " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 17/48] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
2025-12-08 9:46 ` Sandipan Das
2025-12-02 17:50 ` [PATCH v9 18/48] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
2025-12-08 12:42 ` Sandipan Das
2025-12-02 17:50 ` [PATCH v9 19/48] perf jevents: Add itlb metric group for AMD Ian Rogers
2025-12-02 17:50 ` [PATCH v9 20/48] perf jevents: Add dtlb " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 21/48] perf jevents: Add uncore l3 " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 22/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
2025-12-08 9:21 ` Sandipan Das
2025-12-02 17:50 ` [PATCH v9 23/48] perf jevents: Add context switch metrics " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 24/48] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
2025-12-02 17:50 ` [PATCH v9 25/48] perf jevents: Add idle metric for " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 26/48] perf jevents: Add CheckPmu to see if a PMU is in loaded json events Ian Rogers
2025-12-02 17:50 ` [PATCH v9 27/48] perf jevents: Add smi metric group for Intel models Ian Rogers
2025-12-02 17:50 ` [PATCH v9 28/48] perf jevents: Mark metrics with experimental events as experimental Ian Rogers
2025-12-02 17:50 ` [PATCH v9 29/48] perf jevents: Add tsx metric group for Intel models Ian Rogers
2025-12-02 17:50 ` [PATCH v9 30/48] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
2025-12-02 17:50 ` [PATCH v9 31/48] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
2025-12-02 17:50 ` [PATCH v9 32/48] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
2025-12-02 17:50 ` [PATCH v9 33/48] perf jevents: Add L2 metrics for Intel Ian Rogers
2025-12-02 17:50 ` [PATCH v9 34/48] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 35/48] perf jevents: Add ILP metrics " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 36/48] perf jevents: Add context switch " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 37/48] perf jevents: Add FPU " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 38/48] perf jevents: Add Miss Level Parallelism (MLP) metric " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 39/48] perf jevents: Add mem_bw " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 40/48] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 41/48] perf jevents: Add dir " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 42/48] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 43/48] perf jevents: Add local/remote miss latency metrics " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 44/48] perf jevents: Add upi_bw metric " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 45/48] perf jevents: Add mesh bandwidth saturation " Ian Rogers
2025-12-02 17:50 ` [PATCH v9 46/48] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
2025-12-09 11:31 ` James Clark
2025-12-09 21:23 ` Ian Rogers
2025-12-02 17:50 ` [PATCH v9 47/48] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
2025-12-02 17:50 ` [PATCH v9 48/48] perf jevents: Validate that all names given an Event Ian Rogers
2025-12-03 17:59 ` [PATCH v9 00/48] AMD, ARM, Intel metric generation with Python Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).