* [PATCH v10 00/35] AMD and Intel metric generation with Python
@ 2026-01-08 19:10 Ian Rogers
2026-01-08 19:10 ` [PATCH v10 01/35] perf jevents: Build support for generating metrics from python Ian Rogers
` (36 more replies)
0 siblings, 37 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics in the perf tool come in via json. Json doesn't allow
comments, line breaks, etc. making it an inconvenient way to write
metrics. Further, it is useful to detect when writing a metric that
the event specified is supported within the event json for a
model. From the metric python code Event(s) are used, with fallback
events provided, if no event is found then an exception is thrown and
that can either indicate a failure or an unsupported model. To avoid
confusion all the metrics and their metricgroups are prefixed with
'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
characters aren't ideal, this separates the metrics from other vendor
provided metrics.
* The first 2 patches introduce infrastructure for the addition of
metrics written in python for Arm64, AMD Zen and Intel CPUs.
* The next 9 patches generate additional metrics for AMD zen. Rapl
and Idle metrics aren't specific to AMD but are placed here for ease
and convenience. Uncore L3 metrics are added along with the majority
of core metrics.
* The next 22 patches add additional metrics for Intel. Rapl and Idle
metrics aren't specific to Intel but are placed here for ease and
convenience. Smi and tsx metrics are added so they can be dropped
from the per model json files. There are four uncore sets of metrics
and eleven core metrics. Add a CheckPmu function to metric to
simplify detecting the presence of hybrid PMUs in events. Metrics
with experimental events are flagged as experimental in their
description.
* The next patch adds a cycles metrics based on perf event modifiers
for AMD, Intel and Arm64.
* The final patch validates that all events provided to an Event
object exist in a json file somewhere. This is to avoid mistakes
like unfortunate typos.
This series has benefitted from the input of Leo Yan
<leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
<thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
v10. Drop already merged non-vendor patches (Namhyung). Drop "Add
collection of topdown like metrics for arm64" as requested by
James Clark. Update AMD metrics for changes to AMD Zen6 event
names from the series:
https://lore.kernel.org/lkml/cover.1767858676.git.sandipan.das@amd.com/
v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
Thomas Falcon (Intel).
https://lore.kernel.org/lkml/20251202175043.623597-1-irogers@google.com/
v8. Combine the previous 4 series for clarity. Rebase on top of the
more recent legacy metric and event changes. Make the python more
pep8 and pylint compliant.
https://lore.kernel.org/lkml/20251113032040.1994090-1-irogers@google.com/
Foundations:
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
v5. Rebase on top of legacy hardware/cache changes that now generate
events using python:
https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
the v5 series is:
https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
v4. Rebase and small Build/Makefile tweak
https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
v3. Some code tidying, make the input directory a command line
argument, but no other functional or output changes.
https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
v2. Fixes two type issues in the python code but no functional or
output changes.
https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
AMD:
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
names with lpm_ (short for Linux Perf Metric) so that python
generated metrics are clearly namespaced.
https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
v4. Rebase.
https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
v3. Some minor code cleanup changes.
https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
v2. Drop the cycles breakdown in favor of having it as a common
metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
Intel:
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
names with lpm_ (short for Linux Perf Metric) so that python
generated metrics are clearly namespaced. Kan requested a
namespace in his review:
https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
The v5 series is:
https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
minor code cleanup changes. Drop reference to merged fix for
umasks/occ_sel in PCU events and for cstate metrics.
https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
v2. Drop the cycles breakdown in favor of having it as a common
metric, spelling and other improvements suggested by Kan Liang
<kan.liang@linux.intel.com>.
https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
ARM:
v7. Switch a use of cycles to cpu-cycles due to ARM having too many
cycles events.
https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
Das <sandidas@amd.com>) which didn't alter the generated json.
https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
v5. Rebase. Address review comments from Leo Yan
<leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
Linux Perf Metric) so that python generated metrics are clearly
namespaced. Use cpu-cycles rather than cycles legacy event for
cycles metrics to avoid confusion with ARM PMUs. Add patch that
checks events to ensure all possible event names are present in at
least one json file.
https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
v4. Tweak to build dependencies and rebase.
https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
v3. Some minor code cleanup changes.
https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
v2. The cycles metrics are now made common and shared with AMD and
Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
assumes these patches come after the AMD and Intel sets.
https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
Ian Rogers (35):
perf jevents: Build support for generating metrics from python
perf jevents: Add load event json to verify and allow fallbacks
perf jevents: Add RAPL event metric for AMD zen models
perf jevents: Add idle metric for AMD zen models
perf jevents: Add upc metric for uops per cycle for AMD
perf jevents: Add br metric group for branch statistics on AMD
perf jevents: Add itlb metric group for AMD
perf jevents: Add dtlb metric group for AMD
perf jevents: Add uncore l3 metric group for AMD
perf jevents: Add load store breakdown metrics ldst for AMD
perf jevents: Add context switch metrics for AMD
perf jevents: Add RAPL metrics for all Intel models
perf jevents: Add idle metric for Intel models
perf jevents: Add CheckPmu to see if a PMU is in loaded json events
perf jevents: Add smi metric group for Intel models
perf jevents: Mark metrics with experimental events as experimental
perf jevents: Add tsx metric group for Intel models
perf jevents: Add br metric group for branch statistics on Intel
perf jevents: Add software prefetch (swpf) metric group for Intel
perf jevents: Add ports metric group giving utilization on Intel
perf jevents: Add L2 metrics for Intel
perf jevents: Add load store breakdown metrics ldst for Intel
perf jevents: Add ILP metrics for Intel
perf jevents: Add context switch metrics for Intel
perf jevents: Add FPU metrics for Intel
perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
perf jevents: Add mem_bw metric for Intel
perf jevents: Add local/remote "mem" breakdown metrics for Intel
perf jevents: Add dir breakdown metrics for Intel
perf jevents: Add C-State metrics from the PCU PMU for Intel
perf jevents: Add local/remote miss latency metrics for Intel
perf jevents: Add upi_bw metric for Intel
perf jevents: Add mesh bandwidth saturation metric for Intel
perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
perf jevents: Validate that all names given an Event
tools/perf/.gitignore | 5 +
tools/perf/Makefile.perf | 2 +
tools/perf/pmu-events/Build | 51 +-
tools/perf/pmu-events/amd_metrics.py | 492 ++++++++++
tools/perf/pmu-events/arm64_metrics.py | 49 +
tools/perf/pmu-events/common_metrics.py | 19 +
tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
tools/perf/pmu-events/metric.py | 171 +++-
8 files changed, 1914 insertions(+), 4 deletions(-)
create mode 100755 tools/perf/pmu-events/amd_metrics.py
create mode 100755 tools/perf/pmu-events/arm64_metrics.py
create mode 100644 tools/perf/pmu-events/common_metrics.py
create mode 100755 tools/perf/pmu-events/intel_metrics.py
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v10 01/35] perf jevents: Build support for generating metrics from python
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 02/35] perf jevents: Add load event json to verify and allow fallbacks Ian Rogers
` (35 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Generate extra-metrics.json and extra-metricgroups.json from python
architecture specific scripts. The metrics themselves will be added in
later patches.
If a build takes place in tools/perf/ then extra-metrics.json and
extra-metricgroups.json are generated in that directory and so added
to .gitignore. If there is an OUTPUT directory then the
tools/perf/pmu-events/arch files are copied to it so the generated
extra-metrics.json and extra-metricgroups.json can be added/generated
there.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/.gitignore | 5 +++
tools/perf/Makefile.perf | 2 +
tools/perf/pmu-events/Build | 51 +++++++++++++++++++++++++-
tools/perf/pmu-events/amd_metrics.py | 42 +++++++++++++++++++++
tools/perf/pmu-events/arm64_metrics.py | 43 ++++++++++++++++++++++
tools/perf/pmu-events/intel_metrics.py | 42 +++++++++++++++++++++
6 files changed, 184 insertions(+), 1 deletion(-)
create mode 100755 tools/perf/pmu-events/amd_metrics.py
create mode 100755 tools/perf/pmu-events/arm64_metrics.py
create mode 100755 tools/perf/pmu-events/intel_metrics.py
diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
index b64302a76144..133e343bf44e 100644
--- a/tools/perf/.gitignore
+++ b/tools/perf/.gitignore
@@ -42,6 +42,11 @@ pmu-events/metric_test.log
pmu-events/empty-pmu-events.log
pmu-events/test-empty-pmu-events.c
*.shellcheck_log
+pmu-events/arch/**/extra-metrics.json
+pmu-events/arch/**/extra-metricgroups.json
+tests/shell/*.shellcheck_log
+tests/shell/coresight/*.shellcheck_log
+tests/shell/lib/*.shellcheck_log
feature/
libapi/
libbpf/
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index b3f481a626af..3714288fc2f8 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1277,6 +1277,8 @@ ifeq ($(OUTPUT),)
pmu-events/metric_test.log \
pmu-events/test-empty-pmu-events.c \
pmu-events/empty-pmu-events.log
+ $(Q)find pmu-events/arch -name 'extra-metrics.json' -delete -o \
+ -name 'extra-metricgroups.json' -delete
else # When an OUTPUT directory is present, clean up the copied pmu-events/arch directory.
$(call QUIET_CLEAN, pmu-events) $(RM) -r $(OUTPUT)pmu-events/arch \
$(OUTPUT)pmu-events/pmu-events.c \
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index a46ab7b612df..c9df78ee003c 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -29,6 +29,10 @@ $(PMU_EVENTS_C): $(EMPTY_PMU_EVENTS_C)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)cp $< $@
else
+# Functions to extract the model from a extra-metrics.json or extra-metricgroups.json path.
+model_name = $(shell echo $(1)|sed -e 's@.\+/\(.*\)/extra-metric.*\.json@\1@')
+vendor_name = $(shell echo $(1)|sed -e 's@.\+/\(.*\)/[^/]*/extra-metric.*\.json@\1@')
+
# Copy checked-in json to OUTPUT for generation if it's an out of source build
ifneq ($(OUTPUT),)
$(OUTPUT)pmu-events/arch/%: pmu-events/arch/%
@@ -40,7 +44,52 @@ $(LEGACY_CACHE_JSON): $(LEGACY_CACHE_PY)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(LEGACY_CACHE_PY) > $@
-GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) $(LEGACY_CACHE_JSON)
+GEN_METRIC_DEPS := pmu-events/metric.py
+
+# Generate AMD Json
+ZENS = $(shell ls -d pmu-events/arch/x86/amdzen*)
+ZEN_METRICS = $(foreach x,$(ZENS),$(OUTPUT)$(x)/extra-metrics.json)
+ZEN_METRICGROUPS = $(foreach x,$(ZENS),$(OUTPUT)$(x)/extra-metricgroups.json)
+
+$(ZEN_METRICS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+
+$(ZEN_METRICGROUPS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+
+# Generate ARM Json
+ARMS = $(shell ls -d pmu-events/arch/arm64/arm/*)
+ARM_METRICS = $(foreach x,$(ARMS),$(OUTPUT)$(x)/extra-metrics.json)
+ARM_METRICGROUPS = $(foreach x,$(ARMS),$(OUTPUT)$(x)/extra-metricgroups.json)
+
+$(ARM_METRICS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call vendor_name,$@) $(call model_name,$@) arch > $@
+
+$(ARM_METRICGROUPS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call vendor_name,$@) $(call model_name,$@) arch > $@
+
+# Generate Intel Json
+INTELS = $(shell ls -d pmu-events/arch/x86/*|grep -v amdzen|grep -v mapfile.csv)
+INTEL_METRICS = $(foreach x,$(INTELS),$(OUTPUT)$(x)/extra-metrics.json)
+INTEL_METRICGROUPS = $(foreach x,$(INTELS),$(OUTPUT)$(x)/extra-metricgroups.json)
+
+$(INTEL_METRICS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+
+$(INTEL_METRICGROUPS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+
+GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) \
+ $(LEGACY_CACHE_JSON) \
+ $(ZEN_METRICS) $(ZEN_METRICGROUPS) \
+ $(ARM_METRICS) $(ARM_METRICGROUPS) \
+ $(INTEL_METRICS) $(INTEL_METRICGROUPS)
$(METRIC_TEST_LOG): $(METRIC_TEST_PY) $(METRIC_PY)
$(call rule_mkdir)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
new file mode 100755
index 000000000000..5f44687d8d20
--- /dev/null
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+import argparse
+import os
+from metric import (
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+
+# Global command line arguments.
+_args = None
+
+
+def main() -> None:
+ global _args
+
+ def dir_path(path: str) -> str:
+ """Validate path is a directory for argparse."""
+ if os.path.isdir(path):
+ return path
+ raise argparse.ArgumentTypeError(
+ f'\'{path}\' is not a valid directory')
+
+ parser = argparse.ArgumentParser(description="AMD perf json generator")
+ parser.add_argument(
+ "-metricgroups", help="Generate metricgroups data", action='store_true')
+ parser.add_argument("model", help="e.g. amdzen[123]")
+ parser.add_argument(
+ 'events_path',
+ type=dir_path,
+ help='Root of tree containing architecture directories containing json files'
+ )
+ _args = parser.parse_args()
+
+ all_metrics = MetricGroup("", [])
+
+ if _args.metricgroups:
+ print(JsonEncodeMetricGroupDescriptions(all_metrics))
+ else:
+ print(JsonEncodeMetric(all_metrics))
+
+
+if __name__ == '__main__':
+ main()
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
new file mode 100755
index 000000000000..204b3b08c680
--- /dev/null
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -0,0 +1,43 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+import argparse
+import os
+from metric import (
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+
+# Global command line arguments.
+_args = None
+
+
+def main() -> None:
+ global _args
+
+ def dir_path(path: str) -> str:
+ """Validate path is a directory for argparse."""
+ if os.path.isdir(path):
+ return path
+ raise argparse.ArgumentTypeError(
+ f'\'{path}\' is not a valid directory')
+
+ parser = argparse.ArgumentParser(description="ARM perf json generator")
+ parser.add_argument(
+ "-metricgroups", help="Generate metricgroups data", action='store_true')
+ parser.add_argument("vendor", help="e.g. arm")
+ parser.add_argument("model", help="e.g. neoverse-n1")
+ parser.add_argument(
+ 'events_path',
+ type=dir_path,
+ help='Root of tree containing architecture directories containing json files'
+ )
+ _args = parser.parse_args()
+
+ all_metrics = MetricGroup("", [])
+
+ if _args.metricgroups:
+ print(JsonEncodeMetricGroupDescriptions(all_metrics))
+ else:
+ print(JsonEncodeMetric(all_metrics))
+
+
+if __name__ == '__main__':
+ main()
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
new file mode 100755
index 000000000000..65ada006d05a
--- /dev/null
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+import argparse
+import os
+from metric import (
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+
+# Global command line arguments.
+_args = None
+
+
+def main() -> None:
+ global _args
+
+ def dir_path(path: str) -> str:
+ """Validate path is a directory for argparse."""
+ if os.path.isdir(path):
+ return path
+ raise argparse.ArgumentTypeError(
+ f'\'{path}\' is not a valid directory')
+
+ parser = argparse.ArgumentParser(description="Intel perf json generator")
+ parser.add_argument(
+ "-metricgroups", help="Generate metricgroups data", action='store_true')
+ parser.add_argument("model", help="e.g. skylakex")
+ parser.add_argument(
+ 'events_path',
+ type=dir_path,
+ help='Root of tree containing architecture directories containing json files'
+ )
+ _args = parser.parse_args()
+
+ all_metrics = MetricGroup("", [])
+
+ if _args.metricgroups:
+ print(JsonEncodeMetricGroupDescriptions(all_metrics))
+ else:
+ print(JsonEncodeMetric(all_metrics))
+
+
+if __name__ == '__main__':
+ main()
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 02/35] perf jevents: Add load event json to verify and allow fallbacks
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
2026-01-08 19:10 ` [PATCH v10 01/35] perf jevents: Build support for generating metrics from python Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 03/35] perf jevents: Add RAPL event metric for AMD zen models Ian Rogers
` (34 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add a LoadEvents function that loads all event json files in a
directory. In the Event constructor ensure all events are defined in
the event json except for legacy events like "cycles". If the initial
event isn't found then legacy_event1 is used, and if that isn't found
legacy_event2 is used. This allows a single Event to have multiple
event names as models will often rename the same event over time. If
the event doesn't exist an exception is raised.
So that references to metrics can be added, add the MetricRef
class. This doesn't validate as an event name and so provides an
escape hatch for metrics to refer to each other.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/Build | 12 ++--
tools/perf/pmu-events/amd_metrics.py | 7 ++-
tools/perf/pmu-events/arm64_metrics.py | 7 ++-
tools/perf/pmu-events/intel_metrics.py | 7 ++-
tools/perf/pmu-events/metric.py | 83 +++++++++++++++++++++++++-
5 files changed, 101 insertions(+), 15 deletions(-)
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index c9df78ee003c..f7d67d03d055 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -53,11 +53,11 @@ ZEN_METRICGROUPS = $(foreach x,$(ZENS),$(OUTPUT)$(x)/extra-metricgroups.json)
$(ZEN_METRICS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) pmu-events/arch > $@
$(ZEN_METRICGROUPS): pmu-events/amd_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) pmu-events/arch > $@
# Generate ARM Json
ARMS = $(shell ls -d pmu-events/arch/arm64/arm/*)
@@ -66,11 +66,11 @@ ARM_METRICGROUPS = $(foreach x,$(ARMS),$(OUTPUT)$(x)/extra-metricgroups.json)
$(ARM_METRICS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call vendor_name,$@) $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call vendor_name,$@) $(call model_name,$@) pmu-events/arch > $@
$(ARM_METRICGROUPS): pmu-events/arm64_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call vendor_name,$@) $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call vendor_name,$@) $(call model_name,$@) pmu-events/arch > $@
# Generate Intel Json
INTELS = $(shell ls -d pmu-events/arch/x86/*|grep -v amdzen|grep -v mapfile.csv)
@@ -79,11 +79,11 @@ INTEL_METRICGROUPS = $(foreach x,$(INTELS),$(OUTPUT)$(x)/extra-metricgroups.json
$(INTEL_METRICS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< $(call model_name,$@) pmu-events/arch > $@
$(INTEL_METRICGROUPS): pmu-events/intel_metrics.py $(GEN_METRIC_DEPS)
$(call rule_mkdir)
- $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) arch > $@
+ $(Q)$(call echo-cmd,gen)$(PYTHON) $< -metricgroups $(call model_name,$@) pmu-events/arch > $@
GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) \
$(LEGACY_CACHE_JSON) \
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 5f44687d8d20..bc91d9c120fa 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -2,8 +2,8 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (
- JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
+ MetricGroup)
# Global command line arguments.
_args = None
@@ -30,6 +30,9 @@ def main() -> None:
)
_args = parser.parse_args()
+ directory = f"{_args.events_path}/x86/{_args.model}/"
+ LoadEvents(directory)
+
all_metrics = MetricGroup("", [])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index 204b3b08c680..ac717ca3513a 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -2,8 +2,8 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (
- JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
+ MetricGroup)
# Global command line arguments.
_args = None
@@ -31,6 +31,9 @@ def main() -> None:
)
_args = parser.parse_args()
+ directory = f"{_args.events_path}/arm64/{_args.vendor}/{_args.model}/"
+ LoadEvents(directory)
+
all_metrics = MetricGroup("", [])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 65ada006d05a..b287ef115193 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -2,8 +2,8 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
import os
-from metric import (
- JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, MetricGroup)
+from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
+ MetricGroup)
# Global command line arguments.
_args = None
@@ -30,6 +30,9 @@ def main() -> None:
)
_args = parser.parse_args()
+ directory = f"{_args.events_path}/x86/{_args.model}/"
+ LoadEvents(directory)
+
all_metrics = MetricGroup("", [])
if _args.metricgroups:
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index dd8fd06940e6..e33e163b2815 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -3,10 +3,56 @@
import ast
import decimal
import json
+import os
import re
from enum import Enum
from typing import Dict, List, Optional, Set, Tuple, Union
+all_events = set()
+
+def LoadEvents(directory: str) -> None:
+ """Populate a global set of all known events for the purpose of validating Event names"""
+ global all_events
+ all_events = {
+ "context\\-switches",
+ "cpu\\-cycles",
+ "cycles",
+ "duration_time",
+ "instructions",
+ "l2_itlb_misses",
+ }
+ for file in os.listdir(os.fsencode(directory)):
+ filename = os.fsdecode(file)
+ if filename.endswith(".json"):
+ try:
+ for x in json.load(open(f"{directory}/{filename}")):
+ if "EventName" in x:
+ all_events.add(x["EventName"])
+ elif "ArchStdEvent" in x:
+ all_events.add(x["ArchStdEvent"])
+ except json.decoder.JSONDecodeError:
+ # The generated directory may be the same as the input, which
+ # causes partial json files. Ignore errors.
+ pass
+
+
+def CheckEvent(name: str) -> bool:
+ """Check the event name exists in the set of all loaded events"""
+ global all_events
+ if len(all_events) == 0:
+ # No events loaded so assume any event is good.
+ return True
+
+ if ':' in name:
+ # Remove trailing modifier.
+ name = name[:name.find(':')]
+ elif '/' in name:
+ # Name could begin with a PMU or an event, for now assume it is good.
+ return True
+
+ return name in all_events
+
+
class MetricConstraint(Enum):
GROUPED_EVENTS = 0
NO_GROUP_EVENTS = 1
@@ -317,9 +363,18 @@ def _FixEscapes(s: str) -> str:
class Event(Expression):
"""An event in an expression."""
- def __init__(self, name: str, legacy_name: str = ''):
- self.name = _FixEscapes(name)
- self.legacy_name = _FixEscapes(legacy_name)
+ def __init__(self, *args: str):
+ error = ""
+ for name in args:
+ if CheckEvent(name):
+ self.name = _FixEscapes(name)
+ return
+ if error:
+ error += " or " + name
+ else:
+ error = name
+ global all_events
+ raise Exception(f"No event {error} in:\n{all_events}")
def ToPerfJson(self):
result = re.sub('/', '@', self.name)
@@ -338,6 +393,28 @@ class Event(Expression):
return self
+class MetricRef(Expression):
+ """A metric reference in an expression."""
+
+ def __init__(self, name: str):
+ self.name = _FixEscapes(name)
+
+ def ToPerfJson(self):
+ return self.name
+
+ def ToPython(self):
+ return f'MetricRef(r"{self.name}")'
+
+ def Simplify(self) -> Expression:
+ return self
+
+ def Equals(self, other: Expression) -> bool:
+ return isinstance(other, MetricRef) and self.name == other.name
+
+ def Substitute(self, name: str, expression: Expression) -> Expression:
+ return self
+
+
class Constant(Expression):
"""A constant within the expression tree."""
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 03/35] perf jevents: Add RAPL event metric for AMD zen models
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
2026-01-08 19:10 ` [PATCH v10 01/35] perf jevents: Build support for generating metrics from python Ian Rogers
2026-01-08 19:10 ` [PATCH v10 02/35] perf jevents: Add load event json to verify and allow fallbacks Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 04/35] perf jevents: Add idle " Ian Rogers
` (33 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add power per second metrics based on RAPL.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 31 +++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index bc91d9c120fa..b6cdeb4f09fe 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -1,13 +1,36 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
+import math
import os
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
- MetricGroup)
+from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ LoadEvents, Metric, MetricGroup, Select)
# Global command line arguments.
_args = None
+interval_sec = Event("duration_time")
+
+
+def Rapl() -> MetricGroup:
+ """Processor socket power consumption estimate.
+
+ Use events from the running average power limit (RAPL) driver.
+ """
+ # Watts = joules/second
+ # Currently only energy-pkg is supported by AMD:
+ # https://lore.kernel.org/lkml/20220105185659.643355-1-eranian@google.com/
+ pkg = Event("power/energy\\-pkg/")
+ cond_pkg = Select(pkg, has_event(pkg), math.nan)
+ scale = 2.3283064365386962890625e-10
+ metrics = [
+ Metric("lpm_cpu_power_pkg", "",
+ d_ratio(cond_pkg * scale, interval_sec), "Watts"),
+ ]
+
+ return MetricGroup("lpm_cpu_power", metrics,
+ description="Processor socket power consumption estimates")
+
def main() -> None:
global _args
@@ -33,7 +56,9 @@ def main() -> None:
directory = f"{_args.events_path}/x86/{_args.model}/"
LoadEvents(directory)
- all_metrics = MetricGroup("", [])
+ all_metrics = MetricGroup("", [
+ Rapl(),
+ ])
if _args.metricgroups:
print(JsonEncodeMetricGroupDescriptions(all_metrics))
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 04/35] perf jevents: Add idle metric for AMD zen models
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (2 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 03/35] perf jevents: Add RAPL event metric for AMD zen models Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 05/35] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
` (32 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Compute using the msr PMU the percentage of wallclock cycles where the
CPUs are in a low power state.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index b6cdeb4f09fe..f51a044b8005 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -3,8 +3,9 @@
import argparse
import math
import os
-from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
- LoadEvents, Metric, MetricGroup, Select)
+from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+ JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
+ MetricGroup, Select)
# Global command line arguments.
_args = None
@@ -12,6 +13,16 @@ _args = None
interval_sec = Event("duration_time")
+def Idle() -> Metric:
+ cyc = Event("msr/mperf/")
+ tsc = Event("msr/tsc/")
+ low = max(tsc - cyc, 0)
+ return Metric(
+ "lpm_idle",
+ "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
+ d_ratio(low, tsc), "100%")
+
+
def Rapl() -> MetricGroup:
"""Processor socket power consumption estimate.
@@ -57,6 +68,7 @@ def main() -> None:
LoadEvents(directory)
all_metrics = MetricGroup("", [
+ Idle(),
Rapl(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 05/35] perf jevents: Add upc metric for uops per cycle for AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (3 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 04/35] perf jevents: Add idle " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 06/35] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
` (31 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The metric adjusts for whether or not SMT is on.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index f51a044b8005..42e46b33334d 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -3,14 +3,26 @@
import argparse
import math
import os
+from typing import Optional
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, Select)
+ JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
+ Metric, MetricGroup, Select)
# Global command line arguments.
_args = None
-
+_zen_model: int = 1
interval_sec = Event("duration_time")
+ins = Event("instructions")
+cycles = Event("cycles")
+# Number of CPU cycles scaled for SMT.
+smt_cycles = Select(cycles / 2, Literal("#smt_on"), cycles)
+
+
+def AmdUpc() -> Metric:
+ ops = Event("ex_ret_ops", "ex_ret_cops")
+ upc = d_ratio(ops, smt_cycles)
+ return Metric("lpm_upc", "Micro-ops retired per core cycle (higher is better)",
+ upc, "uops/cycle")
def Idle() -> Metric:
@@ -45,6 +57,7 @@ def Rapl() -> MetricGroup:
def main() -> None:
global _args
+ global _zen_model
def dir_path(path: str) -> str:
"""Validate path is a directory for argparse."""
@@ -67,7 +80,10 @@ def main() -> None:
directory = f"{_args.events_path}/x86/{_args.model}/"
LoadEvents(directory)
+ _zen_model = int(_args.model[6:])
+
all_metrics = MetricGroup("", [
+ AmdUpc(),
Idle(),
Rapl(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 06/35] perf jevents: Add br metric group for branch statistics on AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (4 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 05/35] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 07/35] perf jevents: Add itlb metric group for AMD Ian Rogers
` (30 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using json
events. The lack of conditional events on anything but zen2 means this
category is lacking on zen1, zen3 and zen4.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 105 +++++++++++++++++++++++++++
1 file changed, 105 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 42e46b33334d..38948f63cb52 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -18,6 +18,110 @@ cycles = Event("cycles")
smt_cycles = Select(cycles / 2, Literal("#smt_on"), cycles)
+def AmdBr():
+ def Total() -> MetricGroup:
+ br = Event("ex_ret_brn")
+ br_m_all = Event("ex_ret_brn_misp")
+ br_clr = Event("ex_ret_brn_cond_misp",
+ "ex_ret_msprd_brnch_instr_dir_msmtch",
+ "ex_ret_brn_resync")
+
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ misp_r = d_ratio(br_m_all, br)
+ clr_r = d_ratio(br_clr, interval_sec)
+
+ return MetricGroup("lpm_br_total", [
+ Metric("lpm_br_total_retired",
+ "The number of branch instructions retired per second.", br_r,
+ "insn/s"),
+ Metric(
+ "lpm_br_total_mispred",
+ "The number of branch instructions retired, of any type, that were "
+ "not correctly predicted as a percentage of all branch instrucions.",
+ misp_r, "100%"),
+ Metric("lpm_br_total_insn_between_branches",
+ "The number of instructions divided by the number of branches.",
+ ins_r, "insn"),
+ Metric("lpm_br_total_insn_fe_resteers",
+ "The number of resync branches per second.", clr_r, "req/s")
+ ])
+
+ def Taken() -> MetricGroup:
+ br = Event("ex_ret_brn_tkn")
+ br_m_tk = Event("ex_ret_brn_tkn_misp")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ misp_r = d_ratio(br_m_tk, br)
+ return MetricGroup("lpm_br_taken", [
+ Metric("lpm_br_taken_retired",
+ "The number of taken branches that were retired per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_taken_mispred",
+ "The number of retired taken branch instructions that were "
+ "mispredicted as a percentage of all taken branches.", misp_r,
+ "100%"),
+ Metric(
+ "lpm_br_taken_insn_between_branches",
+ "The number of instructions divided by the number of taken branches.",
+ ins_r, "insn"),
+ ])
+
+ def Conditional() -> Optional[MetricGroup]:
+ global _zen_model
+ br = Event("ex_ret_brn_cond", "ex_ret_cond")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+
+ metrics = [
+ Metric("lpm_br_cond_retired", "Retired conditional branch instructions.",
+ br_r, "insn/s"),
+ Metric("lpm_br_cond_insn_between_branches",
+ "The number of instructions divided by the number of conditional "
+ "branches.", ins_r, "insn"),
+ ]
+ if _zen_model == 2:
+ br_m_cond = Event("ex_ret_cond_misp")
+ misp_r = d_ratio(br_m_cond, br)
+ metrics += [
+ Metric("lpm_br_cond_mispred",
+ "Retired conditional branch instructions mispredicted as a "
+ "percentage of all conditional branches.", misp_r, "100%"),
+ ]
+
+ return MetricGroup("lpm_br_cond", metrics)
+
+ def Fused() -> MetricGroup:
+ br = Event("ex_ret_fused_instr", "ex_ret_fus_brnch_inst")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ return MetricGroup("lpm_br_cond", [
+ Metric("lpm_br_fused_retired",
+ "Retired fused branch instructions per second.", br_r, "insn/s"),
+ Metric(
+ "lpm_br_fused_insn_between_branches",
+ "The number of instructions divided by the number of fused "
+ "branches.", ins_r, "insn"),
+ ])
+
+ def Far() -> MetricGroup:
+ br = Event("ex_ret_brn_far")
+ br_r = d_ratio(br, interval_sec)
+ ins_r = d_ratio(ins, br)
+ return MetricGroup("lpm_br_far", [
+ Metric("lpm_br_far_retired", "Retired far control transfers per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_far_insn_between_branches",
+ "The number of instructions divided by the number of far branches.",
+ ins_r, "insn"),
+ ])
+
+ return MetricGroup("lpm_br", [Total(), Taken(), Conditional(), Fused(), Far()],
+ description="breakdown of retired branch instructions")
+
+
def AmdUpc() -> Metric:
ops = Event("ex_ret_ops", "ex_ret_cops")
upc = d_ratio(ops, smt_cycles)
@@ -83,6 +187,7 @@ def main() -> None:
_zen_model = int(_args.model[6:])
all_metrics = MetricGroup("", [
+ AmdBr(),
AmdUpc(),
Idle(),
Rapl(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 07/35] perf jevents: Add itlb metric group for AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (5 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 06/35] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 08/35] perf jevents: Add dtlb " Ian Rogers
` (29 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add metrics that give an overview and details of the l1 itlb (zen1,
zen2, zen3) and l2 itlb (all zens).
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 49 ++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 38948f63cb52..8fb0b55074a2 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -122,6 +122,54 @@ def AmdBr():
description="breakdown of retired branch instructions")
+def AmdItlb():
+ global _zen_model
+ l2h = Event("bp_l1_tlb_miss_l2_tlb_hit", "bp_l1_tlb_miss_l2_hit")
+ l2m = Event("l2_itlb_misses")
+ l2r = l2h + l2m
+
+ itlb_l1_mg = None
+ l1m = l2r
+ if _zen_model <= 3:
+ l1r = Event("ic_fw32")
+ l1h = max(l1r - l1m, 0)
+ itlb_l1_mg = MetricGroup("lpm_itlb_l1", [
+ Metric("lpm_itlb_l1_hits",
+ "L1 ITLB hits as a perecentage of L1 ITLB accesses.",
+ d_ratio(l1h, l1h + l1m), "100%"),
+ Metric("lpm_itlb_l1_miss",
+ "L1 ITLB misses as a perecentage of L1 ITLB accesses.",
+ d_ratio(l1m, l1h + l1m), "100%"),
+ Metric("lpm_itlb_l1_reqs",
+ "The number of 32B fetch windows transferred from IC pipe to DE "
+ "instruction decoder per second.", d_ratio(
+ l1r, interval_sec),
+ "windows/sec"),
+ ])
+
+ return MetricGroup("lpm_itlb", [
+ MetricGroup("lpm_itlb_ov", [
+ Metric("lpm_itlb_ov_insn_bt_l1_miss",
+ "Number of instructions between l1 misses", d_ratio(
+ ins, l1m), "insns"),
+ Metric("lpm_itlb_ov_insn_bt_l2_miss",
+ "Number of instructions between l2 misses", d_ratio(
+ ins, l2m), "insns"),
+ ]),
+ itlb_l1_mg,
+ MetricGroup("lpm_itlb_l2", [
+ Metric("lpm_itlb_l2_hits",
+ "L2 ITLB hits as a percentage of all L2 ITLB accesses.",
+ d_ratio(l2h, l2r), "100%"),
+ Metric("lpm_itlb_l2_miss",
+ "L2 ITLB misses as a percentage of all L2 ITLB accesses.",
+ d_ratio(l2m, l2r), "100%"),
+ Metric("lpm_itlb_l2_reqs", "ITLB accesses per second.",
+ d_ratio(l2r, interval_sec), "accesses/sec"),
+ ]),
+ ], description="Instruction TLB breakdown")
+
+
def AmdUpc() -> Metric:
ops = Event("ex_ret_ops", "ex_ret_cops")
upc = d_ratio(ops, smt_cycles)
@@ -188,6 +236,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
AmdBr(),
+ AmdItlb(),
AmdUpc(),
Idle(),
Rapl(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 08/35] perf jevents: Add dtlb metric group for AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (6 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 07/35] perf jevents: Add itlb metric group for AMD Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 09/35] perf jevents: Add uncore l3 " Ian Rogers
` (28 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add metrics that give an overview and details of the dtlb (zen1, zen2,
zen3).
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 111 +++++++++++++++++++++++++++
1 file changed, 111 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 8fb0b55074a2..a4ff88de08b5 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -122,6 +122,116 @@ def AmdBr():
description="breakdown of retired branch instructions")
+def AmdDtlb() -> Optional[MetricGroup]:
+ global _zen_model
+ if _zen_model >= 4:
+ return None
+
+ d_dat = Event("ls_dc_accesses") if _zen_model <= 3 else None
+ d_h4k = Event("ls_l1_d_tlb_miss.tlb_reload_4k_l2_hit")
+ d_hcoal = Event(
+ "ls_l1_d_tlb_miss.tlb_reload_coalesced_page_hit") if _zen_model >= 2 else 0
+ d_h2m = Event("ls_l1_d_tlb_miss.tlb_reload_2m_l2_hit")
+ d_h1g = Event("ls_l1_d_tlb_miss.tlb_reload_1g_l2_hit")
+
+ d_m4k = Event("ls_l1_d_tlb_miss.tlb_reload_4k_l2_miss")
+ d_mcoal = Event(
+ "ls_l1_d_tlb_miss.tlb_reload_coalesced_page_miss") if _zen_model >= 2 else 0
+ d_m2m = Event("ls_l1_d_tlb_miss.tlb_reload_2m_l2_miss")
+ d_m1g = Event("ls_l1_d_tlb_miss.tlb_reload_1g_l2_miss")
+
+ d_w0 = Event("ls_tablewalker.dc_type0") if _zen_model <= 3 else None
+ d_w1 = Event("ls_tablewalker.dc_type1") if _zen_model <= 3 else None
+ walks = d_w0 + d_w1
+ walks_r = d_ratio(walks, interval_sec)
+ ins_w = d_ratio(ins, walks)
+ l1 = d_dat
+ l1_r = d_ratio(l1, interval_sec)
+ l2_hits = d_h4k + d_hcoal + d_h2m + d_h1g
+ l2_miss = d_m4k + d_mcoal + d_m2m + d_m1g
+ l2_r = d_ratio(l2_hits + l2_miss, interval_sec)
+ l1_miss = l2_hits + l2_miss + walks
+ l1_hits = max(l1 - l1_miss, 0)
+ ins_l = d_ratio(ins, l1_miss)
+
+ return MetricGroup("lpm_dtlb", [
+ MetricGroup("lpm_dtlb_ov", [
+ Metric("lpm_dtlb_ov_insn_bt_l1_miss",
+ "DTLB overview: instructions between l1 misses.", ins_l,
+ "insns"),
+ Metric("lpm_dtlb_ov_insn_bt_walks",
+ "DTLB overview: instructions between dtlb page table walks.",
+ ins_w, "insns"),
+ ]),
+ MetricGroup("lpm_dtlb_l1", [
+ Metric("lpm_dtlb_l1_hits",
+ "DTLB L1 hits as percentage of all DTLB L1 accesses.",
+ d_ratio(l1_hits, l1), "100%"),
+ Metric("lpm_dtlb_l1_miss",
+ "DTLB L1 misses as percentage of all DTLB L1 accesses.",
+ d_ratio(l1_miss, l1), "100%"),
+ Metric("lpm_dtlb_l1_reqs", "DTLB L1 accesses per second.", l1_r,
+ "insns/s"),
+ ]),
+ MetricGroup("lpm_dtlb_l2", [
+ Metric("lpm_dtlb_l2_hits",
+ "DTLB L2 hits as percentage of all DTLB L2 accesses.",
+ d_ratio(l2_hits, l2_hits + l2_miss), "100%"),
+ Metric("lpm_dtlb_l2_miss",
+ "DTLB L2 misses as percentage of all DTLB L2 accesses.",
+ d_ratio(l2_miss, l2_hits + l2_miss), "100%"),
+ Metric("lpm_dtlb_l2_reqs", "DTLB L2 accesses per second.", l2_r,
+ "insns/s"),
+ MetricGroup("lpm_dtlb_l2_4kb", [
+ Metric(
+ "lpm_dtlb_l2_4kb_hits",
+ "DTLB L2 4kb page size hits as percentage of all DTLB L2 4kb "
+ "accesses.", d_ratio(d_h4k, d_h4k + d_m4k), "100%"),
+ Metric(
+ "lpm_dtlb_l2_4kb_miss",
+ "DTLB L2 4kb page size misses as percentage of all DTLB L2 4kb"
+ "accesses.", d_ratio(d_m4k, d_h4k + d_m4k), "100%")
+ ]),
+ MetricGroup("lpm_dtlb_l2_coalesced", [
+ Metric(
+ "lpm_dtlb_l2_coal_hits",
+ "DTLB L2 coalesced page (16kb) hits as percentage of all DTLB "
+ "L2 coalesced accesses.", d_ratio(d_hcoal,
+ d_hcoal + d_mcoal), "100%"),
+ Metric(
+ "lpm_dtlb_l2_coal_miss",
+ "DTLB L2 coalesced page (16kb) misses as percentage of all "
+ "DTLB L2 coalesced accesses.",
+ d_ratio(d_mcoal, d_hcoal + d_mcoal), "100%")
+ ]),
+ MetricGroup("lpm_dtlb_l2_2mb", [
+ Metric(
+ "lpm_dtlb_l2_2mb_hits",
+ "DTLB L2 2mb page size hits as percentage of all DTLB L2 2mb "
+ "accesses.", d_ratio(d_h2m, d_h2m + d_m2m), "100%"),
+ Metric(
+ "lpm_dtlb_l2_2mb_miss",
+ "DTLB L2 2mb page size misses as percentage of all DTLB L2 "
+ "accesses.", d_ratio(d_m2m, d_h2m + d_m2m), "100%")
+ ]),
+ MetricGroup("lpm_dtlb_l2_1g", [
+ Metric(
+ "lpm_dtlb_l2_1g_hits",
+ "DTLB L2 1gb page size hits as percentage of all DTLB L2 1gb "
+ "accesses.", d_ratio(d_h1g, d_h1g + d_m1g), "100%"),
+ Metric(
+ "lpm_dtlb_l2_1g_miss",
+ "DTLB L2 1gb page size misses as percentage of all DTLB L2 "
+ "1gb accesses.", d_ratio(d_m1g, d_h1g + d_m1g), "100%")
+ ]),
+ ]),
+ MetricGroup("lpm_dtlb_walks", [
+ Metric("lpm_dtlb_walks_reqs", "DTLB page table walks per second.",
+ walks_r, "walks/s"),
+ ]),
+ ], description="Data TLB metrics")
+
+
def AmdItlb():
global _zen_model
l2h = Event("bp_l1_tlb_miss_l2_tlb_hit", "bp_l1_tlb_miss_l2_hit")
@@ -236,6 +346,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
AmdBr(),
+ AmdDtlb(),
AmdItlb(),
AmdUpc(),
Idle(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 09/35] perf jevents: Add uncore l3 metric group for AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (7 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 08/35] perf jevents: Add dtlb " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 10/35] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
` (27 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics use the amd_l3 PMU for access/miss/hit information.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index a4ff88de08b5..d71997177239 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -317,6 +317,24 @@ def Rapl() -> MetricGroup:
description="Processor socket power consumption estimates")
+def UncoreL3():
+ acc = Event("l3_lookup_state.all_coherent_accesses_to_l3",
+ "l3_lookup_state.all_l3_req_typs")
+ miss = Event("l3_lookup_state.l3_miss",
+ "l3_comb_clstr_state.request_miss")
+ acc = max(acc, miss)
+ hits = acc - miss
+
+ return MetricGroup("lpm_l3", [
+ Metric("lpm_l3_accesses", "L3 victim cache accesses",
+ d_ratio(acc, interval_sec), "accesses/sec"),
+ Metric("lpm_l3_hits", "L3 victim cache hit rate",
+ d_ratio(hits, acc), "100%"),
+ Metric("lpm_l3_miss", "L3 victim cache miss rate", d_ratio(miss, acc),
+ "100%"),
+ ], description="L3 cache breakdown per CCX")
+
+
def main() -> None:
global _args
global _zen_model
@@ -351,6 +369,7 @@ def main() -> None:
AmdUpc(),
Idle(),
Rapl(),
+ UncoreL3(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 10/35] perf jevents: Add load store breakdown metrics ldst for AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (8 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 09/35] perf jevents: Add uncore l3 " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 11/35] perf jevents: Add context switch metrics " Ian Rogers
` (26 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 75 ++++++++++++++++++++++++++++
1 file changed, 75 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index d71997177239..b3de74babe40 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -280,6 +280,80 @@ def AmdItlb():
], description="Instruction TLB breakdown")
+def AmdLdSt() -> MetricGroup:
+ ldst_ld = Event("ls_dispatch.pure_ld", "ls_dispatch.ld_dispatch")
+ ldst_st = Event("ls_dispatch.pure_st", "ls_dispatch.store_dispatch")
+ ldst_ldc1 = Event(f"{ldst_ld}/cmask=1/")
+ ldst_stc1 = Event(f"{ldst_st}/cmask=1/")
+ ldst_ldc2 = Event(f"{ldst_ld}/cmask=2/")
+ ldst_stc2 = Event(f"{ldst_st}/cmask=2/")
+ ldst_ldc3 = Event(f"{ldst_ld}/cmask=3/")
+ ldst_stc3 = Event(f"{ldst_st}/cmask=3/")
+ ldst_cyc = Event("ls_not_halted_cyc")
+
+ ld_rate = d_ratio(ldst_ld, interval_sec)
+ st_rate = d_ratio(ldst_st, interval_sec)
+
+ ld_v1 = max(ldst_ldc1 - ldst_ldc2, 0)
+ ld_v2 = max(ldst_ldc2 - ldst_ldc3, 0)
+ ld_v3 = ldst_ldc3
+
+ st_v1 = max(ldst_stc1 - ldst_stc2, 0)
+ st_v2 = max(ldst_stc2 - ldst_stc3, 0)
+ st_v3 = ldst_stc3
+
+ return MetricGroup("lpm_ldst", [
+ MetricGroup("lpm_ldst_total", [
+ Metric("lpm_ldst_total_ld", "Number of loads dispatched per second.",
+ ld_rate, "insns/sec"),
+ Metric("lpm_ldst_total_st", "Number of stores dispatched per second.",
+ st_rate, "insns/sec"),
+ ]),
+ MetricGroup("lpm_ldst_percent_insn", [
+ Metric("lpm_ldst_percent_insn_ld",
+ "Load instructions as a percentage of all instructions.",
+ d_ratio(ldst_ld, ins), "100%"),
+ Metric("lpm_ldst_percent_insn_st",
+ "Store instructions as a percentage of all instructions.",
+ d_ratio(ldst_st, ins), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_loads_per_cycle", [
+ Metric(
+ "lpm_ldst_ret_loads_per_cycle_1",
+ "Load instructions retiring in 1 cycle as a percentage of all "
+ "unhalted cycles.", d_ratio(ld_v1, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_loads_per_cycle_2",
+ "Load instructions retiring in 2 cycles as a percentage of all "
+ "unhalted cycles.", d_ratio(ld_v2, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_loads_per_cycle_3",
+ "Load instructions retiring in 3 or more cycles as a percentage"
+ "of all unhalted cycles.", d_ratio(ld_v3, ldst_cyc), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_stores_per_cycle", [
+ Metric(
+ "lpm_ldst_ret_stores_per_cycle_1",
+ "Store instructions retiring in 1 cycle as a percentage of all "
+ "unhalted cycles.", d_ratio(st_v1, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_stores_per_cycle_2",
+ "Store instructions retiring in 2 cycles as a percentage of all "
+ "unhalted cycles.", d_ratio(st_v2, ldst_cyc), "100%"),
+ Metric(
+ "lpm_ldst_ret_stores_per_cycle_3",
+ "Store instructions retiring in 3 or more cycles as a percentage"
+ "of all unhalted cycles.", d_ratio(st_v3, ldst_cyc), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_insn_bt", [
+ Metric("lpm_ldst_insn_bt_ld", "Number of instructions between loads.",
+ d_ratio(ins, ldst_ld), "insns"),
+ Metric("lpm_ldst_insn_bt_st", "Number of instructions between stores.",
+ d_ratio(ins, ldst_st), "insns"),
+ ])
+ ], description="Breakdown of load/store instructions")
+
+
def AmdUpc() -> Metric:
ops = Event("ex_ret_ops", "ex_ret_cops")
upc = d_ratio(ops, smt_cycles)
@@ -366,6 +440,7 @@ def main() -> None:
AmdBr(),
AmdDtlb(),
AmdItlb(),
+ AmdLdSt(),
AmdUpc(),
Idle(),
Rapl(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 11/35] perf jevents: Add context switch metrics for AMD
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (9 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 10/35] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 12/35] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
` (25 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics break down context switches for different kinds of
instruction.
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/amd_metrics.py | 33 ++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index b3de74babe40..83e77ccc059e 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -122,6 +122,38 @@ def AmdBr():
description="breakdown of retired branch instructions")
+def AmdCtxSw() -> MetricGroup:
+ cs = Event("context\\-switches")
+ metrics = [
+ Metric("lpm_cs_rate", "Context switches per second",
+ d_ratio(cs, interval_sec), "ctxsw/s")
+ ]
+
+ ev = Event("instructions")
+ metrics.append(Metric("lpm_cs_instr", "Instructions per context switch",
+ d_ratio(ev, cs), "instr/cs"))
+
+ ev = Event("cycles")
+ metrics.append(Metric("lpm_cs_cycles", "Cycles per context switch",
+ d_ratio(ev, cs), "cycles/cs"))
+
+ ev = Event("ls_dispatch.pure_ld", "ls_dispatch.ld_dispatch")
+ metrics.append(Metric("lpm_cs_loads", "Loads per context switch",
+ d_ratio(ev, cs), "loads/cs"))
+
+ ev = Event("ls_dispatch.pure_st", "ls_dispatch.store_dispatch")
+ metrics.append(Metric("lpm_cs_stores", "Stores per context switch",
+ d_ratio(ev, cs), "stores/cs"))
+
+ ev = Event("ex_ret_brn_tkn")
+ metrics.append(Metric("lpm_cs_br_taken", "Branches taken per context switch",
+ d_ratio(ev, cs), "br_taken/cs"))
+
+ return MetricGroup("lpm_cs", metrics,
+ description=("Number of context switches per second, instructions "
+ "retired & core cycles between context switches"))
+
+
def AmdDtlb() -> Optional[MetricGroup]:
global _zen_model
if _zen_model >= 4:
@@ -438,6 +470,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
AmdBr(),
+ AmdCtxSw(),
AmdDtlb(),
AmdItlb(),
AmdLdSt(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 12/35] perf jevents: Add RAPL metrics for all Intel models
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (10 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 11/35] perf jevents: Add context switch metrics " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 13/35] perf jevents: Add idle metric for " Ian Rogers
` (24 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add a 'cpu_power' metric group that computes the power consumption
from RAPL events if they are present.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 44 ++++++++++++++++++++++++--
1 file changed, 41 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index b287ef115193..61778deedfff 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1,12 +1,48 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
+import math
import os
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
- MetricGroup)
+from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ LoadEvents, Metric, MetricGroup, Select)
# Global command line arguments.
_args = None
+interval_sec = Event("duration_time")
+
+
+def Rapl() -> MetricGroup:
+ """Processor power consumption estimate.
+
+ Use events from the running average power limit (RAPL) driver.
+ """
+ # Watts = joules/second
+ pkg = Event("power/energy\\-pkg/")
+ cond_pkg = Select(pkg, has_event(pkg), math.nan)
+ cores = Event("power/energy\\-cores/")
+ cond_cores = Select(cores, has_event(cores), math.nan)
+ ram = Event("power/energy\\-ram/")
+ cond_ram = Select(ram, has_event(ram), math.nan)
+ gpu = Event("power/energy\\-gpu/")
+ cond_gpu = Select(gpu, has_event(gpu), math.nan)
+ psys = Event("power/energy\\-psys/")
+ cond_psys = Select(psys, has_event(psys), math.nan)
+ scale = 2.3283064365386962890625e-10
+ metrics = [
+ Metric("lpm_cpu_power_pkg", "",
+ d_ratio(cond_pkg * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_cores", "",
+ d_ratio(cond_cores * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_ram", "",
+ d_ratio(cond_ram * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_gpu", "",
+ d_ratio(cond_gpu * scale, interval_sec), "Watts"),
+ Metric("lpm_cpu_power_psys", "",
+ d_ratio(cond_psys * scale, interval_sec), "Watts"),
+ ]
+
+ return MetricGroup("lpm_cpu_power", metrics,
+ description="Running Average Power Limit (RAPL) power consumption estimates")
def main() -> None:
@@ -33,7 +69,9 @@ def main() -> None:
directory = f"{_args.events_path}/x86/{_args.model}/"
LoadEvents(directory)
- all_metrics = MetricGroup("", [])
+ all_metrics = MetricGroup("", [
+ Rapl(),
+ ])
if _args.metricgroups:
print(JsonEncodeMetricGroupDescriptions(all_metrics))
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 13/35] perf jevents: Add idle metric for Intel models
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (11 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 12/35] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 14/35] perf jevents: Add CheckPmu to see if a PMU is in loaded json events Ian Rogers
` (23 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Compute using the msr PMU the percentage of wallclock cycles where the
CPUs are in a low power state.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 61778deedfff..0cb7a38ad238 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -3,14 +3,25 @@
import argparse
import math
import os
-from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
- LoadEvents, Metric, MetricGroup, Select)
+from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+ JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
+ MetricGroup, Select)
# Global command line arguments.
_args = None
interval_sec = Event("duration_time")
+def Idle() -> Metric:
+ cyc = Event("msr/mperf/")
+ tsc = Event("msr/tsc/")
+ low = max(tsc - cyc, 0)
+ return Metric(
+ "lpm_idle",
+ "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
+ d_ratio(low, tsc), "100%")
+
+
def Rapl() -> MetricGroup:
"""Processor power consumption estimate.
@@ -70,6 +81,7 @@ def main() -> None:
LoadEvents(directory)
all_metrics = MetricGroup("", [
+ Idle(),
Rapl(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 14/35] perf jevents: Add CheckPmu to see if a PMU is in loaded json events
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (12 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 13/35] perf jevents: Add idle metric for " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 15/35] perf jevents: Add smi metric group for Intel models Ian Rogers
` (22 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
CheckPmu can be used to determine if hybrid events are present,
allowing for hybrid conditional metrics/events/pmus to be premised on
the json files rather than hard coded tables.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/metric.py | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index e33e163b2815..62d1a1e1d458 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -8,10 +8,12 @@ import re
from enum import Enum
from typing import Dict, List, Optional, Set, Tuple, Union
+all_pmus = set()
all_events = set()
def LoadEvents(directory: str) -> None:
"""Populate a global set of all known events for the purpose of validating Event names"""
+ global all_pmus
global all_events
all_events = {
"context\\-switches",
@@ -26,6 +28,8 @@ def LoadEvents(directory: str) -> None:
if filename.endswith(".json"):
try:
for x in json.load(open(f"{directory}/{filename}")):
+ if "Unit" in x:
+ all_pmus.add(x["Unit"])
if "EventName" in x:
all_events.add(x["EventName"])
elif "ArchStdEvent" in x:
@@ -36,6 +40,10 @@ def LoadEvents(directory: str) -> None:
pass
+def CheckPmu(name: str) -> bool:
+ return name in all_pmus
+
+
def CheckEvent(name: str) -> bool:
"""Check the event name exists in the set of all loaded events"""
global all_events
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 15/35] perf jevents: Add smi metric group for Intel models
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (13 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 14/35] perf jevents: Add CheckPmu to see if a PMU is in loaded json events Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 16/35] perf jevents: Mark metrics with experimental events as experimental Ian Rogers
` (21 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Allow duplicated metric to be dropped from json files.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 0cb7a38ad238..94604b1b07d8 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -3,9 +3,9 @@
import argparse
import math
import os
-from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, Select)
+ MetricGroup, MetricRef, Select)
# Global command line arguments.
_args = None
@@ -56,6 +56,25 @@ def Rapl() -> MetricGroup:
description="Running Average Power Limit (RAPL) power consumption estimates")
+def Smi() -> MetricGroup:
+ pmu = "<cpu_core or cpu_atom>" if CheckPmu("cpu_core") else "cpu"
+ aperf = Event('msr/aperf/')
+ cycles = Event('cycles')
+ smi_num = Event('msr/smi/')
+ smi_cycles = Select(Select((aperf - cycles) / aperf, smi_num > 0, 0),
+ has_event(aperf),
+ 0)
+ return MetricGroup('smi', [
+ Metric('smi_num', 'Number of SMI interrupts.',
+ Select(smi_num, has_event(smi_num), 0), 'SMI#'),
+ # Note, the smi_cycles "Event" is really a reference to the metric.
+ Metric('smi_cycles',
+ 'Percentage of cycles spent in System Management Interrupts. '
+ f'Requires /sys/bus/event_source/devices/{pmu}/freeze_on_smi to be 1.',
+ smi_cycles, '100%', threshold=(MetricRef('smi_cycles') > 0.10))
+ ], description='System Management Interrupt metrics')
+
+
def main() -> None:
global _args
@@ -83,6 +102,7 @@ def main() -> None:
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
+ Smi(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 16/35] perf jevents: Mark metrics with experimental events as experimental
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (14 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 15/35] perf jevents: Add smi metric group for Intel models Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 17/35] perf jevents: Add tsx metric group for Intel models Ian Rogers
` (20 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
When metrics are made with experimental events it is desirable the
metric description also carries this information in case of metric
inaccuracies.
Suggested-by: Perry Taylor <perry.taylor@intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/metric.py | 44 +++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 62d1a1e1d458..2029b6e28365 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -10,11 +10,13 @@ from typing import Dict, List, Optional, Set, Tuple, Union
all_pmus = set()
all_events = set()
+experimental_events = set()
def LoadEvents(directory: str) -> None:
"""Populate a global set of all known events for the purpose of validating Event names"""
global all_pmus
global all_events
+ global experimental_events
all_events = {
"context\\-switches",
"cpu\\-cycles",
@@ -32,6 +34,8 @@ def LoadEvents(directory: str) -> None:
all_pmus.add(x["Unit"])
if "EventName" in x:
all_events.add(x["EventName"])
+ if "Experimental" in x and x["Experimental"] == "1":
+ experimental_events.add(x["EventName"])
elif "ArchStdEvent" in x:
all_events.add(x["ArchStdEvent"])
except json.decoder.JSONDecodeError:
@@ -61,6 +65,18 @@ def CheckEvent(name: str) -> bool:
return name in all_events
+def IsExperimentalEvent(name: str) -> bool:
+ global experimental_events
+ if ':' in name:
+ # Remove trailing modifier.
+ name = name[:name.find(':')]
+ elif '/' in name:
+ # Name could begin with a PMU or an event, for now assume it is not experimental.
+ return False
+
+ return name in experimental_events
+
+
class MetricConstraint(Enum):
GROUPED_EVENTS = 0
NO_GROUP_EVENTS = 1
@@ -82,6 +98,10 @@ class Expression:
"""Returns a simplified version of self."""
raise NotImplementedError()
+ def HasExperimentalEvents(self) -> bool:
+ """Are experimental events used in the expression?"""
+ raise NotImplementedError()
+
def Equals(self, other) -> bool:
"""Returns true when two expressions are the same."""
raise NotImplementedError()
@@ -249,6 +269,9 @@ class Operator(Expression):
return Operator(self.operator, lhs, rhs)
+ def HasExperimentalEvents(self) -> bool:
+ return self.lhs.HasExperimentalEvents() or self.rhs.HasExperimentalEvents()
+
def Equals(self, other: Expression) -> bool:
if isinstance(other, Operator):
return self.operator == other.operator and self.lhs.Equals(
@@ -297,6 +320,10 @@ class Select(Expression):
return Select(true_val, cond, false_val)
+ def HasExperimentalEvents(self) -> bool:
+ return (self.cond.HasExperimentalEvents() or self.true_val.HasExperimentalEvents() or
+ self.false_val.HasExperimentalEvents())
+
def Equals(self, other: Expression) -> bool:
if isinstance(other, Select):
return self.cond.Equals(other.cond) and self.false_val.Equals(
@@ -345,6 +372,9 @@ class Function(Expression):
return Function(self.fn, lhs, rhs)
+ def HasExperimentalEvents(self) -> bool:
+ return self.lhs.HasExperimentalEvents() or (self.rhs and self.rhs.HasExperimentalEvents())
+
def Equals(self, other: Expression) -> bool:
if isinstance(other, Function):
result = self.fn == other.fn and self.lhs.Equals(other.lhs)
@@ -384,6 +414,9 @@ class Event(Expression):
global all_events
raise Exception(f"No event {error} in:\n{all_events}")
+ def HasExperimentalEvents(self) -> bool:
+ return IsExperimentalEvent(self.name)
+
def ToPerfJson(self):
result = re.sub('/', '@', self.name)
return result
@@ -416,6 +449,9 @@ class MetricRef(Expression):
def Simplify(self) -> Expression:
return self
+ def HasExperimentalEvents(self) -> bool:
+ return False
+
def Equals(self, other: Expression) -> bool:
return isinstance(other, MetricRef) and self.name == other.name
@@ -443,6 +479,9 @@ class Constant(Expression):
def Simplify(self) -> Expression:
return self
+ def HasExperimentalEvents(self) -> bool:
+ return False
+
def Equals(self, other: Expression) -> bool:
return isinstance(other, Constant) and self.value == other.value
@@ -465,6 +504,9 @@ class Literal(Expression):
def Simplify(self) -> Expression:
return self
+ def HasExperimentalEvents(self) -> bool:
+ return False
+
def Equals(self, other: Expression) -> bool:
return isinstance(other, Literal) and self.value == other.value
@@ -527,6 +569,8 @@ class Metric:
self.name = name
self.description = description
self.expr = expr.Simplify()
+ if self.expr.HasExperimentalEvents():
+ self.description += " (metric should be considered experimental as it contains experimental events)."
# Workraound valid_only_metric hiding certain metrics based on unit.
scale_unit = scale_unit.replace('/sec', ' per sec')
if scale_unit[0].isdigit():
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 17/35] perf jevents: Add tsx metric group for Intel models
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (15 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 16/35] perf jevents: Mark metrics with experimental events as experimental Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 18/35] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
` (19 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Allow duplicated metric to be dropped from json files. Detect when TSX
is supported by a model by using the json events, use sysfs events at
runtime as hypervisors, etc. may disable TSX.
Add CheckPmu to metric to determine if which PMUs have been associated
with the loaded events.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 50 ++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 94604b1b07d8..05f3d94ec5d5 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -3,6 +3,7 @@
import argparse
import math
import os
+from typing import Optional
from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
MetricGroup, MetricRef, Select)
@@ -75,6 +76,54 @@ def Smi() -> MetricGroup:
], description='System Management Interrupt metrics')
+def Tsx() -> Optional[MetricGroup]:
+ pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
+ cycles = Event('cycles')
+ cycles_in_tx = Event(f'{pmu}/cycles\\-t/')
+ cycles_in_tx_cp = Event(f'{pmu}/cycles\\-ct/')
+ try:
+ # Test if the tsx event is present in the json, prefer the
+ # sysfs version so that we can detect its presence at runtime.
+ transaction_start = Event("RTM_RETIRED.START")
+ transaction_start = Event(f'{pmu}/tx\\-start/')
+ except:
+ return None
+
+ elision_start = None
+ try:
+ # Elision start isn't supported by all models, but we'll not
+ # generate the tsx_cycles_per_elision metric in that
+ # case. Again, prefer the sysfs encoding of the event.
+ elision_start = Event("HLE_RETIRED.START")
+ elision_start = Event(f'{pmu}/el\\-start/')
+ except:
+ pass
+
+ return MetricGroup('transaction', [
+ Metric('tsx_transactional_cycles',
+ 'Percentage of cycles within a transaction region.',
+ Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
+ '100%'),
+ Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
+ Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
+ has_event(cycles_in_tx),
+ 0),
+ '100%'),
+ Metric('tsx_cycles_per_transaction',
+ 'Number of cycles within a transaction divided by the number of transactions.',
+ Select(cycles_in_tx / transaction_start,
+ has_event(cycles_in_tx),
+ 0),
+ "cycles / transaction"),
+ Metric('tsx_cycles_per_elision',
+ 'Number of cycles within a transaction divided by the number of elisions.',
+ Select(cycles_in_tx / elision_start,
+ has_event(elision_start),
+ 0),
+ "cycles / elision") if elision_start else None,
+ ], description="Breakdown of transactional memory statistics")
+
+
def main() -> None:
global _args
@@ -103,6 +152,7 @@ def main() -> None:
Idle(),
Rapl(),
Smi(),
+ Tsx(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 18/35] perf jevents: Add br metric group for branch statistics on Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (16 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 17/35] perf jevents: Add tsx metric group for Intel models Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 19/35] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
` (18 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using json
events. Conditional taken and not taken metrics are specific to
Icelake and later generations, so the presence of the event is used to
determine whether the metric should exist.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 138 +++++++++++++++++++++++++
1 file changed, 138 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 05f3d94ec5d5..e1944d821248 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -124,6 +124,143 @@ def Tsx() -> Optional[MetricGroup]:
], description="Breakdown of transactional memory statistics")
+def IntelBr():
+ ins = Event("instructions")
+
+ def Total() -> MetricGroup:
+ br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
+ br_m_all = Event("BR_MISP_RETIRED.ALL_BRANCHES",
+ "BR_INST_RETIRED.MISPRED",
+ "BR_MISP_EXEC.ANY")
+ br_clr = None
+ try:
+ br_clr = Event("BACLEARS.ANY", "BACLEARS.ALL")
+ except:
+ pass
+
+ br_r = d_ratio(br_all, interval_sec)
+ ins_r = d_ratio(ins, br_all)
+ misp_r = d_ratio(br_m_all, br_all)
+ clr_r = d_ratio(br_clr, interval_sec) if br_clr else None
+
+ return MetricGroup("lpm_br_total", [
+ Metric("lpm_br_total_retired",
+ "The number of branch instructions retired per second.", br_r,
+ "insn/s"),
+ Metric(
+ "lpm_br_total_mispred",
+ "The number of branch instructions retired, of any type, that were "
+ "not correctly predicted as a percentage of all branch instrucions.",
+ misp_r, "100%"),
+ Metric("lpm_br_total_insn_between_branches",
+ "The number of instructions divided by the number of branches.",
+ ins_r, "insn"),
+ Metric("lpm_br_total_insn_fe_resteers",
+ "The number of resync branches per second.", clr_r, "req/s"
+ ) if clr_r else None
+ ])
+
+ def Taken() -> MetricGroup:
+ br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
+ br_m_tk = None
+ try:
+ br_m_tk = Event("BR_MISP_RETIRED.NEAR_TAKEN",
+ "BR_MISP_RETIRED.TAKEN_JCC",
+ "BR_INST_RETIRED.MISPRED_TAKEN")
+ except:
+ pass
+ br_r = d_ratio(br_all, interval_sec)
+ ins_r = d_ratio(ins, br_all)
+ misp_r = d_ratio(br_m_tk, br_all) if br_m_tk else None
+ return MetricGroup("lpm_br_taken", [
+ Metric("lpm_br_taken_retired",
+ "The number of taken branches that were retired per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_taken_mispred",
+ "The number of retired taken branch instructions that were "
+ "mispredicted as a percentage of all taken branches.", misp_r,
+ "100%") if misp_r else None,
+ Metric(
+ "lpm_br_taken_insn_between_branches",
+ "The number of instructions divided by the number of taken branches.",
+ ins_r, "insn"),
+ ])
+
+ def Conditional() -> Optional[MetricGroup]:
+ try:
+ br_cond = Event("BR_INST_RETIRED.COND",
+ "BR_INST_RETIRED.CONDITIONAL",
+ "BR_INST_RETIRED.TAKEN_JCC")
+ br_m_cond = Event("BR_MISP_RETIRED.COND",
+ "BR_MISP_RETIRED.CONDITIONAL",
+ "BR_MISP_RETIRED.TAKEN_JCC")
+ except:
+ return None
+
+ br_cond_nt = None
+ br_m_cond_nt = None
+ try:
+ br_cond_nt = Event("BR_INST_RETIRED.COND_NTAKEN")
+ br_m_cond_nt = Event("BR_MISP_RETIRED.COND_NTAKEN")
+ except:
+ pass
+ br_r = d_ratio(br_cond, interval_sec)
+ ins_r = d_ratio(ins, br_cond)
+ misp_r = d_ratio(br_m_cond, br_cond)
+ taken_metrics = [
+ Metric("lpm_br_cond_retired", "Retired conditional branch instructions.",
+ br_r, "insn/s"),
+ Metric("lpm_br_cond_insn_between_branches",
+ "The number of instructions divided by the number of conditional "
+ "branches.", ins_r, "insn"),
+ Metric("lpm_br_cond_mispred",
+ "Retired conditional branch instructions mispredicted as a "
+ "percentage of all conditional branches.", misp_r, "100%"),
+ ]
+ if not br_m_cond_nt:
+ return MetricGroup("lpm_br_cond", taken_metrics)
+
+ br_r = d_ratio(br_cond_nt, interval_sec)
+ ins_r = d_ratio(ins, br_cond_nt)
+ misp_r = d_ratio(br_m_cond_nt, br_cond_nt)
+
+ not_taken_metrics = [
+ Metric("lpm_br_cond_retired", "Retired conditional not taken branch instructions.",
+ br_r, "insn/s"),
+ Metric("lpm_br_cond_insn_between_branches",
+ "The number of instructions divided by the number of not taken conditional "
+ "branches.", ins_r, "insn"),
+ Metric("lpm_br_cond_mispred",
+ "Retired not taken conditional branch instructions mispredicted as a "
+ "percentage of all not taken conditional branches.", misp_r, "100%"),
+ ]
+ return MetricGroup("lpm_br_cond", [
+ MetricGroup("lpm_br_cond_nt", not_taken_metrics),
+ MetricGroup("lpm_br_cond_tkn", taken_metrics),
+ ])
+
+ def Far() -> Optional[MetricGroup]:
+ try:
+ br_far = Event("BR_INST_RETIRED.FAR_BRANCH")
+ except:
+ return None
+
+ br_r = d_ratio(br_far, interval_sec)
+ ins_r = d_ratio(ins, br_far)
+ return MetricGroup("lpm_br_far", [
+ Metric("lpm_br_far_retired", "Retired far control transfers per second.",
+ br_r, "insn/s"),
+ Metric(
+ "lpm_br_far_insn_between_branches",
+ "The number of instructions divided by the number of far branches.",
+ ins_r, "insn"),
+ ])
+
+ return MetricGroup("lpm_br", [Total(), Taken(), Conditional(), Far()],
+ description="breakdown of retired branch instructions")
+
+
def main() -> None:
global _args
@@ -153,6 +290,7 @@ def main() -> None:
Rapl(),
Smi(),
Tsx(),
+ IntelBr(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 19/35] perf jevents: Add software prefetch (swpf) metric group for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (17 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 18/35] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 20/35] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
` (17 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Add metrics that breakdown software prefetch instruction use.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 66 ++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index e1944d821248..919a058c343a 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -261,6 +261,71 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelSwpf() -> Optional[MetricGroup]:
+ ins = Event("instructions")
+ try:
+ s_ld = Event("MEM_INST_RETIRED.ALL_LOADS",
+ "MEM_UOPS_RETIRED.ALL_LOADS")
+ s_nta = Event("SW_PREFETCH_ACCESS.NTA")
+ s_t0 = Event("SW_PREFETCH_ACCESS.T0")
+ s_t1 = Event("SW_PREFETCH_ACCESS.T1_T2")
+ s_w = Event("SW_PREFETCH_ACCESS.PREFETCHW")
+ except:
+ return None
+
+ all_sw = s_nta + s_t0 + s_t1 + s_w
+ swp_r = d_ratio(all_sw, interval_sec)
+ ins_r = d_ratio(ins, all_sw)
+ ld_r = d_ratio(s_ld, all_sw)
+
+ return MetricGroup("lpm_swpf", [
+ MetricGroup("lpm_swpf_totals", [
+ Metric("lpm_swpf_totals_exec", "Software prefetch instructions per second",
+ swp_r, "swpf/s"),
+ Metric("lpm_swpf_totals_insn_per_pf",
+ "Average number of instructions between software prefetches",
+ ins_r, "insn/swpf"),
+ Metric("lpm_swpf_totals_loads_per_pf",
+ "Average number of loads between software prefetches",
+ ld_r, "loads/swpf"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn", [
+ MetricGroup("lpm_swpf_bkdwn_nta", [
+ Metric("lpm_swpf_bkdwn_nta_per_swpf",
+ "Software prefetch NTA instructions as a percent of all prefetch instructions",
+ d_ratio(s_nta, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_nta_rate",
+ "Software prefetch NTA instructions per second",
+ d_ratio(s_nta, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn_t0", [
+ Metric("lpm_swpf_bkdwn_t0_per_swpf",
+ "Software prefetch T0 instructions as a percent of all prefetch instructions",
+ d_ratio(s_t0, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_t0_rate",
+ "Software prefetch T0 instructions per second",
+ d_ratio(s_t0, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn_t1_t2", [
+ Metric("lpm_swpf_bkdwn_t1_t2_per_swpf",
+ "Software prefetch T1 or T2 instructions as a percent of all prefetch instructions",
+ d_ratio(s_t1, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_t1_t2_rate",
+ "Software prefetch T1 or T2 instructions per second",
+ d_ratio(s_t1, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("lpm_swpf_bkdwn_w", [
+ Metric("lpm_swpf_bkdwn_w_per_swpf",
+ "Software prefetch W instructions as a percent of all prefetch instructions",
+ d_ratio(s_w, all_sw), "100%"),
+ Metric("lpm_swpf_bkdwn_w_rate",
+ "Software prefetch W instructions per second",
+ d_ratio(s_w, interval_sec), "insn/s"),
+ ]),
+ ]),
+ ], description="Software prefetch instruction breakdown")
+
+
def main() -> None:
global _args
@@ -291,6 +356,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelSwpf(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 20/35] perf jevents: Add ports metric group giving utilization on Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (18 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 19/35] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 21/35] perf jevents: Add L2 metrics for Intel Ian Rogers
` (16 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
The ports metric group contains a metric for each port giving its
utilization as a ratio of cycles. The metrics are created by looking
for UOPS_DISPATCHED.PORT events.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 35 ++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 919a058c343a..7fcc0a1c544d 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1,12 +1,14 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
import argparse
+import json
import math
import os
+import re
from typing import Optional
from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, MetricRef, Select)
+ JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
+ Metric, MetricGroup, MetricRef, Select)
# Global command line arguments.
_args = None
@@ -261,6 +263,34 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelPorts() -> Optional[MetricGroup]:
+ pipeline_events = json.load(
+ open(f"{_args.events_path}/x86/{_args.model}/pipeline.json"))
+
+ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
+ "CPU_CLK_UNHALTED.DISTRIBUTED",
+ "cycles")
+ # Number of CPU cycles scaled for SMT.
+ smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)
+
+ metrics = []
+ for x in pipeline_events:
+ if "EventName" in x and re.search("^UOPS_DISPATCHED.PORT", x["EventName"]):
+ name = x["EventName"]
+ port = re.search(r"(PORT_[0-9].*)", name).group(0).lower()
+ if name.endswith("_CORE"):
+ cyc = core_cycles
+ else:
+ cyc = smt_cycles
+ metrics.append(Metric(f"lpm_{port}", f"{port} utilization (higher is better)",
+ d_ratio(Event(name), cyc), "100%"))
+ if len(metrics) == 0:
+ return None
+
+ return MetricGroup("lpm_ports", metrics, "functional unit (port) utilization -- "
+ "fraction of cycles each port is utilized (higher is better)")
+
+
def IntelSwpf() -> Optional[MetricGroup]:
ins = Event("instructions")
try:
@@ -356,6 +386,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelPorts(),
IntelSwpf(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 21/35] perf jevents: Add L2 metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (19 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 20/35] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 22/35] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
` (15 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Give a breakdown of various L2 counters as metrics, including totals,
reads, hardware prefetcher, RFO, code and evictions.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 170 +++++++++++++++++++++++++
1 file changed, 170 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 7fcc0a1c544d..d190d97f4aff 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -263,6 +263,175 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelL2() -> Optional[MetricGroup]:
+ try:
+ DC_HIT = Event("L2_RQSTS.DEMAND_DATA_RD_HIT")
+ except:
+ return None
+ try:
+ DC_MISS = Event("L2_RQSTS.DEMAND_DATA_RD_MISS")
+ l2_dmnd_miss = DC_MISS
+ l2_dmnd_rd_all = DC_MISS + DC_HIT
+ except:
+ DC_ALL = Event("L2_RQSTS.ALL_DEMAND_DATA_RD")
+ l2_dmnd_miss = DC_ALL - DC_HIT
+ l2_dmnd_rd_all = DC_ALL
+ l2_dmnd_mrate = d_ratio(l2_dmnd_miss, interval_sec)
+ l2_dmnd_rrate = d_ratio(l2_dmnd_rd_all, interval_sec)
+
+ DC_PFH = None
+ DC_PFM = None
+ l2_pf_all = None
+ l2_pf_mrate = None
+ l2_pf_rrate = None
+ try:
+ DC_PFH = Event("L2_RQSTS.PF_HIT")
+ DC_PFM = Event("L2_RQSTS.PF_MISS")
+ l2_pf_all = DC_PFH + DC_PFM
+ l2_pf_mrate = d_ratio(DC_PFM, interval_sec)
+ l2_pf_rrate = d_ratio(l2_pf_all, interval_sec)
+ except:
+ pass
+
+ DC_RFOH = None
+ DC_RFOM = None
+ l2_rfo_all = None
+ l2_rfo_mrate = None
+ l2_rfo_rrate = None
+ try:
+ DC_RFOH = Event("L2_RQSTS.RFO_HIT")
+ DC_RFOM = Event("L2_RQSTS.RFO_MISS")
+ l2_rfo_all = DC_RFOH + DC_RFOM
+ l2_rfo_mrate = d_ratio(DC_RFOM, interval_sec)
+ l2_rfo_rrate = d_ratio(l2_rfo_all, interval_sec)
+ except:
+ pass
+
+ DC_CH = None
+ try:
+ DC_CH = Event("L2_RQSTS.CODE_RD_HIT")
+ except:
+ pass
+ DC_CM = Event("L2_RQSTS.CODE_RD_MISS")
+ DC_IN = Event("L2_LINES_IN.ALL")
+ DC_OUT_NS = None
+ DC_OUT_S = None
+ l2_lines_out = None
+ l2_out_rate = None
+ wbn = None
+ isd = None
+ try:
+ DC_OUT_NS = Event("L2_LINES_OUT.NON_SILENT",
+ "L2_LINES_OUT.DEMAND_DIRTY",
+ "L2_LINES_IN.S")
+ DC_OUT_S = Event("L2_LINES_OUT.SILENT",
+ "L2_LINES_OUT.DEMAND_CLEAN",
+ "L2_LINES_IN.I")
+ if DC_OUT_S.name == "L2_LINES_OUT.SILENT" and (
+ args.model.startswith("skylake") or
+ args.model == "cascadelakex"):
+ DC_OUT_S.name = "L2_LINES_OUT.SILENT/any/"
+ # bring is back to per-CPU
+ l2_s = Select(DC_OUT_S / 2, Literal("#smt_on"), DC_OUT_S)
+ l2_ns = DC_OUT_NS
+ l2_lines_out = l2_s + l2_ns
+ l2_out_rate = d_ratio(l2_lines_out, interval_sec)
+ nlr = max(l2_ns - DC_WB_U - DC_WB_D, 0)
+ wbn = d_ratio(nlr, interval_sec)
+ isd = d_ratio(l2_s, interval_sec)
+ except:
+ pass
+ DC_OUT_U = None
+ l2_pf_useless = None
+ l2_useless_rate = None
+ try:
+ DC_OUT_U = Event("L2_LINES_OUT.USELESS_HWPF")
+ l2_pf_useless = DC_OUT_U
+ l2_useless_rate = d_ratio(l2_pf_useless, interval_sec)
+ except:
+ pass
+ DC_WB_U = None
+ DC_WB_D = None
+ wbu = None
+ wbd = None
+ try:
+ DC_WB_U = Event("IDI_MISC.WB_UPGRADE")
+ DC_WB_D = Event("IDI_MISC.WB_DOWNGRADE")
+ wbu = d_ratio(DC_WB_U, interval_sec)
+ wbd = d_ratio(DC_WB_D, interval_sec)
+ except:
+ pass
+
+ l2_lines_in = DC_IN
+ l2_code_all = (DC_CH + DC_CM) if DC_CH else None
+ l2_code_rate = d_ratio(l2_code_all, interval_sec) if DC_CH else None
+ l2_code_miss_rate = d_ratio(DC_CM, interval_sec)
+ l2_in_rate = d_ratio(l2_lines_in, interval_sec)
+
+ return MetricGroup("lpm_l2", [
+ MetricGroup("lpm_l2_totals", [
+ Metric("lpm_l2_totals_in", "L2 cache total in per second",
+ l2_in_rate, "In/s"),
+ Metric("lpm_l2_totals_out", "L2 cache total out per second",
+ l2_out_rate, "Out/s") if l2_out_rate else None,
+ ]),
+ MetricGroup("lpm_l2_rd", [
+ Metric("lpm_l2_rd_hits", "L2 cache data read hits",
+ d_ratio(DC_HIT, l2_dmnd_rd_all), "100%"),
+ Metric("lpm_l2_rd_hits", "L2 cache data read hits",
+ d_ratio(l2_dmnd_miss, l2_dmnd_rd_all), "100%"),
+ Metric("lpm_l2_rd_requests", "L2 cache data read requests per second",
+ l2_dmnd_rrate, "requests/s"),
+ Metric("lpm_l2_rd_misses", "L2 cache data read misses per second",
+ l2_dmnd_mrate, "misses/s"),
+ ]),
+ MetricGroup("lpm_l2_hwpf", [
+ Metric("lpm_l2_hwpf_hits", "L2 cache hardware prefetcher hits",
+ d_ratio(DC_PFH, l2_pf_all), "100%"),
+ Metric("lpm_l2_hwpf_misses", "L2 cache hardware prefetcher misses",
+ d_ratio(DC_PFM, l2_pf_all), "100%"),
+ Metric("lpm_l2_hwpf_useless", "L2 cache hardware prefetcher useless prefetches per second",
+ l2_useless_rate, "100%") if l2_useless_rate else None,
+ Metric("lpm_l2_hwpf_requests", "L2 cache hardware prefetcher requests per second",
+ l2_pf_rrate, "100%"),
+ Metric("lpm_l2_hwpf_misses", "L2 cache hardware prefetcher misses per second",
+ l2_pf_mrate, "100%"),
+ ]) if DC_PFH else None,
+ MetricGroup("lpm_l2_rfo", [
+ Metric("lpm_l2_rfo_hits", "L2 cache request for ownership (RFO) hits",
+ d_ratio(DC_RFOH, l2_rfo_all), "100%"),
+ Metric("lpm_l2_rfo_misses", "L2 cache request for ownership (RFO) misses",
+ d_ratio(DC_RFOM, l2_rfo_all), "100%"),
+ Metric("lpm_l2_rfo_requests", "L2 cache request for ownership (RFO) requests per second",
+ l2_rfo_rrate, "requests/s"),
+ Metric("lpm_l2_rfo_misses", "L2 cache request for ownership (RFO) misses per second",
+ l2_rfo_mrate, "misses/s"),
+ ]) if DC_RFOH else None,
+ MetricGroup("lpm_l2_code", [
+ Metric("lpm_l2_code_hits", "L2 cache code hits",
+ d_ratio(DC_CH, l2_code_all), "100%") if DC_CH else None,
+ Metric("lpm_l2_code_misses", "L2 cache code misses",
+ d_ratio(DC_CM, l2_code_all), "100%") if DC_CH else None,
+ Metric("lpm_l2_code_requests", "L2 cache code requests per second",
+ l2_code_rate, "requests/s") if DC_CH else None,
+ Metric("lpm_l2_code_misses", "L2 cache code misses per second",
+ l2_code_miss_rate, "misses/s"),
+ ]),
+ MetricGroup("lpm_l2_evict", [
+ MetricGroup("lpm_l2_evict_mef_lines", [
+ Metric("lpm_l2_evict_mef_lines_l3_hot_lru", "L2 evictions M/E/F lines L3 hot LRU per second",
+ wbu, "HotLRU/s") if wbu else None,
+ Metric("lpm_l2_evict_mef_lines_l3_norm_lru", "L2 evictions M/E/F lines L3 normal LRU per second",
+ wbn, "NormLRU/s") if wbn else None,
+ Metric("lpm_l2_evict_mef_lines_dropped", "L2 evictions M/E/F lines dropped per second",
+ wbd, "dropped/s") if wbd else None,
+ Metric("lpm_l2_evict_is_lines_dropped", "L2 evictions I/S lines dropped per second",
+ isd, "dropped/s") if isd else None,
+ ]),
+ ]),
+ ], description="L2 data cache analysis")
+
+
def IntelPorts() -> Optional[MetricGroup]:
pipeline_events = json.load(
open(f"{_args.events_path}/x86/{_args.model}/pipeline.json"))
@@ -386,6 +555,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelL2(),
IntelPorts(),
IntelSwpf(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 22/35] perf jevents: Add load store breakdown metrics ldst for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (20 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 21/35] perf jevents: Add L2 metrics for Intel Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 23/35] perf jevents: Add ILP metrics " Ian Rogers
` (14 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 87 +++++++++++++++++++++++++-
1 file changed, 86 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index d190d97f4aff..19a284b4c520 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -8,7 +8,7 @@ import re
from typing import Optional
from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
- Metric, MetricGroup, MetricRef, Select)
+ Metric, MetricConstraint, MetricGroup, MetricRef, Select)
# Global command line arguments.
_args = None
@@ -525,6 +525,90 @@ def IntelSwpf() -> Optional[MetricGroup]:
], description="Software prefetch instruction breakdown")
+def IntelLdSt() -> Optional[MetricGroup]:
+ if _args.model in [
+ "bonnell",
+ "nehalemep",
+ "nehalemex",
+ "westmereep-dp",
+ "westmereep-sp",
+ "westmereex",
+ ]:
+ return None
+ LDST_LD = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ LDST_ST = Event("MEM_INST_RETIRED.ALL_STORES",
+ "MEM_UOPS_RETIRED.ALL_STORES")
+ LDST_LDC1 = Event(f"{LDST_LD.name}/cmask=1/")
+ LDST_STC1 = Event(f"{LDST_ST.name}/cmask=1/")
+ LDST_LDC2 = Event(f"{LDST_LD.name}/cmask=2/")
+ LDST_STC2 = Event(f"{LDST_ST.name}/cmask=2/")
+ LDST_LDC3 = Event(f"{LDST_LD.name}/cmask=3/")
+ LDST_STC3 = Event(f"{LDST_ST.name}/cmask=3/")
+ ins = Event("instructions")
+ LDST_CYC = Event("CPU_CLK_UNHALTED.THREAD",
+ "CPU_CLK_UNHALTED.CORE_P",
+ "CPU_CLK_UNHALTED.THREAD_P")
+ LDST_PRE = None
+ try:
+ LDST_PRE = Event("LOAD_HIT_PREFETCH.SWPF", "LOAD_HIT_PRE.SW_PF")
+ except:
+ pass
+ LDST_AT = None
+ try:
+ LDST_AT = Event("MEM_INST_RETIRED.LOCK_LOADS")
+ except:
+ pass
+ cyc = LDST_CYC
+
+ ld_rate = d_ratio(LDST_LD, interval_sec)
+ st_rate = d_ratio(LDST_ST, interval_sec)
+ pf_rate = d_ratio(LDST_PRE, interval_sec) if LDST_PRE else None
+ at_rate = d_ratio(LDST_AT, interval_sec) if LDST_AT else None
+
+ ldst_ret_constraint = MetricConstraint.GROUPED_EVENTS
+ if LDST_LD.name == "MEM_UOPS_RETIRED.ALL_LOADS":
+ ldst_ret_constraint = MetricConstraint.NO_GROUP_EVENTS_NMI
+
+ return MetricGroup("lpm_ldst", [
+ MetricGroup("lpm_ldst_total", [
+ Metric("lpm_ldst_total_loads", "Load/store instructions total loads",
+ ld_rate, "loads"),
+ Metric("lpm_ldst_total_stores", "Load/store instructions total stores",
+ st_rate, "stores"),
+ ]),
+ MetricGroup("lpm_ldst_prcnt", [
+ Metric("lpm_ldst_prcnt_loads", "Percent of all instructions that are loads",
+ d_ratio(LDST_LD, ins), "100%"),
+ Metric("lpm_ldst_prcnt_stores", "Percent of all instructions that are stores",
+ d_ratio(LDST_ST, ins), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_lds", [
+ Metric("lpm_ldst_ret_lds_1", "Retired loads in 1 cycle",
+ d_ratio(max(LDST_LDC1 - LDST_LDC2, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_lds_2", "Retired loads in 2 cycles",
+ d_ratio(max(LDST_LDC2 - LDST_LDC3, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_lds_3", "Retired loads in 3 or more cycles",
+ d_ratio(LDST_LDC3, cyc), "100%"),
+ ]),
+ MetricGroup("lpm_ldst_ret_sts", [
+ Metric("lpm_ldst_ret_sts_1", "Retired stores in 1 cycle",
+ d_ratio(max(LDST_STC1 - LDST_STC2, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_sts_2", "Retired stores in 2 cycles",
+ d_ratio(max(LDST_STC2 - LDST_STC3, 0), cyc), "100%",
+ constraint=ldst_ret_constraint),
+ Metric("lpm_ldst_ret_sts_3", "Retired stores in 3 more cycles",
+ d_ratio(LDST_STC3, cyc), "100%"),
+ ]),
+ Metric("lpm_ldst_ld_hit_swpf", "Load hit software prefetches per second",
+ pf_rate, "swpf/s") if pf_rate else None,
+ Metric("lpm_ldst_atomic_lds", "Atomic loads per second",
+ at_rate, "loads/s") if at_rate else None,
+ ], description="Breakdown of load/store instructions")
+
+
def main() -> None:
global _args
@@ -556,6 +640,7 @@ def main() -> None:
Tsx(),
IntelBr(),
IntelL2(),
+ IntelLdSt(),
IntelPorts(),
IntelSwpf(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 23/35] perf jevents: Add ILP metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (21 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 22/35] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 24/35] perf jevents: Add context switch " Ian Rogers
` (13 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Use the counter mask (cmask) to see how many cycles an instruction
takes to retire. Present as a set of ILP metrics.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 40 ++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 19a284b4c520..bc3c50285916 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -263,6 +263,45 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelIlp() -> MetricGroup:
+ tsc = Event("msr/tsc/")
+ c0 = Event("msr/mperf/")
+ low = tsc - c0
+ inst_ret = Event("INST_RETIRED.ANY_P")
+ inst_ret_c = [Event(f"{inst_ret.name}/cmask={x}/") for x in range(1, 6)]
+ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
+ "CPU_CLK_UNHALTED.DISTRIBUTED",
+ "cycles")
+ ilp = [d_ratio(max(inst_ret_c[x] - inst_ret_c[x + 1], 0), core_cycles)
+ for x in range(0, 4)]
+ ilp.append(d_ratio(inst_ret_c[4], core_cycles))
+ ilp0 = 1
+ for x in ilp:
+ ilp0 -= x
+ return MetricGroup("lpm_ilp", [
+ Metric("lpm_ilp_idle", "Lower power cycles as a percentage of all cycles",
+ d_ratio(low, tsc), "100%"),
+ Metric("lpm_ilp_inst_ret_0",
+ "Instructions retired in 0 cycles as a percentage of all cycles",
+ ilp0, "100%"),
+ Metric("lpm_ilp_inst_ret_1",
+ "Instructions retired in 1 cycles as a percentage of all cycles",
+ ilp[0], "100%"),
+ Metric("lpm_ilp_inst_ret_2",
+ "Instructions retired in 2 cycles as a percentage of all cycles",
+ ilp[1], "100%"),
+ Metric("lpm_ilp_inst_ret_3",
+ "Instructions retired in 3 cycles as a percentage of all cycles",
+ ilp[2], "100%"),
+ Metric("lpm_ilp_inst_ret_4",
+ "Instructions retired in 4 cycles as a percentage of all cycles",
+ ilp[3], "100%"),
+ Metric("lpm_ilp_inst_ret_5",
+ "Instructions retired in 5 or more cycles as a percentage of all cycles",
+ ilp[4], "100%"),
+ ])
+
+
def IntelL2() -> Optional[MetricGroup]:
try:
DC_HIT = Event("L2_RQSTS.DEMAND_DATA_RD_HIT")
@@ -639,6 +678,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelIlp(),
IntelL2(),
IntelLdSt(),
IntelPorts(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 24/35] perf jevents: Add context switch metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (22 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 23/35] perf jevents: Add ILP metrics " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 25/35] perf jevents: Add FPU " Ian Rogers
` (12 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics break down context switches for different kinds of
instruction.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 58 ++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index bc3c50285916..9cf4bd8ac769 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -263,6 +263,63 @@ def IntelBr():
description="breakdown of retired branch instructions")
+def IntelCtxSw() -> MetricGroup:
+ cs = Event("context\\-switches")
+ metrics = [
+ Metric("lpm_cs_rate", "Context switches per second",
+ d_ratio(cs, interval_sec), "ctxsw/s")
+ ]
+
+ ev = Event("instructions")
+ metrics.append(Metric("lpm_cs_instr", "Instructions per context switch",
+ d_ratio(ev, cs), "instr/cs"))
+
+ ev = Event("cycles")
+ metrics.append(Metric("lpm_cs_cycles", "Cycles per context switch",
+ d_ratio(ev, cs), "cycles/cs"))
+
+ try:
+ ev = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ metrics.append(Metric("lpm_cs_loads", "Loads per context switch",
+ d_ratio(ev, cs), "loads/cs"))
+ except:
+ pass
+
+ try:
+ ev = Event("MEM_INST_RETIRED.ALL_STORES",
+ "MEM_UOPS_RETIRED.ALL_STORES")
+ metrics.append(Metric("lpm_cs_stores", "Stores per context switch",
+ d_ratio(ev, cs), "stores/cs"))
+ except:
+ pass
+
+ try:
+ ev = Event("BR_INST_RETIRED.NEAR_TAKEN", "BR_INST_RETIRED.TAKEN_JCC")
+ metrics.append(Metric("lpm_cs_br_taken", "Branches taken per context switch",
+ d_ratio(ev, cs), "br_taken/cs"))
+ except:
+ pass
+
+ try:
+ l2_misses = (Event("L2_RQSTS.DEMAND_DATA_RD_MISS") +
+ Event("L2_RQSTS.RFO_MISS") +
+ Event("L2_RQSTS.CODE_RD_MISS"))
+ try:
+ l2_misses += Event("L2_RQSTS.HWPF_MISS",
+ "L2_RQSTS.L2_PF_MISS", "L2_RQSTS.PF_MISS")
+ except:
+ pass
+
+ metrics.append(Metric("lpm_cs_l2_misses", "L2 misses per context switch",
+ d_ratio(l2_misses, cs), "l2_misses/cs"))
+ except:
+ pass
+
+ return MetricGroup("lpm_cs", metrics,
+ description=("Number of context switches per second, instructions "
+ "retired & core cycles between context switches"))
+
+
def IntelIlp() -> MetricGroup:
tsc = Event("msr/tsc/")
c0 = Event("msr/mperf/")
@@ -678,6 +735,7 @@ def main() -> None:
Smi(),
Tsx(),
IntelBr(),
+ IntelCtxSw(),
IntelIlp(),
IntelL2(),
IntelLdSt(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 25/35] perf jevents: Add FPU metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (23 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 24/35] perf jevents: Add context switch " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 26/35] perf jevents: Add Miss Level Parallelism (MLP) metric " Ian Rogers
` (11 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Metrics break down of floating point operations.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 97 ++++++++++++++++++++++++++
1 file changed, 97 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 9cf4bd8ac769..77b8e10194db 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -320,6 +320,102 @@ def IntelCtxSw() -> MetricGroup:
"retired & core cycles between context switches"))
+def IntelFpu() -> Optional[MetricGroup]:
+ cyc = Event("cycles")
+ try:
+ s_64 = Event("FP_ARITH_INST_RETIRED.SCALAR_SINGLE",
+ "SIMD_INST_RETIRED.SCALAR_SINGLE")
+ except:
+ return None
+ d_64 = Event("FP_ARITH_INST_RETIRED.SCALAR_DOUBLE",
+ "SIMD_INST_RETIRED.SCALAR_DOUBLE")
+ s_128 = Event("FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
+ "SIMD_INST_RETIRED.PACKED_SINGLE")
+
+ flop = s_64 + d_64 + 4 * s_128
+
+ d_128 = None
+ s_256 = None
+ d_256 = None
+ s_512 = None
+ d_512 = None
+ try:
+ d_128 = Event("FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE")
+ flop += 2 * d_128
+ s_256 = Event("FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE")
+ flop += 8 * s_256
+ d_256 = Event("FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE")
+ flop += 4 * d_256
+ s_512 = Event("FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE")
+ flop += 16 * s_512
+ d_512 = Event("FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE")
+ flop += 8 * d_512
+ except:
+ pass
+
+ f_assist = Event("ASSISTS.FP", "FP_ASSIST.ANY", "FP_ASSIST.S")
+ if f_assist in [
+ "ASSISTS.FP",
+ "FP_ASSIST.S",
+ ]:
+ f_assist += "/cmask=1/"
+
+ flop_r = d_ratio(flop, interval_sec)
+ flop_c = d_ratio(flop, cyc)
+ nmi_constraint = MetricConstraint.GROUPED_EVENTS
+ if f_assist.name == "ASSISTS.FP": # Icelake+
+ nmi_constraint = MetricConstraint.NO_GROUP_EVENTS_NMI
+
+ def FpuMetrics(group: str, fl: Optional[Event], mult: int, desc: str) -> Optional[MetricGroup]:
+ if not fl:
+ return None
+
+ f = fl * mult
+ fl_r = d_ratio(f, interval_sec)
+ r_s = d_ratio(fl, interval_sec)
+ return MetricGroup(group, [
+ Metric(f"{group}_of_total", desc + " floating point operations per second",
+ d_ratio(f, flop), "100%"),
+ Metric(f"{group}_flops", desc + " floating point operations per second",
+ fl_r, "flops/s"),
+ Metric(f"{group}_ops", desc + " operations per second",
+ r_s, "ops/s"),
+ ])
+
+ return MetricGroup("lpm_fpu", [
+ MetricGroup("lpm_fpu_total", [
+ Metric("lpm_fpu_total_flops", "Floating point operations per second",
+ flop_r, "flops/s"),
+ Metric("lpm_fpu_total_flopc", "Floating point operations per cycle",
+ flop_c, "flops/cycle", constraint=nmi_constraint),
+ ]),
+ MetricGroup("lpm_fpu_64", [
+ FpuMetrics("lpm_fpu_64_single", s_64, 1, "64-bit single"),
+ FpuMetrics("lpm_fpu_64_double", d_64, 1, "64-bit double"),
+ ]),
+ MetricGroup("lpm_fpu_128", [
+ FpuMetrics("lpm_fpu_128_single", s_128,
+ 4, "128-bit packed single"),
+ FpuMetrics("lpm_fpu_128_double", d_128,
+ 2, "128-bit packed double"),
+ ]),
+ MetricGroup("lpm_fpu_256", [
+ FpuMetrics("lpm_fpu_256_single", s_256,
+ 8, "128-bit packed single"),
+ FpuMetrics("lpm_fpu_256_double", d_256,
+ 4, "128-bit packed double"),
+ ]),
+ MetricGroup("lpm_fpu_512", [
+ FpuMetrics("lpm_fpu_512_single", s_512,
+ 16, "128-bit packed single"),
+ FpuMetrics("lpm_fpu_512_double", d_512,
+ 8, "128-bit packed double"),
+ ]),
+ Metric("lpm_fpu_assists", "FP assists as a percentage of cycles",
+ d_ratio(f_assist, cyc), "100%"),
+ ])
+
+
def IntelIlp() -> MetricGroup:
tsc = Event("msr/tsc/")
c0 = Event("msr/mperf/")
@@ -736,6 +832,7 @@ def main() -> None:
Tsx(),
IntelBr(),
IntelCtxSw(),
+ IntelFpu(),
IntelIlp(),
IntelL2(),
IntelLdSt(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 26/35] perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (24 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 25/35] perf jevents: Add FPU " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 27/35] perf jevents: Add mem_bw " Ian Rogers
` (10 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Number of outstanding load misses per cycle.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 77b8e10194db..dddeae35e4b4 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -624,6 +624,20 @@ def IntelL2() -> Optional[MetricGroup]:
], description="L2 data cache analysis")
+def IntelMlp() -> Optional[Metric]:
+ try:
+ l1d = Event("L1D_PEND_MISS.PENDING")
+ l1dc = Event("L1D_PEND_MISS.PENDING_CYCLES")
+ except:
+ return None
+
+ l1dc = Select(l1dc / 2, Literal("#smt_on"), l1dc)
+ ml = d_ratio(l1d, l1dc)
+ return Metric("lpm_mlp",
+ "Miss level parallelism - number of outstanding load misses per cycle (higher is better)",
+ ml, "load_miss_pending/cycle")
+
+
def IntelPorts() -> Optional[MetricGroup]:
pipeline_events = json.load(
open(f"{_args.events_path}/x86/{_args.model}/pipeline.json"))
@@ -836,6 +850,7 @@ def main() -> None:
IntelIlp(),
IntelL2(),
IntelLdSt(),
+ IntelMlp(),
IntelPorts(),
IntelSwpf(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 27/35] perf jevents: Add mem_bw metric for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (25 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 26/35] perf jevents: Add Miss Level Parallelism (MLP) metric " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 28/35] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
` (9 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Break down memory bandwidth using uncore counters. For many models
this matches the memory_bandwidth_* metrics, but these metrics aren't
made available on all models. Add support for free running counters.
Query the event json when determining which what events/counters are
available.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 62 ++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index dddeae35e4b4..f671d6e4fd67 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,67 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreMemBw() -> Optional[MetricGroup]:
+ mem_events = []
+ try:
+ mem_events = json.load(open(f"{os.path.dirname(os.path.realpath(__file__))}"
+ f"/arch/x86/{args.model}/uncore-memory.json"))
+ except:
+ pass
+
+ ddr_rds = 0
+ ddr_wrs = 0
+ ddr_total = 0
+ for x in mem_events:
+ if "EventName" in x:
+ name = x["EventName"]
+ if re.search("^UNC_MC[0-9]+_RDCAS_COUNT_FREERUN", name):
+ ddr_rds += Event(name)
+ elif re.search("^UNC_MC[0-9]+_WRCAS_COUNT_FREERUN", name):
+ ddr_wrs += Event(name)
+ # elif re.search("^UNC_MC[0-9]+_TOTAL_REQCOUNT_FREERUN", name):
+ # ddr_total += Event(name)
+
+ if ddr_rds == 0:
+ try:
+ ddr_rds = Event("UNC_M_CAS_COUNT.RD")
+ ddr_wrs = Event("UNC_M_CAS_COUNT.WR")
+ except:
+ return None
+
+ ddr_total = ddr_rds + ddr_wrs
+
+ pmm_rds = 0
+ pmm_wrs = 0
+ try:
+ pmm_rds = Event("UNC_M_PMM_RPQ_INSERTS")
+ pmm_wrs = Event("UNC_M_PMM_WPQ_INSERTS")
+ except:
+ pass
+
+ pmm_total = pmm_rds + pmm_wrs
+
+ scale = 64 / 1_000_000
+ return MetricGroup("lpm_mem_bw", [
+ MetricGroup("lpm_mem_bw_ddr", [
+ Metric("lpm_mem_bw_ddr_read", "DDR memory read bandwidth",
+ d_ratio(ddr_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_ddr_write", "DDR memory write bandwidth",
+ d_ratio(ddr_wrs, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_ddr_total", "DDR memory write bandwidth",
+ d_ratio(ddr_total, interval_sec), f"{scale}MB/s"),
+ ], description="DDR Memory Bandwidth"),
+ MetricGroup("lpm_mem_bw_pmm", [
+ Metric("lpm_mem_bw_pmm_read", "PMM memory read bandwidth",
+ d_ratio(pmm_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_pmm_write", "PMM memory write bandwidth",
+ d_ratio(pmm_wrs, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_bw_pmm_total", "PMM memory write bandwidth",
+ d_ratio(pmm_total, interval_sec), f"{scale}MB/s"),
+ ], description="PMM Memory Bandwidth") if pmm_rds != 0 else None,
+ ], description="Memory Bandwidth")
+
+
def main() -> None:
global _args
@@ -853,6 +914,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreMemBw(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 28/35] perf jevents: Add local/remote "mem" breakdown metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (26 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 27/35] perf jevents: Add mem_bw " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:10 ` [PATCH v10 29/35] perf jevents: Add dir " Ian Rogers
` (8 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Breakdown local and remote memory bandwidth, read and writes. The
implementation uses the HA and CHA PMUs present in server models
broadwellde, broadwellx cascadelakex, emeraldrapids, haswellx,
icelakex, ivytown, sapphirerapids and skylakex.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 31 ++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index f671d6e4fd67..983e5021f3d3 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,36 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreMem() -> Optional[MetricGroup]:
+ try:
+ loc_rds = Event("UNC_CHA_REQUESTS.READS_LOCAL",
+ "UNC_H_REQUESTS.READS_LOCAL")
+ rem_rds = Event("UNC_CHA_REQUESTS.READS_REMOTE",
+ "UNC_H_REQUESTS.READS_REMOTE")
+ loc_wrs = Event("UNC_CHA_REQUESTS.WRITES_LOCAL",
+ "UNC_H_REQUESTS.WRITES_LOCAL")
+ rem_wrs = Event("UNC_CHA_REQUESTS.WRITES_REMOTE",
+ "UNC_H_REQUESTS.WRITES_REMOTE")
+ except:
+ return None
+
+ scale = 64 / 1_000_000
+ return MetricGroup("lpm_mem", [
+ MetricGroup("lpm_mem_local", [
+ Metric("lpm_mem_local_read", "Local memory read bandwidth not including directory updates",
+ d_ratio(loc_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_local_write", "Local memory write bandwidth not including directory updates",
+ d_ratio(loc_wrs, interval_sec), f"{scale}MB/s"),
+ ]),
+ MetricGroup("lpm_mem_remote", [
+ Metric("lpm_mem_remote_read", "Remote memory read bandwidth not including directory updates",
+ d_ratio(rem_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_mem_remote_write", "Remote memory write bandwidth not including directory updates",
+ d_ratio(rem_wrs, interval_sec), f"{scale}MB/s"),
+ ]),
+ ], description="Memory Bandwidth breakdown local vs. remote (remote requests in). directory updates not included")
+
+
def UncoreMemBw() -> Optional[MetricGroup]:
mem_events = []
try:
@@ -914,6 +944,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreMem(),
UncoreMemBw(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 29/35] perf jevents: Add dir breakdown metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (27 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 28/35] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
@ 2026-01-08 19:10 ` Ian Rogers
2026-01-08 19:11 ` [PATCH v10 30/35] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
` (7 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:10 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Breakdown directory hit, misses and requests. The implementation uses
the M2M and CHA PMUs present in server models broadwellde, broadwellx
cascadelakex, emeraldrapids, icelakex, sapphirerapids and skylakex.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 36 ++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 983e5021f3d3..24ceb7f8719b 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,41 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreDir() -> Optional[MetricGroup]:
+ try:
+ m2m_upd = Event("UNC_M2M_DIRECTORY_UPDATE.ANY")
+ m2m_hits = Event("UNC_M2M_DIRECTORY_HIT.DIRTY_I")
+ # Turn the umask into a ANY rather than DIRTY_I filter.
+ m2m_hits.name += "/umask=0xFF,name=UNC_M2M_DIRECTORY_HIT.ANY/"
+ m2m_miss = Event("UNC_M2M_DIRECTORY_MISS.DIRTY_I")
+ # Turn the umask into a ANY rather than DIRTY_I filter.
+ m2m_miss.name += "/umask=0xFF,name=UNC_M2M_DIRECTORY_MISS.ANY/"
+ cha_upd = Event("UNC_CHA_DIR_UPDATE.HA")
+ # Turn the umask into a ANY rather than HA filter.
+ cha_upd.name += "/umask=3,name=UNC_CHA_DIR_UPDATE.ANY/"
+ except:
+ return None
+
+ m2m_total = m2m_hits + m2m_miss
+ upd = m2m_upd + cha_upd # in cache lines
+ upd_r = upd / interval_sec
+ look_r = m2m_total / interval_sec
+
+ scale = 64 / 1_000_000 # Cache lines to MB
+ return MetricGroup("lpm_dir", [
+ Metric("lpm_dir_lookup_rate", "",
+ d_ratio(m2m_total, interval_sec), "requests/s"),
+ Metric("lpm_dir_lookup_hits", "",
+ d_ratio(m2m_hits, m2m_total), "100%"),
+ Metric("lpm_dir_lookup_misses", "",
+ d_ratio(m2m_miss, m2m_total), "100%"),
+ Metric("lpm_dir_update_requests", "",
+ d_ratio(m2m_upd + cha_upd, interval_sec), "requests/s"),
+ Metric("lpm_dir_update_bw", "",
+ d_ratio(m2m_upd + cha_upd, interval_sec), f"{scale}MB/s"),
+ ])
+
+
def UncoreMem() -> Optional[MetricGroup]:
try:
loc_rds = Event("UNC_CHA_REQUESTS.READS_LOCAL",
@@ -944,6 +979,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreDir(),
UncoreMem(),
UncoreMemBw(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 30/35] perf jevents: Add C-State metrics from the PCU PMU for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (28 preceding siblings ...)
2026-01-08 19:10 ` [PATCH v10 29/35] perf jevents: Add dir " Ian Rogers
@ 2026-01-08 19:11 ` Ian Rogers
2026-01-08 19:11 ` [PATCH v10 31/35] perf jevents: Add local/remote miss latency metrics " Ian Rogers
` (6 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:11 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Use occupancy events fixed in:
https://lore.kernel.org/lkml/20240226201517.3540187-1-irogers@google.com/
Metrics are at the socket level referring to cores, not hyperthreads.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 30 ++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 24ceb7f8719b..118fe0fc05a3 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -815,6 +815,35 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description="Breakdown of load/store instructions")
+def UncoreCState() -> Optional[MetricGroup]:
+ try:
+ pcu_ticks = Event("UNC_P_CLOCKTICKS")
+ c0 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C0")
+ c3 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C3")
+ c6 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C6")
+ except:
+ return None
+
+ num_cores = Literal("#num_cores") / Literal("#num_packages")
+
+ max_cycles = pcu_ticks * num_cores
+ total_cycles = c0 + c3 + c6
+
+ # remove fused-off cores which show up in C6/C7.
+ c6 = Select(max(c6 - (total_cycles - max_cycles), 0),
+ total_cycles > max_cycles,
+ c6)
+
+ return MetricGroup("lpm_cstate", [
+ Metric("lpm_cstate_c0", "C-State cores in C0/C1",
+ d_ratio(c0, pcu_ticks), "cores"),
+ Metric("lpm_cstate_c3", "C-State cores in C3",
+ d_ratio(c3, pcu_ticks), "cores"),
+ Metric("lpm_cstate_c6", "C-State cores in C6/C7",
+ d_ratio(c6, pcu_ticks), "cores"),
+ ])
+
+
def UncoreDir() -> Optional[MetricGroup]:
try:
m2m_upd = Event("UNC_M2M_DIRECTORY_UPDATE.ANY")
@@ -979,6 +1008,7 @@ def main() -> None:
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreCState(),
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 31/35] perf jevents: Add local/remote miss latency metrics for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (29 preceding siblings ...)
2026-01-08 19:11 ` [PATCH v10 30/35] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
@ 2026-01-08 19:11 ` Ian Rogers
2026-01-08 19:11 ` [PATCH v10 32/35] perf jevents: Add upi_bw metric " Ian Rogers
` (5 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:11 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Derive from CBOX/CHA occupancy and inserts the average latency as is
provided in Intel's uncore performance monitoring reference.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 70 ++++++++++++++++++++++++--
1 file changed, 67 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 118fe0fc05a3..037f9b2ea1b6 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -6,9 +6,10 @@ import math
import os
import re
from typing import Optional
-from metric import (d_ratio, has_event, max, CheckPmu, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
- Metric, MetricConstraint, MetricGroup, MetricRef, Select)
+from metric import (d_ratio, has_event, max, source_count, CheckPmu, Event,
+ JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+ Literal, LoadEvents, Metric, MetricConstraint, MetricGroup,
+ MetricRef, Select)
# Global command line arguments.
_args = None
@@ -624,6 +625,68 @@ def IntelL2() -> Optional[MetricGroup]:
], description="L2 data cache analysis")
+def IntelMissLat() -> Optional[MetricGroup]:
+ try:
+ ticks = Event("UNC_CHA_CLOCKTICKS", "UNC_C_CLOCKTICKS")
+ data_rd_loc_occ = Event("UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_LOCAL",
+ "UNC_CHA_TOR_OCCUPANCY.IA_MISS",
+ "UNC_C_TOR_OCCUPANCY.MISS_LOCAL_OPCODE",
+ "UNC_C_TOR_OCCUPANCY.MISS_OPCODE")
+ data_rd_loc_ins = Event("UNC_CHA_TOR_INSERTS.IA_MISS_DRD_LOCAL",
+ "UNC_CHA_TOR_INSERTS.IA_MISS",
+ "UNC_C_TOR_INSERTS.MISS_LOCAL_OPCODE",
+ "UNC_C_TOR_INSERTS.MISS_OPCODE")
+ data_rd_rem_occ = Event("UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE",
+ "UNC_CHA_TOR_OCCUPANCY.IA_MISS",
+ "UNC_C_TOR_OCCUPANCY.MISS_REMOTE_OPCODE",
+ "UNC_C_TOR_OCCUPANCY.NID_MISS_OPCODE")
+ data_rd_rem_ins = Event("UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE",
+ "UNC_CHA_TOR_INSERTS.IA_MISS",
+ "UNC_C_TOR_INSERTS.MISS_REMOTE_OPCODE",
+ "UNC_C_TOR_INSERTS.NID_MISS_OPCODE")
+ except:
+ return None
+
+ if (data_rd_loc_occ.name == "UNC_C_TOR_OCCUPANCY.MISS_LOCAL_OPCODE" or
+ data_rd_loc_occ.name == "UNC_C_TOR_OCCUPANCY.MISS_OPCODE"):
+ data_rd = 0x182
+ for e in [data_rd_loc_occ, data_rd_loc_ins, data_rd_rem_occ, data_rd_rem_ins]:
+ e.name += f"/filter_opc={hex(data_rd)}/"
+ elif data_rd_loc_occ.name == "UNC_CHA_TOR_OCCUPANCY.IA_MISS":
+ # Demand Data Read - Full cache-line read requests from core for
+ # lines to be cached in S or E, typically for data
+ demand_data_rd = 0x202
+ # LLC Prefetch Data - Uncore will first look up the line in the
+ # LLC; for a cache hit, the LRU will be updated, on a miss, the
+ # DRd will be initiated
+ llc_prefetch_data = 0x25a
+ local_filter = (f"/filter_opc0={hex(demand_data_rd)},"
+ f"filter_opc1={hex(llc_prefetch_data)},"
+ "filter_loc,filter_nm,filter_not_nm/")
+ remote_filter = (f"/filter_opc0={hex(demand_data_rd)},"
+ f"filter_opc1={hex(llc_prefetch_data)},"
+ "filter_rem,filter_nm,filter_not_nm/")
+ for e in [data_rd_loc_occ, data_rd_loc_ins]:
+ e.name += local_filter
+ for e in [data_rd_rem_occ, data_rd_rem_ins]:
+ e.name += remote_filter
+ else:
+ assert data_rd_loc_occ.name == "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_LOCAL", data_rd_loc_occ
+
+ ticks_per_cha = ticks / source_count(data_rd_loc_ins)
+ loc_lat = interval_sec * 1e9 * data_rd_loc_occ / \
+ (ticks_per_cha * data_rd_loc_ins)
+ ticks_per_cha = ticks / source_count(data_rd_rem_ins)
+ rem_lat = interval_sec * 1e9 * data_rd_rem_occ / \
+ (ticks_per_cha * data_rd_rem_ins)
+ return MetricGroup("lpm_miss_lat", [
+ Metric("lpm_miss_lat_loc", "Local to a socket miss latency in nanoseconds",
+ loc_lat, "ns"),
+ Metric("lpm_miss_lat_rem", "Remote to a socket miss latency in nanoseconds",
+ rem_lat, "ns"),
+ ])
+
+
def IntelMlp() -> Optional[Metric]:
try:
l1d = Event("L1D_PEND_MISS.PENDING")
@@ -1005,6 +1068,7 @@ def main() -> None:
IntelIlp(),
IntelL2(),
IntelLdSt(),
+ IntelMissLat(),
IntelMlp(),
IntelPorts(),
IntelSwpf(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 32/35] perf jevents: Add upi_bw metric for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (30 preceding siblings ...)
2026-01-08 19:11 ` [PATCH v10 31/35] perf jevents: Add local/remote miss latency metrics " Ian Rogers
@ 2026-01-08 19:11 ` Ian Rogers
2026-01-08 19:11 ` [PATCH v10 33/35] perf jevents: Add mesh bandwidth saturation " Ian Rogers
` (4 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:11 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Break down UPI read and write bandwidth using uncore_upi counters.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 037f9b2ea1b6..f6bb691dc5bb 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1033,6 +1033,27 @@ def UncoreMemBw() -> Optional[MetricGroup]:
], description="Memory Bandwidth")
+def UncoreUpiBw() -> Optional[MetricGroup]:
+ try:
+ upi_rds = Event("UNC_UPI_RxL_FLITS.ALL_DATA")
+ upi_wrs = Event("UNC_UPI_TxL_FLITS.ALL_DATA")
+ except:
+ return None
+
+ upi_total = upi_rds + upi_wrs
+
+ # From "Uncore Performance Monitoring": When measuring the amount of
+ # bandwidth consumed by transmission of the data (i.e. NOT including
+ # the header), it should be .ALL_DATA / 9 * 64B.
+ scale = (64 / 9) / 1_000_000
+ return MetricGroup("lpm_upi_bw", [
+ Metric("lpm_upi_bw_read", "UPI read bandwidth",
+ d_ratio(upi_rds, interval_sec), f"{scale}MB/s"),
+ Metric("lpm_upi_bw_write", "DDR memory write bandwidth",
+ d_ratio(upi_wrs, interval_sec), f"{scale}MB/s"),
+ ], description="UPI Bandwidth")
+
+
def main() -> None:
global _args
@@ -1076,6 +1097,7 @@ def main() -> None:
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
+ UncoreUpiBw(),
])
if _args.metricgroups:
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 33/35] perf jevents: Add mesh bandwidth saturation metric for Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (31 preceding siblings ...)
2026-01-08 19:11 ` [PATCH v10 32/35] perf jevents: Add upi_bw metric " Ian Rogers
@ 2026-01-08 19:11 ` Ian Rogers
2026-01-08 19:11 ` [PATCH v10 34/35] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
` (3 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:11 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Memory bandwidth saturation from CBOX/CHA events present in
broadwellde, broadwellx, cascadelakex, haswellx, icelakex, skylakex
and snowridgex.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/intel_metrics.py | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index f6bb691dc5bb..d56bab7337df 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1033,6 +1033,22 @@ def UncoreMemBw() -> Optional[MetricGroup]:
], description="Memory Bandwidth")
+def UncoreMemSat() -> Optional[Metric]:
+ try:
+ clocks = Event("UNC_CHA_CLOCKTICKS", "UNC_C_CLOCKTICKS")
+ sat = Event("UNC_CHA_DISTRESS_ASSERTED.VERT", "UNC_CHA_FAST_ASSERTED.VERT",
+ "UNC_C_FAST_ASSERTED")
+ except:
+ return None
+
+ desc = ("Mesh Bandwidth saturation (% CBOX cycles with FAST signal asserted, "
+ "include QPI bandwidth saturation), lower is better")
+ if "UNC_CHA_" in sat.name:
+ desc = ("Mesh Bandwidth saturation (% CHA cycles with FAST signal asserted, "
+ "include UPI bandwidth saturation), lower is better")
+ return Metric("lpm_mem_sat", desc, d_ratio(sat, clocks), "100%")
+
+
def UncoreUpiBw() -> Optional[MetricGroup]:
try:
upi_rds = Event("UNC_UPI_RxL_FLITS.ALL_DATA")
@@ -1097,6 +1113,7 @@ def main() -> None:
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
+ UncoreMemSat(),
UncoreUpiBw(),
])
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 34/35] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (32 preceding siblings ...)
2026-01-08 19:11 ` [PATCH v10 33/35] perf jevents: Add mesh bandwidth saturation " Ian Rogers
@ 2026-01-08 19:11 ` Ian Rogers
2026-01-08 19:11 ` [PATCH v10 35/35] perf jevents: Validate that all names given an Event Ian Rogers
` (2 subsequent siblings)
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:11 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Breakdown cycles to user, kernel and guest. Add a common_metrics.py
file for such metrics.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/Build | 2 +-
tools/perf/pmu-events/amd_metrics.py | 2 ++
tools/perf/pmu-events/arm64_metrics.py | 5 ++++-
tools/perf/pmu-events/common_metrics.py | 19 +++++++++++++++++++
tools/perf/pmu-events/intel_metrics.py | 2 ++
5 files changed, 28 insertions(+), 2 deletions(-)
create mode 100644 tools/perf/pmu-events/common_metrics.py
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index f7d67d03d055..a3d7a04f0abf 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -44,7 +44,7 @@ $(LEGACY_CACHE_JSON): $(LEGACY_CACHE_PY)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(LEGACY_CACHE_PY) > $@
-GEN_METRIC_DEPS := pmu-events/metric.py
+GEN_METRIC_DEPS := pmu-events/metric.py pmu-events/common_metrics.py
# Generate AMD Json
ZENS = $(shell ls -d pmu-events/arch/x86/amdzen*)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 83e77ccc059e..e2defaffde3e 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -4,6 +4,7 @@ import argparse
import math
import os
from typing import Optional
+from common_metrics import Cycles
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
Metric, MetricGroup, Select)
@@ -475,6 +476,7 @@ def main() -> None:
AmdItlb(),
AmdLdSt(),
AmdUpc(),
+ Cycles(),
Idle(),
Rapl(),
UncoreL3(),
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index ac717ca3513a..4ecda96d11fa 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -4,6 +4,7 @@ import argparse
import os
from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
MetricGroup)
+from common_metrics import Cycles
# Global command line arguments.
_args = None
@@ -34,7 +35,9 @@ def main() -> None:
directory = f"{_args.events_path}/arm64/{_args.vendor}/{_args.model}/"
LoadEvents(directory)
- all_metrics = MetricGroup("", [])
+ all_metrics = MetricGroup("", [
+ Cycles(),
+ ])
if _args.metricgroups:
print(JsonEncodeMetricGroupDescriptions(all_metrics))
diff --git a/tools/perf/pmu-events/common_metrics.py b/tools/perf/pmu-events/common_metrics.py
new file mode 100644
index 000000000000..fcdfb9d3e648
--- /dev/null
+++ b/tools/perf/pmu-events/common_metrics.py
@@ -0,0 +1,19 @@
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+from metric import (d_ratio, Event, Metric, MetricGroup)
+
+
+def Cycles() -> MetricGroup:
+ cyc_k = Event("cpu\\-cycles:kHh") # exclude user and guest
+ cyc_g = Event("cpu\\-cycles:G") # exclude host
+ cyc_u = Event("cpu\\-cycles:uH") # exclude kernel, hypervisor and guest
+ cyc = cyc_k + cyc_g + cyc_u
+
+ return MetricGroup("lpm_cycles", [
+ Metric("lpm_cycles_total", "Total number of cycles", cyc, "cycles"),
+ Metric("lpm_cycles_user", "User cycles as a percentage of all cycles",
+ d_ratio(cyc_u, cyc), "100%"),
+ Metric("lpm_cycles_kernel", "Kernel cycles as a percentage of all cycles",
+ d_ratio(cyc_k, cyc), "100%"),
+ Metric("lpm_cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
+ d_ratio(cyc_g, cyc), "100%"),
+ ], description="cycles breakdown per privilege level (users, kernel, guest)")
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index d56bab7337df..52035433b505 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -6,6 +6,7 @@ import math
import os
import re
from typing import Optional
+from common_metrics import Cycles
from metric import (d_ratio, has_event, max, source_count, CheckPmu, Event,
JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
Literal, LoadEvents, Metric, MetricConstraint, MetricGroup,
@@ -1095,6 +1096,7 @@ def main() -> None:
LoadEvents(directory)
all_metrics = MetricGroup("", [
+ Cycles(),
Idle(),
Rapl(),
Smi(),
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v10 35/35] perf jevents: Validate that all names given an Event
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (33 preceding siblings ...)
2026-01-08 19:11 ` [PATCH v10 34/35] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
@ 2026-01-08 19:11 ` Ian Rogers
2026-01-20 5:23 ` [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
2026-01-27 17:07 ` Arnaldo Carvalho de Melo
36 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-08 19:11 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
Validate they exist in a json file from one directory found from one
directory above the model's json directory. This avoids broken
fallback encodings being created.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/metric.py | 36 +++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index 2029b6e28365..585454828c2f 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -11,12 +11,14 @@ from typing import Dict, List, Optional, Set, Tuple, Union
all_pmus = set()
all_events = set()
experimental_events = set()
+all_events_all_models = set()
def LoadEvents(directory: str) -> None:
"""Populate a global set of all known events for the purpose of validating Event names"""
global all_pmus
global all_events
global experimental_events
+ global all_events_all_models
all_events = {
"context\\-switches",
"cpu\\-cycles",
@@ -42,6 +44,20 @@ def LoadEvents(directory: str) -> None:
# The generated directory may be the same as the input, which
# causes partial json files. Ignore errors.
pass
+ all_events_all_models = all_events.copy()
+ for root, dirs, files in os.walk(directory + ".."):
+ for filename in files:
+ if filename.endswith(".json"):
+ try:
+ for x in json.load(open(f"{root}/{filename}")):
+ if "EventName" in x:
+ all_events_all_models.add(x["EventName"])
+ elif "ArchStdEvent" in x:
+ all_events_all_models.add(x["ArchStdEvent"])
+ except json.decoder.JSONDecodeError:
+ # The generated directory may be the same as the input, which
+ # causes partial json files. Ignore errors.
+ pass
def CheckPmu(name: str) -> bool:
@@ -64,6 +80,25 @@ def CheckEvent(name: str) -> bool:
return name in all_events
+def CheckEveryEvent(*names: str) -> None:
+ """Check all the events exist in at least one json file"""
+ global all_events_all_models
+ if len(all_events_all_models) == 0:
+ assert len(names) == 1, f"Cannot determine valid events in {names}"
+ # No events loaded so assume any event is good.
+ return
+
+ for name in names:
+ # Remove trailing modifier.
+ if ':' in name:
+ name = name[:name.find(':')]
+ elif '/' in name:
+ name = name[:name.find('/')]
+ if any([name.startswith(x) for x in ['amd', 'arm', 'cpu', 'msr', 'power']]):
+ continue
+ if name not in all_events_all_models:
+ raise Exception(f"Is {name} a named json event?")
+
def IsExperimentalEvent(name: str) -> bool:
global experimental_events
@@ -403,6 +438,7 @@ class Event(Expression):
def __init__(self, *args: str):
error = ""
+ CheckEveryEvent(*args)
for name in args:
if CheckEvent(name):
self.name = _FixEscapes(name)
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH v10 00/35] AMD and Intel metric generation with Python
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (34 preceding siblings ...)
2026-01-08 19:11 ` [PATCH v10 35/35] perf jevents: Validate that all names given an Event Ian Rogers
@ 2026-01-20 5:23 ` Ian Rogers
2026-01-23 17:12 ` Ian Rogers
2026-01-27 17:07 ` Arnaldo Carvalho de Melo
36 siblings, 1 reply; 40+ messages in thread
From: Ian Rogers @ 2026-01-20 5:23 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On Thu, Jan 8, 2026 at 11:11 AM Ian Rogers <irogers@google.com> wrote:
>
> Metrics in the perf tool come in via json. Json doesn't allow
> comments, line breaks, etc. making it an inconvenient way to write
> metrics. Further, it is useful to detect when writing a metric that
> the event specified is supported within the event json for a
> model. From the metric python code Event(s) are used, with fallback
> events provided, if no event is found then an exception is thrown and
> that can either indicate a failure or an unsupported model. To avoid
> confusion all the metrics and their metricgroups are prefixed with
> 'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
> characters aren't ideal, this separates the metrics from other vendor
> provided metrics.
>
> * The first 2 patches introduce infrastructure for the addition of
> metrics written in python for Arm64, AMD Zen and Intel CPUs.
>
> * The next 9 patches generate additional metrics for AMD zen. Rapl
> and Idle metrics aren't specific to AMD but are placed here for ease
> and convenience. Uncore L3 metrics are added along with the majority
> of core metrics.
>
> * The next 22 patches add additional metrics for Intel. Rapl and Idle
> metrics aren't specific to Intel but are placed here for ease and
> convenience. Smi and tsx metrics are added so they can be dropped
> from the per model json files. There are four uncore sets of metrics
> and eleven core metrics. Add a CheckPmu function to metric to
> simplify detecting the presence of hybrid PMUs in events. Metrics
> with experimental events are flagged as experimental in their
> description.
>
> * The next patch adds a cycles metrics based on perf event modifiers
> for AMD, Intel and Arm64.
>
> * The final patch validates that all events provided to an Event
> object exist in a json file somewhere. This is to avoid mistakes
> like unfortunate typos.
>
> This series has benefitted from the input of Leo Yan
> <leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
> <thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
>
> v10. Drop already merged non-vendor patches (Namhyung). Drop "Add
> collection of topdown like metrics for arm64" as requested by
> James Clark. Update AMD metrics for changes to AMD Zen6 event
> names from the series:
> https://lore.kernel.org/lkml/cover.1767858676.git.sandipan.das@amd.com/
>
> v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
> reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
> Thomas Falcon (Intel).
> https://lore.kernel.org/lkml/20251202175043.623597-1-irogers@google.com/
>
> v8. Combine the previous 4 series for clarity. Rebase on top of the
> more recent legacy metric and event changes. Make the python more
> pep8 and pylint compliant.
> https://lore.kernel.org/lkml/20251113032040.1994090-1-irogers@google.com/
>
> Foundations:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
>
> v5. Rebase on top of legacy hardware/cache changes that now generate
> events using python:
> https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
> the v5 series is:
> https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
>
> v4. Rebase and small Build/Makefile tweak
> https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
>
> v3. Some code tidying, make the input directory a command line
> argument, but no other functional or output changes.
> https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
>
> v2. Fixes two type issues in the python code but no functional or
> output changes.
> https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
>
> AMD:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
>
> v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
> names with lpm_ (short for Linux Perf Metric) so that python
> generated metrics are clearly namespaced.
> https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
>
> v4. Rebase.
> https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
>
> v3. Some minor code cleanup changes.
> https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
>
> v2. Drop the cycles breakdown in favor of having it as a common
> metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
> https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
>
> Intel:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
>
> v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
> names with lpm_ (short for Linux Perf Metric) so that python
> generated metrics are clearly namespaced. Kan requested a
> namespace in his review:
> https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
> The v5 series is:
> https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
>
> v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
>
> v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
> minor code cleanup changes. Drop reference to merged fix for
> umasks/occ_sel in PCU events and for cstate metrics.
> https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
>
> v2. Drop the cycles breakdown in favor of having it as a common
> metric, spelling and other improvements suggested by Kan Liang
> <kan.liang@linux.intel.com>.
> https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
>
> ARM:
>
> v7. Switch a use of cycles to cpu-cycles due to ARM having too many
> cycles events.
> https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
>
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
>
> v5. Rebase. Address review comments from Leo Yan
> <leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
> Linux Perf Metric) so that python generated metrics are clearly
> namespaced. Use cpu-cycles rather than cycles legacy event for
> cycles metrics to avoid confusion with ARM PMUs. Add patch that
> checks events to ensure all possible event names are present in at
> least one json file.
> https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
>
> v4. Tweak to build dependencies and rebase.
> https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
>
> v3. Some minor code cleanup changes.
> https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
>
> v2. The cycles metrics are now made common and shared with AMD and
> Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
> assumes these patches come after the AMD and Intel sets.
> https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
>
> Ian Rogers (35):
> perf jevents: Build support for generating metrics from python
> perf jevents: Add load event json to verify and allow fallbacks
> perf jevents: Add RAPL event metric for AMD zen models
> perf jevents: Add idle metric for AMD zen models
> perf jevents: Add upc metric for uops per cycle for AMD
> perf jevents: Add br metric group for branch statistics on AMD
> perf jevents: Add itlb metric group for AMD
> perf jevents: Add dtlb metric group for AMD
> perf jevents: Add uncore l3 metric group for AMD
> perf jevents: Add load store breakdown metrics ldst for AMD
> perf jevents: Add context switch metrics for AMD
> perf jevents: Add RAPL metrics for all Intel models
> perf jevents: Add idle metric for Intel models
> perf jevents: Add CheckPmu to see if a PMU is in loaded json events
> perf jevents: Add smi metric group for Intel models
> perf jevents: Mark metrics with experimental events as experimental
> perf jevents: Add tsx metric group for Intel models
> perf jevents: Add br metric group for branch statistics on Intel
> perf jevents: Add software prefetch (swpf) metric group for Intel
> perf jevents: Add ports metric group giving utilization on Intel
> perf jevents: Add L2 metrics for Intel
> perf jevents: Add load store breakdown metrics ldst for Intel
> perf jevents: Add ILP metrics for Intel
> perf jevents: Add context switch metrics for Intel
> perf jevents: Add FPU metrics for Intel
> perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
> perf jevents: Add mem_bw metric for Intel
> perf jevents: Add local/remote "mem" breakdown metrics for Intel
> perf jevents: Add dir breakdown metrics for Intel
> perf jevents: Add C-State metrics from the PCU PMU for Intel
> perf jevents: Add local/remote miss latency metrics for Intel
> perf jevents: Add upi_bw metric for Intel
> perf jevents: Add mesh bandwidth saturation metric for Intel
> perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
> perf jevents: Validate that all names given an Event
Hi Arnaldo and Namhyung,
this patch series has carved out everything anybody has objected to
and every patch has tags. I'd appreciate it landing. The v1 series was
originally mailed in February 2024 :-)
Thanks,
Ian
> tools/perf/.gitignore | 5 +
> tools/perf/Makefile.perf | 2 +
> tools/perf/pmu-events/Build | 51 +-
> tools/perf/pmu-events/amd_metrics.py | 492 ++++++++++
> tools/perf/pmu-events/arm64_metrics.py | 49 +
> tools/perf/pmu-events/common_metrics.py | 19 +
> tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
> tools/perf/pmu-events/metric.py | 171 +++-
> 8 files changed, 1914 insertions(+), 4 deletions(-)
> create mode 100755 tools/perf/pmu-events/amd_metrics.py
> create mode 100755 tools/perf/pmu-events/arm64_metrics.py
> create mode 100644 tools/perf/pmu-events/common_metrics.py
> create mode 100755 tools/perf/pmu-events/intel_metrics.py
>
> --
> 2.52.0.457.g6b5491de43-goog
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v10 00/35] AMD and Intel metric generation with Python
2026-01-20 5:23 ` [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
@ 2026-01-23 17:12 ` Ian Rogers
0 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-23 17:12 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Benjamin Gray, Caleb Biggers, Edward Baker, Ian Rogers,
Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa, John Garry,
Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra, Samantha Alt,
Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang, linux-kernel,
linux-perf-users
On Mon, Jan 19, 2026 at 9:23 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Jan 8, 2026 at 11:11 AM Ian Rogers <irogers@google.com> wrote:
> >
> > Metrics in the perf tool come in via json. Json doesn't allow
> > comments, line breaks, etc. making it an inconvenient way to write
> > metrics. Further, it is useful to detect when writing a metric that
> > the event specified is supported within the event json for a
> > model. From the metric python code Event(s) are used, with fallback
> > events provided, if no event is found then an exception is thrown and
> > that can either indicate a failure or an unsupported model. To avoid
> > confusion all the metrics and their metricgroups are prefixed with
> > 'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
> > characters aren't ideal, this separates the metrics from other vendor
> > provided metrics.
> >
> > * The first 2 patches introduce infrastructure for the addition of
> > metrics written in python for Arm64, AMD Zen and Intel CPUs.
> >
> > * The next 9 patches generate additional metrics for AMD zen. Rapl
> > and Idle metrics aren't specific to AMD but are placed here for ease
> > and convenience. Uncore L3 metrics are added along with the majority
> > of core metrics.
> >
> > * The next 22 patches add additional metrics for Intel. Rapl and Idle
> > metrics aren't specific to Intel but are placed here for ease and
> > convenience. Smi and tsx metrics are added so they can be dropped
> > from the per model json files. There are four uncore sets of metrics
> > and eleven core metrics. Add a CheckPmu function to metric to
> > simplify detecting the presence of hybrid PMUs in events. Metrics
> > with experimental events are flagged as experimental in their
> > description.
> >
> > * The next patch adds a cycles metrics based on perf event modifiers
> > for AMD, Intel and Arm64.
> >
> > * The final patch validates that all events provided to an Event
> > object exist in a json file somewhere. This is to avoid mistakes
> > like unfortunate typos.
> >
> > This series has benefitted from the input of Leo Yan
> > <leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
> > <thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
> >
> > v10. Drop already merged non-vendor patches (Namhyung). Drop "Add
> > collection of topdown like metrics for arm64" as requested by
> > James Clark. Update AMD metrics for changes to AMD Zen6 event
> > names from the series:
> > https://lore.kernel.org/lkml/cover.1767858676.git.sandipan.das@amd.com/
> >
> > v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
> > reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
> > Thomas Falcon (Intel).
> > https://lore.kernel.org/lkml/20251202175043.623597-1-irogers@google.com/
> >
> > v8. Combine the previous 4 series for clarity. Rebase on top of the
> > more recent legacy metric and event changes. Make the python more
> > pep8 and pylint compliant.
> > https://lore.kernel.org/lkml/20251113032040.1994090-1-irogers@google.com/
> >
> > Foundations:
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
> >
> > v5. Rebase on top of legacy hardware/cache changes that now generate
> > events using python:
> > https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
> > the v5 series is:
> > https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
> >
> > v4. Rebase and small Build/Makefile tweak
> > https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
> >
> > v3. Some code tidying, make the input directory a command line
> > argument, but no other functional or output changes.
> > https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
> >
> > v2. Fixes two type issues in the python code but no functional or
> > output changes.
> > https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
> >
> > AMD:
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
> >
> > v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
> > names with lpm_ (short for Linux Perf Metric) so that python
> > generated metrics are clearly namespaced.
> > https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
> >
> > v4. Rebase.
> > https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
> >
> > v3. Some minor code cleanup changes.
> > https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
> >
> > v2. Drop the cycles breakdown in favor of having it as a common
> > metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
> > https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
> >
> > Intel:
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
> >
> > v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
> > names with lpm_ (short for Linux Perf Metric) so that python
> > generated metrics are clearly namespaced. Kan requested a
> > namespace in his review:
> > https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
> > The v5 series is:
> > https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
> >
> > v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> > https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
> >
> > v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
> > minor code cleanup changes. Drop reference to merged fix for
> > umasks/occ_sel in PCU events and for cstate metrics.
> > https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
> >
> > v2. Drop the cycles breakdown in favor of having it as a common
> > metric, spelling and other improvements suggested by Kan Liang
> > <kan.liang@linux.intel.com>.
> > https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
> >
> > ARM:
> >
> > v7. Switch a use of cycles to cpu-cycles due to ARM having too many
> > cycles events.
> > https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
> >
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
> >
> > v5. Rebase. Address review comments from Leo Yan
> > <leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
> > Linux Perf Metric) so that python generated metrics are clearly
> > namespaced. Use cpu-cycles rather than cycles legacy event for
> > cycles metrics to avoid confusion with ARM PMUs. Add patch that
> > checks events to ensure all possible event names are present in at
> > least one json file.
> > https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
> >
> > v4. Tweak to build dependencies and rebase.
> > https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
> >
> > v3. Some minor code cleanup changes.
> > https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
> >
> > v2. The cycles metrics are now made common and shared with AMD and
> > Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
> > assumes these patches come after the AMD and Intel sets.
> > https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
> >
> > Ian Rogers (35):
> > perf jevents: Build support for generating metrics from python
> > perf jevents: Add load event json to verify and allow fallbacks
> > perf jevents: Add RAPL event metric for AMD zen models
> > perf jevents: Add idle metric for AMD zen models
> > perf jevents: Add upc metric for uops per cycle for AMD
> > perf jevents: Add br metric group for branch statistics on AMD
> > perf jevents: Add itlb metric group for AMD
> > perf jevents: Add dtlb metric group for AMD
> > perf jevents: Add uncore l3 metric group for AMD
> > perf jevents: Add load store breakdown metrics ldst for AMD
> > perf jevents: Add context switch metrics for AMD
> > perf jevents: Add RAPL metrics for all Intel models
> > perf jevents: Add idle metric for Intel models
> > perf jevents: Add CheckPmu to see if a PMU is in loaded json events
> > perf jevents: Add smi metric group for Intel models
> > perf jevents: Mark metrics with experimental events as experimental
> > perf jevents: Add tsx metric group for Intel models
> > perf jevents: Add br metric group for branch statistics on Intel
> > perf jevents: Add software prefetch (swpf) metric group for Intel
> > perf jevents: Add ports metric group giving utilization on Intel
> > perf jevents: Add L2 metrics for Intel
> > perf jevents: Add load store breakdown metrics ldst for Intel
> > perf jevents: Add ILP metrics for Intel
> > perf jevents: Add context switch metrics for Intel
> > perf jevents: Add FPU metrics for Intel
> > perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
> > perf jevents: Add mem_bw metric for Intel
> > perf jevents: Add local/remote "mem" breakdown metrics for Intel
> > perf jevents: Add dir breakdown metrics for Intel
> > perf jevents: Add C-State metrics from the PCU PMU for Intel
> > perf jevents: Add local/remote miss latency metrics for Intel
> > perf jevents: Add upi_bw metric for Intel
> > perf jevents: Add mesh bandwidth saturation metric for Intel
> > perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
> > perf jevents: Validate that all names given an Event
>
> Hi Arnaldo and Namhyung,
>
> this patch series has carved out everything anybody has objected to
> and every patch has tags. I'd appreciate it landing. The v1 series was
> originally mailed in February 2024 :-)
Ping.
Thanks,
Ian
> Thanks,
> Ian
>
> > tools/perf/.gitignore | 5 +
> > tools/perf/Makefile.perf | 2 +
> > tools/perf/pmu-events/Build | 51 +-
> > tools/perf/pmu-events/amd_metrics.py | 492 ++++++++++
> > tools/perf/pmu-events/arm64_metrics.py | 49 +
> > tools/perf/pmu-events/common_metrics.py | 19 +
> > tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
> > tools/perf/pmu-events/metric.py | 171 +++-
> > 8 files changed, 1914 insertions(+), 4 deletions(-)
> > create mode 100755 tools/perf/pmu-events/amd_metrics.py
> > create mode 100755 tools/perf/pmu-events/arm64_metrics.py
> > create mode 100644 tools/perf/pmu-events/common_metrics.py
> > create mode 100755 tools/perf/pmu-events/intel_metrics.py
> >
> > --
> > 2.52.0.457.g6b5491de43-goog
> >
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v10 00/35] AMD and Intel metric generation with Python
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
` (35 preceding siblings ...)
2026-01-20 5:23 ` [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
@ 2026-01-27 17:07 ` Arnaldo Carvalho de Melo
2026-01-27 18:09 ` Ian Rogers
36 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-01-27 17:07 UTC (permalink / raw)
To: Ian Rogers
Cc: Adrian Hunter, Alexander Shishkin, Benjamin Gray, Caleb Biggers,
Edward Baker, Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa,
John Garry, Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra,
Samantha Alt, Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang,
linux-kernel, linux-perf-users
On Thu, Jan 08, 2026 at 11:10:30AM -0800, Ian Rogers wrote:
> Metrics in the perf tool come in via json. Json doesn't allow
> comments, line breaks, etc. making it an inconvenient way to write
> metrics. Further, it is useful to detect when writing a metric that
> the event specified is supported within the event json for a
> model. From the metric python code Event(s) are used, with fallback
> events provided, if no event is found then an exception is thrown and
> that can either indicate a failure or an unsupported model. To avoid
> confusion all the metrics and their metricgroups are prefixed with
> 'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
> characters aren't ideal, this separates the metrics from other vendor
> provided metrics.
>
> * The first 2 patches introduce infrastructure for the addition of
> metrics written in python for Arm64, AMD Zen and Intel CPUs.
Tried this one now:
Cover: ./v10_20260108_irogers_amd_and_intel_metric_generation_with_python.cover
Link: https://lore.kernel.org/r/20260108191105.695131-1-irogers@google.com
Base: not specified
git am ./v10_20260108_irogers_amd_and_intel_metric_generation_with_python.mbx
⬢ [acme@toolbx perf-tools-next]$ git am ./v10_20260108_irogers_amd_and_intel_metric_generation_with_python.mbx
Applying: perf jevents: Build support for generating metrics from python
error: patch failed: tools/perf/pmu-events/Build:29
error: tools/perf/pmu-events/Build: patch does not apply
Patch failed at 0001 perf jevents: Build support for generating metrics from python
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
⬢ [acme@toolbx perf-tools-next]$ git am --abort
⬢ [acme@toolbx perf-tools-next]$
Can you please take a look?
- Arnaldo
> * The next 9 patches generate additional metrics for AMD zen. Rapl
> and Idle metrics aren't specific to AMD but are placed here for ease
> and convenience. Uncore L3 metrics are added along with the majority
> of core metrics.
>
> * The next 22 patches add additional metrics for Intel. Rapl and Idle
> metrics aren't specific to Intel but are placed here for ease and
> convenience. Smi and tsx metrics are added so they can be dropped
> from the per model json files. There are four uncore sets of metrics
> and eleven core metrics. Add a CheckPmu function to metric to
> simplify detecting the presence of hybrid PMUs in events. Metrics
> with experimental events are flagged as experimental in their
> description.
>
> * The next patch adds a cycles metrics based on perf event modifiers
> for AMD, Intel and Arm64.
>
> * The final patch validates that all events provided to an Event
> object exist in a json file somewhere. This is to avoid mistakes
> like unfortunate typos.
>
> This series has benefitted from the input of Leo Yan
> <leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
> <thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
>
> v10. Drop already merged non-vendor patches (Namhyung). Drop "Add
> collection of topdown like metrics for arm64" as requested by
> James Clark. Update AMD metrics for changes to AMD Zen6 event
> names from the series:
> https://lore.kernel.org/lkml/cover.1767858676.git.sandipan.das@amd.com/
>
> v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
> reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
> Thomas Falcon (Intel).
> https://lore.kernel.org/lkml/20251202175043.623597-1-irogers@google.com/
>
> v8. Combine the previous 4 series for clarity. Rebase on top of the
> more recent legacy metric and event changes. Make the python more
> pep8 and pylint compliant.
> https://lore.kernel.org/lkml/20251113032040.1994090-1-irogers@google.com/
>
> Foundations:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
>
> v5. Rebase on top of legacy hardware/cache changes that now generate
> events using python:
> https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
> the v5 series is:
> https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
>
> v4. Rebase and small Build/Makefile tweak
> https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
>
> v3. Some code tidying, make the input directory a command line
> argument, but no other functional or output changes.
> https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
>
> v2. Fixes two type issues in the python code but no functional or
> output changes.
> https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
>
> AMD:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
>
> v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
> names with lpm_ (short for Linux Perf Metric) so that python
> generated metrics are clearly namespaced.
> https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
>
> v4. Rebase.
> https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
>
> v3. Some minor code cleanup changes.
> https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
>
> v2. Drop the cycles breakdown in favor of having it as a common
> metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
> https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
>
> Intel:
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
>
> v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
> names with lpm_ (short for Linux Perf Metric) so that python
> generated metrics are clearly namespaced. Kan requested a
> namespace in his review:
> https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
> The v5 series is:
> https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
>
> v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
>
> v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
> minor code cleanup changes. Drop reference to merged fix for
> umasks/occ_sel in PCU events and for cstate metrics.
> https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
>
> v2. Drop the cycles breakdown in favor of having it as a common
> metric, spelling and other improvements suggested by Kan Liang
> <kan.liang@linux.intel.com>.
> https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
>
> ARM:
>
> v7. Switch a use of cycles to cpu-cycles due to ARM having too many
> cycles events.
> https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
>
> v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> Das <sandidas@amd.com>) which didn't alter the generated json.
> https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
>
> v5. Rebase. Address review comments from Leo Yan
> <leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
> Linux Perf Metric) so that python generated metrics are clearly
> namespaced. Use cpu-cycles rather than cycles legacy event for
> cycles metrics to avoid confusion with ARM PMUs. Add patch that
> checks events to ensure all possible event names are present in at
> least one json file.
> https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
>
> v4. Tweak to build dependencies and rebase.
> https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
>
> v3. Some minor code cleanup changes.
> https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
>
> v2. The cycles metrics are now made common and shared with AMD and
> Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
> assumes these patches come after the AMD and Intel sets.
> https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
>
> v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
>
> Ian Rogers (35):
> perf jevents: Build support for generating metrics from python
> perf jevents: Add load event json to verify and allow fallbacks
> perf jevents: Add RAPL event metric for AMD zen models
> perf jevents: Add idle metric for AMD zen models
> perf jevents: Add upc metric for uops per cycle for AMD
> perf jevents: Add br metric group for branch statistics on AMD
> perf jevents: Add itlb metric group for AMD
> perf jevents: Add dtlb metric group for AMD
> perf jevents: Add uncore l3 metric group for AMD
> perf jevents: Add load store breakdown metrics ldst for AMD
> perf jevents: Add context switch metrics for AMD
> perf jevents: Add RAPL metrics for all Intel models
> perf jevents: Add idle metric for Intel models
> perf jevents: Add CheckPmu to see if a PMU is in loaded json events
> perf jevents: Add smi metric group for Intel models
> perf jevents: Mark metrics with experimental events as experimental
> perf jevents: Add tsx metric group for Intel models
> perf jevents: Add br metric group for branch statistics on Intel
> perf jevents: Add software prefetch (swpf) metric group for Intel
> perf jevents: Add ports metric group giving utilization on Intel
> perf jevents: Add L2 metrics for Intel
> perf jevents: Add load store breakdown metrics ldst for Intel
> perf jevents: Add ILP metrics for Intel
> perf jevents: Add context switch metrics for Intel
> perf jevents: Add FPU metrics for Intel
> perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
> perf jevents: Add mem_bw metric for Intel
> perf jevents: Add local/remote "mem" breakdown metrics for Intel
> perf jevents: Add dir breakdown metrics for Intel
> perf jevents: Add C-State metrics from the PCU PMU for Intel
> perf jevents: Add local/remote miss latency metrics for Intel
> perf jevents: Add upi_bw metric for Intel
> perf jevents: Add mesh bandwidth saturation metric for Intel
> perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
> perf jevents: Validate that all names given an Event
>
> tools/perf/.gitignore | 5 +
> tools/perf/Makefile.perf | 2 +
> tools/perf/pmu-events/Build | 51 +-
> tools/perf/pmu-events/amd_metrics.py | 492 ++++++++++
> tools/perf/pmu-events/arm64_metrics.py | 49 +
> tools/perf/pmu-events/common_metrics.py | 19 +
> tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
> tools/perf/pmu-events/metric.py | 171 +++-
> 8 files changed, 1914 insertions(+), 4 deletions(-)
> create mode 100755 tools/perf/pmu-events/amd_metrics.py
> create mode 100755 tools/perf/pmu-events/arm64_metrics.py
> create mode 100644 tools/perf/pmu-events/common_metrics.py
> create mode 100755 tools/perf/pmu-events/intel_metrics.py
>
> --
> 2.52.0.457.g6b5491de43-goog
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v10 00/35] AMD and Intel metric generation with Python
2026-01-27 17:07 ` Arnaldo Carvalho de Melo
@ 2026-01-27 18:09 ` Ian Rogers
0 siblings, 0 replies; 40+ messages in thread
From: Ian Rogers @ 2026-01-27 18:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Adrian Hunter, Alexander Shishkin, Benjamin Gray, Caleb Biggers,
Edward Baker, Ingo Molnar, James Clark, Jing Zhang, Jiri Olsa,
John Garry, Leo Yan, Namhyung Kim, Perry Taylor, Peter Zijlstra,
Samantha Alt, Sandipan Das, Thomas Falcon, Weilin Wang, Xu Yang,
linux-kernel, linux-perf-users
On Tue, Jan 27, 2026 at 9:07 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Thu, Jan 08, 2026 at 11:10:30AM -0800, Ian Rogers wrote:
> > Metrics in the perf tool come in via json. Json doesn't allow
> > comments, line breaks, etc. making it an inconvenient way to write
> > metrics. Further, it is useful to detect when writing a metric that
> > the event specified is supported within the event json for a
> > model. From the metric python code Event(s) are used, with fallback
> > events provided, if no event is found then an exception is thrown and
> > that can either indicate a failure or an unsupported model. To avoid
> > confusion all the metrics and their metricgroups are prefixed with
> > 'lpm_', where LPM is an abbreviation of Linux Perf Metric. While extra
> > characters aren't ideal, this separates the metrics from other vendor
> > provided metrics.
> >
> > * The first 2 patches introduce infrastructure for the addition of
> > metrics written in python for Arm64, AMD Zen and Intel CPUs.
>
> Tried this one now:
>
> Cover: ./v10_20260108_irogers_amd_and_intel_metric_generation_with_python.cover
> Link: https://lore.kernel.org/r/20260108191105.695131-1-irogers@google.com
> Base: not specified
> git am ./v10_20260108_irogers_amd_and_intel_metric_generation_with_python.mbx
> ⬢ [acme@toolbx perf-tools-next]$ git am ./v10_20260108_irogers_amd_and_intel_metric_generation_with_python.mbx
> Applying: perf jevents: Build support for generating metrics from python
> error: patch failed: tools/perf/pmu-events/Build:29
> error: tools/perf/pmu-events/Build: patch does not apply
> Patch failed at 0001 perf jevents: Build support for generating metrics from python
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> hint: When you have resolved this problem, run "git am --continue".
> hint: If you prefer to skip this patch, run "git am --skip" instead.
> hint: To restore the original branch and stop patching, run "git am --abort".
> hint: Disable this message with "git config set advice.mergeConflict false"
> ⬢ [acme@toolbx perf-tools-next]$ git am --abort
> ⬢ [acme@toolbx perf-tools-next]$
>
> Can you please take a look?
I'll rebase and resend. My local rebases hadn't been having issues,
largely as the changes are on added files, and I'd wanted to minimize
mailing list noise.
Thanks,
Ian
> - Arnaldo
>
> > * The next 9 patches generate additional metrics for AMD zen. Rapl
> > and Idle metrics aren't specific to AMD but are placed here for ease
> > and convenience. Uncore L3 metrics are added along with the majority
> > of core metrics.
> >
> > * The next 22 patches add additional metrics for Intel. Rapl and Idle
> > metrics aren't specific to Intel but are placed here for ease and
> > convenience. Smi and tsx metrics are added so they can be dropped
> > from the per model json files. There are four uncore sets of metrics
> > and eleven core metrics. Add a CheckPmu function to metric to
> > simplify detecting the presence of hybrid PMUs in events. Metrics
> > with experimental events are flagged as experimental in their
> > description.
> >
> > * The next patch adds a cycles metrics based on perf event modifiers
> > for AMD, Intel and Arm64.
> >
> > * The final patch validates that all events provided to an Event
> > object exist in a json file somewhere. This is to avoid mistakes
> > like unfortunate typos.
> >
> > This series has benefitted from the input of Leo Yan
> > <leo.yan@arm.com>, Sandipan Das <sandidas@amd.com>, Thomas Falcon
> > <thomas.falcon@intel.com> and Perry Taylor <perry.taylor@intel.com>.
> >
> > v10. Drop already merged non-vendor patches (Namhyung). Drop "Add
> > collection of topdown like metrics for arm64" as requested by
> > James Clark. Update AMD metrics for changes to AMD Zen6 event
> > names from the series:
> > https://lore.kernel.org/lkml/cover.1767858676.git.sandipan.das@amd.com/
> >
> > v9. Drop (for now) 4 AMD sets of metrics for additional follow up. Add
> > reviewed-by tags from Sandipan Das (AMD) and tested-by tags from
> > Thomas Falcon (Intel).
> > https://lore.kernel.org/lkml/20251202175043.623597-1-irogers@google.com/
> >
> > v8. Combine the previous 4 series for clarity. Rebase on top of the
> > more recent legacy metric and event changes. Make the python more
> > pep8 and pylint compliant.
> > https://lore.kernel.org/lkml/20251113032040.1994090-1-irogers@google.com/
> >
> > Foundations:
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904043208.995243-1-irogers@google.com/
> >
> > v5. Rebase on top of legacy hardware/cache changes that now generate
> > events using python:
> > https://lore.kernel.org/lkml/20250828205930.4007284-1-irogers@google.com/
> > the v5 series is:
> > https://lore.kernel.org/lkml/20250829030727.4159703-1-irogers@google.com/
> >
> > v4. Rebase and small Build/Makefile tweak
> > https://lore.kernel.org/lkml/20240926173554.404411-1-irogers@google.com/
> >
> > v3. Some code tidying, make the input directory a command line
> > argument, but no other functional or output changes.
> > https://lore.kernel.org/lkml/20240314055051.1960527-1-irogers@google.com/
> >
> > v2. Fixes two type issues in the python code but no functional or
> > output changes.
> > https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240302005950.2847058-1-irogers@google.com/
> >
> > AMD:
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904044047.999031-1-irogers@google.com/
> >
> > v5. Rebase. Add uop cache hit/miss rates patch. Prefix all metric
> > names with lpm_ (short for Linux Perf Metric) so that python
> > generated metrics are clearly namespaced.
> > https://lore.kernel.org/lkml/20250829033138.4166591-1-irogers@google.com/
> >
> > v4. Rebase.
> > https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
> >
> > v3. Some minor code cleanup changes.
> > https://lore.kernel.org/lkml/20240314055839.1975063-1-irogers@google.com/
> >
> > v2. Drop the cycles breakdown in favor of having it as a common
> > metric, suggested by Kan Liang <kan.liang@linux.intel.com>.
> > https://lore.kernel.org/lkml/20240301184737.2660108-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240229001537.4158049-1-irogers@google.com/
> >
> > Intel:
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904044653.1002362-1-irogers@google.com/
> >
> > v5. Rebase. Fix description for smi metric (Kan). Prefix all metric
> > names with lpm_ (short for Linux Perf Metric) so that python
> > generated metrics are clearly namespaced. Kan requested a
> > namespace in his review:
> > https://lore.kernel.org/lkml/43548903-b7c8-47c4-b1da-0258293ecbd4@linux.intel.com/
> > The v5 series is:
> > https://lore.kernel.org/lkml/20250829041104.4186320-1-irogers@google.com/
> >
> > v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> > https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/
> >
> > v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
> > minor code cleanup changes. Drop reference to merged fix for
> > umasks/occ_sel in PCU events and for cstate metrics.
> > https://lore.kernel.org/lkml/20240314055919.1979781-1-irogers@google.com/
> >
> > v2. Drop the cycles breakdown in favor of having it as a common
> > metric, spelling and other improvements suggested by Kan Liang
> > <kan.liang@linux.intel.com>.
> > https://lore.kernel.org/lkml/20240301185559.2661241-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240229001806.4158429-1-irogers@google.com/
> >
> > ARM:
> >
> > v7. Switch a use of cycles to cpu-cycles due to ARM having too many
> > cycles events.
> > https://lore.kernel.org/lkml/20250904194139.1540230-1-irogers@google.com/
> >
> > v6. Fix issue with '\-' escape not being '\\-' (reported-by Sandipan
> > Das <sandidas@amd.com>) which didn't alter the generated json.
> > https://lore.kernel.org/lkml/20250904045253.1007052-1-irogers@google.com/
> >
> > v5. Rebase. Address review comments from Leo Yan
> > <leo.yan@arm.com>. Prefix all metric names with lpm_ (short for
> > Linux Perf Metric) so that python generated metrics are clearly
> > namespaced. Use cpu-cycles rather than cycles legacy event for
> > cycles metrics to avoid confusion with ARM PMUs. Add patch that
> > checks events to ensure all possible event names are present in at
> > least one json file.
> > https://lore.kernel.org/lkml/20250829053235.21994-1-irogers@google.com/
> >
> > v4. Tweak to build dependencies and rebase.
> > https://lore.kernel.org/lkml/20240926175709.410022-1-irogers@google.com/
> >
> > v3. Some minor code cleanup changes.
> > https://lore.kernel.org/lkml/20240314055801.1973422-1-irogers@google.com/
> >
> > v2. The cycles metrics are now made common and shared with AMD and
> > Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
> > assumes these patches come after the AMD and Intel sets.
> > https://lore.kernel.org/lkml/20240301184942.2660478-1-irogers@google.com/
> >
> > v1. https://lore.kernel.org/lkml/20240229001325.4157655-1-irogers@google.com/
> >
> > Ian Rogers (35):
> > perf jevents: Build support for generating metrics from python
> > perf jevents: Add load event json to verify and allow fallbacks
> > perf jevents: Add RAPL event metric for AMD zen models
> > perf jevents: Add idle metric for AMD zen models
> > perf jevents: Add upc metric for uops per cycle for AMD
> > perf jevents: Add br metric group for branch statistics on AMD
> > perf jevents: Add itlb metric group for AMD
> > perf jevents: Add dtlb metric group for AMD
> > perf jevents: Add uncore l3 metric group for AMD
> > perf jevents: Add load store breakdown metrics ldst for AMD
> > perf jevents: Add context switch metrics for AMD
> > perf jevents: Add RAPL metrics for all Intel models
> > perf jevents: Add idle metric for Intel models
> > perf jevents: Add CheckPmu to see if a PMU is in loaded json events
> > perf jevents: Add smi metric group for Intel models
> > perf jevents: Mark metrics with experimental events as experimental
> > perf jevents: Add tsx metric group for Intel models
> > perf jevents: Add br metric group for branch statistics on Intel
> > perf jevents: Add software prefetch (swpf) metric group for Intel
> > perf jevents: Add ports metric group giving utilization on Intel
> > perf jevents: Add L2 metrics for Intel
> > perf jevents: Add load store breakdown metrics ldst for Intel
> > perf jevents: Add ILP metrics for Intel
> > perf jevents: Add context switch metrics for Intel
> > perf jevents: Add FPU metrics for Intel
> > perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
> > perf jevents: Add mem_bw metric for Intel
> > perf jevents: Add local/remote "mem" breakdown metrics for Intel
> > perf jevents: Add dir breakdown metrics for Intel
> > perf jevents: Add C-State metrics from the PCU PMU for Intel
> > perf jevents: Add local/remote miss latency metrics for Intel
> > perf jevents: Add upi_bw metric for Intel
> > perf jevents: Add mesh bandwidth saturation metric for Intel
> > perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
> > perf jevents: Validate that all names given an Event
> >
> > tools/perf/.gitignore | 5 +
> > tools/perf/Makefile.perf | 2 +
> > tools/perf/pmu-events/Build | 51 +-
> > tools/perf/pmu-events/amd_metrics.py | 492 ++++++++++
> > tools/perf/pmu-events/arm64_metrics.py | 49 +
> > tools/perf/pmu-events/common_metrics.py | 19 +
> > tools/perf/pmu-events/intel_metrics.py | 1129 +++++++++++++++++++++++
> > tools/perf/pmu-events/metric.py | 171 +++-
> > 8 files changed, 1914 insertions(+), 4 deletions(-)
> > create mode 100755 tools/perf/pmu-events/amd_metrics.py
> > create mode 100755 tools/perf/pmu-events/arm64_metrics.py
> > create mode 100644 tools/perf/pmu-events/common_metrics.py
> > create mode 100755 tools/perf/pmu-events/intel_metrics.py
> >
> > --
> > 2.52.0.457.g6b5491de43-goog
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2026-01-27 18:10 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-08 19:10 [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
2026-01-08 19:10 ` [PATCH v10 01/35] perf jevents: Build support for generating metrics from python Ian Rogers
2026-01-08 19:10 ` [PATCH v10 02/35] perf jevents: Add load event json to verify and allow fallbacks Ian Rogers
2026-01-08 19:10 ` [PATCH v10 03/35] perf jevents: Add RAPL event metric for AMD zen models Ian Rogers
2026-01-08 19:10 ` [PATCH v10 04/35] perf jevents: Add idle " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 05/35] perf jevents: Add upc metric for uops per cycle for AMD Ian Rogers
2026-01-08 19:10 ` [PATCH v10 06/35] perf jevents: Add br metric group for branch statistics on AMD Ian Rogers
2026-01-08 19:10 ` [PATCH v10 07/35] perf jevents: Add itlb metric group for AMD Ian Rogers
2026-01-08 19:10 ` [PATCH v10 08/35] perf jevents: Add dtlb " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 09/35] perf jevents: Add uncore l3 " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 10/35] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 11/35] perf jevents: Add context switch metrics " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 12/35] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
2026-01-08 19:10 ` [PATCH v10 13/35] perf jevents: Add idle metric for " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 14/35] perf jevents: Add CheckPmu to see if a PMU is in loaded json events Ian Rogers
2026-01-08 19:10 ` [PATCH v10 15/35] perf jevents: Add smi metric group for Intel models Ian Rogers
2026-01-08 19:10 ` [PATCH v10 16/35] perf jevents: Mark metrics with experimental events as experimental Ian Rogers
2026-01-08 19:10 ` [PATCH v10 17/35] perf jevents: Add tsx metric group for Intel models Ian Rogers
2026-01-08 19:10 ` [PATCH v10 18/35] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
2026-01-08 19:10 ` [PATCH v10 19/35] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
2026-01-08 19:10 ` [PATCH v10 20/35] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
2026-01-08 19:10 ` [PATCH v10 21/35] perf jevents: Add L2 metrics for Intel Ian Rogers
2026-01-08 19:10 ` [PATCH v10 22/35] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 23/35] perf jevents: Add ILP metrics " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 24/35] perf jevents: Add context switch " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 25/35] perf jevents: Add FPU " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 26/35] perf jevents: Add Miss Level Parallelism (MLP) metric " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 27/35] perf jevents: Add mem_bw " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 28/35] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
2026-01-08 19:10 ` [PATCH v10 29/35] perf jevents: Add dir " Ian Rogers
2026-01-08 19:11 ` [PATCH v10 30/35] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
2026-01-08 19:11 ` [PATCH v10 31/35] perf jevents: Add local/remote miss latency metrics " Ian Rogers
2026-01-08 19:11 ` [PATCH v10 32/35] perf jevents: Add upi_bw metric " Ian Rogers
2026-01-08 19:11 ` [PATCH v10 33/35] perf jevents: Add mesh bandwidth saturation " Ian Rogers
2026-01-08 19:11 ` [PATCH v10 34/35] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
2026-01-08 19:11 ` [PATCH v10 35/35] perf jevents: Validate that all names given an Event Ian Rogers
2026-01-20 5:23 ` [PATCH v10 00/35] AMD and Intel metric generation with Python Ian Rogers
2026-01-23 17:12 ` Ian Rogers
2026-01-27 17:07 ` Arnaldo Carvalho de Melo
2026-01-27 18:09 ` Ian Rogers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox