[PATCH v4 0/2] Python generated Arm64 metrics

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/2] Python generated Arm64 metrics
@ 2024-09-26 17:57 Ian Rogers
  2024-09-26 17:57 ` [PATCH v4 1/2] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
  2024-09-26 17:57 ` [PATCH v4 2/2] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
  0 siblings, 2 replies; 5+ messages in thread
From: Ian Rogers @ 2024-09-26 17:57 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, John Garry, linux-kernel,
	linux-perf-users, Jing Zhang, Thomas Richter, James Clark,
	Leo Yan

Generate two sets of additional metrics for Arm64, where the topdown
set decomposes yet further. The metrcs primarily use json events,
where the json contains architecture standard events. Not all events
are in the json, such as for a53 where the events are in
sysfs. Workaround this by adding the sysfs events to the metrics but
longer-term such events should be added to the json.

The patches should be applied on top of:
https://lore.kernel.org/lkml/20240926175035.408668-1-irogers@google.com/

v4. Tweak to build dependencies and rebase.
v3. Some minor code cleanup changes.
v2. The cycles metrics are now made common and shared with AMD and
    Intel, suggested by Kan Liang <kan.liang@linux.intel.com>. This
    assumes these patches come after the AMD and Intel sets.

Ian Rogers (2):
  perf jevents: Add collection of topdown like metrics for arm64
  perf jevents: Add cycles breakdown metric for arm64/AMD/Intel

 tools/perf/pmu-events/Build             |   2 +-
 tools/perf/pmu-events/amd_metrics.py    |   3 +
 tools/perf/pmu-events/arm64_metrics.py  | 149 +++++++++++++++++++++++-
 tools/perf/pmu-events/common_metrics.py |  18 +++
 tools/perf/pmu-events/intel_metrics.py  |   2 +
 5 files changed, 169 insertions(+), 5 deletions(-)
 create mode 100644 tools/perf/pmu-events/common_metrics.py

-- 
2.46.1.824.gd892dcdcdd-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/2] perf jevents: Add collection of topdown like metrics for arm64
  2024-09-26 17:57 [PATCH v4 0/2] Python generated Arm64 metrics Ian Rogers
@ 2024-09-26 17:57 ` Ian Rogers
  2024-09-26 19:56   ` Leo Yan
  2024-09-26 17:57 ` [PATCH v4 2/2] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
  1 sibling, 1 reply; 5+ messages in thread
From: Ian Rogers @ 2024-09-26 17:57 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, John Garry, linux-kernel,
	linux-perf-users, Jing Zhang, Thomas Richter, James Clark,
	Leo Yan

Metrics are created using legacy, common and recommended events. As
events may be missing a TryEvent function will give None if an event
is missing. To workaround missing JSON events for cortex-a53, sysfs
encodings are used.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/pmu-events/arm64_metrics.py | 147 ++++++++++++++++++++++++-
 1 file changed, 143 insertions(+), 4 deletions(-)

diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index c9aa2d827a82..bfac570600d9 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -1,14 +1,151 @@
 #!/usr/bin/env python3
 # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
-                    MetricGroup)
+from metric import (d_ratio, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+                    LoadEvents, Metric, MetricGroup)
 import argparse
 import json
 import os
+from typing import Optional
 
 # Global command line arguments.
 _args = None
 
+def Arm64Topdown() -> MetricGroup:
+  """Returns a MetricGroup representing ARM64 topdown like metrics."""
+  def TryEvent(name: str) -> Optional[Event]:
+    # Skip an event if not in the json files.
+    try:
+      return Event(name)
+    except:
+      return None
+  # ARM models like a53 lack JSON for INST_RETIRED but have the
+  # architetural standard event in sysfs. Use the PMU name to identify
+  # the sysfs event.
+  pmu_name = f'armv8_{_args.model.replace("-", "_")}'
+  ins = Event("instructions")
+  ins_ret = Event("INST_RETIRED", f"{pmu_name}/inst_retired/")
+  cycles = Event("cycles")
+  stall_fe = TryEvent("STALL_FRONTEND")
+  stall_be = TryEvent("STALL_BACKEND")
+  br_ret = TryEvent("BR_RETIRED")
+  br_mp_ret = TryEvent("BR_MIS_PRED_RETIRED")
+  dtlb_walk = TryEvent("DTLB_WALK")
+  itlb_walk = TryEvent("ITLB_WALK")
+  l1d_tlb = TryEvent("L1D_TLB")
+  l1i_tlb = TryEvent("L1I_TLB")
+  l1d_refill = Event("L1D_CACHE_REFILL", f"{pmu_name}/l1d_cache_refill/")
+  l2d_refill = Event("L2D_CACHE_REFILL", f"{pmu_name}/l2d_cache_refill/")
+  l1i_refill = Event("L1I_CACHE_REFILL", f"{pmu_name}/l1i_cache_refill/")
+  l1d_access = Event("L1D_CACHE", f"{pmu_name}/l1d_cache/")
+  l2d_access = Event("L2D_CACHE", f"{pmu_name}/l2d_cache/")
+  llc_access = TryEvent("LL_CACHE_RD")
+  l1i_access = Event("L1I_CACHE", f"{pmu_name}/l1i_cache/")
+  llc_miss_rd = TryEvent("LL_CACHE_MISS_RD")
+  ase_spec = TryEvent("ASE_SPEC")
+  ld_spec = TryEvent("LD_SPEC")
+  st_spec = TryEvent("ST_SPEC")
+  vfp_spec = TryEvent("VFP_SPEC")
+  dp_spec = TryEvent("DP_SPEC")
+  br_immed_spec = TryEvent("BR_IMMED_SPEC")
+  br_indirect_spec = TryEvent("BR_INDIRECT_SPEC")
+  br_ret_spec = TryEvent("BR_RETURN_SPEC")
+  crypto_spec = TryEvent("CRYPTO_SPEC")
+  inst_spec = TryEvent("INST_SPEC")
+
+  return MetricGroup("topdown", [
+      MetricGroup("topdown_tl", [
+          Metric("topdown_tl_ipc", "Instructions per cycle", d_ratio(
+              ins, cycles), "insn/cycle"),
+          Metric("topdown_tl_stall_fe_rate", "Frontend stalls to all cycles",
+                 d_ratio(stall_fe, cycles), "100%") if stall_fe else None,
+          Metric("topdown_tl_stall_be_rate", "Backend stalls to all cycles",
+                 d_ratio(stall_be, cycles), "100%") if stall_be else None,
+      ]),
+      MetricGroup("topdown_fe_bound", [
+          MetricGroup("topdown_fe_br", [
+              Metric("topdown_fe_br_mp_per_insn",
+                     "Branch mispredicts per instruction retired",
+                     d_ratio(br_mp_ret, ins_ret), "br/insn") if br_mp_ret else None,
+              Metric("topdown_fe_br_ins_rate",
+                     "Branches per instruction retired", d_ratio(
+                         br_ret, ins_ret), "100%") if br_ret else None,
+              Metric("topdown_fe_br_mispredict",
+                     "Branch mispredicts per branch instruction",
+                     d_ratio(br_mp_ret, br_ret), "100%") if br_mp_ret else None,
+          ]),
+          MetricGroup("topdown_fe_itlb", [
+              Metric("topdown_fe_itlb_walks", "Itlb walks per insn",
+                     d_ratio(itlb_walk, ins_ret), "walk/insn"),
+              Metric("topdown_fe_itlb_walk_rate", "Itlb walks per l1i access",
+                     d_ratio(itlb_walk, l1i_tlb), "100%"),
+          ]) if itlb_walk else None,
+          MetricGroup("topdown_fe_icache", [
+              Metric("topdown_fe_icache_l1i_per_insn",
+                     "L1I cache refills per instruction",
+                     d_ratio(l1i_refill, ins_ret), "l1i/insn"),
+              Metric("topdown_fe_icache_l1i_miss_rate",
+                     "L1I cache refills per L1I cache access",
+                     d_ratio(l1i_refill, l1i_access), "100%"),
+          ]),
+      ]),
+      MetricGroup("topdown_be_bound", [
+          MetricGroup("topdown_be_dtlb", [
+              Metric("topdown_be_dtlb_walks", "Dtlb walks per instruction",
+                     d_ratio(dtlb_walk, ins_ret), "walk/insn"),
+              Metric("topdown_be_dtlb_walk_rate", "Dtlb walks per l1d access",
+                     d_ratio(dtlb_walk, l1d_tlb), "100%"),
+          ]) if dtlb_walk else None,
+          MetricGroup("topdown_be_mix", [
+              Metric("topdown_be_mix_ld", "Percentage of load instructions",
+                     d_ratio(ld_spec, inst_spec), "100%") if ld_spec else None,
+              Metric("topdown_be_mix_st", "Percentage of store instructions",
+                     d_ratio(st_spec, inst_spec), "100%") if st_spec else None,
+              Metric("topdown_be_mix_simd", "Percentage of SIMD instructions",
+                     d_ratio(ase_spec, inst_spec), "100%") if ase_spec else None,
+              Metric("topdown_be_mix_fp",
+                     "Percentage of floating point instructions",
+                     d_ratio(vfp_spec, inst_spec), "100%") if vfp_spec else None,
+              Metric("topdown_be_mix_dp",
+                     "Percentage of data processing instructions",
+                     d_ratio(dp_spec, inst_spec), "100%") if dp_spec else None,
+              Metric("topdown_be_mix_crypto",
+                     "Percentage of data processing instructions",
+                     d_ratio(crypto_spec, inst_spec), "100%") if crypto_spec else None,
+              Metric(
+                  "topdown_be_mix_br", "Percentage of branch instructions",
+                  d_ratio(br_immed_spec + br_indirect_spec + br_ret_spec,
+                          inst_spec), "100%") if br_immed_spec and br_indirect_spec and br_ret_spec else None,
+          ]) if inst_spec else None,
+          MetricGroup("topdown_be_dcache", [
+              MetricGroup("topdown_be_dcache_l1", [
+                  Metric("topdown_be_dcache_l1_per_insn",
+                         "L1D cache refills per instruction",
+                         d_ratio(l1d_refill, ins_ret), "refills/insn"),
+                  Metric("topdown_be_dcache_l1_miss_rate",
+                         "L1D cache refills per L1D cache access",
+                         d_ratio(l1d_refill, l1d_access), "100%")
+              ]),
+              MetricGroup("topdown_be_dcache_l2", [
+                  Metric("topdown_be_dcache_l2_per_insn",
+                         "L2D cache refills per instruction",
+                         d_ratio(l2d_refill, ins_ret), "refills/insn"),
+                  Metric("topdown_be_dcache_l2_miss_rate",
+                         "L2D cache refills per L2D cache access",
+                         d_ratio(l2d_refill, l2d_access), "100%")
+              ]),
+              MetricGroup("topdown_be_dcache_llc", [
+                  Metric("topdown_be_dcache_llc_per_insn",
+                         "Last level cache misses per instruction",
+                         d_ratio(llc_miss_rd, ins_ret), "miss/insn"),
+                  Metric("topdown_be_dcache_llc_miss_rate",
+                         "Last level cache misses per L2D cache access",
+                         d_ratio(llc_miss_rd, llc_access), "100%")
+              ]) if llc_miss_rd and llc_access else None,
+          ]),
+      ]),
+  ])
+
+
 def main() -> None:
   global _args
 
@@ -29,11 +166,13 @@ def main() -> None:
   )
   _args = parser.parse_args()
 
-  all_metrics = MetricGroup("",[])
-
   directory = f"{_args.events_path}/arm64/{_args.vendor}/{_args.model}/"
   LoadEvents(directory)
 
+  all_metrics = MetricGroup("",[
+      Arm64Topdown(),
+  ])
+
   if _args.metricgroups:
     print(JsonEncodeMetricGroupDescriptions(all_metrics))
   else:
-- 
2.46.1.824.gd892dcdcdd-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/2] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
  2024-09-26 17:57 [PATCH v4 0/2] Python generated Arm64 metrics Ian Rogers
  2024-09-26 17:57 ` [PATCH v4 1/2] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
@ 2024-09-26 17:57 ` Ian Rogers
  2024-09-26 20:12   ` Leo Yan
  1 sibling, 1 reply; 5+ messages in thread
From: Ian Rogers @ 2024-09-26 17:57 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, John Garry, linux-kernel,
	linux-perf-users, Jing Zhang, Thomas Richter, James Clark,
	Leo Yan

Breakdown cycles to user, kernel and guest. Add a common_metrics.py
file for such metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/pmu-events/Build             |  2 +-
 tools/perf/pmu-events/amd_metrics.py    |  3 +++
 tools/perf/pmu-events/arm64_metrics.py  |  2 ++
 tools/perf/pmu-events/common_metrics.py | 18 ++++++++++++++++++
 tools/perf/pmu-events/intel_metrics.py  |  2 ++
 5 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/pmu-events/common_metrics.py

diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index f3bc6c093360..91b6837e32c9 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -37,7 +37,7 @@ $(OUTPUT)pmu-events/arch/%: pmu-events/arch/%
 	$(call rule_mkdir)
 	$(Q)$(call echo-cmd,gen)cp $< $@
 
-GEN_METRIC_DEPS := pmu-events/metric.py
+GEN_METRIC_DEPS := pmu-events/metric.py pmu-events/common_metrics.py
 
 # Generate AMD Json
 ZENS = $(shell ls -d pmu-events/arch/x86/amdzen*)
diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index 422b119553ff..ccc8ebf13e08 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -4,6 +4,7 @@ from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
                     JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
                     Metric, MetricGroup, Select)
 import argparse
+from common_metrics import Cycles
 import json
 import math
 import os
@@ -571,6 +572,7 @@ def AmdUpc() -> Metric:
   return Metric("upc", "Micro-ops retired per core cycle (higher is better)",
                 upc, "uops/cycle")
 
+
 def Idle() -> Metric:
   cyc = Event("msr/mperf/")
   tsc = Event("msr/tsc/")
@@ -652,6 +654,7 @@ def main() -> None:
       AmdHwpf(),
       AmdSwpf(),
       AmdUpc(),
+      Cycles(),
       Idle(),
       Rapl(),
       UncoreL3(),
diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
index bfac570600d9..5285a22ff0c8 100755
--- a/tools/perf/pmu-events/arm64_metrics.py
+++ b/tools/perf/pmu-events/arm64_metrics.py
@@ -3,6 +3,7 @@
 from metric import (d_ratio, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
                     LoadEvents, Metric, MetricGroup)
 import argparse
+from common_metrics import Cycles
 import json
 import os
 from typing import Optional
@@ -171,6 +172,7 @@ def main() -> None:
 
   all_metrics = MetricGroup("",[
       Arm64Topdown(),
+      Cycles(),
   ])
 
   if _args.metricgroups:
diff --git a/tools/perf/pmu-events/common_metrics.py b/tools/perf/pmu-events/common_metrics.py
new file mode 100644
index 000000000000..74c58f9ab020
--- /dev/null
+++ b/tools/perf/pmu-events/common_metrics.py
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+from metric import (d_ratio, Event, Metric, MetricGroup)
+
+def Cycles() -> MetricGroup:
+  cyc_k = Event("cycles:kHh")
+  cyc_g = Event("cycles:G")
+  cyc_u = Event("cycles:uH")
+  cyc = cyc_k + cyc_g + cyc_u
+
+  return MetricGroup("cycles", [
+      Metric("cycles_total", "Total number of cycles", cyc, "cycles"),
+      Metric("cycles_user", "User cycles as a percentage of all cycles",
+             d_ratio(cyc_u, cyc), "100%"),
+      Metric("cycles_kernel", "Kernel cycles as a percentage of all cycles",
+             d_ratio(cyc_k, cyc), "100%"),
+      Metric("cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
+             d_ratio(cyc_g, cyc), "100%"),
+  ], description = "cycles breakdown per privilege level (users, kernel, guest)")
diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index a3a317d13841..4b7668e25e54 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -5,6 +5,7 @@ from metric import (d_ratio, has_event, max, source_count, CheckPmu, Event,
                     Literal, LoadEvents, Metric, MetricConstraint, MetricGroup,
                     MetricRef, Select)
 import argparse
+from common_metrics import Cycles
 import json
 import math
 import os
@@ -1050,6 +1051,7 @@ def main() -> None:
   LoadEvents(directory)
 
   all_metrics = MetricGroup("", [
+      Cycles(),
       Idle(),
       Rapl(),
       Smi(),
-- 
2.46.1.824.gd892dcdcdd-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/2] perf jevents: Add collection of topdown like metrics for arm64
  2024-09-26 17:57 ` [PATCH v4 1/2] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
@ 2024-09-26 19:56   ` Leo Yan
  0 siblings, 0 replies; 5+ messages in thread
From: Leo Yan @ 2024-09-26 19:56 UTC (permalink / raw)
  To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Adrian Hunter, Kan Liang, John Garry, linux-kernel,
	linux-perf-users, Jing Zhang, Thomas Richter, James Clark,
	Leo Yan

On 9/26/2024 6:57 PM, Ian Rogers wrote:
> 
> 
> Metrics are created using legacy, common and recommended events. As
> events may be missing a TryEvent function will give None if an event
> is missing. To workaround missing JSON events for cortex-a53, sysfs
> encodings are used.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/pmu-events/arm64_metrics.py | 147 ++++++++++++++++++++++++-
>  1 file changed, 143 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
> index c9aa2d827a82..bfac570600d9 100755
> --- a/tools/perf/pmu-events/arm64_metrics.py
> +++ b/tools/perf/pmu-events/arm64_metrics.py
> @@ -1,14 +1,151 @@
>  #!/usr/bin/env python3
>  # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> -from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
> -                    MetricGroup)
> +from metric import (d_ratio, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
> +                    LoadEvents, Metric, MetricGroup)
>  import argparse
>  import json
>  import os
> +from typing import Optional
> 
>  # Global command line arguments.
>  _args = None
> 
> +def Arm64Topdown() -> MetricGroup:
> +  """Returns a MetricGroup representing ARM64 topdown like metrics."""
> +  def TryEvent(name: str) -> Optional[Event]:
> +    # Skip an event if not in the json files.
> +    try:
> +      return Event(name)
> +    except:
> +      return None
> +  # ARM models like a53 lack JSON for INST_RETIRED but have the
> +  # architetural standard event in sysfs. Use the PMU name to identify
> +  # the sysfs event.
>
> +  pmu_name = f'armv8_{_args.model.replace("-", "_")}'
> +  ins = Event("instructions")
> +  ins_ret = Event("INST_RETIRED", f"{pmu_name}/inst_retired/")
> +  cycles = Event("cycles")
> +  stall_fe = TryEvent("STALL_FRONTEND")
> +  stall_be = TryEvent("STALL_BACKEND")
> +  br_ret = TryEvent("BR_RETIRED")
> +  br_mp_ret = TryEvent("BR_MIS_PRED_RETIRED")
> +  dtlb_walk = TryEvent("DTLB_WALK")
> +  itlb_walk = TryEvent("ITLB_WALK")
> +  l1d_tlb = TryEvent("L1D_TLB")
> +  l1i_tlb = TryEvent("L1I_TLB")
> +  l1d_refill = Event("L1D_CACHE_REFILL", f"{pmu_name}/l1d_cache_refill/")
> +  l2d_refill = Event("L2D_CACHE_REFILL", f"{pmu_name}/l2d_cache_refill/")
> +  l1i_refill = Event("L1I_CACHE_REFILL", f"{pmu_name}/l1i_cache_refill/")
> +  l1d_access = Event("L1D_CACHE", f"{pmu_name}/l1d_cache/")
> +  l2d_access = Event("L2D_CACHE", f"{pmu_name}/l2d_cache/")
> +  llc_access = TryEvent("LL_CACHE_RD")
> +  l1i_access = Event("L1I_CACHE", f"{pmu_name}/l1i_cache/")
> +  llc_miss_rd = TryEvent("LL_CACHE_MISS_RD")
> +  ase_spec = TryEvent("ASE_SPEC")
> +  ld_spec = TryEvent("LD_SPEC")
> +  st_spec = TryEvent("ST_SPEC")
> +  vfp_spec = TryEvent("VFP_SPEC")
> +  dp_spec = TryEvent("DP_SPEC")
> +  br_immed_spec = TryEvent("BR_IMMED_SPEC")
> +  br_indirect_spec = TryEvent("BR_INDIRECT_SPEC")
> +  br_ret_spec = TryEvent("BR_RETURN_SPEC")
> +  crypto_spec = TryEvent("CRYPTO_SPEC")
> +  inst_spec = TryEvent("INST_SPEC")
> +
> +  return MetricGroup("topdown", [
> +      MetricGroup("topdown_tl", [

Is "tl" short for "top level"?

> +          Metric("topdown_tl_ipc", "Instructions per cycle", d_ratio(
> +              ins, cycles), "insn/cycle"),
> +          Metric("topdown_tl_stall_fe_rate", "Frontend stalls to all cycles",
> +                 d_ratio(stall_fe, cycles), "100%") if stall_fe else None,
> +          Metric("topdown_tl_stall_be_rate", "Backend stalls to all cycles",
> +                 d_ratio(stall_be, cycles), "100%") if stall_be else None,
> +      ]),
> +      MetricGroup("topdown_fe_bound", [
> +          MetricGroup("topdown_fe_br", [
> +              Metric("topdown_fe_br_mp_per_insn",
> +                     "Branch mispredicts per instruction retired",
> +                     d_ratio(br_mp_ret, ins_ret), "br/insn") if br_mp_ret else None,
> +              Metric("topdown_fe_br_ins_rate",
> +                     "Branches per instruction retired", d_ratio(
> +                         br_ret, ins_ret), "100%") if br_ret else None,
> +              Metric("topdown_fe_br_mispredict",
> +                     "Branch mispredicts per branch instruction",
> +                     d_ratio(br_mp_ret, br_ret), "100%") if br_mp_ret else None,

For the condition checking, should not be:

  if (br_mp_ret and br_ret) else None

> +          ]),
> +          MetricGroup("topdown_fe_itlb", [
> +              Metric("topdown_fe_itlb_walks", "Itlb walks per insn",
> +                     d_ratio(itlb_walk, ins_ret), "walk/insn"),
> +              Metric("topdown_fe_itlb_walk_rate", "Itlb walks per l1i access",

s/l1i/L1I TLB

> +                     d_ratio(itlb_walk, l1i_tlb), "100%"),

Add checking for: if l1i_tlb else None ?

> +          ]) if itlb_walk else None,
> +          MetricGroup("topdown_fe_icache", [
> +              Metric("topdown_fe_icache_l1i_per_insn",
> +                     "L1I cache refills per instruction",
> +                     d_ratio(l1i_refill, ins_ret), "l1i/insn"),
> +              Metric("topdown_fe_icache_l1i_miss_rate",
> +                     "L1I cache refills per L1I cache access",
> +                     d_ratio(l1i_refill, l1i_access), "100%"),
> +          ]),
> +      ]),
> +      MetricGroup("topdown_be_bound", [
> +          MetricGroup("topdown_be_dtlb", [
> +              Metric("topdown_be_dtlb_walks", "Dtlb walks per instruction",
> +                     d_ratio(dtlb_walk, ins_ret), "walk/insn"),
> +              Metric("topdown_be_dtlb_walk_rate", "Dtlb walks per l1d access",

s/l1d/L1D TLB ?

> +                     d_ratio(dtlb_walk, l1d_tlb), "100%"),

if l1d_tlb or None,

> +          ]) if dtlb_walk else None,
> +          MetricGroup("topdown_be_mix", [
> +              Metric("topdown_be_mix_ld", "Percentage of load instructions",

Should we expicitly say "... speculatively instructions"?

> +                     d_ratio(ld_spec, inst_spec), "100%") if ld_spec else None,
> +              Metric("topdown_be_mix_st", "Percentage of store instructions",
> +                     d_ratio(st_spec, inst_spec), "100%") if st_spec else None,
> +              Metric("topdown_be_mix_simd", "Percentage of SIMD instructions",
> +                     d_ratio(ase_spec, inst_spec), "100%") if ase_spec else None,
> +              Metric("topdown_be_mix_fp",
> +                     "Percentage of floating point instructions",
> +                     d_ratio(vfp_spec, inst_spec), "100%") if vfp_spec else None,
> +              Metric("topdown_be_mix_dp",
> +                     "Percentage of data processing instructions",
> +                     d_ratio(dp_spec, inst_spec), "100%") if dp_spec else None,
> +              Metric("topdown_be_mix_crypto",
> +                     "Percentage of data processing instructions",
> +                     d_ratio(crypto_spec, inst_spec), "100%") if crypto_spec else None,
> +              Metric(
> +                  "topdown_be_mix_br", "Percentage of branch instructions",
> +                  d_ratio(br_immed_spec + br_indirect_spec + br_ret_spec,
> +                          inst_spec), "100%") if br_immed_spec and br_indirect_spec and br_ret_spec else None,
> +          ]) if inst_spec else None,
> +          MetricGroup("topdown_be_dcache", [
> +              MetricGroup("topdown_be_dcache_l1", [
> +                  Metric("topdown_be_dcache_l1_per_insn",
> +                         "L1D cache refills per instruction",
> +                         d_ratio(l1d_refill, ins_ret), "refills/insn"),
> +                  Metric("topdown_be_dcache_l1_miss_rate",
> +                         "L1D cache refills per L1D cache access",
> +                         d_ratio(l1d_refill, l1d_access), "100%")
> +              ]),
> +              MetricGroup("topdown_be_dcache_l2", [
> +                  Metric("topdown_be_dcache_l2_per_insn",
> +                         "L2D cache refills per instruction",
> +                         d_ratio(l2d_refill, ins_ret), "refills/insn"),
> +                  Metric("topdown_be_dcache_l2_miss_rate",
> +                         "L2D cache refills per L2D cache access",
> +                         d_ratio(l2d_refill, l2d_access), "100%")
> +              ]),
> +              MetricGroup("topdown_be_dcache_llc", [
> +                  Metric("topdown_be_dcache_llc_per_insn",
> +                         "Last level cache misses per instruction",
> +                         d_ratio(llc_miss_rd, ins_ret), "miss/insn"),
> +                  Metric("topdown_be_dcache_llc_miss_rate",
> +                         "Last level cache misses per L2D cache access",

Typo : s/L2D/last level

> +                         d_ratio(llc_miss_rd, llc_access), "100%")
> +              ]) if llc_miss_rd and llc_access else None,
> +          ]),
> +      ]),
> +  ])
> +
> +
>  def main() -> None:
>    global _args
> 
> @@ -29,11 +166,13 @@ def main() -> None:
>    )
>    _args = parser.parse_args()
> 
> -  all_metrics = MetricGroup("",[])
> -
>    directory = f"{_args.events_path}/arm64/{_args.vendor}/{_args.model}/"
>    LoadEvents(directory)
> 
> +  all_metrics = MetricGroup("",[
> +      Arm64Topdown(),
> +  ])
> +
>    if _args.metricgroups:
>      print(JsonEncodeMetricGroupDescriptions(all_metrics))
>    else:
> --
> 2.46.1.824.gd892dcdcdd-goog
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 2/2] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
  2024-09-26 17:57 ` [PATCH v4 2/2] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
@ 2024-09-26 20:12   ` Leo Yan
  0 siblings, 0 replies; 5+ messages in thread
From: Leo Yan @ 2024-09-26 20:12 UTC (permalink / raw)
  To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Adrian Hunter, Kan Liang, John Garry, linux-kernel,
	linux-perf-users, Jing Zhang, Thomas Richter, James Clark,
	Leo Yan

On 9/26/2024 6:57 PM, Ian Rogers wrote:
> 
> Breakdown cycles to user, kernel and guest. Add a common_metrics.py
> file for such metrics.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/pmu-events/Build             |  2 +-
>  tools/perf/pmu-events/amd_metrics.py    |  3 +++
>  tools/perf/pmu-events/arm64_metrics.py  |  2 ++
>  tools/perf/pmu-events/common_metrics.py | 18 ++++++++++++++++++
>  tools/perf/pmu-events/intel_metrics.py  |  2 ++
>  5 files changed, 26 insertions(+), 1 deletion(-)
>  create mode 100644 tools/perf/pmu-events/common_metrics.py
> 
> diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
> index f3bc6c093360..91b6837e32c9 100644
> --- a/tools/perf/pmu-events/Build
> +++ b/tools/perf/pmu-events/Build
> @@ -37,7 +37,7 @@ $(OUTPUT)pmu-events/arch/%: pmu-events/arch/%
>         $(call rule_mkdir)
>         $(Q)$(call echo-cmd,gen)cp $< $@
> 
> -GEN_METRIC_DEPS := pmu-events/metric.py
> +GEN_METRIC_DEPS := pmu-events/metric.py pmu-events/common_metrics.py
> 
>  # Generate AMD Json
>  ZENS = $(shell ls -d pmu-events/arch/x86/amdzen*)
> diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
> index 422b119553ff..ccc8ebf13e08 100755
> --- a/tools/perf/pmu-events/amd_metrics.py
> +++ b/tools/perf/pmu-events/amd_metrics.py
> @@ -4,6 +4,7 @@ from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
>                      JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
>                      Metric, MetricGroup, Select)
>  import argparse
> +from common_metrics import Cycles
>  import json
>  import math
>  import os
> @@ -571,6 +572,7 @@ def AmdUpc() -> Metric:
>    return Metric("upc", "Micro-ops retired per core cycle (higher is better)",
>                  upc, "uops/cycle")
> 
> +
>  def Idle() -> Metric:
>    cyc = Event("msr/mperf/")
>    tsc = Event("msr/tsc/")
> @@ -652,6 +654,7 @@ def main() -> None:
>        AmdHwpf(),
>        AmdSwpf(),
>        AmdUpc(),
> +      Cycles(),
>        Idle(),
>        Rapl(),
>        UncoreL3(),
> diff --git a/tools/perf/pmu-events/arm64_metrics.py b/tools/perf/pmu-events/arm64_metrics.py
> index bfac570600d9..5285a22ff0c8 100755
> --- a/tools/perf/pmu-events/arm64_metrics.py
> +++ b/tools/perf/pmu-events/arm64_metrics.py
> @@ -3,6 +3,7 @@
>  from metric import (d_ratio, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
>                      LoadEvents, Metric, MetricGroup)
>  import argparse
> +from common_metrics import Cycles
>  import json
>  import os
>  from typing import Optional
> @@ -171,6 +172,7 @@ def main() -> None:
> 
>    all_metrics = MetricGroup("",[
>        Arm64Topdown(),
> +      Cycles(),
>    ])
> 
>    if _args.metricgroups:
> diff --git a/tools/perf/pmu-events/common_metrics.py b/tools/perf/pmu-events/common_metrics.py
> new file mode 100644
> index 000000000000..74c58f9ab020
> --- /dev/null
> +++ b/tools/perf/pmu-events/common_metrics.py
> @@ -0,0 +1,18 @@
> +# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> +from metric import (d_ratio, Event, Metric, MetricGroup)
> +
> +def Cycles() -> MetricGroup:
> +  cyc_k = Event("cycles:kHh")

I am confused that these modifiers should be OR operation or AND operation.

Seems to me, 'k' / 'H' modifiers are AND operation for tracing host and
kernel. But for 'h', it is OR operation with rest modifiers, as we want to
trace for both host kernel and hypervisor.

Sorry I might ask a duplicate question which has been discussed before.

The patch itself looks good to me.

Thanks for working on this.
Leo

> +  cyc_g = Event("cycles:G")
> +  cyc_u = Event("cycles:uH")
> +  cyc = cyc_k + cyc_g + cyc_u
> +
> +  return MetricGroup("cycles", [
> +      Metric("cycles_total", "Total number of cycles", cyc, "cycles"),
> +      Metric("cycles_user", "User cycles as a percentage of all cycles",
> +             d_ratio(cyc_u, cyc), "100%"),
> +      Metric("cycles_kernel", "Kernel cycles as a percentage of all cycles",
> +             d_ratio(cyc_k, cyc), "100%"),
> +      Metric("cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
> +             d_ratio(cyc_g, cyc), "100%"),
> +  ], description = "cycles breakdown per privilege level (users, kernel, guest)")
> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> index a3a317d13841..4b7668e25e54 100755
> --- a/tools/perf/pmu-events/intel_metrics.py
> +++ b/tools/perf/pmu-events/intel_metrics.py
> @@ -5,6 +5,7 @@ from metric import (d_ratio, has_event, max, source_count, CheckPmu, Event,
>                      Literal, LoadEvents, Metric, MetricConstraint, MetricGroup,
>                      MetricRef, Select)
>  import argparse
> +from common_metrics import Cycles
>  import json
>  import math
>  import os
> @@ -1050,6 +1051,7 @@ def main() -> None:
>    LoadEvents(directory)
> 
>    all_metrics = MetricGroup("", [
> +      Cycles(),
>        Idle(),
>        Rapl(),
>        Smi(),
> --
> 2.46.1.824.gd892dcdcdd-goog
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-09-26 20:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-26 17:57 [PATCH v4 0/2] Python generated Arm64 metrics Ian Rogers
2024-09-26 17:57 ` [PATCH v4 1/2] perf jevents: Add collection of topdown like metrics for arm64 Ian Rogers
2024-09-26 19:56   ` Leo Yan
2024-09-26 17:57 ` [PATCH v4 2/2] perf jevents: Add cycles breakdown metric for arm64/AMD/Intel Ian Rogers
2024-09-26 20:12   ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).