* [PATCH v2] perf vendor events arm64: Fix for AmpereOne metrics
@ 2023-09-20 6:18 Ilkka Koskinen
2023-09-20 16:40 ` Ian Rogers
0 siblings, 1 reply; 3+ messages in thread
From: Ilkka Koskinen @ 2023-09-20 6:18 UTC (permalink / raw)
To: James Clark, John Garry, Ian Rogers, Arnaldo Carvalho de Melo,
Will Deacon, Ilkka Koskinen, D Scott Phillips
Cc: Mike Leach, Leo Yan, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Namhyung Kim, Adrian Hunter,
Dave Kleikamp, linux-arm-kernel, linux-perf-users, linux-kernel
This patch addresses review comments that were given for
705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
but didn't make it to the original patch [1][2]
Changes include: A fix for backend_memory formula, use of standard metrics
when possible, using #slots, renaming metrics to avoid spaces in the names,
and cleanup.
[1] https://lore.kernel.org/linux-perf-users/e9bdacb-a231-36af-6a2e-6918ee7effa@os.amperecomputing.com/
[2] https://lore.kernel.org/linux-perf-users/20230826192352.3043220-1-ilkka@os.amperecomputing.com/
Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
Fixed the scaling issues on some of the metrics in v1.
I'll be offline for a couple of weeks but Scott can address any review
comments meanwhile.
Cheers, Ilkka
.../arch/arm64/ampere/ampereone/metrics.json | 418 +++++++++---------
1 file changed, 220 insertions(+), 198 deletions(-)
diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
index 1e7e8901a445..e2848a9d4848 100644
--- a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
@@ -1,362 +1,384 @@
[
{
+ "MetricName": "branch_miss_pred_rate",
"MetricExpr": "BR_MIS_PRED / BR_PRED",
"BriefDescription": "Branch predictor misprediction rate. May not count branches that are never resolved because they are in the misprediction shadow of an earlier branch",
- "MetricGroup": "Branch Prediction",
- "MetricName": "Misprediction"
+ "MetricGroup": "branch",
+ "ScaleUnit": "100%"
},
{
- "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
- "BriefDescription": "Branch predictor misprediction rate",
- "MetricGroup": "Branch Prediction",
- "MetricName": "Misprediction (retired)"
- },
- {
- "MetricExpr": "BUS_ACCESS / ( BUS_CYCLES * 1)",
+ "MetricName": "bus_utilization",
+ "MetricExpr": "((BUS_ACCESS / (BUS_CYCLES * 1)) * 100)",
"BriefDescription": "Core-to-uncore bus utilization",
"MetricGroup": "Bus",
- "MetricName": "Bus utilization"
+ "ScaleUnit": "1percent of bus cycles"
},
{
- "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
- "BriefDescription": "L1D cache miss rate",
- "MetricGroup": "Cache",
- "MetricName": "L1D cache miss"
+ "MetricName": "l1d_cache_miss_ratio",
+ "MetricExpr": "(L1D_CACHE_REFILL / L1D_CACHE)",
+ "BriefDescription": "This metric measures the ratio of level 1 data cache accesses missed to the total number of level 1 data cache accesses. This gives an indication of the effectiveness of the level 1 data cache.",
+ "MetricGroup": "Miss_Ratio;L1D_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
+ },
+ {
+ "MetricName": "l1i_cache_miss_ratio",
+ "MetricExpr": "(L1I_CACHE_REFILL / L1I_CACHE)",
+ "BriefDescription": "This metric measures the ratio of level 1 instruction cache accesses missed to the total number of level 1 instruction cache accesses. This gives an indication of the effectiveness of the level 1 instruction cache.",
+ "MetricGroup": "Miss_Ratio;L1I_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
+ "MetricName": "Miss_Ratio;l1d_cache_read_miss",
"MetricExpr": "L1D_CACHE_LMISS_RD / L1D_CACHE_RD",
"BriefDescription": "L1D cache read miss rate",
"MetricGroup": "Cache",
- "MetricName": "L1D cache read miss"
+ "ScaleUnit": "1per cache read access"
},
{
- "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
- "BriefDescription": "L1I cache miss rate",
- "MetricGroup": "Cache",
- "MetricName": "L1I cache miss"
- },
- {
- "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
- "BriefDescription": "L2 cache miss rate",
- "MetricGroup": "Cache",
- "MetricName": "L2 cache miss"
+ "MetricName": "l2_cache_miss_ratio",
+ "MetricExpr": "(L2D_CACHE_REFILL / L2D_CACHE)",
+ "BriefDescription": "This metric measures the ratio of level 2 cache accesses missed to the total number of level 2 cache accesses. This gives an indication of the effectiveness of the level 2 cache, which is a unified cache that stores both data and instruction. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a unified cache.",
+ "MetricGroup": "Miss_Ratio;L2_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
+ "MetricName": "l1i_cache_read_miss_rate",
"MetricExpr": "L1I_CACHE_LMISS / L1I_CACHE",
"BriefDescription": "L1I cache read miss rate",
"MetricGroup": "Cache",
- "MetricName": "L1I cache read miss"
+ "ScaleUnit": "1per cache access"
},
{
+ "MetricName": "l2d_cache_read_miss_rate",
"MetricExpr": "L2D_CACHE_LMISS_RD / L2D_CACHE_RD",
"BriefDescription": "L2 cache read miss rate",
"MetricGroup": "Cache",
- "MetricName": "L2 cache read miss"
+ "ScaleUnit": "1per cache read access"
},
{
- "MetricExpr": "(L1D_CACHE_LMISS_RD * 1000) / INST_RETIRED",
+ "MetricName": "l1d_cache_miss_mpki",
+ "MetricExpr": "(L1D_CACHE_LMISS_RD * 1e3) / INST_RETIRED",
"BriefDescription": "Misses per thousand instructions (data)",
"MetricGroup": "Cache",
- "MetricName": "MPKI data"
+ "ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "(L1I_CACHE_LMISS * 1000) / INST_RETIRED",
+ "MetricName": "l1i_cache_miss_mpki",
+ "MetricExpr": "(L1I_CACHE_LMISS * 1e3) / INST_RETIRED",
"BriefDescription": "Misses per thousand instructions (instruction)",
"MetricGroup": "Cache",
- "MetricName": "MPKI instruction"
+ "ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "ASE_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) operations",
- "MetricGroup": "Instruction",
- "MetricName": "ASE mix"
+ "MetricName": "simd_percentage",
+ "MetricExpr": "((ASE_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures advanced SIMD operations as a percentage of total operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "CRYPTO_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of crypto data processing operations",
- "MetricGroup": "Instruction",
- "MetricName": "Crypto mix"
+ "MetricName": "crypto_percentage",
+ "MetricExpr": "((CRYPTO_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures crypto operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "VFP_SPEC / (duration_time *1000000000)",
+ "MetricName": "gflops",
+ "MetricExpr": "VFP_SPEC / (duration_time * 1e9)",
"BriefDescription": "Giga-floating point operations per second",
- "MetricGroup": "Instruction",
- "MetricName": "GFLOPS_ISSUED"
+ "MetricGroup": "InstructionMix"
},
{
- "MetricExpr": "DP_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of integer data processing operations",
- "MetricGroup": "Instruction",
- "MetricName": "Integer mix"
+ "MetricName": "integer_dp_percentage",
+ "MetricExpr": "((DP_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures scalar integer operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "INST_RETIRED / CPU_CYCLES",
- "BriefDescription": "Instructions per cycle",
- "MetricGroup": "Instruction",
- "MetricName": "IPC"
+ "MetricName": "ipc",
+ "MetricExpr": "(INST_RETIRED / CPU_CYCLES)",
+ "BriefDescription": "This metric measures the number of instructions retired per cycle.",
+ "MetricGroup": "General",
+ "ScaleUnit": "1per cycle"
},
{
- "MetricExpr": "LD_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of load operations",
- "MetricGroup": "Instruction",
- "MetricName": "Load mix"
+ "MetricName": "load_percentage",
+ "MetricExpr": "((LD_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures load operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "LDST_SPEC/ OP_SPEC",
- "BriefDescription": "Proportion of load & store operations",
- "MetricGroup": "Instruction",
- "MetricName": "Load-store mix"
+ "MetricName": "load_store_spec_rate",
+ "MetricExpr": "((LDST_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speclatively executed",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "INST_RETIRED / (duration_time * 1000000)",
+ "MetricName": "retired_mips",
+ "MetricExpr": "INST_RETIRED / (duration_time * 1e6)",
"BriefDescription": "Millions of instructions per second",
- "MetricGroup": "Instruction",
- "MetricName": "MIPS_RETIRED"
+ "MetricGroup": "InstructionMix"
},
{
- "MetricExpr": "INST_SPEC / (duration_time * 1000000)",
+ "MetricName": "spec_utilization_mips",
+ "MetricExpr": "INST_SPEC / (duration_time * 1e6)",
"BriefDescription": "Millions of instructions per second",
- "MetricGroup": "Instruction",
- "MetricName": "MIPS_UTILIZATION"
- },
- {
- "MetricExpr": "PC_WRITE_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of software change of PC operations",
- "MetricGroup": "Instruction",
- "MetricName": "PC write mix"
+ "MetricGroup": "PEutilization"
},
{
- "MetricExpr": "ST_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of store operations",
- "MetricGroup": "Instruction",
- "MetricName": "Store mix"
+ "MetricName": "pc_write_spec_rate",
+ "MetricExpr": "((PC_WRITE_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speclatively executed",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "VFP_SPEC / OP_SPEC",
- "BriefDescription": "Proportion of FP operations",
- "MetricGroup": "Instruction",
- "MetricName": "VFP mix"
+ "MetricName": "store_percentage",
+ "MetricExpr": "((ST_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures store operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "1 - (OP_RETIRED/ (CPU_CYCLES * 4))",
- "BriefDescription": "Proportion of slots lost",
- "MetricGroup": "Speculation / TDA",
- "MetricName": "CPU lost"
+ "MetricName": "scalar_fp_percentage",
+ "MetricExpr": "((VFP_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures scalar floating point operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "OP_RETIRED/ (CPU_CYCLES * 4)",
- "BriefDescription": "Proportion of slots retiring",
- "MetricGroup": "Speculation / TDA",
- "MetricName": "CPU utilization"
+ "MetricName": "retired_rate",
+ "MetricExpr": "OP_RETIRED / OP_SPEC",
+ "BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
+ "MetricGroup": "General",
+ "ScaleUnit": "100%"
},
{
- "MetricExpr": "OP_RETIRED - OP_SPEC",
- "BriefDescription": "Operations lost due to misspeculation",
- "MetricGroup": "Speculation / TDA",
- "MetricName": "Operations lost"
+ "MetricName": "wasted",
+ "MetricExpr": "1 - (OP_RETIRED / (CPU_CYCLES * #slots))",
+ "BriefDescription": "Of all the micro-operations issued, what proportion are lost",
+ "MetricGroup": "General",
+ "ScaleUnit": "100%"
},
{
- "MetricExpr": "1 - (OP_RETIRED / OP_SPEC)",
- "BriefDescription": "Proportion of operations lost",
- "MetricGroup": "Speculation / TDA",
- "MetricName": "Operations lost (ratio)"
+ "MetricName": "wasted_rate",
+ "MetricExpr": "1 - OP_RETIRED / OP_SPEC",
+ "BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
+ "MetricGroup": "General",
+ "ScaleUnit": "100%"
},
{
- "MetricExpr": "OP_RETIRED / OP_SPEC",
- "BriefDescription": "Proportion of operations retired",
- "MetricGroup": "Speculation / TDA",
- "MetricName": "Operations retired"
- },
- {
- "MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
+ "MetricName": "stall_backend_cache_rate",
+ "MetricExpr": "((STALL_BACKEND_CACHE / CPU_CYCLES) * 100)",
"BriefDescription": "Proportion of cycles stalled and no operations issued to backend and cache miss",
"MetricGroup": "Stall",
- "MetricName": "Stall backend cache cycles"
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
+ "MetricName": "stall_backend_resource_rate",
+ "MetricExpr": "((STALL_BACKEND_RESOURCE / CPU_CYCLES) * 100)",
"BriefDescription": "Proportion of cycles stalled and no operations issued to backend and resource full",
"MetricGroup": "Stall",
- "MetricName": "Stall backend resource cycles"
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
+ "MetricName": "stall_backend_tlb_rate",
+ "MetricExpr": "((STALL_BACKEND_TLB / CPU_CYCLES) * 100)",
"BriefDescription": "Proportion of cycles stalled and no operations issued to backend and TLB miss",
"MetricGroup": "Stall",
- "MetricName": "Stall backend tlb cycles"
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
+ "MetricName": "stall_frontend_cache_rate",
+ "MetricExpr": "((STALL_FRONTEND_CACHE / CPU_CYCLES) * 100)",
"BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and cache miss",
"MetricGroup": "Stall",
- "MetricName": "Stall frontend cache cycles"
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES",
+ "MetricName": "stall_frontend_tlb_rate",
+ "MetricExpr": "((STALL_FRONTEND_TLB / CPU_CYCLES) * 100)",
"BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and TLB miss",
"MetricGroup": "Stall",
- "MetricName": "Stall frontend tlb cycles"
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "DTLB_WALK / L1D_TLB",
- "BriefDescription": "D-side walk per d-side translation request",
- "MetricGroup": "TLB",
- "MetricName": "DTLB walks"
+ "MetricName": "dtlb_walk_ratio",
+ "MetricExpr": "(DTLB_WALK / L1D_TLB)",
+ "BriefDescription": "This metric measures the ratio of data TLB Walks to the total number of data TLB accesses. This gives an indication of the effectiveness of the data TLB accesses.",
+ "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
},
{
- "MetricExpr": "ITLB_WALK / L1I_TLB",
- "BriefDescription": "I-side walk per i-side translation request",
- "MetricGroup": "TLB",
- "MetricName": "ITLB walks"
+ "MetricName": "itlb_walk_ratio",
+ "MetricExpr": "(ITLB_WALK / L1I_TLB)",
+ "BriefDescription": "This metric measures the ratio of instruction TLB Walks to the total number of instruction TLB accesses. This gives an indication of the effectiveness of the instruction TLB accesses.",
+ "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
},
{
- "MetricExpr": "STALL_SLOT_BACKEND / (CPU_CYCLES * 4)",
- "BriefDescription": "Fraction of slots backend bound",
- "MetricGroup": "TopDownL1",
- "MetricName": "backend"
+ "ArchStdEvent": "backend_bound"
},
{
- "MetricExpr": "1 - (retiring + lost + backend)",
- "BriefDescription": "Fraction of slots frontend bound",
- "MetricGroup": "TopDownL1",
- "MetricName": "frontend"
+ "ArchStdEvent": "frontend_bound",
+ "MetricExpr": "100 - (retired_fraction + slots_lost_misspeculation_fraction + backend_bound)"
},
{
- "MetricExpr": "((OP_SPEC - OP_RETIRED) / (CPU_CYCLES * 4))",
+ "MetricName": "slots_lost_misspeculation_fraction",
+ "MetricExpr": "100 * ((OP_SPEC - OP_RETIRED) / (CPU_CYCLES * #slots))",
"BriefDescription": "Fraction of slots lost due to misspeculation",
- "MetricGroup": "TopDownL1",
- "MetricName": "lost"
+ "MetricGroup": "Default;TopdownL1",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "(OP_RETIRED / (CPU_CYCLES * 4))",
+ "MetricName": "retired_fraction",
+ "MetricExpr": "100 * (OP_RETIRED / (CPU_CYCLES * #slots))",
"BriefDescription": "Fraction of slots retiring, useful work",
- "MetricGroup": "TopDownL1",
- "MetricName": "retiring"
+ "MetricGroup": "Default;TopdownL1",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "backend - backend_memory",
+ "MetricName": "backend_core",
+ "MetricExpr": "(backend_bound / 100) - backend_memory",
"BriefDescription": "Fraction of slots the CPU was stalled due to backend non-memory subsystem issues",
- "MetricGroup": "TopDownL2",
- "MetricName": "backend_core"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "100%"
},
{
- "MetricExpr": "(STALL_BACKEND_TLB + STALL_BACKEND_CACHE + STALL_BACKEND_MEM) / CPU_CYCLES ",
+ "MetricName": "backend_memory",
+ "MetricExpr": "(STALL_BACKEND_TLB + STALL_BACKEND_CACHE) / CPU_CYCLES",
"BriefDescription": "Fraction of slots the CPU was stalled due to backend memory subsystem issues (cache/tlb miss)",
- "MetricGroup": "TopDownL2",
- "MetricName": "backend_memory"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "100%"
},
{
- "MetricExpr": " (BR_MIS_PRED_RETIRED / GPC_FLUSH) * lost",
+ "MetricName": "branch_mispredict",
+ "MetricExpr": "(BR_MIS_PRED_RETIRED / GPC_FLUSH) * slots_lost_misspeculation_fraction",
"BriefDescription": "Fraction of slots lost due to branch misprediciton",
- "MetricGroup": "TopDownL2",
- "MetricName": "branch_mispredict"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "frontend - frontend_latency",
+ "MetricName": "frontend_bandwidth",
+ "MetricExpr": "frontend_bound - frontend_latency",
"BriefDescription": "Fraction of slots the CPU did not dispatch at full bandwidth - able to dispatch partial slots only (1, 2, or 3 uops)",
- "MetricGroup": "TopDownL2",
- "MetricName": "frontend_bandwidth"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "(STALL_FRONTEND - ((STALL_SLOT_FRONTEND - (frontend * CPU_CYCLES * 4)) / 4)) / CPU_CYCLES",
+ "MetricName": "frontend_latency",
+ "MetricExpr": "((STALL_FRONTEND - ((STALL_SLOT_FRONTEND - ((frontend_bound / 100) * CPU_CYCLES * #slots)) / #slots)) / CPU_CYCLES) * 100",
"BriefDescription": "Fraction of slots the CPU was stalled due to frontend latency issues (cache/tlb miss); nothing to dispatch",
- "MetricGroup": "TopDownL2",
- "MetricName": "frontend_latency"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "lost - branch_mispredict",
+ "MetricName": "other_miss_pred",
+ "MetricExpr": "slots_lost_misspeculation_fraction - branch_mispredict",
"BriefDescription": "Fraction of slots lost due to other/non-branch misprediction misspeculation",
- "MetricGroup": "TopDownL2",
- "MetricName": "other_clears"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "(IXU_NUM_UOPS_ISSUED + FSU_ISSUED) / (CPU_CYCLES * 6)",
+ "MetricName": "pipe_utilization",
+ "MetricExpr": "100 * ((IXU_NUM_UOPS_ISSUED + FSU_ISSUED) / (CPU_CYCLES * 6))",
"BriefDescription": "Fraction of execute slots utilized",
- "MetricGroup": "TopDownL2",
- "MetricName": "pipe_utilization"
+ "MetricGroup": "TopdownL2",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "STALL_BACKEND_MEM / CPU_CYCLES",
+ "MetricName": "d_cache_l2_miss_rate",
+ "MetricExpr": "((STALL_BACKEND_MEM / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled due to data L2 cache miss",
- "MetricGroup": "TopDownL3",
- "MetricName": "d_cache_l2_miss"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
+ "MetricName": "d_cache_miss_rate",
+ "MetricExpr": "((STALL_BACKEND_CACHE / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled due to data cache miss",
- "MetricGroup": "TopDownL3",
- "MetricName": "d_cache_miss"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
+ "MetricName": "d_tlb_miss_rate",
+ "MetricExpr": "((STALL_BACKEND_TLB / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled due to data TLB miss",
- "MetricGroup": "TopDownL3",
- "MetricName": "d_tlb_miss"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "FSU_ISSUED / (CPU_CYCLES * 2)",
+ "MetricName": "fsu_pipe_utilization",
+ "MetricExpr": "((FSU_ISSUED / (CPU_CYCLES * 2)) * 100)",
"BriefDescription": "Fraction of FSU execute slots utilized",
- "MetricGroup": "TopDownL3",
- "MetricName": "fsu_pipe_utilization"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
+ "MetricName": "i_cache_miss_rate",
+ "MetricExpr": "((STALL_FRONTEND_CACHE / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled due to instruction cache miss",
- "MetricGroup": "TopDownL3",
- "MetricName": "i_cache_miss"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": " STALL_FRONTEND_TLB / CPU_CYCLES ",
+ "MetricName": "i_tlb_miss_rate",
+ "MetricExpr": "((STALL_FRONTEND_TLB / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled due to instruction TLB miss",
- "MetricGroup": "TopDownL3",
- "MetricName": "i_tlb_miss"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "IXU_NUM_UOPS_ISSUED / (CPU_CYCLES / 4)",
+ "MetricName": "ixu_pipe_utilization",
+ "MetricExpr": "((IXU_NUM_UOPS_ISSUED / (CPU_CYCLES * #slots)) * 100)",
"BriefDescription": "Fraction of IXU execute slots utilized",
- "MetricGroup": "TopDownL3",
- "MetricName": "ixu_pipe_utilization"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "IDR_STALL_FLUSH / CPU_CYCLES",
+ "MetricName": "stall_recovery_rate",
+ "MetricExpr": "((IDR_STALL_FLUSH / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled due to flush recovery",
- "MetricGroup": "TopDownL3",
- "MetricName": "recovery"
- },
- {
- "MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
- "BriefDescription": "Fraction of cycles the CPU was stalled due to core resource shortage",
- "MetricGroup": "TopDownL3",
- "MetricName": "resource"
+ "MetricGroup": "TopdownL3",
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "IDR_STALL_FSU_SCHED / CPU_CYCLES ",
+ "MetricName": "stall_fsu_sched_rate",
+ "MetricExpr": "((IDR_STALL_FSU_SCHED / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled and FSU was full",
- "MetricGroup": "TopDownL4",
- "MetricName": "stall_fsu_sched"
+ "MetricGroup": "TopdownL4",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "IDR_STALL_IXU_SCHED / CPU_CYCLES ",
+ "MetricName": "stall_ixu_sched_rate",
+ "MetricExpr": "((IDR_STALL_IXU_SCHED / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled and IXU was full",
- "MetricGroup": "TopDownL4",
- "MetricName": "stall_ixu_sched"
+ "MetricGroup": "TopdownL4",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "IDR_STALL_LOB_ID / CPU_CYCLES ",
+ "MetricName": "stall_lob_id_rate",
+ "MetricExpr": "((IDR_STALL_LOB_ID / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled and LOB was full",
- "MetricGroup": "TopDownL4",
- "MetricName": "stall_lob_id"
+ "MetricGroup": "TopdownL4",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "IDR_STALL_ROB_ID / CPU_CYCLES",
+ "MetricName": "stall_rob_id_rate",
+ "MetricExpr": "((IDR_STALL_ROB_ID / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled and ROB was full",
- "MetricGroup": "TopDownL4",
- "MetricName": "stall_rob_id"
+ "MetricGroup": "TopdownL4",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "IDR_STALL_SOB_ID / CPU_CYCLES ",
+ "MetricName": "stall_sob_id_rate",
+ "MetricExpr": "((IDR_STALL_SOB_ID / CPU_CYCLES) * 100)",
"BriefDescription": "Fraction of cycles the CPU was stalled and SOB was full",
- "MetricGroup": "TopDownL4",
- "MetricName": "stall_sob_id"
+ "MetricGroup": "TopdownL4",
+ "ScaleUnit": "1percent of cycles"
}
]
--
2.40.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] perf vendor events arm64: Fix for AmpereOne metrics
2023-09-20 6:18 [PATCH v2] perf vendor events arm64: Fix for AmpereOne metrics Ilkka Koskinen
@ 2023-09-20 16:40 ` Ian Rogers
2023-09-26 22:50 ` Namhyung Kim
0 siblings, 1 reply; 3+ messages in thread
From: Ian Rogers @ 2023-09-20 16:40 UTC (permalink / raw)
To: Ilkka Koskinen, D Scott Phillips
Cc: James Clark, John Garry, Arnaldo Carvalho de Melo, Will Deacon,
Mike Leach, Leo Yan, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Namhyung Kim, Adrian Hunter,
Dave Kleikamp, linux-arm-kernel, linux-perf-users, linux-kernel
On Tue, Sep 19, 2023 at 11:19 PM Ilkka Koskinen
<ilkka@os.amperecomputing.com> wrote:
>
> This patch addresses review comments that were given for
> 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
> but didn't make it to the original patch [1][2]
>
> Changes include: A fix for backend_memory formula, use of standard metrics
> when possible, using #slots, renaming metrics to avoid spaces in the names,
> and cleanup.
>
> [1] https://lore.kernel.org/linux-perf-users/e9bdacb-a231-36af-6a2e-6918ee7effa@os.amperecomputing.com/
> [2] https://lore.kernel.org/linux-perf-users/20230826192352.3043220-1-ilkka@os.amperecomputing.com/
>
> Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Ian Rogers <irogers@google.com>
> ---
> Fixed the scaling issues on some of the metrics in v1.
>
> I'll be offline for a couple of weeks but Scott can address any review
> comments meanwhile.
>
> Cheers, Ilkka
>
> .../arch/arm64/ampere/ampereone/metrics.json | 418 +++++++++---------
> 1 file changed, 220 insertions(+), 198 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
> index 1e7e8901a445..e2848a9d4848 100644
> --- a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
> +++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
> @@ -1,362 +1,384 @@
> [
> {
> + "MetricName": "branch_miss_pred_rate",
> "MetricExpr": "BR_MIS_PRED / BR_PRED",
> "BriefDescription": "Branch predictor misprediction rate. May not count branches that are never resolved because they are in the misprediction shadow of an earlier branch",
> - "MetricGroup": "Branch Prediction",
> - "MetricName": "Misprediction"
> + "MetricGroup": "branch",
> + "ScaleUnit": "100%"
> },
> {
> - "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
> - "BriefDescription": "Branch predictor misprediction rate",
> - "MetricGroup": "Branch Prediction",
> - "MetricName": "Misprediction (retired)"
> - },
> - {
> - "MetricExpr": "BUS_ACCESS / ( BUS_CYCLES * 1)",
> + "MetricName": "bus_utilization",
> + "MetricExpr": "((BUS_ACCESS / (BUS_CYCLES * 1)) * 100)",
> "BriefDescription": "Core-to-uncore bus utilization",
> "MetricGroup": "Bus",
> - "MetricName": "Bus utilization"
> + "ScaleUnit": "1percent of bus cycles"
nits: this could be "100percent of bus cycles" and then you needn't "*
100" as you did above, but this doesn't matter as they are equivalent.
There are a few acronyms in the metric descriptions (IXU, LOB, ROB,
SOB) perhaps they could be added to:
https://perf.wiki.kernel.org/index.php/Glossary
Thanks,
Ian
> },
> {
> - "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
> - "BriefDescription": "L1D cache miss rate",
> - "MetricGroup": "Cache",
> - "MetricName": "L1D cache miss"
> + "MetricName": "l1d_cache_miss_ratio",
> + "MetricExpr": "(L1D_CACHE_REFILL / L1D_CACHE)",
> + "BriefDescription": "This metric measures the ratio of level 1 data cache accesses missed to the total number of level 1 data cache accesses. This gives an indication of the effectiveness of the level 1 data cache.",
> + "MetricGroup": "Miss_Ratio;L1D_Cache_Effectiveness",
> + "ScaleUnit": "1per cache access"
> + },
> + {
> + "MetricName": "l1i_cache_miss_ratio",
> + "MetricExpr": "(L1I_CACHE_REFILL / L1I_CACHE)",
> + "BriefDescription": "This metric measures the ratio of level 1 instruction cache accesses missed to the total number of level 1 instruction cache accesses. This gives an indication of the effectiveness of the level 1 instruction cache.",
> + "MetricGroup": "Miss_Ratio;L1I_Cache_Effectiveness",
> + "ScaleUnit": "1per cache access"
> },
> {
> + "MetricName": "Miss_Ratio;l1d_cache_read_miss",
> "MetricExpr": "L1D_CACHE_LMISS_RD / L1D_CACHE_RD",
> "BriefDescription": "L1D cache read miss rate",
> "MetricGroup": "Cache",
> - "MetricName": "L1D cache read miss"
> + "ScaleUnit": "1per cache read access"
> },
> {
> - "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
> - "BriefDescription": "L1I cache miss rate",
> - "MetricGroup": "Cache",
> - "MetricName": "L1I cache miss"
> - },
> - {
> - "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
> - "BriefDescription": "L2 cache miss rate",
> - "MetricGroup": "Cache",
> - "MetricName": "L2 cache miss"
> + "MetricName": "l2_cache_miss_ratio",
> + "MetricExpr": "(L2D_CACHE_REFILL / L2D_CACHE)",
> + "BriefDescription": "This metric measures the ratio of level 2 cache accesses missed to the total number of level 2 cache accesses. This gives an indication of the effectiveness of the level 2 cache, which is a unified cache that stores both data and instruction. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a unified cache.",
> + "MetricGroup": "Miss_Ratio;L2_Cache_Effectiveness",
> + "ScaleUnit": "1per cache access"
> },
> {
> + "MetricName": "l1i_cache_read_miss_rate",
> "MetricExpr": "L1I_CACHE_LMISS / L1I_CACHE",
> "BriefDescription": "L1I cache read miss rate",
> "MetricGroup": "Cache",
> - "MetricName": "L1I cache read miss"
> + "ScaleUnit": "1per cache access"
> },
> {
> + "MetricName": "l2d_cache_read_miss_rate",
> "MetricExpr": "L2D_CACHE_LMISS_RD / L2D_CACHE_RD",
> "BriefDescription": "L2 cache read miss rate",
> "MetricGroup": "Cache",
> - "MetricName": "L2 cache read miss"
> + "ScaleUnit": "1per cache read access"
> },
> {
> - "MetricExpr": "(L1D_CACHE_LMISS_RD * 1000) / INST_RETIRED",
> + "MetricName": "l1d_cache_miss_mpki",
> + "MetricExpr": "(L1D_CACHE_LMISS_RD * 1e3) / INST_RETIRED",
> "BriefDescription": "Misses per thousand instructions (data)",
> "MetricGroup": "Cache",
> - "MetricName": "MPKI data"
> + "ScaleUnit": "1MPKI"
> },
> {
> - "MetricExpr": "(L1I_CACHE_LMISS * 1000) / INST_RETIRED",
> + "MetricName": "l1i_cache_miss_mpki",
> + "MetricExpr": "(L1I_CACHE_LMISS * 1e3) / INST_RETIRED",
> "BriefDescription": "Misses per thousand instructions (instruction)",
> "MetricGroup": "Cache",
> - "MetricName": "MPKI instruction"
> + "ScaleUnit": "1MPKI"
> },
> {
> - "MetricExpr": "ASE_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "ASE mix"
> + "MetricName": "simd_percentage",
> + "MetricExpr": "((ASE_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "This metric measures advanced SIMD operations as a percentage of total operations speculatively executed.",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "CRYPTO_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of crypto data processing operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "Crypto mix"
> + "MetricName": "crypto_percentage",
> + "MetricExpr": "((CRYPTO_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "This metric measures crypto operations as a percentage of operations speculatively executed.",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "VFP_SPEC / (duration_time *1000000000)",
> + "MetricName": "gflops",
> + "MetricExpr": "VFP_SPEC / (duration_time * 1e9)",
> "BriefDescription": "Giga-floating point operations per second",
> - "MetricGroup": "Instruction",
> - "MetricName": "GFLOPS_ISSUED"
> + "MetricGroup": "InstructionMix"
> },
> {
> - "MetricExpr": "DP_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of integer data processing operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "Integer mix"
> + "MetricName": "integer_dp_percentage",
> + "MetricExpr": "((DP_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "This metric measures scalar integer operations as a percentage of operations speculatively executed.",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "INST_RETIRED / CPU_CYCLES",
> - "BriefDescription": "Instructions per cycle",
> - "MetricGroup": "Instruction",
> - "MetricName": "IPC"
> + "MetricName": "ipc",
> + "MetricExpr": "(INST_RETIRED / CPU_CYCLES)",
> + "BriefDescription": "This metric measures the number of instructions retired per cycle.",
> + "MetricGroup": "General",
> + "ScaleUnit": "1per cycle"
> },
> {
> - "MetricExpr": "LD_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of load operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "Load mix"
> + "MetricName": "load_percentage",
> + "MetricExpr": "((LD_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "This metric measures load operations as a percentage of operations speculatively executed.",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "LDST_SPEC/ OP_SPEC",
> - "BriefDescription": "Proportion of load & store operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "Load-store mix"
> + "MetricName": "load_store_spec_rate",
> + "MetricExpr": "((LDST_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speclatively executed",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "INST_RETIRED / (duration_time * 1000000)",
> + "MetricName": "retired_mips",
> + "MetricExpr": "INST_RETIRED / (duration_time * 1e6)",
> "BriefDescription": "Millions of instructions per second",
> - "MetricGroup": "Instruction",
> - "MetricName": "MIPS_RETIRED"
> + "MetricGroup": "InstructionMix"
> },
> {
> - "MetricExpr": "INST_SPEC / (duration_time * 1000000)",
> + "MetricName": "spec_utilization_mips",
> + "MetricExpr": "INST_SPEC / (duration_time * 1e6)",
> "BriefDescription": "Millions of instructions per second",
> - "MetricGroup": "Instruction",
> - "MetricName": "MIPS_UTILIZATION"
> - },
> - {
> - "MetricExpr": "PC_WRITE_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of software change of PC operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "PC write mix"
> + "MetricGroup": "PEutilization"
> },
> {
> - "MetricExpr": "ST_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of store operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "Store mix"
> + "MetricName": "pc_write_spec_rate",
> + "MetricExpr": "((PC_WRITE_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speclatively executed",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "VFP_SPEC / OP_SPEC",
> - "BriefDescription": "Proportion of FP operations",
> - "MetricGroup": "Instruction",
> - "MetricName": "VFP mix"
> + "MetricName": "store_percentage",
> + "MetricExpr": "((ST_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "This metric measures store operations as a percentage of operations speculatively executed.",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "1 - (OP_RETIRED/ (CPU_CYCLES * 4))",
> - "BriefDescription": "Proportion of slots lost",
> - "MetricGroup": "Speculation / TDA",
> - "MetricName": "CPU lost"
> + "MetricName": "scalar_fp_percentage",
> + "MetricExpr": "((VFP_SPEC / INST_SPEC) * 100)",
> + "BriefDescription": "This metric measures scalar floating point operations as a percentage of operations speculatively executed.",
> + "MetricGroup": "Operation_Mix",
> + "ScaleUnit": "1percent of operations"
> },
> {
> - "MetricExpr": "OP_RETIRED/ (CPU_CYCLES * 4)",
> - "BriefDescription": "Proportion of slots retiring",
> - "MetricGroup": "Speculation / TDA",
> - "MetricName": "CPU utilization"
> + "MetricName": "retired_rate",
> + "MetricExpr": "OP_RETIRED / OP_SPEC",
> + "BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
> + "MetricGroup": "General",
> + "ScaleUnit": "100%"
> },
> {
> - "MetricExpr": "OP_RETIRED - OP_SPEC",
> - "BriefDescription": "Operations lost due to misspeculation",
> - "MetricGroup": "Speculation / TDA",
> - "MetricName": "Operations lost"
> + "MetricName": "wasted",
> + "MetricExpr": "1 - (OP_RETIRED / (CPU_CYCLES * #slots))",
> + "BriefDescription": "Of all the micro-operations issued, what proportion are lost",
> + "MetricGroup": "General",
> + "ScaleUnit": "100%"
> },
> {
> - "MetricExpr": "1 - (OP_RETIRED / OP_SPEC)",
> - "BriefDescription": "Proportion of operations lost",
> - "MetricGroup": "Speculation / TDA",
> - "MetricName": "Operations lost (ratio)"
> + "MetricName": "wasted_rate",
> + "MetricExpr": "1 - OP_RETIRED / OP_SPEC",
> + "BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
> + "MetricGroup": "General",
> + "ScaleUnit": "100%"
> },
> {
> - "MetricExpr": "OP_RETIRED / OP_SPEC",
> - "BriefDescription": "Proportion of operations retired",
> - "MetricGroup": "Speculation / TDA",
> - "MetricName": "Operations retired"
> - },
> - {
> - "MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
> + "MetricName": "stall_backend_cache_rate",
> + "MetricExpr": "((STALL_BACKEND_CACHE / CPU_CYCLES) * 100)",
> "BriefDescription": "Proportion of cycles stalled and no operations issued to backend and cache miss",
> "MetricGroup": "Stall",
> - "MetricName": "Stall backend cache cycles"
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
> + "MetricName": "stall_backend_resource_rate",
> + "MetricExpr": "((STALL_BACKEND_RESOURCE / CPU_CYCLES) * 100)",
> "BriefDescription": "Proportion of cycles stalled and no operations issued to backend and resource full",
> "MetricGroup": "Stall",
> - "MetricName": "Stall backend resource cycles"
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
> + "MetricName": "stall_backend_tlb_rate",
> + "MetricExpr": "((STALL_BACKEND_TLB / CPU_CYCLES) * 100)",
> "BriefDescription": "Proportion of cycles stalled and no operations issued to backend and TLB miss",
> "MetricGroup": "Stall",
> - "MetricName": "Stall backend tlb cycles"
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
> + "MetricName": "stall_frontend_cache_rate",
> + "MetricExpr": "((STALL_FRONTEND_CACHE / CPU_CYCLES) * 100)",
> "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and cache miss",
> "MetricGroup": "Stall",
> - "MetricName": "Stall frontend cache cycles"
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES",
> + "MetricName": "stall_frontend_tlb_rate",
> + "MetricExpr": "((STALL_FRONTEND_TLB / CPU_CYCLES) * 100)",
> "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and TLB miss",
> "MetricGroup": "Stall",
> - "MetricName": "Stall frontend tlb cycles"
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "DTLB_WALK / L1D_TLB",
> - "BriefDescription": "D-side walk per d-side translation request",
> - "MetricGroup": "TLB",
> - "MetricName": "DTLB walks"
> + "MetricName": "dtlb_walk_ratio",
> + "MetricExpr": "(DTLB_WALK / L1D_TLB)",
> + "BriefDescription": "This metric measures the ratio of data TLB Walks to the total number of data TLB accesses. This gives an indication of the effectiveness of the data TLB accesses.",
> + "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
> + "ScaleUnit": "1per TLB access"
> },
> {
> - "MetricExpr": "ITLB_WALK / L1I_TLB",
> - "BriefDescription": "I-side walk per i-side translation request",
> - "MetricGroup": "TLB",
> - "MetricName": "ITLB walks"
> + "MetricName": "itlb_walk_ratio",
> + "MetricExpr": "(ITLB_WALK / L1I_TLB)",
> + "BriefDescription": "This metric measures the ratio of instruction TLB Walks to the total number of instruction TLB accesses. This gives an indication of the effectiveness of the instruction TLB accesses.",
> + "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
> + "ScaleUnit": "1per TLB access"
> },
> {
> - "MetricExpr": "STALL_SLOT_BACKEND / (CPU_CYCLES * 4)",
> - "BriefDescription": "Fraction of slots backend bound",
> - "MetricGroup": "TopDownL1",
> - "MetricName": "backend"
> + "ArchStdEvent": "backend_bound"
> },
> {
> - "MetricExpr": "1 - (retiring + lost + backend)",
> - "BriefDescription": "Fraction of slots frontend bound",
> - "MetricGroup": "TopDownL1",
> - "MetricName": "frontend"
> + "ArchStdEvent": "frontend_bound",
> + "MetricExpr": "100 - (retired_fraction + slots_lost_misspeculation_fraction + backend_bound)"
> },
> {
> - "MetricExpr": "((OP_SPEC - OP_RETIRED) / (CPU_CYCLES * 4))",
> + "MetricName": "slots_lost_misspeculation_fraction",
> + "MetricExpr": "100 * ((OP_SPEC - OP_RETIRED) / (CPU_CYCLES * #slots))",
> "BriefDescription": "Fraction of slots lost due to misspeculation",
> - "MetricGroup": "TopDownL1",
> - "MetricName": "lost"
> + "MetricGroup": "Default;TopdownL1",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "(OP_RETIRED / (CPU_CYCLES * 4))",
> + "MetricName": "retired_fraction",
> + "MetricExpr": "100 * (OP_RETIRED / (CPU_CYCLES * #slots))",
> "BriefDescription": "Fraction of slots retiring, useful work",
> - "MetricGroup": "TopDownL1",
> - "MetricName": "retiring"
> + "MetricGroup": "Default;TopdownL1",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "backend - backend_memory",
> + "MetricName": "backend_core",
> + "MetricExpr": "(backend_bound / 100) - backend_memory",
> "BriefDescription": "Fraction of slots the CPU was stalled due to backend non-memory subsystem issues",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "backend_core"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "100%"
> },
> {
> - "MetricExpr": "(STALL_BACKEND_TLB + STALL_BACKEND_CACHE + STALL_BACKEND_MEM) / CPU_CYCLES ",
> + "MetricName": "backend_memory",
> + "MetricExpr": "(STALL_BACKEND_TLB + STALL_BACKEND_CACHE) / CPU_CYCLES",
> "BriefDescription": "Fraction of slots the CPU was stalled due to backend memory subsystem issues (cache/tlb miss)",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "backend_memory"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "100%"
> },
> {
> - "MetricExpr": " (BR_MIS_PRED_RETIRED / GPC_FLUSH) * lost",
> + "MetricName": "branch_mispredict",
> + "MetricExpr": "(BR_MIS_PRED_RETIRED / GPC_FLUSH) * slots_lost_misspeculation_fraction",
> "BriefDescription": "Fraction of slots lost due to branch misprediciton",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "branch_mispredict"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "frontend - frontend_latency",
> + "MetricName": "frontend_bandwidth",
> + "MetricExpr": "frontend_bound - frontend_latency",
> "BriefDescription": "Fraction of slots the CPU did not dispatch at full bandwidth - able to dispatch partial slots only (1, 2, or 3 uops)",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "frontend_bandwidth"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "(STALL_FRONTEND - ((STALL_SLOT_FRONTEND - (frontend * CPU_CYCLES * 4)) / 4)) / CPU_CYCLES",
> + "MetricName": "frontend_latency",
> + "MetricExpr": "((STALL_FRONTEND - ((STALL_SLOT_FRONTEND - ((frontend_bound / 100) * CPU_CYCLES * #slots)) / #slots)) / CPU_CYCLES) * 100",
> "BriefDescription": "Fraction of slots the CPU was stalled due to frontend latency issues (cache/tlb miss); nothing to dispatch",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "frontend_latency"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "lost - branch_mispredict",
> + "MetricName": "other_miss_pred",
> + "MetricExpr": "slots_lost_misspeculation_fraction - branch_mispredict",
> "BriefDescription": "Fraction of slots lost due to other/non-branch misprediction misspeculation",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "other_clears"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "(IXU_NUM_UOPS_ISSUED + FSU_ISSUED) / (CPU_CYCLES * 6)",
> + "MetricName": "pipe_utilization",
> + "MetricExpr": "100 * ((IXU_NUM_UOPS_ISSUED + FSU_ISSUED) / (CPU_CYCLES * 6))",
> "BriefDescription": "Fraction of execute slots utilized",
> - "MetricGroup": "TopDownL2",
> - "MetricName": "pipe_utilization"
> + "MetricGroup": "TopdownL2",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "STALL_BACKEND_MEM / CPU_CYCLES",
> + "MetricName": "d_cache_l2_miss_rate",
> + "MetricExpr": "((STALL_BACKEND_MEM / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled due to data L2 cache miss",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "d_cache_l2_miss"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
> + "MetricName": "d_cache_miss_rate",
> + "MetricExpr": "((STALL_BACKEND_CACHE / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled due to data cache miss",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "d_cache_miss"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
> + "MetricName": "d_tlb_miss_rate",
> + "MetricExpr": "((STALL_BACKEND_TLB / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled due to data TLB miss",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "d_tlb_miss"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "FSU_ISSUED / (CPU_CYCLES * 2)",
> + "MetricName": "fsu_pipe_utilization",
> + "MetricExpr": "((FSU_ISSUED / (CPU_CYCLES * 2)) * 100)",
> "BriefDescription": "Fraction of FSU execute slots utilized",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "fsu_pipe_utilization"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
> + "MetricName": "i_cache_miss_rate",
> + "MetricExpr": "((STALL_FRONTEND_CACHE / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled due to instruction cache miss",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "i_cache_miss"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": " STALL_FRONTEND_TLB / CPU_CYCLES ",
> + "MetricName": "i_tlb_miss_rate",
> + "MetricExpr": "((STALL_FRONTEND_TLB / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled due to instruction TLB miss",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "i_tlb_miss"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "IXU_NUM_UOPS_ISSUED / (CPU_CYCLES / 4)",
> + "MetricName": "ixu_pipe_utilization",
> + "MetricExpr": "((IXU_NUM_UOPS_ISSUED / (CPU_CYCLES * #slots)) * 100)",
> "BriefDescription": "Fraction of IXU execute slots utilized",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "ixu_pipe_utilization"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "IDR_STALL_FLUSH / CPU_CYCLES",
> + "MetricName": "stall_recovery_rate",
> + "MetricExpr": "((IDR_STALL_FLUSH / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled due to flush recovery",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "recovery"
> - },
> - {
> - "MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
> - "BriefDescription": "Fraction of cycles the CPU was stalled due to core resource shortage",
> - "MetricGroup": "TopDownL3",
> - "MetricName": "resource"
> + "MetricGroup": "TopdownL3",
> + "ScaleUnit": "1percent of slots"
> },
> {
> - "MetricExpr": "IDR_STALL_FSU_SCHED / CPU_CYCLES ",
> + "MetricName": "stall_fsu_sched_rate",
> + "MetricExpr": "((IDR_STALL_FSU_SCHED / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled and FSU was full",
> - "MetricGroup": "TopDownL4",
> - "MetricName": "stall_fsu_sched"
> + "MetricGroup": "TopdownL4",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "IDR_STALL_IXU_SCHED / CPU_CYCLES ",
> + "MetricName": "stall_ixu_sched_rate",
> + "MetricExpr": "((IDR_STALL_IXU_SCHED / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled and IXU was full",
> - "MetricGroup": "TopDownL4",
> - "MetricName": "stall_ixu_sched"
> + "MetricGroup": "TopdownL4",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "IDR_STALL_LOB_ID / CPU_CYCLES ",
> + "MetricName": "stall_lob_id_rate",
> + "MetricExpr": "((IDR_STALL_LOB_ID / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled and LOB was full",
> - "MetricGroup": "TopDownL4",
> - "MetricName": "stall_lob_id"
> + "MetricGroup": "TopdownL4",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "IDR_STALL_ROB_ID / CPU_CYCLES",
> + "MetricName": "stall_rob_id_rate",
> + "MetricExpr": "((IDR_STALL_ROB_ID / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled and ROB was full",
> - "MetricGroup": "TopDownL4",
> - "MetricName": "stall_rob_id"
> + "MetricGroup": "TopdownL4",
> + "ScaleUnit": "1percent of cycles"
> },
> {
> - "MetricExpr": "IDR_STALL_SOB_ID / CPU_CYCLES ",
> + "MetricName": "stall_sob_id_rate",
> + "MetricExpr": "((IDR_STALL_SOB_ID / CPU_CYCLES) * 100)",
> "BriefDescription": "Fraction of cycles the CPU was stalled and SOB was full",
> - "MetricGroup": "TopDownL4",
> - "MetricName": "stall_sob_id"
> + "MetricGroup": "TopdownL4",
> + "ScaleUnit": "1percent of cycles"
> }
> ]
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] perf vendor events arm64: Fix for AmpereOne metrics
2023-09-20 16:40 ` Ian Rogers
@ 2023-09-26 22:50 ` Namhyung Kim
0 siblings, 0 replies; 3+ messages in thread
From: Namhyung Kim @ 2023-09-26 22:50 UTC (permalink / raw)
To: Ian Rogers
Cc: Ilkka Koskinen, D Scott Phillips, James Clark, John Garry,
Arnaldo Carvalho de Melo, Will Deacon, Mike Leach, Leo Yan,
Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Dave Kleikamp, linux-arm-kernel,
linux-perf-users, linux-kernel
On Wed, Sep 20, 2023 at 9:40 AM Ian Rogers <irogers@google.com> wrote:
>
> On Tue, Sep 19, 2023 at 11:19 PM Ilkka Koskinen
> <ilkka@os.amperecomputing.com> wrote:
> >
> > This patch addresses review comments that were given for
> > 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
> > but didn't make it to the original patch [1][2]
> >
> > Changes include: A fix for backend_memory formula, use of standard metrics
> > when possible, using #slots, renaming metrics to avoid spaces in the names,
> > and cleanup.
> >
> > [1] https://lore.kernel.org/linux-perf-users/e9bdacb-a231-36af-6a2e-6918ee7effa@os.amperecomputing.com/
> > [2] https://lore.kernel.org/linux-perf-users/20230826192352.3043220-1-ilkka@os.amperecomputing.com/
> >
> > Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
> > Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
>
> Reviewed-by: Ian Rogers <irogers@google.com>
Applied to perf-tools-next, thanks!
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-09-26 22:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-20 6:18 [PATCH v2] perf vendor events arm64: Fix for AmpereOne metrics Ilkka Koskinen
2023-09-20 16:40 ` Ian Rogers
2023-09-26 22:50 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).