All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] perf stat: Fix uncore metric scaling across aggregation modes
@ 2026-05-21 20:15 Chun-Tse Shao
  2026-05-21 20:15 ` [PATCH v3 1/2] perf stat: Add aggr_nr metric parser support Chun-Tse Shao
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Chun-Tse Shao @ 2026-05-21 20:15 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung
  Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
	james.clark, sandipan.das, leo.yan, thomas.falcon, yang.lee,
	linux-perf-users, linux-kernel, Chun-Tse Shao

This series fixes a scaling issue for metrics (like lpm_miss_lat) across
different runtime aggregation modes.

Uncore metrics currently use `source_count` to scale events. However,
`source_count` returns the total uncore unit count regardless of the
selected aggregation mode. When evaluating metrics in different
aggregation mode other than `--per-socket`, this incorrectly divides
aggregated uncore events against the total uncore count rather than the
uncores belonging to the aggregation, leading to wrong metric results.

To fix this, we:
1. Introduce the aggr_nr() keyword to the metric parser, which
dynamically resolves to the active units in the current aggregation
group (`gr->nr`).

2. Update the python metrics to use `aggr_nr` instead of `source_count`,
ensuring correct scaling across all runtime aggregation boundaries.

Before the fix (incorrect low latency in global mode):
  $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "122.8", "ns  lpm_miss_lat_loc" : "114.5"}
  $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "ns  lpm_miss_lat_rem" : "232.1", "ns  lpm_miss_lat_loc" : "278.2"}
  {"socket" : "S1", "ns  lpm_miss_lat_rem" : "233.9", "ns  lpm_miss_lat_loc" : "257.5"}

After the fix (correct scaled latency in all aggregation modes):
  $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "231.7", "ns  lpm_miss_lat_loc" : "245.0"}
  $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "ns  lpm_miss_lat_rem" : "238.3", "ns  lpm_miss_lat_loc" : "249.4"}
  {"socket" : "S1", "ns  lpm_miss_lat_rem" : "259.1", "ns  lpm_miss_lat_loc" : "253.1"}

v3:
  Fixed based on Sashiko review:
  - Removed the unnecessary, copied `redefined-builtin` pylint-disable
    comment from `aggr_nr` definition inside `metric.py`.

v2: lore.kernel.org/20260521035941.3860145-1-ctshao@google.com
  Fixed based on Sashiko review:
  - Fixed `aggr_nr` setting when an uncore event fails to run
    (counts.run == 0) to explicitly set it to 0 instead of defaulting to
    1.
  - Accumulated `aggr_nr` when multiple unmerged PMU events are
    associated with the same metric ID to prevent incorrect scaling
    across active sockets.
  - Removed unused `List` import from `typing` in `intel_metrics.py`.

v1: lore.kernel.org/20260520180032.3045144-1-ctshao@google.com

Chun-Tse Shao (2):
  perf stat: Add aggr_nr metric parser support
  perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics

 tools/perf/pmu-events/intel_metrics.py |  6 +++---
 tools/perf/pmu-events/metric.py        |  9 +++++++--
 tools/perf/util/expr.c                 | 26 ++++++++++++++++++++++----
 tools/perf/util/expr.h                 |  6 +++++-
 tools/perf/util/expr.l                 |  1 +
 tools/perf/util/expr.y                 | 24 +++++++++++++++++-------
 tools/perf/util/stat-shadow.c          |  6 +++++-
 7 files changed, 60 insertions(+), 18 deletions(-)

--
2.54.0.746.g67dd491aae-goog


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-04 13:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-21 20:15 [PATCH v3 0/2] perf stat: Fix uncore metric scaling across aggregation modes Chun-Tse Shao
2026-05-21 20:15 ` [PATCH v3 1/2] perf stat: Add aggr_nr metric parser support Chun-Tse Shao
2026-05-21 20:15 ` [PATCH v3 2/2] perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics Chun-Tse Shao
2026-05-21 21:08   ` sashiko-bot
2026-05-21 22:01     ` Chun-Tse Shao
2026-05-27 19:10 ` [PATCH v3 0/2] perf stat: Fix uncore metric scaling across aggregation modes Chen, Zide
2026-05-28 19:17 ` Namhyung Kim
2026-06-04 13:49   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.