Linux Perf Users
 help / color / mirror / Atom feed
* [PATCH 0/2] perf stat: Fix uncore metric scaling across aggregation modes
@ 2026-05-20 17:58 Chun-Tse Shao
  2026-05-20 17:58 ` [PATCH 1/2] perf stat: Add aggr_nr metric parser support Chun-Tse Shao
  2026-05-20 17:58 ` [PATCH 2/2] perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics Chun-Tse Shao
  0 siblings, 2 replies; 6+ messages in thread
From: Chun-Tse Shao @ 2026-05-20 17:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Chun-Tse Shao, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, james.clark,
	sandipan.das, leo.yan, thomas.falcon, yang.lee, linux-perf-users

This series fixes a scaling issue for metrics (like lpm_miss_lat) across
different runtime aggregation modes.

Uncore metrics currently use `source_count` to scale events. However,
`source_count` returns the total uncore unit count regardless of the
selected aggregation mode. When evaluating metrics in different
aggregation mode other than `--per-socket`, this incorrectly divides
aggregated uncore events against the total uncore count rather than the
uncores belonging to the aggregation, leading to wrong metric results.

To fix this, we:
1. Introduce the aggr_nr() keyword to the metric parser, which
dynamically resolves to the active units in the current aggregation
group (`aggr->nr`).

2. Update the python metrics to use `aggr_nr` instead of `source_count`,
ensuring correct scaling across all runtime aggregation boundaries.

Before the fix (incorrect low latency in global mode):
  $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "122.8", "ns  lpm_miss_lat_loc" : "114.5"}
  $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "ns  lpm_miss_lat_rem" : "232.1", "ns  lpm_miss_lat_loc" : "278.2"}
  {"socket" : "S1", "ns  lpm_miss_lat_rem" : "233.9", "ns  lpm_miss_lat_loc" : "257.5"}

After the fix (correct scaled latency in all aggregation modes):
  $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "230.0", "ns  lpm_miss_lat_loc" : "220.8"}
  $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "ns  lpm_miss_lat_rem" : "232.5", "ns  lpm_miss_lat_loc" : "269.9"}
  {"socket" : "S1", "ns  lpm_miss_lat_rem" : "225.7", "ns  lpm_miss_lat_loc" : "262.3"}

Chun-Tse Shao (2):
  perf stat: Add aggr_nr metric parser support
  perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics

 tools/perf/pmu-events/intel_metrics.py |  8 ++++----
 tools/perf/pmu-events/metric.py        | 11 +++++++++--
 tools/perf/util/expr.c                 | 27 ++++++++++++++++++++++----
 tools/perf/util/expr.h                 |  4 ++++
 tools/perf/util/expr.l                 |  1 +
 tools/perf/util/expr.y                 | 24 ++++++++++++++++-------
 tools/perf/util/stat-shadow.c          |  4 +++-
 7 files changed, 61 insertions(+), 18 deletions(-)

--
2.54.0.669.g59709faab0-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread
* [PATCH 0/2] perf stat: Fix uncore metric scaling across aggregation modes
@ 2026-05-20 17:47 Chun-Tse Shao
  2026-05-20 17:47 ` [PATCH 1/2] perf stat: Add aggr_nr metric parser support Chun-Tse Shao
  0 siblings, 1 reply; 6+ messages in thread
From: Chun-Tse Shao @ 2026-05-20 17:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Chun-Tse Shao, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, james.clark,
	sandipan.das, leo.yan, thomas.falcon, yang.lee, linux-perf-users

This series fixes a scaling issue for metrics (like lpm_miss_lat) across
different runtime aggregation modes.

Uncore metrics currently use `source_count` to scale events. However,
`source_count` returns the total uncore unit count regardless of the
selected aggregation mode. When evaluating metrics in different
aggregation mode other than `--per-socket`, this incorrectly divides
aggregated uncore events against the total uncore count rather than the
uncores belonging to the aggregation, leading to wrong metric results.

To fix this, we:
1. Introduce the aggr_nr() keyword to the metric parser, which
dynamically resolves to the active units in the current aggregation
group (`aggr->nr`).

2. Update the python metrics to use `aggr_nr` instead of `source_count`,
ensuring correct scaling across all runtime aggregation boundaries.

Before the fix (incorrect low latency in global mode):
  $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "122.8", "ns  lpm_miss_lat_loc" : "114.5"}
  $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "ns  lpm_miss_lat_rem" : "232.1", "ns  lpm_miss_lat_loc" : "278.2"}
  {"socket" : "S1", "ns  lpm_miss_lat_rem" : "233.9", "ns  lpm_miss_lat_loc" : "257.5"}

After the fix (correct scaled latency in all aggregation modes):
  $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "230.0", "ns  lpm_miss_lat_loc" : "220.8"}
  $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "ns  lpm_miss_lat_rem" : "232.5", "ns  lpm_miss_lat_loc" : "269.9"}
  {"socket" : "S1", "ns  lpm_miss_lat_rem" : "225.7", "ns  lpm_miss_lat_loc" : "262.3"}

Chun-Tse Shao (2):
  perf stat: Add aggr_nr metric parser support
  perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics

 tools/perf/pmu-events/intel_metrics.py |  8 ++++----
 tools/perf/pmu-events/metric.py        | 11 +++++++++--
 tools/perf/util/expr.c                 | 27 ++++++++++++++++++++++----
 tools/perf/util/expr.h                 |  4 ++++
 tools/perf/util/expr.l                 |  1 +
 tools/perf/util/expr.y                 | 24 ++++++++++++++++-------
 tools/perf/util/stat-shadow.c          |  4 +++-
 7 files changed, 61 insertions(+), 18 deletions(-)

--
2.54.0.669.g59709faab0-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-20 19:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 17:58 [PATCH 0/2] perf stat: Fix uncore metric scaling across aggregation modes Chun-Tse Shao
2026-05-20 17:58 ` [PATCH 1/2] perf stat: Add aggr_nr metric parser support Chun-Tse Shao
2026-05-20 18:28   ` sashiko-bot
2026-05-20 17:58 ` [PATCH 2/2] perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics Chun-Tse Shao
2026-05-20 19:24   ` sashiko-bot
  -- strict thread matches above, loose matches on Subject: below --
2026-05-20 17:47 [PATCH 0/2] perf stat: Fix uncore metric scaling across aggregation modes Chun-Tse Shao
2026-05-20 17:47 ` [PATCH 1/2] perf stat: Add aggr_nr metric parser support Chun-Tse Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox