From: Ian Rogers <irogers@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
"Dr. David Alan Gilbert" <linux@treblig.org>,
Yang Li <yang.lee@linux.alibaba.com>,
James Clark <james.clark@linaro.org>,
Thomas Falcon <thomas.falcon@intel.com>,
Thomas Richter <tmricht@linux.ibm.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
Andi Kleen <ak@linux.intel.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>
Subject: [PATCH v3 0/9] perf stat fixes and improvements
Date: Wed, 5 Nov 2025 23:12:31 -0800 [thread overview]
Message-ID: <20251106071241.141234-1-irogers@google.com> (raw)
A collection of fixes aiming to stabilize and make more reasonable
measurements/metrics such as memory bandwidth.
Tool events are changed from getting a PMU cpu map of all online CPUs
to either CPU 0 or all online CPUs. This avoids iterating over useless
CPUs for events in particular `duration_time`. Fix a bug where
duration_time didn't correct use the previous raw counts and would
skip values in interval mode.
Change how json metrics handle tool events. Use the counter value
rather than using shared state with perf stat. A later patch changes
it so that tool events are read last, so that if reading say memory
bandwidth counters you don't divide by an earlier read time and exceed
the theoretical maximum memory bandwidth.
Do some clean up around the shared state in stat-shadow that's no
longer used. It can be fully removed when the legacy json metrics
patch series lands:
https://lore.kernel.org/lkml/20251024175857.808401-1-irogers@google.com/
Change how affinities work with evlist__for_each_cpu. Move the
affinity code into the iterator to simplify setting it up. Detect when
affinities will and won't be profitable, for example a tool event and
a regular perf event (or read group) may face less delay from a single
IPI for the event read than from a call to sched_setaffinity. Add a
--no-affinity flag to perf stat to allow affinities to be disabled.
v3: Add affinity clean ups and read tool events last.
v2: Fixed an aggregation index issue:
https://lore.kernel.org/lkml/20251104234148.3103176-2-irogers@google.com/
v1:
https://lore.kernel.org/lkml/20251104053449.1208800-1-irogers@google.com/
Ian Rogers (9):
libperf cpumap: Reduce allocations and sorting in intersect
perf pmu: perf_cpu_map__new_int to avoid parsing a string
perf tool_pmu: Use old_count when computing count values for time
events
perf stat-shadow: Read tool events directly
perf stat: Reduce scope of ru_stats
perf tool_pmu: More accurately set the cpus for tool events
perf evlist: Reduce affinity use and move into iterator, fix no
affinity
perf stat: Read tool events last
perf stat: Add no-affinity flag
tools/lib/perf/cpumap.c | 29 ++--
tools/perf/Documentation/perf-stat.txt | 4 +
tools/perf/builtin-stat.c | 189 +++++++++++++++----------
tools/perf/util/config.c | 1 -
tools/perf/util/drm_pmu.c | 2 +-
tools/perf/util/evlist.c | 156 ++++++++++++--------
tools/perf/util/evlist.h | 27 +++-
tools/perf/util/hwmon_pmu.c | 2 +-
tools/perf/util/parse-events.c | 9 +-
tools/perf/util/pmu.c | 12 ++
tools/perf/util/pmu.h | 1 +
tools/perf/util/stat-shadow.c | 149 +++++++++----------
tools/perf/util/stat.h | 16 ---
tools/perf/util/tool_pmu.c | 78 ++++++----
tools/perf/util/tool_pmu.h | 1 +
15 files changed, 395 insertions(+), 281 deletions(-)
--
2.51.2.1041.gc1ab5b90ca-goog
next reply other threads:[~2025-11-06 7:12 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-06 7:12 Ian Rogers [this message]
2025-11-06 7:12 ` [PATCH v3 1/9] libperf cpumap: Reduce allocations and sorting in intersect Ian Rogers
2025-11-06 7:12 ` [PATCH v3 2/9] perf pmu: perf_cpu_map__new_int to avoid parsing a string Ian Rogers
2025-11-06 7:12 ` [PATCH v3 3/9] perf tool_pmu: Use old_count when computing count values for time events Ian Rogers
2025-11-06 7:12 ` [PATCH v3 4/9] perf stat-shadow: Read tool events directly Ian Rogers
2025-11-06 7:12 ` [PATCH v3 5/9] perf stat: Reduce scope of ru_stats Ian Rogers
2025-11-06 7:12 ` [PATCH v3 6/9] perf tool_pmu: More accurately set the cpus for tool events Ian Rogers
2025-11-06 7:12 ` [PATCH v3 7/9] perf evlist: Reduce affinity use and move into iterator, fix no affinity Ian Rogers
2025-11-06 7:12 ` [PATCH v3 8/9] perf stat: Read tool events last Ian Rogers
2025-11-06 7:12 ` [PATCH v3 9/9] perf stat: Add no-affinity flag Ian Rogers
2025-11-06 17:31 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251106071241.141234-1-irogers@google.com \
--to=irogers@google.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=james.clark@linaro.org \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux@treblig.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=thomas.falcon@intel.com \
--cc=tmricht@linux.ibm.com \
--cc=yang.lee@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox