From: Ian Rogers <irogers@google.com>
To: Suzuki K Poulose <suzuki.poulose@arm.com>,
Mike Leach <mike.leach@linaro.org>,
James Clark <james.clark@linaro.org>,
John Garry <john.g.garry@oracle.com>,
Will Deacon <will@kernel.org>, Leo Yan <leo.yan@linux.dev>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Charlie Jenkins <charlie@rivosinc.com>,
Thomas Falcon <thomas.falcon@intel.com>,
Yicong Yang <yangyicong@hisilicon.com>,
Thomas Richter <tmricht@linux.ibm.com>,
Athira Rajeev <atrajeev@linux.ibm.com>,
Howard Chu <howardchu95@gmail.com>, Song Liu <song@kernel.org>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Levi Yun <yeoreum.yun@arm.com>,
Zhongqiu Han <quic_zhonhan@quicinc.com>,
Blake Jones <blakejones@google.com>,
Anubhav Shelat <ashelat@redhat.com>,
Chun-Tse Shao <ctshao@google.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Jean-Philippe Romain <jean-philippe.romain@foss.st.com>,
Gautam Menghani <gautam@linux.ibm.com>,
Dmitry Vyukov <dvyukov@google.com>,
Yang Li <yang.lee@linux.alibaba.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Andi Kleen <ak@linux.intel.com>,
Weilin Wang <weilin.wang@intel.com>
Subject: [RFC PATCH v1 00/15] Addition of session API to python module
Date: Tue, 28 Oct 2025 22:33:58 -0700 [thread overview]
Message-ID: <20251029053413.355154-1-irogers@google.com> (raw)
The perf script command uses a session with process_events to call
through to the python process_events function. The event is turned
into a python dictionary, whether the entries are used or not, adding
overhead. To avoid the overhead, add a session API abstraction and
pass callbacks that can be used to perform the existing perf script
functions. The implementation is incomplete in this RFC.
In this series the mem-phys-addr.py command is ported from perf script
to using the session API. The performance before and after is:
Before:
```
$ perf mem record -a sleep 1
$ time perf script tools/perf/scripts/python/mem-phys-addr.py
Event: cpu_core/mem-loads-aux/
Memory type count percentage
--------------------------------------- ---------- ----------
0-fff : Reserved 3217 100.0
real 0m3.754s
user 0m0.023s
sys 0m0.018s
```
After:
```
$ PYTHONPATH=/tmp/perf/python time python3 tools/perf/python/mem-phys-addr.py
Event: evsel(cpu_core/mem-loads-aux/)
Memory type count percentage
--------------------------------------- ---------- ----------
0-fff : Reserved 3217 100.0
real 0m0.106s
user 0m0.021s
sys 0m0.020s
```
So a roughly 35x speedup, but it maybe that some of that is one time
start-up overhead of libpython which wouldn't be present for larger
perf.data files.
Before porting all the script commands and adding things like
callchain support to the python module, I wanted to get feedback. One
thing that particularly simplifies the series is adding reference
counts to evsel and evlist to avoid copying/cloning evsels created by
the session API when loading a perf.data file.
The approach of moving away from libpython and scripts was most
recently discussed as a topic in:
https://lore.kernel.org/lkml/CAP-5=fWDqE8SYfOLZkg_0=4Ayx6E7O+h7uUp4NDeCFkiN4b7-w@mail.gmail.com/
When creating the python wrapper some house keeping was done around
includes and perf_data's encapsulation.
The perf script callbacks differ from those in perf_tool, for example,
stat is the perf_tool callback is for a stat event while the scripting
ops combine things and have a stat callback associated with
stat_round. Should the session API match the tool or the script API?
The former feels better for long term, while the latter could simplify
porting perf scripts.
Ian Rogers (15):
perf arch arm: Sort includes and add missed explicit dependencies
perf arch x86: Sort includes and add missed explicit dependencies
perf tests: Sort includes and add missed explicit dependencies
perf script: Sort includes and add missed explicit dependencies
perf util: Sort includes and add missed explicit dependencies
perf python: Add add missed explicit dependencies
perf evsel/evlist: Avoid unnecessary #includes
perf maps: Move getting debug_file to verbose path
perf data: Clean up use_stdio and structures
perf python: Add wrapper for perf_data file abstraction
perf python: Add python session abstraction wrapping perf's session
perf evlist: Add reference count
perf evsel: Add reference count
perf python: Add access to evsel and phys_addr in event
perf mem-phys-addr.py: Port to standalone application from perf script
tools/perf/arch/arm/util/cs-etm.c | 22 +-
tools/perf/arch/x86/tests/hybrid.c | 2 +-
tools/perf/arch/x86/tests/topdown.c | 2 +-
tools/perf/arch/x86/util/intel-bts.c | 14 +-
tools/perf/arch/x86/util/intel-pt.c | 31 +-
tools/perf/arch/x86/util/iostat.c | 2 +-
tools/perf/bench/evlist-open-close.c | 18 +-
tools/perf/builtin-ftrace.c | 8 +-
tools/perf/builtin-inject.c | 7 +-
tools/perf/builtin-kvm.c | 4 +-
tools/perf/builtin-lock.c | 2 +-
tools/perf/builtin-record.c | 14 +-
tools/perf/builtin-script.c | 109 ++--
tools/perf/builtin-stat.c | 8 +-
tools/perf/builtin-top.c | 52 +-
tools/perf/builtin-trace.c | 38 +-
tools/perf/python/mem-phys-addr.py | 117 ++++
tools/perf/tests/backward-ring-buffer.c | 18 +-
tools/perf/tests/code-reading.c | 4 +-
tools/perf/tests/event-times.c | 4 +-
tools/perf/tests/event_update.c | 2 +-
tools/perf/tests/evsel-roundtrip-name.c | 8 +-
tools/perf/tests/evsel-tp-sched.c | 4 +-
tools/perf/tests/expand-cgroup.c | 8 +-
tools/perf/tests/hists_cumulate.c | 2 +-
tools/perf/tests/hists_filter.c | 2 +-
tools/perf/tests/hists_link.c | 2 +-
tools/perf/tests/hists_output.c | 2 +-
tools/perf/tests/hwmon_pmu.c | 14 +-
tools/perf/tests/keep-tracking.c | 2 +-
tools/perf/tests/mmap-basic.c | 31 +-
tools/perf/tests/openat-syscall-all-cpus.c | 6 +-
tools/perf/tests/openat-syscall-tp-fields.c | 18 +-
tools/perf/tests/openat-syscall.c | 6 +-
tools/perf/tests/parse-events.c | 4 +-
tools/perf/tests/parse-metric.c | 4 +-
tools/perf/tests/parse-no-sample-id-all.c | 2 +-
tools/perf/tests/perf-record.c | 18 +-
tools/perf/tests/perf-time-to-tsc.c | 2 +-
tools/perf/tests/pfm.c | 4 +-
tools/perf/tests/pmu-events.c | 6 +-
tools/perf/tests/pmu.c | 2 +-
tools/perf/tests/sw-clock.c | 14 +-
tools/perf/tests/switch-tracking.c | 2 +-
tools/perf/tests/task-exit.c | 14 +-
tools/perf/tests/tool_pmu.c | 2 +-
tools/perf/tests/topology.c | 5 +-
tools/perf/util/bpf_counter_cgroup.c | 2 +-
tools/perf/util/bpf_off_cpu.c | 28 +-
tools/perf/util/bpf_trace_augment.c | 7 +-
tools/perf/util/cgroup.c | 6 +-
tools/perf/util/data-convert-bt.c | 2 +-
tools/perf/util/data.c | 81 ++-
tools/perf/util/data.h | 52 +-
tools/perf/util/evlist.c | 100 ++--
tools/perf/util/evlist.h | 23 +-
tools/perf/util/evsel.c | 103 ++--
tools/perf/util/evsel.h | 30 +-
tools/perf/util/expr.c | 2 +-
tools/perf/util/header.c | 12 +-
tools/perf/util/map.h | 6 +-
tools/perf/util/maps.c | 9 +-
tools/perf/util/metricgroup.c | 6 +-
tools/perf/util/parse-events.c | 4 +-
tools/perf/util/parse-events.y | 2 +-
tools/perf/util/perf_api_probe.c | 19 +-
tools/perf/util/pfm.c | 2 +-
tools/perf/util/print-events.c | 2 +-
tools/perf/util/print_insn.h | 5 +-
tools/perf/util/python.c | 584 +++++++++++++++-----
tools/perf/util/record.c | 2 +-
tools/perf/util/s390-sample-raw.c | 15 +-
tools/perf/util/session.c | 4 +-
tools/perf/util/sideband_evlist.c | 16 +-
tools/perf/util/stat-shadow.c | 1 +
tools/perf/util/stat.c | 15 +-
76 files changed, 1152 insertions(+), 650 deletions(-)
create mode 100644 tools/perf/python/mem-phys-addr.py
--
2.51.1.851.g4ebd6896fd-goog
next reply other threads:[~2025-10-29 5:34 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 5:33 Ian Rogers [this message]
2025-10-29 5:33 ` [RFC PATCH v1 01/15] perf arch arm: Sort includes and add missed explicit dependencies Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 02/15] perf arch x86: " Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 03/15] perf tests: " Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 04/15] perf script: " Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 05/15] perf util: " Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 06/15] perf python: Add " Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 07/15] perf evsel/evlist: Avoid unnecessary #includes Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 08/15] perf maps: Move getting debug_file to verbose path Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 09/15] perf data: Clean up use_stdio and structures Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 10/15] perf python: Add wrapper for perf_data file abstraction Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 11/15] perf python: Add python session abstraction wrapping perf's session Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 12/15] perf evlist: Add reference count Ian Rogers
2025-10-29 16:22 ` Arnaldo Carvalho de Melo
2025-10-29 16:25 ` Arnaldo Carvalho de Melo
2025-10-29 16:56 ` Ian Rogers
2025-10-29 18:33 ` Arnaldo Carvalho de Melo
2025-10-29 21:12 ` Ian Rogers
2025-10-30 13:09 ` Arnaldo Carvalho de Melo
2025-10-29 5:34 ` [RFC PATCH v1 13/15] perf evsel: " Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 14/15] perf python: Add access to evsel and phys_addr in event Ian Rogers
2025-10-29 5:34 ` [RFC PATCH v1 15/15] perf mem-phys-addr.py: Port to standalone application from perf script Ian Rogers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251029053413.355154-1-irogers@google.com \
--to=irogers@google.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=ashelat@redhat.com \
--cc=atrajeev@linux.ibm.com \
--cc=blakejones@google.com \
--cc=charlie@rivosinc.com \
--cc=christophe.leroy@csgroup.eu \
--cc=ctshao@google.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=dvyukov@google.com \
--cc=gautam@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=howardchu95@gmail.com \
--cc=james.clark@linaro.org \
--cc=jean-philippe.romain@foss.st.com \
--cc=john.g.garry@oracle.com \
--cc=jolsa@kernel.org \
--cc=leo.yan@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mike.leach@linaro.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=quic_zhonhan@quicinc.com \
--cc=song@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=thomas.falcon@intel.com \
--cc=tmricht@linux.ibm.com \
--cc=weilin.wang@intel.com \
--cc=will@kernel.org \
--cc=yang.lee@linux.alibaba.com \
--cc=yangyicong@hisilicon.com \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox