public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ian Rogers <irogers@google.com>
To: Suzuki K Poulose <suzuki.poulose@arm.com>,
	Mike Leach <mike.leach@linaro.org>,
	 James Clark <james.clark@linaro.org>,
	John Garry <john.g.garry@oracle.com>,
	 Will Deacon <will@kernel.org>, Leo Yan <leo.yan@linux.dev>,
	 Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	 Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	 Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,  Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	 Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Charlie Jenkins <charlie@rivosinc.com>,
	 Thomas Falcon <thomas.falcon@intel.com>,
	Yicong Yang <yangyicong@hisilicon.com>,
	 Thomas Richter <tmricht@linux.ibm.com>,
	Athira Rajeev <atrajeev@linux.ibm.com>,
	 Howard Chu <howardchu95@gmail.com>, Song Liu <song@kernel.org>,
	 Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Levi Yun <yeoreum.yun@arm.com>,
	 Zhongqiu Han <quic_zhonhan@quicinc.com>,
	Blake Jones <blakejones@google.com>,
	 Anubhav Shelat <ashelat@redhat.com>,
	Chun-Tse Shao <ctshao@google.com>,
	 Christophe Leroy <christophe.leroy@csgroup.eu>,
	 Jean-Philippe Romain <jean-philippe.romain@foss.st.com>,
	Gautam Menghani <gautam@linux.ibm.com>,
	 Dmitry Vyukov <dvyukov@google.com>,
	Yang Li <yang.lee@linux.alibaba.com>,
	 linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	 Andi Kleen <ak@linux.intel.com>,
	Weilin Wang <weilin.wang@intel.com>
Subject: [RFC PATCH v1 00/15] Addition of session API to python module
Date: Tue, 28 Oct 2025 22:33:58 -0700	[thread overview]
Message-ID: <20251029053413.355154-1-irogers@google.com> (raw)

The perf script command uses a session with process_events to call
through to the python process_events function. The event is turned
into a python dictionary, whether the entries are used or not, adding
overhead. To avoid the overhead, add a session API abstraction and
pass callbacks that can be used to perform the existing perf script
functions. The implementation is incomplete in this RFC.

In this series the mem-phys-addr.py command is ported from perf script
to using the session API. The performance before and after is:

Before:
```
$ perf mem record -a sleep 1
$ time perf script tools/perf/scripts/python/mem-phys-addr.py
Event: cpu_core/mem-loads-aux/
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m3.754s
user    0m0.023s
sys     0m0.018s
```

After:
```
$ PYTHONPATH=/tmp/perf/python time python3 tools/perf/python/mem-phys-addr.py
Event: evsel(cpu_core/mem-loads-aux/)
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m0.106s
user    0m0.021s
sys     0m0.020s
```

So a roughly 35x speedup, but it maybe that some of that is one time
start-up overhead of libpython which wouldn't be present for larger
perf.data files.

Before porting all the script commands and adding things like
callchain support to the python module, I wanted to get feedback. One
thing that particularly simplifies the series is adding reference
counts to evsel and evlist to avoid copying/cloning evsels created by
the session API when loading a perf.data file.

The approach of moving away from libpython and scripts was most
recently discussed as a topic in:
https://lore.kernel.org/lkml/CAP-5=fWDqE8SYfOLZkg_0=4Ayx6E7O+h7uUp4NDeCFkiN4b7-w@mail.gmail.com/

When creating the python wrapper some house keeping was done around
includes and perf_data's encapsulation.

The perf script callbacks differ from those in perf_tool, for example,
stat is the perf_tool callback is for a stat event while the scripting
ops combine things and have a stat callback associated with
stat_round. Should the session API match the tool or the script API?
The former feels better for long term, while the latter could simplify
porting perf scripts.

Ian Rogers (15):
  perf arch arm: Sort includes and add missed explicit dependencies
  perf arch x86: Sort includes and add missed explicit dependencies
  perf tests: Sort includes and add missed explicit dependencies
  perf script: Sort includes and add missed explicit dependencies
  perf util: Sort includes and add missed explicit dependencies
  perf python: Add add missed explicit dependencies
  perf evsel/evlist: Avoid unnecessary #includes
  perf maps: Move getting debug_file to verbose path
  perf data: Clean up use_stdio and structures
  perf python: Add wrapper for perf_data file abstraction
  perf python: Add python session abstraction wrapping perf's session
  perf evlist: Add reference count
  perf evsel: Add reference count
  perf python: Add access to evsel and phys_addr in event
  perf mem-phys-addr.py: Port to standalone application from perf script

 tools/perf/arch/arm/util/cs-etm.c           |  22 +-
 tools/perf/arch/x86/tests/hybrid.c          |   2 +-
 tools/perf/arch/x86/tests/topdown.c         |   2 +-
 tools/perf/arch/x86/util/intel-bts.c        |  14 +-
 tools/perf/arch/x86/util/intel-pt.c         |  31 +-
 tools/perf/arch/x86/util/iostat.c           |   2 +-
 tools/perf/bench/evlist-open-close.c        |  18 +-
 tools/perf/builtin-ftrace.c                 |   8 +-
 tools/perf/builtin-inject.c                 |   7 +-
 tools/perf/builtin-kvm.c                    |   4 +-
 tools/perf/builtin-lock.c                   |   2 +-
 tools/perf/builtin-record.c                 |  14 +-
 tools/perf/builtin-script.c                 | 109 ++--
 tools/perf/builtin-stat.c                   |   8 +-
 tools/perf/builtin-top.c                    |  52 +-
 tools/perf/builtin-trace.c                  |  38 +-
 tools/perf/python/mem-phys-addr.py          | 117 ++++
 tools/perf/tests/backward-ring-buffer.c     |  18 +-
 tools/perf/tests/code-reading.c             |   4 +-
 tools/perf/tests/event-times.c              |   4 +-
 tools/perf/tests/event_update.c             |   2 +-
 tools/perf/tests/evsel-roundtrip-name.c     |   8 +-
 tools/perf/tests/evsel-tp-sched.c           |   4 +-
 tools/perf/tests/expand-cgroup.c            |   8 +-
 tools/perf/tests/hists_cumulate.c           |   2 +-
 tools/perf/tests/hists_filter.c             |   2 +-
 tools/perf/tests/hists_link.c               |   2 +-
 tools/perf/tests/hists_output.c             |   2 +-
 tools/perf/tests/hwmon_pmu.c                |  14 +-
 tools/perf/tests/keep-tracking.c            |   2 +-
 tools/perf/tests/mmap-basic.c               |  31 +-
 tools/perf/tests/openat-syscall-all-cpus.c  |   6 +-
 tools/perf/tests/openat-syscall-tp-fields.c |  18 +-
 tools/perf/tests/openat-syscall.c           |   6 +-
 tools/perf/tests/parse-events.c             |   4 +-
 tools/perf/tests/parse-metric.c             |   4 +-
 tools/perf/tests/parse-no-sample-id-all.c   |   2 +-
 tools/perf/tests/perf-record.c              |  18 +-
 tools/perf/tests/perf-time-to-tsc.c         |   2 +-
 tools/perf/tests/pfm.c                      |   4 +-
 tools/perf/tests/pmu-events.c               |   6 +-
 tools/perf/tests/pmu.c                      |   2 +-
 tools/perf/tests/sw-clock.c                 |  14 +-
 tools/perf/tests/switch-tracking.c          |   2 +-
 tools/perf/tests/task-exit.c                |  14 +-
 tools/perf/tests/tool_pmu.c                 |   2 +-
 tools/perf/tests/topology.c                 |   5 +-
 tools/perf/util/bpf_counter_cgroup.c        |   2 +-
 tools/perf/util/bpf_off_cpu.c               |  28 +-
 tools/perf/util/bpf_trace_augment.c         |   7 +-
 tools/perf/util/cgroup.c                    |   6 +-
 tools/perf/util/data-convert-bt.c           |   2 +-
 tools/perf/util/data.c                      |  81 ++-
 tools/perf/util/data.h                      |  52 +-
 tools/perf/util/evlist.c                    | 100 ++--
 tools/perf/util/evlist.h                    |  23 +-
 tools/perf/util/evsel.c                     | 103 ++--
 tools/perf/util/evsel.h                     |  30 +-
 tools/perf/util/expr.c                      |   2 +-
 tools/perf/util/header.c                    |  12 +-
 tools/perf/util/map.h                       |   6 +-
 tools/perf/util/maps.c                      |   9 +-
 tools/perf/util/metricgroup.c               |   6 +-
 tools/perf/util/parse-events.c              |   4 +-
 tools/perf/util/parse-events.y              |   2 +-
 tools/perf/util/perf_api_probe.c            |  19 +-
 tools/perf/util/pfm.c                       |   2 +-
 tools/perf/util/print-events.c              |   2 +-
 tools/perf/util/print_insn.h                |   5 +-
 tools/perf/util/python.c                    | 584 +++++++++++++++-----
 tools/perf/util/record.c                    |   2 +-
 tools/perf/util/s390-sample-raw.c           |  15 +-
 tools/perf/util/session.c                   |   4 +-
 tools/perf/util/sideband_evlist.c           |  16 +-
 tools/perf/util/stat-shadow.c               |   1 +
 tools/perf/util/stat.c                      |  15 +-
 76 files changed, 1152 insertions(+), 650 deletions(-)
 create mode 100644 tools/perf/python/mem-phys-addr.py

-- 
2.51.1.851.g4ebd6896fd-goog


             reply	other threads:[~2025-10-29  5:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29  5:33 Ian Rogers [this message]
2025-10-29  5:33 ` [RFC PATCH v1 01/15] perf arch arm: Sort includes and add missed explicit dependencies Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 02/15] perf arch x86: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 03/15] perf tests: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 04/15] perf script: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 05/15] perf util: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 06/15] perf python: Add " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 07/15] perf evsel/evlist: Avoid unnecessary #includes Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 08/15] perf maps: Move getting debug_file to verbose path Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 09/15] perf data: Clean up use_stdio and structures Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 10/15] perf python: Add wrapper for perf_data file abstraction Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 11/15] perf python: Add python session abstraction wrapping perf's session Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 12/15] perf evlist: Add reference count Ian Rogers
2025-10-29 16:22   ` Arnaldo Carvalho de Melo
2025-10-29 16:25     ` Arnaldo Carvalho de Melo
2025-10-29 16:56     ` Ian Rogers
2025-10-29 18:33       ` Arnaldo Carvalho de Melo
2025-10-29 21:12         ` Ian Rogers
2025-10-30 13:09           ` Arnaldo Carvalho de Melo
2025-10-29  5:34 ` [RFC PATCH v1 13/15] perf evsel: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 14/15] perf python: Add access to evsel and phys_addr in event Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 15/15] perf mem-phys-addr.py: Port to standalone application from perf script Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251029053413.355154-1-irogers@google.com \
    --to=irogers@google.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=ashelat@redhat.com \
    --cc=atrajeev@linux.ibm.com \
    --cc=blakejones@google.com \
    --cc=charlie@rivosinc.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=ctshao@google.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=gautam@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=howardchu95@gmail.com \
    --cc=james.clark@linaro.org \
    --cc=jean-philippe.romain@foss.st.com \
    --cc=john.g.garry@oracle.com \
    --cc=jolsa@kernel.org \
    --cc=leo.yan@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=quic_zhonhan@quicinc.com \
    --cc=song@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=thomas.falcon@intel.com \
    --cc=tmricht@linux.ibm.com \
    --cc=weilin.wang@intel.com \
    --cc=will@kernel.org \
    --cc=yang.lee@linux.alibaba.com \
    --cc=yangyicong@hisilicon.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox