linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/15] Addition of session API to python module
@ 2025-10-29  5:33 Ian Rogers
  2025-10-29  5:33 ` [RFC PATCH v1 01/15] perf arch arm: Sort includes and add missed explicit dependencies Ian Rogers
                   ` (14 more replies)
  0 siblings, 15 replies; 22+ messages in thread
From: Ian Rogers @ 2025-10-29  5:33 UTC (permalink / raw)
  To: Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Leo Yan, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Jiri Olsa, Ian Rogers, Adrian Hunter, Greg Kroah-Hartman,
	Charlie Jenkins, Thomas Falcon, Yicong Yang, Thomas Richter,
	Athira Rajeev, Howard Chu, Song Liu, Dapeng Mi, Levi Yun,
	Zhongqiu Han, Blake Jones, Anubhav Shelat, Chun-Tse Shao,
	Christophe Leroy, Jean-Philippe Romain, Gautam Menghani,
	Dmitry Vyukov, Yang Li, linux-kernel, linux-perf-users,
	Andi Kleen, Weilin Wang

The perf script command uses a session with process_events to call
through to the python process_events function. The event is turned
into a python dictionary, whether the entries are used or not, adding
overhead. To avoid the overhead, add a session API abstraction and
pass callbacks that can be used to perform the existing perf script
functions. The implementation is incomplete in this RFC.

In this series the mem-phys-addr.py command is ported from perf script
to using the session API. The performance before and after is:

Before:
```
$ perf mem record -a sleep 1
$ time perf script tools/perf/scripts/python/mem-phys-addr.py
Event: cpu_core/mem-loads-aux/
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m3.754s
user    0m0.023s
sys     0m0.018s
```

After:
```
$ PYTHONPATH=/tmp/perf/python time python3 tools/perf/python/mem-phys-addr.py
Event: evsel(cpu_core/mem-loads-aux/)
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m0.106s
user    0m0.021s
sys     0m0.020s
```

So a roughly 35x speedup, but it maybe that some of that is one time
start-up overhead of libpython which wouldn't be present for larger
perf.data files.

Before porting all the script commands and adding things like
callchain support to the python module, I wanted to get feedback. One
thing that particularly simplifies the series is adding reference
counts to evsel and evlist to avoid copying/cloning evsels created by
the session API when loading a perf.data file.

The approach of moving away from libpython and scripts was most
recently discussed as a topic in:
https://lore.kernel.org/lkml/CAP-5=fWDqE8SYfOLZkg_0=4Ayx6E7O+h7uUp4NDeCFkiN4b7-w@mail.gmail.com/

When creating the python wrapper some house keeping was done around
includes and perf_data's encapsulation.

The perf script callbacks differ from those in perf_tool, for example,
stat is the perf_tool callback is for a stat event while the scripting
ops combine things and have a stat callback associated with
stat_round. Should the session API match the tool or the script API?
The former feels better for long term, while the latter could simplify
porting perf scripts.

Ian Rogers (15):
  perf arch arm: Sort includes and add missed explicit dependencies
  perf arch x86: Sort includes and add missed explicit dependencies
  perf tests: Sort includes and add missed explicit dependencies
  perf script: Sort includes and add missed explicit dependencies
  perf util: Sort includes and add missed explicit dependencies
  perf python: Add add missed explicit dependencies
  perf evsel/evlist: Avoid unnecessary #includes
  perf maps: Move getting debug_file to verbose path
  perf data: Clean up use_stdio and structures
  perf python: Add wrapper for perf_data file abstraction
  perf python: Add python session abstraction wrapping perf's session
  perf evlist: Add reference count
  perf evsel: Add reference count
  perf python: Add access to evsel and phys_addr in event
  perf mem-phys-addr.py: Port to standalone application from perf script

 tools/perf/arch/arm/util/cs-etm.c           |  22 +-
 tools/perf/arch/x86/tests/hybrid.c          |   2 +-
 tools/perf/arch/x86/tests/topdown.c         |   2 +-
 tools/perf/arch/x86/util/intel-bts.c        |  14 +-
 tools/perf/arch/x86/util/intel-pt.c         |  31 +-
 tools/perf/arch/x86/util/iostat.c           |   2 +-
 tools/perf/bench/evlist-open-close.c        |  18 +-
 tools/perf/builtin-ftrace.c                 |   8 +-
 tools/perf/builtin-inject.c                 |   7 +-
 tools/perf/builtin-kvm.c                    |   4 +-
 tools/perf/builtin-lock.c                   |   2 +-
 tools/perf/builtin-record.c                 |  14 +-
 tools/perf/builtin-script.c                 | 109 ++--
 tools/perf/builtin-stat.c                   |   8 +-
 tools/perf/builtin-top.c                    |  52 +-
 tools/perf/builtin-trace.c                  |  38 +-
 tools/perf/python/mem-phys-addr.py          | 117 ++++
 tools/perf/tests/backward-ring-buffer.c     |  18 +-
 tools/perf/tests/code-reading.c             |   4 +-
 tools/perf/tests/event-times.c              |   4 +-
 tools/perf/tests/event_update.c             |   2 +-
 tools/perf/tests/evsel-roundtrip-name.c     |   8 +-
 tools/perf/tests/evsel-tp-sched.c           |   4 +-
 tools/perf/tests/expand-cgroup.c            |   8 +-
 tools/perf/tests/hists_cumulate.c           |   2 +-
 tools/perf/tests/hists_filter.c             |   2 +-
 tools/perf/tests/hists_link.c               |   2 +-
 tools/perf/tests/hists_output.c             |   2 +-
 tools/perf/tests/hwmon_pmu.c                |  14 +-
 tools/perf/tests/keep-tracking.c            |   2 +-
 tools/perf/tests/mmap-basic.c               |  31 +-
 tools/perf/tests/openat-syscall-all-cpus.c  |   6 +-
 tools/perf/tests/openat-syscall-tp-fields.c |  18 +-
 tools/perf/tests/openat-syscall.c           |   6 +-
 tools/perf/tests/parse-events.c             |   4 +-
 tools/perf/tests/parse-metric.c             |   4 +-
 tools/perf/tests/parse-no-sample-id-all.c   |   2 +-
 tools/perf/tests/perf-record.c              |  18 +-
 tools/perf/tests/perf-time-to-tsc.c         |   2 +-
 tools/perf/tests/pfm.c                      |   4 +-
 tools/perf/tests/pmu-events.c               |   6 +-
 tools/perf/tests/pmu.c                      |   2 +-
 tools/perf/tests/sw-clock.c                 |  14 +-
 tools/perf/tests/switch-tracking.c          |   2 +-
 tools/perf/tests/task-exit.c                |  14 +-
 tools/perf/tests/tool_pmu.c                 |   2 +-
 tools/perf/tests/topology.c                 |   5 +-
 tools/perf/util/bpf_counter_cgroup.c        |   2 +-
 tools/perf/util/bpf_off_cpu.c               |  28 +-
 tools/perf/util/bpf_trace_augment.c         |   7 +-
 tools/perf/util/cgroup.c                    |   6 +-
 tools/perf/util/data-convert-bt.c           |   2 +-
 tools/perf/util/data.c                      |  81 ++-
 tools/perf/util/data.h                      |  52 +-
 tools/perf/util/evlist.c                    | 100 ++--
 tools/perf/util/evlist.h                    |  23 +-
 tools/perf/util/evsel.c                     | 103 ++--
 tools/perf/util/evsel.h                     |  30 +-
 tools/perf/util/expr.c                      |   2 +-
 tools/perf/util/header.c                    |  12 +-
 tools/perf/util/map.h                       |   6 +-
 tools/perf/util/maps.c                      |   9 +-
 tools/perf/util/metricgroup.c               |   6 +-
 tools/perf/util/parse-events.c              |   4 +-
 tools/perf/util/parse-events.y              |   2 +-
 tools/perf/util/perf_api_probe.c            |  19 +-
 tools/perf/util/pfm.c                       |   2 +-
 tools/perf/util/print-events.c              |   2 +-
 tools/perf/util/print_insn.h                |   5 +-
 tools/perf/util/python.c                    | 584 +++++++++++++++-----
 tools/perf/util/record.c                    |   2 +-
 tools/perf/util/s390-sample-raw.c           |  15 +-
 tools/perf/util/session.c                   |   4 +-
 tools/perf/util/sideband_evlist.c           |  16 +-
 tools/perf/util/stat-shadow.c               |   1 +
 tools/perf/util/stat.c                      |  15 +-
 76 files changed, 1152 insertions(+), 650 deletions(-)
 create mode 100644 tools/perf/python/mem-phys-addr.py

-- 
2.51.1.851.g4ebd6896fd-goog


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-10-30 13:09 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29  5:33 [RFC PATCH v1 00/15] Addition of session API to python module Ian Rogers
2025-10-29  5:33 ` [RFC PATCH v1 01/15] perf arch arm: Sort includes and add missed explicit dependencies Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 02/15] perf arch x86: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 03/15] perf tests: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 04/15] perf script: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 05/15] perf util: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 06/15] perf python: Add " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 07/15] perf evsel/evlist: Avoid unnecessary #includes Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 08/15] perf maps: Move getting debug_file to verbose path Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 09/15] perf data: Clean up use_stdio and structures Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 10/15] perf python: Add wrapper for perf_data file abstraction Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 11/15] perf python: Add python session abstraction wrapping perf's session Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 12/15] perf evlist: Add reference count Ian Rogers
2025-10-29 16:22   ` Arnaldo Carvalho de Melo
2025-10-29 16:25     ` Arnaldo Carvalho de Melo
2025-10-29 16:56     ` Ian Rogers
2025-10-29 18:33       ` Arnaldo Carvalho de Melo
2025-10-29 21:12         ` Ian Rogers
2025-10-30 13:09           ` Arnaldo Carvalho de Melo
2025-10-29  5:34 ` [RFC PATCH v1 13/15] perf evsel: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 14/15] perf python: Add access to evsel and phys_addr in event Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 15/15] perf mem-phys-addr.py: Port to standalone application from perf script Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).