All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/15] Addition of session API to python module
@ 2025-10-29  5:33 Ian Rogers
  2025-10-29  5:33 ` [RFC PATCH v1 01/15] perf arch arm: Sort includes and add missed explicit dependencies Ian Rogers
                   ` (14 more replies)
  0 siblings, 15 replies; 22+ messages in thread
From: Ian Rogers @ 2025-10-29  5:33 UTC (permalink / raw)
  To: Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Leo Yan, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Jiri Olsa, Ian Rogers, Adrian Hunter, Greg Kroah-Hartman,
	Charlie Jenkins, Thomas Falcon, Yicong Yang, Thomas Richter,
	Athira Rajeev, Howard Chu, Song Liu, Dapeng Mi, Levi Yun,
	Zhongqiu Han, Blake Jones, Anubhav Shelat, Chun-Tse Shao,
	Christophe Leroy, Jean-Philippe Romain, Gautam Menghani,
	Dmitry Vyukov, Yang Li, linux-kernel, linux-perf-users,
	Andi Kleen, Weilin Wang

The perf script command uses a session with process_events to call
through to the python process_events function. The event is turned
into a python dictionary, whether the entries are used or not, adding
overhead. To avoid the overhead, add a session API abstraction and
pass callbacks that can be used to perform the existing perf script
functions. The implementation is incomplete in this RFC.

In this series the mem-phys-addr.py command is ported from perf script
to using the session API. The performance before and after is:

Before:
```
$ perf mem record -a sleep 1
$ time perf script tools/perf/scripts/python/mem-phys-addr.py
Event: cpu_core/mem-loads-aux/
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m3.754s
user    0m0.023s
sys     0m0.018s
```

After:
```
$ PYTHONPATH=/tmp/perf/python time python3 tools/perf/python/mem-phys-addr.py
Event: evsel(cpu_core/mem-loads-aux/)
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m0.106s
user    0m0.021s
sys     0m0.020s
```

So a roughly 35x speedup, but it maybe that some of that is one time
start-up overhead of libpython which wouldn't be present for larger
perf.data files.

Before porting all the script commands and adding things like
callchain support to the python module, I wanted to get feedback. One
thing that particularly simplifies the series is adding reference
counts to evsel and evlist to avoid copying/cloning evsels created by
the session API when loading a perf.data file.

The approach of moving away from libpython and scripts was most
recently discussed as a topic in:
https://lore.kernel.org/lkml/CAP-5=fWDqE8SYfOLZkg_0=4Ayx6E7O+h7uUp4NDeCFkiN4b7-w@mail.gmail.com/

When creating the python wrapper some house keeping was done around
includes and perf_data's encapsulation.

The perf script callbacks differ from those in perf_tool, for example,
stat is the perf_tool callback is for a stat event while the scripting
ops combine things and have a stat callback associated with
stat_round. Should the session API match the tool or the script API?
The former feels better for long term, while the latter could simplify
porting perf scripts.

Ian Rogers (15):
  perf arch arm: Sort includes and add missed explicit dependencies
  perf arch x86: Sort includes and add missed explicit dependencies
  perf tests: Sort includes and add missed explicit dependencies
  perf script: Sort includes and add missed explicit dependencies
  perf util: Sort includes and add missed explicit dependencies
  perf python: Add add missed explicit dependencies
  perf evsel/evlist: Avoid unnecessary #includes
  perf maps: Move getting debug_file to verbose path
  perf data: Clean up use_stdio and structures
  perf python: Add wrapper for perf_data file abstraction
  perf python: Add python session abstraction wrapping perf's session
  perf evlist: Add reference count
  perf evsel: Add reference count
  perf python: Add access to evsel and phys_addr in event
  perf mem-phys-addr.py: Port to standalone application from perf script

 tools/perf/arch/arm/util/cs-etm.c           |  22 +-
 tools/perf/arch/x86/tests/hybrid.c          |   2 +-
 tools/perf/arch/x86/tests/topdown.c         |   2 +-
 tools/perf/arch/x86/util/intel-bts.c        |  14 +-
 tools/perf/arch/x86/util/intel-pt.c         |  31 +-
 tools/perf/arch/x86/util/iostat.c           |   2 +-
 tools/perf/bench/evlist-open-close.c        |  18 +-
 tools/perf/builtin-ftrace.c                 |   8 +-
 tools/perf/builtin-inject.c                 |   7 +-
 tools/perf/builtin-kvm.c                    |   4 +-
 tools/perf/builtin-lock.c                   |   2 +-
 tools/perf/builtin-record.c                 |  14 +-
 tools/perf/builtin-script.c                 | 109 ++--
 tools/perf/builtin-stat.c                   |   8 +-
 tools/perf/builtin-top.c                    |  52 +-
 tools/perf/builtin-trace.c                  |  38 +-
 tools/perf/python/mem-phys-addr.py          | 117 ++++
 tools/perf/tests/backward-ring-buffer.c     |  18 +-
 tools/perf/tests/code-reading.c             |   4 +-
 tools/perf/tests/event-times.c              |   4 +-
 tools/perf/tests/event_update.c             |   2 +-
 tools/perf/tests/evsel-roundtrip-name.c     |   8 +-
 tools/perf/tests/evsel-tp-sched.c           |   4 +-
 tools/perf/tests/expand-cgroup.c            |   8 +-
 tools/perf/tests/hists_cumulate.c           |   2 +-
 tools/perf/tests/hists_filter.c             |   2 +-
 tools/perf/tests/hists_link.c               |   2 +-
 tools/perf/tests/hists_output.c             |   2 +-
 tools/perf/tests/hwmon_pmu.c                |  14 +-
 tools/perf/tests/keep-tracking.c            |   2 +-
 tools/perf/tests/mmap-basic.c               |  31 +-
 tools/perf/tests/openat-syscall-all-cpus.c  |   6 +-
 tools/perf/tests/openat-syscall-tp-fields.c |  18 +-
 tools/perf/tests/openat-syscall.c           |   6 +-
 tools/perf/tests/parse-events.c             |   4 +-
 tools/perf/tests/parse-metric.c             |   4 +-
 tools/perf/tests/parse-no-sample-id-all.c   |   2 +-
 tools/perf/tests/perf-record.c              |  18 +-
 tools/perf/tests/perf-time-to-tsc.c         |   2 +-
 tools/perf/tests/pfm.c                      |   4 +-
 tools/perf/tests/pmu-events.c               |   6 +-
 tools/perf/tests/pmu.c                      |   2 +-
 tools/perf/tests/sw-clock.c                 |  14 +-
 tools/perf/tests/switch-tracking.c          |   2 +-
 tools/perf/tests/task-exit.c                |  14 +-
 tools/perf/tests/tool_pmu.c                 |   2 +-
 tools/perf/tests/topology.c                 |   5 +-
 tools/perf/util/bpf_counter_cgroup.c        |   2 +-
 tools/perf/util/bpf_off_cpu.c               |  28 +-
 tools/perf/util/bpf_trace_augment.c         |   7 +-
 tools/perf/util/cgroup.c                    |   6 +-
 tools/perf/util/data-convert-bt.c           |   2 +-
 tools/perf/util/data.c                      |  81 ++-
 tools/perf/util/data.h                      |  52 +-
 tools/perf/util/evlist.c                    | 100 ++--
 tools/perf/util/evlist.h                    |  23 +-
 tools/perf/util/evsel.c                     | 103 ++--
 tools/perf/util/evsel.h                     |  30 +-
 tools/perf/util/expr.c                      |   2 +-
 tools/perf/util/header.c                    |  12 +-
 tools/perf/util/map.h                       |   6 +-
 tools/perf/util/maps.c                      |   9 +-
 tools/perf/util/metricgroup.c               |   6 +-
 tools/perf/util/parse-events.c              |   4 +-
 tools/perf/util/parse-events.y              |   2 +-
 tools/perf/util/perf_api_probe.c            |  19 +-
 tools/perf/util/pfm.c                       |   2 +-
 tools/perf/util/print-events.c              |   2 +-
 tools/perf/util/print_insn.h                |   5 +-
 tools/perf/util/python.c                    | 584 +++++++++++++++-----
 tools/perf/util/record.c                    |   2 +-
 tools/perf/util/s390-sample-raw.c           |  15 +-
 tools/perf/util/session.c                   |   4 +-
 tools/perf/util/sideband_evlist.c           |  16 +-
 tools/perf/util/stat-shadow.c               |   1 +
 tools/perf/util/stat.c                      |  15 +-
 76 files changed, 1152 insertions(+), 650 deletions(-)
 create mode 100644 tools/perf/python/mem-phys-addr.py

-- 
2.51.1.851.g4ebd6896fd-goog


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-10-30 13:09 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29  5:33 [RFC PATCH v1 00/15] Addition of session API to python module Ian Rogers
2025-10-29  5:33 ` [RFC PATCH v1 01/15] perf arch arm: Sort includes and add missed explicit dependencies Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 02/15] perf arch x86: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 03/15] perf tests: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 04/15] perf script: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 05/15] perf util: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 06/15] perf python: Add " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 07/15] perf evsel/evlist: Avoid unnecessary #includes Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 08/15] perf maps: Move getting debug_file to verbose path Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 09/15] perf data: Clean up use_stdio and structures Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 10/15] perf python: Add wrapper for perf_data file abstraction Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 11/15] perf python: Add python session abstraction wrapping perf's session Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 12/15] perf evlist: Add reference count Ian Rogers
2025-10-29 16:22   ` Arnaldo Carvalho de Melo
2025-10-29 16:25     ` Arnaldo Carvalho de Melo
2025-10-29 16:56     ` Ian Rogers
2025-10-29 18:33       ` Arnaldo Carvalho de Melo
2025-10-29 21:12         ` Ian Rogers
2025-10-30 13:09           ` Arnaldo Carvalho de Melo
2025-10-29  5:34 ` [RFC PATCH v1 13/15] perf evsel: " Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 14/15] perf python: Add access to evsel and phys_addr in event Ian Rogers
2025-10-29  5:34 ` [RFC PATCH v1 15/15] perf mem-phys-addr.py: Port to standalone application from perf script Ian Rogers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.