linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Perf support in CPython
@ 2023-11-20 23:50 Pablo Galindo Salgado
  2023-11-21 18:15 ` Namhyung Kim
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Pablo Galindo Salgado @ 2023-11-20 23:50 UTC (permalink / raw)
  To: linux-perf-users

Hi,

I am Pablo Galindo Salgado from the CPython core development team. In
the past release of CPython (Python 3.12) I have added support
for including Python function calls in the output of perf by
leveraging the jit-interface support by writing to /tmp/perf-%d.map.
The support
works by compiling assembly trampolines at runtime that just call the
Python bytecode evaluation loop and assigning function names to the
trampolines by writing in /tmp/perf-%d.map. This has worked really
well when CPython is compiled with frame pointers but unfortunately
CPython also suffers from 5% to 10% slow down when frame pointers are
used. Unfortunately, perf cannot unwind through the jitted trampolines
without frame pointers currently.

I am looking to extend support to the jitdump specification to allow
perf to unwind through these trampolines when using dwarf unwinding
mode.
I have a draft for a patch
(https://github.com/python/cpython/pull/112254) that adds support for
the specification by writing JIT_CODE_LOAD and
JIT_CODE_UNWINDING_INFO (currently with empty EH Frame entries to
force perf to defer to FP based unwinding). This works well in my
current
distribution (Arch Linux with perf version 6.5) but when tested in
other distributions it falls apart.

* One problem I have struggled to debug is that even when compiled
with frame pointers and without emitting JIT_CODE_UNWINDING_INFO
(using fp-based unwinding), in some cases (such as the latest Ubuntu
versions), the shared objects and addresses generated by perf-inject
-j look sane, the addresses match but perf report or perf script do
not include the symbol names in their output. I do not understand why
this
is happening and my debuggins so far has not allowed me to get any
thread I can pull so I wanted to reach to this mailing list for
assistance.

* The other problem I am finding is to include the eh_frames to our
trampolines as it looks like deferring to FP unwinding is unreliable
even if
our trampolines are trivial. This is a different problem but my
current attempts to add EH Frames have failed and I don't know how to
properly
debug them correctly as it looks like perf doesn't expect some of the
values we are writing. I would love it if someone with perf and DWARF
expertise could help us in the CPython team to get steered in the
right direction.

Thanks a lot in advance,
Pablo Galindo Salgado

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-12-01  9:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-20 23:50 Perf support in CPython Pablo Galindo Salgado
2023-11-21 18:15 ` Namhyung Kim
2023-11-21 18:30   ` Pablo Galindo Salgado
2023-11-21 18:49     ` Namhyung Kim
2023-11-22 21:04 ` Ian Rogers
2023-11-30  6:39   ` Namhyung Kim
2023-11-30 11:00 ` James Clark
2023-11-30 12:42   ` Pablo Galindo Salgado
2023-11-30 18:09     ` James Clark
2023-11-30 21:16     ` Ian Rogers
2023-12-01  9:57       ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).