linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Perf support for interpreted and Just-In-Time translated languages
@ 2014-12-05 20:18 Carl Love
  2014-12-05 21:27 ` Brendan Gregg
  0 siblings, 1 reply; 29+ messages in thread
From: Carl Love @ 2014-12-05 20:18 UTC (permalink / raw)
  To: linux-perf-users

> On 12/02/2014 08:36 PM, Brendan Gregg wrote:
>> G'Day Will,
>>
>> On Tue, Dec 2, 2014 at 1:08 PM, William Cohen <wcohen@redhat.com> wrote:
>>> perf makes use of the debug information provided by the compilers to
>>> map the addresses observed in the instruction pointer and on the stack
>>> back to source code.  This works very well for traditional compiled
>>> programs written in c and c++.  However, the assumption that the
>>> instruction address maps back to something the user wrote is not true
>>> for code written in interpretered languages such as python, perl, and
>>> Ruby or for Just-In-Time (JIT) runtime environment commonly used for
>>> Java.  The addresses would either map back to the interpreter runtime
>>> or dynamically generated code.  It would be really nice if perf was
>>> enhanced to provide data about where in the interpreted and JIT'ed
>>> code the processor was spending time.

I wholeheartedly agree. The ability to profile Java JITed code is a very big
deal for some perf users. I think perf should provide its own solution for
profiling Java JITed code that is well designed and well documented, instead of
directing users to something out-of-tree and out of perf's sphere of control.

>>
>> perf supports the /tmp/perf-PID.map files for JIT translations. It's
>> up to the runtimes to create these files.
>>
>> I was enhancing the Java perf-map-agent today
>> (https://github.com/jrudolph/perf-map-agent), and using it with perf.

Thanks for the pointer. I didn't know about this tool before. It's cool that
it has the ability to attach to a running JVM and create a /tmp/perf-<pid>.map
file -- i.e., can capture profile data without having to start the JVM with the
-agentpath or -agentlib option. But the downside is (as the documentation says) ...
  "Over time the JVM will JIT compile more methods and the perf-<pid>.map file
  will become stale. You need to rerun perf-java to generate a new and current map."

>> perf doesn't seem to handle map files that grow (and overwrite
>> symbols) very well, so I had to create an extra step that cleaned up
>> the map file. I should write up the Java instructions somewhere.

Yes, oprofile has to handle that as well. It keeps track of how long
each symbol resides at the overwritten address, and then chooses the
one that was resident the longest to attribute samples to. It's of course not
perfect, but it's probably reasonable to do so.  The oprofile user manual
explains this (http://oprofile.sourceforge.net/doc/overlapping-symbols.html).

>>
>> I did do a writeup for Node.js, whose v8 engine supports the perf map
>> files. See: http://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html
>>
>> Also see tools/perf/Documentation/jit-interface.txt
>>
>>> OProfile provides the ability to map samples from Java Runtime
>>> Environment (JRE) JIT code using a shared library agent loaded when
>>> the program starts executing.  The shared library uses the JVMTI or
>>> JVMPI interface to note the method that each region of JIT'ed code
>>> maps to.  This is later used to map the instruction pointer back to
>>> the appropriate Java method.  There is some information on how this is
>>> implement at http://oprofile.sourceforge.net/doc/devel/index.html.
>>
>> Yes, that's exactly what perf-map-agent does (JVMTI). I only just

Similar, but not exactly. OProfile's Java agent library is passed to the JVM
on startup and is continuously used throughout the JVM's run time. It would be
ideal to have both this functionality and the attach functionality of perf-map-agent.

> OProfile provides two implementations of VM-specific libs -- one for pre-1.5 Java
> (using JVMPI interface) and another for 1.5 and later Java (using JVMTI interface).
> I know there are some other VM-specific agent libs that have been written (for mono
> and LLVM), but don't know how much they are used -- they were not contributed to
> oprofile.
>> created the pull request, but if you try perf-map-agent, you'll want
>> to use the fflush fix to avoid buffering lag
>> (https://github.com/jrudolph/perf-map-agent/pull/8).

There are a couple other issues with the current techniques used by perf for profiling
JITed code (unless I'm missing something):
  - When are the /tmp/perf-<pid>.map files deleted?
  - How does this work for the offline analysis scenario (i.e., using 'perf archive')?
    Would the /tmp/perf-<pid>.map files have to be copied over to the host system where
    the analysis is being done?

                     Carl Love

^ permalink raw reply	[flat|nested] 29+ messages in thread
* Perf support for interpreted and Just-In-Time translated languages
@ 2014-12-02 21:08 William Cohen
  2014-12-03  2:36 ` Brendan Gregg
  0 siblings, 1 reply; 29+ messages in thread
From: William Cohen @ 2014-12-02 21:08 UTC (permalink / raw)
  To: linux-perf-users

perf makes use of the debug information provided by the compilers to
map the addresses observed in the instruction pointer and on the stack
back to source code.  This works very well for traditional compiled
programs written in c and c++.  However, the assumption that the
instruction address maps back to something the user wrote is not true
for code written in interpretered languages such as python, perl, and
Ruby or for Just-In-Time (JIT) runtime environment commonly used for
Java.  The addresses would either map back to the interpreter runtime
or dynamically generated code.  It would be really nice if perf was
enhanced to provide data about where in the interpreted and JIT'ed
code the processor was spending time.

OProfile provides the ability to map samples from Java Runtime
Environment (JRE) JIT code using a shared library agent loaded when
the program starts executing.  The shared library uses the JVMTI or
JVMPI interface to note the method that each region of JIT'ed code
maps to.  This is later used to map the instruction pointer back to
the appropriate Java method.  There is some information on how this is
implement at http://oprofile.sourceforge.net/doc/devel/index.html.

For traditional interpreters the samples perf get mapped to the
internals of the interpreter.  Rather than getting samples that map
back to the developer's Ruby code, developers get samples that map
back to the internals of the Ruby intpreter which they have little
control and understanding of.  What would be desired is for each
memory map region or process to have something that indicates what
kind of information perf should record for a sample in that region or
process.  By default this would fall back on the traditional IP
sampling, but allow some user-space memory locations to be read for a
line number and dcookie for the file that the code came from instead.

-Will

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-01-23  8:30 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-05 20:18 Perf support for interpreted and Just-In-Time translated languages Carl Love
2014-12-05 21:27 ` Brendan Gregg
2014-12-09 20:34   ` Arnaldo Carvalho de Melo
2014-12-09 22:01     ` Andi Kleen
2014-12-09 22:22       ` Perf support for interpreted and Just-In-Time translated olanguages Arnaldo Carvalho de Melo
2014-12-10  0:38         ` Andi Kleen
2014-12-10 17:41           ` Carl Love
2014-12-10 18:09             ` Andi Kleen
2014-12-10 19:21             ` Arnaldo Carvalho de Melo
2014-12-10 19:19           ` Arnaldo Carvalho de Melo
2014-12-10 17:32         ` Andi Kleen
2014-12-10 17:39           ` David Ahern
2014-12-10 18:05             ` Andi Kleen
2014-12-10 18:27               ` David Ahern
2014-12-10 19:43                 ` Arnaldo Carvalho de Melo
2015-01-09 20:19                   ` Carl Love
2015-01-10  4:15                     ` William Cohen
2015-01-10 15:14                       ` David Ahern
2015-01-12 17:22                         ` Carl Love
2015-01-12 17:58                           ` David Ahern
2015-01-12 18:43                             ` Carl Love
2015-01-20 18:19                             ` Carl Love
2015-01-20 19:29                               ` Arnaldo Carvalho de Melo
2015-01-20 20:34                                 ` Carl Love
2015-01-20 20:52                                   ` Arnaldo Carvalho de Melo
2015-01-23  8:25                               ` Sujoy Saraswati
2014-12-10  7:55     ` Perf support for interpreted and Just-In-Time translated languages Pekka Enberg
  -- strict thread matches above, loose matches on Subject: below --
2014-12-02 21:08 William Cohen
2014-12-03  2:36 ` Brendan Gregg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).