linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brendan Gregg <brendan.d.gregg@gmail.com>
To: Carl Love <cel@us.ibm.com>
Cc: "linux-perf-use." <linux-perf-users@vger.kernel.org>
Subject: Re: Perf support for interpreted and Just-In-Time translated languages
Date: Fri, 5 Dec 2014 13:27:36 -0800	[thread overview]
Message-ID: <CAE40pdcXt_gaCXD7n3UED4Y2_gdpH+A3S627TS-72xkbGDoNsA@mail.gmail.com> (raw)
In-Reply-To: <1417810736.5098.11.camel@oc0276584878.ibm.com>

G'Day Carl,

On Fri, Dec 5, 2014 at 12:18 PM, Carl Love <cel@us.ibm.com> wrote:
>
> > On 12/02/2014 08:36 PM, Brendan Gregg wrote:
> >> G'Day Will,
> >>
> >> On Tue, Dec 2, 2014 at 1:08 PM, William Cohen <wcohen@redhat.com> wrote:
> >>> perf makes use of the debug information provided by the compilers to
> >>> map the addresses observed in the instruction pointer and on the stack
> >>> back to source code.  This works very well for traditional compiled
> >>> programs written in c and c++.  However, the assumption that the
> >>> instruction address maps back to something the user wrote is not true
> >>> for code written in interpretered languages such as python, perl, and
> >>> Ruby or for Just-In-Time (JIT) runtime environment commonly used for
> >>> Java.  The addresses would either map back to the interpreter runtime
> >>> or dynamically generated code.  It would be really nice if perf was
> >>> enhanced to provide data about where in the interpreted and JIT'ed
> >>> code the processor was spending time.
>
> I wholeheartedly agree. The ability to profile Java JITed code is a very big
> deal for some perf users. I think perf should provide its own solution for
> profiling Java JITed code that is well designed and well documented, instead of
> directing users to something out-of-tree and out of perf's sphere of control.

I posted a hotspot patch yesterday:
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2014-December/016477.html

Along with perf-map-agent for symbol translation, this lets perf
profile Java (noting the caveats in that email).

... There are a lot of exotic things we can do with perf, but I don't
think CPU stack profiling is one of them. I think if you have perf &
java, it should just work.

> >>
> >> perf supports the /tmp/perf-PID.map files for JIT translations. It's
> >> up to the runtimes to create these files.
> >>
> >> I was enhancing the Java perf-map-agent today
> >> (https://github.com/jrudolph/perf-map-agent), and using it with perf.
>
> Thanks for the pointer. I didn't know about this tool before. It's cool that
> it has the ability to attach to a running JVM and create a /tmp/perf-<pid>.map
> file -- i.e., can capture profile data without having to start the JVM with the
> -agentpath or -agentlib option. But the downside is (as the documentation says) ...
>   "Over time the JVM will JIT compile more methods and the perf-<pid>.map file
>   will become stale. You need to rerun perf-java to generate a new and current map."

FWIW, there's a lot of churn in the first few minutes of java running
hot, as methods get compiled, but I've seen it settle down after 5
minutes for my workload. Still, it's something I'm keeping an eye on.
I can, at least, generate a map before and after profiling, and look
for changes.

>
>
> >> perf doesn't seem to handle map files that grow (and overwrite
> >> symbols) very well, so I had to create an extra step that cleaned up
> >> the map file. I should write up the Java instructions somewhere.
>
> Yes, oprofile has to handle that as well. It keeps track of how long
> each symbol resides at the overwritten address, and then chooses the
> one that was resident the longest to attribute samples to. It's of course not
> perfect, but it's probably reasonable to do so.  The oprofile user manual
> explains this (http://oprofile.sourceforge.net/doc/overlapping-symbols.html).

Hm, that is a bit odd. My dumb solution would have been to detect
symbols that have changed during profiling, and flag them in the
profile so the end-user would know to be dubious. The percentage is
pretty small, but YMMV.

If we were to look at timing, why not have JVMTI emit timestamped
method symbols, and then correlate to perf's timestamped samples.

>
> >>
> >> I did do a writeup for Node.js, whose v8 engine supports the perf map
> >> files. See: http://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html
> >>
> >> Also see tools/perf/Documentation/jit-interface.txt
> >>
> >>> OProfile provides the ability to map samples from Java Runtime
> >>> Environment (JRE) JIT code using a shared library agent loaded when
> >>> the program starts executing.  The shared library uses the JVMTI or
> >>> JVMPI interface to note the method that each region of JIT'ed code
> >>> maps to.  This is later used to map the instruction pointer back to
> >>> the appropriate Java method.  There is some information on how this is
> >>> implement at http://oprofile.sourceforge.net/doc/devel/index.html.
> >>
> >> Yes, that's exactly what perf-map-agent does (JVMTI). I only just
>
> Similar, but not exactly. OProfile's Java agent library is passed to the JVM
> on startup and is continuously used throughout the JVM's run time. It would be
> ideal to have both this functionality and the attach functionality of perf-map-agent.

Actually, that is what perf-map-agent did do when I wrote this. :) It
was just changed, so that it now emits the map file on demand.

A motivating factor to change this was that the map file grew in such
a way that it confused perf_events, which didn't translate properly. I
haven't debugged it, but I suspect perf_events expects a sorted map
file, which this wasn't. I wrote a perl tool to tidy up the map file,
which made perf_events then work correctly.. The other solution, which
is what perf-map-agent now does, is just to dump the whole map file on
demand, rather than growing it over time.

>
> > OProfile provides two implementations of VM-specific libs -- one for pre-1.5 Java
> > (using JVMPI interface) and another for 1.5 and later Java (using JVMTI interface).
> > I know there are some other VM-specific agent libs that have been written (for mono
> > and LLVM), but don't know how much they are used -- they were not contributed to
> > oprofile.
> >> created the pull request, but if you try perf-map-agent, you'll want
> >> to use the fflush fix to avoid buffering lag
> >> (https://github.com/jrudolph/perf-map-agent/pull/8).
>
> There are a couple other issues with the current techniques used by perf for profiling
> JITed code (unless I'm missing something):
>   - When are the /tmp/perf-<pid>.map files deleted?

That's up to the runtime agent. Currently never, so your /tmp slowly fills!

>   - How does this work for the offline analysis scenario (i.e., using 'perf archive')?
>     Would the /tmp/perf-<pid>.map files have to be copied over to the host system where
>     the analysis is being done?

Yes. I keep copies of the perf.map along with the perf.data. It might
be worth having an option to perf to change the base path for these
maps, so that I didn't have to keep putting them in /tmp.

Brendan

  reply	other threads:[~2014-12-05 21:27 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-05 20:18 Perf support for interpreted and Just-In-Time translated languages Carl Love
2014-12-05 21:27 ` Brendan Gregg [this message]
2014-12-09 20:34   ` Arnaldo Carvalho de Melo
2014-12-09 22:01     ` Andi Kleen
2014-12-09 22:22       ` Perf support for interpreted and Just-In-Time translated olanguages Arnaldo Carvalho de Melo
2014-12-10  0:38         ` Andi Kleen
2014-12-10 17:41           ` Carl Love
2014-12-10 18:09             ` Andi Kleen
2014-12-10 19:21             ` Arnaldo Carvalho de Melo
2014-12-10 19:19           ` Arnaldo Carvalho de Melo
2014-12-10 17:32         ` Andi Kleen
2014-12-10 17:39           ` David Ahern
2014-12-10 18:05             ` Andi Kleen
2014-12-10 18:27               ` David Ahern
2014-12-10 19:43                 ` Arnaldo Carvalho de Melo
2015-01-09 20:19                   ` Carl Love
2015-01-10  4:15                     ` William Cohen
2015-01-10 15:14                       ` David Ahern
2015-01-12 17:22                         ` Carl Love
2015-01-12 17:58                           ` David Ahern
2015-01-12 18:43                             ` Carl Love
2015-01-20 18:19                             ` Carl Love
2015-01-20 19:29                               ` Arnaldo Carvalho de Melo
2015-01-20 20:34                                 ` Carl Love
2015-01-20 20:52                                   ` Arnaldo Carvalho de Melo
2015-01-23  8:25                               ` Sujoy Saraswati
2014-12-10  7:55     ` Perf support for interpreted and Just-In-Time translated languages Pekka Enberg
  -- strict thread matches above, loose matches on Subject: below --
2014-12-02 21:08 William Cohen
2014-12-03  2:36 ` Brendan Gregg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE40pdcXt_gaCXD7n3UED4Y2_gdpH+A3S627TS-72xkbGDoNsA@mail.gmail.com \
    --to=brendan.d.gregg@gmail.com \
    --cc=cel@us.ibm.com \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).