Re: Perf support for interpreted and Just-In-Time translated languages

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carl Love <cel@us.ibm.com>, Pekka Enberg <penberg@kernel.org>,
	"linux-perf-use." <linux-perf-users@vger.kernel.org>
Subject: Re: Perf support for interpreted and Just-In-Time translated languages
Date: Tue, 9 Dec 2014 17:34:19 -0300	[thread overview]
Message-ID: <20141209203419.GI4189@kernel.org> (raw)
In-Reply-To: <CAE40pdcXt_gaCXD7n3UED4Y2_gdpH+A3S627TS-72xkbGDoNsA@mail.gmail.com>

Em Fri, Dec 05, 2014 at 01:27:36PM -0800, Brendan Gregg escreveu:
> G'Day Carl,
> 
> On Fri, Dec 5, 2014 at 12:18 PM, Carl Love <cel@us.ibm.com> wrote:
> >
> > > On 12/02/2014 08:36 PM, Brendan Gregg wrote:
> > >> G'Day Will,
> > >>
> > >> On Tue, Dec 2, 2014 at 1:08 PM, William Cohen <wcohen@redhat.com> wrote:
> > >>> perf makes use of the debug information provided by the compilers to
> > >>> map the addresses observed in the instruction pointer and on the stack
> > >>> back to source code.  This works very well for traditional compiled
> > >>> programs written in c and c++.  However, the assumption that the
> > >>> instruction address maps back to something the user wrote is not true
> > >>> for code written in interpretered languages such as python, perl, and
> > >>> Ruby or for Just-In-Time (JIT) runtime environment commonly used for
> > >>> Java.  The addresses would either map back to the interpreter runtime
> > >>> or dynamically generated code.  It would be really nice if perf was
> > >>> enhanced to provide data about where in the interpreted and JIT'ed
> > >>> code the processor was spending time.
> >
> > I wholeheartedly agree. The ability to profile Java JITed code is a very big
> > deal for some perf users. I think perf should provide its own solution for
> > profiling Java JITed code that is well designed and well documented, instead of
> > directing users to something out-of-tree and out of perf's sphere of control.
> 
> I posted a hotspot patch yesterday:
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2014-December/016477.html
> 
> Along with perf-map-agent for symbol translation, this lets perf
> profile Java (noting the caveats in that email).
> 
> ... There are a lot of exotic things we can do with perf, but I don't
> think CPU stack profiling is one of them. I think if you have perf &
> java, it should just work.
> 
> > >>
> > >> perf supports the /tmp/perf-PID.map files for JIT translations. It's
> > >> up to the runtimes to create these files.
> > >>
> > >> I was enhancing the Java perf-map-agent today
> > >> (https://github.com/jrudolph/perf-map-agent), and using it with perf.
> >
> > Thanks for the pointer. I didn't know about this tool before. It's cool that
> > it has the ability to attach to a running JVM and create a /tmp/perf-<pid>.map
> > file -- i.e., can capture profile data without having to start the JVM with the
> > -agentpath or -agentlib option. But the downside is (as the documentation says) ...
> >   "Over time the JVM will JIT compile more methods and the perf-<pid>.map file
> >   will become stale. You need to rerun perf-java to generate a new and current map."
> 
> FWIW, there's a lot of churn in the first few minutes of java running
> hot, as methods get compiled, but I've seen it settle down after 5
> minutes for my workload. Still, it's something I'm keeping an eye on.
> I can, at least, generate a map before and after profiling, and look
> for changes.

Humm, I wonder if we could try to attach a 'perf probe' (uprobes) to
some JVM method that is known to invalidate JITted code -> symtab
mappings so that we would use it as a PERF_RECORD_MMAP equivalent...
I.e. we would know that that map overlaps the previous one and that the
symtab is a new one for that addr range, etc, just like we do for
executable mmaps coming from the kernel (PERF_RECORD_MMAP).
 
> > >> perf doesn't seem to handle map files that grow (and overwrite
> > >> symbols) very well, so I had to create an extra step that cleaned up
> > >> the map file. I should write up the Java instructions somewhere.
> >
> > Yes, oprofile has to handle that as well. It keeps track of how long
> > each symbol resides at the overwritten address, and then chooses the
> > one that was resident the longest to attribute samples to. It's of course not
> > perfect, but it's probably reasonable to do so.  The oprofile user manual
> > explains this (http://oprofile.sourceforge.net/doc/overlapping-symbols.html).
 
> Hm, that is a bit odd. My dumb solution would have been to detect
> symbols that have changed during profiling, and flag them in the
> profile so the end-user would know to be dubious. The percentage is
> pretty small, but YMMV.
 
> If we were to look at timing, why not have JVMTI emit timestamped
> method symbols, and then correlate to perf's timestamped samples.

I think we need just to intercept mmap reuses, somehow... I wonder if
this is not a dtrace tracepoint (or whatever that may be named in dtrace
land).
 
> > >> I did do a writeup for Node.js, whose v8 engine supports the perf map
> > >> files. See: http://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html
> > >>
> > >> Also see tools/perf/Documentation/jit-interface.txt
> > >>
> > >>> OProfile provides the ability to map samples from Java Runtime
> > >>> Environment (JRE) JIT code using a shared library agent loaded when
> > >>> the program starts executing.  The shared library uses the JVMTI or
> > >>> JVMPI interface to note the method that each region of JIT'ed code
> > >>> maps to.  This is later used to map the instruction pointer back to
> > >>> the appropriate Java method.  There is some information on how this is
> > >>> implement at http://oprofile.sourceforge.net/doc/devel/index.html.
> > >>
> > >> Yes, that's exactly what perf-map-agent does (JVMTI). I only just
> >
> > Similar, but not exactly. OProfile's Java agent library is passed to the JVM
> > on startup and is continuously used throughout the JVM's run time. It would be
> > ideal to have both this functionality and the attach functionality of perf-map-agent.
> 
> Actually, that is what perf-map-agent did do when I wrote this. :) It
> was just changed, so that it now emits the map file on demand.
> 
> A motivating factor to change this was that the map file grew in such
> a way that it confused perf_events, which didn't translate properly. I
> haven't debugged it, but I suspect perf_events expects a sorted map
> file, which this wasn't. I wrote a perl tool to tidy up the map file,

It shouldn't, as it goes on reading and adding it to a rbtree, which
sorts the symbols so that later we can lookup by addr.

> which made perf_events then work correctly.. The other solution, which
> is what perf-map-agent now does, is just to dump the whole map file on
> demand, rather than growing it over time.

> > > OProfile provides two implementations of VM-specific libs -- one for pre-1.5 Java
> > > (using JVMPI interface) and another for 1.5 and later Java (using JVMTI interface).
> > > I know there are some other VM-specific agent libs that have been written (for mono
> > > and LLVM), but don't know how much they are used -- they were not contributed to
> > > oprofile.
> > >> created the pull request, but if you try perf-map-agent, you'll want
> > >> to use the fflush fix to avoid buffering lag
> > >> (https://github.com/jrudolph/perf-map-agent/pull/8).
> >
> > There are a couple other issues with the current techniques used by perf for profiling
> > JITed code (unless I'm missing something):
> >   - When are the /tmp/perf-<pid>.map files deleted?
> 
> That's up to the runtime agent. Currently never, so your /tmp slowly fills!
> 
> >   - How does this work for the offline analysis scenario (i.e., using 'perf archive')?
> >     Would the /tmp/perf-<pid>.map files have to be copied over to the host system where
> >     the analysis is being done?
> 
> Yes. I keep copies of the perf.map along with the perf.data. It might
> be worth having an option to perf to change the base path for these
> maps, so that I didn't have to keep putting them in /tmp.

Right, this was not really designed, was just a proof of concept for
JATO needs, right Pekka?

- Arnaldo

next prev parent reply	other threads:[~2014-12-09 20:34 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-05 20:18 Perf support for interpreted and Just-In-Time translated languages Carl Love
2014-12-05 21:27 ` Brendan Gregg
2014-12-09 20:34   ` Arnaldo Carvalho de Melo [this message]
2014-12-09 22:01     ` Andi Kleen
2014-12-09 22:22       ` Perf support for interpreted and Just-In-Time translated olanguages Arnaldo Carvalho de Melo
2014-12-10  0:38         ` Andi Kleen
2014-12-10 17:41           ` Carl Love
2014-12-10 18:09             ` Andi Kleen
2014-12-10 19:21             ` Arnaldo Carvalho de Melo
2014-12-10 19:19           ` Arnaldo Carvalho de Melo
2014-12-10 17:32         ` Andi Kleen
2014-12-10 17:39           ` David Ahern
2014-12-10 18:05             ` Andi Kleen
2014-12-10 18:27               ` David Ahern
2014-12-10 19:43                 ` Arnaldo Carvalho de Melo
2015-01-09 20:19                   ` Carl Love
2015-01-10  4:15                     ` William Cohen
2015-01-10 15:14                       ` David Ahern
2015-01-12 17:22                         ` Carl Love
2015-01-12 17:58                           ` David Ahern
2015-01-12 18:43                             ` Carl Love
2015-01-20 18:19                             ` Carl Love
2015-01-20 19:29                               ` Arnaldo Carvalho de Melo
2015-01-20 20:34                                 ` Carl Love
2015-01-20 20:52                                   ` Arnaldo Carvalho de Melo
2015-01-23  8:25                               ` Sujoy Saraswati
2014-12-10  7:55     ` Perf support for interpreted and Just-In-Time translated languages Pekka Enberg
  -- strict thread matches above, loose matches on Subject: below --
2014-12-02 21:08 William Cohen
2014-12-03  2:36 ` Brendan Gregg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141209203419.GI4189@kernel.org \
    --to=acme@kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=cel@us.ibm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=penberg@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).