* [PATCH v7 0/4] perf: add support for profiling jitted code
@ 2015-10-01 6:45 Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 1/5] perf tools: add Java demangling support Stephane Eranian
` (6 more replies)
0 siblings, 7 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-10-01 6:45 UTC (permalink / raw)
To: linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
This patch series extends perf record/report/annotate to enable
profiling of jitted (just-in-time compiled) code. The current
perf tool provides very limited support for profiling jitted
code for some runtime environments. But the support is experimental
and cannot be used in complex environments. It relies on files
in /tmp, for instance. It does not support annotate mode or
rejitted code.
This patch series adds a better way of profiling jitted code
with the following advantages:
- support any jitted code environment (some with modifications)
- support Java runtime with JVMTI interface with no modifications
- provides a portable JVMTI agent library
- known to support V8 runtime
- known to support DART runtime
- supports code rejitting and code movements
- no files in /tmp
- meta-data file is unique to each run
- no changes to perf report/annotate
- support per-thread and system-wide profiling
- support monitoring of multiple simultaneous Jit runtimes
- source level view in perf annotate
The support is based on cooperation with the runtime. For Java runtimes,
supporting the JVMTI interface, there is no change necessary. For other
runtimes, modifications are necessary to emit the meta-data to support
symbolization, annotation, source lines correlation of the samples.
Those modifications are relatively straighforward, some have been
implemented in V8 and DART.
The jit environment emits a binary dump file which contains the jitted
code (in raw format) and meta-data describing the mapping of functions.
The binary format is documented in the jitdump.h header file. It is
adapted from the OProfile jitdump format.
To enable synchronization of the runtime MMAPs with those recorded by
the kernel on behalf of the perf tool, the runtime needs to timestamp
any record in the dump file using the same time source. This is possible
since Linux 4.1 where the kernel supports per event timestamp clock source.
In the case of the JVMTI agent, the clock used is CLOCK_MONOTONIC, thus
perf record is invoked with -k mono such that it matches the agent.
The current support only works when the runtime is monitored from
start to finish: perf record java --agentpath:libpfmjvmti.so my_class.
Once the run is completed, the jitdump file needs to be injected into
the perf.data file. This is accomplished by using the perf inject command.
This will also generate an ELF image for each jitted function. The
injected MMAP records will point to these ELF images. The reasoning
behind using ELF images is that it makes processing for perf report
and annotate automatic and transparent. It also makes it easier to
package and analyze on a remote machine. Binutils tools can decode
the ELF images easily.
The reporting is unchanged, simply invoke perf report or perf annotate
on the modified perf.data file. The jitted code will appear symbolized
and the assembly view will display the instruction level profile and
source level profile.
As an added bonus, the series includes support for demangling function
signature from OpenJDK.
Furthermore, we believe there is a way to skip the perf inject phase
and have perf report/annotate directly inject the MMAP records
on the fly during processing of the perf.data file. Perf report would
also generate the ELF files if necessary. Such optimization, would
make using this extension seamless in system-wide mode and larger
environments. This will be added in a later update as well.
In V2, we have switched to Pawell Moll and David Ahern posix
clock kernel module instead. We have dropped the patch which
modified the arguments to map_init() because the change was
not used. We are not printing the return type of Java methods
anymore and have made the Java demangler a separate module.
We also rebased to 3.19.0+ from tip.git.
In V3, we switched to Pawel Moll's CLOCK_MONOTONIC perf
clock patches. This patch switch perf_events from sched_clock
to CLOCK_MONOTONIC, a clock source which is available to users.
In V4, we rebased to 4.0-rc5. We also simplified the process by
getting rid of the requirement to pass the jitdump file name to
perf inject. Now, perf injects automtically detects if jitdumps
were generated and it merges the relevant meta-data. This is
accomplished by having the jit runtime mmap the jitdump file
for the purpose of creating a MMAP record in the perf.data file.
That MMAP contains all the info to locate the jitdump file and
generate the ELF images for jitted functions.
In V5, we rebase to acme's perf/core branch (instead of tip.git).
We fixed some bswap issues, switched to using scnprintf() and fixed
formatting issues. Also made sure all the files were included in the
patches. We also fix one error message in the JVMTI agent.
In V6, we switched back to using tip.git to leverage PeterZ's clockid
patch for perf_events in 4.0.0-rc6. Clock source can now be specified
per event and they are connected with the MONOTONIC Posix clock. We
leverage this extension to timestamp samples in the jit runtime and
correlate them with perf samples. Notice the -k mono option in perf
record example below.
In V7, we rebased to 4.3.0-rc3 using tip.git (at commit 0dc7757).
We fixed several issues in the agent. We also added source line
information in the jitdump file from the JVMTI agent. This is
still experimental and probably has some issues. The source
line info is encoded in DWARF2 format in each ELF image. The
code to do this is leveraged from Oprofile with some fixes
and cleanups.
To use the new feature:
- need to run with 4.1 or later
- compile perf
- cd tools/perf/jvmti; make; install wherever is appropriate
Example using openJDK:
$ perf record -k mono java -agentpath:libjvmti.so my_class
$ perf inject -i perf.data -jit -o perf.data.jitted
$ perf report -i perf.data.jitted
Thanks to all the contributors and testers. Special thanks
to PeterZ for adding the clock source to perf_events and solving
the problem of common timesource for user and kernel level samples.
Thanks to the Oprofile authors for the DWARF2 source line code
generation.
Enjoy,
Stephane Eranian (5):
perf tools: add Java demangling support
perf tools: pass session to mmap processing code
perf inject: add jitdump mmap injection support
perf tools: add JVMTI agent library
perf/jit: add source line info support
tools/build/Makefile.feature | 2 +
tools/build/feature/Makefile | 4 +
tools/perf/Documentation/perf-inject.txt | 7 +
tools/perf/builtin-inject.c | 97 ++++-
tools/perf/builtin-script.c | 14 +-
tools/perf/config/Makefile | 11 +
tools/perf/jvmti/Makefile | 70 ++++
tools/perf/jvmti/jvmti_agent.c | 464 +++++++++++++++++++++
tools/perf/jvmti/jvmti_agent.h | 36 ++
tools/perf/jvmti/libjvmti.c | 304 ++++++++++++++
tools/perf/util/Build | 4 +
tools/perf/util/demangle-java.c | 199 +++++++++
tools/perf/util/demangle-java.h | 10 +
tools/perf/util/event.c | 6 +-
tools/perf/util/event.h | 6 +-
tools/perf/util/genelf.c | 448 +++++++++++++++++++++
tools/perf/util/genelf.h | 60 +++
tools/perf/util/genelf_debug.c | 610 ++++++++++++++++++++++++++++
tools/perf/util/jitdump.c | 666 +++++++++++++++++++++++++++++++
tools/perf/util/jitdump.h | 116 ++++++
tools/perf/util/session.c | 43 +-
tools/perf/util/symbol-elf.c | 3 +
tools/perf/util/tool.h | 11 +-
23 files changed, 3152 insertions(+), 39 deletions(-)
create mode 100644 tools/perf/jvmti/Makefile
create mode 100644 tools/perf/jvmti/jvmti_agent.c
create mode 100644 tools/perf/jvmti/jvmti_agent.h
create mode 100644 tools/perf/jvmti/libjvmti.c
create mode 100644 tools/perf/util/demangle-java.c
create mode 100644 tools/perf/util/demangle-java.h
create mode 100644 tools/perf/util/genelf.c
create mode 100644 tools/perf/util/genelf.h
create mode 100644 tools/perf/util/genelf_debug.c
create mode 100644 tools/perf/util/jitdump.c
create mode 100644 tools/perf/util/jitdump.h
--
1.9.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v7 1/5] perf tools: add Java demangling support
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
@ 2015-10-01 6:45 ` Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 2/5] perf tools: pass session to mmap processing code Stephane Eranian
` (5 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-10-01 6:45 UTC (permalink / raw)
To: linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
Add Java function descriptor demangling support.
Something bfd cannot do.
Use the JAVA_DEMANGLE_NORET flag to avoid decoding the
return type of functions.
Signed-off-by: Stephane Eranian <eranian@google.com>
---
tools/perf/util/Build | 1 +
tools/perf/util/demangle-java.c | 199 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/demangle-java.h | 10 ++
tools/perf/util/symbol-elf.c | 3 +
4 files changed, 213 insertions(+)
create mode 100644 tools/perf/util/demangle-java.c
create mode 100644 tools/perf/util/demangle-java.h
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 4bc7a9a..495a27a 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -107,6 +107,7 @@ libperf-y += scripting-engines/
libperf-$(CONFIG_PERF_REGS) += perf_regs.o
libperf-$(CONFIG_ZLIB) += zlib.o
libperf-$(CONFIG_LZMA) += lzma.o
+libperf-y += demangle-java.o
CFLAGS_config.o += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
CFLAGS_exec_cmd.o += -DPERF_EXEC_PATH="BUILD_STR($(perfexecdir_SQ))" -DPREFIX="BUILD_STR($(prefix_SQ))"
diff --git a/tools/perf/util/demangle-java.c b/tools/perf/util/demangle-java.c
new file mode 100644
index 0000000..19b7c06
--- /dev/null
+++ b/tools/perf/util/demangle-java.c
@@ -0,0 +1,199 @@
+#include <sys/types.h>
+#include <stdio.h>
+#include <string.h>
+#include "util.h"
+#include "debug.h"
+#include "symbol.h"
+
+#include "demangle-java.h"
+
+enum {
+ MODE_PREFIX=0,
+ MODE_CLASS=1,
+ MODE_FUNC=2,
+ MODE_TYPE=3,
+ MODE_CTYPE=3, /* class arg */
+};
+
+#define BASE_ENT(c, n) [c-'A']=n
+static const char *base_types['Z'-'A' + 1]={
+ BASE_ENT('B', "byte" ),
+ BASE_ENT('C', "char" ),
+ BASE_ENT('D', "double" ),
+ BASE_ENT('F', "float" ),
+ BASE_ENT('I', "int" ),
+ BASE_ENT('J', "long" ),
+ BASE_ENT('S', "short" ),
+ BASE_ENT('Z', "bool" ),
+};
+
+/*
+ * demangle Java symbol between str and end positions and stores
+ * up to maxlen characters into buf. The parser starts in mode.
+ *
+ * Use MODE_PREFIX to process entire prototype till end position
+ * Use MODE_TYPE to process return type if str starts on return type char
+ *
+ * Return:
+ * success: buf
+ * error : NULL
+ */
+static char *
+__demangle_java_sym(const char *str, const char *end, char *buf, int maxlen, int mode)
+{
+ int rlen = 0;
+ int array = 0;
+ int narg = 0;
+ const char *q;
+
+ if (!end)
+ end = str + strlen(str);
+
+ for (q = str; q != end; q++) {
+
+ if (rlen == (maxlen - 1))
+ break;
+
+ switch (*q) {
+ case 'L':
+ if (mode == MODE_PREFIX || mode == MODE_CTYPE) {
+ if (mode == MODE_CTYPE) {
+ if (narg)
+ rlen += scnprintf(buf + rlen, maxlen - rlen, ", ");
+ narg++;
+ }
+ rlen += scnprintf(buf + rlen, maxlen - rlen, "class ");
+ if (mode == MODE_PREFIX)
+ mode = MODE_CLASS;
+ } else
+ buf[rlen++] = *q;
+ break;
+ case 'B':
+ case 'C':
+ case 'D':
+ case 'F':
+ case 'I':
+ case 'J':
+ case 'S':
+ case 'Z':
+ if (mode == MODE_TYPE) {
+ if (narg)
+ rlen += scnprintf(buf + rlen, maxlen - rlen, ", ");
+ rlen += scnprintf(buf+rlen, maxlen - rlen, "%s", base_types[*q - 'A']);
+ while(array--)
+ rlen += scnprintf(buf + rlen, maxlen - rlen, "[]");
+ array = 0;
+ narg++;
+ } else
+ buf[rlen++] = *q;
+ break;
+ case 'V':
+ if (mode == MODE_TYPE) {
+ rlen += scnprintf(buf + rlen, maxlen - rlen, "void");
+ while(array--)
+ rlen += scnprintf(buf + rlen, maxlen - rlen, "[]");
+ array = 0;
+ } else
+ buf[rlen++] = *q;
+ break;
+ case '[':
+ if (mode != MODE_TYPE)
+ goto error;
+ array++;
+ break;
+ case '(':
+ if (mode != MODE_FUNC)
+ goto error;
+ buf[rlen++] = *q;
+ mode = MODE_TYPE;
+ break;
+ case ')':
+ if (mode != MODE_TYPE)
+ goto error;
+ buf[rlen++] = *q;
+ narg = 0;
+ break;
+ case ';':
+ if (mode != MODE_CLASS && mode != MODE_CTYPE)
+ goto error;
+ /* safe because at least one other char to process */
+ if (isalpha(*(q+1)))
+ rlen += scnprintf(buf + rlen, maxlen - rlen, ".");
+ if (mode == MODE_CLASS)
+ mode = MODE_FUNC;
+ else if (mode == MODE_CTYPE)
+ mode = MODE_TYPE;
+ break;
+ case '/':
+ if (mode != MODE_CLASS && mode != MODE_CTYPE)
+ goto error;
+ rlen += scnprintf(buf + rlen, maxlen - rlen, ".");
+ break;
+ default :
+ buf[rlen++] = *q;
+ }
+ }
+ buf[rlen] = '\0';
+ return buf;
+error:
+ return NULL;
+}
+
+/*
+ * Demangle Java function signature (openJDK, not GCJ)
+ * input:
+ * str: string to parse. String is not modified
+ * flags: comobination of JAVA_DEMANGLE_* flags to modify demangling
+ * return:
+ * if input can be demangled, then a newly allocated string is returned.
+ * if input cannot be demangled, then NULL is returned
+ *
+ * Note: caller is responsible for freeing demangled string
+ */
+char *
+java_demangle_sym(const char *str, int flags)
+{
+ char *buf, *ptr;
+ char *p;
+ size_t len, l1 = 0;
+
+ if (!str)
+ return NULL;
+
+ /* find start of retunr type */
+ p = strrchr(str, ')');
+ if (!p)
+ return NULL;
+
+ /*
+ * expansion factor estimated to 3x
+ */
+ len = strlen(str) * 3 + 1;
+ buf = malloc(len);
+ if (!buf)
+ return NULL;
+
+ buf[0] = '\0';
+ if (!(flags & JAVA_DEMANGLE_NORET)) {
+ /*
+ * get return type first
+ */
+ ptr = __demangle_java_sym(p + 1, NULL, buf, len, MODE_TYPE);
+ if (!ptr)
+ goto error;
+
+ /* add space between return type and function prototype */
+ l1 = strlen(buf);
+ buf[l1++] = ' ';
+ }
+
+ /* process function up to return type */
+ ptr = __demangle_java_sym(str, p + 1, buf + l1, len - l1, MODE_PREFIX);
+ if (!ptr)
+ goto error;
+
+ return buf;
+error:
+ free(buf);
+ return NULL;
+}
diff --git a/tools/perf/util/demangle-java.h b/tools/perf/util/demangle-java.h
new file mode 100644
index 0000000..a981c1f
--- /dev/null
+++ b/tools/perf/util/demangle-java.h
@@ -0,0 +1,10 @@
+#ifndef __PERF_DEMANGLE_JAVA
+#define __PERF_DEMANGLE_JAVA 1
+/*
+ * demangle function flags
+ */
+#define JAVA_DEMANGLE_NORET 0x1 /* do not process return type */
+
+char * java_demangle_sym(const char *str, int flags);
+
+#endif /* __PERF_DEMANGLE_JAVA */
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 475d88d..3fce39f 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -6,6 +6,7 @@
#include <inttypes.h>
#include "symbol.h"
+#include "demangle-java.h"
#include "machine.h"
#include "vdso.h"
#include <symbol/kallsyms.h>
@@ -1070,6 +1071,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
demangle_flags = DMGL_PARAMS | DMGL_ANSI;
demangled = bfd_demangle(NULL, elf_name, demangle_flags);
+ if (demangled == NULL)
+ demangled = java_demangle_sym(elf_name, JAVA_DEMANGLE_NORET);
if (demangled != NULL)
elf_name = demangled;
}
--
1.9.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v7 2/5] perf tools: pass session to mmap processing code
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 1/5] perf tools: add Java demangling support Stephane Eranian
@ 2015-10-01 6:45 ` Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 3/5] perf inject: add jitdump mmap injection support Stephane Eranian
` (4 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-10-01 6:45 UTC (permalink / raw)
To: linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
This patch passes the perf_session to the mmap processing callbacks.
This is needed in a later patch to get to the event list and sample_type.
A consequence of the patch is that the number of argument to
machine_deliver_event() is reduced.
Signed-off-by: Stephane Eranian <eranian@google.com>
---
tools/perf/builtin-inject.c | 14 ++++++++------
tools/perf/builtin-script.c | 14 ++++++--------
tools/perf/util/event.c | 6 ++++--
tools/perf/util/event.h | 6 ++++--
tools/perf/util/session.c | 43 +++++++++++++++++++++++++------------------
tools/perf/util/tool.h | 11 ++++++++---
6 files changed, 55 insertions(+), 39 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 0a945d2..db5393e 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -224,11 +224,12 @@ static int perf_event__repipe_sample(struct perf_tool *tool,
static int perf_event__repipe_mmap(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine)
+ struct machine *machine,
+ struct perf_session *session)
{
int err;
- err = perf_event__process_mmap(tool, event, sample, machine);
+ err = perf_event__process_mmap(tool, event, sample, machine, session);
perf_event__repipe(tool, event, sample, machine);
return err;
@@ -237,11 +238,12 @@ static int perf_event__repipe_mmap(struct perf_tool *tool,
static int perf_event__repipe_mmap2(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine)
+ struct machine *machine,
+ struct perf_session *session)
{
int err;
- err = perf_event__process_mmap2(tool, event, sample, machine);
+ err = perf_event__process_mmap2(tool, event, sample, machine, session);
perf_event__repipe(tool, event, sample, machine);
return err;
@@ -669,8 +671,8 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
struct perf_inject inject = {
.tool = {
.sample = perf_event__repipe_sample,
- .mmap = perf_event__repipe,
- .mmap2 = perf_event__repipe,
+ .mmap = perf_event__repipe_mmap,
+ .mmap2 = perf_event__repipe_mmap,
.comm = perf_event__repipe,
.fork = perf_event__repipe,
.exit = perf_event__repipe,
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 8ce1c6b..93d09f2 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -790,14 +790,13 @@ static int process_exit_event(struct perf_tool *tool,
static int process_mmap_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine)
+ struct machine *machine,
+ struct perf_session *session)
{
struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
- if (perf_event__process_mmap(tool, event, sample, machine) < 0)
+ if (perf_event__process_mmap(tool, event, sample, machine, session) < 0)
return -1;
thread = machine__findnew_thread(machine, event->mmap.pid, event->mmap.tid);
@@ -821,14 +820,13 @@ static int process_mmap_event(struct perf_tool *tool,
static int process_mmap2_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine)
+ struct machine *machine,
+ struct perf_session *session)
{
struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
- if (perf_event__process_mmap2(tool, event, sample, machine) < 0)
+ if (perf_event__process_mmap2(tool, event, sample, machine, session) < 0)
return -1;
thread = machine__findnew_thread(machine, event->mmap2.pid, event->mmap2.tid);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index b1bb348..555f876 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -786,7 +786,8 @@ size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp)
int perf_event__process_mmap(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine)
+ struct machine *machine,
+ struct perf_session *session __maybe_unused)
{
return machine__process_mmap_event(machine, event, sample);
}
@@ -794,7 +795,8 @@ int perf_event__process_mmap(struct perf_tool *tool __maybe_unused,
int perf_event__process_mmap2(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine)
+ struct machine *machine,
+ struct perf_session *session __maybe_unused)
{
return machine__process_mmap2_event(machine, event, sample);
}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index a0dbcbd..a33d50d 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -434,11 +434,13 @@ int perf_event__process_switch(struct perf_tool *tool,
int perf_event__process_mmap(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine);
+ struct machine *machine,
+ struct perf_session *session);
int perf_event__process_mmap2(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
- struct machine *machine);
+ struct machine *machine,
+ struct perf_session *session);
int perf_event__process_fork(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 84a02eae..fcbf1a0 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -21,7 +21,6 @@
static int perf_session__deliver_event(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample,
- struct perf_tool *tool,
u64 file_offset);
static int perf_session__open(struct perf_session *session)
@@ -108,7 +107,7 @@ static int ordered_events__deliver_event(struct ordered_events *oe,
}
return perf_session__deliver_event(session, event->event, &sample,
- session->tool, event->file_offset);
+ event->file_offset);
}
struct perf_session *perf_session__new(struct perf_data_file *file,
@@ -214,7 +213,6 @@ static int process_event_sample_stub(struct perf_tool *tool __maybe_unused,
dump_printf(": unhandled!\n");
return 0;
}
-
static int process_event_stub(struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
struct perf_sample *sample __maybe_unused,
@@ -224,6 +222,17 @@ static int process_event_stub(struct perf_tool *tool __maybe_unused,
return 0;
}
+
+static int process_event_mmap_stub(struct perf_tool *tool __maybe_unused,
+ union perf_event *event __maybe_unused,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine __maybe_unused,
+ struct perf_session *session __maybe_unused)
+{
+ dump_printf(": unhandled!\n");
+ return 0;
+}
+
static int process_build_id_stub(struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
struct perf_session *session __maybe_unused)
@@ -301,9 +310,9 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
if (tool->sample == NULL)
tool->sample = process_event_sample_stub;
if (tool->mmap == NULL)
- tool->mmap = process_event_stub;
+ tool->mmap = process_event_mmap_stub;
if (tool->mmap2 == NULL)
- tool->mmap2 = process_event_stub;
+ tool->mmap2 = process_event_mmap_stub;
if (tool->comm == NULL)
tool->comm = process_event_stub;
if (tool->fork == NULL)
@@ -1047,12 +1056,14 @@ static int
&sample->read.one, machine);
}
-static int machines__deliver_event(struct machines *machines,
- struct perf_evlist *evlist,
+static int machines__deliver_event(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample,
- struct perf_tool *tool, u64 file_offset)
+ u64 file_offset)
{
+ struct machines *machines = &session->machines;
+ struct perf_evlist *evlist = session->evlist;
+ struct perf_tool *tool = session->tool;
struct perf_evsel *evsel;
struct machine *machine;
@@ -1075,11 +1086,11 @@ static int machines__deliver_event(struct machines *machines,
}
return perf_evlist__deliver_sample(evlist, tool, event, sample, evsel, machine);
case PERF_RECORD_MMAP:
- return tool->mmap(tool, event, sample, machine);
+ return tool->mmap(tool, event, sample, machine, session);
case PERF_RECORD_MMAP2:
if (event->header.misc & PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT)
++evlist->stats.nr_proc_map_timeout;
- return tool->mmap2(tool, event, sample, machine);
+ return tool->mmap2(tool, event, sample, machine, session);
case PERF_RECORD_COMM:
return tool->comm(tool, event, sample, machine);
case PERF_RECORD_FORK:
@@ -1119,19 +1130,17 @@ static int machines__deliver_event(struct machines *machines,
static int perf_session__deliver_event(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample,
- struct perf_tool *tool,
u64 file_offset)
{
int ret;
- ret = auxtrace__process_event(session, event, sample, tool);
+ ret = auxtrace__process_event(session, event, sample, session->tool);
if (ret < 0)
return ret;
if (ret > 0)
return 0;
- return machines__deliver_event(&session->machines, session->evlist,
- event, sample, tool, file_offset);
+ return machines__deliver_event(session, event, sample, file_offset);
}
static s64 perf_session__process_user_event(struct perf_session *session,
@@ -1189,14 +1198,13 @@ int perf_session__deliver_synth_event(struct perf_session *session,
struct perf_sample *sample)
{
struct perf_evlist *evlist = session->evlist;
- struct perf_tool *tool = session->tool;
events_stats__inc(&evlist->stats, event->header.type);
if (event->header.type >= PERF_RECORD_USER_TYPE_START)
return perf_session__process_user_event(session, event, 0);
- return machines__deliver_event(&session->machines, evlist, event, sample, tool, 0);
+ return machines__deliver_event(session, event, sample, 0);
}
static void event_swap(union perf_event *event, bool sample_id_all)
@@ -1295,8 +1303,7 @@ static s64 perf_session__process_event(struct perf_session *session,
return ret;
}
- return perf_session__deliver_event(session, event, &sample, tool,
- file_offset);
+ return perf_session__deliver_event(session, event, &sample, file_offset);
}
void perf_event_header__bswap(struct perf_event_header *hdr)
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index cab8cc2..b4f94fe 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -34,12 +34,15 @@ typedef int (*event_oe)(struct perf_tool *tool, union perf_event *event,
typedef s64 (*event_op3)(struct perf_tool *tool, union perf_event *event,
struct perf_session *session);
+
+typedef int (*event_op4)(struct perf_tool *tool, union perf_event *event,
+ struct perf_sample *sample, struct machine *machine,
+ struct perf_session *session);
+
struct perf_tool {
event_sample sample,
read;
- event_op mmap,
- mmap2,
- comm,
+ event_op comm,
fork,
exit,
lost,
@@ -57,6 +60,8 @@ struct perf_tool {
auxtrace_info,
auxtrace_error;
event_op3 auxtrace;
+ event_op4 mmap,
+ mmap2;
bool ordered_events;
bool ordering_requires_timestamps;
};
--
1.9.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v7 3/5] perf inject: add jitdump mmap injection support
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 1/5] perf tools: add Java demangling support Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 2/5] perf tools: pass session to mmap processing code Stephane Eranian
@ 2015-10-01 6:45 ` Stephane Eranian
2015-10-09 13:17 ` Adrian Hunter
2015-10-01 6:45 ` [PATCH v7 4/5] perf tools: add JVMTI agent library Stephane Eranian
` (3 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2015-10-01 6:45 UTC (permalink / raw)
To: linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
This patch adds a --jit option to perf inject.
This options injects MMAP records into the perf.data
file to cover the jitted code mmaps. It also emits
ELF images for each function in the jidump file.
Those images are created where the jitdump file is.
The MMAP records point to that location as well.
Typical flow:
$ perf record -k mono -- java -agentpath:libpjvmti.so java_class
$ perf inject --jit -i perf.data -o perf.data.jitted
$ perf report -i perf.data.jitted
Note that jitdump.h support is not limited to Java, it works with
any jitted environment modified to emit the jitdump file format,
include those where code can be jitted multiple times and moved
around.
The jitdump.h format is adapted from the Oprofile project.
The genelf.c (ELF binary generation) depends on MD5 hash
encoding for the buildid. To enable this, libssl-dev must
be installed. If not, then genelf.c defaults to using
urandom to generate the buildid, which is not ideal.
The Makefile auto-detects the presence on libssl-dev.
This version mmaps the jitdump file to create a marker
MMAP record in the perf.data file. The marker is used to detect
jitdump and cause perf inject to inject the jitted mmaps and
generate ELF images for jitted functions.
Signed-off-by: Stephane Eranian <eranian@google.com>
---
tools/build/Makefile.feature | 2 +
tools/build/feature/Makefile | 4 +
tools/perf/Documentation/perf-inject.txt | 7 +
tools/perf/builtin-inject.c | 83 ++++
tools/perf/config/Makefile | 11 +
tools/perf/util/Build | 2 +
tools/perf/util/genelf.c | 441 ++++++++++++++++++++
tools/perf/util/genelf.h | 56 +++
tools/perf/util/jitdump.c | 664 +++++++++++++++++++++++++++++++
tools/perf/util/jitdump.h | 116 ++++++
10 files changed, 1386 insertions(+)
create mode 100644 tools/perf/util/genelf.c
create mode 100644 tools/perf/util/genelf.h
create mode 100644 tools/perf/util/jitdump.c
create mode 100644 tools/perf/util/jitdump.h
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 72817e4..a245539 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -46,6 +46,7 @@ FEATURE_TESTS ?= \
libpython \
libpython-version \
libslang \
+ libcrypto \
libunwind \
pthread-attr-setaffinity-np \
stackprotector-all \
@@ -67,6 +68,7 @@ FEATURE_DISPLAY ?= \
libperl \
libpython \
libslang \
+ libcrypto \
libunwind \
libdw-dwarf-unwind \
zlib \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index e43a297..fa3b6c0 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -24,6 +24,7 @@ FILES= \
test-libpython.bin \
test-libpython-version.bin \
test-libslang.bin \
+ test-libcrypto.bin \
test-libunwind.bin \
test-libunwind-debug-frame.bin \
test-pthread-attr-setaffinity-np.bin \
@@ -104,6 +105,9 @@ endif
test-libslang.bin:
$(BUILD) -I/usr/include/slang -lslang
+test-libcrypto.bin:
+ $(BUILD) -lcrypto
+
test-gtk2.bin:
$(BUILD) $(shell $(PKG_CONFIG) --libs --cflags gtk+-2.0 2>/dev/null)
diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
index 0b1cede..87b2588 100644
--- a/tools/perf/Documentation/perf-inject.txt
+++ b/tools/perf/Documentation/perf-inject.txt
@@ -53,6 +53,13 @@ include::itrace.txt[]
--strip::
Use with --itrace to strip out non-synthesized events.
+-j::
+--jit::
+ Process jitdump files by injecting the mmap records corresponding to jitted
+ functions. This option also generates the ELF images for each jitted function
+ found in the jitdumps files captured in the input perf.data file. Use this option
+ if you are monitoring environment using JIT runtimes, such as Java, DART or V8.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-archive[1]
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index db5393e..8120ab7 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -17,6 +17,7 @@
#include "util/build-id.h"
#include "util/data.h"
#include "util/auxtrace.h"
+#include "util/jit.h"
#include "util/parse-options.h"
@@ -29,6 +30,7 @@ struct perf_inject {
bool sched_stat;
bool have_auxtrace;
bool strip;
+ bool jit_mode;
const char *input_name;
struct perf_data_file output;
u64 bytes_written;
@@ -235,6 +237,26 @@ static int perf_event__repipe_mmap(struct perf_tool *tool,
return err;
}
+static int perf_event__jit_repipe_mmap(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine,
+ struct perf_session *session)
+{
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
+ u64 n = 0;
+
+ /*
+ * if jit marker, then inject jit mmaps and generate ELF images
+ */
+ if (!jit_process(session, &inject->output, machine,
+ event->mmap.filename, sample->pid, &n)) {
+ inject->bytes_written += n;
+ return 0;
+ }
+ return perf_event__repipe(tool, event, sample, machine);
+}
+
static int perf_event__repipe_mmap2(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -249,6 +271,26 @@ static int perf_event__repipe_mmap2(struct perf_tool *tool,
return err;
}
+static int perf_event__jit_repipe_mmap2(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine,
+ struct perf_session *session)
+{
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
+ u64 n = 0;
+
+ /*
+ * if jit marker, then inject jit mmaps and generate ELF images
+ */
+ if (!jit_process(session, &inject->output, machine,
+ event->mmap2.filename, sample->pid, &n)) {
+ inject->bytes_written += n;
+ return 0;
+ }
+ return perf_event__repipe(tool, event, sample, machine);
+}
+
static int perf_event__repipe_fork(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -666,6 +708,21 @@ static int __cmd_inject(struct perf_inject *inject)
return ret;
}
+static int
+jit_validate_events(struct perf_session *session)
+{
+ struct perf_evsel *evsel;
+
+ /*
+ * check that all events use CLOCK_MONOTONIC
+ */
+ evlist__for_each(session->evlist, evsel) {
+ if (evsel->attr.use_clockid == 0 || evsel->attr.clockid != CLOCK_MONOTONIC)
+ return -1;
+ }
+ return 0;
+}
+
int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
{
struct perf_inject inject = {
@@ -714,6 +771,7 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('s', "sched-stat", &inject.sched_stat,
"Merge sched-stat and sched-switch for getting events "
"where and how long tasks slept"),
+ OPT_BOOLEAN('j', "jit", &inject.jit_mode, "merge jitdump files into perf.data file"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show build ids, etc)"),
OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name, "file",
@@ -756,6 +814,31 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
if (inject.session == NULL)
return -1;
+ if (inject.build_ids) {
+ /*
+ * to make sure the mmap records are ordered correctly
+ * and so that the correct especially due to jitted code
+ * mmaps. We cannot generate the buildid hit list and
+ * inject the jit mmaps at the same time for now.
+ */
+ inject.tool.ordered_events = true;
+ inject.tool.ordering_requires_timestamps = true;
+ }
+
+ if (inject.jit_mode) {
+ /*
+ * validate event is using the correct clockid
+ */
+ if (jit_validate_events(inject.session)) {
+ fprintf(stderr, "error, jitted code must be sampled with perf record -k 1\n");
+ return -1;
+ }
+ inject.tool.mmap2 = perf_event__jit_repipe_mmap2;
+ inject.tool.mmap = perf_event__jit_repipe_mmap;
+ inject.tool.ordered_events = true;
+ inject.tool.ordering_requires_timestamps = true;
+ }
+
ret = symbol__init(&inject.session->header.env);
if (ret < 0)
goto out_delete;
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index ab09ada..2858424 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -380,6 +380,17 @@ ifndef NO_LIBAUDIT
endif
endif
+ifndef NO_LIBCRYPTO
+ ifneq ($(feature-libcrypto), 1)
+ msg := $(warning No libcrypto.h found, disables jitted code injection, please install libssl-devel or libssl-dev);
+ NO_LIBCRYPTO := 1
+ else
+ CFLAGS += -DHAVE_LIBCRYPTO_SUPPORT
+ EXTLIBS += -lcrypto
+ $(call detected,CONFIG_CRYPTO)
+ endif
+endif
+
ifdef NO_NEWT
NO_SLANG=1
endif
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 495a27a..9fd4906 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -108,6 +108,8 @@ libperf-$(CONFIG_PERF_REGS) += perf_regs.o
libperf-$(CONFIG_ZLIB) += zlib.o
libperf-$(CONFIG_LZMA) += lzma.o
libperf-y += demangle-java.o
+libperf-y += jitdump.o
+libperf-y += genelf.o
CFLAGS_config.o += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
CFLAGS_exec_cmd.o += -DPERF_EXEC_PATH="BUILD_STR($(perfexecdir_SQ))" -DPREFIX="BUILD_STR($(prefix_SQ))"
diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
new file mode 100644
index 0000000..3832cd0
--- /dev/null
+++ b/tools/perf/util/genelf.c
@@ -0,0 +1,441 @@
+/*
+ * genelf.c
+ * Copyright (C) 2014, Google, Inc
+ *
+ * Contributed by:
+ * Stephane Eranian <eranian@gmail.com>
+ *
+ * Released under the GPL v2. (and only v2, not any later version)
+ */
+
+#include <sys/types.h>
+#include <stdio.h>
+#include <getopt.h>
+#include <stddef.h>
+#include <libelf.h>
+#include <string.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <err.h>
+#include <dwarf.h>
+
+#include "perf.h"
+#include "genelf.h"
+#include "../util/jitdump.h"
+
+#define JVMTI
+
+#define BUILD_ID_URANDOM /* different uuid for each run */
+
+#ifdef HAVE_LIBCRYPTO
+
+#define BUILD_ID_MD5
+#undef BUILD_ID_SHA /* does not seem to work well when linked with Java */
+#undef BUILD_ID_URANDOM /* different uuid for each run */
+
+#ifdef BUILD_ID_SHA
+#include <openssl/sha.h>
+#endif
+
+#ifdef BUILD_ID_MD5
+#include <openssl/md5.h>
+#endif
+#endif
+
+
+typedef struct {
+ unsigned int namesz; /* Size of entry's owner string */
+ unsigned int descsz; /* Size of the note descriptor */
+ unsigned int type; /* Interpretation of the descriptor */
+ char name[0]; /* Start of the name+desc data */
+} Elf_Note;
+
+struct options {
+ char *output;
+ int fd;
+};
+
+static char shd_string_table[] = {
+ 0,
+ '.', 't', 'e', 'x', 't', 0, /* 1 */
+ '.', 's', 'h', 's', 't', 'r', 't', 'a', 'b', 0, /* 7 */
+ '.', 's', 'y', 'm', 't', 'a', 'b', 0, /* 17 */
+ '.', 's', 't', 'r', 't', 'a', 'b', 0, /* 25 */
+ '.', 'n', 'o', 't', 'e', '.', 'g', 'n', 'u', '.', 'b', 'u', 'i', 'l', 'd', '-', 'i', 'd', 0, /* 33 */
+ '.', 'd', 'e', 'b', 'u', 'g', '_', 'l', 'i', 'n', 'e', 0, /* 52 */
+ '.', 'd', 'e', 'b', 'u', 'g', '_', 'i', 'n', 'f', 'o', 0, /* 64 */
+ '.', 'd', 'e', 'b', 'u', 'g', '_', 'a', 'b', 'b', 'r', 'e', 'v', 0, /* 76 */
+};
+
+static struct buildid_note {
+ Elf_Note desc; /* descsz: size of build-id, must be multiple of 4 */
+ char name[4]; /* GNU\0 */
+ char build_id[20];
+} bnote;
+
+static Elf_Sym symtab[]={
+ /* symbol 0 MUST be the undefined symbol */
+ { .st_name = 0, /* index in sym_string table */
+ .st_info = ELF_ST_TYPE(STT_NOTYPE),
+ .st_shndx = 0, /* for now */
+ .st_value = 0x0,
+ .st_other = ELF_ST_VIS(STV_DEFAULT),
+ .st_size = 0,
+ },
+ { .st_name = 1, /* index in sym_string table */
+ .st_info = ELF_ST_BIND(STB_LOCAL) | ELF_ST_TYPE(STT_FUNC),
+ .st_shndx = 1,
+ .st_value = 0, /* for now */
+ .st_other = ELF_ST_VIS(STV_DEFAULT),
+ .st_size = 0, /* for now */
+ }
+};
+
+#ifdef BUILD_ID_URANDOM
+static void
+gen_build_id(struct buildid_note *note,
+ unsigned long load_addr __maybe_unused,
+ const void *code __maybe_unused,
+ size_t csize __maybe_unused)
+{
+ int fd;
+ size_t sz = sizeof(note->build_id);
+ ssize_t sret;
+
+ fd = open("/dev/urandom", O_RDONLY);
+ if (fd == -1)
+ err(1, "cannot access /dev/urandom for builid");
+
+ sret = read(fd, note->build_id, sz);
+
+ close(fd);
+
+ if (sret != (ssize_t)sz)
+ memset(note->build_id, 0, sz);
+}
+#endif
+
+#ifdef BUILD_ID_SHA
+static void
+gen_build_id(struct buildid_note *note,
+ unsigned long load_addr __maybe_unused,
+ const void *code,
+ size_t csize)
+{
+ if (sizeof(note->build_id) < SHA_DIGEST_LENGTH)
+ errx(1, "build_id too small for SHA1");
+
+ SHA1(code, csize, (unsigned char *)note->build_id);
+}
+#endif
+
+#ifdef BUILD_ID_MD5
+static void
+gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *code, size_t csize)
+{
+ MD5_CTX context;
+
+ if (sizeof(note->build_id) < 16)
+ errx(1, "build_id too small for MD5");
+
+ MD5_Init(&context);
+ MD5_Update(&context, &load_addr, sizeof(load_addr));
+ MD5_Update(&context, code, csize);
+ MD5_Final((unsigned char *)note->build_id, &context);
+}
+#endif
+
+/*
+ * fd: file descriptor open for writing for the output file
+ * load_addr: code load address (could be zero, just used for buildid)
+ * sym: function name (for native code - used as the symbol)
+ * code: the native code
+ * csize: the code size in bytes
+ */
+int
+jit_write_elf(int fd, uint64_t load_addr, const char *sym,
+ const void *code, int csize)
+{
+ Elf *e;
+ Elf_Data *d;
+ Elf_Scn *scn;
+ Elf_Ehdr *ehdr;
+ Elf_Shdr *shdr;
+ char *strsym = NULL;
+ int symlen;
+ int retval = -1;
+
+ if (elf_version(EV_CURRENT) == EV_NONE) {
+ warnx("ELF initialization failed");
+ return -1;
+ }
+
+ e = elf_begin(fd, ELF_C_WRITE, NULL);
+ if (!e) {
+ warnx("elf_begin failed");
+ goto error;
+ }
+
+ /*
+ * setup ELF header
+ */
+ ehdr = elf_newehdr(e);
+ if (!ehdr) {
+ warnx("cannot get ehdr");
+ goto error;
+ }
+
+ ehdr->e_ident[EI_DATA] = GEN_ELF_ENDIAN;
+ ehdr->e_ident[EI_CLASS] = GEN_ELF_CLASS;
+ ehdr->e_machine = GEN_ELF_ARCH;
+ ehdr->e_type = ET_DYN;
+ ehdr->e_entry = 0x0;
+ ehdr->e_version = EV_CURRENT;
+ ehdr->e_shstrndx= 2; /* shdr index for section name */
+
+ /*
+ * setup text section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 16;
+ d->d_off = 0LL;
+ d->d_buf = (void *)code;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = csize;
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 1;
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_addr = 0; /* must be zero or == sh_offset -> dynamic object */
+ shdr->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup section headers string table
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = shd_string_table;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = sizeof(shd_string_table);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 7; /* offset of '.shstrtab' in shd_string_table */
+ shdr->sh_type = SHT_STRTAB;
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup symtab section
+ */
+ symtab[1].st_size = csize;
+
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 8;
+ d->d_off = 0LL;
+ d->d_buf = symtab;
+ d->d_type = ELF_T_SYM;
+ d->d_size = sizeof(symtab);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 17; /* offset of '.symtab' in shd_string_table */
+ shdr->sh_type = SHT_SYMTAB;
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = sizeof(Elf_Sym);
+ shdr->sh_link = 4; /* index of .strtab section */
+
+ /*
+ * setup symbols string table
+ * 2 = 1 for 0 in 1st entry, 1 for the 0 at end of symbol for 2nd entry
+ */
+ symlen = 2 + strlen(sym);
+ strsym = calloc(1, symlen);
+ if (!strsym) {
+ warnx("cannot allocate strsym");
+ goto error;
+ }
+ strcpy(strsym + 1, sym);
+
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = strsym;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = symlen;
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 25; /* offset in shd_string_table */
+ shdr->sh_type = SHT_STRTAB;
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup build-id section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ /*
+ * build-id generation
+ */
+ gen_build_id(&bnote, load_addr, code, csize);
+ bnote.desc.namesz = sizeof(bnote.name); /* must include 0 termination */
+ bnote.desc.descsz = sizeof(bnote.build_id);
+ bnote.desc.type = NT_GNU_BUILD_ID;
+ strcpy(bnote.name, "GNU");
+
+ d->d_align = 4;
+ d->d_off = 0LL;
+ d->d_buf = &bnote;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = sizeof(bnote);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 33; /* offset in shd_string_table */
+ shdr->sh_type = SHT_NOTE;
+ shdr->sh_addr = 0x0;
+ shdr->sh_flags = SHF_ALLOC;
+ shdr->sh_size = sizeof(bnote);
+ shdr->sh_entsize = 0;
+
+ if (elf_update(e, ELF_C_WRITE) < 0) {
+ warnx("elf_update 4 failed");
+ goto error;
+ }
+
+ retval = 0;
+error:
+ (void)elf_end(e);
+
+ free(strsym);
+
+
+ return retval;
+}
+
+#ifndef JVMTI
+
+static unsigned char x86_code[] = {
+ 0xBB, 0x2A, 0x00, 0x00, 0x00, /* movl $42, %ebx */
+ 0xB8, 0x01, 0x00, 0x00, 0x00, /* movl $1, %eax */
+ 0xCD, 0x80 /* int $0x80 */
+};
+
+static struct options options;
+
+int main(int argc, char **argv)
+{
+ int c, fd, ret;
+
+ while ((c = getopt(argc, argv, "o:h")) != -1) {
+ switch (c) {
+ case 'o':
+ options.output = optarg;
+ break;
+ case 'h':
+ printf("Usage: genelf -o output_file [-h]\n");
+ return 0;
+ default:
+ errx(1, "unknown option");
+ }
+ }
+
+ fd = open(options.output, O_CREAT|O_TRUNC|O_RDWR, 0666);
+ if (fd == -1)
+ err(1, "cannot create file %s", options.output);
+
+ ret = jit_write_elf(fd, "main", x86_code, sizeof(x86_code));
+ close(fd);
+
+ if (ret != 0)
+ unlink(options.output);
+
+ return ret;
+}
+#endif
diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h
new file mode 100644
index 0000000..79b89e2
--- /dev/null
+++ b/tools/perf/util/genelf.h
@@ -0,0 +1,56 @@
+#ifndef __GENELF_H__
+#define __GENELF_H__
+
+/* genelf.c */
+extern int jit_write_elf(int fd, uint64_t code_addr, const char *sym,
+ const void *code, int csize);
+
+#if defined(__arm__)
+#define GEN_ELF_ARCH EM_ARM
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS32
+#elif defined(__x86_64__)
+#define GEN_ELF_ARCH EM_X86_64
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__i386__)
+#define GEN_ELF_ARCH EM_386
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS32
+#elif defined(__ppcle__)
+#define GEN_ELF_ARCH EM_PPC
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__powerpc__)
+#define GEN_ELF_ARCH EM_PPC64
+#define GEN_ELF_ENDIAN ELFDATA2MSB
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__powerpcle__)
+#define GEN_ELF_ARCH EM_PPC64
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS64
+#else
+#error "unsupported architecture"
+#endif
+
+#if GEN_ELF_CLASS == ELFCLASS64
+#define elf_newehdr elf64_newehdr
+#define elf_getshdr elf64_getshdr
+#define Elf_Ehdr Elf64_Ehdr
+#define Elf_Shdr Elf64_Shdr
+#define Elf_Sym Elf64_Sym
+#define ELF_ST_TYPE(a) ELF64_ST_TYPE(a)
+#define ELF_ST_BIND(a) ELF64_ST_BIND(a)
+#define ELF_ST_VIS(a) ELF64_ST_VISIBILITY(a)
+#else
+#define elf_newehdr elf32_newehdr
+#define elf_getshdr elf32_getshdr
+#define Elf_Ehdr Elf32_Ehdr
+#define Elf_Shdr Elf32_Shdr
+#define Elf_Sym Elf32_Sym
+#define ELF_ST_TYPE(a) ELF32_ST_TYPE(a)
+#define ELF_ST_BIND(a) ELF32_ST_BIND(a)
+#define ELF_ST_VIS(a) ELF32_ST_VISIBILITY(a)
+#endif
+
+#endif
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
new file mode 100644
index 0000000..e910130
--- /dev/null
+++ b/tools/perf/util/jitdump.c
@@ -0,0 +1,664 @@
+#include <sys/types.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <byteswap.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+
+#include "util.h"
+#include "event.h"
+#include "debug.h"
+#include "evlist.h"
+#include "symbol.h"
+#include "strlist.h"
+#include <elf.h>
+
+#include "session.h"
+#include "jit.h"
+#include "jitdump.h"
+#include "genelf.h"
+#include "../builtin.h"
+
+struct jit_buf_desc {
+ struct perf_data_file *output;
+ struct perf_session *session;
+ struct machine *machine;
+ union jr_entry *entry;
+ void *buf;
+ uint64_t sample_type;
+ size_t bufsize;
+ FILE *in;
+ bool needs_bswap; /* handles cross-endianess */
+ void *debug_data;
+ size_t nr_debug_entries;
+ uint32_t code_load_count;
+ u64 bytes_written;
+ struct rb_root code_root;
+ char dir[PATH_MAX];
+};
+
+struct debug_line_info {
+ unsigned long vma;
+ unsigned int lineno;
+ /* The filename format is unspecified, absolute path, relative etc. */
+ char const filename[0];
+};
+
+struct jit_tool {
+ struct perf_tool tool;
+ struct perf_data_file output;
+ struct perf_data_file input;
+ u64 bytes_written;
+};
+
+#define hmax(a, b) ((a) > (b) ? (a) : (b))
+#define get_jit_tool(t) (container_of(tool, struct jit_tool, tool))
+
+static int
+jit_emit_elf(char *filename,
+ const char *sym,
+ uint64_t code_addr,
+ const void *code,
+ int csize)
+{
+ int ret, fd;
+
+ if (verbose > 0)
+ fprintf(stderr, "write ELF image %s\n", filename);
+
+ fd = open(filename, O_CREAT|O_TRUNC|O_WRONLY, 0644);
+ if (fd == -1) {
+ pr_warning("cannot create jit ELF %s: %s\n", filename, strerror(errno));
+ return -1;
+ }
+
+ ret = jit_write_elf(fd, code_addr, sym, (const void *)code, csize);
+
+ close(fd);
+
+ if (ret)
+ unlink(filename);
+
+ return ret;
+}
+
+static void
+jit_close(struct jit_buf_desc *jd)
+{
+ if (!(jd && jd->in))
+ return;
+ funlockfile(jd->in);
+ fclose(jd->in);
+ jd->in = NULL;
+}
+
+static int
+jit_open(struct jit_buf_desc *jd, const char *name)
+{
+ struct jitheader header;
+ struct jr_prefix *prefix;
+ ssize_t bs, bsz = 0;
+ void *n, *buf = NULL;
+ int ret, retval = -1;
+
+ jd->in = fopen(name, "r");
+ if (!jd->in)
+ return -1;
+
+ bsz = hmax(sizeof(header), sizeof(*prefix));
+
+ buf = malloc(bsz);
+ if (!buf)
+ goto error;
+
+ /*
+ * protect from writer modifying the file while we are reading it
+ */
+ flockfile(jd->in);
+
+ ret = fread(buf, sizeof(header), 1, jd->in);
+ if (ret != 1)
+ goto error;
+
+ memcpy(&header, buf, sizeof(header));
+
+ if (header.magic != JITHEADER_MAGIC) {
+ if (header.magic != JITHEADER_MAGIC_SW)
+ goto error;
+ jd->needs_bswap = true;
+ }
+
+ if (jd->needs_bswap) {
+ header.version = bswap_32(header.version);
+ header.total_size = bswap_32(header.total_size);
+ header.pid = bswap_32(header.pid);
+ header.elf_mach = bswap_32(header.elf_mach);
+ header.timestamp = bswap_64(header.timestamp);
+ }
+
+ if (verbose > 2)
+ pr_debug("version=%u\nhdr.size=%u\nts=0x%llx\npid=%d\nelf_mach=%d\n",
+ header.version,
+ header.total_size,
+ (unsigned long long)header.timestamp,
+ header.pid,
+ header.elf_mach);
+
+ bs = header.total_size - sizeof(header);
+
+ if (bs > bsz) {
+ n = realloc(buf, bs);
+ if (!n)
+ goto error;
+ bsz = bs;
+ buf = n;
+ /* read extra we do not know about */
+ ret = fread(buf, bs - bsz, 1, jd->in);
+ if (ret != 1)
+ goto error;
+ }
+ /*
+ * keep dirname for generating files and mmap records
+ */
+ strcpy(jd->dir, name);
+ dirname(jd->dir);
+
+ return 0;
+error:
+ funlockfile(jd->in);
+ fclose(jd->in);
+ return retval;
+}
+
+static union jr_entry *
+jit_get_next_entry(struct jit_buf_desc *jd)
+{
+ struct jr_prefix *prefix;
+ union jr_entry *jr;
+ void *addr;
+ size_t bs, size;
+ int id, ret;
+
+ if (!(jd && jd->in))
+ return NULL;
+
+ if (jd->buf == NULL) {
+ size_t sz = getpagesize();
+ if (sz < sizeof(*prefix))
+ sz = sizeof(*prefix);
+
+ jd->buf = malloc(sz);
+ if (jd->buf == NULL)
+ return NULL;
+
+ jd->bufsize = sz;
+ }
+
+ prefix = jd->buf;
+
+ /*
+ * file is still locked at this point
+ */
+ ret = fread(prefix, sizeof(*prefix), 1, jd->in);
+ if (ret != 1)
+ return NULL;
+
+ if (jd->needs_bswap) {
+ prefix->id = bswap_32(prefix->id);
+ prefix->total_size = bswap_32(prefix->total_size);
+ prefix->timestamp = bswap_64(prefix->timestamp);
+ }
+ id = prefix->id;
+ size = prefix->total_size;
+
+ bs = (size_t)size;
+ if (bs < sizeof(*prefix))
+ return NULL;
+
+ if (id >= JIT_CODE_MAX) {
+ pr_warning("next_entry: unknown prefix %d, skipping\n", id);
+ return NULL;
+ }
+ if (bs > jd->bufsize) {
+ void *n;
+ n = realloc(jd->buf, bs);
+ if (!n)
+ return NULL;
+ jd->buf = n;
+ jd->bufsize = bs;
+ }
+
+ addr = ((void *)jd->buf) + sizeof(*prefix);
+
+ ret = fread(addr, bs - sizeof(*prefix), 1, jd->in);
+ if (ret != 1)
+ return NULL;
+
+ jr = (union jr_entry *)jd->buf;
+
+ switch(id) {
+ case JIT_CODE_DEBUG_INFO:
+ if (jd->needs_bswap) {
+ uint64_t n;
+ jr->info.code_addr = bswap_64(jr->info.code_addr);
+ jr->info.nr_entry = bswap_64(jr->info.nr_entry);
+ for (n = 0 ; n < jr->info.nr_entry; n++) {
+ jr->info.entries[n].addr = bswap_64(jr->info.entries[n].addr);
+ jr->info.entries[n].lineno = bswap_32(jr->info.entries[n].lineno);
+ jr->info.entries[n].discrim = bswap_32(jr->info.entries[n].discrim);
+ }
+ }
+ break;
+ case JIT_CODE_CLOSE:
+ break;
+ case JIT_CODE_LOAD:
+ if (jd->needs_bswap) {
+ jr->load.pid = bswap_32(jr->load.pid);
+ jr->load.tid = bswap_32(jr->load.tid);
+ jr->load.vma = bswap_64(jr->load.vma);
+ jr->load.code_addr = bswap_64(jr->load.code_addr);
+ jr->load.code_size = bswap_64(jr->load.code_size);
+ jr->load.code_index= bswap_64(jr->load.code_index);
+ }
+ jd->code_load_count++;
+ break;
+ case JIT_CODE_MOVE:
+ if (jd->needs_bswap) {
+ jr->move.pid = bswap_32(jr->move.pid);
+ jr->move.tid = bswap_32(jr->move.tid);
+ jr->move.vma = bswap_64(jr->move.vma);
+ jr->move.old_code_addr = bswap_64(jr->move.old_code_addr);
+ jr->move.new_code_addr = bswap_64(jr->move.new_code_addr);
+ jr->move.code_size = bswap_64(jr->move.code_size);
+ jr->move.code_index = bswap_64(jr->move.code_index);
+ }
+ break;
+ case JIT_CODE_MAX:
+ default:
+ return NULL;
+ }
+ return jr;
+}
+
+static int
+jit_inject_event(struct jit_buf_desc *jd, union perf_event *event)
+{
+ ssize_t size;
+
+ size = perf_data_file__write(jd->output, event, event->header.size);
+ if (size < 0)
+ return -1;
+
+ jd->bytes_written += size;
+ return 0;
+}
+
+static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
+{
+ struct perf_sample sample;
+ union perf_event *event;
+ struct perf_tool *tool = jd->session->tool;
+ uint64_t code, addr;
+ char *filename;
+ struct stat st;
+ size_t size;
+ u16 idr_size;
+ const char *sym;
+ uint32_t count;
+ int ret, csize;
+ pid_t pid, tid;
+ struct {
+ u32 pid, tid;
+ u64 time;
+ } *id;
+
+ pid = jr->load.pid;
+ tid = jr->load.tid;
+ csize = jr->load.code_size;
+ addr = jr->load.code_addr;
+ sym = (void *)((unsigned long)jr + sizeof(jr->load));
+ code = (unsigned long)jr + jr->load.p.total_size - csize;
+ count = jr->load.code_index;
+ idr_size = jd->machine->id_hdr_size;
+ /*
+ * +16 to account for sample_id_all (hack)
+ */
+ event = calloc(1, sizeof(*event) + idr_size);
+ if (!event)
+ return -1;
+
+ filename = event->mmap2.filename;
+ size = snprintf(filename, PATH_MAX, "%s/jitted-%d-%u.so",
+ jd->dir,
+ pid,
+ count);
+
+ size++; /* for \0 */
+
+ size = PERF_ALIGN(size, sizeof(u64));
+
+ ret = jit_emit_elf(filename, sym, addr, (const void *)code, csize);
+
+ if (jd->debug_data && jd->nr_debug_entries) {
+ free(jd->debug_data);
+ jd->debug_data = NULL;
+ jd->nr_debug_entries = 0;
+ }
+
+ if (ret) {
+ free(event);
+ return -1;
+ }
+ if (stat(filename, &st))
+ memset(&st, 0, sizeof(stat));
+
+ event->mmap2.header.type = PERF_RECORD_MMAP2;
+ event->mmap2.header.misc = PERF_RECORD_MISC_USER;
+ event->mmap2.header.size = (sizeof(event->mmap2) -
+ (sizeof(event->mmap2.filename) - size) + idr_size);
+
+ event->mmap2.pgoff = 0;
+ event->mmap2.start = addr;
+ event->mmap2.len = csize;
+ event->mmap2.pid = pid;
+ event->mmap2.tid = tid;
+ event->mmap2.ino = st.st_ino;
+ event->mmap2.maj = major(st.st_dev);
+ event->mmap2.min = minor(st.st_dev);
+ event->mmap2.prot = st.st_mode;
+ event->mmap2.flags = MAP_SHARED;
+ event->mmap2.ino_generation = 1;
+
+ id = (void *)((unsigned long)event + event->mmap.header.size - idr_size);
+ if (jd->sample_type & PERF_SAMPLE_TID) {
+ id->pid = pid;
+ id->tid = tid;
+ }
+ if (jd->sample_type & PERF_SAMPLE_TIME)
+ id->time = jr->load.p.timestamp;
+
+ /*
+ * create pseudo sample to induce dso hit increment
+ * use first address as sample address
+ */
+ memset(&sample, 0, sizeof(sample));
+ sample.pid = pid;
+ sample.tid = tid;
+ sample.time = id->time;
+ sample.ip = addr;
+
+ ret = perf_event__process_mmap2(tool, event, &sample, jd->machine, jd->session);
+ if (ret)
+ return ret;
+
+ ret = jit_inject_event(jd, event);
+ /*
+ * mark dso as use to generate buildid in the header
+ */
+ if (!ret)
+ build_id__mark_dso_hit(tool, event, &sample, NULL, jd->machine);
+
+ return ret;
+}
+
+static int jit_repipe_code_move(struct jit_buf_desc *jd, union jr_entry *jr)
+{
+ struct perf_sample sample;
+ union perf_event *event;
+ struct perf_tool *tool = jd->session->tool;
+ char *filename;
+ size_t size;
+ struct stat st;
+ u16 idr_size;
+ int ret;
+ pid_t pid, tid;
+ struct {
+ u32 pid, tid;
+ u64 time;
+ } *id;
+
+ pid = jr->move.pid;
+ tid = jr->move.tid;
+ idr_size = jd->machine->id_hdr_size;
+
+ /*
+ * +16 to account for sample_id_all (hack)
+ */
+ event = calloc(1, sizeof(*event) + 16);
+ if (!event)
+ return -1;
+
+ filename = event->mmap2.filename;
+ size = snprintf(filename, PATH_MAX, "%s/jitted-%d-%"PRIu64,
+ jd->dir,
+ pid,
+ jr->move.code_index);
+
+ size++; /* for \0 */
+
+ if (stat(filename, &st))
+ memset(&st, 0, sizeof(stat));
+
+ size = PERF_ALIGN(size, sizeof(u64));
+
+ event->mmap2.header.type = PERF_RECORD_MMAP2;
+ event->mmap2.header.misc = PERF_RECORD_MISC_USER;
+ event->mmap2.header.size = (sizeof(event->mmap2) -
+ (sizeof(event->mmap2.filename) - size) + idr_size);
+ event->mmap2.pgoff = 0;
+ event->mmap2.start = jr->move.new_code_addr;
+ event->mmap2.len = jr->move.code_size;
+ event->mmap2.pid = pid;
+ event->mmap2.tid = tid;
+ event->mmap2.ino = st.st_ino;
+ event->mmap2.maj = major(st.st_dev);
+ event->mmap2.min = minor(st.st_dev);
+ event->mmap2.prot = st.st_mode;
+ event->mmap2.flags = MAP_SHARED;
+ event->mmap2.ino_generation = 1;
+
+ id = (void *)((unsigned long)event + event->mmap.header.size - idr_size);
+ if (jd->sample_type & PERF_SAMPLE_TID) {
+ id->pid = pid;
+ id->tid = tid;
+ }
+ if (jd->sample_type & PERF_SAMPLE_TIME)
+ id->time = jr->load.p.timestamp;
+
+ /*
+ * create pseudo sample to induce dso hit increment
+ * use first address as sample address
+ */
+ memset(&sample, 0, sizeof(sample));
+ sample.pid = pid;
+ sample.tid = tid;
+ sample.time = id->time;
+ sample.ip = jr->move.new_code_addr;
+
+ ret = perf_event__process_mmap2(tool, event, &sample, jd->machine, jd->session);
+ if (ret)
+ return ret;
+
+ ret = jit_inject_event(jd, event);
+ if (!ret)
+ build_id__mark_dso_hit(tool, event, &sample, NULL, jd->machine);
+
+ return ret;
+}
+
+static int jit_repipe_debug_info(struct jit_buf_desc *jd, union jr_entry *jr)
+{
+ void *data;
+ size_t sz;
+
+ if (!(jd && jr))
+ return -1;
+
+ sz = jr->prefix.total_size - sizeof(jr->info);
+ data = malloc(sz);
+ if (!data)
+ return -1;
+
+ memcpy(data, &jr->info.entries, sz);
+
+ jd->debug_data = data;
+
+ /*
+ * we must use nr_entry instead of size here because
+ * we cannot distinguish actual entry from padding otherwise
+ */
+ jd->nr_debug_entries = jr->info.nr_entry;
+
+ return 0;
+}
+
+static int
+jit_process_dump(struct jit_buf_desc *jd)
+{
+ union jr_entry *jr;
+ int ret;
+
+ while ((jr = jit_get_next_entry(jd))) {
+ switch(jr->prefix.id) {
+ case JIT_CODE_LOAD:
+ ret = jit_repipe_code_load(jd, jr);
+ break;
+ case JIT_CODE_MOVE:
+ ret = jit_repipe_code_move(jd, jr);
+ break;
+ case JIT_CODE_DEBUG_INFO:
+ ret = jit_repipe_debug_info(jd, jr);
+ break;
+ default:
+ ret = 0;
+ continue;
+ }
+ }
+ return ret;
+}
+
+static int
+jit_inject(struct jit_buf_desc *jd, char *path)
+{
+ int ret;
+
+ if (verbose > 0)
+ fprintf(stderr, "injecting: %s\n", path);
+
+ ret = jit_open(jd, path);
+ if (ret)
+ return -1;
+
+ ret = jit_process_dump(jd);
+
+ jit_close(jd);
+
+ if (verbose > 0)
+ fprintf(stderr, "injected: %s (%d)\n", path, ret);
+
+ return 0;
+}
+
+/*
+ * File must be with pattern .../jit-XXXX.dump
+ * where XXXX is the PID of the process which did the mmap()
+ * as captured in the RECORD_MMAP record
+ */
+static int
+jit_detect(char *mmap_name, pid_t pid)
+ {
+ char *p;
+ char *end = NULL;
+ pid_t pid2;
+
+ if (verbose > 2)
+ fprintf(stderr, "jit marker trying : %s\n", mmap_name);
+ /*
+ * get file name
+ */
+ p = strrchr(mmap_name, '/');
+ if (!p)
+ return -1;
+
+ /*
+ * match prefix
+ */
+ if (strncmp(p, "/jit-", 5))
+ return -1;
+
+ /*
+ * skip prefix
+ */
+ p += 5;
+
+ /*
+ * must be followed by a pid
+ */
+ if (!isdigit(*p))
+ return -1;
+
+ pid2 = (int)strtol(p, &end, 10);
+ if (!end)
+ return -1;
+
+ /*
+ * pid does not match mmap pid
+ * pid==0 in system-wide mode (synthesized)
+ */
+ if (pid && pid2 != pid)
+ return -1;
+ /*
+ * validate suffix
+ */
+ if (strcmp(end, ".dump"))
+ return -1;
+
+ if (verbose > 0)
+ fprintf(stderr, "jit marker found: %s\n", mmap_name);
+
+ return 0;
+}
+
+int
+jit_process(struct perf_session *session,
+ struct perf_data_file *output,
+ struct machine *machine,
+ char *filename,
+ pid_t pid,
+ u64 *nbytes)
+{
+ struct perf_evsel *first;
+ struct jit_buf_desc jd;
+ int ret;
+
+ /*
+ * first, detect marker mmap (i.e., the jitdump mmap)
+ */
+ if (jit_detect(filename, pid))
+ return -1;
+
+ memset(&jd, 0, sizeof(jd));
+
+ jd.session = session;
+ jd.output = output;
+ jd.machine = machine;
+
+ /*
+ * track sample_type to compute id_all layout
+ * perf sets the same sample type to all events as of now
+ */
+ first = perf_evlist__first(session->evlist);
+ jd.sample_type = first->attr.sample_type;
+
+ *nbytes = 0;
+
+ ret = jit_inject(&jd, filename);
+ if (!ret)
+ *nbytes = jd.bytes_written;
+
+ return ret;
+}
diff --git a/tools/perf/util/jitdump.h b/tools/perf/util/jitdump.h
new file mode 100644
index 0000000..ab8ddd2
--- /dev/null
+++ b/tools/perf/util/jitdump.h
@@ -0,0 +1,116 @@
+/*
+ * jitdump.h: jitted code info encapsulation file format
+ *
+ * Adapted from OProfile GPLv2 support jidump.h:
+ * Copyright 2007 OProfile authors
+ * Jens Wilke
+ * Daniel Hansel
+ * Copyright IBM Corporation 2007
+ */
+#ifndef JITDUMP_H
+#define JITDUMP_H
+
+#include <sys/time.h>
+#include <time.h>
+#include <stdint.h>
+
+/* JiTD */
+#define JITHEADER_MAGIC 0x4A695444
+#define JITHEADER_MAGIC_SW 0x4454694A
+
+#define PADDING_8ALIGNED(x) ((((x) + 7) & 7) ^ 7)
+
+#define JITHEADER_VERSION 1
+
+struct jitheader {
+ uint32_t magic; /* characters "jItD" */
+ uint32_t version; /* header version */
+ uint32_t total_size; /* total size of header */
+ uint32_t elf_mach; /* elf mach target */
+ uint32_t pad1; /* reserved */
+ uint32_t pid; /* JIT process id */
+ uint64_t timestamp; /* timestamp */
+};
+
+enum jit_record_type {
+ JIT_CODE_LOAD = 0,
+ JIT_CODE_MOVE = 1,
+ JIT_CODE_DEBUG_INFO = 2,
+ JIT_CODE_CLOSE = 3,
+
+ JIT_CODE_MAX,
+};
+
+/* record prefix (mandatory in each record) */
+struct jr_prefix {
+ uint32_t id;
+ uint32_t total_size;
+ uint64_t timestamp;
+};
+
+struct jr_code_load {
+ struct jr_prefix p;
+
+ uint32_t pid;
+ uint32_t tid;
+ uint64_t vma;
+ uint64_t code_addr;
+ uint64_t code_size;
+ uint64_t code_index;
+};
+
+struct jr_code_close {
+ struct jr_prefix p;
+};
+
+struct jr_code_move {
+ struct jr_prefix p;
+
+ uint32_t pid;
+ uint32_t tid;
+ uint64_t vma;
+ uint64_t old_code_addr;
+ uint64_t new_code_addr;
+ uint64_t code_size;
+ uint64_t code_index;
+};
+
+struct debug_entry {
+ uint64_t addr;
+ int lineno; /* source line number starting at 1 */
+ int discrim; /* column discriminator, 0 is default */
+ const char name[0]; /* null terminated filename, \xff\0 if same as previous entry */
+};
+
+struct jr_code_debug_info {
+ struct jr_prefix p;
+
+ uint64_t code_addr;
+ uint64_t nr_entry;
+ struct debug_entry entries[0];
+};
+
+union jr_entry {
+ struct jr_code_debug_info info;
+ struct jr_code_close close;
+ struct jr_code_load load;
+ struct jr_code_move move;
+ struct jr_prefix prefix;
+};
+
+static inline struct debug_entry *
+debug_entry_next(struct debug_entry *ent)
+{
+ void *a = ent + 1;
+ size_t l = strlen(ent->name) + 1;
+ return a + l;
+}
+
+static inline char *
+debug_entry_file(struct debug_entry *ent)
+{
+ void *a = ent + 1;
+ return a;
+}
+
+#endif /* !JITDUMP_H */
--
1.9.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v7 4/5] perf tools: add JVMTI agent library
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
` (2 preceding siblings ...)
2015-10-01 6:45 ` [PATCH v7 3/5] perf inject: add jitdump mmap injection support Stephane Eranian
@ 2015-10-01 6:45 ` Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 5/5] perf/jit: add source line info support Stephane Eranian
` (2 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-10-01 6:45 UTC (permalink / raw)
To: linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
This is a standalone JVMTI library to help profile Java jitted
code with perf record/perf report. The library is not installed
or compiled automatically by perf Makefile. It is not used
directly by perf. It is arch agnostic and has been tested on
X86 and ARM. It needs to be used with a Java runtime, such
as OpenJDK, as follows:
$ java -agentpath:libjvmti.so .......
When used this way, java will generate a jitdump binary file in
$HOME/.debug/java/jit/java-jit-*
This binary dump file contains information to help symbolize and
annotate jitted code.
The jitdump information must be injected into the perf.data file
using:
$ perf inject --jit -i perf.data -o perf.data.jitted
This injects the MMAP records to cover the jitted code and also generates
one ELF image for each jitted function. The ELF images are created in the
same subdir as the jitdump file. The MMAP records point there too.
Then, to visualize the function or asm profile, simply use the regular
perf commands:
$ perf report -i perf.data.jitted
or
$ perf annotate -i perf.data.jitted
JVMTI agent code adapted from the OProfile's opagent code.
This version of the JVMTI agent is using the CLOCK_MONOTIC
as the time source to timestamp jit samples. To correlate
with perf_events samples, it needs to run on kernel 4.0.0-rc5+
or later with the following commit from Peter Zijlstra:
34f4392 perf: Add per event clockid support
With this patch recording jitted code is done as follows:
$ perf record -k mono -- java -agentpath:libjvmti.so .......
Signed-off-by: Stephane Eranian <eranian@google.com>
---
tools/perf/jvmti/Makefile | 70 +++++++
tools/perf/jvmti/jvmti_agent.c | 464 +++++++++++++++++++++++++++++++++++++++++
tools/perf/jvmti/jvmti_agent.h | 29 +++
tools/perf/jvmti/libjvmti.c | 208 ++++++++++++++++++
4 files changed, 771 insertions(+)
create mode 100644 tools/perf/jvmti/Makefile
create mode 100644 tools/perf/jvmti/jvmti_agent.c
create mode 100644 tools/perf/jvmti/jvmti_agent.h
create mode 100644 tools/perf/jvmti/libjvmti.c
diff --git a/tools/perf/jvmti/Makefile b/tools/perf/jvmti/Makefile
new file mode 100644
index 0000000..22efd13
--- /dev/null
+++ b/tools/perf/jvmti/Makefile
@@ -0,0 +1,70 @@
+ARCH=$(shell uname -m)
+
+ifeq ($(ARCH), x86_64)
+JARCH=amd64
+endif
+ifeq ($(ARCH), armv7l)
+JARCH=armhf
+endif
+ifeq ($(ARCH), armv6l)
+JARCH=armhf
+endif
+ifeq ($(ARCH), ppc64)
+JARCH=powerpc
+endif
+ifeq ($(ARCH), ppc64le)
+JARCH=powerpc
+endif
+
+DESTDIR=/usr/local
+
+VERSION=1
+REVISION=0
+AGE=0
+
+LN=ln -sf
+RM=rm
+
+SLIBJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
+VLIBJVMTI=libjvmti.so.$(VERSION)
+SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBJVMTI)
+SOLIBEXT=so
+
+JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)
+# -lrt required in 32-bit mode for clock_gettime()
+LIBS=-lelf -lrt
+INCDIR=-I $(JDIR)/include -I $(JDIR)/include/linux
+
+TARGETS=$(SLIBJVMTI)
+
+SRCS=libjvmti.c jvmti_agent.c
+OBJS=$(SRCS:.c=.o)
+SOBJS=$(OBJS:.o=.lo)
+OPT=-O2 -g -Werror -Wall
+
+CFLAGS=$(INCDIR) $(OPT)
+
+all: $(TARGETS)
+
+.c.o:
+ $(CC) $(CFLAGS) -c $*.c
+.c.lo:
+ $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo
+
+$(OBJS) $(SOBJS): Makefile jvmti_agent.h ../util/jitdump.h
+
+$(SLIBJVMTI): $(SOBJS)
+ $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LIBS)
+ $(LN) $@ libjvmti.$(SOLIBEXT)
+
+clean:
+ $(RM) -f *.o *.so.* *.so *.lo
+
+install:
+ -mkdir -p $(DESTDIR)/lib
+ install -m 755 $(SLIBJVMTI) $(DESTDIR)/lib/
+ (cd $(DESTDIR)/lib; $(LN) $(SLIBJVMTI) $(VLIBJVMTI))
+ (cd $(DESTDIR)/lib; $(LN) $(SLIBJVMTI) libjvmti.$(SOLIBEXT))
+ ldconfig
+
+.SUFFIXES: .c .S .o .lo
diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c
new file mode 100644
index 0000000..7d96581
--- /dev/null
+++ b/tools/perf/jvmti/jvmti_agent.c
@@ -0,0 +1,464 @@
+/*
+ * jvmti_agent.c: JVMTI agent interface
+ *
+ * Adapted from the Oprofile code in opagent.c:
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Copyright 2007 OProfile authors
+ * Jens Wilke
+ * Daniel Hansel
+ * Copyright IBM Corporation 2007
+ */
+#include <sys/types.h>
+#include <sys/stat.h> /* for mkdir() */
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <time.h>
+#include <sys/mman.h>
+#include <syscall.h> /* for gettid() */
+#include <err.h>
+
+#include "jvmti_agent.h"
+#include "../util/jitdump.h"
+
+#define JIT_LANG "java"
+
+static char jit_path[PATH_MAX];
+static void *marker_addr;
+
+/*
+ * padding buffer
+ */
+static const char pad_bytes[7];
+
+static inline pid_t gettid(void)
+{
+ return (pid_t)syscall(__NR_gettid);
+}
+
+static int get_e_machine(struct jitheader *hdr)
+{
+ ssize_t sret;
+ char id[16];
+ int fd, ret = -1;
+ int m = -1;
+ struct {
+ uint16_t e_type;
+ uint16_t e_machine;
+ } info;
+
+ fd = open("/proc/self/exe", O_RDONLY);
+ if (fd == -1)
+ return -1;
+
+ sret = read(fd, id, sizeof(id));
+ if (sret != sizeof(id))
+ goto error;
+
+ /* check ELF signature */
+ if (id[0] != 0x7f || id[1] != 'E' || id[2] != 'L' || id[3] != 'F')
+ goto error;
+
+ sret = read(fd, &info, sizeof(info));
+ if (sret != sizeof(info))
+ goto error;
+
+ m = info.e_machine;
+ if (m < 0)
+ m = 0; /* ELF EM_NONE */
+
+ hdr->elf_mach = m;
+ ret = 0;
+error:
+ close(fd);
+ return ret;
+}
+
+#define NSEC_PER_SEC 1000000000
+static int perf_clk_id = CLOCK_MONOTONIC;
+
+static inline uint64_t
+timespec_to_ns(const struct timespec *ts)
+{
+ return ((uint64_t) ts->tv_sec * NSEC_PER_SEC) + ts->tv_nsec;
+}
+
+static inline uint64_t
+perf_get_timestamp(void)
+{
+ struct timespec ts;
+ int ret;
+
+ ret = clock_gettime(perf_clk_id, &ts);
+ if (ret)
+ return 0;
+
+ return timespec_to_ns(&ts);
+}
+
+static int
+debug_cache_init(void)
+{
+ char str[32];
+ char *base, *p;
+ struct tm tm;
+ time_t t;
+ int ret;
+
+ time(&t);
+ localtime_r(&t, &tm);
+
+ base = getenv("JITDUMPDIR");
+ if (!base)
+ base = getenv("HOME");
+ if (!base)
+ base = ".";
+
+ strftime(str, sizeof(str), JIT_LANG"-jit-%Y%m%d", &tm);
+
+ snprintf(jit_path, PATH_MAX - 1, "%s/.debug/", base);
+
+ ret = mkdir(jit_path, 0755);
+ if (ret == -1) {
+ if (errno != EEXIST) {
+ warn("jvmti: cannot create jit cache dir %s", jit_path);
+ return -1;
+ }
+ }
+
+ snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit", base);
+ ret = mkdir(jit_path, 0755);
+ if (ret == -1) {
+ if (errno != EEXIST) {
+ warn("cannot create jit cache dir %s", jit_path);
+ return -1;
+ }
+ }
+
+ snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit/%s.XXXXXXXX", base, str);
+
+ p = mkdtemp(jit_path);
+ if (p != jit_path) {
+ warn("cannot create jit cache dir %s", jit_path);
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+perf_open_marker_file(int fd)
+{
+ long pgsz;
+
+ pgsz = sysconf(_SC_PAGESIZE);
+ if (pgsz == -1)
+ return -1;
+
+ /*
+ * we mmap the jitdump to create an MMAP RECORD in perf.data file.
+ * The mmap is captured either live (perf record running when we mmap)
+ * or in deferred mode, via /proc/PID/maps
+ * the MMAP record is used as a marker of a jitdump file for more meta
+ * data info about the jitted code. Perf report/annotate detect this
+ * special filename and process the jitdump file.
+ *
+ * mapping must be PROT_EXEC to ensure it is captured by perf record
+ * even when not using -d option
+ */
+ marker_addr = mmap(NULL, pgsz, PROT_READ|PROT_EXEC, MAP_PRIVATE, fd, 0);
+ return (marker_addr == MAP_FAILED) ? -1 : 0;
+}
+
+static void
+perf_close_marker_file(void)
+{
+ long pgsz;
+
+ if (!marker_addr)
+ return;
+
+ pgsz = sysconf(_SC_PAGESIZE);
+ if (pgsz == -1)
+ return;
+
+ munmap(marker_addr, pgsz);
+}
+
+void *jvmti_open(void)
+{
+ int pad_cnt;
+ char dump_path[PATH_MAX];
+ struct jitheader header;
+ int fd;
+ FILE *fp;
+
+ /*
+ * check if clockid is supported
+ */
+ if (!perf_get_timestamp())
+ warnx("jvmti: kernel does not support %d clock id", perf_clk_id);
+
+ memset(&header, 0, sizeof(header));
+
+ debug_cache_init();
+
+ /*
+ * jitdump file name
+ */
+ snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
+
+ fd = open(dump_path, O_CREAT|O_TRUNC|O_RDWR, 0666);
+ if (fd == -1)
+ return NULL;
+
+ /*
+ * create perf.data maker for the jitdump file
+ */
+ if (perf_open_marker_file(fd)) {
+ warnx("jvmti: failed to create marker file");
+ return NULL;
+ }
+
+ fp = fdopen(fd, "w+");
+ if (!fp) {
+ warn("jvmti: cannot create %s", dump_path);
+ close(fd);
+ goto error;
+ }
+
+ warnx("jvmti: jitdump in %s", dump_path);
+
+ if (get_e_machine(&header)) {
+ warn("get_e_machine failed\n");
+ goto error;
+ }
+
+ header.magic = JITHEADER_MAGIC;
+ header.version = JITHEADER_VERSION;
+ header.total_size = sizeof(header);
+ header.pid = getpid();
+
+ /* calculate amount of padding '\0' */
+ pad_cnt = PADDING_8ALIGNED(header.total_size);
+ header.total_size += pad_cnt;
+
+ header.timestamp = perf_get_timestamp();
+
+ if (!fwrite(&header, sizeof(header), 1, fp)) {
+ warn("jvmti: cannot write dumpfile header");
+ goto error;
+ }
+
+ /* write padding '\0' if necessary */
+ if (pad_cnt && !fwrite(pad_bytes, pad_cnt, 1, fp)) {
+ warn("jvmti: cannot write dumpfile header padding");
+ goto error;
+ }
+
+ return fp;
+error:
+ fclose(fp);
+ return NULL;
+}
+
+int
+jvmti_close(void *agent)
+{
+ struct jr_code_close rec;
+ FILE *fp = agent;
+
+ if (!fp) {
+ warnx("jvmti: incalid fd in close_agent");
+ return -1;
+ }
+
+ rec.p.id = JIT_CODE_CLOSE;
+ rec.p.total_size = sizeof(rec);
+
+ rec.p.timestamp = perf_get_timestamp();
+
+ if (!fwrite(&rec, sizeof(rec), 1, fp))
+ return -1;
+
+ fclose(fp);
+
+ fp = NULL;
+
+ perf_close_marker_file();
+
+ return 0;
+}
+
+int
+jvmti_write_code(void *agent, char const *sym,
+ uint64_t vma, void const *code, unsigned int const size)
+{
+ static int code_generation = 1;
+ struct jr_code_load rec;
+ size_t sym_len;
+ size_t padding_count;
+ FILE *fp = agent;
+ int ret = -1;
+
+ /* don't care about 0 length function, no samples */
+ if (size == 0)
+ return 0;
+
+ if (!fp) {
+ warnx("jvmti: invalid fd in write_native_code");
+ return -1;
+ }
+
+ sym_len = strlen(sym) + 1;
+
+ rec.p.id = JIT_CODE_LOAD;
+ rec.p.total_size = sizeof(rec) + sym_len;
+ padding_count = PADDING_8ALIGNED(rec.p.total_size);
+ rec.p. total_size += padding_count;
+ rec.p.timestamp = perf_get_timestamp();
+
+ rec.code_size = size;
+ rec.vma = vma;
+ rec.code_addr = vma;
+ rec.pid = getpid();
+ rec.tid = gettid();
+
+ if (code)
+ rec.p.total_size += size;
+
+ /*
+ * If JVM is multi-threaded, nultiple concurrent calls to agent
+ * may be possible, so protect file writes
+ */
+ flockfile(fp);
+
+ /*
+ * get code index inside lock to avoid race condition
+ */
+ rec.code_index = code_generation++;
+
+ ret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
+ fwrite_unlocked(sym, sym_len, 1, fp);
+ if (code)
+ fwrite_unlocked(code, size, 1, fp);
+
+ if (padding_count)
+ fwrite_unlocked(pad_bytes, padding_count, 1, fp);
+
+ funlockfile(fp);
+
+ ret = 0;
+
+ return ret;
+}
+
+int
+jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
+ jvmtiAddrLocationMap const *map,
+ jvmtiLineNumberEntry *li, jint num)
+{
+ static const char *prev_str = "\xff";
+ struct jr_code_debug_info rec;
+ size_t sret, len, size, flen;
+ size_t padding_count;
+ FILE *fp = agent;
+ int i;
+
+ /*
+ * no entry to write
+ */
+ if (!num)
+ return 0;
+
+ if (!fp) {
+ warnx("jvmti: invalid fd in write_debug_info");
+ return -1;
+ }
+
+ flen = strlen(file) + 1;
+
+ rec.p.id = JIT_CODE_DEBUG_INFO;
+ size = sizeof(rec);
+ rec.p.timestamp = perf_get_timestamp();
+ rec.code_addr = (uint64_t)(uintptr_t)code;
+ rec.nr_entry = num;
+
+ /*
+ * on disk source line info layout:
+ * uint64_t : addr
+ * int : line number
+ * file[] : source file name
+ * padding : pad to multiple of 8 bytes
+ */
+ size += num * (sizeof(uint64_t) + sizeof(int));
+ size += flen + (num - 1) * 2;
+ /*
+ * pad to 8 bytes
+ */
+ padding_count = PADDING_8ALIGNED(size);
+
+ rec.p.total_size = size + padding_count;
+
+ /*
+ * If JVM is multi-threaded, nultiple concurrent calls to agent
+ * may be possible, so protect file writes
+ */
+ flockfile(fp);
+
+ sret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
+ if (sret != 1)
+ goto error;
+
+ for (i = 0; i < num; i++) {
+ uint64_t addr;
+
+ addr = (uint64_t)map[i].start_address;
+ len = sizeof(addr);
+ sret = fwrite_unlocked(&addr, len, 1, fp);
+ if (sret != 1)
+ goto error;
+
+ len = sizeof(int);
+ sret = fwrite_unlocked(&li[i].line_number, len, 1, fp);
+ if (sret != 1)
+ goto error;
+
+ if (i == 0) {
+ sret = fwrite_unlocked(file, flen, 1, fp);
+ } else {
+ sret = fwrite_unlocked(prev_str, 2, 1, fp);
+ }
+ if (sret != 1)
+ goto error;
+
+ }
+ if (padding_count)
+ sret = fwrite_unlocked(pad_bytes, padding_count, 1, fp);
+ if (sret != 1)
+ goto error;
+
+ funlockfile(fp);
+ return 0;
+error:
+ funlockfile(fp);
+ return -1;
+}
diff --git a/tools/perf/jvmti/jvmti_agent.h b/tools/perf/jvmti/jvmti_agent.h
new file mode 100644
index 0000000..8251a1c
--- /dev/null
+++ b/tools/perf/jvmti/jvmti_agent.h
@@ -0,0 +1,29 @@
+#ifndef __JVMTI_AGENT_H__
+#define __JVMTI_AGENT_H__
+
+#include <sys/types.h>
+#include <stdint.h>
+#include <jvmti.h>
+
+#define __unused __attribute__((unused))
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+void *jvmti_open(void);
+int jvmti_close(void *agent);
+int jvmti_write_code(void *agent, char const *symbol_name,
+ uint64_t vma, void const *code,
+ const unsigned int code_size);
+int jvmti_write_debug_info(void *agent,
+ uint64_t code,
+ const char *file,
+ jvmtiAddrLocationMap const *map,
+ jvmtiLineNumberEntry *tab, jint nr);
+
+#if defined(__cplusplus)
+}
+
+#endif
+#endif /* __JVMTI_H__ */
diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
new file mode 100644
index 0000000..745f20c
--- /dev/null
+++ b/tools/perf/jvmti/libjvmti.c
@@ -0,0 +1,208 @@
+#include <sys/types.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <err.h>
+#include <jvmti.h>
+#include <limits.h>
+
+#include "jvmti_agent.h"
+
+static int has_line_numbers;
+void *jvmti_agent;
+
+static void JNICALL
+compiled_method_load_cb(jvmtiEnv *jvmti,
+ jmethodID method,
+ jint code_size,
+ void const *code_addr,
+ jint map_length,
+ jvmtiAddrLocationMap const *map,
+ void const *compile_info __unused)
+{
+ jvmtiLineNumberEntry *tab = NULL;
+ jclass decl_class;
+ char *class_sign = NULL;
+ char *func_name = NULL;
+ char *func_sign = NULL;
+ char *file_name= NULL;
+ char fn[PATH_MAX];
+ uint64_t addr = (uint64_t)(uintptr_t)code_addr;
+ jvmtiError ret;
+ jint nr_lines = 0;
+ size_t len;
+
+ ret = (*jvmti)->GetMethodDeclaringClass(jvmti, method,
+ &decl_class);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: cannot get declaring class");
+ return;
+ }
+
+ if (has_line_numbers && map && map_length) {
+
+ ret = (*jvmti)->GetLineNumberTable(jvmti, method, &nr_lines, &tab);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: cannot get line table for method");
+ } else {
+ ret = (*jvmti)->GetSourceFileName(jvmti, decl_class, &file_name);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: cannot get source filename ret=%d", ret);
+ nr_lines = 0;
+ }
+ }
+ }
+
+ ret = (*jvmti)->GetClassSignature(jvmti, decl_class,
+ &class_sign, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: getclassignature failed");
+ goto error;
+ }
+
+ ret = (*jvmti)->GetMethodName(jvmti, method, &func_name,
+ &func_sign, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: failed getmethodname");
+ goto error;
+ }
+
+ /*
+ * Assume path name is class hierarchy, this is a common practice with Java programs
+ */
+ if (*class_sign == 'L') {
+ int j, i = 0;
+ char *p = strrchr(class_sign, '/');
+ if (p) {
+ /* drop the 'L' prefix and copy up to the final '/' */
+ for (i = 0; i < (p - class_sign); i++)
+ fn[i] = class_sign[i+1];
+ }
+ /*
+ * append file name, we use loops and not string ops to avoid modifying
+ * class_sign which is used later for the symbol name
+ */
+ for (j = 0; i < (PATH_MAX - 1) && j < strlen(file_name); j++, i++)
+ fn[i] = file_name[j];
+ fn[i] = '\0';
+ } else {
+ /* fallback case */
+ strcpy(fn, file_name);
+ }
+ /*
+ * write source line info record if we have it
+ */
+ if (jvmti_write_debug_info(jvmti_agent, addr, fn, map, tab, nr_lines))
+ warnx("jvmti: write_debug_info() failed");
+
+ len = strlen(func_name) + strlen(class_sign) + strlen(func_sign) + 2;
+ {
+ char str[len];
+ snprintf(str, len, "%s%s%s", class_sign, func_name, func_sign);
+ if (jvmti_write_code(jvmti_agent, str, addr, code_addr, code_size))
+ warnx("jvmti: write_code() failed");
+ }
+error:
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)func_name);
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)func_sign);
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)class_sign);
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)tab);
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)file_name);
+}
+
+static void JNICALL
+code_generated_cb(jvmtiEnv *jvmti,
+ char const *name,
+ void const *code_addr,
+ jint code_size)
+{
+ uint64_t addr = (uint64_t)(unsigned long)code_addr;
+ int ret;
+
+ ret = jvmti_write_code(jvmti_agent, name, addr, code_addr, code_size);
+ if (ret)
+ warnx("jvmti: write_code() failed for code_generated");
+}
+
+JNIEXPORT jint JNICALL
+Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
+{
+ jvmtiEventCallbacks cb;
+ jvmtiCapabilities caps1;
+ jvmtiJlocationFormat format;
+ jvmtiEnv *jvmti = NULL;
+ jint ret;
+
+ jvmti_agent = jvmti_open();
+ if (!jvmti_agent) {
+ warnx("jvmti: open_agent failed");
+ return -1;
+ }
+
+ /*
+ * Request a JVMTI interface version 1 environment
+ */
+ ret = (*jvm)->GetEnv(jvm, (void *)&jvmti, JVMTI_VERSION_1);
+ if (ret != JNI_OK) {
+ warnx("jvmti: jvmti version 1 not supported");
+ return -1;
+ }
+
+ /*
+ * acquire method_load capability, we require it
+ * request line numbers (optional)
+ */
+ memset(&caps1, 0, sizeof(caps1));
+ caps1.can_generate_compiled_method_load_events = 1;
+
+ ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: acquire compiled_method capability failed");
+ return -1;
+ }
+ ret = (*jvmti)->GetJLocationFormat(jvmti, &format);
+ if (ret == JVMTI_ERROR_NONE && format == JVMTI_JLOCATION_JVMBCI) {
+ memset(&caps1, 0, sizeof(caps1));
+ caps1.can_get_line_numbers = 1;
+ caps1.can_get_source_file_name = 1;
+ ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
+ if (ret == JVMTI_ERROR_NONE)
+ has_line_numbers = 1;
+ }
+
+ memset(&cb, 0, sizeof(cb));
+
+ cb.CompiledMethodLoad = compiled_method_load_cb;
+ cb.DynamicCodeGenerated = code_generated_cb;
+
+ ret = (*jvmti)->SetEventCallbacks(jvmti, &cb, sizeof(cb));
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: cannot set event callbacks");
+ return -1;
+ }
+
+ ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
+ JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: setnotification failed for method_load");
+ return -1;
+ }
+
+ ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
+ JVMTI_EVENT_DYNAMIC_CODE_GENERATED, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: setnotification failed on code_generated");
+ return -1;
+ }
+ return 0;
+}
+
+JNIEXPORT void JNICALL
+Agent_OnUnload(JavaVM *jvm __unused)
+{
+ int ret;
+
+ ret = jvmti_close(jvmti_agent);
+ if (ret)
+ errx(1, "Error: op_close_agent()");
+}
--
1.9.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v7 5/5] perf/jit: add source line info support
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
` (3 preceding siblings ...)
2015-10-01 6:45 ` [PATCH v7 4/5] perf tools: add JVMTI agent library Stephane Eranian
@ 2015-10-01 6:45 ` Stephane Eranian
2015-10-01 9:17 ` [PATCH v7 0/4] perf: add support for profiling jitted code Peter Zijlstra
2015-10-01 22:45 ` Brendan Gregg
6 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-10-01 6:45 UTC (permalink / raw)
To: linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
This patch adds source line information support to perf for jitted code.
The source line info must be emitted by the runtime, such as JVMTI.
Perf injects extract the source line info from the jitdump file and
adds the corresponding .debug_lines section in the ELF image generated
for each jitted function. The source line enables matching any address
in the profile with a source file and line number. The improvement is
visible in perf annotate with the source code displayed alongside
the assembly code.
The dwarf code leverages the support from OProfile which is also released
under GPLv2. Copyright 2007 OProfile authors.
Signed-off-by: Stephane Eranian <eranian@google.com>
---
tools/perf/jvmti/jvmti_agent.c | 32 +--
tools/perf/jvmti/jvmti_agent.h | 11 +-
tools/perf/jvmti/libjvmti.c | 122 ++++++++-
tools/perf/util/Build | 1 +
tools/perf/util/genelf.c | 15 +-
tools/perf/util/genelf.h | 6 +-
tools/perf/util/genelf_debug.c | 610 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/jitdump.c | 8 +-
8 files changed, 766 insertions(+), 39 deletions(-)
create mode 100644 tools/perf/util/genelf_debug.c
diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c
index 7d96581..ca064be 100644
--- a/tools/perf/jvmti/jvmti_agent.c
+++ b/tools/perf/jvmti/jvmti_agent.c
@@ -373,20 +373,20 @@ jvmti_write_code(void *agent, char const *sym,
int
jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
- jvmtiAddrLocationMap const *map,
- jvmtiLineNumberEntry *li, jint num)
+ jvmti_line_info_t *li, int nr_lines)
{
- static const char *prev_str = "\xff";
struct jr_code_debug_info rec;
size_t sret, len, size, flen;
size_t padding_count;
+ uint64_t addr;
+ const char *fn = file;
FILE *fp = agent;
int i;
/*
* no entry to write
*/
- if (!num)
+ if (!nr_lines)
return 0;
if (!fp) {
@@ -400,17 +400,18 @@ jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
size = sizeof(rec);
rec.p.timestamp = perf_get_timestamp();
rec.code_addr = (uint64_t)(uintptr_t)code;
- rec.nr_entry = num;
+ rec.nr_entry = nr_lines;
/*
* on disk source line info layout:
* uint64_t : addr
* int : line number
+ * int : column discriminator
* file[] : source file name
* padding : pad to multiple of 8 bytes
*/
- size += num * (sizeof(uint64_t) + sizeof(int));
- size += flen + (num - 1) * 2;
+ size += nr_lines * sizeof(struct debug_entry);
+ size += flen * nr_lines;
/*
* pad to 8 bytes
*/
@@ -428,28 +429,27 @@ jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
if (sret != 1)
goto error;
- for (i = 0; i < num; i++) {
- uint64_t addr;
+ for (i = 0; i < nr_lines; i++) {
- addr = (uint64_t)map[i].start_address;
+ addr = (uint64_t)li[i].pc;
len = sizeof(addr);
sret = fwrite_unlocked(&addr, len, 1, fp);
if (sret != 1)
goto error;
- len = sizeof(int);
+ len = sizeof(li[0].line_number);
sret = fwrite_unlocked(&li[i].line_number, len, 1, fp);
if (sret != 1)
goto error;
- if (i == 0) {
- sret = fwrite_unlocked(file, flen, 1, fp);
- } else {
- sret = fwrite_unlocked(prev_str, 2, 1, fp);
- }
+ len = sizeof(li[0].discrim);
+ sret = fwrite_unlocked(&li[i].discrim, len, 1, fp);
if (sret != 1)
goto error;
+ sret = fwrite_unlocked(fn, flen, 1, fp);
+ if (sret != 1)
+ goto error;
}
if (padding_count)
sret = fwrite_unlocked(pad_bytes, padding_count, 1, fp);
diff --git a/tools/perf/jvmti/jvmti_agent.h b/tools/perf/jvmti/jvmti_agent.h
index 8251a1c..bedf5d0 100644
--- a/tools/perf/jvmti/jvmti_agent.h
+++ b/tools/perf/jvmti/jvmti_agent.h
@@ -11,16 +11,23 @@
extern "C" {
#endif
+typedef struct {
+ unsigned long pc;
+ int line_number;
+ int discrim; /* discriminator -- 0 for now */
+} jvmti_line_info_t;
+
void *jvmti_open(void);
int jvmti_close(void *agent);
int jvmti_write_code(void *agent, char const *symbol_name,
uint64_t vma, void const *code,
const unsigned int code_size);
+
int jvmti_write_debug_info(void *agent,
uint64_t code,
const char *file,
- jvmtiAddrLocationMap const *map,
- jvmtiLineNumberEntry *tab, jint nr);
+ jvmti_line_info_t *li,
+ int nr_lines);
#if defined(__cplusplus)
}
diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
index 745f20c..6ee98b0 100644
--- a/tools/perf/jvmti/libjvmti.c
+++ b/tools/perf/jvmti/libjvmti.c
@@ -4,6 +4,7 @@
#include <stdlib.h>
#include <err.h>
#include <jvmti.h>
+#include <jvmticmlr.h>
#include <limits.h>
#include "jvmti_agent.h"
@@ -11,6 +12,100 @@
static int has_line_numbers;
void *jvmti_agent;
+static jvmtiError
+do_get_line_numbers(jvmtiEnv *jvmti, void *pc, jmethodID m, jint bci,
+ jvmti_line_info_t *tab, jint *nr)
+{
+ jint i, lines = 0;
+ jint nr_lines = 0;
+ jvmtiLineNumberEntry *loc_tab = NULL;
+ jvmtiError ret;
+
+ ret = (*jvmti)->GetLineNumberTable(jvmti, m, &nr_lines, &loc_tab);
+ if (ret != JVMTI_ERROR_NONE)
+ return ret;
+
+ for (i = 0; i < nr_lines; i++) {
+ if (loc_tab[i].start_location < bci) {
+ tab[lines].pc = (unsigned long)pc;
+ tab[lines].line_number = loc_tab[i].line_number;
+ tab[lines].discrim = 0; /* not yet used */
+ lines++;
+ } else {
+ break;
+ }
+ }
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)loc_tab);
+ *nr = lines;
+ return JVMTI_ERROR_NONE;
+}
+
+static jvmtiError
+get_line_numbers(jvmtiEnv *jvmti, const void *compile_info, jvmti_line_info_t **tab, int *nr_lines)
+{
+ const jvmtiCompiledMethodLoadRecordHeader *hdr;
+ jvmtiCompiledMethodLoadInlineRecord *rec;
+ jvmtiLineNumberEntry *lne = NULL;
+ PCStackInfo *c;
+ jint nr, ret;
+ int nr_total = 0;
+ int i, lines_total = 0;
+
+ if (!(tab && nr_lines))
+ return JVMTI_ERROR_NULL_POINTER;
+
+ /*
+ * Phase 1 -- get the number of lines necessary
+ */
+ for (hdr = compile_info; hdr != NULL; hdr = hdr->next) {
+ if (hdr->kind == JVMTI_CMLR_INLINE_INFO) {
+ rec = (jvmtiCompiledMethodLoadInlineRecord *)hdr;
+ for (i = 0; i < rec->numpcs; i++) {
+ c = rec->pcinfo + i;
+ nr = 0;
+ /*
+ * unfortunately, need a tab to get the number of lines!
+ */
+ ret = (*jvmti)->GetLineNumberTable(jvmti, c->methods[0], &nr, &lne);
+ if (ret == JVMTI_ERROR_NONE) {
+ /* free what was allocated for nothing */
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)lne);
+ nr_total += (int)nr;
+ }
+ }
+ }
+ }
+
+ if (nr_total == 0)
+ return JVMTI_ERROR_NOT_FOUND;
+
+ /*
+ * Phase 2 -- allocate big enough line table
+ */
+ *tab = malloc(nr_total * sizeof(**tab));
+ if (!*tab)
+ return JVMTI_ERROR_OUT_OF_MEMORY;
+
+ for (hdr = compile_info; hdr != NULL; hdr = hdr->next) {
+ if (hdr->kind == JVMTI_CMLR_INLINE_INFO) {
+ rec = (jvmtiCompiledMethodLoadInlineRecord *)hdr;
+ for (i = 0; i < rec->numpcs; i++) {
+ c = rec->pcinfo + i;
+ nr = 0;
+ ret = do_get_line_numbers(jvmti, c->pc,
+ c->methods[0],
+ c->bcis[0],
+ *tab + lines_total,
+ &nr);
+ if (ret == JVMTI_ERROR_NONE)
+ lines_total += nr;
+ }
+ }
+ }
+ *nr_lines = lines_total;
+ return JVMTI_ERROR_NONE;
+}
+
static void JNICALL
compiled_method_load_cb(jvmtiEnv *jvmti,
jmethodID method,
@@ -18,9 +113,9 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
void const *code_addr,
jint map_length,
jvmtiAddrLocationMap const *map,
- void const *compile_info __unused)
+ const void *compile_info)
{
- jvmtiLineNumberEntry *tab = NULL;
+ jvmti_line_info_t *line_tab = NULL;
jclass decl_class;
char *class_sign = NULL;
char *func_name = NULL;
@@ -29,7 +124,7 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
char fn[PATH_MAX];
uint64_t addr = (uint64_t)(uintptr_t)code_addr;
jvmtiError ret;
- jint nr_lines = 0;
+ int nr_lines = 0; /* in line_tab[] */
size_t len;
ret = (*jvmti)->GetMethodDeclaringClass(jvmti, method,
@@ -40,19 +135,19 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
}
if (has_line_numbers && map && map_length) {
-
- ret = (*jvmti)->GetLineNumberTable(jvmti, method, &nr_lines, &tab);
+ ret = get_line_numbers(jvmti, compile_info, &line_tab, &nr_lines);
if (ret != JVMTI_ERROR_NONE) {
warnx("jvmti: cannot get line table for method");
- } else {
- ret = (*jvmti)->GetSourceFileName(jvmti, decl_class, &file_name);
- if (ret != JVMTI_ERROR_NONE) {
- warnx("jvmti: cannot get source filename ret=%d", ret);
- nr_lines = 0;
- }
+ nr_lines = 0;
}
}
+ ret = (*jvmti)->GetSourceFileName(jvmti, decl_class, &file_name);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: cannot get source filename ret=%d", ret);
+ goto error;
+ }
+
ret = (*jvmti)->GetClassSignature(jvmti, decl_class,
&class_sign, NULL);
if (ret != JVMTI_ERROR_NONE) {
@@ -92,13 +187,14 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
/*
* write source line info record if we have it
*/
- if (jvmti_write_debug_info(jvmti_agent, addr, fn, map, tab, nr_lines))
+ if (jvmti_write_debug_info(jvmti_agent, addr, fn, line_tab, nr_lines))
warnx("jvmti: write_debug_info() failed");
len = strlen(func_name) + strlen(class_sign) + strlen(func_sign) + 2;
{
char str[len];
snprintf(str, len, "%s%s%s", class_sign, func_name, func_sign);
+
if (jvmti_write_code(jvmti_agent, str, addr, code_addr, code_size))
warnx("jvmti: write_code() failed");
}
@@ -106,8 +202,8 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
(*jvmti)->Deallocate(jvmti, (unsigned char *)func_name);
(*jvmti)->Deallocate(jvmti, (unsigned char *)func_sign);
(*jvmti)->Deallocate(jvmti, (unsigned char *)class_sign);
- (*jvmti)->Deallocate(jvmti, (unsigned char *)tab);
(*jvmti)->Deallocate(jvmti, (unsigned char *)file_name);
+ free(line_tab);
}
static void JNICALL
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 9fd4906..b6f8c96 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -110,6 +110,7 @@ libperf-$(CONFIG_LZMA) += lzma.o
libperf-y += demangle-java.o
libperf-y += jitdump.o
libperf-y += genelf.o
+libperf-y += genelf_debug.o
CFLAGS_config.o += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
CFLAGS_exec_cmd.o += -DPERF_EXEC_PATH="BUILD_STR($(perfexecdir_SQ))" -DPREFIX="BUILD_STR($(prefix_SQ))"
diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
index 3832cd0..6ce8cf3 100644
--- a/tools/perf/util/genelf.c
+++ b/tools/perf/util/genelf.c
@@ -156,7 +156,8 @@ gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *cod
*/
int
jit_write_elf(int fd, uint64_t load_addr, const char *sym,
- const void *code, int csize)
+ const void *code, int csize,
+ void *debug, int nr_debug_entries)
{
Elf *e;
Elf_Data *d;
@@ -384,9 +385,15 @@ jit_write_elf(int fd, uint64_t load_addr, const char *sym,
shdr->sh_size = sizeof(bnote);
shdr->sh_entsize = 0;
- if (elf_update(e, ELF_C_WRITE) < 0) {
- warnx("elf_update 4 failed");
- goto error;
+ if (debug && nr_debug_entries) {
+ retval = jit_add_debug_info(e, load_addr, debug, nr_debug_entries);
+ if (retval)
+ goto error;
+ } else {
+ if (elf_update(e, ELF_C_WRITE) < 0) {
+ warnx("elf_update 4 failed");
+ goto error;
+ }
}
retval = 0;
diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h
index 79b89e2..957e3de 100644
--- a/tools/perf/util/genelf.h
+++ b/tools/perf/util/genelf.h
@@ -3,7 +3,11 @@
/* genelf.c */
extern int jit_write_elf(int fd, uint64_t code_addr, const char *sym,
- const void *code, int csize);
+ const void *code, int csize,
+ void *debug, int nr_debug_entries);
+/* genelf_debug.c */
+extern int jit_add_debug_info(Elf *e, uint64_t code_addr,
+ void *debug, int nr_debug_entries);
#if defined(__arm__)
#define GEN_ELF_ARCH EM_ARM
diff --git a/tools/perf/util/genelf_debug.c b/tools/perf/util/genelf_debug.c
new file mode 100644
index 0000000..1ff5a46
--- /dev/null
+++ b/tools/perf/util/genelf_debug.c
@@ -0,0 +1,610 @@
+/*
+ * genelf_debug.c
+ * Copyright (C) 2015, Google, Inc
+ *
+ * Contributed by:
+ * Stephane Eranian <eranian@google.com>
+ *
+ * Released under the GPL v2.
+ *
+ * based on GPLv2 source code from Oprofile
+ * @remark Copyright 2007 OProfile authors
+ * @author Philippe Elie
+ */
+#include <sys/types.h>
+#include <stdio.h>
+#include <getopt.h>
+#include <stddef.h>
+#include <libelf.h>
+#include <string.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <err.h>
+#include <dwarf.h>
+
+#include "perf.h"
+#include "genelf.h"
+#include "../util/jitdump.h"
+
+#define BUFFER_EXT_DFL_SIZE (4 * 1024)
+
+typedef uint32_t uword;
+typedef uint16_t uhalf;
+typedef int32_t sword;
+typedef int16_t shalf;
+typedef uint8_t ubyte;
+typedef int8_t sbyte;
+
+struct buffer_ext {
+ size_t cur_pos;
+ size_t max_sz;
+ void *data;
+};
+
+static void
+buffer_ext_dump(struct buffer_ext *be, const char *msg)
+{
+ size_t i;
+ warnx("DUMP for %s", msg);
+ for (i = 0 ; i < be->cur_pos; i++)
+ warnx("%4zu 0x%02x", i, (((char *)be->data)[i]) & 0xff);
+}
+
+static inline int
+buffer_ext_add(struct buffer_ext *be, void *addr, size_t sz)
+{
+ void *tmp;
+ size_t be_sz = be->max_sz;
+
+retry:
+ if ((be->cur_pos + sz) < be_sz) {
+ memcpy(be->data + be->cur_pos, addr, sz);
+ be->cur_pos += sz;
+ return 0;
+ }
+
+ if (!be_sz)
+ be_sz = BUFFER_EXT_DFL_SIZE;
+ else
+ be_sz <<= 1;
+
+ tmp = realloc(be->data, be_sz);
+ if (!tmp)
+ return -1;
+
+ be->data = tmp;
+ be->max_sz = be_sz;
+
+ goto retry;
+}
+
+static void
+buffer_ext_init(struct buffer_ext *be)
+{
+ be->data = NULL;
+ be->cur_pos = 0;
+ be->max_sz = 0;
+}
+
+static inline size_t
+buffer_ext_size(struct buffer_ext *be)
+{
+ return be->cur_pos;
+}
+
+static inline void *
+buffer_ext_addr(struct buffer_ext *be)
+{
+ return be->data;
+}
+
+struct debug_line_header {
+ // Not counting this field
+ uword total_length;
+ // version number (2 currently)
+ uhalf version;
+ // relative offset from next field to
+ // program statement
+ uword prolog_length;
+ ubyte minimum_instruction_length;
+ ubyte default_is_stmt;
+ // line_base - see DWARF 2 specs
+ sbyte line_base;
+ // line_range - see DWARF 2 specs
+ ubyte line_range;
+ // number of opcode + 1
+ ubyte opcode_base;
+ /* follow the array of opcode args nr: ubytes [nr_opcode_base] */
+ /* follow the search directories index, zero terminated string
+ * terminated by an empty string.
+ */
+ /* follow an array of { filename, LEB128, LEB128, LEB128 }, first is
+ * the directory index entry, 0 means current directory, then mtime
+ * and filesize, last entry is followed by en empty string.
+ */
+ /* follow the first program statement */
+} __attribute__((packed));
+
+/* DWARF 2 spec talk only about one possible compilation unit header while
+ * binutils can handle two flavours of dwarf 2, 32 and 64 bits, this is not
+ * related to the used arch, an ELF 32 can hold more than 4 Go of debug
+ * information. For now we handle only DWARF 2 32 bits comp unit. It'll only
+ * become a problem if we generate more than 4GB of debug information.
+ */
+struct compilation_unit_header {
+ uword total_length;
+ uhalf version;
+ uword debug_abbrev_offset;
+ ubyte pointer_size;
+} __attribute__((packed));
+
+#define DW_LNS_num_opcode (DW_LNS_set_isa + 1)
+
+/* field filled at run time are marked with -1 */
+static struct debug_line_header const default_debug_line_header = {
+ .total_length = -1,
+ .version = 2,
+ .prolog_length = -1,
+ .minimum_instruction_length = 1, /* could be better when min instruction size != 1 */
+ .default_is_stmt = 1, /* we don't take care about basic block */
+ .line_base = -5, /* sensible value for line base ... */
+ .line_range = -14, /* ... and line range are guessed statically */
+ .opcode_base = DW_LNS_num_opcode
+};
+
+static ubyte standard_opcode_length[] =
+{
+ 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1
+};
+#if 0
+{
+ [DW_LNS_advance_pc] = 1,
+ [DW_LNS_advance_line] = 1,
+ [DW_LNS_set_file] = 1,
+ [DW_LNS_set_column] = 1,
+ [DW_LNS_fixed_advance_pc] = 1,
+ [DW_LNS_set_isa] = 1,
+};
+#endif
+
+/* field filled at run time are marked with -1 */
+static struct compilation_unit_header default_comp_unit_header = {
+ .total_length = -1,
+ .version = 2,
+ .debug_abbrev_offset = 0, /* we reuse the same abbrev entries for all comp unit */
+ .pointer_size = sizeof(void *)
+};
+
+static void emit_uword(struct buffer_ext *be, uword data)
+{
+ buffer_ext_add(be, &data, sizeof(uword));
+}
+
+static void emit_string(struct buffer_ext *be, const char *s)
+{
+ buffer_ext_add(be, (void *)s, strlen(s) + 1);
+}
+
+static void emit_unsigned_LEB128(struct buffer_ext *be,
+ unsigned long data)
+{
+ do {
+ ubyte cur = data & 0x7F;
+ data >>= 7;
+ if (data)
+ cur |= 0x80;
+ buffer_ext_add(be, &cur, 1);
+ } while (data);
+}
+
+static void emit_signed_LEB128(struct buffer_ext *be, long data)
+{
+ int more = 1;
+ int negative = data < 0;
+ int size = sizeof(long) * CHAR_BIT;
+ while (more) {
+ ubyte cur = data & 0x7F;
+ data >>= 7;
+ if (negative)
+ data |= - (1 << (size - 7));
+ if ((data == 0 && !(cur & 0x40)) ||
+ (data == -1l && (cur & 0x40)))
+ more = 0;
+ else
+ cur |= 0x80;
+ buffer_ext_add(be, &cur, 1);
+ }
+}
+
+static void emit_extended_opcode(struct buffer_ext *be, ubyte opcode,
+ void *data, size_t data_len)
+{
+ buffer_ext_add(be, (char *)"", 1);
+
+ emit_unsigned_LEB128(be, data_len + 1);
+
+ buffer_ext_add(be, &opcode, 1);
+ buffer_ext_add(be, data, data_len);
+}
+
+static void emit_opcode(struct buffer_ext *be, ubyte opcode)
+{
+ buffer_ext_add(be, &opcode, 1);
+}
+
+static void emit_opcode_signed(struct buffer_ext *be,
+ ubyte opcode, long data)
+{
+ buffer_ext_add(be, &opcode, 1);
+ emit_signed_LEB128(be, data);
+}
+
+static void emit_opcode_unsigned(struct buffer_ext *be, ubyte opcode,
+ unsigned long data)
+{
+ buffer_ext_add(be, &opcode, 1);
+ emit_unsigned_LEB128(be, data);
+}
+
+static void emit_advance_pc(struct buffer_ext *be, unsigned long delta_pc)
+{
+ emit_opcode_unsigned(be, DW_LNS_advance_pc, delta_pc);
+}
+
+static void emit_advance_lineno(struct buffer_ext *be, long delta_lineno)
+{
+ emit_opcode_signed(be, DW_LNS_advance_line, delta_lineno);
+}
+
+static void emit_lne_end_of_sequence(struct buffer_ext *be)
+{
+ emit_extended_opcode(be, DW_LNE_end_sequence, NULL, 0);
+}
+
+static void emit_set_file(struct buffer_ext *be, unsigned long index)
+{
+ emit_opcode_unsigned(be, DW_LNS_set_file, index);
+}
+
+static void emit_lne_define_filename(struct buffer_ext *be,
+ const char *filename)
+{
+ buffer_ext_add(be, (void *)"", 1);
+
+ /* LNE field, strlen(filename) + zero termination, 3 bytes for: the dir entry, timestamp, filesize */
+ emit_unsigned_LEB128(be, strlen(filename) + 5);
+ emit_opcode(be, DW_LNE_define_file);
+ emit_string(be, filename);
+ /* directory index 0=do not know */
+ emit_unsigned_LEB128(be, 0);
+ /* last modification date on file 0=do not know */
+ emit_unsigned_LEB128(be, 0);
+ /* filesize 0=do not know */
+ emit_unsigned_LEB128(be, 0);
+}
+
+static void emit_lne_set_address(struct buffer_ext *be,
+ void *address)
+{
+ emit_extended_opcode(be, DW_LNE_set_address, &address, sizeof(unsigned long));
+}
+
+static ubyte get_special_opcode(struct debug_entry *ent,
+ unsigned int last_line,
+ unsigned long last_vma)
+{
+ unsigned int temp;
+ unsigned long delta_addr;
+
+ /*
+ * delta from line_base
+ */
+ temp = (ent->lineno - last_line) - default_debug_line_header.line_base;
+
+ if (temp >= default_debug_line_header.line_range)
+ return 0;
+
+ /*
+ * delta of addresses
+ */
+ delta_addr = (ent->addr - last_vma) / default_debug_line_header.minimum_instruction_length;
+
+ /* This is not sufficient to ensure opcode will be in [0-256] but
+ * sufficient to ensure when summing with the delta lineno we will
+ * not overflow the unsigned long opcode */
+
+ if (delta_addr <= 256 / default_debug_line_header.line_range) {
+ unsigned long opcode = temp +
+ (delta_addr * default_debug_line_header.line_range) +
+ default_debug_line_header.opcode_base;
+
+ return opcode <= 255 ? opcode : 0;
+ }
+ return 0;
+}
+
+static void emit_lineno_info(struct buffer_ext *be,
+ struct debug_entry *ent, size_t nr_entry,
+ unsigned long code_addr)
+{
+ size_t i;
+
+ /*
+ * Machine state at start of a statement program
+ * address = 0
+ * file = 1
+ * line = 1
+ * column = 0
+ * is_stmt = default_is_stmt as given in the debug_line_header
+ * basic block = 0
+ * end sequence = 0
+ */
+
+ /* start state of the state machine we take care of */
+ unsigned long last_vma = code_addr;
+ char const *cur_filename = NULL;
+ unsigned long cur_file_idx = 0;
+ int last_line = 1;
+
+ emit_lne_set_address(be, (void *)code_addr);
+
+ for (i = 0; i < nr_entry; i++, ent = debug_entry_next(ent)) {
+ int need_copy = 0;
+ ubyte special_opcode;
+
+ /*
+ * check if filename changed, if so add it
+ */
+ if (!cur_filename || strcmp(cur_filename, ent->name)) {
+ emit_lne_define_filename(be, ent->name);
+ cur_filename = ent->name;
+ emit_set_file(be, ++cur_file_idx);
+ need_copy = 1;
+ }
+
+ special_opcode = get_special_opcode(ent, last_line, last_vma);
+ if (special_opcode != 0) {
+ last_line = ent->lineno;
+ last_vma = ent->addr;
+ emit_opcode(be, special_opcode);
+ } else {
+ /*
+ * lines differ, emit line delta
+ */
+ if (last_line != ent->lineno) {
+ emit_advance_lineno(be, ent->lineno - last_line);
+ last_line = ent->lineno;
+ need_copy = 1;
+ }
+ /*
+ * addresses differ, emit address delta
+ */
+ if (last_vma != ent->addr) {
+ emit_advance_pc(be, ent->addr - last_vma);
+ last_vma = ent->addr;
+ need_copy = 1;
+ }
+ /*
+ * add new row to matrix
+ */
+ if (need_copy)
+ emit_opcode(be, DW_LNS_copy);
+ }
+ }
+}
+
+static void add_debug_line(struct buffer_ext *be,
+ struct debug_entry *ent, size_t nr_entry,
+ unsigned long code_addr)
+{
+ struct debug_line_header * dbg_header;
+ size_t old_size;
+
+ old_size = buffer_ext_size(be);
+
+ buffer_ext_add(be, (void *)&default_debug_line_header,
+ sizeof(default_debug_line_header));
+
+ buffer_ext_add(be, &standard_opcode_length, sizeof(standard_opcode_length));
+
+ // empty directory entry
+ buffer_ext_add(be, (void *)"", 1);
+
+ // empty filename directory
+ buffer_ext_add(be, (void *)"", 1);
+
+ dbg_header = buffer_ext_addr(be) + old_size;
+ dbg_header->prolog_length = (buffer_ext_size(be) - old_size) -
+ offsetof(struct debug_line_header, minimum_instruction_length);
+
+ emit_lineno_info(be, ent, nr_entry, code_addr);
+
+ emit_lne_end_of_sequence(be);
+
+ dbg_header = buffer_ext_addr(be) + old_size;
+ dbg_header->total_length = (buffer_ext_size(be) - old_size) -
+ offsetof(struct debug_line_header, version);
+}
+
+static void
+add_debug_abbrev(struct buffer_ext *be)
+{
+ emit_unsigned_LEB128(be, 1);
+ emit_unsigned_LEB128(be, DW_TAG_compile_unit);
+ emit_unsigned_LEB128(be, DW_CHILDREN_yes);
+ emit_unsigned_LEB128(be, DW_AT_stmt_list);
+ emit_unsigned_LEB128(be, DW_FORM_data4);
+ emit_unsigned_LEB128(be, 0);
+ emit_unsigned_LEB128(be, 0);
+ emit_unsigned_LEB128(be, 0);
+}
+
+static void
+add_compilation_unit(struct buffer_ext *be,
+ size_t offset_debug_line)
+{
+ struct compilation_unit_header *comp_unit_header;
+ size_t old_size = buffer_ext_size(be);
+
+ buffer_ext_add(be, &default_comp_unit_header,
+ sizeof(default_comp_unit_header));
+
+ emit_unsigned_LEB128(be, 1);
+ emit_uword(be, offset_debug_line);
+
+ comp_unit_header = buffer_ext_addr(be) + old_size;
+ comp_unit_header->total_length = (buffer_ext_size(be) - old_size) -
+ offsetof(struct compilation_unit_header, version);
+}
+
+static int
+jit_process_debug_info(uint64_t code_addr,
+ void *debug, int nr_debug_entries,
+ struct buffer_ext *dl,
+ struct buffer_ext *da,
+ struct buffer_ext *di)
+{
+ struct debug_entry *ent = debug;
+ int i;
+
+ for (i = 0; i < nr_debug_entries; i++) {
+ ent->addr = ent->addr - code_addr;
+ ent = debug_entry_next(ent);
+ }
+ add_compilation_unit(di, buffer_ext_size(dl));
+ add_debug_line(dl, debug, nr_debug_entries, 0);
+ add_debug_abbrev(da);
+ if (0) buffer_ext_dump(da, "abbrev");
+
+ return 0;
+}
+
+int
+jit_add_debug_info(Elf *e, uint64_t code_addr, void *debug, int nr_debug_entries)
+{
+ Elf_Data *d;
+ Elf_Scn *scn;
+ Elf_Shdr *shdr;
+ struct buffer_ext dl, di, da;
+ int ret;
+
+ buffer_ext_init(&dl);
+ buffer_ext_init(&di);
+ buffer_ext_init(&da);
+
+ ret = jit_process_debug_info(code_addr, debug, nr_debug_entries, &dl, &da, &di);
+ if (ret)
+ return -1;
+ /*
+ * setup .debug_line section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ return -1;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ return -1;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = buffer_ext_addr(&dl);
+ d->d_type = ELF_T_BYTE;
+ d->d_size = buffer_ext_size(&dl);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ return -1;
+ }
+
+ shdr->sh_name = 52; /* .debug_line */
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_addr = 0; /* must be zero or == sh_offset -> dynamic object */
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup .debug_info section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ return -1;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ return -1;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = buffer_ext_addr(&di);
+ d->d_type = ELF_T_BYTE;
+ d->d_size = buffer_ext_size(&di);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ return -1;
+ }
+
+ shdr->sh_name = 64; /* .debug_info */
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_addr = 0; /* must be zero or == sh_offset -> dynamic object */
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup .debug_abbrev section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ return -1;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ return -1;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = buffer_ext_addr(&da);
+ d->d_type = ELF_T_BYTE;
+ d->d_size = buffer_ext_size(&da);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ return -1;
+ }
+
+ shdr->sh_name = 76; /* .debug_info */
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_addr = 0; /* must be zero or == sh_offset -> dynamic object */
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * now we update the ELF image with all the sections
+ */
+ if (elf_update(e, ELF_C_WRITE) < 0) {
+ warnx("elf_update debug failed");
+ return -1;
+ }
+ return 0;
+}
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index e910130..42b967b 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -63,7 +63,9 @@ jit_emit_elf(char *filename,
const char *sym,
uint64_t code_addr,
const void *code,
- int csize)
+ int csize,
+ void *debug,
+ int nr_debug_entries)
{
int ret, fd;
@@ -76,7 +78,7 @@ jit_emit_elf(char *filename,
return -1;
}
- ret = jit_write_elf(fd, code_addr, sym, (const void *)code, csize);
+ ret = jit_write_elf(fd, code_addr, sym, (const void *)code, csize, debug, nr_debug_entries);
close(fd);
@@ -341,7 +343,7 @@ static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
size = PERF_ALIGN(size, sizeof(u64));
- ret = jit_emit_elf(filename, sym, addr, (const void *)code, csize);
+ ret = jit_emit_elf(filename, sym, addr, (const void *)code, csize, jd->debug_data, jd->nr_debug_entries);
if (jd->debug_data && jd->nr_debug_entries) {
free(jd->debug_data);
--
1.9.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
` (4 preceding siblings ...)
2015-10-01 6:45 ` [PATCH v7 5/5] perf/jit: add source line info support Stephane Eranian
@ 2015-10-01 9:17 ` Peter Zijlstra
2015-10-01 9:39 ` Ingo Molnar
2015-10-04 19:52 ` Stephane Eranian
2015-10-01 22:45 ` Brendan Gregg
6 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2015-10-01 9:17 UTC (permalink / raw)
To: Stephane Eranian
Cc: linux-kernel, acme, mingo, ak, jolsa, namhyung, cel, dsahern,
adrian.hunter, johnmccutchan, brendan.d.gregg
On Thu, Oct 01, 2015 at 08:45:44AM +0200, Stephane Eranian wrote:
> This patch series extends perf record/report/annotate to enable
> profiling of jitted (just-in-time compiled) code. The current
> perf tool provides very limited support for profiling jitted
> code for some runtime environments. But the support is experimental
> and cannot be used in complex environments. It relies on files
> in /tmp, for instance. It does not support annotate mode or
> rejitted code.
>
> This patch series adds a better way of profiling jitted code
> with the following advantages:
> - support any jitted code environment (some with modifications)
> - support Java runtime with JVMTI interface with no modifications
> - provides a portable JVMTI agent library
> - known to support V8 runtime
> - known to support DART runtime
> - supports code rejitting and code movements
> - no files in /tmp
> - meta-data file is unique to each run
> - no changes to perf report/annotate
> - support per-thread and system-wide profiling
> - support monitoring of multiple simultaneous Jit runtimes
> - source level view in perf annotate
>
> The support is based on cooperation with the runtime. For Java runtimes,
> supporting the JVMTI interface, there is no change necessary. For other
> runtimes, modifications are necessary to emit the meta-data to support
> symbolization, annotation, source lines correlation of the samples.
> Those modifications are relatively straighforward, some have been
> implemented in V8 and DART.
Do V8 and DART come with these bits or will that be a future
contribution?
> This will also generate an ELF image for each jitted function. The
> injected MMAP records will point to these ELF images. The reasoning
> behind using ELF images is that it makes processing for perf report
> and annotate automatic and transparent. It also makes it easier to
> package and analyze on a remote machine. Binutils tools can decode
> the ELF images easily.
The generation of ELF files is really nice!
All in all this looks really nice.
Thanks for doing this.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-01 9:17 ` [PATCH v7 0/4] perf: add support for profiling jitted code Peter Zijlstra
@ 2015-10-01 9:39 ` Ingo Molnar
2015-10-01 16:13 ` Arnaldo Carvalho de Melo
2015-10-04 19:52 ` Stephane Eranian
1 sibling, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2015-10-01 9:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Stephane Eranian, linux-kernel, acme, mingo, ak, jolsa, namhyung,
cel, dsahern, adrian.hunter, johnmccutchan, brendan.d.gregg
* Peter Zijlstra <peterz@infradead.org> wrote:
> > This will also generate an ELF image for each jitted function. The injected
> > MMAP records will point to these ELF images. The reasoning behind using ELF
> > images is that it makes processing for perf report and annotate automatic and
> > transparent. It also makes it easier to package and analyze on a remote
> > machine. Binutils tools can decode the ELF images easily.
>
> The generation of ELF files is really nice!
>
> All in all this looks really nice.
>
> Thanks for doing this.
Seconded, nice stuff!
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-01 9:39 ` Ingo Molnar
@ 2015-10-01 16:13 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-10-01 16:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Stephane Eranian, linux-kernel, mingo, ak, jolsa,
namhyung, cel, dsahern, adrian.hunter, johnmccutchan,
brendan.d.gregg
Em Thu, Oct 01, 2015 at 11:39:14AM +0200, Ingo Molnar escreveu:
> * Peter Zijlstra <peterz@infradead.org> wrote:
> > > This will also generate an ELF image for each jitted function. The injected
> > > MMAP records will point to these ELF images. The reasoning behind using ELF
> > > images is that it makes processing for perf report and annotate automatic and
> > > transparent. It also makes it easier to package and analyze on a remote
> > > machine. Binutils tools can decode the ELF images easily.
> > The generation of ELF files is really nice!
> > All in all this looks really nice.
> > Thanks for doing this.
> Seconded, nice stuff!
Thirded! ;-)
Gave it a quick look, the ELF generation thing could later be librarized
somehow for other uses, the passing of perf_session for sample_type can
maybe use evsel->attr.sample_type instead, no?
Will try to try it after lunch :-)
- Arnaldo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
` (5 preceding siblings ...)
2015-10-01 9:17 ` [PATCH v7 0/4] perf: add support for profiling jitted code Peter Zijlstra
@ 2015-10-01 22:45 ` Brendan Gregg
2015-10-04 20:05 ` Stephane Eranian
6 siblings, 1 reply; 14+ messages in thread
From: Brendan Gregg @ 2015-10-01 22:45 UTC (permalink / raw)
To: Stephane Eranian
Cc: LKML, acme, Peter Zijlstra, Ingo Molnar, Andi Kleen, Jiri Olsa,
Namhyung Kim, Rose Belcher, David Ahern, Adrian Hunter,
John Mccutchan
G'Day,
On Wed, Sep 30, 2015 at 11:45 PM, Stephane Eranian <eranian@google.com> wrote:
>
> This patch series extends perf record/report/annotate to enable
> profiling of jitted (just-in-time compiled) code. The current
> perf tool provides very limited support for profiling jitted
> code for some runtime environments. But the support is experimental
> and cannot be used in complex environments. It relies on files
> in /tmp, for instance. It does not support annotate mode or
> rejitted code.
>
> This patch series adds a better way of profiling jitted code
> with the following advantages:
> - support any jitted code environment (some with modifications)
> - support Java runtime with JVMTI interface with no modifications
> - provides a portable JVMTI agent library
> - known to support V8 runtime
> - known to support DART runtime
> - supports code rejitting and code movements
> - no files in /tmp
> - meta-data file is unique to each run
> - no changes to perf report/annotate
> - support per-thread and system-wide profiling
> - support monitoring of multiple simultaneous Jit runtimes
> - source level view in perf annotate
>
> The support is based on cooperation with the runtime. For Java runtimes,
> supporting the JVMTI interface, there is no change necessary. For other
> runtimes, modifications are necessary to emit the meta-data to support
> symbolization, annotation, source lines correlation of the samples.
> Those modifications are relatively straighforward, some have been
> implemented in V8 and DART.
>
> The jit environment emits a binary dump file which contains the jitted
> code (in raw format) and meta-data describing the mapping of functions.
> The binary format is documented in the jitdump.h header file. It is
> adapted from the OProfile jitdump format.
While this is impressive work, I don't think I'd use it very much, and
I wouldn't encourage others too either. Is it right that this approach
needs to be turned on from runtime start, and will constantly emit
timestamped JIT records? I'd use that as a backup for existing
techniques for perf_events and jitted runtimes.
Right now we (Netflix) can profile Java in production using
perf_events, using an on-demand symbol dump: the perf-map-agent
JVMTI[1]. This agent used to be loaded on startup, like this patchset,
which cost overhead: CPU, filesystem I/O, storage. The problem was
that we didn't know which of the tens of thousands of instances we'd
want to profile, so loading an always-on JIT symbol dumper on all
instances would cost resources. With on-demand, we only dump symbols
on the few instances needed. On-demand has caveats, of course, and
symbols can change during the profile. But so far that's not been a
problem.
So this seems like a lot of changes for what won't be our primary way
of using perf_events on Java or Node.js, which we currently can
profile with perf_events effectively. So long as we're aware of that.
Brendan
[1] http://techblog.netflix.com/2015/07/java-in-flames.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-01 9:17 ` [PATCH v7 0/4] perf: add support for profiling jitted code Peter Zijlstra
2015-10-01 9:39 ` Ingo Molnar
@ 2015-10-04 19:52 ` Stephane Eranian
1 sibling, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-10-04 19:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: LKML, Arnaldo Carvalho de Melo, mingo@elte.hu, ak@linux.intel.com,
Jiri Olsa, Namhyung Kim, Rose Belcher, David Ahern, Adrian Hunter,
John Mccutchan, Brendan Gregg
Hi Peter,
On Thu, Oct 1, 2015 at 2:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Oct 01, 2015 at 08:45:44AM +0200, Stephane Eranian wrote:
>> This patch series extends perf record/report/annotate to enable
>> profiling of jitted (just-in-time compiled) code. The current
>> perf tool provides very limited support for profiling jitted
>> code for some runtime environments. But the support is experimental
>> and cannot be used in complex environments. It relies on files
>> in /tmp, for instance. It does not support annotate mode or
>> rejitted code.
>>
>> This patch series adds a better way of profiling jitted code
>> with the following advantages:
>> - support any jitted code environment (some with modifications)
>> - support Java runtime with JVMTI interface with no modifications
>> - provides a portable JVMTI agent library
>> - known to support V8 runtime
>> - known to support DART runtime
>> - supports code rejitting and code movements
>> - no files in /tmp
>> - meta-data file is unique to each run
>> - no changes to perf report/annotate
>> - support per-thread and system-wide profiling
>> - support monitoring of multiple simultaneous Jit runtimes
>> - source level view in perf annotate
>>
>> The support is based on cooperation with the runtime. For Java runtimes,
>> supporting the JVMTI interface, there is no change necessary. For other
>> runtimes, modifications are necessary to emit the meta-data to support
>> symbolization, annotation, source lines correlation of the samples.
>> Those modifications are relatively straighforward, some have been
>> implemented in V8 and DART.
>
> Do V8 and DART come with these bits or will that be a future
> contribution?
>
I think both had this working some time ago. I have not had a chance to
verify if this new version still works. I did not change the format of
the jitdump
file, so if they use the same clock source for their timestamp as perf
record, it
should still work. All of this hinges on the fact that V8 & DART developers run
with 4.1 to get your clock changes.
>> This will also generate an ELF image for each jitted function. The
>> injected MMAP records will point to these ELF images. The reasoning
>> behind using ELF images is that it makes processing for perf report
>> and annotate automatic and transparent. It also makes it easier to
>> package and analyze on a remote machine. Binutils tools can decode
>> the ELF images easily.
>
> The generation of ELF files is really nice!
>
> All in all this looks really nice.
>
Thanks!
> Thanks for doing this.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-01 22:45 ` Brendan Gregg
@ 2015-10-04 20:05 ` Stephane Eranian
2015-10-09 20:51 ` Brendan Gregg
0 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2015-10-04 20:05 UTC (permalink / raw)
To: Brendan Gregg
Cc: LKML, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
Andi Kleen, Jiri Olsa, Namhyung Kim, Rose Belcher, David Ahern,
Adrian Hunter, John Mccutchan
Brendan,
On Thu, Oct 1, 2015 at 3:45 PM, Brendan Gregg <brendan.d.gregg@gmail.com> wrote:
> G'Day,
>
> On Wed, Sep 30, 2015 at 11:45 PM, Stephane Eranian <eranian@google.com> wrote:
>>
>> This patch series extends perf record/report/annotate to enable
>> profiling of jitted (just-in-time compiled) code. The current
>> perf tool provides very limited support for profiling jitted
>> code for some runtime environments. But the support is experimental
>> and cannot be used in complex environments. It relies on files
>> in /tmp, for instance. It does not support annotate mode or
>> rejitted code.
>>
>> This patch series adds a better way of profiling jitted code
>> with the following advantages:
>> - support any jitted code environment (some with modifications)
>> - support Java runtime with JVMTI interface with no modifications
>> - provides a portable JVMTI agent library
>> - known to support V8 runtime
>> - known to support DART runtime
>> - supports code rejitting and code movements
>> - no files in /tmp
>> - meta-data file is unique to each run
>> - no changes to perf report/annotate
>> - support per-thread and system-wide profiling
>> - support monitoring of multiple simultaneous Jit runtimes
>> - source level view in perf annotate
>>
>> The support is based on cooperation with the runtime. For Java runtimes,
>> supporting the JVMTI interface, there is no change necessary. For other
>> runtimes, modifications are necessary to emit the meta-data to support
>> symbolization, annotation, source lines correlation of the samples.
>> Those modifications are relatively straighforward, some have been
>> implemented in V8 and DART.
>>
>> The jit environment emits a binary dump file which contains the jitted
>> code (in raw format) and meta-data describing the mapping of functions.
>> The binary format is documented in the jitdump.h header file. It is
>> adapted from the OProfile jitdump format.
>
> While this is impressive work, I don't think I'd use it very much, and
> I wouldn't encourage others too either. Is it right that this approach
> needs to be turned on from runtime start, and will constantly emit
> timestamped JIT records? I'd use that as a backup for existing
> techniques for perf_events and jitted runtimes.
>
This boils down that when does the JIT runtime emit the jitdump data?
In the case of openJDK and given how I wrote the agent, it needs to
run from start to finish. In order for perf to have full visibility, it needs
to get info about all the code that has been jitted so far. This could be
done after start if there was somehow a protocol to handle this in the
runtime, like a signal. It would just need to dump the current status, including
all the code. The rest of the support would work. In other words, if there was
a way to signal to the runtime that it is being monitored and that it needs to
dump its state, then everything would work. To avoid generating more dumps,
the runtime would also have to be informed that monitoring has stopped.
> Right now we (Netflix) can profile Java in production using
> perf_events, using an on-demand symbol dump: the perf-map-agent
> JVMTI[1]. This agent used to be loaded on startup, like this patchset,
> which cost overhead: CPU, filesystem I/O, storage. The problem was
> that we didn't know which of the tens of thousands of instances we'd
> want to profile, so loading an always-on JIT symbol dumper on all
> instances would cost resources. With on-demand, we only dump symbols
> on the few instances needed. On-demand has caveats, of course, and
> symbols can change during the profile. But so far that's not been a
> problem.
>
I am guessing that for what you are doing getting the jitted code symbol names
are sufficient. This patch series is not really needed for this, you
already have the
/tmp/perf-map approach. This is mostly for the user program
developers. This series
addresses the needs of the other category, namely the jit compiler and runtime
developers who need to study the quality of the generated code. And there, they
need to see the profile of the jitted native code. This is the major
value-add of this
series. I would also add the support for re-jitting and code movements.
> So this seems like a lot of changes for what won't be our primary way
> of using perf_events on Java or Node.js, which we currently can
> profile with perf_events effectively. So long as we're aware of that.
>
The changes to perf are small. There are no change to perf record, report,
annotate. just the injection code with the jitdump reader and elf creation.
> Brendan
>
> [1] http://techblog.netflix.com/2015/07/java-in-flames.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 3/5] perf inject: add jitdump mmap injection support
2015-10-01 6:45 ` [PATCH v7 3/5] perf inject: add jitdump mmap injection support Stephane Eranian
@ 2015-10-09 13:17 ` Adrian Hunter
0 siblings, 0 replies; 14+ messages in thread
From: Adrian Hunter @ 2015-10-09 13:17 UTC (permalink / raw)
To: Stephane Eranian, linux-kernel
Cc: acme, peterz, mingo, ak, jolsa, namhyung, cel, dsahern,
johnmccutchan, brendan.d.gregg
On 01/10/15 09:45, Stephane Eranian wrote:
> This patch adds a --jit option to perf inject.
>
> This options injects MMAP records into the perf.data
> file to cover the jitted code mmaps. It also emits
> ELF images for each function in the jidump file.
> Those images are created where the jitdump file is.
> The MMAP records point to that location as well.
>
> Typical flow:
> $ perf record -k mono -- java -agentpath:libpjvmti.so java_class
> $ perf inject --jit -i perf.data -o perf.data.jitted
> $ perf report -i perf.data.jitted
>
> Note that jitdump.h support is not limited to Java, it works with
> any jitted environment modified to emit the jitdump file format,
> include those where code can be jitted multiple times and moved
> around.
>
> The jitdump.h format is adapted from the Oprofile project.
>
> The genelf.c (ELF binary generation) depends on MD5 hash
> encoding for the buildid. To enable this, libssl-dev must
> be installed. If not, then genelf.c defaults to using
> urandom to generate the buildid, which is not ideal.
> The Makefile auto-detects the presence on libssl-dev.
>
> This version mmaps the jitdump file to create a marker
> MMAP record in the perf.data file. The marker is used to detect
> jitdump and cause perf inject to inject the jitted mmaps and
> generate ELF images for jitted functions.
>
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
<SNIP>
> +static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
> +{
> + struct perf_sample sample;
> + union perf_event *event;
> + struct perf_tool *tool = jd->session->tool;
> + uint64_t code, addr;
> + char *filename;
> + struct stat st;
> + size_t size;
> + u16 idr_size;
> + const char *sym;
> + uint32_t count;
> + int ret, csize;
> + pid_t pid, tid;
> + struct {
> + u32 pid, tid;
> + u64 time;
> + } *id;
> +
> + pid = jr->load.pid;
> + tid = jr->load.tid;
> + csize = jr->load.code_size;
> + addr = jr->load.code_addr;
> + sym = (void *)((unsigned long)jr + sizeof(jr->load));
> + code = (unsigned long)jr + jr->load.p.total_size - csize;
> + count = jr->load.code_index;
> + idr_size = jd->machine->id_hdr_size;
> + /*
> + * +16 to account for sample_id_all (hack)
> + */
> + event = calloc(1, sizeof(*event) + idr_size);
> + if (!event)
> + return -1;
> +
> + filename = event->mmap2.filename;
> + size = snprintf(filename, PATH_MAX, "%s/jitted-%d-%u.so",
> + jd->dir,
> + pid,
> + count);
> +
> + size++; /* for \0 */
> +
> + size = PERF_ALIGN(size, sizeof(u64));
> +
> + ret = jit_emit_elf(filename, sym, addr, (const void *)code, csize);
> +
> + if (jd->debug_data && jd->nr_debug_entries) {
> + free(jd->debug_data);
> + jd->debug_data = NULL;
> + jd->nr_debug_entries = 0;
> + }
> +
> + if (ret) {
> + free(event);
> + return -1;
> + }
> + if (stat(filename, &st))
> + memset(&st, 0, sizeof(stat));
> +
> + event->mmap2.header.type = PERF_RECORD_MMAP2;
> + event->mmap2.header.misc = PERF_RECORD_MISC_USER;
> + event->mmap2.header.size = (sizeof(event->mmap2) -
> + (sizeof(event->mmap2.filename) - size) + idr_size);
> +
> + event->mmap2.pgoff = 0;
I was testing this with Intel PT but getting decoding errors.
I think the problem is here, as described below, but I will look at it more
next week.
event->mmap2.pgoff = 0 says that the code running at 'addr' can be found at
offset 0 in the elf file, which is not correct. The MMAP event has to
inform where in the file the code is.
> + event->mmap2.start = addr;
> + event->mmap2.len = csize;
> + event->mmap2.pid = pid;
> + event->mmap2.tid = tid;
> + event->mmap2.ino = st.st_ino;
> + event->mmap2.maj = major(st.st_dev);
> + event->mmap2.min = minor(st.st_dev);
> + event->mmap2.prot = st.st_mode;
> + event->mmap2.flags = MAP_SHARED;
> + event->mmap2.ino_generation = 1;
> +
> + id = (void *)((unsigned long)event + event->mmap.header.size - idr_size);
> + if (jd->sample_type & PERF_SAMPLE_TID) {
> + id->pid = pid;
> + id->tid = tid;
> + }
> + if (jd->sample_type & PERF_SAMPLE_TIME)
> + id->time = jr->load.p.timestamp;
> +
> + /*
> + * create pseudo sample to induce dso hit increment
> + * use first address as sample address
> + */
> + memset(&sample, 0, sizeof(sample));
> + sample.pid = pid;
> + sample.tid = tid;
> + sample.time = id->time;
> + sample.ip = addr;
> +
> + ret = perf_event__process_mmap2(tool, event, &sample, jd->machine, jd->session);
> + if (ret)
> + return ret;
> +
> + ret = jit_inject_event(jd, event);
> + /*
> + * mark dso as use to generate buildid in the header
> + */
> + if (!ret)
> + build_id__mark_dso_hit(tool, event, &sample, NULL, jd->machine);
> +
> + return ret;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/4] perf: add support for profiling jitted code
2015-10-04 20:05 ` Stephane Eranian
@ 2015-10-09 20:51 ` Brendan Gregg
0 siblings, 0 replies; 14+ messages in thread
From: Brendan Gregg @ 2015-10-09 20:51 UTC (permalink / raw)
To: Stephane Eranian
Cc: LKML, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
Andi Kleen, Jiri Olsa, Namhyung Kim, Rose Belcher, David Ahern,
Adrian Hunter, John Mccutchan
On Sun, Oct 4, 2015 at 1:05 PM, Stephane Eranian <eranian@google.com> wrote:
>
> Brendan,
>
> On Thu, Oct 1, 2015 at 3:45 PM, Brendan Gregg <brendan.d.gregg@gmail.com> wrote:
> > G'Day,
> >
> > On Wed, Sep 30, 2015 at 11:45 PM, Stephane Eranian <eranian@google.com> wrote:
> >>
> >> This patch series extends perf record/report/annotate to enable
> >> profiling of jitted (just-in-time compiled) code. The current
> >> perf tool provides very limited support for profiling jitted
> >> code for some runtime environments. But the support is experimental
> >> and cannot be used in complex environments. It relies on files
> >> in /tmp, for instance. It does not support annotate mode or
> >> rejitted code.
> >>
> >> This patch series adds a better way of profiling jitted code
> >> with the following advantages:
> >> - support any jitted code environment (some with modifications)
> >> - support Java runtime with JVMTI interface with no modifications
> >> - provides a portable JVMTI agent library
> >> - known to support V8 runtime
> >> - known to support DART runtime
> >> - supports code rejitting and code movements
> >> - no files in /tmp
> >> - meta-data file is unique to each run
> >> - no changes to perf report/annotate
> >> - support per-thread and system-wide profiling
> >> - support monitoring of multiple simultaneous Jit runtimes
> >> - source level view in perf annotate
> >>
> >> The support is based on cooperation with the runtime. For Java runtimes,
> >> supporting the JVMTI interface, there is no change necessary. For other
> >> runtimes, modifications are necessary to emit the meta-data to support
> >> symbolization, annotation, source lines correlation of the samples.
> >> Those modifications are relatively straighforward, some have been
> >> implemented in V8 and DART.
> >>
> >> The jit environment emits a binary dump file which contains the jitted
> >> code (in raw format) and meta-data describing the mapping of functions.
> >> The binary format is documented in the jitdump.h header file. It is
> >> adapted from the OProfile jitdump format.
> >
> > While this is impressive work, I don't think I'd use it very much, and
> > I wouldn't encourage others too either. Is it right that this approach
> > needs to be turned on from runtime start, and will constantly emit
> > timestamped JIT records? I'd use that as a backup for existing
> > techniques for perf_events and jitted runtimes.
> >
> This boils down that when does the JIT runtime emit the jitdump data?
> In the case of openJDK and given how I wrote the agent, it needs to
> run from start to finish. In order for perf to have full visibility, it needs
> to get info about all the code that has been jitted so far. This could be
> done after start if there was somehow a protocol to handle this in the
> runtime, like a signal. It would just need to dump the current status, including
> all the code. The rest of the support would work. In other words, if there was
> a way to signal to the runtime that it is being monitored and that it needs to
> dump its state, then everything would work. To avoid generating more dumps,
> the runtime would also have to be informed that monitoring has stopped.
Ok, so it's possible that an on-demand approach could work, if the
agent can be modified. And since perf-map-agent made this transition
(https://github.com/jrudolph/perf-map-agent), examples of how to do
this are probably in its git history. :) The outcome could well be
something like:
perf record -F 99 -a -g -- perf_java_agent --sleep 10
Where a perf_java_agent tool attached to all Javas, dumped initial
symbols, and then logged timestamped symbols during the run (--sleep
10), then detached at the end. That would be ideal. Which just means
changing the Java agent, not the perf implementation.
Brendan
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-10-09 20:51 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-01 6:45 [PATCH v7 0/4] perf: add support for profiling jitted code Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 1/5] perf tools: add Java demangling support Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 2/5] perf tools: pass session to mmap processing code Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 3/5] perf inject: add jitdump mmap injection support Stephane Eranian
2015-10-09 13:17 ` Adrian Hunter
2015-10-01 6:45 ` [PATCH v7 4/5] perf tools: add JVMTI agent library Stephane Eranian
2015-10-01 6:45 ` [PATCH v7 5/5] perf/jit: add source line info support Stephane Eranian
2015-10-01 9:17 ` [PATCH v7 0/4] perf: add support for profiling jitted code Peter Zijlstra
2015-10-01 9:39 ` Ingo Molnar
2015-10-01 16:13 ` Arnaldo Carvalho de Melo
2015-10-04 19:52 ` Stephane Eranian
2015-10-01 22:45 ` Brendan Gregg
2015-10-04 20:05 ` Stephane Eranian
2015-10-09 20:51 ` Brendan Gregg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).