* [PATCH v2 0/4] perf inject improvements
@ 2024-09-09 20:37 Ian Rogers
2024-09-09 20:37 ` [PATCH v2 1/4] perf inject: Fix build ID injection Ian Rogers
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Ian Rogers @ 2024-09-09 20:37 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Colin Ian King, Casey Chen,
Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel
Fix the existing build id injection by adding sample IDs on to the
synthesized events. This correctly orders the events and addresses
issues such as a profiled executable being replaced during its
execution.
Add a new --mmap2-buildid-all option that rewrites all mmap events as
mmap2 events containing build IDs. This removes the need for build_id
events.
Add a new -B option that like --mmap2-buildid-all synthesizes mmap2
with build id events. With -B the behavior is to do it lazily, so only
when a sample references the particular map. With system wide
profiling that synthesizes mmap events for all running processes the
perf.data file savings can be greater than 50%.
Reduce the memory footprint of perf inject by avoiding creating
symbols in the callchain, the symbols aren't used during perf inject
and necessitate the loading of dsos.
v2: Rename dso__inject* functions to tool__inject* addressing feedback
from Arnaldo and a suggestion from Namhyung that the name should
reflect the first argument's type. Rebase, in particular over the
perf inject pipe mode fixes. Add Namhyung's acked-by.
Ian Rogers (4):
perf inject: Fix build ID injection
perf inject: Add new mmap2-buildid-all option
perf inject: Lazy build-id mmap2 event insertion
perf callchain: Allow symbols to be optional when resolving a
callchain
tools/perf/builtin-inject.c | 304 +++++++++++++++++++++++-----
tools/perf/tests/shell/pipe_test.sh | 2 +
tools/perf/util/build-id.c | 6 +-
tools/perf/util/callchain.c | 8 +-
tools/perf/util/callchain.h | 2 +-
tools/perf/util/machine.c | 92 +++++----
tools/perf/util/machine.h | 33 ++-
tools/perf/util/map.c | 1 +
tools/perf/util/map.h | 11 +
tools/perf/util/synthetic-events.c | 101 +++++++--
tools/perf/util/synthetic-events.h | 21 +-
11 files changed, 468 insertions(+), 113 deletions(-)
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/4] perf inject: Fix build ID injection
2024-09-09 20:37 [PATCH v2 0/4] perf inject improvements Ian Rogers
@ 2024-09-09 20:37 ` Ian Rogers
2024-09-09 20:37 ` [PATCH v2 2/4] perf inject: Add new mmap2-buildid-all option Ian Rogers
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2024-09-09 20:37 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Colin Ian King, Casey Chen,
Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel
Build ID injection wasn't inserting a sample ID and aligning events to
64 bytes rather than 8. No sample ID means events are unordered and
two different build_id events for the same path, as happens when a
file is replaced, can't be differentiated.
Add in sample ID insertion for the build_id events alongside some
refactoring. The refactoring better aligns the function arguments for
different use cases, such as synthesizing build_id events without
needing to have a dso. The misc bits are explicitly passed as with
callchains the maps/dsos may span user and kernel land, so using
sample->cpumode isn't good enough.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-inject.c | 170 ++++++++++++++++++++++-------
tools/perf/util/build-id.c | 6 +-
tools/perf/util/synthetic-events.c | 44 ++++++--
tools/perf/util/synthetic-events.h | 10 +-
4 files changed, 175 insertions(+), 55 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 0ccf80fe8399..24470c57527d 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -130,6 +130,7 @@ struct perf_inject {
struct perf_file_section secs[HEADER_FEAT_BITS];
struct guest_session guest_session;
struct strlist *known_build_ids;
+ const struct evsel *mmap_evsel;
};
struct event_entry {
@@ -138,8 +139,13 @@ struct event_entry {
union perf_event event[];
};
-static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
- struct machine *machine, u8 cpumode, u32 flags);
+static int tool__inject_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ const struct evsel *evsel,
+ __u16 misc,
+ const char *filename,
+ struct dso *dso, u32 flags);
static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
{
@@ -422,6 +428,28 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
return dso;
}
+/*
+ * The evsel used for the sample ID for mmap events. Typically stashed when
+ * processing mmap events. If not stashed, search the evlist for the first mmap
+ * gathering event.
+ */
+static const struct evsel *inject__mmap_evsel(struct perf_inject *inject)
+{
+ struct evsel *pos;
+
+ if (inject->mmap_evsel)
+ return inject->mmap_evsel;
+
+ evlist__for_each_entry(inject->session->evlist, pos) {
+ if (pos->core.attr.mmap) {
+ inject->mmap_evsel = pos;
+ return pos;
+ }
+ }
+ pr_err("No mmap events found\n");
+ return NULL;
+}
+
static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -469,12 +497,28 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
}
if (dso && !dso__hit(dso)) {
- dso__set_hit(dso);
- dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
+ struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
+
+ if (evsel) {
+ dso__set_hit(dso);
+ tool__inject_build_id(tool, sample, machine, evsel,
+ /*misc=*/sample->cpumode,
+ filename, dso, flags);
+ }
}
} else {
+ int err;
+
+ /*
+ * Remember the evsel for lazy build id generation. It is used
+ * for the sample id header type.
+ */
+ if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
+ !inject->mmap_evsel)
+ inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
+
/* Create the thread, map, etc. Not done for the unordered inject all case. */
- int err = perf_event_process(tool, event, sample, machine);
+ err = perf_event_process(tool, event, sample, machine);
if (err) {
dso__put(dso);
@@ -667,16 +711,20 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
return false;
}
-static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
- struct machine *machine, u8 cpumode, u32 flags)
+static int tool__inject_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ const struct evsel *evsel,
+ __u16 misc,
+ const char *filename,
+ struct dso *dso, u32 flags)
{
- struct perf_inject *inject = container_of(tool, struct perf_inject,
- tool);
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
int err;
- if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
+ if (is_anon_memory(filename) || flags & MAP_HUGETLB)
return 0;
- if (is_no_dso_memory(dso__long_name(dso)))
+ if (is_no_dso_memory(filename))
return 0;
if (inject->known_build_ids != NULL &&
@@ -684,24 +732,65 @@ static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
return 1;
if (dso__read_build_id(dso) < 0) {
- pr_debug("no build_id found for %s\n", dso__long_name(dso));
+ pr_debug("no build_id found for %s\n", filename);
return -1;
}
- err = perf_event__synthesize_build_id(tool, dso, cpumode,
- perf_event__repipe, machine);
+ err = perf_event__synthesize_build_id(tool, sample, machine,
+ perf_event__repipe,
+ evsel, misc, dso__bid(dso),
+ filename);
if (err) {
- pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
+ pr_err("Can't synthesize build_id event for %s\n", filename);
return -1;
}
return 0;
}
+static int mark_dso_hit(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ const struct evsel *mmap_evsel,
+ struct map *map, bool sample_in_dso)
+{
+ struct dso *dso;
+ u16 misc = sample->cpumode;
+
+ if (!map)
+ return 0;
+
+ if (!sample_in_dso) {
+ u16 guest_mask = PERF_RECORD_MISC_GUEST_KERNEL |
+ PERF_RECORD_MISC_GUEST_USER;
+
+ if ((misc & guest_mask) != 0) {
+ misc &= PERF_RECORD_MISC_HYPERVISOR;
+ misc |= __map__is_kernel(map)
+ ? PERF_RECORD_MISC_GUEST_KERNEL
+ : PERF_RECORD_MISC_GUEST_USER;
+ } else {
+ misc &= PERF_RECORD_MISC_HYPERVISOR;
+ misc |= __map__is_kernel(map)
+ ? PERF_RECORD_MISC_KERNEL
+ : PERF_RECORD_MISC_USER;
+ }
+ }
+ dso = map__dso(map);
+ if (dso && !dso__hit(dso)) {
+ dso__set_hit(dso);
+ tool__inject_build_id(tool, sample, machine,
+ mmap_evsel, misc, dso__long_name(dso), dso,
+ map__flags(map));
+ }
+ return 0;
+}
+
struct mark_dso_hit_args {
const struct perf_tool *tool;
+ struct perf_sample *sample;
struct machine *machine;
- u8 cpumode;
+ const struct evsel *mmap_evsel;
};
static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
@@ -709,16 +798,8 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
struct mark_dso_hit_args *args = data;
struct map *map = node->ms.map;
- if (map) {
- struct dso *dso = map__dso(map);
-
- if (dso && !dso__hit(dso)) {
- dso__set_hit(dso);
- dso__inject_build_id(dso, args->tool, args->machine,
- args->cpumode, map__flags(map));
- }
- }
- return 0;
+ return mark_dso_hit(args->tool, args->sample, args->machine,
+ args->mmap_evsel, map, /*sample_in_dso=*/false);
}
int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *event,
@@ -728,10 +809,16 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
{
struct addr_location al;
struct thread *thread;
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
struct mark_dso_hit_args args = {
.tool = tool,
+ /*
+ * Use the parsed sample data of the sample event, which will
+ * have a later timestamp than the mmap event.
+ */
+ .sample = sample,
.machine = machine,
- .cpumode = sample->cpumode,
+ .mmap_evsel = inject__mmap_evsel(inject),
};
addr_location__init(&al);
@@ -743,13 +830,8 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
}
if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
- struct dso *dso = map__dso(al.map);
-
- if (!dso__hit(dso)) {
- dso__set_hit(dso);
- dso__inject_build_id(dso, tool, machine,
- sample->cpumode, map__flags(al.map));
- }
+ mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
+ /*sample_in_dso=*/true);
}
sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
@@ -1159,17 +1241,27 @@ static int process_build_id(const struct perf_tool *tool,
static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
{
struct machine *machine = perf_session__findnew_machine(inject->session, machine_pid);
- u8 cpumode = dso__is_in_kernel_space(dso) ?
- PERF_RECORD_MISC_GUEST_KERNEL :
- PERF_RECORD_MISC_GUEST_USER;
+ struct perf_sample synth_sample = {
+ .pid = -1,
+ .tid = -1,
+ .time = -1,
+ .stream_id = -1,
+ .cpu = -1,
+ .period = 1,
+ .cpumode = dso__is_in_kernel_space(dso)
+ ? PERF_RECORD_MISC_GUEST_KERNEL
+ : PERF_RECORD_MISC_GUEST_USER,
+ };
if (!machine)
return -ENOMEM;
dso__set_hit(dso);
- return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
- process_build_id, machine);
+ return perf_event__synthesize_build_id(&inject->tool, &synth_sample, machine,
+ process_build_id, inject__mmap_evsel(inject),
+ /*misc=*/synth_sample.cpumode,
+ dso__bid(dso), dso__long_name(dso));
}
static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 451d145fa4ed..8982f68e7230 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -277,8 +277,8 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
struct perf_record_header_build_id b;
size_t len;
- len = name_len + 1;
- len = PERF_ALIGN(len, NAME_ALIGN);
+ len = sizeof(b) + name_len + 1;
+ len = PERF_ALIGN(len, sizeof(u64));
memset(&b, 0, sizeof(b));
memcpy(&b.data, bid->data, bid->size);
@@ -286,7 +286,7 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
misc |= PERF_RECORD_MISC_BUILD_ID_SIZE;
b.pid = pid;
b.header.misc = misc;
- b.header.size = sizeof(b) + len;
+ b.header.size = len;
err = do_write(fd, &b, sizeof(b));
if (err < 0)
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 0a7f93ae76fb..6bb62e4e2d5d 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2225,28 +2225,48 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
}
#endif
-int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc,
- perf_event__handler_t process, struct machine *machine)
+int perf_event__synthesize_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ perf_event__handler_t process,
+ const struct evsel *evsel,
+ __u16 misc,
+ const struct build_id *bid,
+ const char *filename)
{
union perf_event ev;
size_t len;
- if (!dso__hit(pos))
- return 0;
+ len = sizeof(ev.build_id) + strlen(filename) + 1;
+ len = PERF_ALIGN(len, sizeof(u64));
- memset(&ev, 0, sizeof(ev));
+ memset(&ev, 0, len);
- len = dso__long_name_len(pos) + 1;
- len = PERF_ALIGN(len, NAME_ALIGN);
- ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
- memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
+ ev.build_id.size = min(bid->size, sizeof(ev.build_id.build_id));
+ memcpy(ev.build_id.build_id, bid->data, ev.build_id.size);
ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
ev.build_id.pid = machine->pid;
- ev.build_id.header.size = sizeof(ev.build_id) + len;
- memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
+ ev.build_id.header.size = len;
+ strcpy(ev.build_id.filename, filename);
+
+ if (evsel) {
+ void *array = &ev;
+ int ret;
- return process(tool, &ev, NULL, machine);
+ array += ev.header.size;
+ ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
+ if (ret < 0)
+ return ret;
+
+ if (ret & 7) {
+ pr_err("Bad id sample size %d\n", ret);
+ return -EINVAL;
+ }
+
+ ev.header.size += ret;
+ }
+ return process(tool, &ev, sample, machine);
}
int perf_event__synthesize_stat_events(struct perf_stat_config *config, const struct perf_tool *tool,
diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
index 31df7653677f..795bf3e18396 100644
--- a/tools/perf/util/synthetic-events.h
+++ b/tools/perf/util/synthetic-events.h
@@ -9,6 +9,7 @@
#include <perf/cpumap.h>
struct auxtrace_record;
+struct build_id;
struct dso;
struct evlist;
struct evsel;
@@ -45,7 +46,14 @@ typedef int (*perf_event__handler_t)(const struct perf_tool *tool, union perf_ev
int perf_event__synthesize_attrs(const struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process);
int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_attr *attr, u32 ids, u64 *id, perf_event__handler_t process);
-int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc, perf_event__handler_t process, struct machine *machine);
+int perf_event__synthesize_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ perf_event__handler_t process,
+ const struct evsel *evsel,
+ __u16 misc,
+ const struct build_id *bid,
+ const char *filename);
int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/4] perf inject: Add new mmap2-buildid-all option
2024-09-09 20:37 [PATCH v2 0/4] perf inject improvements Ian Rogers
2024-09-09 20:37 ` [PATCH v2 1/4] perf inject: Fix build ID injection Ian Rogers
@ 2024-09-09 20:37 ` Ian Rogers
2024-09-09 20:37 ` [PATCH v2 3/4] perf inject: Lazy build-id mmap2 event insertion Ian Rogers
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2024-09-09 20:37 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Colin Ian King, Casey Chen,
Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel
Add an option that allows all mmap or mmap2 events to be rewritten as
mmap2 events with build IDs. This is similar to the existing
-b/--build-ids and --buildid-all options except instead of adding a
build_id event an existing mmap/mmap2 event is used as a template and
a new mmap2 event synthesized from it. As mmap2 events are typical
this avoids the insertion of build_id events.
Add test coverage to the pipe test.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-inject.c | 88 ++++++++++++++++++++++++++++-
tools/perf/tests/shell/pipe_test.sh | 1 +
tools/perf/util/synthetic-events.c | 57 +++++++++++++++++++
tools/perf/util/synthetic-events.h | 11 ++++
4 files changed, 154 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 24470c57527d..5a27fa46e93d 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -107,6 +107,7 @@ enum build_id_rewrite_style {
BID_RWS__NONE = 0,
BID_RWS__INJECT_HEADER_LAZY,
BID_RWS__INJECT_HEADER_ALL,
+ BID_RWS__MMAP2_BUILDID_ALL,
};
struct perf_inject {
@@ -146,6 +147,16 @@ static int tool__inject_build_id(const struct perf_tool *tool,
__u16 misc,
const char *filename,
struct dso *dso, u32 flags);
+static int tool__inject_mmap2_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ const struct evsel *evsel,
+ __u16 misc,
+ __u32 pid, __u32 tid,
+ __u64 start, __u64 len, __u64 pgoff,
+ struct dso *dso,
+ __u32 prot, __u32 flags,
+ const char *filename);
static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
{
@@ -161,6 +172,7 @@ static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
static int perf_event__repipe_synth(const struct perf_tool *tool,
union perf_event *event)
+
{
struct perf_inject *inject = container_of(tool, struct perf_inject,
tool);
@@ -454,7 +466,9 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine,
- __u32 pid, __u32 tid, __u32 flags,
+ __u32 pid, __u32 tid,
+ __u64 start, __u64 len, __u64 pgoff,
+ __u32 flags, __u32 prot,
const char *filename,
const struct dso_id *dso_id,
int (*perf_event_process)(const struct perf_tool *tool,
@@ -525,6 +539,26 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
return err;
}
}
+ if ((inject->build_id_style == BID_RWS__MMAP2_BUILDID_ALL) &&
+ !(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) {
+ struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
+
+ if (evsel && !dso_sought) {
+ dso = findnew_dso(pid, tid, filename, dso_id, machine);
+ dso_sought = true;
+ }
+ if (evsel && dso &&
+ !tool__inject_mmap2_build_id(tool, sample, machine, evsel,
+ sample->cpumode | PERF_RECORD_MISC_MMAP_BUILD_ID,
+ pid, tid, start, len, pgoff,
+ dso,
+ prot, flags,
+ filename)) {
+ /* Injected mmap2 so no need to repipe. */
+ dso__put(dso);
+ return 0;
+ }
+ }
dso__put(dso);
return perf_event__repipe(tool, event, sample, machine);
}
@@ -536,7 +570,9 @@ static int perf_event__repipe_mmap(const struct perf_tool *tool,
{
return perf_event__repipe_common_mmap(
tool, event, sample, machine,
- event->mmap.pid, event->mmap.tid, /*flags=*/0,
+ event->mmap.pid, event->mmap.tid,
+ event->mmap.start, event->mmap.len, event->mmap.pgoff,
+ /*flags=*/0, PROT_EXEC,
event->mmap.filename, /*dso_id=*/NULL,
perf_event__process_mmap);
}
@@ -559,7 +595,9 @@ static int perf_event__repipe_mmap2(const struct perf_tool *tool,
return perf_event__repipe_common_mmap(
tool, event, sample, machine,
- event->mmap2.pid, event->mmap2.tid, event->mmap2.flags,
+ event->mmap2.pid, event->mmap2.tid,
+ event->mmap2.start, event->mmap2.len, event->mmap2.pgoff,
+ event->mmap2.flags, event->mmap2.prot,
event->mmap2.filename, dso_id,
perf_event__process_mmap2);
}
@@ -748,6 +786,45 @@ static int tool__inject_build_id(const struct perf_tool *tool,
return 0;
}
+static int tool__inject_mmap2_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ const struct evsel *evsel,
+ __u16 misc,
+ __u32 pid, __u32 tid,
+ __u64 start, __u64 len, __u64 pgoff,
+ struct dso *dso,
+ __u32 prot, __u32 flags,
+ const char *filename)
+{
+ int err;
+
+ /* Return to repipe anonymous maps. */
+ if (is_anon_memory(filename) || flags & MAP_HUGETLB)
+ return 1;
+ if (is_no_dso_memory(filename))
+ return 1;
+
+ if (dso__read_build_id(dso)) {
+ pr_debug("no build_id found for %s\n", filename);
+ return -1;
+ }
+
+ err = perf_event__synthesize_mmap2_build_id(tool, sample, machine,
+ perf_event__repipe,
+ evsel,
+ misc, pid, tid,
+ start, len, pgoff,
+ dso__bid(dso),
+ prot, flags,
+ filename);
+ if (err) {
+ pr_err("Can't synthesize build_id event for %s\n", filename);
+ return -1;
+ }
+ return 0;
+}
+
static int mark_dso_hit(const struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine,
@@ -2261,12 +2338,15 @@ int cmd_inject(int argc, const char **argv)
const char *known_build_ids = NULL;
bool build_ids;
bool build_id_all;
+ bool mmap2_build_id_all;
struct option options[] = {
OPT_BOOLEAN('b', "build-ids", &build_ids,
"Inject build-ids into the output stream"),
OPT_BOOLEAN(0, "buildid-all", &build_id_all,
"Inject build-ids of all DSOs into the output stream"),
+ OPT_BOOLEAN(0, "mmap2-buildid-all", &mmap2_build_id_all,
+ "Rewrite all mmap events as mmap2 events with build IDs"),
OPT_STRING(0, "known-build-ids", &known_build_ids,
"buildid path [,buildid path...]",
"build-ids to use for given paths"),
@@ -2363,6 +2443,8 @@ int cmd_inject(int argc, const char **argv)
return -1;
}
}
+ if (mmap2_build_id_all)
+ inject.build_id_style = BID_RWS__MMAP2_BUILDID_ALL;
if (build_ids)
inject.build_id_style = BID_RWS__INJECT_HEADER_LAZY;
if (build_id_all)
diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell/pipe_test.sh
index a3c94b4182c2..250574cd68b6 100755
--- a/tools/perf/tests/shell/pipe_test.sh
+++ b/tools/perf/tests/shell/pipe_test.sh
@@ -118,6 +118,7 @@ test_inject_bids() {
test_record_report
test_inject_bids -b
test_inject_bids --buildid-all
+test_inject_bids --mmap2-buildid-all
cleanup
exit $err
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 6bb62e4e2d5d..a58444c4aed1 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2266,6 +2266,63 @@ int perf_event__synthesize_build_id(const struct perf_tool *tool,
ev.header.size += ret;
}
+
+ return process(tool, &ev, sample, machine);
+}
+
+int perf_event__synthesize_mmap2_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ perf_event__handler_t process,
+ const struct evsel *evsel,
+ __u16 misc,
+ __u32 pid, __u32 tid,
+ __u64 start, __u64 len, __u64 pgoff,
+ const struct build_id *bid,
+ __u32 prot, __u32 flags,
+ const char *filename)
+{
+ union perf_event ev;
+ size_t ev_len;
+ void *array;
+ int ret;
+
+ ev_len = sizeof(ev.mmap2) - sizeof(ev.mmap2.filename) + strlen(filename) + 1;
+ ev_len = PERF_ALIGN(ev_len, sizeof(u64));
+
+ memset(&ev, 0, ev_len);
+
+ ev.mmap2.header.type = PERF_RECORD_MMAP2;
+ ev.mmap2.header.misc = misc | PERF_RECORD_MISC_MMAP_BUILD_ID;
+ ev.mmap2.header.size = ev_len;
+
+ ev.mmap2.pid = pid;
+ ev.mmap2.tid = tid;
+ ev.mmap2.start = start;
+ ev.mmap2.len = len;
+ ev.mmap2.pgoff = pgoff;
+
+ ev.mmap2.build_id_size = min(bid->size, sizeof(ev.mmap2.build_id));
+ memcpy(ev.mmap2.build_id, bid->data, ev.mmap2.build_id_size);
+
+ ev.mmap2.prot = prot;
+ ev.mmap2.flags = flags;
+
+ memcpy(ev.mmap2.filename, filename, min(strlen(filename), sizeof(ev.mmap.filename)));
+
+ array = &ev;
+ array += ev.header.size;
+ ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
+ if (ret < 0)
+ return ret;
+
+ if (ret & 7) {
+ pr_err("Bad id sample size %d\n", ret);
+ return -EINVAL;
+ }
+
+ ev.header.size += ret;
+
return process(tool, &ev, sample, machine);
}
diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
index 795bf3e18396..b9c936b5cfeb 100644
--- a/tools/perf/util/synthetic-events.h
+++ b/tools/perf/util/synthetic-events.h
@@ -54,6 +54,17 @@ int perf_event__synthesize_build_id(const struct perf_tool *tool,
__u16 misc,
const struct build_id *bid,
const char *filename);
+int perf_event__synthesize_mmap2_build_id(const struct perf_tool *tool,
+ struct perf_sample *sample,
+ struct machine *machine,
+ perf_event__handler_t process,
+ const struct evsel *evsel,
+ __u16 misc,
+ __u32 pid, __u32 tid,
+ __u64 start, __u64 len, __u64 pgoff,
+ const struct build_id *bid,
+ __u32 prot, __u32 flags,
+ const char *filename);
int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/4] perf inject: Lazy build-id mmap2 event insertion
2024-09-09 20:37 [PATCH v2 0/4] perf inject improvements Ian Rogers
2024-09-09 20:37 ` [PATCH v2 1/4] perf inject: Fix build ID injection Ian Rogers
2024-09-09 20:37 ` [PATCH v2 2/4] perf inject: Add new mmap2-buildid-all option Ian Rogers
@ 2024-09-09 20:37 ` Ian Rogers
2024-09-09 20:37 ` [PATCH v2 4/4] perf callchain: Allow symbols to be optional when resolving a callchain Ian Rogers
2024-09-10 14:55 ` [PATCH v2 0/4] perf inject improvements Arnaldo Carvalho de Melo
4 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2024-09-09 20:37 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Colin Ian King, Casey Chen,
Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel
Add -B option that lazily inserts mmap2 events thereby dropping all
mmap events without samples. This is similar to the behavior of -b
where only build_id events are inserted when a dso is accessed in a
sample.
File size savings can be significant in system-wide mode, consider:
```
$ perf record -g -a -o perf.data sleep 1
$ perf inject -B -i perf.data -o perf.new.data
$ ls -al perf.data perf.new.data
5147049 perf.data
2248493 perf.new.data
```
Give test coverage of the new option in pipe test.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-inject.c | 62 +++++++++++++++++++++++------
tools/perf/tests/shell/pipe_test.sh | 1 +
tools/perf/util/map.c | 1 +
tools/perf/util/map.h | 11 +++++
4 files changed, 63 insertions(+), 12 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 5a27fa46e93d..9eb72ff48d88 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -108,6 +108,7 @@ enum build_id_rewrite_style {
BID_RWS__INJECT_HEADER_LAZY,
BID_RWS__INJECT_HEADER_ALL,
BID_RWS__MMAP2_BUILDID_ALL,
+ BID_RWS__MMAP2_BUILDID_LAZY,
};
struct perf_inject {
@@ -527,7 +528,8 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
* Remember the evsel for lazy build id generation. It is used
* for the sample id header type.
*/
- if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
+ if ((inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+ inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) &&
!inject->mmap_evsel)
inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
@@ -560,6 +562,9 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
}
}
dso__put(dso);
+ if (inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY)
+ return 0;
+
return perf_event__repipe(tool, event, sample, machine);
}
@@ -825,7 +830,8 @@ static int tool__inject_mmap2_build_id(const struct perf_tool *tool,
return 0;
}
-static int mark_dso_hit(const struct perf_tool *tool,
+static int mark_dso_hit(const struct perf_inject *inject,
+ const struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine,
const struct evsel *mmap_evsel,
@@ -854,16 +860,39 @@ static int mark_dso_hit(const struct perf_tool *tool,
}
}
dso = map__dso(map);
- if (dso && !dso__hit(dso)) {
- dso__set_hit(dso);
- tool__inject_build_id(tool, sample, machine,
- mmap_evsel, misc, dso__long_name(dso), dso,
- map__flags(map));
+ if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+ if (dso && !dso__hit(dso)) {
+ dso__set_hit(dso);
+ tool__inject_build_id(tool, sample, machine,
+ mmap_evsel, misc, dso__long_name(dso), dso,
+ map__flags(map));
+ }
+ } else if (inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) {
+ if (!map__hit(map)) {
+ const struct build_id null_bid = { .size = 0 };
+ const struct build_id *bid = dso ? dso__bid(dso) : &null_bid;
+ const char *filename = dso ? dso__long_name(dso) : "";
+
+ map__set_hit(map);
+ perf_event__synthesize_mmap2_build_id(tool, sample, machine,
+ perf_event__repipe,
+ mmap_evsel,
+ misc,
+ sample->pid, sample->tid,
+ map__start(map),
+ map__end(map) - map__start(map),
+ map__pgoff(map),
+ bid,
+ map__prot(map),
+ map__flags(map),
+ filename);
+ }
}
return 0;
}
struct mark_dso_hit_args {
+ const struct perf_inject *inject;
const struct perf_tool *tool;
struct perf_sample *sample;
struct machine *machine;
@@ -875,7 +904,7 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
struct mark_dso_hit_args *args = data;
struct map *map = node->ms.map;
- return mark_dso_hit(args->tool, args->sample, args->machine,
+ return mark_dso_hit(args->inject, args->tool, args->sample, args->machine,
args->mmap_evsel, map, /*sample_in_dso=*/false);
}
@@ -888,6 +917,7 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
struct thread *thread;
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
struct mark_dso_hit_args args = {
+ .inject = inject,
.tool = tool,
/*
* Use the parsed sample data of the sample event, which will
@@ -907,7 +937,7 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
}
if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
- mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
+ mark_dso_hit(inject, tool, sample, machine, args.mmap_evsel, al.map,
/*sample_in_dso=*/true);
}
@@ -2155,7 +2185,8 @@ static int __cmd_inject(struct perf_inject *inject)
#endif
}
- if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+ if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+ inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) {
inject->tool.sample = perf_event__inject_buildid;
} else if (inject->sched_stat) {
struct evsel *evsel;
@@ -2338,6 +2369,7 @@ int cmd_inject(int argc, const char **argv)
const char *known_build_ids = NULL;
bool build_ids;
bool build_id_all;
+ bool mmap2_build_ids;
bool mmap2_build_id_all;
struct option options[] = {
@@ -2345,6 +2377,8 @@ int cmd_inject(int argc, const char **argv)
"Inject build-ids into the output stream"),
OPT_BOOLEAN(0, "buildid-all", &build_id_all,
"Inject build-ids of all DSOs into the output stream"),
+ OPT_BOOLEAN('B', "mmap2-buildids", &mmap2_build_ids,
+ "Drop unused mmap events, make others mmap2 with build IDs"),
OPT_BOOLEAN(0, "mmap2-buildid-all", &mmap2_build_id_all,
"Rewrite all mmap events as mmap2 events with build IDs"),
OPT_STRING(0, "known-build-ids", &known_build_ids,
@@ -2443,6 +2477,8 @@ int cmd_inject(int argc, const char **argv)
return -1;
}
}
+ if (mmap2_build_ids)
+ inject.build_id_style = BID_RWS__MMAP2_BUILDID_LAZY;
if (mmap2_build_id_all)
inject.build_id_style = BID_RWS__MMAP2_BUILDID_ALL;
if (build_ids)
@@ -2453,7 +2489,8 @@ int cmd_inject(int argc, const char **argv)
data.path = inject.input_name;
ordered_events = inject.jit_mode || inject.sched_stat ||
- (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY);
+ inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+ inject.build_id_style == BID_RWS__MMAP2_BUILDID_LAZY;
perf_tool__init(&inject.tool, ordered_events);
inject.tool.sample = perf_event__repipe_sample;
inject.tool.read = perf_event__repipe_sample;
@@ -2532,7 +2569,8 @@ int cmd_inject(int argc, const char **argv)
}
}
- if (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+ if (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+ inject.build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) {
/*
* to make sure the mmap records are ordered correctly
* and so that the correct especially due to jitted code
diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell/pipe_test.sh
index 250574cd68b6..d4c8005ce9b9 100755
--- a/tools/perf/tests/shell/pipe_test.sh
+++ b/tools/perf/tests/shell/pipe_test.sh
@@ -116,6 +116,7 @@ test_inject_bids() {
}
test_record_report
+test_inject_bids -B
test_inject_bids -b
test_inject_bids --buildid-all
test_inject_bids --mmap2-buildid-all
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index e781c8d56a9a..d729438b7d65 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -116,6 +116,7 @@ static void map__init(struct map *map, u64 start, u64 end, u64 pgoff,
map__set_mapping_type(map, MAPPING_TYPE__DSO);
assert(map__erange_warned(map) == false);
assert(map__priv(map) == false);
+ assert(map__hit(map) == false);
}
struct map *map__new(struct machine *machine, u64 start, u64 len,
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 6c43f31a9fe0..4262f5a143be 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -35,6 +35,7 @@ DECLARE_RC_STRUCT(map) {
enum mapping_type mapping_type:8;
bool erange_warned;
bool priv;
+ bool hit;
};
struct kmap;
@@ -83,6 +84,11 @@ static inline bool map__priv(const struct map *map)
return RC_CHK_ACCESS(map)->priv;
}
+static inline bool map__hit(const struct map *map)
+{
+ return RC_CHK_ACCESS(map)->hit;
+}
+
static inline refcount_t *map__refcnt(struct map *map)
{
return &RC_CHK_ACCESS(map)->refcnt;
@@ -287,6 +293,11 @@ static inline void map__set_priv(struct map *map)
RC_CHK_ACCESS(map)->priv = true;
}
+static inline void map__set_hit(struct map *map)
+{
+ RC_CHK_ACCESS(map)->hit = true;
+}
+
static inline void map__set_erange_warned(struct map *map)
{
RC_CHK_ACCESS(map)->erange_warned = true;
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 4/4] perf callchain: Allow symbols to be optional when resolving a callchain
2024-09-09 20:37 [PATCH v2 0/4] perf inject improvements Ian Rogers
` (2 preceding siblings ...)
2024-09-09 20:37 ` [PATCH v2 3/4] perf inject: Lazy build-id mmap2 event insertion Ian Rogers
@ 2024-09-09 20:37 ` Ian Rogers
2024-09-10 14:55 ` [PATCH v2 0/4] perf inject improvements Arnaldo Carvalho de Melo
4 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2024-09-09 20:37 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Colin Ian King, Casey Chen,
Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel
In uses like perf inject it is not necessary to gather the symbol for
each call chain location, the map for the sample IP is wanted so that
build IDs and the like can be injected. Make gathering the symbol in
the callchain_cursor optional.
For a perf inject -B command this lowers the peak RSS from 54.1MB to
29.6MB by avoiding loading symbols.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-inject.c | 2 +-
tools/perf/util/callchain.c | 8 ++--
tools/perf/util/callchain.h | 2 +-
tools/perf/util/machine.c | 92 +++++++++++++++++++++----------------
tools/perf/util/machine.h | 33 ++++++++++---
5 files changed, 85 insertions(+), 52 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 9eb72ff48d88..d6989195a061 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -942,7 +942,7 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
}
sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
- mark_dso_hit_callback, &args);
+ /*symbols=*/false, mark_dso_hit_callback, &args);
thread__put(thread);
repipe:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 0d608e875fe9..0c7564747a14 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1800,7 +1800,7 @@ s64 callchain_avg_cycles(struct callchain_node *cnode)
int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
struct perf_sample *sample, int max_stack,
- callchain_iter_fn cb, void *data)
+ bool symbols, callchain_iter_fn cb, void *data)
{
struct callchain_cursor *cursor = get_tls_callchain_cursor();
int ret;
@@ -1809,9 +1809,9 @@ int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
return -ENOMEM;
/* Fill in the callchain. */
- ret = thread__resolve_callchain(thread, cursor, evsel, sample,
- /*parent=*/NULL, /*root_al=*/NULL,
- max_stack);
+ ret = __thread__resolve_callchain(thread, cursor, evsel, sample,
+ /*parent=*/NULL, /*root_al=*/NULL,
+ max_stack, symbols);
if (ret)
return ret;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 76891f8e2373..86ed9e4d04f9 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -315,6 +315,6 @@ typedef int (*callchain_iter_fn)(struct callchain_cursor_node *node, void *data)
int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
struct perf_sample *sample, int max_stack,
- callchain_iter_fn cb, void *data);
+ bool symbols, callchain_iter_fn cb, void *data);
#endif /* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5783b96fb988..fad227b625d1 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2060,7 +2060,8 @@ static int add_callchain_ip(struct thread *thread,
bool branch,
struct branch_flags *flags,
struct iterations *iter,
- u64 branch_from)
+ u64 branch_from,
+ bool symbols)
{
struct map_symbol ms = {};
struct addr_location al;
@@ -2099,7 +2100,8 @@ static int add_callchain_ip(struct thread *thread,
}
goto out;
}
- thread__find_symbol(thread, *cpumode, ip, &al);
+ if (symbols)
+ thread__find_symbol(thread, *cpumode, ip, &al);
}
if (al.sym != NULL) {
@@ -2228,7 +2230,8 @@ static int lbr_callchain_add_kernel_ip(struct thread *thread,
struct symbol **parent,
struct addr_location *root_al,
u64 branch_from,
- bool callee, int end)
+ bool callee, int end,
+ bool symbols)
{
struct ip_callchain *chain = sample->callchain;
u8 cpumode = PERF_RECORD_MISC_USER;
@@ -2238,7 +2241,8 @@ static int lbr_callchain_add_kernel_ip(struct thread *thread,
for (i = 0; i < end + 1; i++) {
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, chain->ips[i],
- false, NULL, NULL, branch_from);
+ false, NULL, NULL, branch_from,
+ symbols);
if (err)
return err;
}
@@ -2248,7 +2252,8 @@ static int lbr_callchain_add_kernel_ip(struct thread *thread,
for (i = end; i >= 0; i--) {
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, chain->ips[i],
- false, NULL, NULL, branch_from);
+ false, NULL, NULL, branch_from,
+ symbols);
if (err)
return err;
}
@@ -2291,7 +2296,8 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
struct symbol **parent,
struct addr_location *root_al,
u64 *branch_from,
- bool callee)
+ bool callee,
+ bool symbols)
{
struct branch_stack *lbr_stack = sample->branch_stack;
struct branch_entry *entries = perf_sample__branch_entries(sample);
@@ -2324,7 +2330,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
true, flags, NULL,
- *branch_from);
+ *branch_from, symbols);
if (err)
return err;
@@ -2349,7 +2355,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
true, flags, NULL,
- *branch_from);
+ *branch_from, symbols);
if (err)
return err;
save_lbr_cursor_node(thread, cursor, i);
@@ -2364,7 +2370,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
true, flags, NULL,
- *branch_from);
+ *branch_from, symbols);
if (err)
return err;
save_lbr_cursor_node(thread, cursor, i);
@@ -2378,7 +2384,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
true, flags, NULL,
- *branch_from);
+ *branch_from, symbols);
if (err)
return err;
}
@@ -2545,7 +2551,8 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
struct symbol **parent,
struct addr_location *root_al,
int max_stack,
- unsigned int max_lbr)
+ unsigned int max_lbr,
+ bool symbols)
{
bool callee = (callchain_param.order == ORDER_CALLEE);
struct ip_callchain *chain = sample->callchain;
@@ -2587,12 +2594,12 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
/* Add kernel ip */
err = lbr_callchain_add_kernel_ip(thread, cursor, sample,
parent, root_al, branch_from,
- true, i);
+ true, i, symbols);
if (err)
goto error;
err = lbr_callchain_add_lbr_ip(thread, cursor, sample, parent,
- root_al, &branch_from, true);
+ root_al, &branch_from, true, symbols);
if (err)
goto error;
@@ -2609,14 +2616,14 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
goto error;
}
err = lbr_callchain_add_lbr_ip(thread, cursor, sample, parent,
- root_al, &branch_from, false);
+ root_al, &branch_from, false, symbols);
if (err)
goto error;
/* Add kernel ip */
err = lbr_callchain_add_kernel_ip(thread, cursor, sample,
parent, root_al, branch_from,
- false, i);
+ false, i, symbols);
if (err)
goto error;
}
@@ -2630,7 +2637,7 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
struct callchain_cursor *cursor,
struct symbol **parent,
struct addr_location *root_al,
- u8 *cpumode, int ent)
+ u8 *cpumode, int ent, bool symbols)
{
int err = 0;
@@ -2640,7 +2647,7 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
if (ip >= PERF_CONTEXT_MAX) {
err = add_callchain_ip(thread, cursor, parent,
root_al, cpumode, ip,
- false, NULL, NULL, 0);
+ false, NULL, NULL, 0, symbols);
break;
}
}
@@ -2662,7 +2669,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
struct perf_sample *sample,
struct symbol **parent,
struct addr_location *root_al,
- int max_stack)
+ int max_stack,
+ bool symbols)
{
struct branch_stack *branch = sample->branch_stack;
struct branch_entry *entries = perf_sample__branch_entries(sample);
@@ -2682,7 +2690,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
err = resolve_lbr_callchain_sample(thread, cursor, sample, parent,
root_al, max_stack,
- !env ? 0 : env->max_branches);
+ !env ? 0 : env->max_branches,
+ symbols);
if (err)
return (err < 0) ? err : 0;
}
@@ -2747,13 +2756,14 @@ static int thread__resolve_callchain_sample(struct thread *thread,
root_al,
NULL, be[i].to,
true, &be[i].flags,
- NULL, be[i].from);
+ NULL, be[i].from, symbols);
- if (!err)
+ if (!err) {
err = add_callchain_ip(thread, cursor, parent, root_al,
NULL, be[i].from,
true, &be[i].flags,
- &iter[i], 0);
+ &iter[i], 0, symbols);
+ }
if (err == -EINVAL)
break;
if (err)
@@ -2769,7 +2779,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
check_calls:
if (chain && callchain_param.order != ORDER_CALLEE) {
err = find_prev_cpumode(chain, thread, cursor, parent, root_al,
- &cpumode, chain->nr - first_call);
+ &cpumode, chain->nr - first_call, symbols);
if (err)
return (err < 0) ? err : 0;
}
@@ -2791,7 +2801,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
++nr_entries;
else if (callchain_param.order != ORDER_CALLEE) {
err = find_prev_cpumode(chain, thread, cursor, parent,
- root_al, &cpumode, j);
+ root_al, &cpumode, j, symbols);
if (err)
return (err < 0) ? err : 0;
continue;
@@ -2818,8 +2828,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
if (leaf_frame_caller && leaf_frame_caller != ip) {
err = add_callchain_ip(thread, cursor, parent,
- root_al, &cpumode, leaf_frame_caller,
- false, NULL, NULL, 0);
+ root_al, &cpumode, leaf_frame_caller,
+ false, NULL, NULL, 0, symbols);
if (err)
return (err < 0) ? err : 0;
}
@@ -2827,7 +2837,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
- false, NULL, NULL, 0);
+ false, NULL, NULL, 0, symbols);
if (err)
return (err < 0) ? err : 0;
@@ -2907,7 +2917,7 @@ static int thread__resolve_callchain_unwind(struct thread *thread,
struct callchain_cursor *cursor,
struct evsel *evsel,
struct perf_sample *sample,
- int max_stack)
+ int max_stack, bool symbols)
{
/* Can we do dwarf post unwind? */
if (!((evsel->core.attr.sample_type & PERF_SAMPLE_REGS_USER) &&
@@ -2919,17 +2929,21 @@ static int thread__resolve_callchain_unwind(struct thread *thread,
(!sample->user_stack.size))
return 0;
+ if (!symbols)
+ pr_debug("Not resolving symbols with an unwinder isn't currently supported\n");
+
return unwind__get_entries(unwind_entry, cursor,
thread, sample, max_stack, false);
}
-int thread__resolve_callchain(struct thread *thread,
- struct callchain_cursor *cursor,
- struct evsel *evsel,
- struct perf_sample *sample,
- struct symbol **parent,
- struct addr_location *root_al,
- int max_stack)
+int __thread__resolve_callchain(struct thread *thread,
+ struct callchain_cursor *cursor,
+ struct evsel *evsel,
+ struct perf_sample *sample,
+ struct symbol **parent,
+ struct addr_location *root_al,
+ int max_stack,
+ bool symbols)
{
int ret = 0;
@@ -2942,22 +2956,22 @@ int thread__resolve_callchain(struct thread *thread,
ret = thread__resolve_callchain_sample(thread, cursor,
evsel, sample,
parent, root_al,
- max_stack);
+ max_stack, symbols);
if (ret)
return ret;
ret = thread__resolve_callchain_unwind(thread, cursor,
evsel, sample,
- max_stack);
+ max_stack, symbols);
} else {
ret = thread__resolve_callchain_unwind(thread, cursor,
evsel, sample,
- max_stack);
+ max_stack, symbols);
if (ret)
return ret;
ret = thread__resolve_callchain_sample(thread, cursor,
evsel, sample,
parent, root_al,
- max_stack);
+ max_stack, symbols);
}
return ret;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index a687876e3453..2e5a4cb342d8 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -178,13 +178,32 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
struct callchain_cursor;
-int thread__resolve_callchain(struct thread *thread,
- struct callchain_cursor *cursor,
- struct evsel *evsel,
- struct perf_sample *sample,
- struct symbol **parent,
- struct addr_location *root_al,
- int max_stack);
+int __thread__resolve_callchain(struct thread *thread,
+ struct callchain_cursor *cursor,
+ struct evsel *evsel,
+ struct perf_sample *sample,
+ struct symbol **parent,
+ struct addr_location *root_al,
+ int max_stack,
+ bool symbols);
+
+static inline int thread__resolve_callchain(struct thread *thread,
+ struct callchain_cursor *cursor,
+ struct evsel *evsel,
+ struct perf_sample *sample,
+ struct symbol **parent,
+ struct addr_location *root_al,
+ int max_stack)
+{
+ return __thread__resolve_callchain(thread,
+ cursor,
+ evsel,
+ sample,
+ parent,
+ root_al,
+ max_stack,
+ /*symbols=*/true);
+}
/*
* Default guest kernel is defined by parameter --guestkallsyms
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 0/4] perf inject improvements
2024-09-09 20:37 [PATCH v2 0/4] perf inject improvements Ian Rogers
` (3 preceding siblings ...)
2024-09-09 20:37 ` [PATCH v2 4/4] perf callchain: Allow symbols to be optional when resolving a callchain Ian Rogers
@ 2024-09-10 14:55 ` Arnaldo Carvalho de Melo
4 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-10 14:55 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Colin Ian King, Casey Chen, Anne Macedo, Sun Haiyong,
linux-perf-users, linux-kernel
On Mon, Sep 09, 2024 at 01:37:36PM -0700, Ian Rogers wrote:
> Fix the existing build id injection by adding sample IDs on to the
> synthesized events. This correctly orders the events and addresses
> issues such as a profiled executable being replaced during its
> execution.
>
> Add a new --mmap2-buildid-all option that rewrites all mmap events as
> mmap2 events containing build IDs. This removes the need for build_id
> events.
Thanks, applied to perf-tools-next,
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-10 14:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-09 20:37 [PATCH v2 0/4] perf inject improvements Ian Rogers
2024-09-09 20:37 ` [PATCH v2 1/4] perf inject: Fix build ID injection Ian Rogers
2024-09-09 20:37 ` [PATCH v2 2/4] perf inject: Add new mmap2-buildid-all option Ian Rogers
2024-09-09 20:37 ` [PATCH v2 3/4] perf inject: Lazy build-id mmap2 event insertion Ian Rogers
2024-09-09 20:37 ` [PATCH v2 4/4] perf callchain: Allow symbols to be optional when resolving a callchain Ian Rogers
2024-09-10 14:55 ` [PATCH v2 0/4] perf inject improvements Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).