linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/13] perf inject improvements
@ 2024-08-17  6:44 Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 01/13] perf synthetic-events: Avoid unnecessary memset Ian Rogers
                   ` (13 more replies)
  0 siblings, 14 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Fix the existing build id injection by adding sample IDs on to the
synthesized events. This correctly orders the events and addresses
issues such as a profiled executable being replaced during its
execution.

Add a new --mmap2-buildid-all option that rewrites all mmap events as
mmap2 events containing build IDs. This removes the need for build_id
events.

Add a new -B option that like --mmap2-buildid-all synthesizes mmap2
with build id events. With -B the behavior is to do it lazily, so only
when a sample references the particular map. With system wide
profiling that synthesizes mmap events for all running processes the
perf.data file savings can be greater than 50%.

Reduce the memory footprint of perf inject by avoiding creating
symbols in the callchain, the symbols aren't used during perf inject
and necessitate the loading of dsos.

Ian Rogers (13):
  perf synthetic-events: Avoid unnecessary memset
  perf map: API clean up
  perf jit: Constify filename argument
  perf dso: Constify dso_id
  perf evsel: Constify evsel__id_hdr_size argument
  perf test: Expand pipe/inject test
  perf inject: Combine build_ids and build_id_all into enum
  perf inject: Combine different mmap and mmap2 functions
  perf inject: Combine mmap and mmap2 handling
  perf inject: Fix build ID injection
  perf inject: Add new mmap2-buildid-all option
  perf inject: Lazy build-id mmap2 event insertion
  perf callchain: Allow symbols to be optional when resolving a
    callchain

 tools/perf/builtin-inject.c         | 532 ++++++++++++++++++----------
 tools/perf/builtin-top.c            |   2 +-
 tools/perf/tests/shell/pipe_test.sh | 103 ++++--
 tools/perf/tests/vmlinux-kallsyms.c |   4 +-
 tools/perf/util/build-id.c          |   6 +-
 tools/perf/util/callchain.c         |   8 +-
 tools/perf/util/callchain.h         |   2 +-
 tools/perf/util/dso.c               |   4 +-
 tools/perf/util/dso.h               |   4 +-
 tools/perf/util/dsos.c              |  12 +-
 tools/perf/util/dsos.h              |   2 +-
 tools/perf/util/evsel.c             |   2 +-
 tools/perf/util/evsel.h             |   2 +-
 tools/perf/util/jit.h               |   3 +-
 tools/perf/util/jitdump.c           |   6 +-
 tools/perf/util/machine.c           |  95 ++---
 tools/perf/util/machine.h           |  36 +-
 tools/perf/util/map.c               |  25 +-
 tools/perf/util/map.h               |  22 +-
 tools/perf/util/synthetic-events.c  | 103 +++++-
 tools/perf/util/synthetic-events.h  |  21 +-
 21 files changed, 682 insertions(+), 312 deletions(-)

-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 01/13] perf synthetic-events: Avoid unnecessary memset
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 02/13] perf map: API clean up Ian Rogers
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Make sure the memset of a synthesized event only zeros the necessary
tracing data part of the event, as a full event can be over 4kb in
size.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/synthetic-events.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 7f884d70de81..0a7f93ae76fb 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2200,7 +2200,7 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
 	if (!tdata)
 		return -1;
 
-	memset(&ev, 0, sizeof(ev));
+	memset(&ev, 0, sizeof(ev.tracing_data));
 
 	ev.tracing_data.header.type = PERF_RECORD_HEADER_TRACING_DATA;
 	size = tdata->size;
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 02/13] perf map: API clean up
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 01/13] perf synthetic-events: Avoid unnecessary memset Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 03/13] perf jit: Constify filename argument Ian Rogers
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

map__init is only used internally so make it static. Assume memory is
zero initialized, which will better support adding fields to struct
map in the future and was already the case for map__new2. To reduce
complexity, change set_priv and set_erange_warned to not take a value
to assign as they always assign true.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-top.c            |  2 +-
 tools/perf/tests/vmlinux-kallsyms.c |  4 ++--
 tools/perf/util/map.c               | 24 ++++++++++++------------
 tools/perf/util/map.h               | 11 ++++-------
 4 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 881b861c35ee..724a79386321 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -191,7 +191,7 @@ static void ui__warn_map_erange(struct map *map, struct symbol *sym, u64 ip)
 	if (use_browser <= 0)
 		sleep(5);
 
-	map__set_erange_warned(map, true);
+	map__set_erange_warned(map);
 }
 
 static void perf_top__record_precise_ip(struct perf_top *top,
diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
index cd3b480d20bd..74cdbd2ce9d0 100644
--- a/tools/perf/tests/vmlinux-kallsyms.c
+++ b/tools/perf/tests/vmlinux-kallsyms.c
@@ -131,7 +131,7 @@ static int test__vmlinux_matches_kallsyms_cb1(struct map *map, void *data)
 					(dso__kernel(dso) ? dso__short_name(dso) : dso__name(dso)));
 
 	if (pair) {
-		map__set_priv(pair, 1);
+		map__set_priv(pair);
 		map__put(pair);
 	} else {
 		if (!args->header_printed) {
@@ -166,7 +166,7 @@ static int test__vmlinux_matches_kallsyms_cb2(struct map *map, void *data)
 			pr_info(":\nWARN: *%" PRIx64 "-%" PRIx64 " %" PRIx64,
 				map__start(pair), map__end(pair), map__pgoff(pair));
 		pr_info(" %s\n", dso__name(dso));
-		map__set_priv(pair, 1);
+		map__set_priv(pair);
 	}
 	map__put(pair);
 	return 0;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index e1d14936a60d..e781c8d56a9a 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -102,16 +102,20 @@ static inline bool replace_android_lib(const char *filename, char *newfilename)
 	return false;
 }
 
-void map__init(struct map *map, u64 start, u64 end, u64 pgoff, struct dso *dso)
+static void map__init(struct map *map, u64 start, u64 end, u64 pgoff,
+		      struct dso *dso, u32 prot, u32 flags)
 {
 	map__set_start(map, start);
 	map__set_end(map, end);
 	map__set_pgoff(map, pgoff);
-	map__set_reloc(map, 0);
+	assert(map__reloc(map) == 0);
 	map__set_dso(map, dso__get(dso));
-	map__set_mapping_type(map, MAPPING_TYPE__DSO);
-	map__set_erange_warned(map, false);
 	refcount_set(map__refcnt(map), 1);
+	RC_CHK_ACCESS(map)->prot = prot;
+	RC_CHK_ACCESS(map)->flags = flags;
+	map__set_mapping_type(map, MAPPING_TYPE__DSO);
+	assert(map__erange_warned(map) == false);
+	assert(map__priv(map) == false);
 }
 
 struct map *map__new(struct machine *machine, u64 start, u64 len,
@@ -124,7 +128,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 	struct nsinfo *nsi = NULL;
 	struct nsinfo *nnsi;
 
-	map = malloc(sizeof(*map));
+	map = zalloc(sizeof(*map));
 	if (ADD_RC_CHK(result, map)) {
 		char newfilename[PATH_MAX];
 		struct dso *dso, *header_bid_dso;
@@ -134,8 +138,6 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		anon = is_anon_memory(filename) || flags & MAP_HUGETLB;
 		vdso = is_vdso_map(filename);
 		no_dso = is_no_dso_memory(filename);
-		map->prot = prot;
-		map->flags = flags;
 		nsi = nsinfo__get(thread__nsinfo(thread));
 
 		if ((anon || no_dso) && nsi && (prot & PROT_EXEC)) {
@@ -169,7 +171,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			goto out_delete;
 
 		assert(!dso__kernel(dso));
-		map__init(result, start, start + len, pgoff, dso);
+		map__init(result, start, start + len, pgoff, dso, prot, flags);
 
 		if (anon || no_dso) {
 			map->mapping_type = MAPPING_TYPE__IDENTITY;
@@ -223,10 +225,8 @@ struct map *map__new2(u64 start, struct dso *dso)
 
 	map = calloc(1, sizeof(*map) + (dso__kernel(dso) ? sizeof(struct kmap) : 0));
 	if (ADD_RC_CHK(result, map)) {
-		/*
-		 * ->end will be filled after we load all the symbols
-		 */
-		map__init(result, start, 0, 0, dso);
+		/* ->end will be filled after we load all the symbols. */
+		map__init(result, start, /*end=*/0, /*pgoff=*/0, dso, /*prot=*/0, /*flags=*/0);
 	}
 
 	return result;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 65e2609fa1b1..6c43f31a9fe0 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -166,9 +166,6 @@ struct thread;
 #define map__for_each_symbol_by_name(map, sym_name, pos, idx)	\
 	__map__for_each_symbol_by_name(map, sym_name, (pos), idx)
 
-void map__init(struct map *map,
-	       u64 start, u64 end, u64 pgoff, struct dso *dso);
-
 struct dso_id;
 struct build_id;
 
@@ -285,14 +282,14 @@ static inline void map__set_reloc(struct map *map, u64 reloc)
 	RC_CHK_ACCESS(map)->reloc = reloc;
 }
 
-static inline void map__set_priv(struct map *map, int priv)
+static inline void map__set_priv(struct map *map)
 {
-	RC_CHK_ACCESS(map)->priv = priv;
+	RC_CHK_ACCESS(map)->priv = true;
 }
 
-static inline void map__set_erange_warned(struct map *map, bool erange_warned)
+static inline void map__set_erange_warned(struct map *map)
 {
-	RC_CHK_ACCESS(map)->erange_warned = erange_warned;
+	RC_CHK_ACCESS(map)->erange_warned = true;
 }
 
 static inline void map__set_dso(struct map *map, struct dso *dso)
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 03/13] perf jit: Constify filename argument
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 01/13] perf synthetic-events: Avoid unnecessary memset Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 02/13] perf map: API clean up Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 04/13] perf dso: Constify dso_id Ian Rogers
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Make it clearer the argument is just being used as a string.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/jit.h     | 3 ++-
 tools/perf/util/jitdump.c | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/jit.h b/tools/perf/util/jit.h
index fb810e1b2de7..f4037203e9ec 100644
--- a/tools/perf/util/jit.h
+++ b/tools/perf/util/jit.h
@@ -5,7 +5,8 @@
 #include <data.h>
 
 int jit_process(struct perf_session *session, struct perf_data *output,
-		struct machine *machine, char *filename, pid_t pid, pid_t tid, u64 *nbytes);
+		struct machine *machine, const char *filename,
+		pid_t pid, pid_t tid, u64 *nbytes);
 
 int jit_inject_record(const char *filename);
 
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 5ce13653512b..346513e5e9b7 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -710,7 +710,7 @@ jit_process_dump(struct jit_buf_desc *jd)
 }
 
 static int
-jit_inject(struct jit_buf_desc *jd, char *path)
+jit_inject(struct jit_buf_desc *jd, const char *path)
 {
 	int ret;
 
@@ -737,7 +737,7 @@ jit_inject(struct jit_buf_desc *jd, char *path)
  * as captured in the RECORD_MMAP record
  */
 static int
-jit_detect(char *mmap_name, pid_t pid, struct nsinfo *nsi)
+jit_detect(const char *mmap_name, pid_t pid, struct nsinfo *nsi)
  {
 	char *p;
 	char *end = NULL;
@@ -821,7 +821,7 @@ int
 jit_process(struct perf_session *session,
 	    struct perf_data *output,
 	    struct machine *machine,
-	    char *filename,
+	    const char *filename,
 	    pid_t pid,
 	    pid_t tid,
 	    u64 *nbytes)
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 04/13] perf dso: Constify dso_id
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (2 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 03/13] perf jit: Constify filename argument Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 05/13] perf evsel: Constify evsel__id_hdr_size argument Ian Rogers
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

The passed dso_id is copied and so is never an out argument. Remove
its mutability.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c |  2 +-
 tools/perf/util/dso.c       |  4 ++--
 tools/perf/util/dso.h       |  4 ++--
 tools/perf/util/dsos.c      | 12 ++++++------
 tools/perf/util/dsos.h      |  2 +-
 tools/perf/util/machine.c   |  3 ++-
 tools/perf/util/machine.h   |  3 ++-
 7 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a35bde3f3c09..8440ddfbf4fe 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -411,7 +411,7 @@ static int perf_event__jit_repipe_mmap(const struct perf_tool *tool,
 #endif
 
 static struct dso *findnew_dso(int pid, int tid, const char *filename,
-			       struct dso_id *id, struct machine *machine)
+			       const struct dso_id *id, struct machine *machine)
 {
 	struct thread *thread;
 	struct nsinfo *nsi = NULL;
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 67414944f245..5c6e85fdae0d 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1327,7 +1327,7 @@ bool dso_id__empty(const struct dso_id *id)
 	return !id->maj && !id->min && !id->ino && !id->ino_generation;
 }
 
-void __dso__inject_id(struct dso *dso, struct dso_id *id)
+void __dso__inject_id(struct dso *dso, const struct dso_id *id)
 {
 	struct dsos *dsos = dso__dsos(dso);
 	struct dso_id *dso_id = dso__id(dso);
@@ -1417,7 +1417,7 @@ void dso__set_sorted_by_name(struct dso *dso)
 	RC_CHK_ACCESS(dso)->sorted_by_name = true;
 }
 
-struct dso *dso__new_id(const char *name, struct dso_id *id)
+struct dso *dso__new_id(const char *name, const struct dso_id *id)
 {
 	RC_STRUCT(dso) *dso = zalloc(sizeof(*dso) + strlen(name) + 1);
 	struct dso *res;
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ed0068251c65..bb8e8f444054 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -640,14 +640,14 @@ static inline void dso__set_text_offset(struct dso *dso, u64 val)
 int dso_id__cmp(const struct dso_id *a, const struct dso_id *b);
 bool dso_id__empty(const struct dso_id *id);
 
-struct dso *dso__new_id(const char *name, struct dso_id *id);
+struct dso *dso__new_id(const char *name, const struct dso_id *id);
 struct dso *dso__new(const char *name);
 void dso__delete(struct dso *dso);
 
 int dso__cmp_id(struct dso *a, struct dso *b);
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated);
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated);
-void __dso__inject_id(struct dso *dso, struct dso_id *id);
+void __dso__inject_id(struct dso *dso, const struct dso_id *id);
 
 int dso__name_len(const struct dso *dso);
 
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index d4acdb37f046..e0998e2a7c4e 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -155,7 +155,7 @@ static int dsos__cmp_key_long_name_id(const void *vkey, const void *vdso)
  */
 static struct dso *__dsos__find_by_longname_id(struct dsos *dsos,
 					       const char *name,
-					       struct dso_id *id,
+					       const struct dso_id *id,
 					       bool write_locked)
 {
 	struct dsos__key key = {
@@ -244,7 +244,7 @@ int dsos__add(struct dsos *dsos, struct dso *dso)
 
 struct dsos__find_id_cb_args {
 	const char *name;
-	struct dso_id *id;
+	const struct dso_id *id;
 	struct dso *res;
 };
 
@@ -260,7 +260,7 @@ static int dsos__find_id_cb(struct dso *dso, void *data)
 
 }
 
-static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct dso_id *id,
+static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, const struct dso_id *id,
 				   bool cmp_short, bool write_locked)
 {
 	struct dso *res;
@@ -321,7 +321,7 @@ static void dso__set_basename(struct dso *dso)
 	dso__set_short_name(dso, base, true);
 }
 
-static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
+static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, const struct dso_id *id)
 {
 	struct dso *dso = dso__new_id(name, id);
 
@@ -337,7 +337,7 @@ static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, struct
 	return dso;
 }
 
-static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
+static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, const struct dso_id *id)
 {
 	struct dso *dso = __dsos__find_id(dsos, name, id, false, /*write_locked=*/true);
 
@@ -347,7 +347,7 @@ static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, struc
 	return dso ? dso : __dsos__addnew_id(dsos, name, id);
 }
 
-struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
+struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, const struct dso_id *id)
 {
 	struct dso *dso;
 	down_write(&dsos->lock);
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index 6c13b65648bc..a26774950866 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -32,7 +32,7 @@ int __dsos__add(struct dsos *dsos, struct dso *dso);
 int dsos__add(struct dsos *dsos, struct dso *dso);
 struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 
-struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id);
+struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, const struct dso_id *id);
  
 bool dsos__read_build_ids(struct dsos *dsos, bool with_hits);
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index c08774d06d14..cd79a830abae 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -3128,7 +3128,8 @@ u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr)
 	return addr_cpumode;
 }
 
-struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename, struct dso_id *id)
+struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename,
+				    const struct dso_id *id)
 {
 	return dsos__findnew_id(&machine->dsos, filename, id);
 }
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 82a47bac8023..a687876e3453 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -207,7 +207,8 @@ int machine__nr_cpus_avail(struct machine *machine);
 
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
 
-struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename, struct dso_id *id);
+struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename,
+				    const struct dso_id *id);
 struct dso *machine__findnew_dso(struct machine *machine, const char *filename);
 
 size_t machine__fprintf(struct machine *machine, FILE *fp);
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 05/13] perf evsel: Constify evsel__id_hdr_size argument
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (3 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 04/13] perf dso: Constify dso_id Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 06/13] perf test: Expand pipe/inject test Ian Rogers
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Allows evsel__id_hdr_size to be used when the evsel is const.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/evsel.c | 2 +-
 tools/perf/util/evsel.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 38a74d6dde49..49cc71511c0c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3048,7 +3048,7 @@ int evsel__parse_sample_timestamp(struct evsel *evsel, union perf_event *event,
 	return 0;
 }
 
-u16 evsel__id_hdr_size(struct evsel *evsel)
+u16 evsel__id_hdr_size(const struct evsel *evsel)
 {
 	u64 sample_type = evsel->core.attr.sample_type;
 	u16 size = 0;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4316992f6a69..15acf293e12a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -442,7 +442,7 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
 int evsel__parse_sample_timestamp(struct evsel *evsel, union perf_event *event,
 				  u64 *timestamp);
 
-u16 evsel__id_hdr_size(struct evsel *evsel);
+u16 evsel__id_hdr_size(const struct evsel *evsel);
 
 static inline struct evsel *evsel__next(struct evsel *evsel)
 {
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 06/13] perf test: Expand pipe/inject test
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (4 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 05/13] perf evsel: Constify evsel__id_hdr_size argument Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 07/13] perf inject: Combine build_ids and build_id_all into enum Ian Rogers
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Test recording of call-graphs and injecting --build-all. Add/expand
trap handler.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/shell/pipe_test.sh | 101 ++++++++++++++++++++++------
 1 file changed, 79 insertions(+), 22 deletions(-)

diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell/pipe_test.sh
index a78d35d2cff0..ad10012fdc29 100755
--- a/tools/perf/tests/shell/pipe_test.sh
+++ b/tools/perf/tests/shell/pipe_test.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
 # perf pipe recording and injection test
 # SPDX-License-Identifier: GPL-2.0
 
@@ -12,30 +12,87 @@ skip_test_missing_symbol ${sym}
 
 data=$(mktemp /tmp/perf.data.XXXXXX)
 prog="perf test -w noploop"
-task="perf"
+err=0
 
-if ! perf record -e task-clock:u -o - ${prog} | perf report -i - --task | grep ${task}; then
-	echo "cannot find the test file in the perf report"
-	exit 1
-fi
+set -e
 
-if ! perf record -e task-clock:u -o - ${prog} | perf inject -b | perf report -i - | grep ${sym}; then
-	echo "cannot find noploop function in pipe #1"
-	exit 1
-fi
+cleanup() {
+  rm -rf "${data}"
+  rm -rf "${data}".old
 
-perf record -e task-clock:u -o - ${prog} | perf inject -b -o ${data}
-if ! perf report -i ${data} | grep ${sym}; then
-	echo "cannot find noploop function in pipe #2"
-	exit 1
-fi
+  trap - EXIT TERM INT
+}
 
-perf record -e task-clock:u -o ${data} ${prog}
-if ! perf inject -b -i ${data} | perf report -i - | grep ${sym}; then
-	echo "cannot find noploop function in pipe #3"
-	exit 1
-fi
+trap_cleanup() {
+  echo "Unexpected signal in ${FUNCNAME[1]}"
+  cleanup
+  exit 1
+}
+trap trap_cleanup EXIT TERM INT
 
+test_record_report() {
+  echo
+  echo "Record+report pipe test"
+
+  task="perf"
+  if ! perf record -e task-clock:u -o - ${prog} | perf report -i - --task | grep -q ${task}
+  then
+    echo "Record+report pipe test [Failed - cannot find the test file in the perf report #1]"
+    err=1
+    return
+  fi
+
+  if ! perf record -g -e task-clock:u -o - ${prog} | perf report -i - --task | grep -q ${task}
+  then
+    echo "Record+report pipe test [Failed - cannot find the test file in the perf report #2]"
+    err=1
+    return
+  fi
+
+  echo "Record+report pipe test [Success]"
+}
+
+test_inject_bids() {
+  inject_opt=$1
+
+  echo
+  echo "Inject ${inject_opt} build-ids test"
+
+  if ! perf record -e task-clock:u -o - ${prog} | perf inject ${inject_opt}| perf report -i - | grep -q ${sym}
+  then
+    echo "Inject build-ids test [Failed - cannot find noploop function in pipe #1]"
+    err=1
+    return
+  fi
+
+  if ! perf record -g -e task-clock:u -o - ${prog} | perf inject ${inject_opt} | perf report -i - | grep -q ${sym}
+  then
+    echo "Inject ${inject_opt} build-ids test [Failed - cannot find noploop function in pipe #2]"
+    err=1
+    return
+  fi
+
+  perf record -e task-clock:u -o - ${prog} | perf inject ${inject_opt} -o ${data}
+  if ! perf report -i ${data} | grep -q ${sym}; then
+    echo "Inject ${inject_opt} build-ids test [Failed - cannot find noploop function in pipe #3]"
+    err=1
+    return
+  fi
+
+  perf record -e task-clock:u -o ${data} ${prog}
+  if ! perf inject ${inject_opt} -i ${data} | perf report -i - | grep -q ${sym}; then
+    echo "Inject ${inject_opt} build-ids test [Failed - cannot find noploop function in pipe #4]"
+    err=1
+    return
+  fi
+
+  echo "Inject ${inject_opt} build-ids test [Success]"
+}
+
+test_record_report
+test_inject_bids -b
+test_inject_bids --buildid-all
+
+cleanup
+exit $err
 
-rm -f ${data} ${data}.old
-exit 0
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 07/13] perf inject: Combine build_ids and build_id_all into enum
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (5 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 06/13] perf test: Expand pipe/inject test Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 08/13] perf inject: Combine different mmap and mmap2 functions Ian Rogers
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

It is clearer to have a single enum that determines how build ids are
injected, it also allows for future extension.

Set the header build ID feature whether lazy or all are generated,
previously only the lazy case would set it.

Allow parsing of known build IDs for either the lazy or all cases.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c | 51 ++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 8440ddfbf4fe..865d16ceead2 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -103,11 +103,16 @@ struct guest_session {
 	struct guest_event		ev;
 };
 
+enum build_id_rewrite_style {
+	BID_RWS__NONE = 0,
+	BID_RWS__INJECT_HEADER_LAZY,
+	BID_RWS__INJECT_HEADER_ALL,
+};
+
 struct perf_inject {
 	struct perf_tool	tool;
 	struct perf_session	*session;
-	bool			build_ids;
-	bool			build_id_all;
+	enum build_id_rewrite_style build_id_style;
 	bool			sched_stat;
 	bool			have_auxtrace;
 	bool			strip;
@@ -2015,8 +2020,8 @@ static int __cmd_inject(struct perf_inject *inject)
 
 	signal(SIGINT, sig_handler);
 
-	if (inject->build_ids || inject->sched_stat ||
-	    inject->itrace_synth_opts.set || inject->build_id_all) {
+	if (inject->build_id_style != BID_RWS__NONE || inject->sched_stat ||
+	    inject->itrace_synth_opts.set) {
 		inject->tool.mmap	  = perf_event__repipe_mmap;
 		inject->tool.mmap2	  = perf_event__repipe_mmap2;
 		inject->tool.fork	  = perf_event__repipe_fork;
@@ -2027,10 +2032,10 @@ static int __cmd_inject(struct perf_inject *inject)
 
 	output_data_offset = perf_session__data_offset(session->evlist);
 
-	if (inject->build_id_all) {
+	if (inject->build_id_style == BID_RWS__INJECT_HEADER_ALL) {
 		inject->tool.mmap	  = perf_event__repipe_buildid_mmap;
 		inject->tool.mmap2	  = perf_event__repipe_buildid_mmap2;
-	} else if (inject->build_ids) {
+	} else if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
 		inject->tool.sample = perf_event__inject_buildid;
 	} else if (inject->sched_stat) {
 		struct evsel *evsel;
@@ -2148,9 +2153,9 @@ static int __cmd_inject(struct perf_inject *inject)
 			.inject = inject,
 		};
 
-		if (inject->build_ids)
-			perf_header__set_feat(&session->header,
-					      HEADER_BUILD_ID);
+		if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+		    inject->build_id_style == BID_RWS__INJECT_HEADER_ALL)
+			perf_header__set_feat(&session->header, HEADER_BUILD_ID);
 		/*
 		 * Keep all buildids when there is unprocessed AUX data because
 		 * it is not known which ones the AUX trace hits.
@@ -2211,11 +2216,13 @@ int cmd_inject(int argc, const char **argv)
 	int ret;
 	bool repipe = true;
 	const char *known_build_ids = NULL;
+	bool build_ids;
+	bool build_id_all;
 
 	struct option options[] = {
-		OPT_BOOLEAN('b', "build-ids", &inject.build_ids,
+		OPT_BOOLEAN('b', "build-ids", &build_ids,
 			    "Inject build-ids into the output stream"),
-		OPT_BOOLEAN(0, "buildid-all", &inject.build_id_all,
+		OPT_BOOLEAN(0, "buildid-all", &build_id_all,
 			    "Inject build-ids of all DSOs into the output stream"),
 		OPT_STRING(0, "known-build-ids", &known_build_ids,
 			   "buildid path [,buildid path...]",
@@ -2313,6 +2320,10 @@ int cmd_inject(int argc, const char **argv)
 			return -1;
 		}
 	}
+	if (build_ids)
+		inject.build_id_style = BID_RWS__INJECT_HEADER_LAZY;
+	if (build_id_all)
+		inject.build_id_style = BID_RWS__INJECT_HEADER_ALL;
 
 	data.path = inject.input_name;
 	if (!strcmp(inject.input_name, "-") || inject.output.is_pipe) {
@@ -2326,7 +2337,7 @@ int cmd_inject(int argc, const char **argv)
 			repipe = false;
 	}
 	ordered_events = inject.jit_mode || inject.sched_stat ||
-		(inject.build_ids && !inject.build_id_all);
+		(inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY);
 	perf_tool__init(&inject.tool, ordered_events);
 	inject.tool.sample		= perf_event__repipe_sample;
 	inject.tool.read		= perf_event__repipe_sample;
@@ -2398,7 +2409,7 @@ int cmd_inject(int argc, const char **argv)
 			goto out_delete;
 	}
 
-	if (inject.build_ids && !inject.build_id_all) {
+	if (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
 		/*
 		 * to make sure the mmap records are ordered correctly
 		 * and so that the correct especially due to jitted code
@@ -2406,14 +2417,14 @@ int cmd_inject(int argc, const char **argv)
 		 * inject the jit mmaps at the same time for now.
 		 */
 		inject.tool.ordering_requires_timestamps = true;
-		if (known_build_ids != NULL) {
-			inject.known_build_ids =
-				perf_inject__parse_known_build_ids(known_build_ids);
+	}
+	if (inject.build_id_style != BID_RWS__NONE && known_build_ids != NULL) {
+		inject.known_build_ids =
+			perf_inject__parse_known_build_ids(known_build_ids);
 
-			if (inject.known_build_ids == NULL) {
-				pr_err("Couldn't parse known build ids.\n");
-				goto out_delete;
-			}
+		if (inject.known_build_ids == NULL) {
+			pr_err("Couldn't parse known build ids.\n");
+			goto out_delete;
 		}
 	}
 
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 08/13] perf inject: Combine different mmap and mmap2 functions
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (6 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 07/13] perf inject: Combine build_ids and build_id_all into enum Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 09/13] perf inject: Combine mmap and mmap2 handling Ian Rogers
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

There are repipe, build ID and JIT dump variants of the mmap and mmap2
repipe functions. The organization doesn't allow JIT dump to work with
build ID injection and the structure is less than clear. Combine the
function and enable the different behaviors based on ifs.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c | 199 ++++++++++++++----------------------
 1 file changed, 76 insertions(+), 123 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 865d16ceead2..d99868953ff2 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -377,44 +377,6 @@ static int perf_event__repipe_sample(const struct perf_tool *tool,
 	return perf_event__repipe_synth(tool, event);
 }
 
-static int perf_event__repipe_mmap(const struct perf_tool *tool,
-				   union perf_event *event,
-				   struct perf_sample *sample,
-				   struct machine *machine)
-{
-	int err;
-
-	err = perf_event__process_mmap(tool, event, sample, machine);
-	perf_event__repipe(tool, event, sample, machine);
-
-	return err;
-}
-
-#ifdef HAVE_JITDUMP
-static int perf_event__jit_repipe_mmap(const struct perf_tool *tool,
-				       union perf_event *event,
-				       struct perf_sample *sample,
-				       struct machine *machine)
-{
-	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
-	u64 n = 0;
-	int ret;
-
-	/*
-	 * if jit marker, then inject jit mmaps and generate ELF images
-	 */
-	ret = jit_process(inject->session, &inject->output, machine,
-			  event->mmap.filename, event->mmap.pid, event->mmap.tid, &n);
-	if (ret < 0)
-		return ret;
-	if (ret) {
-		inject->bytes_written += n;
-		return 0;
-	}
-	return perf_event__repipe_mmap(tool, event, sample, machine);
-}
-#endif
-
 static struct dso *findnew_dso(int pid, int tid, const char *filename,
 			       const struct dso_id *id, struct machine *machine)
 {
@@ -460,114 +422,108 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
 	return dso;
 }
 
-static int perf_event__repipe_buildid_mmap(const struct perf_tool *tool,
-					   union perf_event *event,
-					   struct perf_sample *sample,
-					   struct machine *machine)
-{
-	struct dso *dso;
-
-	dso = findnew_dso(event->mmap.pid, event->mmap.tid,
-			  event->mmap.filename, NULL, machine);
-
-	if (dso && !dso__hit(dso)) {
-		dso__set_hit(dso);
-		dso__inject_build_id(dso, tool, machine, sample->cpumode, 0);
-	}
-	dso__put(dso);
-
-	return perf_event__repipe(tool, event, sample, machine);
-}
-
-static int perf_event__repipe_mmap2(const struct perf_tool *tool,
+static int perf_event__repipe_mmap(const struct perf_tool *tool,
 				   union perf_event *event,
 				   struct perf_sample *sample,
 				   struct machine *machine)
 {
-	int err;
+	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
 
-	err = perf_event__process_mmap2(tool, event, sample, machine);
-	perf_event__repipe(tool, event, sample, machine);
+#ifdef HAVE_JITDUMP
+	if (inject->jit_mode) {
+		u64 n = 0;
+		int ret;
 
-	if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) {
-		struct dso *dso;
+		/* If jit marker, then inject jit mmaps and generate ELF images. */
+		ret = jit_process(inject->session, &inject->output, machine,
+				  event->mmap.filename, event->mmap.pid, event->mmap.tid, &n);
+		if (ret < 0)
+			return ret;
+		if (ret) {
+			inject->bytes_written += n;
+			return 0;
+		}
+	}
+#endif
+	if (inject->build_id_style == BID_RWS__INJECT_HEADER_ALL) {
+		struct dso *dso = findnew_dso(event->mmap.pid, event->mmap.tid,
+					      event->mmap.filename, NULL, machine);
 
-		dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
-				  event->mmap2.filename, NULL, machine);
-		if (dso) {
-			/* mark it not to inject build-id */
+		if (dso && !dso__hit(dso)) {
 			dso__set_hit(dso);
+			dso__inject_build_id(dso, tool, machine, sample->cpumode, 0);
 		}
 		dso__put(dso);
-	}
+	} else {
+		/* Create the thread, map, etc. Not done for the unordered inject all case. */
+		int err = perf_event__process_mmap(tool, event, sample, machine);
 
-	return err;
+		if (err)
+			return err;
+	}
+	return perf_event__repipe(tool, event, sample, machine);
 }
 
-#ifdef HAVE_JITDUMP
-static int perf_event__jit_repipe_mmap2(const struct perf_tool *tool,
-					union perf_event *event,
-					struct perf_sample *sample,
-					struct machine *machine)
+static int perf_event__repipe_mmap2(const struct perf_tool *tool,
+				   union perf_event *event,
+				   struct perf_sample *sample,
+				   struct machine *machine)
 {
 	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
-	u64 n = 0;
-	int ret;
+	struct dso *dso = NULL;
 
-	/*
-	 * if jit marker, then inject jit mmaps and generate ELF images
-	 */
-	ret = jit_process(inject->session, &inject->output, machine,
-			  event->mmap2.filename, event->mmap2.pid, event->mmap2.tid, &n);
-	if (ret < 0)
-		return ret;
-	if (ret) {
-		inject->bytes_written += n;
-		return 0;
+#ifdef HAVE_JITDUMP
+	if (inject->jit_mode) {
+		u64 n = 0;
+		int ret;
+
+		/* If jit marker, then inject jit mmaps and generate ELF images. */
+		ret = jit_process(inject->session, &inject->output, machine,
+				event->mmap2.filename, event->mmap2.pid, event->mmap2.tid, &n);
+		if (ret < 0)
+			return ret;
+		if (ret) {
+			inject->bytes_written += n;
+			return 0;
+		}
 	}
-	return perf_event__repipe_mmap2(tool, event, sample, machine);
-}
 #endif
-
-static int perf_event__repipe_buildid_mmap2(const struct perf_tool *tool,
-					    union perf_event *event,
-					    struct perf_sample *sample,
-					    struct machine *machine)
-{
-	struct dso_id dso_id = {
-		.maj = event->mmap2.maj,
-		.min = event->mmap2.min,
-		.ino = event->mmap2.ino,
-		.ino_generation = event->mmap2.ino_generation,
-	};
-	struct dso *dso;
-
 	if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) {
-		/* cannot use dso_id since it'd have invalid info */
 		dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
 				  event->mmap2.filename, NULL, machine);
 		if (dso) {
 			/* mark it not to inject build-id */
 			dso__set_hit(dso);
 		}
-		dso__put(dso);
-		perf_event__repipe(tool, event, sample, machine);
-		return 0;
 	}
+	if (inject->build_id_style == BID_RWS__INJECT_HEADER_ALL) {
+		if (!(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) {
+			struct dso_id dso_id = {
+				.maj = event->mmap2.maj,
+				.min = event->mmap2.min,
+				.ino = event->mmap2.ino,
+				.ino_generation = event->mmap2.ino_generation,
+			};
+
+			dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
+					  event->mmap2.filename, &dso_id, machine);
+		}
+		if (dso && !dso__hit(dso)) {
+			dso__set_hit(dso);
+			dso__inject_build_id(dso, tool, machine, sample->cpumode,
+					event->mmap2.flags);
+		}
+	} else {
+		/* Create the thread, map, etc. Not done for the unordered inject all case. */
+		int err = perf_event__process_mmap(tool, event, sample, machine);
 
-	dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
-			  event->mmap2.filename, &dso_id, machine);
-
-	if (dso && !dso__hit(dso)) {
-		dso__set_hit(dso);
-		dso__inject_build_id(dso, tool, machine, sample->cpumode,
-				     event->mmap2.flags);
+		if (err) {
+			dso__put(dso);
+			return err;
+		}
 	}
 	dso__put(dso);
-
-	perf_event__repipe(tool, event, sample, machine);
-
-	return 0;
+	return perf_event__repipe(tool, event, sample, machine);
 }
 
 static int perf_event__repipe_fork(const struct perf_tool *tool,
@@ -2032,10 +1988,7 @@ static int __cmd_inject(struct perf_inject *inject)
 
 	output_data_offset = perf_session__data_offset(session->evlist);
 
-	if (inject->build_id_style == BID_RWS__INJECT_HEADER_ALL) {
-		inject->tool.mmap	  = perf_event__repipe_buildid_mmap;
-		inject->tool.mmap2	  = perf_event__repipe_buildid_mmap2;
-	} else if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+	if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
 		inject->tool.sample = perf_event__inject_buildid;
 	} else if (inject->sched_stat) {
 		struct evsel *evsel;
@@ -2430,8 +2383,8 @@ int cmd_inject(int argc, const char **argv)
 
 #ifdef HAVE_JITDUMP
 	if (inject.jit_mode) {
-		inject.tool.mmap2	   = perf_event__jit_repipe_mmap2;
-		inject.tool.mmap	   = perf_event__jit_repipe_mmap;
+		inject.tool.mmap2	   = perf_event__repipe_mmap2;
+		inject.tool.mmap	   = perf_event__repipe_mmap;
 		inject.tool.ordering_requires_timestamps = true;
 		/*
 		 * JIT MMAP injection injects all MMAP events in one go, so it
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 09/13] perf inject: Combine mmap and mmap2 handling
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (7 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 08/13] perf inject: Combine different mmap and mmap2 functions Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 10/13] perf inject: Fix build ID injection Ian Rogers
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

The handling of mmap and mmap2 events is near identical. Add a common
helper function and call that by the two event handling functions.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c | 118 +++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 62 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index d99868953ff2..a7c859db2e15 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -422,55 +422,21 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
 	return dso;
 }
 
-static int perf_event__repipe_mmap(const struct perf_tool *tool,
-				   union perf_event *event,
-				   struct perf_sample *sample,
-				   struct machine *machine)
-{
-	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
-
-#ifdef HAVE_JITDUMP
-	if (inject->jit_mode) {
-		u64 n = 0;
-		int ret;
-
-		/* If jit marker, then inject jit mmaps and generate ELF images. */
-		ret = jit_process(inject->session, &inject->output, machine,
-				  event->mmap.filename, event->mmap.pid, event->mmap.tid, &n);
-		if (ret < 0)
-			return ret;
-		if (ret) {
-			inject->bytes_written += n;
-			return 0;
-		}
-	}
-#endif
-	if (inject->build_id_style == BID_RWS__INJECT_HEADER_ALL) {
-		struct dso *dso = findnew_dso(event->mmap.pid, event->mmap.tid,
-					      event->mmap.filename, NULL, machine);
-
-		if (dso && !dso__hit(dso)) {
-			dso__set_hit(dso);
-			dso__inject_build_id(dso, tool, machine, sample->cpumode, 0);
-		}
-		dso__put(dso);
-	} else {
-		/* Create the thread, map, etc. Not done for the unordered inject all case. */
-		int err = perf_event__process_mmap(tool, event, sample, machine);
-
-		if (err)
-			return err;
-	}
-	return perf_event__repipe(tool, event, sample, machine);
-}
-
-static int perf_event__repipe_mmap2(const struct perf_tool *tool,
-				   union perf_event *event,
-				   struct perf_sample *sample,
-				   struct machine *machine)
+static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
+					  union perf_event *event,
+					  struct perf_sample *sample,
+					  struct machine *machine,
+					  __u32 pid, __u32 tid, __u32 flags,
+					  const char *filename,
+					  const struct dso_id *dso_id,
+					  int (*perf_event_process)(const struct perf_tool *tool,
+								    union perf_event *event,
+								    struct perf_sample *sample,
+								    struct machine *machine))
 {
 	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
 	struct dso *dso = NULL;
+	bool dso_sought = false;
 
 #ifdef HAVE_JITDUMP
 	if (inject->jit_mode) {
@@ -479,7 +445,7 @@ static int perf_event__repipe_mmap2(const struct perf_tool *tool,
 
 		/* If jit marker, then inject jit mmaps and generate ELF images. */
 		ret = jit_process(inject->session, &inject->output, machine,
-				event->mmap2.filename, event->mmap2.pid, event->mmap2.tid, &n);
+				  filename, pid, tid, &n);
 		if (ret < 0)
 			return ret;
 		if (ret) {
@@ -489,33 +455,26 @@ static int perf_event__repipe_mmap2(const struct perf_tool *tool,
 	}
 #endif
 	if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) {
-		dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
-				  event->mmap2.filename, NULL, machine);
+		dso = findnew_dso(pid, tid, filename, dso_id, machine);
+		dso_sought = true;
 		if (dso) {
 			/* mark it not to inject build-id */
 			dso__set_hit(dso);
 		}
 	}
 	if (inject->build_id_style == BID_RWS__INJECT_HEADER_ALL) {
-		if (!(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) {
-			struct dso_id dso_id = {
-				.maj = event->mmap2.maj,
-				.min = event->mmap2.min,
-				.ino = event->mmap2.ino,
-				.ino_generation = event->mmap2.ino_generation,
-			};
-
-			dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
-					  event->mmap2.filename, &dso_id, machine);
+		if (!dso_sought) {
+			dso = findnew_dso(pid, tid, filename, dso_id, machine);
+			dso_sought = true;
 		}
+
 		if (dso && !dso__hit(dso)) {
 			dso__set_hit(dso);
-			dso__inject_build_id(dso, tool, machine, sample->cpumode,
-					event->mmap2.flags);
+			dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
 		}
 	} else {
 		/* Create the thread, map, etc. Not done for the unordered inject all case. */
-		int err = perf_event__process_mmap(tool, event, sample, machine);
+		int err = perf_event_process(tool, event, sample, machine);
 
 		if (err) {
 			dso__put(dso);
@@ -526,6 +485,41 @@ static int perf_event__repipe_mmap2(const struct perf_tool *tool,
 	return perf_event__repipe(tool, event, sample, machine);
 }
 
+static int perf_event__repipe_mmap(const struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample,
+				struct machine *machine)
+{
+	return perf_event__repipe_common_mmap(
+		tool, event, sample, machine,
+		event->mmap.pid, event->mmap.tid, /*flags=*/0,
+		event->mmap.filename, /*dso_id=*/NULL,
+		perf_event__process_mmap);
+}
+
+static int perf_event__repipe_mmap2(const struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample,
+				struct machine *machine)
+{
+	struct dso_id id;
+	struct dso_id *dso_id = NULL;
+
+	if (!(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) {
+		id.maj = event->mmap2.maj;
+		id.min = event->mmap2.min;
+		id.ino = event->mmap2.ino;
+		id.ino_generation = event->mmap2.ino_generation;
+		dso_id = &id;
+	}
+
+	return perf_event__repipe_common_mmap(
+		tool, event, sample, machine,
+		event->mmap2.pid, event->mmap2.tid, event->mmap2.flags,
+		event->mmap2.filename, dso_id,
+		perf_event__process_mmap2);
+}
+
 static int perf_event__repipe_fork(const struct perf_tool *tool,
 				   union perf_event *event,
 				   struct perf_sample *sample,
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 10/13] perf inject: Fix build ID injection
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (8 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 09/13] perf inject: Combine mmap and mmap2 handling Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-19 18:01   ` Arnaldo Carvalho de Melo
  2024-08-17  6:44 ` [PATCH v1 11/13] perf inject: Add new mmap2-buildid-all option Ian Rogers
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Build ID injection wasn't inserting a sample ID and aligning events to
64 bytes rather than 8. No sample ID means events are unordered and
two different build_id events for the same path, as happens when a
file is replaced, can't be differentiated.

Add in sample ID insertion for the build_id events alongside some
refactoring. The refactoring better aligns the function arguments for
different use cases, such as synthesizing build_id events without
needing to have a dso. The misc bits are explicitly passed as with
callchains the maps/dsos may span user and kernel land, so using
sample->cpumode isn't good enough.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c        | 170 ++++++++++++++++++++++-------
 tools/perf/util/build-id.c         |   6 +-
 tools/perf/util/synthetic-events.c |  44 ++++++--
 tools/perf/util/synthetic-events.h |  10 +-
 4 files changed, 175 insertions(+), 55 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a7c859db2e15..84a4bdb5cb0a 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -131,6 +131,7 @@ struct perf_inject {
 	struct perf_file_section secs[HEADER_FEAT_BITS];
 	struct guest_session	guest_session;
 	struct strlist		*known_build_ids;
+	const struct evsel	*mmap_evsel;
 };
 
 struct event_entry {
@@ -139,8 +140,13 @@ struct event_entry {
 	union perf_event event[];
 };
 
-static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
-				struct machine *machine, u8 cpumode, u32 flags);
+static int dso__inject_build_id(const struct perf_tool *tool,
+				struct perf_sample *sample,
+				struct machine *machine,
+				const struct evsel *evsel,
+				__u16 misc,
+				const char *filename,
+				struct dso *dso, u32 flags);
 
 static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
 {
@@ -422,6 +428,28 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
 	return dso;
 }
 
+/*
+ * The evsel used for the sample ID for mmap events. Typically stashed when
+ * processing mmap events. If not stashed, search the evlist for the first mmap
+ * gathering event.
+ */
+static const struct evsel *inject__mmap_evsel(struct perf_inject *inject)
+{
+	struct evsel *pos;
+
+	if (inject->mmap_evsel)
+		return inject->mmap_evsel;
+
+	evlist__for_each_entry(inject->session->evlist, pos) {
+		if (pos->core.attr.mmap) {
+			inject->mmap_evsel = pos;
+			return pos;
+		}
+	}
+	pr_err("No mmap events found\n");
+	return NULL;
+}
+
 static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
 					  union perf_event *event,
 					  struct perf_sample *sample,
@@ -469,12 +497,28 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
 		}
 
 		if (dso && !dso__hit(dso)) {
-			dso__set_hit(dso);
-			dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
+			struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
+
+			if (evsel) {
+				dso__set_hit(dso);
+				dso__inject_build_id(tool, sample, machine, evsel,
+						     /*misc=*/sample->cpumode,
+						     filename, dso, flags);
+			}
 		}
 	} else {
+		int err;
+
+		/*
+		 * Remember the evsel for lazy build id generation. It is used
+		 * for the sample id header type.
+		 */
+		if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
+		    !inject->mmap_evsel)
+			inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
+
 		/* Create the thread, map, etc. Not done for the unordered inject all case. */
-		int err = perf_event_process(tool, event, sample, machine);
+		err = perf_event_process(tool, event, sample, machine);
 
 		if (err) {
 			dso__put(dso);
@@ -667,16 +711,20 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
 	return false;
 }
 
-static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
-				struct machine *machine, u8 cpumode, u32 flags)
+static int dso__inject_build_id(const struct perf_tool *tool,
+				struct perf_sample *sample,
+				struct machine *machine,
+				const struct evsel *evsel,
+				__u16 misc,
+				const char *filename,
+				struct dso *dso, u32 flags)
 {
-	struct perf_inject *inject = container_of(tool, struct perf_inject,
-						  tool);
+	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
 	int err;
 
-	if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
+	if (is_anon_memory(filename) || flags & MAP_HUGETLB)
 		return 0;
-	if (is_no_dso_memory(dso__long_name(dso)))
+	if (is_no_dso_memory(filename))
 		return 0;
 
 	if (inject->known_build_ids != NULL &&
@@ -684,24 +732,65 @@ static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
 		return 1;
 
 	if (dso__read_build_id(dso) < 0) {
-		pr_debug("no build_id found for %s\n", dso__long_name(dso));
+		pr_debug("no build_id found for %s\n", filename);
 		return -1;
 	}
 
-	err = perf_event__synthesize_build_id(tool, dso, cpumode,
-					      perf_event__repipe, machine);
+	err = perf_event__synthesize_build_id(tool, sample, machine,
+					      perf_event__repipe,
+					      evsel, misc, dso__bid(dso),
+					      filename);
 	if (err) {
-		pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
+		pr_err("Can't synthesize build_id event for %s\n", filename);
 		return -1;
 	}
 
 	return 0;
 }
 
+static int mark_dso_hit(const struct perf_tool *tool,
+			struct perf_sample *sample,
+			struct machine *machine,
+			const struct evsel *mmap_evsel,
+			struct map *map, bool sample_in_dso)
+{
+	struct dso *dso;
+	u16 misc = sample->cpumode;
+
+	if (!map)
+		return 0;
+
+	if (!sample_in_dso) {
+		u16 guest_mask = PERF_RECORD_MISC_GUEST_KERNEL |
+			PERF_RECORD_MISC_GUEST_USER;
+
+		if ((misc & guest_mask) != 0) {
+			misc &= PERF_RECORD_MISC_HYPERVISOR;
+			misc |= __map__is_kernel(map)
+				? PERF_RECORD_MISC_GUEST_KERNEL
+				: PERF_RECORD_MISC_GUEST_USER;
+		} else {
+			misc &= PERF_RECORD_MISC_HYPERVISOR;
+			misc |= __map__is_kernel(map)
+				? PERF_RECORD_MISC_KERNEL
+				: PERF_RECORD_MISC_USER;
+		}
+	}
+	dso = map__dso(map);
+	if (dso && !dso__hit(dso)) {
+		dso__set_hit(dso);
+		dso__inject_build_id(tool, sample, machine,
+				mmap_evsel, misc, dso__long_name(dso), dso,
+				map__flags(map));
+	}
+	return 0;
+}
+
 struct mark_dso_hit_args {
 	const struct perf_tool *tool;
+	struct perf_sample *sample;
 	struct machine *machine;
-	u8 cpumode;
+	const struct evsel *mmap_evsel;
 };
 
 static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
@@ -709,16 +798,8 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
 	struct mark_dso_hit_args *args = data;
 	struct map *map = node->ms.map;
 
-	if (map) {
-		struct dso *dso = map__dso(map);
-
-		if (dso && !dso__hit(dso)) {
-			dso__set_hit(dso);
-			dso__inject_build_id(dso, args->tool, args->machine,
-					     args->cpumode, map__flags(map));
-		}
-	}
-	return 0;
+	return mark_dso_hit(args->tool, args->sample, args->machine,
+			    args->mmap_evsel, map, /*sample_in_dso=*/false);
 }
 
 int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *event,
@@ -728,10 +809,16 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
 {
 	struct addr_location al;
 	struct thread *thread;
+	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
 	struct mark_dso_hit_args args = {
 		.tool = tool,
+		/*
+		 * Use the parsed sample data of the sample event, which will
+		 * have a later timestamp than the mmap event.
+		 */
+		.sample = sample,
 		.machine = machine,
-		.cpumode = sample->cpumode,
+		.mmap_evsel = inject__mmap_evsel(inject),
 	};
 
 	addr_location__init(&al);
@@ -743,13 +830,8 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
 	}
 
 	if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
-		struct dso *dso = map__dso(al.map);
-
-		if (!dso__hit(dso)) {
-			dso__set_hit(dso);
-			dso__inject_build_id(dso, tool, machine,
-					     sample->cpumode, map__flags(al.map));
-		}
+		mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
+			     /*sample_in_dso=*/true);
 	}
 
 	sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
@@ -1159,17 +1241,27 @@ static int process_build_id(const struct perf_tool *tool,
 static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
 {
 	struct machine *machine = perf_session__findnew_machine(inject->session, machine_pid);
-	u8 cpumode = dso__is_in_kernel_space(dso) ?
-			PERF_RECORD_MISC_GUEST_KERNEL :
-			PERF_RECORD_MISC_GUEST_USER;
+	struct perf_sample synth_sample = {
+		.pid	   = -1,
+		.tid	   = -1,
+		.time	   = -1,
+		.stream_id = -1,
+		.cpu	   = -1,
+		.period	   = 1,
+		.cpumode   = dso__is_in_kernel_space(dso)
+		? PERF_RECORD_MISC_GUEST_KERNEL
+		: PERF_RECORD_MISC_GUEST_USER,
+	};
 
 	if (!machine)
 		return -ENOMEM;
 
 	dso__set_hit(dso);
 
-	return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
-					       process_build_id, machine);
+	return perf_event__synthesize_build_id(&inject->tool, &synth_sample, machine,
+					       process_build_id, inject__mmap_evsel(inject),
+					       /*misc=*/synth_sample.cpumode,
+					       dso__bid(dso), dso__long_name(dso));
 }
 
 static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 451d145fa4ed..8982f68e7230 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -277,8 +277,8 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
 	struct perf_record_header_build_id b;
 	size_t len;
 
-	len = name_len + 1;
-	len = PERF_ALIGN(len, NAME_ALIGN);
+	len = sizeof(b) + name_len + 1;
+	len = PERF_ALIGN(len, sizeof(u64));
 
 	memset(&b, 0, sizeof(b));
 	memcpy(&b.data, bid->data, bid->size);
@@ -286,7 +286,7 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
 	misc |= PERF_RECORD_MISC_BUILD_ID_SIZE;
 	b.pid = pid;
 	b.header.misc = misc;
-	b.header.size = sizeof(b) + len;
+	b.header.size = len;
 
 	err = do_write(fd, &b, sizeof(b));
 	if (err < 0)
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 0a7f93ae76fb..6bb62e4e2d5d 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2225,28 +2225,48 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
 }
 #endif
 
-int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc,
-				    perf_event__handler_t process, struct machine *machine)
+int perf_event__synthesize_build_id(const struct perf_tool *tool,
+				    struct perf_sample *sample,
+				    struct machine *machine,
+				    perf_event__handler_t process,
+				    const struct evsel *evsel,
+				    __u16 misc,
+				    const struct build_id *bid,
+				    const char *filename)
 {
 	union perf_event ev;
 	size_t len;
 
-	if (!dso__hit(pos))
-		return 0;
+	len = sizeof(ev.build_id) + strlen(filename) + 1;
+	len = PERF_ALIGN(len, sizeof(u64));
 
-	memset(&ev, 0, sizeof(ev));
+	memset(&ev, 0, len);
 
-	len = dso__long_name_len(pos) + 1;
-	len = PERF_ALIGN(len, NAME_ALIGN);
-	ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
-	memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
+	ev.build_id.size = min(bid->size, sizeof(ev.build_id.build_id));
+	memcpy(ev.build_id.build_id, bid->data, ev.build_id.size);
 	ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
 	ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
 	ev.build_id.pid = machine->pid;
-	ev.build_id.header.size = sizeof(ev.build_id) + len;
-	memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
+	ev.build_id.header.size = len;
+	strcpy(ev.build_id.filename, filename);
+
+	if (evsel) {
+		void *array = &ev;
+		int ret;
 
-	return process(tool, &ev, NULL, machine);
+		array += ev.header.size;
+		ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
+		if (ret < 0)
+			return ret;
+
+		if (ret & 7) {
+			pr_err("Bad id sample size %d\n", ret);
+			return -EINVAL;
+		}
+
+		ev.header.size += ret;
+	}
+	return process(tool, &ev, sample, machine);
 }
 
 int perf_event__synthesize_stat_events(struct perf_stat_config *config, const struct perf_tool *tool,
diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
index 31df7653677f..795bf3e18396 100644
--- a/tools/perf/util/synthetic-events.h
+++ b/tools/perf/util/synthetic-events.h
@@ -9,6 +9,7 @@
 #include <perf/cpumap.h>
 
 struct auxtrace_record;
+struct build_id;
 struct dso;
 struct evlist;
 struct evsel;
@@ -45,7 +46,14 @@ typedef int (*perf_event__handler_t)(const struct perf_tool *tool, union perf_ev
 
 int perf_event__synthesize_attrs(const struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process);
 int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_attr *attr, u32 ids, u64 *id, perf_event__handler_t process);
-int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc, perf_event__handler_t process, struct machine *machine);
+int perf_event__synthesize_build_id(const struct perf_tool *tool,
+				    struct perf_sample *sample,
+				    struct machine *machine,
+				    perf_event__handler_t process,
+				    const struct evsel *evsel,
+				    __u16 misc,
+				    const struct build_id *bid,
+				    const char *filename);
 int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
 int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
 int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 11/13] perf inject: Add new mmap2-buildid-all option
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (9 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 10/13] perf inject: Fix build ID injection Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 12/13] perf inject: Lazy build-id mmap2 event insertion Ian Rogers
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Add an option that allows all mmap or mmap2 events to be rewritten as
mmap2 events with build IDs. This is similar to the existing
-b/--build-ids and --buildid-all options except instead of adding a
build_id event an existing mmap/mmap2 event is used as a template and
a new mmap2 event synthesized from it. As mmap2 events are typical
this avoids the insertion of build_id events.

Add test coverage to the pipe test.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c         | 88 ++++++++++++++++++++++++++++-
 tools/perf/tests/shell/pipe_test.sh |  1 +
 tools/perf/util/synthetic-events.c  | 57 +++++++++++++++++++
 tools/perf/util/synthetic-events.h  | 11 ++++
 4 files changed, 154 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 84a4bdb5cb0a..86425cade30e 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -107,6 +107,7 @@ enum build_id_rewrite_style {
 	BID_RWS__NONE = 0,
 	BID_RWS__INJECT_HEADER_LAZY,
 	BID_RWS__INJECT_HEADER_ALL,
+	BID_RWS__MMAP2_BUILDID_ALL,
 };
 
 struct perf_inject {
@@ -147,6 +148,16 @@ static int dso__inject_build_id(const struct perf_tool *tool,
 				__u16 misc,
 				const char *filename,
 				struct dso *dso, u32 flags);
+static int dso__inject_mmap2_build_id(const struct perf_tool *tool,
+				      struct perf_sample *sample,
+				      struct machine *machine,
+				      const struct evsel *evsel,
+				      __u16 misc,
+				      __u32 pid, __u32 tid,
+				      __u64 start, __u64 len, __u64 pgoff,
+				      struct dso *dso,
+				      __u32 prot, __u32 flags,
+				      const char *filename);
 
 static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
 {
@@ -162,6 +173,7 @@ static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
 
 static int perf_event__repipe_synth(const struct perf_tool *tool,
 				    union perf_event *event)
+
 {
 	struct perf_inject *inject = container_of(tool, struct perf_inject,
 						  tool);
@@ -454,7 +466,9 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
 					  union perf_event *event,
 					  struct perf_sample *sample,
 					  struct machine *machine,
-					  __u32 pid, __u32 tid, __u32 flags,
+					  __u32 pid, __u32 tid,
+					  __u64 start, __u64 len, __u64 pgoff,
+					  __u32 flags, __u32 prot,
 					  const char *filename,
 					  const struct dso_id *dso_id,
 					  int (*perf_event_process)(const struct perf_tool *tool,
@@ -525,6 +539,26 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
 			return err;
 		}
 	}
+	if ((inject->build_id_style == BID_RWS__MMAP2_BUILDID_ALL) &&
+	    !(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) {
+		struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
+
+		if (evsel && !dso_sought) {
+			dso = findnew_dso(pid, tid, filename, dso_id, machine);
+			dso_sought = true;
+		}
+		if (evsel && dso &&
+		    !dso__inject_mmap2_build_id(tool, sample, machine, evsel,
+						sample->cpumode | PERF_RECORD_MISC_MMAP_BUILD_ID,
+						pid, tid, start, len, pgoff,
+						dso,
+						prot, flags,
+						filename)) {
+			/* Injected mmap2 so no need to repipe. */
+			dso__put(dso);
+			return 0;
+		}
+	}
 	dso__put(dso);
 	return perf_event__repipe(tool, event, sample, machine);
 }
@@ -536,7 +570,9 @@ static int perf_event__repipe_mmap(const struct perf_tool *tool,
 {
 	return perf_event__repipe_common_mmap(
 		tool, event, sample, machine,
-		event->mmap.pid, event->mmap.tid, /*flags=*/0,
+		event->mmap.pid, event->mmap.tid,
+		event->mmap.start, event->mmap.len, event->mmap.pgoff,
+		/*flags=*/0, PROT_EXEC,
 		event->mmap.filename, /*dso_id=*/NULL,
 		perf_event__process_mmap);
 }
@@ -559,7 +595,9 @@ static int perf_event__repipe_mmap2(const struct perf_tool *tool,
 
 	return perf_event__repipe_common_mmap(
 		tool, event, sample, machine,
-		event->mmap2.pid, event->mmap2.tid, event->mmap2.flags,
+		event->mmap2.pid, event->mmap2.tid,
+		event->mmap2.start, event->mmap2.len, event->mmap2.pgoff,
+		event->mmap2.flags, event->mmap2.prot,
 		event->mmap2.filename, dso_id,
 		perf_event__process_mmap2);
 }
@@ -748,6 +786,45 @@ static int dso__inject_build_id(const struct perf_tool *tool,
 	return 0;
 }
 
+static int dso__inject_mmap2_build_id(const struct perf_tool *tool,
+				      struct perf_sample *sample,
+				      struct machine *machine,
+				      const struct evsel *evsel,
+				      __u16 misc,
+				      __u32 pid, __u32 tid,
+				      __u64 start, __u64 len, __u64 pgoff,
+				      struct dso *dso,
+				      __u32 prot, __u32 flags,
+				      const char *filename)
+{
+	int err;
+
+	/* Return to repipe anonymous maps. */
+	if (is_anon_memory(filename) || flags & MAP_HUGETLB)
+		return 1;
+	if (is_no_dso_memory(filename))
+		return 1;
+
+	if (dso__read_build_id(dso)) {
+		pr_debug("no build_id found for %s\n", filename);
+		return -1;
+	}
+
+	err = perf_event__synthesize_mmap2_build_id(tool, sample, machine,
+						    perf_event__repipe,
+						    evsel,
+						    misc, pid, tid,
+						    start, len, pgoff,
+						    dso__bid(dso),
+						    prot, flags,
+						    filename);
+	if (err) {
+		pr_err("Can't synthesize build_id event for %s\n", filename);
+		return -1;
+	}
+	return 0;
+}
+
 static int mark_dso_hit(const struct perf_tool *tool,
 			struct perf_sample *sample,
 			struct machine *machine,
@@ -2257,12 +2334,15 @@ int cmd_inject(int argc, const char **argv)
 	const char *known_build_ids = NULL;
 	bool build_ids;
 	bool build_id_all;
+	bool mmap2_build_id_all;
 
 	struct option options[] = {
 		OPT_BOOLEAN('b', "build-ids", &build_ids,
 			    "Inject build-ids into the output stream"),
 		OPT_BOOLEAN(0, "buildid-all", &build_id_all,
 			    "Inject build-ids of all DSOs into the output stream"),
+		OPT_BOOLEAN(0, "mmap2-buildid-all", &mmap2_build_id_all,
+			    "Rewrite all mmap events as mmap2 events with build IDs"),
 		OPT_STRING(0, "known-build-ids", &known_build_ids,
 			   "buildid path [,buildid path...]",
 			   "build-ids to use for given paths"),
@@ -2359,6 +2439,8 @@ int cmd_inject(int argc, const char **argv)
 			return -1;
 		}
 	}
+	if (mmap2_build_id_all)
+		inject.build_id_style = BID_RWS__MMAP2_BUILDID_ALL;
 	if (build_ids)
 		inject.build_id_style = BID_RWS__INJECT_HEADER_LAZY;
 	if (build_id_all)
diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell/pipe_test.sh
index ad10012fdc29..1cacd6f15bcb 100755
--- a/tools/perf/tests/shell/pipe_test.sh
+++ b/tools/perf/tests/shell/pipe_test.sh
@@ -92,6 +92,7 @@ test_inject_bids() {
 test_record_report
 test_inject_bids -b
 test_inject_bids --buildid-all
+test_inject_bids --mmap2-buildid-all
 
 cleanup
 exit $err
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 6bb62e4e2d5d..a58444c4aed1 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2266,6 +2266,63 @@ int perf_event__synthesize_build_id(const struct perf_tool *tool,
 
 		ev.header.size += ret;
 	}
+
+	return process(tool, &ev, sample, machine);
+}
+
+int perf_event__synthesize_mmap2_build_id(const struct perf_tool *tool,
+					  struct perf_sample *sample,
+					  struct machine *machine,
+					  perf_event__handler_t process,
+					  const struct evsel *evsel,
+					  __u16 misc,
+					  __u32 pid, __u32 tid,
+					  __u64 start, __u64 len, __u64 pgoff,
+					  const struct build_id *bid,
+					  __u32 prot, __u32 flags,
+					  const char *filename)
+{
+	union perf_event ev;
+	size_t ev_len;
+	void *array;
+	int ret;
+
+	ev_len = sizeof(ev.mmap2) - sizeof(ev.mmap2.filename) + strlen(filename) + 1;
+	ev_len = PERF_ALIGN(ev_len, sizeof(u64));
+
+	memset(&ev, 0, ev_len);
+
+	ev.mmap2.header.type = PERF_RECORD_MMAP2;
+	ev.mmap2.header.misc = misc | PERF_RECORD_MISC_MMAP_BUILD_ID;
+	ev.mmap2.header.size = ev_len;
+
+	ev.mmap2.pid = pid;
+	ev.mmap2.tid = tid;
+	ev.mmap2.start = start;
+	ev.mmap2.len = len;
+	ev.mmap2.pgoff = pgoff;
+
+	ev.mmap2.build_id_size = min(bid->size, sizeof(ev.mmap2.build_id));
+	memcpy(ev.mmap2.build_id, bid->data, ev.mmap2.build_id_size);
+
+	ev.mmap2.prot = prot;
+	ev.mmap2.flags = flags;
+
+	memcpy(ev.mmap2.filename, filename, min(strlen(filename), sizeof(ev.mmap.filename)));
+
+	array = &ev;
+	array += ev.header.size;
+	ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
+	if (ret < 0)
+		return ret;
+
+	if (ret & 7) {
+		pr_err("Bad id sample size %d\n", ret);
+		return -EINVAL;
+	}
+
+	ev.header.size += ret;
+
 	return process(tool, &ev, sample, machine);
 }
 
diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
index 795bf3e18396..b9c936b5cfeb 100644
--- a/tools/perf/util/synthetic-events.h
+++ b/tools/perf/util/synthetic-events.h
@@ -54,6 +54,17 @@ int perf_event__synthesize_build_id(const struct perf_tool *tool,
 				    __u16 misc,
 				    const struct build_id *bid,
 				    const char *filename);
+int perf_event__synthesize_mmap2_build_id(const struct perf_tool *tool,
+					  struct perf_sample *sample,
+					  struct machine *machine,
+					  perf_event__handler_t process,
+					  const struct evsel *evsel,
+					  __u16 misc,
+					  __u32 pid, __u32 tid,
+					  __u64 start, __u64 len, __u64 pgoff,
+					  const struct build_id *bid,
+					  __u32 prot, __u32 flags,
+					  const char *filename);
 int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
 int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
 int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 12/13] perf inject: Lazy build-id mmap2 event insertion
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (10 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 11/13] perf inject: Add new mmap2-buildid-all option Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-08-17  6:44 ` [PATCH v1 13/13] perf callchain: Allow symbols to be optional when resolving a callchain Ian Rogers
  2024-09-02 18:27 ` [PATCH v1 00/13] perf inject improvements Namhyung Kim
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

Add -B option that lazily inserts mmap2 events thereby dropping all
mmap events without samples. This is similar to the behavior of -b
where only build_id events are inserted when a dso is accessed in a
sample.

File size savings can be significant in system-wide mode, consider:
```
$ perf record -g -a -o perf.data sleep 1
$ perf inject -B -i perf.data -o perf.new.data
$ ls -al perf.data perf.new.data
         5147049 perf.data
         2248493 perf.new.data
```

Give test coverage of the new option in pipe test.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c         | 62 +++++++++++++++++++++++------
 tools/perf/tests/shell/pipe_test.sh |  1 +
 tools/perf/util/map.c               |  1 +
 tools/perf/util/map.h               | 11 +++++
 4 files changed, 63 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 86425cade30e..2ff246f56a44 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -108,6 +108,7 @@ enum build_id_rewrite_style {
 	BID_RWS__INJECT_HEADER_LAZY,
 	BID_RWS__INJECT_HEADER_ALL,
 	BID_RWS__MMAP2_BUILDID_ALL,
+	BID_RWS__MMAP2_BUILDID_LAZY,
 };
 
 struct perf_inject {
@@ -527,7 +528,8 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
 		 * Remember the evsel for lazy build id generation. It is used
 		 * for the sample id header type.
 		 */
-		if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
+		if ((inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+		     inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) &&
 		    !inject->mmap_evsel)
 			inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
 
@@ -560,6 +562,9 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
 		}
 	}
 	dso__put(dso);
+	if (inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY)
+		return 0;
+
 	return perf_event__repipe(tool, event, sample, machine);
 }
 
@@ -825,7 +830,8 @@ static int dso__inject_mmap2_build_id(const struct perf_tool *tool,
 	return 0;
 }
 
-static int mark_dso_hit(const struct perf_tool *tool,
+static int mark_dso_hit(const struct perf_inject *inject,
+			const struct perf_tool *tool,
 			struct perf_sample *sample,
 			struct machine *machine,
 			const struct evsel *mmap_evsel,
@@ -854,16 +860,39 @@ static int mark_dso_hit(const struct perf_tool *tool,
 		}
 	}
 	dso = map__dso(map);
-	if (dso && !dso__hit(dso)) {
-		dso__set_hit(dso);
-		dso__inject_build_id(tool, sample, machine,
-				mmap_evsel, misc, dso__long_name(dso), dso,
-				map__flags(map));
+	if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+		if (dso && !dso__hit(dso)) {
+			dso__set_hit(dso);
+			dso__inject_build_id(tool, sample, machine,
+					     mmap_evsel, misc, dso__long_name(dso), dso,
+					     map__flags(map));
+		}
+	} else if (inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) {
+		if (!map__hit(map)) {
+			const struct build_id null_bid = { .size = 0 };
+			const struct build_id *bid = dso ? dso__bid(dso) : &null_bid;
+			const char *filename = dso ? dso__long_name(dso) : "";
+
+			map__set_hit(map);
+			perf_event__synthesize_mmap2_build_id(tool, sample, machine,
+								perf_event__repipe,
+								mmap_evsel,
+								misc,
+								sample->pid, sample->tid,
+								map__start(map),
+								map__end(map) - map__start(map),
+								map__pgoff(map),
+								bid,
+								map__prot(map),
+								map__flags(map),
+								filename);
+		}
 	}
 	return 0;
 }
 
 struct mark_dso_hit_args {
+	const struct perf_inject *inject;
 	const struct perf_tool *tool;
 	struct perf_sample *sample;
 	struct machine *machine;
@@ -875,7 +904,7 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
 	struct mark_dso_hit_args *args = data;
 	struct map *map = node->ms.map;
 
-	return mark_dso_hit(args->tool, args->sample, args->machine,
+	return mark_dso_hit(args->inject, args->tool, args->sample, args->machine,
 			    args->mmap_evsel, map, /*sample_in_dso=*/false);
 }
 
@@ -888,6 +917,7 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
 	struct thread *thread;
 	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
 	struct mark_dso_hit_args args = {
+		.inject = inject,
 		.tool = tool,
 		/*
 		 * Use the parsed sample data of the sample event, which will
@@ -907,7 +937,7 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
 	}
 
 	if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
-		mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
+		mark_dso_hit(inject, tool, sample, machine, args.mmap_evsel, al.map,
 			     /*sample_in_dso=*/true);
 	}
 
@@ -2151,7 +2181,8 @@ static int __cmd_inject(struct perf_inject *inject)
 
 	output_data_offset = perf_session__data_offset(session->evlist);
 
-	if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+	if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+	    inject->build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) {
 		inject->tool.sample = perf_event__inject_buildid;
 	} else if (inject->sched_stat) {
 		struct evsel *evsel;
@@ -2334,6 +2365,7 @@ int cmd_inject(int argc, const char **argv)
 	const char *known_build_ids = NULL;
 	bool build_ids;
 	bool build_id_all;
+	bool mmap2_build_ids;
 	bool mmap2_build_id_all;
 
 	struct option options[] = {
@@ -2341,6 +2373,8 @@ int cmd_inject(int argc, const char **argv)
 			    "Inject build-ids into the output stream"),
 		OPT_BOOLEAN(0, "buildid-all", &build_id_all,
 			    "Inject build-ids of all DSOs into the output stream"),
+		OPT_BOOLEAN('B', "mmap2-buildids", &mmap2_build_ids,
+			    "Drop unused mmap events, make others mmap2 with build IDs"),
 		OPT_BOOLEAN(0, "mmap2-buildid-all", &mmap2_build_id_all,
 			    "Rewrite all mmap events as mmap2 events with build IDs"),
 		OPT_STRING(0, "known-build-ids", &known_build_ids,
@@ -2439,6 +2473,8 @@ int cmd_inject(int argc, const char **argv)
 			return -1;
 		}
 	}
+	if (mmap2_build_ids)
+		inject.build_id_style = BID_RWS__MMAP2_BUILDID_LAZY;
 	if (mmap2_build_id_all)
 		inject.build_id_style = BID_RWS__MMAP2_BUILDID_ALL;
 	if (build_ids)
@@ -2458,7 +2494,8 @@ int cmd_inject(int argc, const char **argv)
 			repipe = false;
 	}
 	ordered_events = inject.jit_mode || inject.sched_stat ||
-		(inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY);
+		inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+		inject.build_id_style == BID_RWS__MMAP2_BUILDID_LAZY;
 	perf_tool__init(&inject.tool, ordered_events);
 	inject.tool.sample		= perf_event__repipe_sample;
 	inject.tool.read		= perf_event__repipe_sample;
@@ -2530,7 +2567,8 @@ int cmd_inject(int argc, const char **argv)
 			goto out_delete;
 	}
 
-	if (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
+	if (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY ||
+	    inject.build_id_style == BID_RWS__MMAP2_BUILDID_LAZY) {
 		/*
 		 * to make sure the mmap records are ordered correctly
 		 * and so that the correct especially due to jitted code
diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell/pipe_test.sh
index 1cacd6f15bcb..9b1fcbfeb837 100755
--- a/tools/perf/tests/shell/pipe_test.sh
+++ b/tools/perf/tests/shell/pipe_test.sh
@@ -90,6 +90,7 @@ test_inject_bids() {
 }
 
 test_record_report
+test_inject_bids -B
 test_inject_bids -b
 test_inject_bids --buildid-all
 test_inject_bids --mmap2-buildid-all
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index e781c8d56a9a..d729438b7d65 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -116,6 +116,7 @@ static void map__init(struct map *map, u64 start, u64 end, u64 pgoff,
 	map__set_mapping_type(map, MAPPING_TYPE__DSO);
 	assert(map__erange_warned(map) == false);
 	assert(map__priv(map) == false);
+	assert(map__hit(map) == false);
 }
 
 struct map *map__new(struct machine *machine, u64 start, u64 len,
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 6c43f31a9fe0..4262f5a143be 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -35,6 +35,7 @@ DECLARE_RC_STRUCT(map) {
 	enum mapping_type	mapping_type:8;
 	bool			erange_warned;
 	bool			priv;
+	bool			hit;
 };
 
 struct kmap;
@@ -83,6 +84,11 @@ static inline bool map__priv(const struct map *map)
 	return RC_CHK_ACCESS(map)->priv;
 }
 
+static inline bool map__hit(const struct map *map)
+{
+	return RC_CHK_ACCESS(map)->hit;
+}
+
 static inline refcount_t *map__refcnt(struct map *map)
 {
 	return &RC_CHK_ACCESS(map)->refcnt;
@@ -287,6 +293,11 @@ static inline void map__set_priv(struct map *map)
 	RC_CHK_ACCESS(map)->priv = true;
 }
 
+static inline void map__set_hit(struct map *map)
+{
+	RC_CHK_ACCESS(map)->hit = true;
+}
+
 static inline void map__set_erange_warned(struct map *map)
 {
 	RC_CHK_ACCESS(map)->erange_warned = true;
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 13/13] perf callchain: Allow symbols to be optional when resolving a callchain
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (11 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 12/13] perf inject: Lazy build-id mmap2 event insertion Ian Rogers
@ 2024-08-17  6:44 ` Ian Rogers
  2024-09-02 18:27 ` [PATCH v1 00/13] perf inject improvements Namhyung Kim
  13 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2024-08-17  6:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Masahiro Yamada,
	Arnd Bergmann, Jann Horn, Colin Ian King, Casey Chen,
	Athira Rajeev, Chaitanya S Prakash, James Clark, Ze Gao,
	Yang Jihong, Yunseong Kim, Weilin Wang, Dominique Martinet,
	Anne Macedo, Sun Haiyong, linux-perf-users, linux-kernel

In uses like perf inject it is not necessary to gather the symbol for
each call chain location, the map for the sample IP is wanted so that
build IDs and the like can be injected. Make gathering the symbol in
the callchain_cursor optional.

For a perf inject -B command this lowers the peak RSS from 54.1MB to
29.6MB by avoiding loading symbols.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c |  2 +-
 tools/perf/util/callchain.c |  8 ++--
 tools/perf/util/callchain.h |  2 +-
 tools/perf/util/machine.c   | 92 +++++++++++++++++++++----------------
 tools/perf/util/machine.h   | 33 ++++++++++---
 5 files changed, 85 insertions(+), 52 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 2ff246f56a44..8ad7cc5f0c63 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -942,7 +942,7 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
 	}
 
 	sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
-					mark_dso_hit_callback, &args);
+					/*symbols=*/false, mark_dso_hit_callback, &args);
 
 	thread__put(thread);
 repipe:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 0d608e875fe9..0c7564747a14 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1800,7 +1800,7 @@ s64 callchain_avg_cycles(struct callchain_node *cnode)
 
 int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
 				    struct perf_sample *sample, int max_stack,
-				    callchain_iter_fn cb, void *data)
+				    bool symbols, callchain_iter_fn cb, void *data)
 {
 	struct callchain_cursor *cursor = get_tls_callchain_cursor();
 	int ret;
@@ -1809,9 +1809,9 @@ int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
 		return -ENOMEM;
 
 	/* Fill in the callchain. */
-	ret = thread__resolve_callchain(thread, cursor, evsel, sample,
-					/*parent=*/NULL, /*root_al=*/NULL,
-					max_stack);
+	ret = __thread__resolve_callchain(thread, cursor, evsel, sample,
+					  /*parent=*/NULL, /*root_al=*/NULL,
+					  max_stack, symbols);
 	if (ret)
 		return ret;
 
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 76891f8e2373..86ed9e4d04f9 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -315,6 +315,6 @@ typedef int (*callchain_iter_fn)(struct callchain_cursor_node *node, void *data)
 
 int sample__for_each_callchain_node(struct thread *thread, struct evsel *evsel,
 				    struct perf_sample *sample, int max_stack,
-				    callchain_iter_fn cb, void *data);
+				    bool symbols, callchain_iter_fn cb, void *data);
 
 #endif	/* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index cd79a830abae..6cffee6f9891 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2059,7 +2059,8 @@ static int add_callchain_ip(struct thread *thread,
 			    bool branch,
 			    struct branch_flags *flags,
 			    struct iterations *iter,
-			    u64 branch_from)
+			    u64 branch_from,
+			    bool symbols)
 {
 	struct map_symbol ms = {};
 	struct addr_location al;
@@ -2098,7 +2099,8 @@ static int add_callchain_ip(struct thread *thread,
 			}
 			goto out;
 		}
-		thread__find_symbol(thread, *cpumode, ip, &al);
+		if (symbols)
+			thread__find_symbol(thread, *cpumode, ip, &al);
 	}
 
 	if (al.sym != NULL) {
@@ -2227,7 +2229,8 @@ static int lbr_callchain_add_kernel_ip(struct thread *thread,
 				       struct symbol **parent,
 				       struct addr_location *root_al,
 				       u64 branch_from,
-				       bool callee, int end)
+				       bool callee, int end,
+				       bool symbols)
 {
 	struct ip_callchain *chain = sample->callchain;
 	u8 cpumode = PERF_RECORD_MISC_USER;
@@ -2237,7 +2240,8 @@ static int lbr_callchain_add_kernel_ip(struct thread *thread,
 		for (i = 0; i < end + 1; i++) {
 			err = add_callchain_ip(thread, cursor, parent,
 					       root_al, &cpumode, chain->ips[i],
-					       false, NULL, NULL, branch_from);
+					       false, NULL, NULL, branch_from,
+					       symbols);
 			if (err)
 				return err;
 		}
@@ -2247,7 +2251,8 @@ static int lbr_callchain_add_kernel_ip(struct thread *thread,
 	for (i = end; i >= 0; i--) {
 		err = add_callchain_ip(thread, cursor, parent,
 				       root_al, &cpumode, chain->ips[i],
-				       false, NULL, NULL, branch_from);
+				       false, NULL, NULL, branch_from,
+				       symbols);
 		if (err)
 			return err;
 	}
@@ -2290,7 +2295,8 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
 				    struct symbol **parent,
 				    struct addr_location *root_al,
 				    u64 *branch_from,
-				    bool callee)
+				    bool callee,
+				    bool symbols)
 {
 	struct branch_stack *lbr_stack = sample->branch_stack;
 	struct branch_entry *entries = perf_sample__branch_entries(sample);
@@ -2323,7 +2329,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
 		err = add_callchain_ip(thread, cursor, parent,
 				       root_al, &cpumode, ip,
 				       true, flags, NULL,
-				       *branch_from);
+				       *branch_from, symbols);
 		if (err)
 			return err;
 
@@ -2348,7 +2354,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
 			err = add_callchain_ip(thread, cursor, parent,
 					       root_al, &cpumode, ip,
 					       true, flags, NULL,
-					       *branch_from);
+					       *branch_from, symbols);
 			if (err)
 				return err;
 			save_lbr_cursor_node(thread, cursor, i);
@@ -2363,7 +2369,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
 		err = add_callchain_ip(thread, cursor, parent,
 				       root_al, &cpumode, ip,
 				       true, flags, NULL,
-				       *branch_from);
+				       *branch_from, symbols);
 		if (err)
 			return err;
 		save_lbr_cursor_node(thread, cursor, i);
@@ -2377,7 +2383,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *thread,
 		err = add_callchain_ip(thread, cursor, parent,
 				root_al, &cpumode, ip,
 				true, flags, NULL,
-				*branch_from);
+				*branch_from, symbols);
 		if (err)
 			return err;
 	}
@@ -2544,7 +2550,8 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 					struct symbol **parent,
 					struct addr_location *root_al,
 					int max_stack,
-					unsigned int max_lbr)
+					unsigned int max_lbr,
+					bool symbols)
 {
 	bool callee = (callchain_param.order == ORDER_CALLEE);
 	struct ip_callchain *chain = sample->callchain;
@@ -2586,12 +2593,12 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 		/* Add kernel ip */
 		err = lbr_callchain_add_kernel_ip(thread, cursor, sample,
 						  parent, root_al, branch_from,
-						  true, i);
+						  true, i, symbols);
 		if (err)
 			goto error;
 
 		err = lbr_callchain_add_lbr_ip(thread, cursor, sample, parent,
-					       root_al, &branch_from, true);
+					       root_al, &branch_from, true, symbols);
 		if (err)
 			goto error;
 
@@ -2608,14 +2615,14 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 				goto error;
 		}
 		err = lbr_callchain_add_lbr_ip(thread, cursor, sample, parent,
-					       root_al, &branch_from, false);
+					       root_al, &branch_from, false, symbols);
 		if (err)
 			goto error;
 
 		/* Add kernel ip */
 		err = lbr_callchain_add_kernel_ip(thread, cursor, sample,
 						  parent, root_al, branch_from,
-						  false, i);
+						  false, i, symbols);
 		if (err)
 			goto error;
 	}
@@ -2629,7 +2636,7 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
 			     struct callchain_cursor *cursor,
 			     struct symbol **parent,
 			     struct addr_location *root_al,
-			     u8 *cpumode, int ent)
+			     u8 *cpumode, int ent, bool symbols)
 {
 	int err = 0;
 
@@ -2639,7 +2646,7 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
 		if (ip >= PERF_CONTEXT_MAX) {
 			err = add_callchain_ip(thread, cursor, parent,
 					       root_al, cpumode, ip,
-					       false, NULL, NULL, 0);
+					       false, NULL, NULL, 0, symbols);
 			break;
 		}
 	}
@@ -2661,7 +2668,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 					    struct perf_sample *sample,
 					    struct symbol **parent,
 					    struct addr_location *root_al,
-					    int max_stack)
+					    int max_stack,
+					    bool symbols)
 {
 	struct branch_stack *branch = sample->branch_stack;
 	struct branch_entry *entries = perf_sample__branch_entries(sample);
@@ -2681,7 +2689,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 
 		err = resolve_lbr_callchain_sample(thread, cursor, sample, parent,
 						   root_al, max_stack,
-						   !env ? 0 : env->max_branches);
+						   !env ? 0 : env->max_branches,
+						   symbols);
 		if (err)
 			return (err < 0) ? err : 0;
 	}
@@ -2746,13 +2755,14 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 					       root_al,
 					       NULL, be[i].to,
 					       true, &be[i].flags,
-					       NULL, be[i].from);
+					       NULL, be[i].from, symbols);
 
-			if (!err)
+			if (!err) {
 				err = add_callchain_ip(thread, cursor, parent, root_al,
 						       NULL, be[i].from,
 						       true, &be[i].flags,
-						       &iter[i], 0);
+						       &iter[i], 0, symbols);
+			}
 			if (err == -EINVAL)
 				break;
 			if (err)
@@ -2768,7 +2778,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 check_calls:
 	if (chain && callchain_param.order != ORDER_CALLEE) {
 		err = find_prev_cpumode(chain, thread, cursor, parent, root_al,
-					&cpumode, chain->nr - first_call);
+					&cpumode, chain->nr - first_call, symbols);
 		if (err)
 			return (err < 0) ? err : 0;
 	}
@@ -2790,7 +2800,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
                        ++nr_entries;
 		else if (callchain_param.order != ORDER_CALLEE) {
 			err = find_prev_cpumode(chain, thread, cursor, parent,
-						root_al, &cpumode, j);
+						root_al, &cpumode, j, symbols);
 			if (err)
 				return (err < 0) ? err : 0;
 			continue;
@@ -2817,8 +2827,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 			if (leaf_frame_caller && leaf_frame_caller != ip) {
 
 				err = add_callchain_ip(thread, cursor, parent,
-					       root_al, &cpumode, leaf_frame_caller,
-					       false, NULL, NULL, 0);
+						root_al, &cpumode, leaf_frame_caller,
+						false, NULL, NULL, 0, symbols);
 				if (err)
 					return (err < 0) ? err : 0;
 			}
@@ -2826,7 +2836,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 
 		err = add_callchain_ip(thread, cursor, parent,
 				       root_al, &cpumode, ip,
-				       false, NULL, NULL, 0);
+				       false, NULL, NULL, 0, symbols);
 
 		if (err)
 			return (err < 0) ? err : 0;
@@ -2906,7 +2916,7 @@ static int thread__resolve_callchain_unwind(struct thread *thread,
 					    struct callchain_cursor *cursor,
 					    struct evsel *evsel,
 					    struct perf_sample *sample,
-					    int max_stack)
+					    int max_stack, bool symbols)
 {
 	/* Can we do dwarf post unwind? */
 	if (!((evsel->core.attr.sample_type & PERF_SAMPLE_REGS_USER) &&
@@ -2918,17 +2928,21 @@ static int thread__resolve_callchain_unwind(struct thread *thread,
 	    (!sample->user_stack.size))
 		return 0;
 
+	if (!symbols)
+		pr_debug("Not resolving symbols with an unwinder isn't currently supported\n");
+
 	return unwind__get_entries(unwind_entry, cursor,
 				   thread, sample, max_stack, false);
 }
 
-int thread__resolve_callchain(struct thread *thread,
-			      struct callchain_cursor *cursor,
-			      struct evsel *evsel,
-			      struct perf_sample *sample,
-			      struct symbol **parent,
-			      struct addr_location *root_al,
-			      int max_stack)
+int __thread__resolve_callchain(struct thread *thread,
+				struct callchain_cursor *cursor,
+				struct evsel *evsel,
+				struct perf_sample *sample,
+				struct symbol **parent,
+				struct addr_location *root_al,
+				int max_stack,
+				bool symbols)
 {
 	int ret = 0;
 
@@ -2941,22 +2955,22 @@ int thread__resolve_callchain(struct thread *thread,
 		ret = thread__resolve_callchain_sample(thread, cursor,
 						       evsel, sample,
 						       parent, root_al,
-						       max_stack);
+						       max_stack, symbols);
 		if (ret)
 			return ret;
 		ret = thread__resolve_callchain_unwind(thread, cursor,
 						       evsel, sample,
-						       max_stack);
+						       max_stack, symbols);
 	} else {
 		ret = thread__resolve_callchain_unwind(thread, cursor,
 						       evsel, sample,
-						       max_stack);
+						       max_stack, symbols);
 		if (ret)
 			return ret;
 		ret = thread__resolve_callchain_sample(thread, cursor,
 						       evsel, sample,
 						       parent, root_al,
-						       max_stack);
+						       max_stack, symbols);
 	}
 
 	return ret;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index a687876e3453..2e5a4cb342d8 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -178,13 +178,32 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 
 struct callchain_cursor;
 
-int thread__resolve_callchain(struct thread *thread,
-			      struct callchain_cursor *cursor,
-			      struct evsel *evsel,
-			      struct perf_sample *sample,
-			      struct symbol **parent,
-			      struct addr_location *root_al,
-			      int max_stack);
+int __thread__resolve_callchain(struct thread *thread,
+				struct callchain_cursor *cursor,
+				struct evsel *evsel,
+				struct perf_sample *sample,
+				struct symbol **parent,
+				struct addr_location *root_al,
+				int max_stack,
+				bool symbols);
+
+static inline int thread__resolve_callchain(struct thread *thread,
+					    struct callchain_cursor *cursor,
+					    struct evsel *evsel,
+					    struct perf_sample *sample,
+					    struct symbol **parent,
+					    struct addr_location *root_al,
+					    int max_stack)
+{
+	return __thread__resolve_callchain(thread,
+					   cursor,
+					   evsel,
+					   sample,
+					   parent,
+					   root_al,
+					   max_stack,
+					   /*symbols=*/true);
+}
 
 /*
  * Default guest kernel is defined by parameter --guestkallsyms
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 10/13] perf inject: Fix build ID injection
  2024-08-17  6:44 ` [PATCH v1 10/13] perf inject: Fix build ID injection Ian Rogers
@ 2024-08-19 18:01   ` Arnaldo Carvalho de Melo
  2024-08-19 19:54     ` Ian Rogers
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-19 18:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
	Masahiro Yamada, Arnd Bergmann, Jann Horn, Colin Ian King,
	Casey Chen, Athira Rajeev, Chaitanya S Prakash, James Clark,
	Ze Gao, Yang Jihong, Yunseong Kim, Weilin Wang,
	Dominique Martinet, Anne Macedo, Sun Haiyong, linux-perf-users,
	linux-kernel

On Fri, Aug 16, 2024 at 11:44:39PM -0700, Ian Rogers wrote:
> Build ID injection wasn't inserting a sample ID and aligning events to
> 64 bytes rather than 8. No sample ID means events are unordered and
> two different build_id events for the same path, as happens when a
> file is replaced, can't be differentiated.
> 
> Add in sample ID insertion for the build_id events alongside some
> refactoring. The refactoring better aligns the function arguments for
> different use cases, such as synthesizing build_id events without
> needing to have a dso. The misc bits are explicitly passed as with
> callchains the maps/dsos may span user and kernel land, so using
> sample->cpumode isn't good enough.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/builtin-inject.c        | 170 ++++++++++++++++++++++-------
>  tools/perf/util/build-id.c         |   6 +-
>  tools/perf/util/synthetic-events.c |  44 ++++++--
>  tools/perf/util/synthetic-events.h |  10 +-
>  4 files changed, 175 insertions(+), 55 deletions(-)
> 
> diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> index a7c859db2e15..84a4bdb5cb0a 100644
> --- a/tools/perf/builtin-inject.c
> +++ b/tools/perf/builtin-inject.c
> @@ -131,6 +131,7 @@ struct perf_inject {
>  	struct perf_file_section secs[HEADER_FEAT_BITS];
>  	struct guest_session	guest_session;
>  	struct strlist		*known_build_ids;
> +	const struct evsel	*mmap_evsel;
>  };
>  
>  struct event_entry {
> @@ -139,8 +140,13 @@ struct event_entry {
>  	union perf_event event[];
>  };
>  
> -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> -				struct machine *machine, u8 cpumode, u32 flags);
> +static int dso__inject_build_id(const struct perf_tool *tool,
> +				struct perf_sample *sample,
> +				struct machine *machine,
> +				const struct evsel *evsel,
> +				__u16 misc,
> +				const char *filename,
> +				struct dso *dso, u32 flags);

So in the end the dso was needed, the name of the function remains
dso__something(), so first arg would be a 'struct dso *'

I processed the patches up to 9/13, so that they can get tested now.

- Arnaldo

>  
>  static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
>  {
> @@ -422,6 +428,28 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
>  	return dso;
>  }
>  
> +/*
> + * The evsel used for the sample ID for mmap events. Typically stashed when
> + * processing mmap events. If not stashed, search the evlist for the first mmap
> + * gathering event.
> + */
> +static const struct evsel *inject__mmap_evsel(struct perf_inject *inject)
> +{
> +	struct evsel *pos;
> +
> +	if (inject->mmap_evsel)
> +		return inject->mmap_evsel;
> +
> +	evlist__for_each_entry(inject->session->evlist, pos) {
> +		if (pos->core.attr.mmap) {
> +			inject->mmap_evsel = pos;
> +			return pos;
> +		}
> +	}
> +	pr_err("No mmap events found\n");
> +	return NULL;
> +}
> +
>  static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
>  					  union perf_event *event,
>  					  struct perf_sample *sample,
> @@ -469,12 +497,28 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
>  		}
>  
>  		if (dso && !dso__hit(dso)) {
> -			dso__set_hit(dso);
> -			dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
> +			struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
> +
> +			if (evsel) {
> +				dso__set_hit(dso);
> +				dso__inject_build_id(tool, sample, machine, evsel,
> +						     /*misc=*/sample->cpumode,
> +						     filename, dso, flags);
> +			}
>  		}
>  	} else {
> +		int err;
> +
> +		/*
> +		 * Remember the evsel for lazy build id generation. It is used
> +		 * for the sample id header type.
> +		 */
> +		if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
> +		    !inject->mmap_evsel)
> +			inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
> +
>  		/* Create the thread, map, etc. Not done for the unordered inject all case. */
> -		int err = perf_event_process(tool, event, sample, machine);
> +		err = perf_event_process(tool, event, sample, machine);
>  
>  		if (err) {
>  			dso__put(dso);
> @@ -667,16 +711,20 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
>  	return false;
>  }
>  
> -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> -				struct machine *machine, u8 cpumode, u32 flags)
> +static int dso__inject_build_id(const struct perf_tool *tool,
> +				struct perf_sample *sample,
> +				struct machine *machine,
> +				const struct evsel *evsel,
> +				__u16 misc,
> +				const char *filename,
> +				struct dso *dso, u32 flags)
>  {
> -	struct perf_inject *inject = container_of(tool, struct perf_inject,
> -						  tool);
> +	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
>  	int err;
>  
> -	if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
> +	if (is_anon_memory(filename) || flags & MAP_HUGETLB)
>  		return 0;
> -	if (is_no_dso_memory(dso__long_name(dso)))
> +	if (is_no_dso_memory(filename))
>  		return 0;
>  
>  	if (inject->known_build_ids != NULL &&
> @@ -684,24 +732,65 @@ static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
>  		return 1;
>  
>  	if (dso__read_build_id(dso) < 0) {
> -		pr_debug("no build_id found for %s\n", dso__long_name(dso));
> +		pr_debug("no build_id found for %s\n", filename);
>  		return -1;
>  	}
>  
> -	err = perf_event__synthesize_build_id(tool, dso, cpumode,
> -					      perf_event__repipe, machine);
> +	err = perf_event__synthesize_build_id(tool, sample, machine,
> +					      perf_event__repipe,
> +					      evsel, misc, dso__bid(dso),
> +					      filename);
>  	if (err) {
> -		pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
> +		pr_err("Can't synthesize build_id event for %s\n", filename);
>  		return -1;
>  	}
>  
>  	return 0;
>  }
>  
> +static int mark_dso_hit(const struct perf_tool *tool,
> +			struct perf_sample *sample,
> +			struct machine *machine,
> +			const struct evsel *mmap_evsel,
> +			struct map *map, bool sample_in_dso)
> +{
> +	struct dso *dso;
> +	u16 misc = sample->cpumode;
> +
> +	if (!map)
> +		return 0;
> +
> +	if (!sample_in_dso) {
> +		u16 guest_mask = PERF_RECORD_MISC_GUEST_KERNEL |
> +			PERF_RECORD_MISC_GUEST_USER;
> +
> +		if ((misc & guest_mask) != 0) {
> +			misc &= PERF_RECORD_MISC_HYPERVISOR;
> +			misc |= __map__is_kernel(map)
> +				? PERF_RECORD_MISC_GUEST_KERNEL
> +				: PERF_RECORD_MISC_GUEST_USER;
> +		} else {
> +			misc &= PERF_RECORD_MISC_HYPERVISOR;
> +			misc |= __map__is_kernel(map)
> +				? PERF_RECORD_MISC_KERNEL
> +				: PERF_RECORD_MISC_USER;
> +		}
> +	}
> +	dso = map__dso(map);
> +	if (dso && !dso__hit(dso)) {
> +		dso__set_hit(dso);
> +		dso__inject_build_id(tool, sample, machine,
> +				mmap_evsel, misc, dso__long_name(dso), dso,
> +				map__flags(map));
> +	}
> +	return 0;
> +}
> +
>  struct mark_dso_hit_args {
>  	const struct perf_tool *tool;
> +	struct perf_sample *sample;
>  	struct machine *machine;
> -	u8 cpumode;
> +	const struct evsel *mmap_evsel;
>  };
>  
>  static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> @@ -709,16 +798,8 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
>  	struct mark_dso_hit_args *args = data;
>  	struct map *map = node->ms.map;
>  
> -	if (map) {
> -		struct dso *dso = map__dso(map);
> -
> -		if (dso && !dso__hit(dso)) {
> -			dso__set_hit(dso);
> -			dso__inject_build_id(dso, args->tool, args->machine,
> -					     args->cpumode, map__flags(map));
> -		}
> -	}
> -	return 0;
> +	return mark_dso_hit(args->tool, args->sample, args->machine,
> +			    args->mmap_evsel, map, /*sample_in_dso=*/false);
>  }
>  
>  int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *event,
> @@ -728,10 +809,16 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
>  {
>  	struct addr_location al;
>  	struct thread *thread;
> +	struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
>  	struct mark_dso_hit_args args = {
>  		.tool = tool,
> +		/*
> +		 * Use the parsed sample data of the sample event, which will
> +		 * have a later timestamp than the mmap event.
> +		 */
> +		.sample = sample,
>  		.machine = machine,
> -		.cpumode = sample->cpumode,
> +		.mmap_evsel = inject__mmap_evsel(inject),
>  	};
>  
>  	addr_location__init(&al);
> @@ -743,13 +830,8 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
>  	}
>  
>  	if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
> -		struct dso *dso = map__dso(al.map);
> -
> -		if (!dso__hit(dso)) {
> -			dso__set_hit(dso);
> -			dso__inject_build_id(dso, tool, machine,
> -					     sample->cpumode, map__flags(al.map));
> -		}
> +		mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
> +			     /*sample_in_dso=*/true);
>  	}
>  
>  	sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
> @@ -1159,17 +1241,27 @@ static int process_build_id(const struct perf_tool *tool,
>  static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
>  {
>  	struct machine *machine = perf_session__findnew_machine(inject->session, machine_pid);
> -	u8 cpumode = dso__is_in_kernel_space(dso) ?
> -			PERF_RECORD_MISC_GUEST_KERNEL :
> -			PERF_RECORD_MISC_GUEST_USER;
> +	struct perf_sample synth_sample = {
> +		.pid	   = -1,
> +		.tid	   = -1,
> +		.time	   = -1,
> +		.stream_id = -1,
> +		.cpu	   = -1,
> +		.period	   = 1,
> +		.cpumode   = dso__is_in_kernel_space(dso)
> +		? PERF_RECORD_MISC_GUEST_KERNEL
> +		: PERF_RECORD_MISC_GUEST_USER,
> +	};
>  
>  	if (!machine)
>  		return -ENOMEM;
>  
>  	dso__set_hit(dso);
>  
> -	return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
> -					       process_build_id, machine);
> +	return perf_event__synthesize_build_id(&inject->tool, &synth_sample, machine,
> +					       process_build_id, inject__mmap_evsel(inject),
> +					       /*misc=*/synth_sample.cpumode,
> +					       dso__bid(dso), dso__long_name(dso));
>  }
>  
>  static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
> diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> index 451d145fa4ed..8982f68e7230 100644
> --- a/tools/perf/util/build-id.c
> +++ b/tools/perf/util/build-id.c
> @@ -277,8 +277,8 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
>  	struct perf_record_header_build_id b;
>  	size_t len;
>  
> -	len = name_len + 1;
> -	len = PERF_ALIGN(len, NAME_ALIGN);
> +	len = sizeof(b) + name_len + 1;
> +	len = PERF_ALIGN(len, sizeof(u64));
>  
>  	memset(&b, 0, sizeof(b));
>  	memcpy(&b.data, bid->data, bid->size);
> @@ -286,7 +286,7 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
>  	misc |= PERF_RECORD_MISC_BUILD_ID_SIZE;
>  	b.pid = pid;
>  	b.header.misc = misc;
> -	b.header.size = sizeof(b) + len;
> +	b.header.size = len;
>  
>  	err = do_write(fd, &b, sizeof(b));
>  	if (err < 0)
> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> index 0a7f93ae76fb..6bb62e4e2d5d 100644
> --- a/tools/perf/util/synthetic-events.c
> +++ b/tools/perf/util/synthetic-events.c
> @@ -2225,28 +2225,48 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
>  }
>  #endif
>  
> -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc,
> -				    perf_event__handler_t process, struct machine *machine)
> +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> +				    struct perf_sample *sample,
> +				    struct machine *machine,
> +				    perf_event__handler_t process,
> +				    const struct evsel *evsel,
> +				    __u16 misc,
> +				    const struct build_id *bid,
> +				    const char *filename)
>  {
>  	union perf_event ev;
>  	size_t len;
>  
> -	if (!dso__hit(pos))
> -		return 0;
> +	len = sizeof(ev.build_id) + strlen(filename) + 1;
> +	len = PERF_ALIGN(len, sizeof(u64));
>  
> -	memset(&ev, 0, sizeof(ev));
> +	memset(&ev, 0, len);
>  
> -	len = dso__long_name_len(pos) + 1;
> -	len = PERF_ALIGN(len, NAME_ALIGN);
> -	ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
> -	memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
> +	ev.build_id.size = min(bid->size, sizeof(ev.build_id.build_id));
> +	memcpy(ev.build_id.build_id, bid->data, ev.build_id.size);
>  	ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
>  	ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
>  	ev.build_id.pid = machine->pid;
> -	ev.build_id.header.size = sizeof(ev.build_id) + len;
> -	memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
> +	ev.build_id.header.size = len;
> +	strcpy(ev.build_id.filename, filename);
> +
> +	if (evsel) {
> +		void *array = &ev;
> +		int ret;
>  
> -	return process(tool, &ev, NULL, machine);
> +		array += ev.header.size;
> +		ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
> +		if (ret < 0)
> +			return ret;
> +
> +		if (ret & 7) {
> +			pr_err("Bad id sample size %d\n", ret);
> +			return -EINVAL;
> +		}
> +
> +		ev.header.size += ret;
> +	}
> +	return process(tool, &ev, sample, machine);
>  }
>  
>  int perf_event__synthesize_stat_events(struct perf_stat_config *config, const struct perf_tool *tool,
> diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
> index 31df7653677f..795bf3e18396 100644
> --- a/tools/perf/util/synthetic-events.h
> +++ b/tools/perf/util/synthetic-events.h
> @@ -9,6 +9,7 @@
>  #include <perf/cpumap.h>
>  
>  struct auxtrace_record;
> +struct build_id;
>  struct dso;
>  struct evlist;
>  struct evsel;
> @@ -45,7 +46,14 @@ typedef int (*perf_event__handler_t)(const struct perf_tool *tool, union perf_ev
>  
>  int perf_event__synthesize_attrs(const struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process);
>  int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_attr *attr, u32 ids, u64 *id, perf_event__handler_t process);
> -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc, perf_event__handler_t process, struct machine *machine);
> +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> +				    struct perf_sample *sample,
> +				    struct machine *machine,
> +				    perf_event__handler_t process,
> +				    const struct evsel *evsel,
> +				    __u16 misc,
> +				    const struct build_id *bid,
> +				    const char *filename);
>  int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
>  int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
>  int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> -- 
> 2.46.0.184.g6999bdac58-goog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 10/13] perf inject: Fix build ID injection
  2024-08-19 18:01   ` Arnaldo Carvalho de Melo
@ 2024-08-19 19:54     ` Ian Rogers
  2024-08-28 15:15       ` Ian Rogers
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2024-08-19 19:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
	Masahiro Yamada, Arnd Bergmann, Jann Horn, Colin Ian King,
	Casey Chen, Athira Rajeev, Chaitanya S Prakash, James Clark,
	Ze Gao, Yang Jihong, Yunseong Kim, Weilin Wang,
	Dominique Martinet, Anne Macedo, Sun Haiyong, linux-perf-users,
	linux-kernel

On Mon, Aug 19, 2024 at 11:01 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Fri, Aug 16, 2024 at 11:44:39PM -0700, Ian Rogers wrote:
> > Build ID injection wasn't inserting a sample ID and aligning events to
> > 64 bytes rather than 8. No sample ID means events are unordered and
> > two different build_id events for the same path, as happens when a
> > file is replaced, can't be differentiated.
> >
> > Add in sample ID insertion for the build_id events alongside some
> > refactoring. The refactoring better aligns the function arguments for
> > different use cases, such as synthesizing build_id events without
> > needing to have a dso. The misc bits are explicitly passed as with
> > callchains the maps/dsos may span user and kernel land, so using
> > sample->cpumode isn't good enough.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/builtin-inject.c        | 170 ++++++++++++++++++++++-------
> >  tools/perf/util/build-id.c         |   6 +-
> >  tools/perf/util/synthetic-events.c |  44 ++++++--
> >  tools/perf/util/synthetic-events.h |  10 +-
> >  4 files changed, 175 insertions(+), 55 deletions(-)
> >
> > diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> > index a7c859db2e15..84a4bdb5cb0a 100644
> > --- a/tools/perf/builtin-inject.c
> > +++ b/tools/perf/builtin-inject.c
> > @@ -131,6 +131,7 @@ struct perf_inject {
> >       struct perf_file_section secs[HEADER_FEAT_BITS];
> >       struct guest_session    guest_session;
> >       struct strlist          *known_build_ids;
> > +     const struct evsel      *mmap_evsel;
> >  };
> >
> >  struct event_entry {
> > @@ -139,8 +140,13 @@ struct event_entry {
> >       union perf_event event[];
> >  };
> >
> > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > -                             struct machine *machine, u8 cpumode, u32 flags);
> > +static int dso__inject_build_id(const struct perf_tool *tool,
> > +                             struct perf_sample *sample,
> > +                             struct machine *machine,
> > +                             const struct evsel *evsel,
> > +                             __u16 misc,
> > +                             const char *filename,
> > +                             struct dso *dso, u32 flags);
>
> So in the end the dso was needed, the name of the function remains
> dso__something(), so first arg would be a 'struct dso *'
>
> I processed the patches up to 9/13, so that they can get tested now.

Maybe we should rename the function? We can get the build ID from
mmap2 events now, not just stored away in the dso by build_id events.
Reordering the arguments isn't a problem, I was just aiming for
consistency between the caller, perf_event__synthesize_build_id and
eventually the call to process that all take the arguments in this
order.

Thanks,
Ian

> - Arnaldo
>
> >
> >  static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
> >  {
> > @@ -422,6 +428,28 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
> >       return dso;
> >  }
> >
> > +/*
> > + * The evsel used for the sample ID for mmap events. Typically stashed when
> > + * processing mmap events. If not stashed, search the evlist for the first mmap
> > + * gathering event.
> > + */
> > +static const struct evsel *inject__mmap_evsel(struct perf_inject *inject)
> > +{
> > +     struct evsel *pos;
> > +
> > +     if (inject->mmap_evsel)
> > +             return inject->mmap_evsel;
> > +
> > +     evlist__for_each_entry(inject->session->evlist, pos) {
> > +             if (pos->core.attr.mmap) {
> > +                     inject->mmap_evsel = pos;
> > +                     return pos;
> > +             }
> > +     }
> > +     pr_err("No mmap events found\n");
> > +     return NULL;
> > +}
> > +
> >  static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
> >                                         union perf_event *event,
> >                                         struct perf_sample *sample,
> > @@ -469,12 +497,28 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
> >               }
> >
> >               if (dso && !dso__hit(dso)) {
> > -                     dso__set_hit(dso);
> > -                     dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
> > +                     struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
> > +
> > +                     if (evsel) {
> > +                             dso__set_hit(dso);
> > +                             dso__inject_build_id(tool, sample, machine, evsel,
> > +                                                  /*misc=*/sample->cpumode,
> > +                                                  filename, dso, flags);
> > +                     }
> >               }
> >       } else {
> > +             int err;
> > +
> > +             /*
> > +              * Remember the evsel for lazy build id generation. It is used
> > +              * for the sample id header type.
> > +              */
> > +             if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
> > +                 !inject->mmap_evsel)
> > +                     inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
> > +
> >               /* Create the thread, map, etc. Not done for the unordered inject all case. */
> > -             int err = perf_event_process(tool, event, sample, machine);
> > +             err = perf_event_process(tool, event, sample, machine);
> >
> >               if (err) {
> >                       dso__put(dso);
> > @@ -667,16 +711,20 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
> >       return false;
> >  }
> >
> > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > -                             struct machine *machine, u8 cpumode, u32 flags)
> > +static int dso__inject_build_id(const struct perf_tool *tool,
> > +                             struct perf_sample *sample,
> > +                             struct machine *machine,
> > +                             const struct evsel *evsel,
> > +                             __u16 misc,
> > +                             const char *filename,
> > +                             struct dso *dso, u32 flags)
> >  {
> > -     struct perf_inject *inject = container_of(tool, struct perf_inject,
> > -                                               tool);
> > +     struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
> >       int err;
> >
> > -     if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
> > +     if (is_anon_memory(filename) || flags & MAP_HUGETLB)
> >               return 0;
> > -     if (is_no_dso_memory(dso__long_name(dso)))
> > +     if (is_no_dso_memory(filename))
> >               return 0;
> >
> >       if (inject->known_build_ids != NULL &&
> > @@ -684,24 +732,65 @@ static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> >               return 1;
> >
> >       if (dso__read_build_id(dso) < 0) {
> > -             pr_debug("no build_id found for %s\n", dso__long_name(dso));
> > +             pr_debug("no build_id found for %s\n", filename);
> >               return -1;
> >       }
> >
> > -     err = perf_event__synthesize_build_id(tool, dso, cpumode,
> > -                                           perf_event__repipe, machine);
> > +     err = perf_event__synthesize_build_id(tool, sample, machine,
> > +                                           perf_event__repipe,
> > +                                           evsel, misc, dso__bid(dso),
> > +                                           filename);
> >       if (err) {
> > -             pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
> > +             pr_err("Can't synthesize build_id event for %s\n", filename);
> >               return -1;
> >       }
> >
> >       return 0;
> >  }
> >
> > +static int mark_dso_hit(const struct perf_tool *tool,
> > +                     struct perf_sample *sample,
> > +                     struct machine *machine,
> > +                     const struct evsel *mmap_evsel,
> > +                     struct map *map, bool sample_in_dso)
> > +{
> > +     struct dso *dso;
> > +     u16 misc = sample->cpumode;
> > +
> > +     if (!map)
> > +             return 0;
> > +
> > +     if (!sample_in_dso) {
> > +             u16 guest_mask = PERF_RECORD_MISC_GUEST_KERNEL |
> > +                     PERF_RECORD_MISC_GUEST_USER;
> > +
> > +             if ((misc & guest_mask) != 0) {
> > +                     misc &= PERF_RECORD_MISC_HYPERVISOR;
> > +                     misc |= __map__is_kernel(map)
> > +                             ? PERF_RECORD_MISC_GUEST_KERNEL
> > +                             : PERF_RECORD_MISC_GUEST_USER;
> > +             } else {
> > +                     misc &= PERF_RECORD_MISC_HYPERVISOR;
> > +                     misc |= __map__is_kernel(map)
> > +                             ? PERF_RECORD_MISC_KERNEL
> > +                             : PERF_RECORD_MISC_USER;
> > +             }
> > +     }
> > +     dso = map__dso(map);
> > +     if (dso && !dso__hit(dso)) {
> > +             dso__set_hit(dso);
> > +             dso__inject_build_id(tool, sample, machine,
> > +                             mmap_evsel, misc, dso__long_name(dso), dso,
> > +                             map__flags(map));
> > +     }
> > +     return 0;
> > +}
> > +
> >  struct mark_dso_hit_args {
> >       const struct perf_tool *tool;
> > +     struct perf_sample *sample;
> >       struct machine *machine;
> > -     u8 cpumode;
> > +     const struct evsel *mmap_evsel;
> >  };
> >
> >  static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> > @@ -709,16 +798,8 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> >       struct mark_dso_hit_args *args = data;
> >       struct map *map = node->ms.map;
> >
> > -     if (map) {
> > -             struct dso *dso = map__dso(map);
> > -
> > -             if (dso && !dso__hit(dso)) {
> > -                     dso__set_hit(dso);
> > -                     dso__inject_build_id(dso, args->tool, args->machine,
> > -                                          args->cpumode, map__flags(map));
> > -             }
> > -     }
> > -     return 0;
> > +     return mark_dso_hit(args->tool, args->sample, args->machine,
> > +                         args->mmap_evsel, map, /*sample_in_dso=*/false);
> >  }
> >
> >  int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *event,
> > @@ -728,10 +809,16 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
> >  {
> >       struct addr_location al;
> >       struct thread *thread;
> > +     struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
> >       struct mark_dso_hit_args args = {
> >               .tool = tool,
> > +             /*
> > +              * Use the parsed sample data of the sample event, which will
> > +              * have a later timestamp than the mmap event.
> > +              */
> > +             .sample = sample,
> >               .machine = machine,
> > -             .cpumode = sample->cpumode,
> > +             .mmap_evsel = inject__mmap_evsel(inject),
> >       };
> >
> >       addr_location__init(&al);
> > @@ -743,13 +830,8 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
> >       }
> >
> >       if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
> > -             struct dso *dso = map__dso(al.map);
> > -
> > -             if (!dso__hit(dso)) {
> > -                     dso__set_hit(dso);
> > -                     dso__inject_build_id(dso, tool, machine,
> > -                                          sample->cpumode, map__flags(al.map));
> > -             }
> > +             mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
> > +                          /*sample_in_dso=*/true);
> >       }
> >
> >       sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
> > @@ -1159,17 +1241,27 @@ static int process_build_id(const struct perf_tool *tool,
> >  static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
> >  {
> >       struct machine *machine = perf_session__findnew_machine(inject->session, machine_pid);
> > -     u8 cpumode = dso__is_in_kernel_space(dso) ?
> > -                     PERF_RECORD_MISC_GUEST_KERNEL :
> > -                     PERF_RECORD_MISC_GUEST_USER;
> > +     struct perf_sample synth_sample = {
> > +             .pid       = -1,
> > +             .tid       = -1,
> > +             .time      = -1,
> > +             .stream_id = -1,
> > +             .cpu       = -1,
> > +             .period    = 1,
> > +             .cpumode   = dso__is_in_kernel_space(dso)
> > +             ? PERF_RECORD_MISC_GUEST_KERNEL
> > +             : PERF_RECORD_MISC_GUEST_USER,
> > +     };
> >
> >       if (!machine)
> >               return -ENOMEM;
> >
> >       dso__set_hit(dso);
> >
> > -     return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
> > -                                            process_build_id, machine);
> > +     return perf_event__synthesize_build_id(&inject->tool, &synth_sample, machine,
> > +                                            process_build_id, inject__mmap_evsel(inject),
> > +                                            /*misc=*/synth_sample.cpumode,
> > +                                            dso__bid(dso), dso__long_name(dso));
> >  }
> >
> >  static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
> > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> > index 451d145fa4ed..8982f68e7230 100644
> > --- a/tools/perf/util/build-id.c
> > +++ b/tools/perf/util/build-id.c
> > @@ -277,8 +277,8 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
> >       struct perf_record_header_build_id b;
> >       size_t len;
> >
> > -     len = name_len + 1;
> > -     len = PERF_ALIGN(len, NAME_ALIGN);
> > +     len = sizeof(b) + name_len + 1;
> > +     len = PERF_ALIGN(len, sizeof(u64));
> >
> >       memset(&b, 0, sizeof(b));
> >       memcpy(&b.data, bid->data, bid->size);
> > @@ -286,7 +286,7 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
> >       misc |= PERF_RECORD_MISC_BUILD_ID_SIZE;
> >       b.pid = pid;
> >       b.header.misc = misc;
> > -     b.header.size = sizeof(b) + len;
> > +     b.header.size = len;
> >
> >       err = do_write(fd, &b, sizeof(b));
> >       if (err < 0)
> > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > index 0a7f93ae76fb..6bb62e4e2d5d 100644
> > --- a/tools/perf/util/synthetic-events.c
> > +++ b/tools/perf/util/synthetic-events.c
> > @@ -2225,28 +2225,48 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
> >  }
> >  #endif
> >
> > -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc,
> > -                                 perf_event__handler_t process, struct machine *machine)
> > +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> > +                                 struct perf_sample *sample,
> > +                                 struct machine *machine,
> > +                                 perf_event__handler_t process,
> > +                                 const struct evsel *evsel,
> > +                                 __u16 misc,
> > +                                 const struct build_id *bid,
> > +                                 const char *filename)
> >  {
> >       union perf_event ev;
> >       size_t len;
> >
> > -     if (!dso__hit(pos))
> > -             return 0;
> > +     len = sizeof(ev.build_id) + strlen(filename) + 1;
> > +     len = PERF_ALIGN(len, sizeof(u64));
> >
> > -     memset(&ev, 0, sizeof(ev));
> > +     memset(&ev, 0, len);
> >
> > -     len = dso__long_name_len(pos) + 1;
> > -     len = PERF_ALIGN(len, NAME_ALIGN);
> > -     ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
> > -     memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
> > +     ev.build_id.size = min(bid->size, sizeof(ev.build_id.build_id));
> > +     memcpy(ev.build_id.build_id, bid->data, ev.build_id.size);
> >       ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
> >       ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
> >       ev.build_id.pid = machine->pid;
> > -     ev.build_id.header.size = sizeof(ev.build_id) + len;
> > -     memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
> > +     ev.build_id.header.size = len;
> > +     strcpy(ev.build_id.filename, filename);
> > +
> > +     if (evsel) {
> > +             void *array = &ev;
> > +             int ret;
> >
> > -     return process(tool, &ev, NULL, machine);
> > +             array += ev.header.size;
> > +             ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
> > +             if (ret < 0)
> > +                     return ret;
> > +
> > +             if (ret & 7) {
> > +                     pr_err("Bad id sample size %d\n", ret);
> > +                     return -EINVAL;
> > +             }
> > +
> > +             ev.header.size += ret;
> > +     }
> > +     return process(tool, &ev, sample, machine);
> >  }
> >
> >  int perf_event__synthesize_stat_events(struct perf_stat_config *config, const struct perf_tool *tool,
> > diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
> > index 31df7653677f..795bf3e18396 100644
> > --- a/tools/perf/util/synthetic-events.h
> > +++ b/tools/perf/util/synthetic-events.h
> > @@ -9,6 +9,7 @@
> >  #include <perf/cpumap.h>
> >
> >  struct auxtrace_record;
> > +struct build_id;
> >  struct dso;
> >  struct evlist;
> >  struct evsel;
> > @@ -45,7 +46,14 @@ typedef int (*perf_event__handler_t)(const struct perf_tool *tool, union perf_ev
> >
> >  int perf_event__synthesize_attrs(const struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process);
> >  int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_attr *attr, u32 ids, u64 *id, perf_event__handler_t process);
> > -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc, perf_event__handler_t process, struct machine *machine);
> > +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> > +                                 struct perf_sample *sample,
> > +                                 struct machine *machine,
> > +                                 perf_event__handler_t process,
> > +                                 const struct evsel *evsel,
> > +                                 __u16 misc,
> > +                                 const struct build_id *bid,
> > +                                 const char *filename);
> >  int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
> >  int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> >  int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> > --
> > 2.46.0.184.g6999bdac58-goog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 10/13] perf inject: Fix build ID injection
  2024-08-19 19:54     ` Ian Rogers
@ 2024-08-28 15:15       ` Ian Rogers
  2024-09-02 18:26         ` Namhyung Kim
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2024-08-28 15:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
	Masahiro Yamada, Arnd Bergmann, Jann Horn, Colin Ian King,
	Casey Chen, Athira Rajeev, Chaitanya S Prakash, James Clark,
	Ze Gao, Yang Jihong, Yunseong Kim, Weilin Wang,
	Dominique Martinet, Anne Macedo, Sun Haiyong, linux-perf-users,
	linux-kernel

On Mon, Aug 19, 2024 at 12:54 PM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Aug 19, 2024 at 11:01 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Fri, Aug 16, 2024 at 11:44:39PM -0700, Ian Rogers wrote:
> > > Build ID injection wasn't inserting a sample ID and aligning events to
> > > 64 bytes rather than 8. No sample ID means events are unordered and
> > > two different build_id events for the same path, as happens when a
> > > file is replaced, can't be differentiated.
> > >
> > > Add in sample ID insertion for the build_id events alongside some
> > > refactoring. The refactoring better aligns the function arguments for
> > > different use cases, such as synthesizing build_id events without
> > > needing to have a dso. The misc bits are explicitly passed as with
> > > callchains the maps/dsos may span user and kernel land, so using
> > > sample->cpumode isn't good enough.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/builtin-inject.c        | 170 ++++++++++++++++++++++-------
> > >  tools/perf/util/build-id.c         |   6 +-
> > >  tools/perf/util/synthetic-events.c |  44 ++++++--
> > >  tools/perf/util/synthetic-events.h |  10 +-
> > >  4 files changed, 175 insertions(+), 55 deletions(-)
> > >
> > > diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> > > index a7c859db2e15..84a4bdb5cb0a 100644
> > > --- a/tools/perf/builtin-inject.c
> > > +++ b/tools/perf/builtin-inject.c
> > > @@ -131,6 +131,7 @@ struct perf_inject {
> > >       struct perf_file_section secs[HEADER_FEAT_BITS];
> > >       struct guest_session    guest_session;
> > >       struct strlist          *known_build_ids;
> > > +     const struct evsel      *mmap_evsel;
> > >  };
> > >
> > >  struct event_entry {
> > > @@ -139,8 +140,13 @@ struct event_entry {
> > >       union perf_event event[];
> > >  };
> > >
> > > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > > -                             struct machine *machine, u8 cpumode, u32 flags);
> > > +static int dso__inject_build_id(const struct perf_tool *tool,
> > > +                             struct perf_sample *sample,
> > > +                             struct machine *machine,
> > > +                             const struct evsel *evsel,
> > > +                             __u16 misc,
> > > +                             const char *filename,
> > > +                             struct dso *dso, u32 flags);
> >
> > So in the end the dso was needed, the name of the function remains
> > dso__something(), so first arg would be a 'struct dso *'
> >
> > I processed the patches up to 9/13, so that they can get tested now.
>
> Maybe we should rename the function? We can get the build ID from
> mmap2 events now, not just stored away in the dso by build_id events.
> Reordering the arguments isn't a problem, I was just aiming for
> consistency between the caller, perf_event__synthesize_build_id and
> eventually the call to process that all take the arguments in this
> order.

Arnaldo, did you have any more thoughts on this?

Thanks,
Ian

>
> > - Arnaldo
> >
> > >
> > >  static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
> > >  {
> > > @@ -422,6 +428,28 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
> > >       return dso;
> > >  }
> > >
> > > +/*
> > > + * The evsel used for the sample ID for mmap events. Typically stashed when
> > > + * processing mmap events. If not stashed, search the evlist for the first mmap
> > > + * gathering event.
> > > + */
> > > +static const struct evsel *inject__mmap_evsel(struct perf_inject *inject)
> > > +{
> > > +     struct evsel *pos;
> > > +
> > > +     if (inject->mmap_evsel)
> > > +             return inject->mmap_evsel;
> > > +
> > > +     evlist__for_each_entry(inject->session->evlist, pos) {
> > > +             if (pos->core.attr.mmap) {
> > > +                     inject->mmap_evsel = pos;
> > > +                     return pos;
> > > +             }
> > > +     }
> > > +     pr_err("No mmap events found\n");
> > > +     return NULL;
> > > +}
> > > +
> > >  static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
> > >                                         union perf_event *event,
> > >                                         struct perf_sample *sample,
> > > @@ -469,12 +497,28 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
> > >               }
> > >
> > >               if (dso && !dso__hit(dso)) {
> > > -                     dso__set_hit(dso);
> > > -                     dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
> > > +                     struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
> > > +
> > > +                     if (evsel) {
> > > +                             dso__set_hit(dso);
> > > +                             dso__inject_build_id(tool, sample, machine, evsel,
> > > +                                                  /*misc=*/sample->cpumode,
> > > +                                                  filename, dso, flags);
> > > +                     }
> > >               }
> > >       } else {
> > > +             int err;
> > > +
> > > +             /*
> > > +              * Remember the evsel for lazy build id generation. It is used
> > > +              * for the sample id header type.
> > > +              */
> > > +             if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
> > > +                 !inject->mmap_evsel)
> > > +                     inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
> > > +
> > >               /* Create the thread, map, etc. Not done for the unordered inject all case. */
> > > -             int err = perf_event_process(tool, event, sample, machine);
> > > +             err = perf_event_process(tool, event, sample, machine);
> > >
> > >               if (err) {
> > >                       dso__put(dso);
> > > @@ -667,16 +711,20 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
> > >       return false;
> > >  }
> > >
> > > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > > -                             struct machine *machine, u8 cpumode, u32 flags)
> > > +static int dso__inject_build_id(const struct perf_tool *tool,
> > > +                             struct perf_sample *sample,
> > > +                             struct machine *machine,
> > > +                             const struct evsel *evsel,
> > > +                             __u16 misc,
> > > +                             const char *filename,
> > > +                             struct dso *dso, u32 flags)
> > >  {
> > > -     struct perf_inject *inject = container_of(tool, struct perf_inject,
> > > -                                               tool);
> > > +     struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
> > >       int err;
> > >
> > > -     if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
> > > +     if (is_anon_memory(filename) || flags & MAP_HUGETLB)
> > >               return 0;
> > > -     if (is_no_dso_memory(dso__long_name(dso)))
> > > +     if (is_no_dso_memory(filename))
> > >               return 0;
> > >
> > >       if (inject->known_build_ids != NULL &&
> > > @@ -684,24 +732,65 @@ static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > >               return 1;
> > >
> > >       if (dso__read_build_id(dso) < 0) {
> > > -             pr_debug("no build_id found for %s\n", dso__long_name(dso));
> > > +             pr_debug("no build_id found for %s\n", filename);
> > >               return -1;
> > >       }
> > >
> > > -     err = perf_event__synthesize_build_id(tool, dso, cpumode,
> > > -                                           perf_event__repipe, machine);
> > > +     err = perf_event__synthesize_build_id(tool, sample, machine,
> > > +                                           perf_event__repipe,
> > > +                                           evsel, misc, dso__bid(dso),
> > > +                                           filename);
> > >       if (err) {
> > > -             pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
> > > +             pr_err("Can't synthesize build_id event for %s\n", filename);
> > >               return -1;
> > >       }
> > >
> > >       return 0;
> > >  }
> > >
> > > +static int mark_dso_hit(const struct perf_tool *tool,
> > > +                     struct perf_sample *sample,
> > > +                     struct machine *machine,
> > > +                     const struct evsel *mmap_evsel,
> > > +                     struct map *map, bool sample_in_dso)
> > > +{
> > > +     struct dso *dso;
> > > +     u16 misc = sample->cpumode;
> > > +
> > > +     if (!map)
> > > +             return 0;
> > > +
> > > +     if (!sample_in_dso) {
> > > +             u16 guest_mask = PERF_RECORD_MISC_GUEST_KERNEL |
> > > +                     PERF_RECORD_MISC_GUEST_USER;
> > > +
> > > +             if ((misc & guest_mask) != 0) {
> > > +                     misc &= PERF_RECORD_MISC_HYPERVISOR;
> > > +                     misc |= __map__is_kernel(map)
> > > +                             ? PERF_RECORD_MISC_GUEST_KERNEL
> > > +                             : PERF_RECORD_MISC_GUEST_USER;
> > > +             } else {
> > > +                     misc &= PERF_RECORD_MISC_HYPERVISOR;
> > > +                     misc |= __map__is_kernel(map)
> > > +                             ? PERF_RECORD_MISC_KERNEL
> > > +                             : PERF_RECORD_MISC_USER;
> > > +             }
> > > +     }
> > > +     dso = map__dso(map);
> > > +     if (dso && !dso__hit(dso)) {
> > > +             dso__set_hit(dso);
> > > +             dso__inject_build_id(tool, sample, machine,
> > > +                             mmap_evsel, misc, dso__long_name(dso), dso,
> > > +                             map__flags(map));
> > > +     }
> > > +     return 0;
> > > +}
> > > +
> > >  struct mark_dso_hit_args {
> > >       const struct perf_tool *tool;
> > > +     struct perf_sample *sample;
> > >       struct machine *machine;
> > > -     u8 cpumode;
> > > +     const struct evsel *mmap_evsel;
> > >  };
> > >
> > >  static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> > > @@ -709,16 +798,8 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> > >       struct mark_dso_hit_args *args = data;
> > >       struct map *map = node->ms.map;
> > >
> > > -     if (map) {
> > > -             struct dso *dso = map__dso(map);
> > > -
> > > -             if (dso && !dso__hit(dso)) {
> > > -                     dso__set_hit(dso);
> > > -                     dso__inject_build_id(dso, args->tool, args->machine,
> > > -                                          args->cpumode, map__flags(map));
> > > -             }
> > > -     }
> > > -     return 0;
> > > +     return mark_dso_hit(args->tool, args->sample, args->machine,
> > > +                         args->mmap_evsel, map, /*sample_in_dso=*/false);
> > >  }
> > >
> > >  int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *event,
> > > @@ -728,10 +809,16 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
> > >  {
> > >       struct addr_location al;
> > >       struct thread *thread;
> > > +     struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
> > >       struct mark_dso_hit_args args = {
> > >               .tool = tool,
> > > +             /*
> > > +              * Use the parsed sample data of the sample event, which will
> > > +              * have a later timestamp than the mmap event.
> > > +              */
> > > +             .sample = sample,
> > >               .machine = machine,
> > > -             .cpumode = sample->cpumode,
> > > +             .mmap_evsel = inject__mmap_evsel(inject),
> > >       };
> > >
> > >       addr_location__init(&al);
> > > @@ -743,13 +830,8 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
> > >       }
> > >
> > >       if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
> > > -             struct dso *dso = map__dso(al.map);
> > > -
> > > -             if (!dso__hit(dso)) {
> > > -                     dso__set_hit(dso);
> > > -                     dso__inject_build_id(dso, tool, machine,
> > > -                                          sample->cpumode, map__flags(al.map));
> > > -             }
> > > +             mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
> > > +                          /*sample_in_dso=*/true);
> > >       }
> > >
> > >       sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
> > > @@ -1159,17 +1241,27 @@ static int process_build_id(const struct perf_tool *tool,
> > >  static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
> > >  {
> > >       struct machine *machine = perf_session__findnew_machine(inject->session, machine_pid);
> > > -     u8 cpumode = dso__is_in_kernel_space(dso) ?
> > > -                     PERF_RECORD_MISC_GUEST_KERNEL :
> > > -                     PERF_RECORD_MISC_GUEST_USER;
> > > +     struct perf_sample synth_sample = {
> > > +             .pid       = -1,
> > > +             .tid       = -1,
> > > +             .time      = -1,
> > > +             .stream_id = -1,
> > > +             .cpu       = -1,
> > > +             .period    = 1,
> > > +             .cpumode   = dso__is_in_kernel_space(dso)
> > > +             ? PERF_RECORD_MISC_GUEST_KERNEL
> > > +             : PERF_RECORD_MISC_GUEST_USER,
> > > +     };
> > >
> > >       if (!machine)
> > >               return -ENOMEM;
> > >
> > >       dso__set_hit(dso);
> > >
> > > -     return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
> > > -                                            process_build_id, machine);
> > > +     return perf_event__synthesize_build_id(&inject->tool, &synth_sample, machine,
> > > +                                            process_build_id, inject__mmap_evsel(inject),
> > > +                                            /*misc=*/synth_sample.cpumode,
> > > +                                            dso__bid(dso), dso__long_name(dso));
> > >  }
> > >
> > >  static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
> > > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> > > index 451d145fa4ed..8982f68e7230 100644
> > > --- a/tools/perf/util/build-id.c
> > > +++ b/tools/perf/util/build-id.c
> > > @@ -277,8 +277,8 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
> > >       struct perf_record_header_build_id b;
> > >       size_t len;
> > >
> > > -     len = name_len + 1;
> > > -     len = PERF_ALIGN(len, NAME_ALIGN);
> > > +     len = sizeof(b) + name_len + 1;
> > > +     len = PERF_ALIGN(len, sizeof(u64));
> > >
> > >       memset(&b, 0, sizeof(b));
> > >       memcpy(&b.data, bid->data, bid->size);
> > > @@ -286,7 +286,7 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
> > >       misc |= PERF_RECORD_MISC_BUILD_ID_SIZE;
> > >       b.pid = pid;
> > >       b.header.misc = misc;
> > > -     b.header.size = sizeof(b) + len;
> > > +     b.header.size = len;
> > >
> > >       err = do_write(fd, &b, sizeof(b));
> > >       if (err < 0)
> > > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > > index 0a7f93ae76fb..6bb62e4e2d5d 100644
> > > --- a/tools/perf/util/synthetic-events.c
> > > +++ b/tools/perf/util/synthetic-events.c
> > > @@ -2225,28 +2225,48 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
> > >  }
> > >  #endif
> > >
> > > -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc,
> > > -                                 perf_event__handler_t process, struct machine *machine)
> > > +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> > > +                                 struct perf_sample *sample,
> > > +                                 struct machine *machine,
> > > +                                 perf_event__handler_t process,
> > > +                                 const struct evsel *evsel,
> > > +                                 __u16 misc,
> > > +                                 const struct build_id *bid,
> > > +                                 const char *filename)
> > >  {
> > >       union perf_event ev;
> > >       size_t len;
> > >
> > > -     if (!dso__hit(pos))
> > > -             return 0;
> > > +     len = sizeof(ev.build_id) + strlen(filename) + 1;
> > > +     len = PERF_ALIGN(len, sizeof(u64));
> > >
> > > -     memset(&ev, 0, sizeof(ev));
> > > +     memset(&ev, 0, len);
> > >
> > > -     len = dso__long_name_len(pos) + 1;
> > > -     len = PERF_ALIGN(len, NAME_ALIGN);
> > > -     ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
> > > -     memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
> > > +     ev.build_id.size = min(bid->size, sizeof(ev.build_id.build_id));
> > > +     memcpy(ev.build_id.build_id, bid->data, ev.build_id.size);
> > >       ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
> > >       ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
> > >       ev.build_id.pid = machine->pid;
> > > -     ev.build_id.header.size = sizeof(ev.build_id) + len;
> > > -     memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
> > > +     ev.build_id.header.size = len;
> > > +     strcpy(ev.build_id.filename, filename);
> > > +
> > > +     if (evsel) {
> > > +             void *array = &ev;
> > > +             int ret;
> > >
> > > -     return process(tool, &ev, NULL, machine);
> > > +             array += ev.header.size;
> > > +             ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
> > > +             if (ret < 0)
> > > +                     return ret;
> > > +
> > > +             if (ret & 7) {
> > > +                     pr_err("Bad id sample size %d\n", ret);
> > > +                     return -EINVAL;
> > > +             }
> > > +
> > > +             ev.header.size += ret;
> > > +     }
> > > +     return process(tool, &ev, sample, machine);
> > >  }
> > >
> > >  int perf_event__synthesize_stat_events(struct perf_stat_config *config, const struct perf_tool *tool,
> > > diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
> > > index 31df7653677f..795bf3e18396 100644
> > > --- a/tools/perf/util/synthetic-events.h
> > > +++ b/tools/perf/util/synthetic-events.h
> > > @@ -9,6 +9,7 @@
> > >  #include <perf/cpumap.h>
> > >
> > >  struct auxtrace_record;
> > > +struct build_id;
> > >  struct dso;
> > >  struct evlist;
> > >  struct evsel;
> > > @@ -45,7 +46,14 @@ typedef int (*perf_event__handler_t)(const struct perf_tool *tool, union perf_ev
> > >
> > >  int perf_event__synthesize_attrs(const struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process);
> > >  int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_attr *attr, u32 ids, u64 *id, perf_event__handler_t process);
> > > -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc, perf_event__handler_t process, struct machine *machine);
> > > +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> > > +                                 struct perf_sample *sample,
> > > +                                 struct machine *machine,
> > > +                                 perf_event__handler_t process,
> > > +                                 const struct evsel *evsel,
> > > +                                 __u16 misc,
> > > +                                 const struct build_id *bid,
> > > +                                 const char *filename);
> > >  int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
> > >  int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> > >  int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> > > --
> > > 2.46.0.184.g6999bdac58-goog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 10/13] perf inject: Fix build ID injection
  2024-08-28 15:15       ` Ian Rogers
@ 2024-09-02 18:26         ` Namhyung Kim
  2024-09-03 18:34           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Namhyung Kim @ 2024-09-02 18:26 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Masahiro Yamada, Arnd Bergmann, Jann Horn,
	Colin Ian King, Casey Chen, Athira Rajeev, Chaitanya S Prakash,
	James Clark, Ze Gao, Yang Jihong, Yunseong Kim, Weilin Wang,
	Dominique Martinet, Anne Macedo, Sun Haiyong, linux-perf-users,
	linux-kernel

Hi Ian,

On Wed, Aug 28, 2024 at 08:15:10AM -0700, Ian Rogers wrote:
> On Mon, Aug 19, 2024 at 12:54 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Mon, Aug 19, 2024 at 11:01 AM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > On Fri, Aug 16, 2024 at 11:44:39PM -0700, Ian Rogers wrote:
> > > > Build ID injection wasn't inserting a sample ID and aligning events to
> > > > 64 bytes rather than 8. No sample ID means events are unordered and
> > > > two different build_id events for the same path, as happens when a
> > > > file is replaced, can't be differentiated.
> > > >
> > > > Add in sample ID insertion for the build_id events alongside some
> > > > refactoring. The refactoring better aligns the function arguments for
> > > > different use cases, such as synthesizing build_id events without
> > > > needing to have a dso. The misc bits are explicitly passed as with
> > > > callchains the maps/dsos may span user and kernel land, so using
> > > > sample->cpumode isn't good enough.
> > > >
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > ---
> > > >  tools/perf/builtin-inject.c        | 170 ++++++++++++++++++++++-------
> > > >  tools/perf/util/build-id.c         |   6 +-
> > > >  tools/perf/util/synthetic-events.c |  44 ++++++--
> > > >  tools/perf/util/synthetic-events.h |  10 +-
> > > >  4 files changed, 175 insertions(+), 55 deletions(-)
> > > >
> > > > diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> > > > index a7c859db2e15..84a4bdb5cb0a 100644
> > > > --- a/tools/perf/builtin-inject.c
> > > > +++ b/tools/perf/builtin-inject.c
> > > > @@ -131,6 +131,7 @@ struct perf_inject {
> > > >       struct perf_file_section secs[HEADER_FEAT_BITS];
> > > >       struct guest_session    guest_session;
> > > >       struct strlist          *known_build_ids;
> > > > +     const struct evsel      *mmap_evsel;
> > > >  };
> > > >
> > > >  struct event_entry {
> > > > @@ -139,8 +140,13 @@ struct event_entry {
> > > >       union perf_event event[];
> > > >  };
> > > >
> > > > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > > > -                             struct machine *machine, u8 cpumode, u32 flags);
> > > > +static int dso__inject_build_id(const struct perf_tool *tool,
> > > > +                             struct perf_sample *sample,
> > > > +                             struct machine *machine,
> > > > +                             const struct evsel *evsel,
> > > > +                             __u16 misc,
> > > > +                             const char *filename,
> > > > +                             struct dso *dso, u32 flags);
> > >
> > > So in the end the dso was needed, the name of the function remains
> > > dso__something(), so first arg would be a 'struct dso *'
> > >
> > > I processed the patches up to 9/13, so that they can get tested now.
> >
> > Maybe we should rename the function? We can get the build ID from
> > mmap2 events now, not just stored away in the dso by build_id events.
> > Reordering the arguments isn't a problem, I was just aiming for
> > consistency between the caller, perf_event__synthesize_build_id and
> > eventually the call to process that all take the arguments in this
> > order.
> 
> Arnaldo, did you have any more thoughts on this?

I'm ok to rename it to tool__inject_build_id().

Thanks,
Namhyung


> > >
> > > >
> > > >  static int output_bytes(struct perf_inject *inject, void *buf, size_t sz)
> > > >  {
> > > > @@ -422,6 +428,28 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
> > > >       return dso;
> > > >  }
> > > >
> > > > +/*
> > > > + * The evsel used for the sample ID for mmap events. Typically stashed when
> > > > + * processing mmap events. If not stashed, search the evlist for the first mmap
> > > > + * gathering event.
> > > > + */
> > > > +static const struct evsel *inject__mmap_evsel(struct perf_inject *inject)
> > > > +{
> > > > +     struct evsel *pos;
> > > > +
> > > > +     if (inject->mmap_evsel)
> > > > +             return inject->mmap_evsel;
> > > > +
> > > > +     evlist__for_each_entry(inject->session->evlist, pos) {
> > > > +             if (pos->core.attr.mmap) {
> > > > +                     inject->mmap_evsel = pos;
> > > > +                     return pos;
> > > > +             }
> > > > +     }
> > > > +     pr_err("No mmap events found\n");
> > > > +     return NULL;
> > > > +}
> > > > +
> > > >  static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
> > > >                                         union perf_event *event,
> > > >                                         struct perf_sample *sample,
> > > > @@ -469,12 +497,28 @@ static int perf_event__repipe_common_mmap(const struct perf_tool *tool,
> > > >               }
> > > >
> > > >               if (dso && !dso__hit(dso)) {
> > > > -                     dso__set_hit(dso);
> > > > -                     dso__inject_build_id(dso, tool, machine, sample->cpumode, flags);
> > > > +                     struct evsel *evsel = evlist__event2evsel(inject->session->evlist, event);
> > > > +
> > > > +                     if (evsel) {
> > > > +                             dso__set_hit(dso);
> > > > +                             dso__inject_build_id(tool, sample, machine, evsel,
> > > > +                                                  /*misc=*/sample->cpumode,
> > > > +                                                  filename, dso, flags);
> > > > +                     }
> > > >               }
> > > >       } else {
> > > > +             int err;
> > > > +
> > > > +             /*
> > > > +              * Remember the evsel for lazy build id generation. It is used
> > > > +              * for the sample id header type.
> > > > +              */
> > > > +             if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY &&
> > > > +                 !inject->mmap_evsel)
> > > > +                     inject->mmap_evsel = evlist__event2evsel(inject->session->evlist, event);
> > > > +
> > > >               /* Create the thread, map, etc. Not done for the unordered inject all case. */
> > > > -             int err = perf_event_process(tool, event, sample, machine);
> > > > +             err = perf_event_process(tool, event, sample, machine);
> > > >
> > > >               if (err) {
> > > >                       dso__put(dso);
> > > > @@ -667,16 +711,20 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
> > > >       return false;
> > > >  }
> > > >
> > > > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > > > -                             struct machine *machine, u8 cpumode, u32 flags)
> > > > +static int dso__inject_build_id(const struct perf_tool *tool,
> > > > +                             struct perf_sample *sample,
> > > > +                             struct machine *machine,
> > > > +                             const struct evsel *evsel,
> > > > +                             __u16 misc,
> > > > +                             const char *filename,
> > > > +                             struct dso *dso, u32 flags)
> > > >  {
> > > > -     struct perf_inject *inject = container_of(tool, struct perf_inject,
> > > > -                                               tool);
> > > > +     struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
> > > >       int err;
> > > >
> > > > -     if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
> > > > +     if (is_anon_memory(filename) || flags & MAP_HUGETLB)
> > > >               return 0;
> > > > -     if (is_no_dso_memory(dso__long_name(dso)))
> > > > +     if (is_no_dso_memory(filename))
> > > >               return 0;
> > > >
> > > >       if (inject->known_build_ids != NULL &&
> > > > @@ -684,24 +732,65 @@ static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > > >               return 1;
> > > >
> > > >       if (dso__read_build_id(dso) < 0) {
> > > > -             pr_debug("no build_id found for %s\n", dso__long_name(dso));
> > > > +             pr_debug("no build_id found for %s\n", filename);
> > > >               return -1;
> > > >       }
> > > >
> > > > -     err = perf_event__synthesize_build_id(tool, dso, cpumode,
> > > > -                                           perf_event__repipe, machine);
> > > > +     err = perf_event__synthesize_build_id(tool, sample, machine,
> > > > +                                           perf_event__repipe,
> > > > +                                           evsel, misc, dso__bid(dso),
> > > > +                                           filename);
> > > >       if (err) {
> > > > -             pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
> > > > +             pr_err("Can't synthesize build_id event for %s\n", filename);
> > > >               return -1;
> > > >       }
> > > >
> > > >       return 0;
> > > >  }
> > > >
> > > > +static int mark_dso_hit(const struct perf_tool *tool,
> > > > +                     struct perf_sample *sample,
> > > > +                     struct machine *machine,
> > > > +                     const struct evsel *mmap_evsel,
> > > > +                     struct map *map, bool sample_in_dso)
> > > > +{
> > > > +     struct dso *dso;
> > > > +     u16 misc = sample->cpumode;
> > > > +
> > > > +     if (!map)
> > > > +             return 0;
> > > > +
> > > > +     if (!sample_in_dso) {
> > > > +             u16 guest_mask = PERF_RECORD_MISC_GUEST_KERNEL |
> > > > +                     PERF_RECORD_MISC_GUEST_USER;
> > > > +
> > > > +             if ((misc & guest_mask) != 0) {
> > > > +                     misc &= PERF_RECORD_MISC_HYPERVISOR;
> > > > +                     misc |= __map__is_kernel(map)
> > > > +                             ? PERF_RECORD_MISC_GUEST_KERNEL
> > > > +                             : PERF_RECORD_MISC_GUEST_USER;
> > > > +             } else {
> > > > +                     misc &= PERF_RECORD_MISC_HYPERVISOR;
> > > > +                     misc |= __map__is_kernel(map)
> > > > +                             ? PERF_RECORD_MISC_KERNEL
> > > > +                             : PERF_RECORD_MISC_USER;
> > > > +             }
> > > > +     }
> > > > +     dso = map__dso(map);
> > > > +     if (dso && !dso__hit(dso)) {
> > > > +             dso__set_hit(dso);
> > > > +             dso__inject_build_id(tool, sample, machine,
> > > > +                             mmap_evsel, misc, dso__long_name(dso), dso,
> > > > +                             map__flags(map));
> > > > +     }
> > > > +     return 0;
> > > > +}
> > > > +
> > > >  struct mark_dso_hit_args {
> > > >       const struct perf_tool *tool;
> > > > +     struct perf_sample *sample;
> > > >       struct machine *machine;
> > > > -     u8 cpumode;
> > > > +     const struct evsel *mmap_evsel;
> > > >  };
> > > >
> > > >  static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> > > > @@ -709,16 +798,8 @@ static int mark_dso_hit_callback(struct callchain_cursor_node *node, void *data)
> > > >       struct mark_dso_hit_args *args = data;
> > > >       struct map *map = node->ms.map;
> > > >
> > > > -     if (map) {
> > > > -             struct dso *dso = map__dso(map);
> > > > -
> > > > -             if (dso && !dso__hit(dso)) {
> > > > -                     dso__set_hit(dso);
> > > > -                     dso__inject_build_id(dso, args->tool, args->machine,
> > > > -                                          args->cpumode, map__flags(map));
> > > > -             }
> > > > -     }
> > > > -     return 0;
> > > > +     return mark_dso_hit(args->tool, args->sample, args->machine,
> > > > +                         args->mmap_evsel, map, /*sample_in_dso=*/false);
> > > >  }
> > > >
> > > >  int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *event,
> > > > @@ -728,10 +809,16 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
> > > >  {
> > > >       struct addr_location al;
> > > >       struct thread *thread;
> > > > +     struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
> > > >       struct mark_dso_hit_args args = {
> > > >               .tool = tool,
> > > > +             /*
> > > > +              * Use the parsed sample data of the sample event, which will
> > > > +              * have a later timestamp than the mmap event.
> > > > +              */
> > > > +             .sample = sample,
> > > >               .machine = machine,
> > > > -             .cpumode = sample->cpumode,
> > > > +             .mmap_evsel = inject__mmap_evsel(inject),
> > > >       };
> > > >
> > > >       addr_location__init(&al);
> > > > @@ -743,13 +830,8 @@ int perf_event__inject_buildid(const struct perf_tool *tool, union perf_event *e
> > > >       }
> > > >
> > > >       if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
> > > > -             struct dso *dso = map__dso(al.map);
> > > > -
> > > > -             if (!dso__hit(dso)) {
> > > > -                     dso__set_hit(dso);
> > > > -                     dso__inject_build_id(dso, tool, machine,
> > > > -                                          sample->cpumode, map__flags(al.map));
> > > > -             }
> > > > +             mark_dso_hit(tool, sample, machine, args.mmap_evsel, al.map,
> > > > +                          /*sample_in_dso=*/true);
> > > >       }
> > > >
> > > >       sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEPTH,
> > > > @@ -1159,17 +1241,27 @@ static int process_build_id(const struct perf_tool *tool,
> > > >  static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
> > > >  {
> > > >       struct machine *machine = perf_session__findnew_machine(inject->session, machine_pid);
> > > > -     u8 cpumode = dso__is_in_kernel_space(dso) ?
> > > > -                     PERF_RECORD_MISC_GUEST_KERNEL :
> > > > -                     PERF_RECORD_MISC_GUEST_USER;
> > > > +     struct perf_sample synth_sample = {
> > > > +             .pid       = -1,
> > > > +             .tid       = -1,
> > > > +             .time      = -1,
> > > > +             .stream_id = -1,
> > > > +             .cpu       = -1,
> > > > +             .period    = 1,
> > > > +             .cpumode   = dso__is_in_kernel_space(dso)
> > > > +             ? PERF_RECORD_MISC_GUEST_KERNEL
> > > > +             : PERF_RECORD_MISC_GUEST_USER,
> > > > +     };
> > > >
> > > >       if (!machine)
> > > >               return -ENOMEM;
> > > >
> > > >       dso__set_hit(dso);
> > > >
> > > > -     return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
> > > > -                                            process_build_id, machine);
> > > > +     return perf_event__synthesize_build_id(&inject->tool, &synth_sample, machine,
> > > > +                                            process_build_id, inject__mmap_evsel(inject),
> > > > +                                            /*misc=*/synth_sample.cpumode,
> > > > +                                            dso__bid(dso), dso__long_name(dso));
> > > >  }
> > > >
> > > >  static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
> > > > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> > > > index 451d145fa4ed..8982f68e7230 100644
> > > > --- a/tools/perf/util/build-id.c
> > > > +++ b/tools/perf/util/build-id.c
> > > > @@ -277,8 +277,8 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
> > > >       struct perf_record_header_build_id b;
> > > >       size_t len;
> > > >
> > > > -     len = name_len + 1;
> > > > -     len = PERF_ALIGN(len, NAME_ALIGN);
> > > > +     len = sizeof(b) + name_len + 1;
> > > > +     len = PERF_ALIGN(len, sizeof(u64));
> > > >
> > > >       memset(&b, 0, sizeof(b));
> > > >       memcpy(&b.data, bid->data, bid->size);
> > > > @@ -286,7 +286,7 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
> > > >       misc |= PERF_RECORD_MISC_BUILD_ID_SIZE;
> > > >       b.pid = pid;
> > > >       b.header.misc = misc;
> > > > -     b.header.size = sizeof(b) + len;
> > > > +     b.header.size = len;
> > > >
> > > >       err = do_write(fd, &b, sizeof(b));
> > > >       if (err < 0)
> > > > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > > > index 0a7f93ae76fb..6bb62e4e2d5d 100644
> > > > --- a/tools/perf/util/synthetic-events.c
> > > > +++ b/tools/perf/util/synthetic-events.c
> > > > @@ -2225,28 +2225,48 @@ int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, st
> > > >  }
> > > >  #endif
> > > >
> > > > -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc,
> > > > -                                 perf_event__handler_t process, struct machine *machine)
> > > > +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> > > > +                                 struct perf_sample *sample,
> > > > +                                 struct machine *machine,
> > > > +                                 perf_event__handler_t process,
> > > > +                                 const struct evsel *evsel,
> > > > +                                 __u16 misc,
> > > > +                                 const struct build_id *bid,
> > > > +                                 const char *filename)
> > > >  {
> > > >       union perf_event ev;
> > > >       size_t len;
> > > >
> > > > -     if (!dso__hit(pos))
> > > > -             return 0;
> > > > +     len = sizeof(ev.build_id) + strlen(filename) + 1;
> > > > +     len = PERF_ALIGN(len, sizeof(u64));
> > > >
> > > > -     memset(&ev, 0, sizeof(ev));
> > > > +     memset(&ev, 0, len);
> > > >
> > > > -     len = dso__long_name_len(pos) + 1;
> > > > -     len = PERF_ALIGN(len, NAME_ALIGN);
> > > > -     ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
> > > > -     memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
> > > > +     ev.build_id.size = min(bid->size, sizeof(ev.build_id.build_id));
> > > > +     memcpy(ev.build_id.build_id, bid->data, ev.build_id.size);
> > > >       ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
> > > >       ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
> > > >       ev.build_id.pid = machine->pid;
> > > > -     ev.build_id.header.size = sizeof(ev.build_id) + len;
> > > > -     memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
> > > > +     ev.build_id.header.size = len;
> > > > +     strcpy(ev.build_id.filename, filename);
> > > > +
> > > > +     if (evsel) {
> > > > +             void *array = &ev;
> > > > +             int ret;
> > > >
> > > > -     return process(tool, &ev, NULL, machine);
> > > > +             array += ev.header.size;
> > > > +             ret = perf_event__synthesize_id_sample(array, evsel->core.attr.sample_type, sample);
> > > > +             if (ret < 0)
> > > > +                     return ret;
> > > > +
> > > > +             if (ret & 7) {
> > > > +                     pr_err("Bad id sample size %d\n", ret);
> > > > +                     return -EINVAL;
> > > > +             }
> > > > +
> > > > +             ev.header.size += ret;
> > > > +     }
> > > > +     return process(tool, &ev, sample, machine);
> > > >  }
> > > >
> > > >  int perf_event__synthesize_stat_events(struct perf_stat_config *config, const struct perf_tool *tool,
> > > > diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
> > > > index 31df7653677f..795bf3e18396 100644
> > > > --- a/tools/perf/util/synthetic-events.h
> > > > +++ b/tools/perf/util/synthetic-events.h
> > > > @@ -9,6 +9,7 @@
> > > >  #include <perf/cpumap.h>
> > > >
> > > >  struct auxtrace_record;
> > > > +struct build_id;
> > > >  struct dso;
> > > >  struct evlist;
> > > >  struct evsel;
> > > > @@ -45,7 +46,14 @@ typedef int (*perf_event__handler_t)(const struct perf_tool *tool, union perf_ev
> > > >
> > > >  int perf_event__synthesize_attrs(const struct perf_tool *tool, struct evlist *evlist, perf_event__handler_t process);
> > > >  int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_attr *attr, u32 ids, u64 *id, perf_event__handler_t process);
> > > > -int perf_event__synthesize_build_id(const struct perf_tool *tool, struct dso *pos, u16 misc, perf_event__handler_t process, struct machine *machine);
> > > > +int perf_event__synthesize_build_id(const struct perf_tool *tool,
> > > > +                                 struct perf_sample *sample,
> > > > +                                 struct machine *machine,
> > > > +                                 perf_event__handler_t process,
> > > > +                                 const struct evsel *evsel,
> > > > +                                 __u16 misc,
> > > > +                                 const struct build_id *bid,
> > > > +                                 const char *filename);
> > > >  int perf_event__synthesize_cpu_map(const struct perf_tool *tool, const struct perf_cpu_map *cpus, perf_event__handler_t process, struct machine *machine);
> > > >  int perf_event__synthesize_event_update_cpus(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> > > >  int perf_event__synthesize_event_update_name(const struct perf_tool *tool, struct evsel *evsel, perf_event__handler_t process);
> > > > --
> > > > 2.46.0.184.g6999bdac58-goog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 00/13] perf inject improvements
  2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
                   ` (12 preceding siblings ...)
  2024-08-17  6:44 ` [PATCH v1 13/13] perf callchain: Allow symbols to be optional when resolving a callchain Ian Rogers
@ 2024-09-02 18:27 ` Namhyung Kim
  13 siblings, 0 replies; 20+ messages in thread
From: Namhyung Kim @ 2024-09-02 18:27 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Masahiro Yamada, Arnd Bergmann, Jann Horn,
	Colin Ian King, Casey Chen, Athira Rajeev, Chaitanya S Prakash,
	James Clark, Ze Gao, Yang Jihong, Yunseong Kim, Weilin Wang,
	Dominique Martinet, Anne Macedo, Sun Haiyong, linux-perf-users,
	linux-kernel

On Fri, Aug 16, 2024 at 11:44:29PM -0700, Ian Rogers wrote:
> Fix the existing build id injection by adding sample IDs on to the
> synthesized events. This correctly orders the events and addresses
> issues such as a profiled executable being replaced during its
> execution.
> 
> Add a new --mmap2-buildid-all option that rewrites all mmap events as
> mmap2 events containing build IDs. This removes the need for build_id
> events.
> 
> Add a new -B option that like --mmap2-buildid-all synthesizes mmap2
> with build id events. With -B the behavior is to do it lazily, so only
> when a sample references the particular map. With system wide
> profiling that synthesizes mmap events for all running processes the
> perf.data file savings can be greater than 50%.
> 
> Reduce the memory footprint of perf inject by avoiding creating
> symbols in the callchain, the symbols aren't used during perf inject
> and necessitate the loading of dsos.
> 
> Ian Rogers (13):
>   perf synthetic-events: Avoid unnecessary memset
>   perf map: API clean up
>   perf jit: Constify filename argument
>   perf dso: Constify dso_id
>   perf evsel: Constify evsel__id_hdr_size argument
>   perf test: Expand pipe/inject test
>   perf inject: Combine build_ids and build_id_all into enum
>   perf inject: Combine different mmap and mmap2 functions
>   perf inject: Combine mmap and mmap2 handling
>   perf inject: Fix build ID injection
>   perf inject: Add new mmap2-buildid-all option
>   perf inject: Lazy build-id mmap2 event insertion
>   perf callchain: Allow symbols to be optional when resolving a
>     callchain

For the remaining bits,

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung

> 
>  tools/perf/builtin-inject.c         | 532 ++++++++++++++++++----------
>  tools/perf/builtin-top.c            |   2 +-
>  tools/perf/tests/shell/pipe_test.sh | 103 ++++--
>  tools/perf/tests/vmlinux-kallsyms.c |   4 +-
>  tools/perf/util/build-id.c          |   6 +-
>  tools/perf/util/callchain.c         |   8 +-
>  tools/perf/util/callchain.h         |   2 +-
>  tools/perf/util/dso.c               |   4 +-
>  tools/perf/util/dso.h               |   4 +-
>  tools/perf/util/dsos.c              |  12 +-
>  tools/perf/util/dsos.h              |   2 +-
>  tools/perf/util/evsel.c             |   2 +-
>  tools/perf/util/evsel.h             |   2 +-
>  tools/perf/util/jit.h               |   3 +-
>  tools/perf/util/jitdump.c           |   6 +-
>  tools/perf/util/machine.c           |  95 ++---
>  tools/perf/util/machine.h           |  36 +-
>  tools/perf/util/map.c               |  25 +-
>  tools/perf/util/map.h               |  22 +-
>  tools/perf/util/synthetic-events.c  | 103 +++++-
>  tools/perf/util/synthetic-events.h  |  21 +-
>  21 files changed, 682 insertions(+), 312 deletions(-)
> 
> -- 
> 2.46.0.184.g6999bdac58-goog
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 10/13] perf inject: Fix build ID injection
  2024-09-02 18:26         ` Namhyung Kim
@ 2024-09-03 18:34           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-03 18:34 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ian Rogers, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
	Masahiro Yamada, Arnd Bergmann, Jann Horn, Colin Ian King,
	Casey Chen, Athira Rajeev, Chaitanya S Prakash, James Clark,
	Ze Gao, Yang Jihong, Yunseong Kim, Weilin Wang,
	Dominique Martinet, Anne Macedo, Sun Haiyong, linux-perf-users,
	linux-kernel

On Mon, Sep 02, 2024 at 11:26:09AM -0700, Namhyung Kim wrote:
> Hi Ian,
> 
> On Wed, Aug 28, 2024 at 08:15:10AM -0700, Ian Rogers wrote:
> > On Mon, Aug 19, 2024 at 12:54 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > On Mon, Aug 19, 2024 at 11:01 AM Arnaldo Carvalho de Melo
> > > <acme@kernel.org> wrote:
> > > >
> > > > On Fri, Aug 16, 2024 at 11:44:39PM -0700, Ian Rogers wrote:
> > > > > Build ID injection wasn't inserting a sample ID and aligning events to
> > > > > 64 bytes rather than 8. No sample ID means events are unordered and
> > > > > two different build_id events for the same path, as happens when a
> > > > > file is replaced, can't be differentiated.
> > > > >
> > > > > Add in sample ID insertion for the build_id events alongside some
> > > > > refactoring. The refactoring better aligns the function arguments for
> > > > > different use cases, such as synthesizing build_id events without
> > > > > needing to have a dso. The misc bits are explicitly passed as with
> > > > > callchains the maps/dsos may span user and kernel land, so using
> > > > > sample->cpumode isn't good enough.
> > > > >
> > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > > ---
> > > > >  tools/perf/builtin-inject.c        | 170 ++++++++++++++++++++++-------
> > > > >  tools/perf/util/build-id.c         |   6 +-
> > > > >  tools/perf/util/synthetic-events.c |  44 ++++++--
> > > > >  tools/perf/util/synthetic-events.h |  10 +-
> > > > >  4 files changed, 175 insertions(+), 55 deletions(-)
> > > > >
> > > > > diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> > > > > index a7c859db2e15..84a4bdb5cb0a 100644
> > > > > --- a/tools/perf/builtin-inject.c
> > > > > +++ b/tools/perf/builtin-inject.c
> > > > > @@ -131,6 +131,7 @@ struct perf_inject {
> > > > >       struct perf_file_section secs[HEADER_FEAT_BITS];
> > > > >       struct guest_session    guest_session;
> > > > >       struct strlist          *known_build_ids;
> > > > > +     const struct evsel      *mmap_evsel;
> > > > >  };
> > > > >
> > > > >  struct event_entry {
> > > > > @@ -139,8 +140,13 @@ struct event_entry {
> > > > >       union perf_event event[];
> > > > >  };
> > > > >
> > > > > -static int dso__inject_build_id(struct dso *dso, const struct perf_tool *tool,
> > > > > -                             struct machine *machine, u8 cpumode, u32 flags);
> > > > > +static int dso__inject_build_id(const struct perf_tool *tool,
> > > > > +                             struct perf_sample *sample,
> > > > > +                             struct machine *machine,
> > > > > +                             const struct evsel *evsel,
> > > > > +                             __u16 misc,
> > > > > +                             const char *filename,
> > > > > +                             struct dso *dso, u32 flags);
> > > >
> > > > So in the end the dso was needed, the name of the function remains
> > > > dso__something(), so first arg would be a 'struct dso *'
> > > >
> > > > I processed the patches up to 9/13, so that they can get tested now.
> > >
> > > Maybe we should rename the function? We can get the build ID from
> > > mmap2 events now, not just stored away in the dso by build_id events.
> > > Reordering the arguments isn't a problem, I was just aiming for
> > > consistency between the caller, perf_event__synthesize_build_id and
> > > eventually the call to process that all take the arguments in this
> > > order.
> > 
> > Arnaldo, did you have any more thoughts on this?
> 
> I'm ok to rename it to tool__inject_build_id().

I thought I had this processed, but I stopped at this patch, yeah,
please rebase on top of perf-tools-next with the suggested rename.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-09-03 18:34 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-17  6:44 [PATCH v1 00/13] perf inject improvements Ian Rogers
2024-08-17  6:44 ` [PATCH v1 01/13] perf synthetic-events: Avoid unnecessary memset Ian Rogers
2024-08-17  6:44 ` [PATCH v1 02/13] perf map: API clean up Ian Rogers
2024-08-17  6:44 ` [PATCH v1 03/13] perf jit: Constify filename argument Ian Rogers
2024-08-17  6:44 ` [PATCH v1 04/13] perf dso: Constify dso_id Ian Rogers
2024-08-17  6:44 ` [PATCH v1 05/13] perf evsel: Constify evsel__id_hdr_size argument Ian Rogers
2024-08-17  6:44 ` [PATCH v1 06/13] perf test: Expand pipe/inject test Ian Rogers
2024-08-17  6:44 ` [PATCH v1 07/13] perf inject: Combine build_ids and build_id_all into enum Ian Rogers
2024-08-17  6:44 ` [PATCH v1 08/13] perf inject: Combine different mmap and mmap2 functions Ian Rogers
2024-08-17  6:44 ` [PATCH v1 09/13] perf inject: Combine mmap and mmap2 handling Ian Rogers
2024-08-17  6:44 ` [PATCH v1 10/13] perf inject: Fix build ID injection Ian Rogers
2024-08-19 18:01   ` Arnaldo Carvalho de Melo
2024-08-19 19:54     ` Ian Rogers
2024-08-28 15:15       ` Ian Rogers
2024-09-02 18:26         ` Namhyung Kim
2024-09-03 18:34           ` Arnaldo Carvalho de Melo
2024-08-17  6:44 ` [PATCH v1 11/13] perf inject: Add new mmap2-buildid-all option Ian Rogers
2024-08-17  6:44 ` [PATCH v1 12/13] perf inject: Lazy build-id mmap2 event insertion Ian Rogers
2024-08-17  6:44 ` [PATCH v1 13/13] perf callchain: Allow symbols to be optional when resolving a callchain Ian Rogers
2024-09-02 18:27 ` [PATCH v1 00/13] perf inject improvements Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).