linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] perf: generate events for BPF metadata
@ 2025-05-21 22:27 Blake Jones
  2025-05-21 22:27 ` [PATCH 1/3] perf: add support for printing BTF character arrays as strings Blake Jones
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Blake Jones @ 2025-05-21 22:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Ian Rogers,
	Adrian Hunter, Kan Liang
  Cc: Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users,
	Blake Jones

Commit ffa915f46193 ("Merge branch 'bpf_metadata'"), from September 2020,
added support to the kernel, libbpf, and bpftool to treat read-only BPF
variables that have names starting with 'bpf_metadata_' specially. This
patch series updates perf to handle these variables similarly, allowing a
perf.data file to capture relevant information about BPF programs on the
system being profiled.

When it encounters a BPF program, it reads the program's maps to find an
'.rodata' map with 'bpf_metadata_' variables. If it finds one, it extracts
their values as strings, and creates a new PERF_RECORD_BPF_METADATA
synthetic event using that data. It does this both for BPF programs that
were loaded when a 'perf record' starts, as well as for programs that are
loaded while the profile is running. For the latter case, it stores the
metadata for the duration of the profile, and then dumps it at the end of
the profile, where it's in a better context to do so.

The PERF_RECORD_BPF_METADATA event holds an array of key-value pairs, where
the key is the variable name (minus the "bpf_metadata_" prefix) and the
value is the variable's value, formatted as a string. The event also
includes the BPF program's name, so it can be correlated with
PERF_RECORD_KSYMBOL events.

Blake Jones (3):
  perf: add support for printing BTF character arrays as strings
  perf: collect BPF metadata from existing BPF programs
  perf: collect BPF metadata from new programs, and display the new
    event

 tools/lib/bpf/btf.h                          |   3 +-
 tools/lib/bpf/btf_dump.c                     |  51 ++-
 tools/lib/perf/include/perf/event.h          |  18 +
 tools/perf/builtin-inject.c                  |   1 +
 tools/perf/builtin-record.c                  |   8 +
 tools/perf/builtin-script.c                  |  15 +-
 tools/perf/tests/shell/test_bpf_metadata.sh  |  67 ++++
 tools/perf/util/bpf-event.c                  | 356 +++++++++++++++++++
 tools/perf/util/bpf-event.h                  |  14 +
 tools/perf/util/bpf_skel/sample_filter.bpf.c |   4 +
 tools/perf/util/env.c                        |  19 +-
 tools/perf/util/env.h                        |   4 +
 tools/perf/util/event.c                      |  21 ++
 tools/perf/util/event.h                      |   1 +
 tools/perf/util/header.c                     |   1 +
 tools/perf/util/session.c                    |   4 +
 tools/perf/util/synthetic-events.h           |   2 +
 tools/perf/util/tool.c                       |  14 +
 tools/perf/util/tool.h                       |   3 +-
 19 files changed, 600 insertions(+), 6 deletions(-)
 create mode 100644 tools/perf/tests/shell/test_bpf_metadata.sh

-- 
2.49.0.1143.g0be31eac6b-goog


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-21 22:27 [PATCH 0/3] perf: generate events for BPF metadata Blake Jones
@ 2025-05-21 22:27 ` Blake Jones
  2025-05-22 17:42   ` Arnaldo Carvalho de Melo
  2025-05-21 22:27 ` [PATCH 2/3] perf: collect BPF metadata from existing BPF programs Blake Jones
  2025-05-21 22:27 ` [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event Blake Jones
  2 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-21 22:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Ian Rogers,
	Adrian Hunter, Kan Liang
  Cc: Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users,
	Blake Jones

The BTF dumper code currently displays arrays of characters as just that -
arrays, with each character formatted individually. Sometimes this is what
makes sense, but it's nice to be able to treat that array as a string.

This change adds a special case to the btf_dump functionality to allow
arrays of single-byte integer values to be printed as character strings.
Characters for which isprint() returns false are printed as hex-escaped
values. This is enabled when the new ".print_strings" is set to 1 in the
btf_dump_type_data_opts structure.

As an example, here's what it looks like to dump the string "hello" using
a few different field values for btf_dump_type_data_opts (.compact = 1):

- .print_strings = 0, .skip_names = 0:  (char[6])['h','e','l','l','o',]
- .print_strings = 0, .skip_names = 1:  ['h','e','l','l','o',]
- .print_strings = 1, .skip_names = 0:  (char[6])"hello"
- .print_strings = 1, .skip_names = 1:  "hello"

Here's the string "h\xff", dumped with .compact = 1 and .skip_names = 1:

- .print_strings = 0:  ['h',-1,]
- .print_strings = 1:  "h\xff"

Signed-off-by: Blake Jones <blakejones@google.com>
---
 tools/lib/bpf/btf.h      |  3 ++-
 tools/lib/bpf/btf_dump.c | 51 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 4392451d634b..be8e8e26d245 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -326,9 +326,10 @@ struct btf_dump_type_data_opts {
 	bool compact;		/* no newlines/indentation */
 	bool skip_names;	/* skip member/type names */
 	bool emit_zeroes;	/* show 0-valued fields */
+	bool print_strings;	/* print char arrays as strings */
 	size_t :0;
 };
-#define btf_dump_type_data_opts__last_field emit_zeroes
+#define btf_dump_type_data_opts__last_field print_strings
 
 LIBBPF_API int
 btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
index 460c3e57fadb..a07dd5accdd8 100644
--- a/tools/lib/bpf/btf_dump.c
+++ b/tools/lib/bpf/btf_dump.c
@@ -75,6 +75,7 @@ struct btf_dump_data {
 	bool is_array_member;
 	bool is_array_terminated;
 	bool is_array_char;
+	bool print_strings;
 };
 
 struct btf_dump {
@@ -2028,6 +2029,50 @@ static int btf_dump_var_data(struct btf_dump *d,
 	return btf_dump_dump_type_data(d, NULL, t, type_id, data, 0, 0);
 }
 
+static int btf_dump_string_data(struct btf_dump *d,
+				const struct btf_type *t,
+				__u32 id,
+				const void *data)
+{
+	const struct btf_array *array = btf_array(t);
+	__u32 i;
+
+	if (!btf_is_int(skip_mods_and_typedefs(d->btf, array->type, NULL)) ||
+	    btf__resolve_size(d->btf, array->type) != 1 ||
+	    !d->typed_dump->print_strings) {
+		pr_warn("unexpected %s() call for array type %u\n",
+			__func__, array->type);
+		return -EINVAL;
+	}
+
+	btf_dump_data_pfx(d);
+	btf_dump_printf(d, "\"");
+
+	for (i = 0; i < array->nelems; i++, data++) {
+		char c;
+
+		if (data >= d->typed_dump->data_end)
+			return -E2BIG;
+
+		c = *(char *)data;
+		if (c == '\0') {
+			/* When printing character arrays as strings, NUL bytes
+			 * are always treated as string terminators; they are
+			 * never printed.
+			 */
+			break;
+		}
+		if (isprint(c))
+			btf_dump_printf(d, "%c", c);
+		else
+			btf_dump_printf(d, "\\x%02x", *(__u8 *)data);
+	}
+
+	btf_dump_printf(d, "\"");
+
+	return 0;
+}
+
 static int btf_dump_array_data(struct btf_dump *d,
 			       const struct btf_type *t,
 			       __u32 id,
@@ -2055,8 +2100,11 @@ static int btf_dump_array_data(struct btf_dump *d,
 		 * char arrays, so if size is 1 and element is
 		 * printable as a char, we'll do that.
 		 */
-		if (elem_size == 1)
+		if (elem_size == 1) {
+			if (d->typed_dump->print_strings)
+				return btf_dump_string_data(d, t, id, data);
 			d->typed_dump->is_array_char = true;
+		}
 	}
 
 	/* note that we increment depth before calling btf_dump_print() below;
@@ -2544,6 +2592,7 @@ int btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
 	d->typed_dump->compact = OPTS_GET(opts, compact, false);
 	d->typed_dump->skip_names = OPTS_GET(opts, skip_names, false);
 	d->typed_dump->emit_zeroes = OPTS_GET(opts, emit_zeroes, false);
+	d->typed_dump->print_strings = OPTS_GET(opts, print_strings, false);
 
 	ret = btf_dump_dump_type_data(d, NULL, t, id, data, 0, 0);
 
-- 
2.49.0.1143.g0be31eac6b-goog


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-05-21 22:27 [PATCH 0/3] perf: generate events for BPF metadata Blake Jones
  2025-05-21 22:27 ` [PATCH 1/3] perf: add support for printing BTF character arrays as strings Blake Jones
@ 2025-05-21 22:27 ` Blake Jones
  2025-05-29 17:47   ` Ian Rogers
  2025-06-03 20:15   ` Namhyung Kim
  2025-05-21 22:27 ` [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event Blake Jones
  2 siblings, 2 replies; 28+ messages in thread
From: Blake Jones @ 2025-05-21 22:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Ian Rogers,
	Adrian Hunter, Kan Liang
  Cc: Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users,
	Blake Jones

Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract
their values as strings, and create a new PERF_RECORD_BPF_METADATA
synthetic event using that data. The code gets invoked from the existing
routine perf_event__synthesize_one_bpf_prog().

Signed-off-by: Blake Jones <blakejones@google.com>
---
 tools/lib/perf/include/perf/event.h |  18 ++
 tools/perf/util/bpf-event.c         | 310 ++++++++++++++++++++++++++++
 tools/perf/util/bpf-event.h         |  13 ++
 3 files changed, 341 insertions(+)

diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
index 09b7c643ddac..6608f1e3701b 100644
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@@ -467,6 +467,22 @@ struct perf_record_compressed2 {
 	char			 data[];
 };
 
+#define BPF_METADATA_KEY_LEN   64
+#define BPF_METADATA_VALUE_LEN 256
+#define BPF_PROG_NAME_LEN      KSYM_NAME_LEN
+
+struct perf_record_bpf_metadata_entry {
+	char key[BPF_METADATA_KEY_LEN];
+	char value[BPF_METADATA_VALUE_LEN];
+};
+
+struct perf_record_bpf_metadata {
+	struct perf_event_header	      header;
+	char				      prog_name[BPF_PROG_NAME_LEN];
+	__u64				      nr_entries;
+	struct perf_record_bpf_metadata_entry entries[];
+};
+
 enum perf_user_event_type { /* above any possible kernel type */
 	PERF_RECORD_USER_TYPE_START		= 64,
 	PERF_RECORD_HEADER_ATTR			= 64,
@@ -489,6 +505,7 @@ enum perf_user_event_type { /* above any possible kernel type */
 	PERF_RECORD_COMPRESSED			= 81,
 	PERF_RECORD_FINISHED_INIT		= 82,
 	PERF_RECORD_COMPRESSED2			= 83,
+	PERF_RECORD_BPF_METADATA		= 84,
 	PERF_RECORD_HEADER_MAX
 };
 
@@ -530,6 +547,7 @@ union perf_event {
 	struct perf_record_header_feature	feat;
 	struct perf_record_compressed		pack;
 	struct perf_record_compressed2		pack2;
+	struct perf_record_bpf_metadata		bpf_metadata;
 };
 
 #endif /* __LIBPERF_EVENT_H */
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index c81444059ad0..36d5676f025e 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -1,13 +1,20 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <errno.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
 #include <stdlib.h>
+#include <string.h>
 #include <bpf/bpf.h>
 #include <bpf/btf.h>
 #include <bpf/libbpf.h>
+#include <linux/bpf.h>
 #include <linux/btf.h>
 #include <linux/err.h>
+#include <linux/perf_event.h>
 #include <linux/string.h>
 #include <internal/lib.h>
+#include <perf/event.h>
 #include <symbol/kallsyms.h>
 #include "bpf-event.h"
 #include "bpf-utils.h"
@@ -151,6 +158,298 @@ static int synthesize_bpf_prog_name(char *buf, int size,
 	return name_len;
 }
 
+#define BPF_METADATA_PREFIX "bpf_metadata_"
+#define BPF_METADATA_PREFIX_LEN (sizeof(BPF_METADATA_PREFIX) - 1)
+
+static bool name_has_bpf_metadata_prefix(const char **s)
+{
+	if (strncmp(*s, BPF_METADATA_PREFIX, BPF_METADATA_PREFIX_LEN) != 0)
+		return false;
+	*s += BPF_METADATA_PREFIX_LEN;
+	return true;
+}
+
+struct bpf_metadata_map {
+	struct btf *btf;
+	const struct btf_type *datasec;
+	void *rodata;
+	size_t rodata_size;
+	unsigned int num_vars;
+};
+
+static int bpf_metadata_read_map_data(__u32 map_id, struct bpf_metadata_map *map)
+{
+	int map_fd;
+	struct bpf_map_info map_info;
+	__u32 map_info_len;
+	int key;
+	struct btf *btf;
+	const struct btf_type *datasec;
+	struct btf_var_secinfo *vsi;
+	unsigned int vlen, vars;
+	void *rodata;
+
+	map_fd = bpf_map_get_fd_by_id(map_id);
+	if (map_fd < 0)
+		return -1;
+
+	memset(&map_info, 0, sizeof(map_info));
+	map_info_len = sizeof(map_info);
+	if (bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len) < 0)
+		goto out_close;
+
+	/* If it's not an .rodata map, don't bother. */
+	if (map_info.type != BPF_MAP_TYPE_ARRAY ||
+	    map_info.key_size != sizeof(int) ||
+	    map_info.max_entries != 1 ||
+	    !map_info.btf_value_type_id ||
+	    !strstr(map_info.name, ".rodata")) {
+		goto out_close;
+	}
+
+	btf = btf__load_from_kernel_by_id(map_info.btf_id);
+	if (!btf)
+		goto out_close;
+	datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
+	if (!btf_is_datasec(datasec))
+		goto out_free_btf;
+
+	/* If there aren't any variables with the "bpf_metadata_" prefix,
+	 * don't bother.
+	 */
+	vlen = btf_vlen(datasec);
+	vsi = btf_var_secinfos(datasec);
+	vars = 0;
+	for (unsigned int i = 0; i < vlen; i++, vsi++) {
+		const struct btf_type *t_var = btf__type_by_id(btf, vsi->type);
+		const char *name = btf__name_by_offset(btf, t_var->name_off);
+
+		if (name_has_bpf_metadata_prefix(&name))
+			vars++;
+	}
+	if (vars == 0)
+		goto out_free_btf;
+
+	rodata = calloc(1, map_info.value_size);
+	if (!rodata)
+		goto out_free_btf;
+	key = 0;
+	if (bpf_map_lookup_elem(map_fd, &key, rodata)) {
+		free(rodata);
+		goto out_free_btf;
+	}
+	close(map_fd);
+
+	map->btf = btf;
+	map->datasec = datasec;
+	map->rodata = rodata;
+	map->rodata_size = map_info.value_size;
+	map->num_vars = vars;
+	return 0;
+
+out_free_btf:
+	btf__free(btf);
+out_close:
+	close(map_fd);
+	return -1;
+}
+
+struct format_btf_ctx {
+	char *buf;
+	size_t buf_size;
+	size_t buf_idx;
+};
+
+static void format_btf_cb(void *arg, const char *fmt, va_list ap)
+{
+	int n;
+	struct format_btf_ctx *ctx = (struct format_btf_ctx *)arg;
+
+	n = vsnprintf(ctx->buf + ctx->buf_idx, ctx->buf_size - ctx->buf_idx,
+		      fmt, ap);
+	ctx->buf_idx += n;
+	if (ctx->buf_idx >= ctx->buf_size)
+		ctx->buf_idx = ctx->buf_size;
+}
+
+static void format_btf_variable(struct btf *btf, char *buf, size_t buf_size,
+				const struct btf_type *t, const void *btf_data)
+{
+	struct format_btf_ctx ctx = {
+		.buf = buf,
+		.buf_idx = 0,
+		.buf_size = buf_size,
+	};
+	const struct btf_dump_type_data_opts opts = {
+		.sz = sizeof(struct btf_dump_type_data_opts),
+		.skip_names = 1,
+		.compact = 1,
+		.print_strings = 1,
+	};
+	struct btf_dump *d;
+	size_t btf_size;
+
+	d = btf_dump__new(btf, format_btf_cb, &ctx, NULL);
+	btf_size = btf__resolve_size(btf, t->type);
+	btf_dump__dump_type_data(d, t->type, btf_data, btf_size, &opts);
+	btf_dump__free(d);
+}
+
+static void bpf_metadata_fill_event(struct bpf_metadata_map *map,
+				    struct perf_record_bpf_metadata *bpf_metadata_event)
+{
+	struct btf_var_secinfo *vsi;
+	unsigned int i, vlen;
+
+	memset(bpf_metadata_event->prog_name, 0, BPF_PROG_NAME_LEN);
+	vlen = btf_vlen(map->datasec);
+	vsi = btf_var_secinfos(map->datasec);
+
+	for (i = 0; i < vlen; i++, vsi++) {
+		const struct btf_type *t_var = btf__type_by_id(map->btf,
+							       vsi->type);
+		const char *name = btf__name_by_offset(map->btf,
+						       t_var->name_off);
+		const __u64 nr_entries = bpf_metadata_event->nr_entries;
+		struct perf_record_bpf_metadata_entry *entry;
+
+		if (!name_has_bpf_metadata_prefix(&name))
+			continue;
+
+		if (nr_entries >= (__u64)map->num_vars)
+			break;
+
+		entry = &bpf_metadata_event->entries[nr_entries];
+		memset(entry, 0, sizeof(*entry));
+		snprintf(entry->key, BPF_METADATA_KEY_LEN, "%s", name);
+		format_btf_variable(map->btf, entry->value,
+				    BPF_METADATA_VALUE_LEN, t_var,
+				    map->rodata + vsi->offset);
+		bpf_metadata_event->nr_entries++;
+	}
+}
+
+static void bpf_metadata_free_map_data(struct bpf_metadata_map *map)
+{
+	btf__free(map->btf);
+	free(map->rodata);
+}
+
+void bpf_metadata_free(struct bpf_metadata *metadata)
+{
+	if (metadata == NULL)
+		return;
+	for (__u32 index = 0; index < metadata->nr_prog_names; index++)
+		free(metadata->prog_names[index]);
+	if (metadata->prog_names != NULL)
+		free(metadata->prog_names);
+	if (metadata->event != NULL)
+		free(metadata->event);
+	free(metadata);
+}
+
+static struct bpf_metadata *bpf_metadata_alloc(__u32 nr_prog_tags,
+					       __u32 nr_variables)
+{
+	struct bpf_metadata *metadata;
+
+	metadata = calloc(1, sizeof(struct bpf_metadata));
+	if (!metadata)
+		return NULL;
+
+	metadata->prog_names = calloc(nr_prog_tags, sizeof(char *));
+	if (!metadata->prog_names) {
+		bpf_metadata_free(metadata);
+		return NULL;
+	}
+	for (__u32 prog_index = 0; prog_index < nr_prog_tags; prog_index++) {
+		metadata->prog_names[prog_index] = calloc(BPF_PROG_NAME_LEN,
+							  sizeof(char));
+		if (!metadata->prog_names[prog_index]) {
+			bpf_metadata_free(metadata);
+			return NULL;
+		}
+		metadata->nr_prog_names++;
+	}
+
+	metadata->event_size = sizeof(metadata->event->bpf_metadata) +
+	    nr_variables * sizeof(metadata->event->bpf_metadata.entries[0]);
+	metadata->event = calloc(1, metadata->event_size);
+	if (!metadata->event) {
+		bpf_metadata_free(metadata);
+		return NULL;
+	}
+
+	return metadata;
+}
+
+static struct bpf_metadata *bpf_metadata_create(struct bpf_prog_info *info)
+{
+	struct bpf_metadata *metadata;
+	const __u32 *map_ids = (__u32 *)(uintptr_t)info->map_ids;
+
+	for (__u32 map_index = 0; map_index < info->nr_map_ids; map_index++) {
+		struct perf_record_bpf_metadata *bpf_metadata_event;
+		struct bpf_metadata_map map;
+
+		if (bpf_metadata_read_map_data(map_ids[map_index], &map) != 0)
+			continue;
+
+		metadata = bpf_metadata_alloc(info->nr_prog_tags, map.num_vars);
+		if (!metadata)
+			continue;
+
+		bpf_metadata_event = &metadata->event->bpf_metadata;
+		*bpf_metadata_event = (struct perf_record_bpf_metadata) {
+			.header = {
+				.type = PERF_RECORD_BPF_METADATA,
+				.size = metadata->event_size,
+			},
+			.nr_entries = 0,
+		};
+		bpf_metadata_fill_event(&map, bpf_metadata_event);
+
+		for (__u32 index = 0; index < info->nr_prog_tags; index++) {
+			synthesize_bpf_prog_name(metadata->prog_names[index],
+						 BPF_PROG_NAME_LEN, info,
+						 map.btf, index);
+		}
+
+		bpf_metadata_free_map_data(&map);
+
+		return metadata;
+	}
+
+	return NULL;
+}
+
+static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metadata,
+					       const struct perf_tool *tool,
+					       perf_event__handler_t process,
+					       struct machine *machine)
+{
+	union perf_event *event;
+	int err = 0;
+
+	event = calloc(1, metadata->event_size + machine->id_hdr_size);
+	if (!event)
+		return -1;
+	memcpy(event, metadata->event, metadata->event_size);
+	memset((void *)event + event->header.size, 0, machine->id_hdr_size);
+	event->header.size += machine->id_hdr_size;
+	for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
+		memcpy(event->bpf_metadata.prog_name,
+		       metadata->prog_names[index], BPF_PROG_NAME_LEN);
+		err = perf_tool__process_synth_event(tool, event, machine,
+						     process);
+		if (err != 0)
+			break;
+	}
+
+	free(event);
+	return err;
+}
+
 /*
  * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
  * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
@@ -173,6 +472,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
 	const struct perf_tool *tool = session->tool;
 	struct bpf_prog_info_node *info_node;
 	struct perf_bpil *info_linear;
+	struct bpf_metadata *metadata;
 	struct bpf_prog_info *info;
 	struct btf *btf = NULL;
 	struct perf_env *env;
@@ -193,6 +493,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
 	arrays |= 1UL << PERF_BPIL_JITED_INSNS;
 	arrays |= 1UL << PERF_BPIL_LINE_INFO;
 	arrays |= 1UL << PERF_BPIL_JITED_LINE_INFO;
+	arrays |= 1UL << PERF_BPIL_MAP_IDS;
 
 	info_linear = get_bpf_prog_info_linear(fd, arrays);
 	if (IS_ERR_OR_NULL(info_linear)) {
@@ -301,6 +602,15 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
 		 */
 		err = perf_tool__process_synth_event(tool, event,
 						     machine, process);
+
+		/* Synthesize PERF_RECORD_BPF_METADATA */
+		metadata = bpf_metadata_create(info);
+		if (metadata != NULL) {
+			err = synthesize_perf_record_bpf_metadata(metadata,
+								  tool, process,
+								  machine);
+			bpf_metadata_free(metadata);
+		}
 	}
 
 out:
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
index e2f0420905f5..007f0b4d21cb 100644
--- a/tools/perf/util/bpf-event.h
+++ b/tools/perf/util/bpf-event.h
@@ -17,6 +17,13 @@ struct record_opts;
 struct evlist;
 struct target;
 
+struct bpf_metadata {
+	union perf_event *event;
+	size_t		 event_size;
+	char		 **prog_names;
+	__u64		 nr_prog_names;
+};
+
 struct bpf_prog_info_node {
 	struct perf_bpil		*info_linear;
 	struct rb_node			rb_node;
@@ -36,6 +43,7 @@ int evlist__add_bpf_sb_event(struct evlist *evlist, struct perf_env *env);
 void __bpf_event__print_bpf_prog_info(struct bpf_prog_info *info,
 				      struct perf_env *env,
 				      FILE *fp);
+void bpf_metadata_free(struct bpf_metadata *metadata);
 #else
 static inline int machine__process_bpf(struct machine *machine __maybe_unused,
 				       union perf_event *event __maybe_unused,
@@ -55,6 +63,11 @@ static inline void __bpf_event__print_bpf_prog_info(struct bpf_prog_info *info _
 						    FILE *fp __maybe_unused)
 {
 
+}
+
+static inline void bpf_metadata_free(struct bpf_metadata *metadata)
+{
+
 }
 #endif // HAVE_LIBBPF_SUPPORT
 #endif
-- 
2.49.0.1143.g0be31eac6b-goog


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event
  2025-05-21 22:27 [PATCH 0/3] perf: generate events for BPF metadata Blake Jones
  2025-05-21 22:27 ` [PATCH 1/3] perf: add support for printing BTF character arrays as strings Blake Jones
  2025-05-21 22:27 ` [PATCH 2/3] perf: collect BPF metadata from existing BPF programs Blake Jones
@ 2025-05-21 22:27 ` Blake Jones
  2025-05-29 18:12   ` Ian Rogers
  2 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-21 22:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Ian Rogers,
	Adrian Hunter, Kan Liang
  Cc: Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users,
	Blake Jones

This collects metadata for any BPF programs that were loaded during a
"perf record" run, and emits it at the end of the run. It also adds support
for displaying the new PERF_RECORD_BPF_METADATA type.

Here's some example "perf script -D" output for the new event type. The
": unhandled!" message is from tool.c, analogous to other behavior there.
I've elided some rows with all NUL characters for brevity, and I wrapped
one of the >75-column lines to fit in the commit guidelines.

0x50fc8@perf.data [0x260]: event: 84
.
. ... raw event: size 608 bytes
.  0000:  54 00 00 00 00 00 60 02 62 70 66 5f 70 72 6f 67  T.....`.bpf_prog
.  0010:  5f 31 65 30 61 32 65 33 36 36 65 35 36 66 31 61  _1e0a2e366e56f1a
.  0020:  32 5f 70 65 72 66 5f 73 61 6d 70 6c 65 5f 66 69  2_perf_sample_fi
.  0030:  6c 74 65 72 00 00 00 00 00 00 00 00 00 00 00 00  lter............
.  0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[...]
.  0110:  74 65 73 74 5f 76 61 6c 75 65 00 00 00 00 00 00  test_value......
.  0120:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[...]
.  0150:  34 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00  42..............
.  0160:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[...]

0 0x50fc8 [0x260]: PERF_RECORD_BPF_METADATA \
      prog bpf_prog_1e0a2e366e56f1a2_perf_sample_filter
  entry 0:           test_value = 42
: unhandled!

Signed-off-by: Blake Jones <blakejones@google.com>
---
 tools/perf/builtin-inject.c                  |  1 +
 tools/perf/builtin-record.c                  |  8 +++
 tools/perf/builtin-script.c                  | 15 ++++-
 tools/perf/tests/shell/test_bpf_metadata.sh  | 67 ++++++++++++++++++++
 tools/perf/util/bpf-event.c                  | 46 ++++++++++++++
 tools/perf/util/bpf-event.h                  |  1 +
 tools/perf/util/bpf_skel/sample_filter.bpf.c |  4 ++
 tools/perf/util/env.c                        | 19 +++++-
 tools/perf/util/env.h                        |  4 ++
 tools/perf/util/event.c                      | 21 ++++++
 tools/perf/util/event.h                      |  1 +
 tools/perf/util/header.c                     |  1 +
 tools/perf/util/session.c                    |  4 ++
 tools/perf/util/synthetic-events.h           |  2 +
 tools/perf/util/tool.c                       | 14 ++++
 tools/perf/util/tool.h                       |  3 +-
 16 files changed, 207 insertions(+), 4 deletions(-)
 create mode 100644 tools/perf/tests/shell/test_bpf_metadata.sh

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 11e49cafa3af..b15eac0716f7 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -2530,6 +2530,7 @@ int cmd_inject(int argc, const char **argv)
 	inject.tool.finished_init	= perf_event__repipe_op2_synth;
 	inject.tool.compressed		= perf_event__repipe_op4_synth;
 	inject.tool.auxtrace		= perf_event__repipe_auxtrace;
+	inject.tool.bpf_metadata	= perf_event__repipe_op2_synth;
 	inject.tool.dont_split_sample_group = true;
 	inject.session = __perf_session__new(&data, &inject.tool,
 					     /*trace_event_repipe=*/inject.output.is_pipe);
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 136c0172799a..067e203f57c2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2161,6 +2161,12 @@ static int record__synthesize(struct record *rec, bool tail)
 	return err;
 }
 
+static void record__synthesize_final_bpf_metadata(struct record *rec)
+{
+	perf_event__synthesize_final_bpf_metadata(rec->session,
+						  process_synthesized_event);
+}
+
 static int record__process_signal_event(union perf_event *event __maybe_unused, void *data)
 {
 	struct record *rec = data;
@@ -2806,6 +2812,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	trigger_off(&auxtrace_snapshot_trigger);
 	trigger_off(&switch_output_trigger);
 
+	record__synthesize_final_bpf_metadata(rec);
+
 	if (opts->auxtrace_snapshot_on_exit)
 		record__auxtrace_snapshot_exit(rec);
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 6c3bf74dd78c..4001e621b6cb 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -38,6 +38,7 @@
 #include "print_insn.h"
 #include "archinsn.h"
 #include <linux/bitmap.h>
+#include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/stringify.h>
 #include <linux/time64.h>
@@ -50,6 +51,7 @@
 #include <errno.h>
 #include <inttypes.h>
 #include <signal.h>
+#include <stdio.h>
 #include <sys/param.h>
 #include <sys/types.h>
 #include <sys/stat.h>
@@ -2755,6 +2757,14 @@ process_bpf_events(const struct perf_tool *tool __maybe_unused,
 			   sample->tid);
 }
 
+static int
+process_bpf_metadata_event(struct perf_session *session __maybe_unused,
+			   union perf_event *event)
+{
+	perf_event__fprintf(event, NULL, stdout);
+	return 0;
+}
+
 static int process_text_poke_events(const struct perf_tool *tool,
 				    union perf_event *event,
 				    struct perf_sample *sample,
@@ -2877,8 +2887,9 @@ static int __cmd_script(struct perf_script *script)
 		script->tool.finished_round = process_finished_round_event;
 	}
 	if (script->show_bpf_events) {
-		script->tool.ksymbol = process_bpf_events;
-		script->tool.bpf     = process_bpf_events;
+		script->tool.ksymbol	  = process_bpf_events;
+		script->tool.bpf	  = process_bpf_events;
+		script->tool.bpf_metadata = process_bpf_metadata_event;
 	}
 	if (script->show_text_poke_events) {
 		script->tool.ksymbol   = process_bpf_events;
diff --git a/tools/perf/tests/shell/test_bpf_metadata.sh b/tools/perf/tests/shell/test_bpf_metadata.sh
new file mode 100644
index 000000000000..ede31d5a3c36
--- /dev/null
+++ b/tools/perf/tests/shell/test_bpf_metadata.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# BPF metadata collection test.
+
+set -e
+
+err=0
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
+
+cleanup() {
+	rm -f "${perfdata}"
+	rm -f "${perfdata}".old
+	trap - EXIT TERM INT
+}
+
+trap_cleanup() {
+	cleanup
+	exit 1
+}
+trap trap_cleanup EXIT TERM INT
+
+test_bpf_metadata() {
+	echo "Checking BPF metadata collection"
+
+	# This is a basic invocation of perf record
+	# that invokes the perf_sample_filter BPF program.
+	if ! perf record -e task-clock --filter 'ip > 0' \
+			 -o "${perfdata}" sleep 1 2> /dev/null
+	then
+		echo "Basic BPF metadata test [Failed record]"
+		err=1
+		return
+	fi
+
+	# The perf_sample_filter BPF program has the following variable in it:
+	#
+	#   volatile const int bpf_metadata_test_value SEC(".rodata") = 42;
+	#
+	# This invocation looks for a PERF_RECORD_BPF_METADATA event,
+	# and checks that its content includes something for the above variable.
+	if ! perf script --show-bpf-events -i "${perfdata}" | awk '
+		/PERF_RECORD_BPF_METADATA.*perf_sample_filter/ {
+			header = 1;
+		}
+		/^ *entry/ {
+			if (header) { header = 0; entry = 1; }
+		}
+		$0 !~ /^ *entry/ {
+			entry = 0;
+		}
+		/test_value/ {
+			if (entry) print $NF;
+		}
+	' | grep 42 > /dev/null
+	then
+		echo "Basic BPF metadata test [Failed invalid output]"
+		err=1
+		return
+	fi
+	echo "Basic BPF metadata test [Success]"
+}
+
+test_bpf_metadata
+
+cleanup
+exit $err
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 36d5676f025e..019832ec4802 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -450,6 +450,49 @@ static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metada
 	return err;
 }
 
+struct bpf_metadata_final_ctx {
+	const struct perf_tool *tool;
+	perf_event__handler_t process;
+	struct machine *machine;
+};
+
+static void synthesize_final_bpf_metadata_cb(struct bpf_prog_info_node *node,
+					     void *data)
+{
+	struct bpf_metadata_final_ctx *ctx = (struct bpf_metadata_final_ctx *)data;
+	struct bpf_metadata *metadata = node->metadata;
+	int err;
+
+	if (metadata == NULL)
+		return;
+	err = synthesize_perf_record_bpf_metadata(metadata, ctx->tool,
+						  ctx->process, ctx->machine);
+	if (err != 0) {
+		const char *prog_name = metadata->prog_names[0];
+
+		if (prog_name != NULL)
+			pr_warning("Couldn't synthesize final BPF metadata for %s.\n", prog_name);
+		else
+			pr_warning("Couldn't synthesize final BPF metadata.\n");
+	}
+	bpf_metadata_free(metadata);
+	node->metadata = NULL;
+}
+
+void perf_event__synthesize_final_bpf_metadata(struct perf_session *session,
+					       perf_event__handler_t process)
+{
+	struct perf_env *env = &session->header.env;
+	struct bpf_metadata_final_ctx ctx = {
+		.tool = session->tool,
+		.process = process,
+		.machine = &session->machines.host,
+	};
+
+	perf_env__iterate_bpf_prog_info(env, synthesize_final_bpf_metadata_cb,
+					&ctx);
+}
+
 /*
  * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
  * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
@@ -590,6 +633,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
 		}
 
 		info_node->info_linear = info_linear;
+		info_node->metadata = NULL;
 		if (!perf_env__insert_bpf_prog_info(env, info_node)) {
 			free(info_linear);
 			free(info_node);
@@ -781,6 +825,7 @@ static void perf_env__add_bpf_info(struct perf_env *env, u32 id)
 	arrays |= 1UL << PERF_BPIL_JITED_INSNS;
 	arrays |= 1UL << PERF_BPIL_LINE_INFO;
 	arrays |= 1UL << PERF_BPIL_JITED_LINE_INFO;
+	arrays |= 1UL << PERF_BPIL_MAP_IDS;
 
 	info_linear = get_bpf_prog_info_linear(fd, arrays);
 	if (IS_ERR_OR_NULL(info_linear)) {
@@ -793,6 +838,7 @@ static void perf_env__add_bpf_info(struct perf_env *env, u32 id)
 	info_node = malloc(sizeof(struct bpf_prog_info_node));
 	if (info_node) {
 		info_node->info_linear = info_linear;
+		info_node->metadata = bpf_metadata_create(&info_linear->info);
 		if (!perf_env__insert_bpf_prog_info(env, info_node)) {
 			free(info_linear);
 			free(info_node);
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
index 007f0b4d21cb..49a42c15bcc6 100644
--- a/tools/perf/util/bpf-event.h
+++ b/tools/perf/util/bpf-event.h
@@ -26,6 +26,7 @@ struct bpf_metadata {
 
 struct bpf_prog_info_node {
 	struct perf_bpil		*info_linear;
+	struct bpf_metadata		*metadata;
 	struct rb_node			rb_node;
 };
 
diff --git a/tools/perf/util/bpf_skel/sample_filter.bpf.c b/tools/perf/util/bpf_skel/sample_filter.bpf.c
index b195e6efeb8b..b0265f000325 100644
--- a/tools/perf/util/bpf_skel/sample_filter.bpf.c
+++ b/tools/perf/util/bpf_skel/sample_filter.bpf.c
@@ -43,6 +43,10 @@ struct lost_count {
 	__uint(max_entries, 1);
 } dropped SEC(".maps");
 
+// This is used by tests/shell/record_bpf_metadata.sh
+// to verify that BPF metadata generation works.
+const int bpf_metadata_test_value SEC(".rodata") = 42;
+
 volatile const int use_idx_hash;
 
 void *bpf_cast_to_kern_ctx(void *) __ksym;
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 36411749e007..05a4f2657d72 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -3,8 +3,10 @@
 #include "debug.h"
 #include "env.h"
 #include "util/header.h"
-#include "linux/compiler.h"
+#include "util/rwsem.h"
+#include <linux/compiler.h>
 #include <linux/ctype.h>
+#include <linux/rbtree.h>
 #include <linux/string.h>
 #include <linux/zalloc.h>
 #include "cgroup.h"
@@ -89,6 +91,20 @@ struct bpf_prog_info_node *perf_env__find_bpf_prog_info(struct perf_env *env,
 	return node;
 }
 
+void perf_env__iterate_bpf_prog_info(struct perf_env *env,
+				     void (*cb)(struct bpf_prog_info_node *node,
+						void *data),
+				     void *data)
+{
+	struct rb_node *first;
+
+	down_read(&env->bpf_progs.lock);
+	first = rb_first(&env->bpf_progs.infos);
+	for (struct rb_node *node = first; node != NULL; node = rb_next(node))
+		(*cb)(rb_entry(node, struct bpf_prog_info_node, rb_node), data);
+	up_read(&env->bpf_progs.lock);
+}
+
 bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node)
 {
 	bool ret;
@@ -174,6 +190,7 @@ static void perf_env__purge_bpf(struct perf_env *env)
 		next = rb_next(&node->rb_node);
 		rb_erase(&node->rb_node, root);
 		zfree(&node->info_linear);
+		bpf_metadata_free(node->metadata);
 		free(node);
 	}
 
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index d90e343cf1fa..6819cb9b99ff 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -180,6 +180,10 @@ bool perf_env__insert_bpf_prog_info(struct perf_env *env,
 				    struct bpf_prog_info_node *info_node);
 struct bpf_prog_info_node *perf_env__find_bpf_prog_info(struct perf_env *env,
 							__u32 prog_id);
+void perf_env__iterate_bpf_prog_info(struct perf_env *env,
+				     void (*cb)(struct bpf_prog_info_node *node,
+						void *data),
+				     void *data);
 bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
 bool __perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
 struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 80c9ea682413..e81c2d87d76a 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -1,9 +1,12 @@
 #include <errno.h>
 #include <fcntl.h>
 #include <inttypes.h>
+#include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <perf/cpumap.h>
+#include <perf/event.h>
+#include <stdio.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <unistd.h>
@@ -78,6 +81,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_COMPRESSED]		= "COMPRESSED",
 	[PERF_RECORD_FINISHED_INIT]		= "FINISHED_INIT",
 	[PERF_RECORD_COMPRESSED2]		= "COMPRESSED2",
+	[PERF_RECORD_BPF_METADATA]		= "BPF_METADATA",
 };
 
 const char *perf_event__name(unsigned int id)
@@ -504,6 +508,20 @@ size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp)
 		       event->bpf.type, event->bpf.flags, event->bpf.id);
 }
 
+size_t perf_event__fprintf_bpf_metadata(union perf_event *event, FILE *fp)
+{
+	struct perf_record_bpf_metadata *metadata = &event->bpf_metadata;
+	size_t ret;
+
+	ret = fprintf(fp, " prog %s\n", metadata->prog_name);
+	for (__u32 i = 0; i < metadata->nr_entries; i++) {
+		ret += fprintf(fp, "  entry %d: %20s = %s\n", i,
+			       metadata->entries[i].key,
+			       metadata->entries[i].value);
+	}
+	return ret;
+}
+
 static int text_poke_printer(enum binary_printer_ops op, unsigned int val,
 			     void *extra, FILE *fp)
 {
@@ -601,6 +619,9 @@ size_t perf_event__fprintf(union perf_event *event, struct machine *machine, FIL
 	case PERF_RECORD_AUX_OUTPUT_HW_ID:
 		ret += perf_event__fprintf_aux_output_hw_id(event, fp);
 		break;
+	case PERF_RECORD_BPF_METADATA:
+		ret += perf_event__fprintf_bpf_metadata(event, fp);
+		break;
 	default:
 		ret += fprintf(fp, "\n");
 	}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 664bf39567ce..67ad4a2014bc 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -370,6 +370,7 @@ size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_cgroup(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp);
+size_t perf_event__fprintf_bpf_metadata(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_text_poke(union perf_event *event, struct machine *machine,FILE *fp);
 size_t perf_event__fprintf(union perf_event *event, struct machine *machine, FILE *fp);
 
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index e3cdc3b7b4ab..7c477e2a93b3 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3161,6 +3161,7 @@ static int process_bpf_prog_info(struct feat_fd *ff, void *data __maybe_unused)
 		/* after reading from file, translate offset to address */
 		bpil_offs_to_addr(info_linear);
 		info_node->info_linear = info_linear;
+		info_node->metadata = NULL;
 		if (!__perf_env__insert_bpf_prog_info(env, info_node)) {
 			free(info_linear);
 			free(info_node);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index a320672c264e..38075059086c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -12,6 +12,7 @@
 #include <sys/types.h>
 #include <sys/mman.h>
 #include <perf/cpumap.h>
+#include <perf/event.h>
 
 #include "map_symbol.h"
 #include "branch.h"
@@ -1491,6 +1492,9 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	case PERF_RECORD_FINISHED_INIT:
 		err = tool->finished_init(session, event);
 		break;
+	case PERF_RECORD_BPF_METADATA:
+		err = tool->bpf_metadata(session, event);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
index b9c936b5cfeb..ee29615d68e5 100644
--- a/tools/perf/util/synthetic-events.h
+++ b/tools/perf/util/synthetic-events.h
@@ -92,6 +92,8 @@ int perf_event__synthesize_threads(const struct perf_tool *tool, perf_event__han
 int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, struct evlist *evlist, perf_event__handler_t process);
 int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc, const struct perf_tool *tool, perf_event__handler_t process, struct machine *machine);
 pid_t perf_event__synthesize_comm(const struct perf_tool *tool, union perf_event *event, pid_t pid, perf_event__handler_t process, struct machine *machine);
+void perf_event__synthesize_final_bpf_metadata(struct perf_session *session,
+					       perf_event__handler_t process);
 
 int perf_tool__process_synth_event(const struct perf_tool *tool, union perf_event *event, struct machine *machine, perf_event__handler_t process);
 
diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
index 37bd8ac63b01..204ec03071bc 100644
--- a/tools/perf/util/tool.c
+++ b/tools/perf/util/tool.c
@@ -1,12 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "data.h"
 #include "debug.h"
+#include "event.h"
 #include "header.h"
 #include "session.h"
 #include "stat.h"
 #include "tool.h"
 #include "tsc.h"
+#include <linux/compiler.h>
 #include <sys/mman.h>
+#include <stddef.h>
 #include <unistd.h>
 
 #ifdef HAVE_ZSTD_SUPPORT
@@ -237,6 +240,16 @@ static int perf_session__process_compressed_event_stub(struct perf_session *sess
 	return 0;
 }
 
+static int perf_event__process_bpf_metadata_stub(struct perf_session *perf_session __maybe_unused,
+						 union perf_event *event)
+{
+	if (dump_trace)
+		perf_event__fprintf_bpf_metadata(event, stdout);
+
+	dump_printf(": unhandled!\n");
+	return 0;
+}
+
 void perf_tool__init(struct perf_tool *tool, bool ordered_events)
 {
 	tool->ordered_events = ordered_events;
@@ -293,6 +306,7 @@ void perf_tool__init(struct perf_tool *tool, bool ordered_events)
 	tool->compressed = perf_session__process_compressed_event_stub;
 #endif
 	tool->finished_init = process_event_op2_stub;
+	tool->bpf_metadata = perf_event__process_bpf_metadata_stub;
 }
 
 bool perf_tool__compressed_is_stub(const struct perf_tool *tool)
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index db1c7642b0d1..18b76ff0f26a 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -77,7 +77,8 @@ struct perf_tool {
 			stat,
 			stat_round,
 			feature,
-			finished_init;
+			finished_init,
+			bpf_metadata;
 	event_op4	compressed;
 	event_op3	auxtrace;
 	bool		ordered_events;
-- 
2.49.0.1143.g0be31eac6b-goog


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-21 22:27 ` [PATCH 1/3] perf: add support for printing BTF character arrays as strings Blake Jones
@ 2025-05-22 17:42   ` Arnaldo Carvalho de Melo
  2025-05-22 18:19     ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-05-22 17:42 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Kan Liang,
	Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users

On Wed, May 21, 2025 at 03:27:23PM -0700, Blake Jones wrote:
> The BTF dumper code currently displays arrays of characters as just that -
> arrays, with each character formatted individually. Sometimes this is what
> makes sense, but it's nice to be able to treat that array as a string.
> 
> This change adds a special case to the btf_dump functionality to allow
> arrays of single-byte integer values to be printed as character strings.
> Characters for which isprint() returns false are printed as hex-escaped
> values. This is enabled when the new ".print_strings" is set to 1 in the
> btf_dump_type_data_opts structure.
> 
> As an example, here's what it looks like to dump the string "hello" using
> a few different field values for btf_dump_type_data_opts (.compact = 1):
> 
> - .print_strings = 0, .skip_names = 0:  (char[6])['h','e','l','l','o',]
> - .print_strings = 0, .skip_names = 1:  ['h','e','l','l','o',]
> - .print_strings = 1, .skip_names = 0:  (char[6])"hello"
> - .print_strings = 1, .skip_names = 1:  "hello"
> 
> Here's the string "h\xff", dumped with .compact = 1 and .skip_names = 1:
> 
> - .print_strings = 0:  ['h',-1,]
> - .print_strings = 1:  "h\xff"

I'll test this but unsure if this part should go thru the perf tool
tree, perhaps should go, together with some test case, via the libbpf
tree?

- Arnaldo
 
> Signed-off-by: Blake Jones <blakejones@google.com>
> ---
>  tools/lib/bpf/btf.h      |  3 ++-
>  tools/lib/bpf/btf_dump.c | 51 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 4392451d634b..be8e8e26d245 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -326,9 +326,10 @@ struct btf_dump_type_data_opts {
>  	bool compact;		/* no newlines/indentation */
>  	bool skip_names;	/* skip member/type names */
>  	bool emit_zeroes;	/* show 0-valued fields */
> +	bool print_strings;	/* print char arrays as strings */
>  	size_t :0;
>  };
> -#define btf_dump_type_data_opts__last_field emit_zeroes
> +#define btf_dump_type_data_opts__last_field print_strings
>  
>  LIBBPF_API int
>  btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
> diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
> index 460c3e57fadb..a07dd5accdd8 100644
> --- a/tools/lib/bpf/btf_dump.c
> +++ b/tools/lib/bpf/btf_dump.c
> @@ -75,6 +75,7 @@ struct btf_dump_data {
>  	bool is_array_member;
>  	bool is_array_terminated;
>  	bool is_array_char;
> +	bool print_strings;
>  };
>  
>  struct btf_dump {
> @@ -2028,6 +2029,50 @@ static int btf_dump_var_data(struct btf_dump *d,
>  	return btf_dump_dump_type_data(d, NULL, t, type_id, data, 0, 0);
>  }
>  
> +static int btf_dump_string_data(struct btf_dump *d,
> +				const struct btf_type *t,
> +				__u32 id,
> +				const void *data)
> +{
> +	const struct btf_array *array = btf_array(t);
> +	__u32 i;
> +
> +	if (!btf_is_int(skip_mods_and_typedefs(d->btf, array->type, NULL)) ||
> +	    btf__resolve_size(d->btf, array->type) != 1 ||
> +	    !d->typed_dump->print_strings) {
> +		pr_warn("unexpected %s() call for array type %u\n",
> +			__func__, array->type);
> +		return -EINVAL;
> +	}
> +
> +	btf_dump_data_pfx(d);
> +	btf_dump_printf(d, "\"");
> +
> +	for (i = 0; i < array->nelems; i++, data++) {
> +		char c;
> +
> +		if (data >= d->typed_dump->data_end)
> +			return -E2BIG;
> +
> +		c = *(char *)data;
> +		if (c == '\0') {
> +			/* When printing character arrays as strings, NUL bytes
> +			 * are always treated as string terminators; they are
> +			 * never printed.
> +			 */
> +			break;
> +		}
> +		if (isprint(c))
> +			btf_dump_printf(d, "%c", c);
> +		else
> +			btf_dump_printf(d, "\\x%02x", *(__u8 *)data);
> +	}
> +
> +	btf_dump_printf(d, "\"");
> +
> +	return 0;
> +}
> +
>  static int btf_dump_array_data(struct btf_dump *d,
>  			       const struct btf_type *t,
>  			       __u32 id,
> @@ -2055,8 +2100,11 @@ static int btf_dump_array_data(struct btf_dump *d,
>  		 * char arrays, so if size is 1 and element is
>  		 * printable as a char, we'll do that.
>  		 */
> -		if (elem_size == 1)
> +		if (elem_size == 1) {
> +			if (d->typed_dump->print_strings)
> +				return btf_dump_string_data(d, t, id, data);
>  			d->typed_dump->is_array_char = true;
> +		}
>  	}
>  
>  	/* note that we increment depth before calling btf_dump_print() below;
> @@ -2544,6 +2592,7 @@ int btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
>  	d->typed_dump->compact = OPTS_GET(opts, compact, false);
>  	d->typed_dump->skip_names = OPTS_GET(opts, skip_names, false);
>  	d->typed_dump->emit_zeroes = OPTS_GET(opts, emit_zeroes, false);
> +	d->typed_dump->print_strings = OPTS_GET(opts, print_strings, false);
>  
>  	ret = btf_dump_dump_type_data(d, NULL, t, id, data, 0, 0);
>  
> -- 
> 2.49.0.1143.g0be31eac6b-goog

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-22 17:42   ` Arnaldo Carvalho de Melo
@ 2025-05-22 18:19     ` Blake Jones
  2025-05-29  0:58       ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-22 18:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Kan Liang,
	Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users

Hi Arnaldo,

On Thu, May 22, 2025 at 1:42 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
> I'll test this but unsure if this part should go thru the perf tool
> tree, perhaps should go, together with some test case, via the libbpf
> tree?

Thanks for taking a look at this. I'd appreciate your guidance here - I
sent it here because the other two patches in my patch set depend on this
one.

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-22 18:19     ` Blake Jones
@ 2025-05-29  0:58       ` Blake Jones
  2025-05-30 17:40         ` Namhyung Kim
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-29  0:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Kan Liang,
	Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users

Hi Arnaldo,

On Thu, May 22, 2025 at 11:19 AM Blake Jones <blakejones@google.com> wrote:
> On Thu, May 22, 2025 at 1:42 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> > I'll test this but unsure if this part should go thru the perf tool
> > tree, perhaps should go, together with some test case, via the libbpf
> > tree?
>
> Thanks for taking a look at this. I'd appreciate your guidance here - I
> sent it here because the other two patches in my patch set depend on this
> one.

Do you think this can be merged through the perf tree, or should I separate
this patch and send it though the BPF tree first?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-05-21 22:27 ` [PATCH 2/3] perf: collect BPF metadata from existing BPF programs Blake Jones
@ 2025-05-29 17:47   ` Ian Rogers
  2025-05-29 23:21     ` Blake Jones
  2025-06-03 20:15   ` Namhyung Kim
  1 sibling, 1 reply; 28+ messages in thread
From: Ian Rogers @ 2025-05-29 17:47 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Wed, May 21, 2025 at 3:27 PM Blake Jones <blakejones@google.com> wrote:
>
> Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract
> their values as strings, and create a new PERF_RECORD_BPF_METADATA
> synthetic event using that data. The code gets invoked from the existing
> routine perf_event__synthesize_one_bpf_prog().
>
> Signed-off-by: Blake Jones <blakejones@google.com>

Reviewed-by: Ian Rogers <irogers@google.com>

> ---
>  tools/lib/perf/include/perf/event.h |  18 ++
>  tools/perf/util/bpf-event.c         | 310 ++++++++++++++++++++++++++++
>  tools/perf/util/bpf-event.h         |  13 ++
>  3 files changed, 341 insertions(+)
>
> diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
> index 09b7c643ddac..6608f1e3701b 100644
> --- a/tools/lib/perf/include/perf/event.h
> +++ b/tools/lib/perf/include/perf/event.h
> @@ -467,6 +467,22 @@ struct perf_record_compressed2 {
>         char                     data[];
>  };
>
> +#define BPF_METADATA_KEY_LEN   64
> +#define BPF_METADATA_VALUE_LEN 256
> +#define BPF_PROG_NAME_LEN      KSYM_NAME_LEN
> +
> +struct perf_record_bpf_metadata_entry {
> +       char key[BPF_METADATA_KEY_LEN];
> +       char value[BPF_METADATA_VALUE_LEN];
> +};
> +
> +struct perf_record_bpf_metadata {
> +       struct perf_event_header              header;
> +       char                                  prog_name[BPF_PROG_NAME_LEN];
> +       __u64                                 nr_entries;
> +       struct perf_record_bpf_metadata_entry entries[];
> +};
> +
>  enum perf_user_event_type { /* above any possible kernel type */
>         PERF_RECORD_USER_TYPE_START             = 64,
>         PERF_RECORD_HEADER_ATTR                 = 64,
> @@ -489,6 +505,7 @@ enum perf_user_event_type { /* above any possible kernel type */
>         PERF_RECORD_COMPRESSED                  = 81,
>         PERF_RECORD_FINISHED_INIT               = 82,
>         PERF_RECORD_COMPRESSED2                 = 83,
> +       PERF_RECORD_BPF_METADATA                = 84,
>         PERF_RECORD_HEADER_MAX
>  };
>
> @@ -530,6 +547,7 @@ union perf_event {
>         struct perf_record_header_feature       feat;
>         struct perf_record_compressed           pack;
>         struct perf_record_compressed2          pack2;
> +       struct perf_record_bpf_metadata         bpf_metadata;
>  };
>
>  #endif /* __LIBPERF_EVENT_H */
> diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
> index c81444059ad0..36d5676f025e 100644
> --- a/tools/perf/util/bpf-event.c
> +++ b/tools/perf/util/bpf-event.c
> @@ -1,13 +1,20 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include <errno.h>
> +#include <stddef.h>
> +#include <stdint.h>
> +#include <stdio.h>
>  #include <stdlib.h>
> +#include <string.h>
>  #include <bpf/bpf.h>
>  #include <bpf/btf.h>
>  #include <bpf/libbpf.h>
> +#include <linux/bpf.h>
>  #include <linux/btf.h>
>  #include <linux/err.h>
> +#include <linux/perf_event.h>
>  #include <linux/string.h>
>  #include <internal/lib.h>
> +#include <perf/event.h>
>  #include <symbol/kallsyms.h>
>  #include "bpf-event.h"
>  #include "bpf-utils.h"
> @@ -151,6 +158,298 @@ static int synthesize_bpf_prog_name(char *buf, int size,
>         return name_len;
>  }
>
> +#define BPF_METADATA_PREFIX "bpf_metadata_"
> +#define BPF_METADATA_PREFIX_LEN (sizeof(BPF_METADATA_PREFIX) - 1)
> +
> +static bool name_has_bpf_metadata_prefix(const char **s)
> +{
> +       if (strncmp(*s, BPF_METADATA_PREFIX, BPF_METADATA_PREFIX_LEN) != 0)
> +               return false;
> +       *s += BPF_METADATA_PREFIX_LEN;
> +       return true;
> +}
> +
> +struct bpf_metadata_map {
> +       struct btf *btf;
> +       const struct btf_type *datasec;
> +       void *rodata;
> +       size_t rodata_size;
> +       unsigned int num_vars;
> +};
> +
> +static int bpf_metadata_read_map_data(__u32 map_id, struct bpf_metadata_map *map)
> +{
> +       int map_fd;
> +       struct bpf_map_info map_info;
> +       __u32 map_info_len;
> +       int key;
> +       struct btf *btf;
> +       const struct btf_type *datasec;
> +       struct btf_var_secinfo *vsi;
> +       unsigned int vlen, vars;
> +       void *rodata;
> +
> +       map_fd = bpf_map_get_fd_by_id(map_id);
> +       if (map_fd < 0)
> +               return -1;
> +
> +       memset(&map_info, 0, sizeof(map_info));
> +       map_info_len = sizeof(map_info);
> +       if (bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len) < 0)
> +               goto out_close;
> +
> +       /* If it's not an .rodata map, don't bother. */
> +       if (map_info.type != BPF_MAP_TYPE_ARRAY ||
> +           map_info.key_size != sizeof(int) ||
> +           map_info.max_entries != 1 ||
> +           !map_info.btf_value_type_id ||
> +           !strstr(map_info.name, ".rodata")) {
> +               goto out_close;
> +       }
> +
> +       btf = btf__load_from_kernel_by_id(map_info.btf_id);
> +       if (!btf)
> +               goto out_close;
> +       datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
> +       if (!btf_is_datasec(datasec))
> +               goto out_free_btf;
> +
> +       /* If there aren't any variables with the "bpf_metadata_" prefix,
> +        * don't bother.
> +        */
> +       vlen = btf_vlen(datasec);
> +       vsi = btf_var_secinfos(datasec);
> +       vars = 0;
> +       for (unsigned int i = 0; i < vlen; i++, vsi++) {
> +               const struct btf_type *t_var = btf__type_by_id(btf, vsi->type);
> +               const char *name = btf__name_by_offset(btf, t_var->name_off);
> +
> +               if (name_has_bpf_metadata_prefix(&name))
> +                       vars++;
> +       }
> +       if (vars == 0)
> +               goto out_free_btf;
> +
> +       rodata = calloc(1, map_info.value_size);
> +       if (!rodata)
> +               goto out_free_btf;
> +       key = 0;
> +       if (bpf_map_lookup_elem(map_fd, &key, rodata)) {
> +               free(rodata);
> +               goto out_free_btf;
> +       }
> +       close(map_fd);
> +
> +       map->btf = btf;
> +       map->datasec = datasec;
> +       map->rodata = rodata;
> +       map->rodata_size = map_info.value_size;
> +       map->num_vars = vars;
> +       return 0;
> +
> +out_free_btf:
> +       btf__free(btf);
> +out_close:
> +       close(map_fd);
> +       return -1;
> +}
> +
> +struct format_btf_ctx {
> +       char *buf;
> +       size_t buf_size;
> +       size_t buf_idx;
> +};
> +
> +static void format_btf_cb(void *arg, const char *fmt, va_list ap)
> +{
> +       int n;
> +       struct format_btf_ctx *ctx = (struct format_btf_ctx *)arg;
> +
> +       n = vsnprintf(ctx->buf + ctx->buf_idx, ctx->buf_size - ctx->buf_idx,
> +                     fmt, ap);
> +       ctx->buf_idx += n;
> +       if (ctx->buf_idx >= ctx->buf_size)
> +               ctx->buf_idx = ctx->buf_size;
> +}
> +
> +static void format_btf_variable(struct btf *btf, char *buf, size_t buf_size,
> +                               const struct btf_type *t, const void *btf_data)
> +{
> +       struct format_btf_ctx ctx = {
> +               .buf = buf,
> +               .buf_idx = 0,
> +               .buf_size = buf_size,
> +       };
> +       const struct btf_dump_type_data_opts opts = {
> +               .sz = sizeof(struct btf_dump_type_data_opts),
> +               .skip_names = 1,
> +               .compact = 1,
> +               .print_strings = 1,
> +       };
> +       struct btf_dump *d;
> +       size_t btf_size;
> +
> +       d = btf_dump__new(btf, format_btf_cb, &ctx, NULL);
> +       btf_size = btf__resolve_size(btf, t->type);
> +       btf_dump__dump_type_data(d, t->type, btf_data, btf_size, &opts);
> +       btf_dump__free(d);
> +}
> +
> +static void bpf_metadata_fill_event(struct bpf_metadata_map *map,
> +                                   struct perf_record_bpf_metadata *bpf_metadata_event)
> +{
> +       struct btf_var_secinfo *vsi;
> +       unsigned int i, vlen;
> +
> +       memset(bpf_metadata_event->prog_name, 0, BPF_PROG_NAME_LEN);
> +       vlen = btf_vlen(map->datasec);
> +       vsi = btf_var_secinfos(map->datasec);
> +
> +       for (i = 0; i < vlen; i++, vsi++) {
> +               const struct btf_type *t_var = btf__type_by_id(map->btf,
> +                                                              vsi->type);
> +               const char *name = btf__name_by_offset(map->btf,
> +                                                      t_var->name_off);
> +               const __u64 nr_entries = bpf_metadata_event->nr_entries;
> +               struct perf_record_bpf_metadata_entry *entry;
> +
> +               if (!name_has_bpf_metadata_prefix(&name))
> +                       continue;
> +
> +               if (nr_entries >= (__u64)map->num_vars)
> +                       break;
> +
> +               entry = &bpf_metadata_event->entries[nr_entries];
> +               memset(entry, 0, sizeof(*entry));
> +               snprintf(entry->key, BPF_METADATA_KEY_LEN, "%s", name);
> +               format_btf_variable(map->btf, entry->value,
> +                                   BPF_METADATA_VALUE_LEN, t_var,
> +                                   map->rodata + vsi->offset);
> +               bpf_metadata_event->nr_entries++;
> +       }
> +}
> +
> +static void bpf_metadata_free_map_data(struct bpf_metadata_map *map)
> +{
> +       btf__free(map->btf);
> +       free(map->rodata);
> +}
> +
> +void bpf_metadata_free(struct bpf_metadata *metadata)
> +{
> +       if (metadata == NULL)
> +               return;
> +       for (__u32 index = 0; index < metadata->nr_prog_names; index++)
> +               free(metadata->prog_names[index]);
> +       if (metadata->prog_names != NULL)
> +               free(metadata->prog_names);
> +       if (metadata->event != NULL)
> +               free(metadata->event);
> +       free(metadata);
> +}
> +
> +static struct bpf_metadata *bpf_metadata_alloc(__u32 nr_prog_tags,
> +                                              __u32 nr_variables)
> +{
> +       struct bpf_metadata *metadata;
> +
> +       metadata = calloc(1, sizeof(struct bpf_metadata));
> +       if (!metadata)
> +               return NULL;
> +
> +       metadata->prog_names = calloc(nr_prog_tags, sizeof(char *));
> +       if (!metadata->prog_names) {
> +               bpf_metadata_free(metadata);
> +               return NULL;
> +       }
> +       for (__u32 prog_index = 0; prog_index < nr_prog_tags; prog_index++) {
> +               metadata->prog_names[prog_index] = calloc(BPF_PROG_NAME_LEN,
> +                                                         sizeof(char));
> +               if (!metadata->prog_names[prog_index]) {
> +                       bpf_metadata_free(metadata);
> +                       return NULL;
> +               }
> +               metadata->nr_prog_names++;
> +       }
> +
> +       metadata->event_size = sizeof(metadata->event->bpf_metadata) +
> +           nr_variables * sizeof(metadata->event->bpf_metadata.entries[0]);
> +       metadata->event = calloc(1, metadata->event_size);
> +       if (!metadata->event) {
> +               bpf_metadata_free(metadata);
> +               return NULL;
> +       }
> +
> +       return metadata;
> +}
> +
> +static struct bpf_metadata *bpf_metadata_create(struct bpf_prog_info *info)
> +{
> +       struct bpf_metadata *metadata;
> +       const __u32 *map_ids = (__u32 *)(uintptr_t)info->map_ids;
> +
> +       for (__u32 map_index = 0; map_index < info->nr_map_ids; map_index++) {
> +               struct perf_record_bpf_metadata *bpf_metadata_event;
> +               struct bpf_metadata_map map;
> +
> +               if (bpf_metadata_read_map_data(map_ids[map_index], &map) != 0)
> +                       continue;
> +
> +               metadata = bpf_metadata_alloc(info->nr_prog_tags, map.num_vars);
> +               if (!metadata)
> +                       continue;
> +
> +               bpf_metadata_event = &metadata->event->bpf_metadata;
> +               *bpf_metadata_event = (struct perf_record_bpf_metadata) {
> +                       .header = {
> +                               .type = PERF_RECORD_BPF_METADATA,
> +                               .size = metadata->event_size,

nit: Could we set the header.size in bpf_metadata_alloc to remove
metadata->event_size. The code generally doesn't pass a pair of
perf_event + size around as the size should be in the header.

Thanks,
Ian

> +                       },
> +                       .nr_entries = 0,
> +               };
> +               bpf_metadata_fill_event(&map, bpf_metadata_event);
> +
> +               for (__u32 index = 0; index < info->nr_prog_tags; index++) {
> +                       synthesize_bpf_prog_name(metadata->prog_names[index],
> +                                                BPF_PROG_NAME_LEN, info,
> +                                                map.btf, index);
> +               }
> +
> +               bpf_metadata_free_map_data(&map);
> +
> +               return metadata;
> +       }
> +
> +       return NULL;
> +}
> +
> +static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metadata,
> +                                              const struct perf_tool *tool,
> +                                              perf_event__handler_t process,
> +                                              struct machine *machine)
> +{
> +       union perf_event *event;
> +       int err = 0;
> +
> +       event = calloc(1, metadata->event_size + machine->id_hdr_size);
> +       if (!event)
> +               return -1;
> +       memcpy(event, metadata->event, metadata->event_size);
> +       memset((void *)event + event->header.size, 0, machine->id_hdr_size);
> +       event->header.size += machine->id_hdr_size;
> +       for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
> +               memcpy(event->bpf_metadata.prog_name,
> +                      metadata->prog_names[index], BPF_PROG_NAME_LEN);
> +               err = perf_tool__process_synth_event(tool, event, machine,
> +                                                    process);
> +               if (err != 0)
> +                       break;
> +       }
> +
> +       free(event);
> +       return err;
> +}
> +
>  /*
>   * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
>   * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
> @@ -173,6 +472,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>         const struct perf_tool *tool = session->tool;
>         struct bpf_prog_info_node *info_node;
>         struct perf_bpil *info_linear;
> +       struct bpf_metadata *metadata;
>         struct bpf_prog_info *info;
>         struct btf *btf = NULL;
>         struct perf_env *env;
> @@ -193,6 +493,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>         arrays |= 1UL << PERF_BPIL_JITED_INSNS;
>         arrays |= 1UL << PERF_BPIL_LINE_INFO;
>         arrays |= 1UL << PERF_BPIL_JITED_LINE_INFO;
> +       arrays |= 1UL << PERF_BPIL_MAP_IDS;
>
>         info_linear = get_bpf_prog_info_linear(fd, arrays);
>         if (IS_ERR_OR_NULL(info_linear)) {
> @@ -301,6 +602,15 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>                  */
>                 err = perf_tool__process_synth_event(tool, event,
>                                                      machine, process);
> +
> +               /* Synthesize PERF_RECORD_BPF_METADATA */
> +               metadata = bpf_metadata_create(info);
> +               if (metadata != NULL) {
> +                       err = synthesize_perf_record_bpf_metadata(metadata,
> +                                                                 tool, process,
> +                                                                 machine);
> +                       bpf_metadata_free(metadata);
> +               }
>         }
>
>  out:
> diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
> index e2f0420905f5..007f0b4d21cb 100644
> --- a/tools/perf/util/bpf-event.h
> +++ b/tools/perf/util/bpf-event.h
> @@ -17,6 +17,13 @@ struct record_opts;
>  struct evlist;
>  struct target;
>
> +struct bpf_metadata {
> +       union perf_event *event;
> +       size_t           event_size;
> +       char             **prog_names;
> +       __u64            nr_prog_names;
> +};
> +
>  struct bpf_prog_info_node {
>         struct perf_bpil                *info_linear;
>         struct rb_node                  rb_node;
> @@ -36,6 +43,7 @@ int evlist__add_bpf_sb_event(struct evlist *evlist, struct perf_env *env);
>  void __bpf_event__print_bpf_prog_info(struct bpf_prog_info *info,
>                                       struct perf_env *env,
>                                       FILE *fp);
> +void bpf_metadata_free(struct bpf_metadata *metadata);
>  #else
>  static inline int machine__process_bpf(struct machine *machine __maybe_unused,
>                                        union perf_event *event __maybe_unused,
> @@ -55,6 +63,11 @@ static inline void __bpf_event__print_bpf_prog_info(struct bpf_prog_info *info _
>                                                     FILE *fp __maybe_unused)
>  {
>
> +}
> +
> +static inline void bpf_metadata_free(struct bpf_metadata *metadata)
> +{
> +
>  }
>  #endif // HAVE_LIBBPF_SUPPORT
>  #endif
> --
> 2.49.0.1143.g0be31eac6b-goog
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event
  2025-05-21 22:27 ` [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event Blake Jones
@ 2025-05-29 18:12   ` Ian Rogers
  2025-05-29 23:09     ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Ian Rogers @ 2025-05-29 18:12 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Wed, May 21, 2025 at 3:27 PM Blake Jones <blakejones@google.com> wrote:
>
> This collects metadata for any BPF programs that were loaded during a
> "perf record" run, and emits it at the end of the run. It also adds support
> for displaying the new PERF_RECORD_BPF_METADATA type.
>
> Here's some example "perf script -D" output for the new event type. The
> ": unhandled!" message is from tool.c, analogous to other behavior there.
> I've elided some rows with all NUL characters for brevity, and I wrapped
> one of the >75-column lines to fit in the commit guidelines.
>
> 0x50fc8@perf.data [0x260]: event: 84
> .
> . ... raw event: size 608 bytes
> .  0000:  54 00 00 00 00 00 60 02 62 70 66 5f 70 72 6f 67  T.....`.bpf_prog
> .  0010:  5f 31 65 30 61 32 65 33 36 36 65 35 36 66 31 61  _1e0a2e366e56f1a
> .  0020:  32 5f 70 65 72 66 5f 73 61 6d 70 6c 65 5f 66 69  2_perf_sample_fi
> .  0030:  6c 74 65 72 00 00 00 00 00 00 00 00 00 00 00 00  lter............
> .  0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> [...]
> .  0110:  74 65 73 74 5f 76 61 6c 75 65 00 00 00 00 00 00  test_value......
> .  0120:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> [...]
> .  0150:  34 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00  42..............
> .  0160:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> [...]
>
> 0 0x50fc8 [0x260]: PERF_RECORD_BPF_METADATA \
>       prog bpf_prog_1e0a2e366e56f1a2_perf_sample_filter
>   entry 0:           test_value = 42
> : unhandled!
>
> Signed-off-by: Blake Jones <blakejones@google.com>
> ---
>  tools/perf/builtin-inject.c                  |  1 +
>  tools/perf/builtin-record.c                  |  8 +++
>  tools/perf/builtin-script.c                  | 15 ++++-
>  tools/perf/tests/shell/test_bpf_metadata.sh  | 67 ++++++++++++++++++++
>  tools/perf/util/bpf-event.c                  | 46 ++++++++++++++
>  tools/perf/util/bpf-event.h                  |  1 +
>  tools/perf/util/bpf_skel/sample_filter.bpf.c |  4 ++
>  tools/perf/util/env.c                        | 19 +++++-
>  tools/perf/util/env.h                        |  4 ++
>  tools/perf/util/event.c                      | 21 ++++++
>  tools/perf/util/event.h                      |  1 +
>  tools/perf/util/header.c                     |  1 +
>  tools/perf/util/session.c                    |  4 ++
>  tools/perf/util/synthetic-events.h           |  2 +
>  tools/perf/util/tool.c                       | 14 ++++
>  tools/perf/util/tool.h                       |  3 +-
>  16 files changed, 207 insertions(+), 4 deletions(-)
>  create mode 100644 tools/perf/tests/shell/test_bpf_metadata.sh
>
> diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> index 11e49cafa3af..b15eac0716f7 100644
> --- a/tools/perf/builtin-inject.c
> +++ b/tools/perf/builtin-inject.c
> @@ -2530,6 +2530,7 @@ int cmd_inject(int argc, const char **argv)
>         inject.tool.finished_init       = perf_event__repipe_op2_synth;
>         inject.tool.compressed          = perf_event__repipe_op4_synth;
>         inject.tool.auxtrace            = perf_event__repipe_auxtrace;
> +       inject.tool.bpf_metadata        = perf_event__repipe_op2_synth;
>         inject.tool.dont_split_sample_group = true;
>         inject.session = __perf_session__new(&data, &inject.tool,
>                                              /*trace_event_repipe=*/inject.output.is_pipe);
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 136c0172799a..067e203f57c2 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -2161,6 +2161,12 @@ static int record__synthesize(struct record *rec, bool tail)
>         return err;
>  }
>
> +static void record__synthesize_final_bpf_metadata(struct record *rec)
> +{
> +       perf_event__synthesize_final_bpf_metadata(rec->session,
> +                                                 process_synthesized_event);
> +}
> +
>  static int record__process_signal_event(union perf_event *event __maybe_unused, void *data)
>  {
>         struct record *rec = data;
> @@ -2806,6 +2812,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>         trigger_off(&auxtrace_snapshot_trigger);
>         trigger_off(&switch_output_trigger);
>
> +       record__synthesize_final_bpf_metadata(rec);
> +
>         if (opts->auxtrace_snapshot_on_exit)
>                 record__auxtrace_snapshot_exit(rec);
>
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 6c3bf74dd78c..4001e621b6cb 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -38,6 +38,7 @@
>  #include "print_insn.h"
>  #include "archinsn.h"
>  #include <linux/bitmap.h>
> +#include <linux/compiler.h>
>  #include <linux/kernel.h>
>  #include <linux/stringify.h>
>  #include <linux/time64.h>
> @@ -50,6 +51,7 @@
>  #include <errno.h>
>  #include <inttypes.h>
>  #include <signal.h>
> +#include <stdio.h>
>  #include <sys/param.h>
>  #include <sys/types.h>
>  #include <sys/stat.h>
> @@ -2755,6 +2757,14 @@ process_bpf_events(const struct perf_tool *tool __maybe_unused,
>                            sample->tid);
>  }
>
> +static int
> +process_bpf_metadata_event(struct perf_session *session __maybe_unused,
> +                          union perf_event *event)
> +{
> +       perf_event__fprintf(event, NULL, stdout);
> +       return 0;
> +}
> +
>  static int process_text_poke_events(const struct perf_tool *tool,
>                                     union perf_event *event,
>                                     struct perf_sample *sample,
> @@ -2877,8 +2887,9 @@ static int __cmd_script(struct perf_script *script)
>                 script->tool.finished_round = process_finished_round_event;
>         }
>         if (script->show_bpf_events) {
> -               script->tool.ksymbol = process_bpf_events;
> -               script->tool.bpf     = process_bpf_events;
> +               script->tool.ksymbol      = process_bpf_events;
> +               script->tool.bpf          = process_bpf_events;
> +               script->tool.bpf_metadata = process_bpf_metadata_event;
>         }
>         if (script->show_text_poke_events) {
>                 script->tool.ksymbol   = process_bpf_events;
> diff --git a/tools/perf/tests/shell/test_bpf_metadata.sh b/tools/perf/tests/shell/test_bpf_metadata.sh
> new file mode 100644
> index 000000000000..ede31d5a3c36
> --- /dev/null
> +++ b/tools/perf/tests/shell/test_bpf_metadata.sh
> @@ -0,0 +1,67 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# BPF metadata collection test.
> +
> +set -e
> +
> +err=0
> +perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
> +
> +cleanup() {
> +       rm -f "${perfdata}"
> +       rm -f "${perfdata}".old
> +       trap - EXIT TERM INT
> +}
> +
> +trap_cleanup() {
> +       cleanup
> +       exit 1
> +}
> +trap trap_cleanup EXIT TERM INT
> +
> +test_bpf_metadata() {
> +       echo "Checking BPF metadata collection"
> +
> +       # This is a basic invocation of perf record
> +       # that invokes the perf_sample_filter BPF program.
> +       if ! perf record -e task-clock --filter 'ip > 0' \
> +                        -o "${perfdata}" sleep 1 2> /dev/null
> +       then
> +               echo "Basic BPF metadata test [Failed record]"
> +               err=1
> +               return
> +       fi
> +
> +       # The perf_sample_filter BPF program has the following variable in it:
> +       #
> +       #   volatile const int bpf_metadata_test_value SEC(".rodata") = 42;
> +       #
> +       # This invocation looks for a PERF_RECORD_BPF_METADATA event,
> +       # and checks that its content includes something for the above variable.
> +       if ! perf script --show-bpf-events -i "${perfdata}" | awk '
> +               /PERF_RECORD_BPF_METADATA.*perf_sample_filter/ {
> +                       header = 1;
> +               }
> +               /^ *entry/ {
> +                       if (header) { header = 0; entry = 1; }
> +               }
> +               $0 !~ /^ *entry/ {
> +                       entry = 0;
> +               }
> +               /test_value/ {
> +                       if (entry) print $NF;
> +               }
> +       ' | grep 42 > /dev/null
> +       then
> +               echo "Basic BPF metadata test [Failed invalid output]"
> +               err=1
> +               return
> +       fi
> +       echo "Basic BPF metadata test [Success]"
> +}
> +
> +test_bpf_metadata
> +
> +cleanup
> +exit $err
> diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
> index 36d5676f025e..019832ec4802 100644
> --- a/tools/perf/util/bpf-event.c
> +++ b/tools/perf/util/bpf-event.c
> @@ -450,6 +450,49 @@ static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metada
>         return err;
>  }
>
> +struct bpf_metadata_final_ctx {
> +       const struct perf_tool *tool;
> +       perf_event__handler_t process;
> +       struct machine *machine;
> +};
> +
> +static void synthesize_final_bpf_metadata_cb(struct bpf_prog_info_node *node,
> +                                            void *data)
> +{
> +       struct bpf_metadata_final_ctx *ctx = (struct bpf_metadata_final_ctx *)data;
> +       struct bpf_metadata *metadata = node->metadata;
> +       int err;
> +
> +       if (metadata == NULL)
> +               return;
> +       err = synthesize_perf_record_bpf_metadata(metadata, ctx->tool,
> +                                                 ctx->process, ctx->machine);
> +       if (err != 0) {
> +               const char *prog_name = metadata->prog_names[0];
> +
> +               if (prog_name != NULL)
> +                       pr_warning("Couldn't synthesize final BPF metadata for %s.\n", prog_name);
> +               else
> +                       pr_warning("Couldn't synthesize final BPF metadata.\n");
> +       }
> +       bpf_metadata_free(metadata);
> +       node->metadata = NULL;
> +}
> +
> +void perf_event__synthesize_final_bpf_metadata(struct perf_session *session,
> +                                              perf_event__handler_t process)
> +{
> +       struct perf_env *env = &session->header.env;
> +       struct bpf_metadata_final_ctx ctx = {
> +               .tool = session->tool,
> +               .process = process,
> +               .machine = &session->machines.host,
> +       };
> +
> +       perf_env__iterate_bpf_prog_info(env, synthesize_final_bpf_metadata_cb,
> +                                       &ctx);
> +}
> +
>  /*
>   * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
>   * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
> @@ -590,6 +633,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>                 }
>
>                 info_node->info_linear = info_linear;
> +               info_node->metadata = NULL;
>                 if (!perf_env__insert_bpf_prog_info(env, info_node)) {
>                         free(info_linear);
>                         free(info_node);
> @@ -781,6 +825,7 @@ static void perf_env__add_bpf_info(struct perf_env *env, u32 id)
>         arrays |= 1UL << PERF_BPIL_JITED_INSNS;
>         arrays |= 1UL << PERF_BPIL_LINE_INFO;
>         arrays |= 1UL << PERF_BPIL_JITED_LINE_INFO;
> +       arrays |= 1UL << PERF_BPIL_MAP_IDS;
>
>         info_linear = get_bpf_prog_info_linear(fd, arrays);
>         if (IS_ERR_OR_NULL(info_linear)) {
> @@ -793,6 +838,7 @@ static void perf_env__add_bpf_info(struct perf_env *env, u32 id)
>         info_node = malloc(sizeof(struct bpf_prog_info_node));
>         if (info_node) {
>                 info_node->info_linear = info_linear;
> +               info_node->metadata = bpf_metadata_create(&info_linear->info);
>                 if (!perf_env__insert_bpf_prog_info(env, info_node)) {
>                         free(info_linear);
>                         free(info_node);
> diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
> index 007f0b4d21cb..49a42c15bcc6 100644
> --- a/tools/perf/util/bpf-event.h
> +++ b/tools/perf/util/bpf-event.h
> @@ -26,6 +26,7 @@ struct bpf_metadata {
>
>  struct bpf_prog_info_node {
>         struct perf_bpil                *info_linear;
> +       struct bpf_metadata             *metadata;
>         struct rb_node                  rb_node;
>  };
>
> diff --git a/tools/perf/util/bpf_skel/sample_filter.bpf.c b/tools/perf/util/bpf_skel/sample_filter.bpf.c
> index b195e6efeb8b..b0265f000325 100644
> --- a/tools/perf/util/bpf_skel/sample_filter.bpf.c
> +++ b/tools/perf/util/bpf_skel/sample_filter.bpf.c
> @@ -43,6 +43,10 @@ struct lost_count {
>         __uint(max_entries, 1);
>  } dropped SEC(".maps");
>
> +// This is used by tests/shell/record_bpf_metadata.sh
> +// to verify that BPF metadata generation works.
> +const int bpf_metadata_test_value SEC(".rodata") = 42;

This is a bit random. For the non-BPF C code we have a build generated
PERF-VERSION-FILE that contains something like `#define PERF_VERSION
"6.15.rc7.ge450e74276d2"`. I wonder having something like (the section
is almost certainly wrong):
```
volatile const char * const bpf_metadata_perf_version
__attribute__((section(\".rodata\"), used, weak)) = PERF_VERSION;
```
with a build change something like:
```
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index d4c7031b01a7..f825d9195d77 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1250,9 +1250,9 @@ else
       $(Q)cp "$(VMLINUX_H)" $@
endif

-$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF)
$(SKEL_OUT)/vmlinux.h | $(SKEL_TMP_OUT)
+$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF)
$(OUTPUT)PERF-VERSION-FILE $(SKEL_OUT)/vml
inux.h | $(SKEL_TMP_OUT)
       $(QUIET_CLANG)$(CLANG) -g -O2 --target=bpf $(CLANG_OPTIONS)
$(BPF_INCLUDE) $(TOOLS_UAPI_INCL
UDE) \
-         -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@
+         -include $(OUTPUT)PERF-VERSION-FILE -c $(filter
util/bpf_skel/%.bpf.c,$^) -o $@

$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL)
       $(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@

```
would be more useful/meaningful. Perhaps the build could inject the
variable to avoid duplicating it all the BPF skeletons.

nit: I wonder for testing it would be interesting to have 0 and >1
metadata values tested too. We may want to have test programs
explicitly for that, in tools/perf/tests.

Thanks,
Ian


> +
>  volatile const int use_idx_hash;
>
>  void *bpf_cast_to_kern_ctx(void *) __ksym;
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 36411749e007..05a4f2657d72 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -3,8 +3,10 @@
>  #include "debug.h"
>  #include "env.h"
>  #include "util/header.h"
> -#include "linux/compiler.h"
> +#include "util/rwsem.h"
> +#include <linux/compiler.h>
>  #include <linux/ctype.h>
> +#include <linux/rbtree.h>
>  #include <linux/string.h>
>  #include <linux/zalloc.h>
>  #include "cgroup.h"
> @@ -89,6 +91,20 @@ struct bpf_prog_info_node *perf_env__find_bpf_prog_info(struct perf_env *env,
>         return node;
>  }
>
> +void perf_env__iterate_bpf_prog_info(struct perf_env *env,
> +                                    void (*cb)(struct bpf_prog_info_node *node,
> +                                               void *data),
> +                                    void *data)
> +{
> +       struct rb_node *first;
> +
> +       down_read(&env->bpf_progs.lock);
> +       first = rb_first(&env->bpf_progs.infos);
> +       for (struct rb_node *node = first; node != NULL; node = rb_next(node))
> +               (*cb)(rb_entry(node, struct bpf_prog_info_node, rb_node), data);
> +       up_read(&env->bpf_progs.lock);
> +}
> +
>  bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node)
>  {
>         bool ret;
> @@ -174,6 +190,7 @@ static void perf_env__purge_bpf(struct perf_env *env)
>                 next = rb_next(&node->rb_node);
>                 rb_erase(&node->rb_node, root);
>                 zfree(&node->info_linear);
> +               bpf_metadata_free(node->metadata);
>                 free(node);
>         }
>
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index d90e343cf1fa..6819cb9b99ff 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -180,6 +180,10 @@ bool perf_env__insert_bpf_prog_info(struct perf_env *env,
>                                     struct bpf_prog_info_node *info_node);
>  struct bpf_prog_info_node *perf_env__find_bpf_prog_info(struct perf_env *env,
>                                                         __u32 prog_id);
> +void perf_env__iterate_bpf_prog_info(struct perf_env *env,
> +                                    void (*cb)(struct bpf_prog_info_node *node,
> +                                               void *data),
> +                                    void *data);
>  bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
>  bool __perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
>  struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 80c9ea682413..e81c2d87d76a 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -1,9 +1,12 @@
>  #include <errno.h>
>  #include <fcntl.h>
>  #include <inttypes.h>
> +#include <linux/compiler.h>
>  #include <linux/kernel.h>
>  #include <linux/types.h>
>  #include <perf/cpumap.h>
> +#include <perf/event.h>
> +#include <stdio.h>
>  #include <sys/types.h>
>  #include <sys/stat.h>
>  #include <unistd.h>
> @@ -78,6 +81,7 @@ static const char *perf_event__names[] = {
>         [PERF_RECORD_COMPRESSED]                = "COMPRESSED",
>         [PERF_RECORD_FINISHED_INIT]             = "FINISHED_INIT",
>         [PERF_RECORD_COMPRESSED2]               = "COMPRESSED2",
> +       [PERF_RECORD_BPF_METADATA]              = "BPF_METADATA",
>  };
>
>  const char *perf_event__name(unsigned int id)
> @@ -504,6 +508,20 @@ size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp)
>                        event->bpf.type, event->bpf.flags, event->bpf.id);
>  }
>
> +size_t perf_event__fprintf_bpf_metadata(union perf_event *event, FILE *fp)
> +{
> +       struct perf_record_bpf_metadata *metadata = &event->bpf_metadata;
> +       size_t ret;
> +
> +       ret = fprintf(fp, " prog %s\n", metadata->prog_name);
> +       for (__u32 i = 0; i < metadata->nr_entries; i++) {
> +               ret += fprintf(fp, "  entry %d: %20s = %s\n", i,
> +                              metadata->entries[i].key,
> +                              metadata->entries[i].value);
> +       }
> +       return ret;
> +}
> +
>  static int text_poke_printer(enum binary_printer_ops op, unsigned int val,
>                              void *extra, FILE *fp)
>  {
> @@ -601,6 +619,9 @@ size_t perf_event__fprintf(union perf_event *event, struct machine *machine, FIL
>         case PERF_RECORD_AUX_OUTPUT_HW_ID:
>                 ret += perf_event__fprintf_aux_output_hw_id(event, fp);
>                 break;
> +       case PERF_RECORD_BPF_METADATA:
> +               ret += perf_event__fprintf_bpf_metadata(event, fp);
> +               break;
>         default:
>                 ret += fprintf(fp, "\n");
>         }
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 664bf39567ce..67ad4a2014bc 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -370,6 +370,7 @@ size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
>  size_t perf_event__fprintf_cgroup(union perf_event *event, FILE *fp);
>  size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp);
>  size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp);
> +size_t perf_event__fprintf_bpf_metadata(union perf_event *event, FILE *fp);
>  size_t perf_event__fprintf_text_poke(union perf_event *event, struct machine *machine,FILE *fp);
>  size_t perf_event__fprintf(union perf_event *event, struct machine *machine, FILE *fp);
>
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index e3cdc3b7b4ab..7c477e2a93b3 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -3161,6 +3161,7 @@ static int process_bpf_prog_info(struct feat_fd *ff, void *data __maybe_unused)
>                 /* after reading from file, translate offset to address */
>                 bpil_offs_to_addr(info_linear);
>                 info_node->info_linear = info_linear;
> +               info_node->metadata = NULL;
>                 if (!__perf_env__insert_bpf_prog_info(env, info_node)) {
>                         free(info_linear);
>                         free(info_node);
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index a320672c264e..38075059086c 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -12,6 +12,7 @@
>  #include <sys/types.h>
>  #include <sys/mman.h>
>  #include <perf/cpumap.h>
> +#include <perf/event.h>
>
>  #include "map_symbol.h"
>  #include "branch.h"
> @@ -1491,6 +1492,9 @@ static s64 perf_session__process_user_event(struct perf_session *session,
>         case PERF_RECORD_FINISHED_INIT:
>                 err = tool->finished_init(session, event);
>                 break;
> +       case PERF_RECORD_BPF_METADATA:
> +               err = tool->bpf_metadata(session, event);
> +               break;
>         default:
>                 err = -EINVAL;
>                 break;
> diff --git a/tools/perf/util/synthetic-events.h b/tools/perf/util/synthetic-events.h
> index b9c936b5cfeb..ee29615d68e5 100644
> --- a/tools/perf/util/synthetic-events.h
> +++ b/tools/perf/util/synthetic-events.h
> @@ -92,6 +92,8 @@ int perf_event__synthesize_threads(const struct perf_tool *tool, perf_event__han
>  int perf_event__synthesize_tracing_data(const struct perf_tool *tool, int fd, struct evlist *evlist, perf_event__handler_t process);
>  int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc, const struct perf_tool *tool, perf_event__handler_t process, struct machine *machine);
>  pid_t perf_event__synthesize_comm(const struct perf_tool *tool, union perf_event *event, pid_t pid, perf_event__handler_t process, struct machine *machine);
> +void perf_event__synthesize_final_bpf_metadata(struct perf_session *session,
> +                                              perf_event__handler_t process);
>
>  int perf_tool__process_synth_event(const struct perf_tool *tool, union perf_event *event, struct machine *machine, perf_event__handler_t process);
>
> diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
> index 37bd8ac63b01..204ec03071bc 100644
> --- a/tools/perf/util/tool.c
> +++ b/tools/perf/util/tool.c
> @@ -1,12 +1,15 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include "data.h"
>  #include "debug.h"
> +#include "event.h"
>  #include "header.h"
>  #include "session.h"
>  #include "stat.h"
>  #include "tool.h"
>  #include "tsc.h"
> +#include <linux/compiler.h>
>  #include <sys/mman.h>
> +#include <stddef.h>
>  #include <unistd.h>
>
>  #ifdef HAVE_ZSTD_SUPPORT
> @@ -237,6 +240,16 @@ static int perf_session__process_compressed_event_stub(struct perf_session *sess
>         return 0;
>  }
>
> +static int perf_event__process_bpf_metadata_stub(struct perf_session *perf_session __maybe_unused,
> +                                                union perf_event *event)
> +{
> +       if (dump_trace)
> +               perf_event__fprintf_bpf_metadata(event, stdout);
> +
> +       dump_printf(": unhandled!\n");
> +       return 0;
> +}
> +
>  void perf_tool__init(struct perf_tool *tool, bool ordered_events)
>  {
>         tool->ordered_events = ordered_events;
> @@ -293,6 +306,7 @@ void perf_tool__init(struct perf_tool *tool, bool ordered_events)
>         tool->compressed = perf_session__process_compressed_event_stub;
>  #endif
>         tool->finished_init = process_event_op2_stub;
> +       tool->bpf_metadata = perf_event__process_bpf_metadata_stub;
>  }
>
>  bool perf_tool__compressed_is_stub(const struct perf_tool *tool)
> diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
> index db1c7642b0d1..18b76ff0f26a 100644
> --- a/tools/perf/util/tool.h
> +++ b/tools/perf/util/tool.h
> @@ -77,7 +77,8 @@ struct perf_tool {
>                         stat,
>                         stat_round,
>                         feature,
> -                       finished_init;
> +                       finished_init,
> +                       bpf_metadata;
>         event_op4       compressed;
>         event_op3       auxtrace;
>         bool            ordered_events;
> --
> 2.49.0.1143.g0be31eac6b-goog
>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event
  2025-05-29 18:12   ` Ian Rogers
@ 2025-05-29 23:09     ` Blake Jones
  2025-05-29 23:27       ` Ian Rogers
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-29 23:09 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

Hi Ian,

Thanks for your comments!

On Thu, May 29, 2025 at 11:12 AM Ian Rogers <irogers@google.com> wrote:
> On Wed, May 21, 2025 at 3:27 PM Blake Jones <blakejones@google.com> wrote:
> > diff --git a/tools/perf/util/bpf_skel/sample_filter.bpf.c b/tools/perf/util/bpf_skel/sample_filter.bpf.c
> > [...]
> > +// This is used by tests/shell/record_bpf_metadata.sh
> > +// to verify that BPF metadata generation works.
> > +const int bpf_metadata_test_value SEC(".rodata") = 42;
>
> This is a bit random.

Yeah, that's fair. I added it because it was a straightforward way to reliably
get a known value that I could observe from tests/shell/test_bpf_metadata.sh.

> For the non-BPF C code we have a build generated
> PERF-VERSION-FILE that contains something like `#define PERF_VERSION
> "6.15.rc7.ge450e74276d2"`. I wonder having something like
]> [...]
> would be more useful/meaningful. Perhaps the build could inject the
> variable to avoid duplicating it all the BPF skeletons.

I could do this if you'd like. It would make it harder for my test to check
that it was reporting the right value, because the PERF-VERSION-FILE defines
PERF_VERSION in a way that's useful for C programs but not shell scripts.
I'd just have the test check that it was reporting some string (maybe one
with at least one dot in it, if I can stably make that assumption). WDYT?

> nit: I wonder for testing it would be interesting to have 0 and >1
> metadata values tested too. We may want to have test programs
> explicitly for that, in tools/perf/tests.

My testing right now depends pretty heavily on the fact that perf will load
sample_filter.bpf.o for me; this seems much better than trying to load a
BPF program manually from a test.

If I could find other perf invocations that would load other BPF programs
automatically, I could potentially test the "0 metadata" and ">1 metadata"
cases. That would involve more BPF-program-specific modifications, though,
which would be closer to the thing you objected to above. So to me this
doesn't seem worth the effort.

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-05-29 17:47   ` Ian Rogers
@ 2025-05-29 23:21     ` Blake Jones
  2025-05-29 23:23       ` Ian Rogers
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-29 23:21 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

Hi Ian,

On Thu, May 29, 2025 at 10:47 AM Ian Rogers <irogers@google.com> wrote:
> > +               bpf_metadata_event = &metadata->event->bpf_metadata;
> > +               *bpf_metadata_event = (struct perf_record_bpf_metadata) {
> > +                       .header = {
> > +                               .type = PERF_RECORD_BPF_METADATA,
> > +                               .size = metadata->event_size,
>
> nit: Could we set the header.size in bpf_metadata_alloc to remove
> metadata->event_size. The code generally doesn't pass a pair of
> perf_event + size around as the size should be in the header.

I can do this initialization of metadata->event->bpf_metadata in
bpf_metadata_alloc(). I'll need to do a bit more work in
synthesize_perf_record_bpf_metadata() before I can get rid of
metadata->event_size, because it needs to allocate an additional
machine->id_hdr_size bytes (see below); I'd just have it get the
value out of metadata->event->header.size instead. Sound good?

Blake

> > +static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metadata,
> > +                                              const struct perf_tool *tool,
> > +                                              perf_event__handler_t process,
> > +                                              struct machine *machine)
> > +{
> > +       union perf_event *event;
> > +       int err = 0;
> > +
> > +       event = calloc(1, metadata->event_size + machine->id_hdr_size);
> > +       if (!event)
> > +               return -1;
> > +       memcpy(event, metadata->event, metadata->event_size);
> > +       memset((void *)event + event->header.size, 0, machine->id_hdr_size);
> > +       event->header.size += machine->id_hdr_size;
> > +       for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
> > +               memcpy(event->bpf_metadata.prog_name,
> > +                      metadata->prog_names[index], BPF_PROG_NAME_LEN);
> > +               err = perf_tool__process_synth_event(tool, event, machine,
> > +                                                    process);
> > +               if (err != 0)
> > +                       break;
> > +       }
> > +
> > +       free(event);
> > +       return err;
> > +}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-05-29 23:21     ` Blake Jones
@ 2025-05-29 23:23       ` Ian Rogers
  0 siblings, 0 replies; 28+ messages in thread
From: Ian Rogers @ 2025-05-29 23:23 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Thu, May 29, 2025 at 4:21 PM Blake Jones <blakejones@google.com> wrote:
>
> Hi Ian,
>
> On Thu, May 29, 2025 at 10:47 AM Ian Rogers <irogers@google.com> wrote:
> > > +               bpf_metadata_event = &metadata->event->bpf_metadata;
> > > +               *bpf_metadata_event = (struct perf_record_bpf_metadata) {
> > > +                       .header = {
> > > +                               .type = PERF_RECORD_BPF_METADATA,
> > > +                               .size = metadata->event_size,
> >
> > nit: Could we set the header.size in bpf_metadata_alloc to remove
> > metadata->event_size. The code generally doesn't pass a pair of
> > perf_event + size around as the size should be in the header.
>
> I can do this initialization of metadata->event->bpf_metadata in
> bpf_metadata_alloc(). I'll need to do a bit more work in
> synthesize_perf_record_bpf_metadata() before I can get rid of
> metadata->event_size, because it needs to allocate an additional
> machine->id_hdr_size bytes (see below); I'd just have it get the
> value out of metadata->event->header.size instead. Sound good?

Sounds good. Thanks,
Ian

> Blake
>
> > > +static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metadata,
> > > +                                              const struct perf_tool *tool,
> > > +                                              perf_event__handler_t process,
> > > +                                              struct machine *machine)
> > > +{
> > > +       union perf_event *event;
> > > +       int err = 0;
> > > +
> > > +       event = calloc(1, metadata->event_size + machine->id_hdr_size);
> > > +       if (!event)
> > > +               return -1;
> > > +       memcpy(event, metadata->event, metadata->event_size);
> > > +       memset((void *)event + event->header.size, 0, machine->id_hdr_size);
> > > +       event->header.size += machine->id_hdr_size;
> > > +       for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
> > > +               memcpy(event->bpf_metadata.prog_name,
> > > +                      metadata->prog_names[index], BPF_PROG_NAME_LEN);
> > > +               err = perf_tool__process_synth_event(tool, event, machine,
> > > +                                                    process);
> > > +               if (err != 0)
> > > +                       break;
> > > +       }
> > > +
> > > +       free(event);
> > > +       return err;
> > > +}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event
  2025-05-29 23:09     ` Blake Jones
@ 2025-05-29 23:27       ` Ian Rogers
  2025-05-29 23:49         ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Ian Rogers @ 2025-05-29 23:27 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Thu, May 29, 2025 at 4:09 PM Blake Jones <blakejones@google.com> wrote:
>
> Hi Ian,
>
> Thanks for your comments!
>
> On Thu, May 29, 2025 at 11:12 AM Ian Rogers <irogers@google.com> wrote:
> > On Wed, May 21, 2025 at 3:27 PM Blake Jones <blakejones@google.com> wrote:
> > > diff --git a/tools/perf/util/bpf_skel/sample_filter.bpf.c b/tools/perf/util/bpf_skel/sample_filter.bpf.c
> > > [...]
> > > +// This is used by tests/shell/record_bpf_metadata.sh
> > > +// to verify that BPF metadata generation works.
> > > +const int bpf_metadata_test_value SEC(".rodata") = 42;
> >
> > This is a bit random.
>
> Yeah, that's fair. I added it because it was a straightforward way to reliably
> get a known value that I could observe from tests/shell/test_bpf_metadata.sh.
>
> > For the non-BPF C code we have a build generated
> > PERF-VERSION-FILE that contains something like `#define PERF_VERSION
> > "6.15.rc7.ge450e74276d2"`. I wonder having something like
> ]> [...]
> > would be more useful/meaningful. Perhaps the build could inject the
> > variable to avoid duplicating it all the BPF skeletons.
>
> I could do this if you'd like. It would make it harder for my test to check
> that it was reporting the right value, because the PERF-VERSION-FILE defines
> PERF_VERSION in a way that's useful for C programs but not shell scripts.
> I'd just have the test check that it was reporting some string (maybe one
> with at least one dot in it, if I can stably make that assumption). WDYT?

It should be okay as you can compare the string against that reported
by `perf version`. On my build in `/tmp/perf`:
```
$ /tmp/perf/perf version
perf version 6.15.rc7.gb9ac06abfde9
$ cat /tmp/perf/PERF-VERSION-FILE
#define PERF_VERSION "6.15.rc7.gb9ac06abfde9"
```

> > nit: I wonder for testing it would be interesting to have 0 and >1
> > metadata values tested too. We may want to have test programs
> > explicitly for that, in tools/perf/tests.
>
> My testing right now depends pretty heavily on the fact that perf will load
> sample_filter.bpf.o for me; this seems much better than trying to load a
> BPF program manually from a test.
>
> If I could find other perf invocations that would load other BPF programs
> automatically, I could potentially test the "0 metadata" and ">1 metadata"
> cases. That would involve more BPF-program-specific modifications, though,
> which would be closer to the thing you objected to above. So to me this
> doesn't seem worth the effort.

Ok, and the main 1 metadata case is well covered by the test you added.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event
  2025-05-29 23:27       ` Ian Rogers
@ 2025-05-29 23:49         ` Blake Jones
  0 siblings, 0 replies; 28+ messages in thread
From: Blake Jones @ 2025-05-29 23:49 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Thu, May 29, 2025 at 4:27 PM Ian Rogers <irogers@google.com> wrote:
> It should be okay as you can compare the string against that reported
> by `perf version`. On my build in `/tmp/perf`:
> ```
> $ /tmp/perf/perf version
> perf version 6.15.rc7.gb9ac06abfde9
> $ cat /tmp/perf/PERF-VERSION-FILE
> #define PERF_VERSION "6.15.rc7.gb9ac06abfde9"
> ```

Oh, nice! I'll switch the test to use that instead.

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-29  0:58       ` Blake Jones
@ 2025-05-30 17:40         ` Namhyung Kim
  2025-05-31  7:26           ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2025-05-30 17:40 UTC (permalink / raw)
  To: Blake Jones
  Cc: Arnaldo Carvalho de Melo, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Kan Liang,
	Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users

Hi Blake,

On Wed, May 28, 2025 at 05:58:32PM -0700, Blake Jones wrote:
> Hi Arnaldo,
> 
> On Thu, May 22, 2025 at 11:19 AM Blake Jones <blakejones@google.com> wrote:
> > On Thu, May 22, 2025 at 1:42 PM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > > I'll test this but unsure if this part should go thru the perf tool
> > > tree, perhaps should go, together with some test case, via the libbpf
> > > tree?
> >
> > Thanks for taking a look at this. I'd appreciate your guidance here - I
> > sent it here because the other two patches in my patch set depend on this
> > one.
> 
> Do you think this can be merged through the perf tree, or should I separate
> this patch and send it though the BPF tree first?

I think it's better to go to the bpf tree although it'd take longer to
get your perf patches.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-30 17:40         ` Namhyung Kim
@ 2025-05-31  7:26           ` Blake Jones
  2025-06-03 18:23             ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-05-31  7:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Kan Liang,
	Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users

Hi Namhyung,

On Fri, May 30, 2025 at 10:40 AM Namhyung Kim <namhyung@kernel.org> wrote:
> I think it's better to go to the bpf tree although it'd take longer to
> get your perf patches.

Thanks for the suggestion. I've sent this patch to the bpf tree, and I'll
resend the rest of this series once that change makes its way to this tree.

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-05-31  7:26           ` Blake Jones
@ 2025-06-03 18:23             ` Blake Jones
  2025-06-03 18:43               ` Alexei Starovoitov
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-06-03 18:23 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Kan Liang, Chun-Tse Shao, Zhongqiu Han,
	James Clark, Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan,
	Yujie Liu, Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel,
	bpf, linux-perf-users

The libbpf patch set is under discussion right now. Once it converges,
is there a way to include those patches in the perf tree without
waiting for them to go up to the main tree and then back down? Could I
resend them here, or include them as the first part of my next patch
series?

Thanks in advance for the guidance.

Blake

On Sat, May 31, 2025 at 12:26 AM Blake Jones <blakejones@google.com> wrote:
>
> Hi Namhyung,
>
> On Fri, May 30, 2025 at 10:40 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > I think it's better to go to the bpf tree although it'd take longer to
> > get your perf patches.
>
> Thanks for the suggestion. I've sent this patch to the bpf tree, and I'll
> resend the rest of this series once that change makes its way to this tree.
>
> Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-06-03 18:23             ` Blake Jones
@ 2025-06-03 18:43               ` Alexei Starovoitov
  2025-06-03 19:47                 ` Namhyung Kim
  0 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2025-06-03 18:43 UTC (permalink / raw)
  To: Blake Jones
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Ian Rogers,
	Adrian Hunter, Kan Liang, Chun-Tse Shao, Zhongqiu Han,
	James Clark, Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan,
	Yujie Liu, Graham Woodward, Yicong Yang, Ben Gainey, LKML, bpf,
	linux-perf-use.

On Tue, Jun 3, 2025 at 11:24 AM Blake Jones <blakejones@google.com> wrote:
>
> The libbpf patch set is under discussion right now. Once it converges,
> is there a way to include those patches in the perf tree without
> waiting for them to go up to the main tree and then back down? Could I
> resend them here, or include them as the first part of my next patch
> series?

Not really. libbpf is developed in kernel tree, sync to github
and released from github.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] perf: add support for printing BTF character arrays as strings
  2025-06-03 18:43               ` Alexei Starovoitov
@ 2025-06-03 19:47                 ` Namhyung Kim
  0 siblings, 0 replies; 28+ messages in thread
From: Namhyung Kim @ 2025-06-03 19:47 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Blake Jones, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Ian Rogers,
	Adrian Hunter, Kan Liang, Chun-Tse Shao, Zhongqiu Han,
	James Clark, Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan,
	Yujie Liu, Graham Woodward, Yicong Yang, Ben Gainey, LKML, bpf,
	linux-perf-use.

On Tue, Jun 03, 2025 at 11:43:29AM -0700, Alexei Starovoitov wrote:
> On Tue, Jun 3, 2025 at 11:24 AM Blake Jones <blakejones@google.com> wrote:
> >
> > The libbpf patch set is under discussion right now. Once it converges,
> > is there a way to include those patches in the perf tree without
> > waiting for them to go up to the main tree and then back down? Could I
> > resend them here, or include them as the first part of my next patch
> > series?
> 
> Not really. libbpf is developed in kernel tree, sync to github
> and released from github.

But you can continue to work on the perf side after adding a feature
test.  I think we can accept the change once the libbpf change is
finalized.

It won't actually work until the both changes are merged to the
mainline in the next cycle, but you can test it locally.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-05-21 22:27 ` [PATCH 2/3] perf: collect BPF metadata from existing BPF programs Blake Jones
  2025-05-29 17:47   ` Ian Rogers
@ 2025-06-03 20:15   ` Namhyung Kim
  2025-06-03 21:27     ` Blake Jones
  1 sibling, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2025-06-03 20:15 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Wed, May 21, 2025 at 03:27:24PM -0700, Blake Jones wrote:
> Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract
> their values as strings, and create a new PERF_RECORD_BPF_METADATA
> synthetic event using that data. The code gets invoked from the existing
> routine perf_event__synthesize_one_bpf_prog().

It would be great if you can show an example how those metadata is
constructed and shared between BPF programs.

IIUC the metadata is collected for each BPF program which may have
multiple subprograms.  Then this patch creates multiple PERF_RECORD_
BPF_METADATA for each subprogram, right?

Can it be shared using the BPF program ID?

> 
> Signed-off-by: Blake Jones <blakejones@google.com>
> ---
>  tools/lib/perf/include/perf/event.h |  18 ++
>  tools/perf/util/bpf-event.c         | 310 ++++++++++++++++++++++++++++
>  tools/perf/util/bpf-event.h         |  13 ++
>  3 files changed, 341 insertions(+)
> 
> diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
> index 09b7c643ddac..6608f1e3701b 100644
> --- a/tools/lib/perf/include/perf/event.h
> +++ b/tools/lib/perf/include/perf/event.h
> @@ -467,6 +467,22 @@ struct perf_record_compressed2 {
>  	char			 data[];
>  };
>  
> +#define BPF_METADATA_KEY_LEN   64
> +#define BPF_METADATA_VALUE_LEN 256
> +#define BPF_PROG_NAME_LEN      KSYM_NAME_LEN
> +
> +struct perf_record_bpf_metadata_entry {
> +	char key[BPF_METADATA_KEY_LEN];
> +	char value[BPF_METADATA_VALUE_LEN];
> +};
> +
> +struct perf_record_bpf_metadata {
> +	struct perf_event_header	      header;
> +	char				      prog_name[BPF_PROG_NAME_LEN];
> +	__u64				      nr_entries;
> +	struct perf_record_bpf_metadata_entry entries[];
> +};
> +
>  enum perf_user_event_type { /* above any possible kernel type */
>  	PERF_RECORD_USER_TYPE_START		= 64,
>  	PERF_RECORD_HEADER_ATTR			= 64,
> @@ -489,6 +505,7 @@ enum perf_user_event_type { /* above any possible kernel type */
>  	PERF_RECORD_COMPRESSED			= 81,
>  	PERF_RECORD_FINISHED_INIT		= 82,
>  	PERF_RECORD_COMPRESSED2			= 83,
> +	PERF_RECORD_BPF_METADATA		= 84,
>  	PERF_RECORD_HEADER_MAX
>  };
>  
> @@ -530,6 +547,7 @@ union perf_event {
>  	struct perf_record_header_feature	feat;
>  	struct perf_record_compressed		pack;
>  	struct perf_record_compressed2		pack2;
> +	struct perf_record_bpf_metadata		bpf_metadata;
>  };
>  
>  #endif /* __LIBPERF_EVENT_H */
> diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
> index c81444059ad0..36d5676f025e 100644
> --- a/tools/perf/util/bpf-event.c
> +++ b/tools/perf/util/bpf-event.c
> @@ -1,13 +1,20 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include <errno.h>
> +#include <stddef.h>
> +#include <stdint.h>
> +#include <stdio.h>
>  #include <stdlib.h>
> +#include <string.h>
>  #include <bpf/bpf.h>
>  #include <bpf/btf.h>
>  #include <bpf/libbpf.h>
> +#include <linux/bpf.h>
>  #include <linux/btf.h>
>  #include <linux/err.h>
> +#include <linux/perf_event.h>
>  #include <linux/string.h>
>  #include <internal/lib.h>
> +#include <perf/event.h>
>  #include <symbol/kallsyms.h>
>  #include "bpf-event.h"
>  #include "bpf-utils.h"
> @@ -151,6 +158,298 @@ static int synthesize_bpf_prog_name(char *buf, int size,
>  	return name_len;
>  }
>  
> +#define BPF_METADATA_PREFIX "bpf_metadata_"
> +#define BPF_METADATA_PREFIX_LEN (sizeof(BPF_METADATA_PREFIX) - 1)
> +
> +static bool name_has_bpf_metadata_prefix(const char **s)
> +{
> +	if (strncmp(*s, BPF_METADATA_PREFIX, BPF_METADATA_PREFIX_LEN) != 0)
> +		return false;
> +	*s += BPF_METADATA_PREFIX_LEN;
> +	return true;
> +}
> +
> +struct bpf_metadata_map {
> +	struct btf *btf;
> +	const struct btf_type *datasec;
> +	void *rodata;
> +	size_t rodata_size;
> +	unsigned int num_vars;
> +};
> +
> +static int bpf_metadata_read_map_data(__u32 map_id, struct bpf_metadata_map *map)
> +{
> +	int map_fd;
> +	struct bpf_map_info map_info;
> +	__u32 map_info_len;
> +	int key;
> +	struct btf *btf;
> +	const struct btf_type *datasec;
> +	struct btf_var_secinfo *vsi;
> +	unsigned int vlen, vars;
> +	void *rodata;
> +
> +	map_fd = bpf_map_get_fd_by_id(map_id);
> +	if (map_fd < 0)
> +		return -1;
> +
> +	memset(&map_info, 0, sizeof(map_info));
> +	map_info_len = sizeof(map_info);
> +	if (bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len) < 0)
> +		goto out_close;
> +
> +	/* If it's not an .rodata map, don't bother. */
> +	if (map_info.type != BPF_MAP_TYPE_ARRAY ||
> +	    map_info.key_size != sizeof(int) ||
> +	    map_info.max_entries != 1 ||
> +	    !map_info.btf_value_type_id ||
> +	    !strstr(map_info.name, ".rodata")) {
> +		goto out_close;
> +	}
> +
> +	btf = btf__load_from_kernel_by_id(map_info.btf_id);
> +	if (!btf)
> +		goto out_close;
> +	datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
> +	if (!btf_is_datasec(datasec))
> +		goto out_free_btf;
> +
> +	/* If there aren't any variables with the "bpf_metadata_" prefix,
> +	 * don't bother.
> +	 */
> +	vlen = btf_vlen(datasec);
> +	vsi = btf_var_secinfos(datasec);
> +	vars = 0;
> +	for (unsigned int i = 0; i < vlen; i++, vsi++) {
> +		const struct btf_type *t_var = btf__type_by_id(btf, vsi->type);
> +		const char *name = btf__name_by_offset(btf, t_var->name_off);
> +
> +		if (name_has_bpf_metadata_prefix(&name))
> +			vars++;
> +	}
> +	if (vars == 0)
> +		goto out_free_btf;
> +
> +	rodata = calloc(1, map_info.value_size);

You can use 'zalloc()' instead, in other places too.


> +	if (!rodata)
> +		goto out_free_btf;
> +	key = 0;
> +	if (bpf_map_lookup_elem(map_fd, &key, rodata)) {
> +		free(rodata);
> +		goto out_free_btf;
> +	}
> +	close(map_fd);
> +
> +	map->btf = btf;
> +	map->datasec = datasec;
> +	map->rodata = rodata;
> +	map->rodata_size = map_info.value_size;
> +	map->num_vars = vars;
> +	return 0;
> +
> +out_free_btf:
> +	btf__free(btf);
> +out_close:
> +	close(map_fd);
> +	return -1;
> +}
> +
> +struct format_btf_ctx {
> +	char *buf;
> +	size_t buf_size;
> +	size_t buf_idx;
> +};
> +
> +static void format_btf_cb(void *arg, const char *fmt, va_list ap)
> +{
> +	int n;
> +	struct format_btf_ctx *ctx = (struct format_btf_ctx *)arg;
> +
> +	n = vsnprintf(ctx->buf + ctx->buf_idx, ctx->buf_size - ctx->buf_idx,
> +		      fmt, ap);
> +	ctx->buf_idx += n;
> +	if (ctx->buf_idx >= ctx->buf_size)
> +		ctx->buf_idx = ctx->buf_size;
> +}
> +
> +static void format_btf_variable(struct btf *btf, char *buf, size_t buf_size,
> +				const struct btf_type *t, const void *btf_data)
> +{
> +	struct format_btf_ctx ctx = {
> +		.buf = buf,
> +		.buf_idx = 0,
> +		.buf_size = buf_size,
> +	};
> +	const struct btf_dump_type_data_opts opts = {
> +		.sz = sizeof(struct btf_dump_type_data_opts),
> +		.skip_names = 1,
> +		.compact = 1,
> +		.print_strings = 1,
> +	};
> +	struct btf_dump *d;
> +	size_t btf_size;
> +
> +	d = btf_dump__new(btf, format_btf_cb, &ctx, NULL);
> +	btf_size = btf__resolve_size(btf, t->type);
> +	btf_dump__dump_type_data(d, t->type, btf_data, btf_size, &opts);
> +	btf_dump__free(d);
> +}
> +
> +static void bpf_metadata_fill_event(struct bpf_metadata_map *map,
> +				    struct perf_record_bpf_metadata *bpf_metadata_event)
> +{
> +	struct btf_var_secinfo *vsi;
> +	unsigned int i, vlen;
> +
> +	memset(bpf_metadata_event->prog_name, 0, BPF_PROG_NAME_LEN);
> +	vlen = btf_vlen(map->datasec);
> +	vsi = btf_var_secinfos(map->datasec);
> +
> +	for (i = 0; i < vlen; i++, vsi++) {
> +		const struct btf_type *t_var = btf__type_by_id(map->btf,
> +							       vsi->type);
> +		const char *name = btf__name_by_offset(map->btf,
> +						       t_var->name_off);
> +		const __u64 nr_entries = bpf_metadata_event->nr_entries;
> +		struct perf_record_bpf_metadata_entry *entry;
> +
> +		if (!name_has_bpf_metadata_prefix(&name))
> +			continue;
> +
> +		if (nr_entries >= (__u64)map->num_vars)
> +			break;
> +
> +		entry = &bpf_metadata_event->entries[nr_entries];
> +		memset(entry, 0, sizeof(*entry));
> +		snprintf(entry->key, BPF_METADATA_KEY_LEN, "%s", name);
> +		format_btf_variable(map->btf, entry->value,
> +				    BPF_METADATA_VALUE_LEN, t_var,
> +				    map->rodata + vsi->offset);
> +		bpf_metadata_event->nr_entries++;
> +	}
> +}
> +
> +static void bpf_metadata_free_map_data(struct bpf_metadata_map *map)
> +{
> +	btf__free(map->btf);
> +	free(map->rodata);
> +}
> +
> +void bpf_metadata_free(struct bpf_metadata *metadata)
> +{
> +	if (metadata == NULL)
> +		return;
> +	for (__u32 index = 0; index < metadata->nr_prog_names; index++)
> +		free(metadata->prog_names[index]);
> +	if (metadata->prog_names != NULL)
> +		free(metadata->prog_names);
> +	if (metadata->event != NULL)
> +		free(metadata->event);

No need to NULL change for free().


> +	free(metadata);
> +}
> +
> +static struct bpf_metadata *bpf_metadata_alloc(__u32 nr_prog_tags,
> +					       __u32 nr_variables)
> +{
> +	struct bpf_metadata *metadata;
> +
> +	metadata = calloc(1, sizeof(struct bpf_metadata));
> +	if (!metadata)
> +		return NULL;
> +
> +	metadata->prog_names = calloc(nr_prog_tags, sizeof(char *));
> +	if (!metadata->prog_names) {
> +		bpf_metadata_free(metadata);
> +		return NULL;
> +	}
> +	for (__u32 prog_index = 0; prog_index < nr_prog_tags; prog_index++) {
> +		metadata->prog_names[prog_index] = calloc(BPF_PROG_NAME_LEN,
> +							  sizeof(char));
> +		if (!metadata->prog_names[prog_index]) {
> +			bpf_metadata_free(metadata);
> +			return NULL;
> +		}
> +		metadata->nr_prog_names++;
> +	}
> +
> +	metadata->event_size = sizeof(metadata->event->bpf_metadata) +
> +	    nr_variables * sizeof(metadata->event->bpf_metadata.entries[0]);
> +	metadata->event = calloc(1, metadata->event_size);
> +	if (!metadata->event) {
> +		bpf_metadata_free(metadata);
> +		return NULL;
> +	}
> +
> +	return metadata;
> +}
> +
> +static struct bpf_metadata *bpf_metadata_create(struct bpf_prog_info *info)
> +{
> +	struct bpf_metadata *metadata;
> +	const __u32 *map_ids = (__u32 *)(uintptr_t)info->map_ids;
> +
> +	for (__u32 map_index = 0; map_index < info->nr_map_ids; map_index++) {
> +		struct perf_record_bpf_metadata *bpf_metadata_event;
> +		struct bpf_metadata_map map;
> +
> +		if (bpf_metadata_read_map_data(map_ids[map_index], &map) != 0)
> +			continue;
> +
> +		metadata = bpf_metadata_alloc(info->nr_prog_tags, map.num_vars);
> +		if (!metadata)
> +			continue;
> +
> +		bpf_metadata_event = &metadata->event->bpf_metadata;
> +		*bpf_metadata_event = (struct perf_record_bpf_metadata) {
> +			.header = {
> +				.type = PERF_RECORD_BPF_METADATA,
> +				.size = metadata->event_size,
> +			},
> +			.nr_entries = 0,
> +		};
> +		bpf_metadata_fill_event(&map, bpf_metadata_event);
> +
> +		for (__u32 index = 0; index < info->nr_prog_tags; index++) {
> +			synthesize_bpf_prog_name(metadata->prog_names[index],
> +						 BPF_PROG_NAME_LEN, info,
> +						 map.btf, index);
> +		}
> +
> +		bpf_metadata_free_map_data(&map);
> +
> +		return metadata;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int synthesize_perf_record_bpf_metadata(const struct bpf_metadata *metadata,
> +					       const struct perf_tool *tool,
> +					       perf_event__handler_t process,
> +					       struct machine *machine)
> +{
> +	union perf_event *event;
> +	int err = 0;
> +
> +	event = calloc(1, metadata->event_size + machine->id_hdr_size);
> +	if (!event)
> +		return -1;
> +	memcpy(event, metadata->event, metadata->event_size);
> +	memset((void *)event + event->header.size, 0, machine->id_hdr_size);
> +	event->header.size += machine->id_hdr_size;
> +	for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
> +		memcpy(event->bpf_metadata.prog_name,
> +		       metadata->prog_names[index], BPF_PROG_NAME_LEN);

Is it possible to call synthesize_bpf_prog_name() directly to the
event->bpf_metadata.prog_name instead of saving it metadata->prog_names?

Thanks,
Namhyung


> +		err = perf_tool__process_synth_event(tool, event, machine,
> +						     process);
> +		if (err != 0)
> +			break;
> +	}
> +
> +	free(event);
> +	return err;
> +}
> +
>  /*
>   * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
>   * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
> @@ -173,6 +472,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>  	const struct perf_tool *tool = session->tool;
>  	struct bpf_prog_info_node *info_node;
>  	struct perf_bpil *info_linear;
> +	struct bpf_metadata *metadata;
>  	struct bpf_prog_info *info;
>  	struct btf *btf = NULL;
>  	struct perf_env *env;
> @@ -193,6 +493,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>  	arrays |= 1UL << PERF_BPIL_JITED_INSNS;
>  	arrays |= 1UL << PERF_BPIL_LINE_INFO;
>  	arrays |= 1UL << PERF_BPIL_JITED_LINE_INFO;
> +	arrays |= 1UL << PERF_BPIL_MAP_IDS;
>  
>  	info_linear = get_bpf_prog_info_linear(fd, arrays);
>  	if (IS_ERR_OR_NULL(info_linear)) {
> @@ -301,6 +602,15 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
>  		 */
>  		err = perf_tool__process_synth_event(tool, event,
>  						     machine, process);
> +
> +		/* Synthesize PERF_RECORD_BPF_METADATA */
> +		metadata = bpf_metadata_create(info);
> +		if (metadata != NULL) {
> +			err = synthesize_perf_record_bpf_metadata(metadata,
> +								  tool, process,
> +								  machine);
> +			bpf_metadata_free(metadata);
> +		}
>  	}
>  
>  out:
> diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
> index e2f0420905f5..007f0b4d21cb 100644
> --- a/tools/perf/util/bpf-event.h
> +++ b/tools/perf/util/bpf-event.h
> @@ -17,6 +17,13 @@ struct record_opts;
>  struct evlist;
>  struct target;
>  
> +struct bpf_metadata {
> +	union perf_event *event;
> +	size_t		 event_size;
> +	char		 **prog_names;
> +	__u64		 nr_prog_names;
> +};
> +
>  struct bpf_prog_info_node {
>  	struct perf_bpil		*info_linear;
>  	struct rb_node			rb_node;
> @@ -36,6 +43,7 @@ int evlist__add_bpf_sb_event(struct evlist *evlist, struct perf_env *env);
>  void __bpf_event__print_bpf_prog_info(struct bpf_prog_info *info,
>  				      struct perf_env *env,
>  				      FILE *fp);
> +void bpf_metadata_free(struct bpf_metadata *metadata);
>  #else
>  static inline int machine__process_bpf(struct machine *machine __maybe_unused,
>  				       union perf_event *event __maybe_unused,
> @@ -55,6 +63,11 @@ static inline void __bpf_event__print_bpf_prog_info(struct bpf_prog_info *info _
>  						    FILE *fp __maybe_unused)
>  {
>  
> +}
> +
> +static inline void bpf_metadata_free(struct bpf_metadata *metadata)
> +{
> +
>  }
>  #endif // HAVE_LIBBPF_SUPPORT
>  #endif
> -- 
> 2.49.0.1143.g0be31eac6b-goog
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-03 20:15   ` Namhyung Kim
@ 2025-06-03 21:27     ` Blake Jones
  2025-06-03 21:44       ` Namhyung Kim
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-06-03 21:27 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

Hi Namhyung,

On Tue, Jun 3, 2025 at 1:15 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, May 21, 2025 at 03:27:24PM -0700, Blake Jones wrote:
> > Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract
> > their values as strings, and create a new PERF_RECORD_BPF_METADATA
> > synthetic event using that data. The code gets invoked from the existing
> > routine perf_event__synthesize_one_bpf_prog().
>
> It would be great if you can show an example how those metadata is
> constructed and shared between BPF programs.

I've added the following to my commit message:

| For example, a BPF program with the following variables:
|
|     const char bpf_metadata_version[] SEC(".rodata") = "3.14159";
|     int bpf_metadata_value[] SEC(".rodata") = 42;
|
| would generate a PERF_RECORD_BPF_METADATA record with:
|
|     .prog_name        = <BPF program name, e.g. "bpf_prog_a1b2c3_foo">
|     .nr_entries       = 2
|     .entries[0].key   = "version"
|     .entries[0].value = "3.14159"
|     .entries[1].key   = "value"
|     .entries[1].value = "42"
|
| Each of the BPF programs and subprograms that share those variables would
| get a distinct PERF_RECORD_BPF_METADATA record, with the ".prog_name" showing
| the name of each program or subprogram. The prog_name is deliberately the
| same as the ".name" field in the corresponding PERF_RECORD_KSYMBOL record.

> IIUC the metadata is collected for each BPF program which may have
> multiple subprograms.  Then this patch creates multiple PERF_RECORD_
> BPF_METADATA for each subprogram, right?
>
> Can it be shared using the BPF program ID?

In theory, yes, it could be shared. But I want to be able to correlate them
with the corresponding PERF_RECORD_KSYMBOL events, and KSYMBOL events for
subprograms don't have the full-program ID, so I wouldn't be able to do that.

> > +     rodata = calloc(1, map_info.value_size);
>
> You can use 'zalloc()' instead, in other places too.

Fixed, thanks.

> > +void bpf_metadata_free(struct bpf_metadata *metadata)
> > +{
> > +     if (metadata == NULL)
> > +             return;
> > +     for (__u32 index = 0; index < metadata->nr_prog_names; index++)
> > +             free(metadata->prog_names[index]);
> > +     if (metadata->prog_names != NULL)
> > +             free(metadata->prog_names);
> > +     if (metadata->event != NULL)
> > +             free(metadata->event);
>
> No need to NULL change for free().

I've removed the NULL checks.

> > +static int synthesize_perf_record_bpf_metadata(
> > [...]
> > +     for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
> > +             memcpy(event->bpf_metadata.prog_name,
> > +                    metadata->prog_names[index], BPF_PROG_NAME_LEN);
>
> Is it possible to call synthesize_bpf_prog_name() directly to the
> event->bpf_metadata.prog_name instead of saving it metadata->prog_names?

Not with the way the code is currently structured - we need the BTF data
to call synthesize_bpf_prog_name(), and that's allocated and freed inside
of bpf_metadata_create().

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-03 21:27     ` Blake Jones
@ 2025-06-03 21:44       ` Namhyung Kim
  2025-06-03 21:54         ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2025-06-03 21:44 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

Hi Blake,

On Tue, Jun 03, 2025 at 02:27:53PM -0700, Blake Jones wrote:
> Hi Namhyung,
> 
> On Tue, Jun 3, 2025 at 1:15 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, May 21, 2025 at 03:27:24PM -0700, Blake Jones wrote:
> > > Look for .rodata maps, find ones with 'bpf_metadata_' variables, extract
> > > their values as strings, and create a new PERF_RECORD_BPF_METADATA
> > > synthetic event using that data. The code gets invoked from the existing
> > > routine perf_event__synthesize_one_bpf_prog().
> >
> > It would be great if you can show an example how those metadata is
> > constructed and shared between BPF programs.
> 
> I've added the following to my commit message:
> 
> | For example, a BPF program with the following variables:
> |
> |     const char bpf_metadata_version[] SEC(".rodata") = "3.14159";
> |     int bpf_metadata_value[] SEC(".rodata") = 42;
> |
> | would generate a PERF_RECORD_BPF_METADATA record with:
> |
> |     .prog_name        = <BPF program name, e.g. "bpf_prog_a1b2c3_foo">
> |     .nr_entries       = 2
> |     .entries[0].key   = "version"
> |     .entries[0].value = "3.14159"
> |     .entries[1].key   = "value"
> |     .entries[1].value = "42"
> |
> | Each of the BPF programs and subprograms that share those variables would
> | get a distinct PERF_RECORD_BPF_METADATA record, with the ".prog_name" showing
> | the name of each program or subprogram. The prog_name is deliberately the
> | same as the ".name" field in the corresponding PERF_RECORD_KSYMBOL record.

Thanks!

> 
> > IIUC the metadata is collected for each BPF program which may have
> > multiple subprograms.  Then this patch creates multiple PERF_RECORD_
> > BPF_METADATA for each subprogram, right?
> >
> > Can it be shared using the BPF program ID?
> 
> In theory, yes, it could be shared. But I want to be able to correlate them
> with the corresponding PERF_RECORD_KSYMBOL events, and KSYMBOL events for
> subprograms don't have the full-program ID, so I wouldn't be able to do that.

It's unfortunate that KSYMBOL doesn't have the program ID, but IIRC the
following BPF_EVENT should have it.  I think it's safe to think KSYMBOLs
belong to the BPF_EVENT when they are from the same thread.

> 
> > > +     rodata = calloc(1, map_info.value_size);
> >
> > You can use 'zalloc()' instead, in other places too.
> 
> Fixed, thanks.
> 
> > > +void bpf_metadata_free(struct bpf_metadata *metadata)
> > > +{
> > > +     if (metadata == NULL)
> > > +             return;
> > > +     for (__u32 index = 0; index < metadata->nr_prog_names; index++)
> > > +             free(metadata->prog_names[index]);
> > > +     if (metadata->prog_names != NULL)
> > > +             free(metadata->prog_names);
> > > +     if (metadata->event != NULL)
> > > +             free(metadata->event);
> >
> > No need to NULL change for free().
> 
> I've removed the NULL checks.
> 
> > > +static int synthesize_perf_record_bpf_metadata(
> > > [...]
> > > +     for (__u32 index = 0; index < metadata->nr_prog_names; index++) {
> > > +             memcpy(event->bpf_metadata.prog_name,
> > > +                    metadata->prog_names[index], BPF_PROG_NAME_LEN);
> >
> > Is it possible to call synthesize_bpf_prog_name() directly to the
> > event->bpf_metadata.prog_name instead of saving it metadata->prog_names?
> 
> Not with the way the code is currently structured - we need the BTF data
> to call synthesize_bpf_prog_name(), and that's allocated and freed inside
> of bpf_metadata_create().

I see.  You already freed the map data and BTF.  Ok, it's not a big deal
and probably not needed if we can switch to BPF ID.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-03 21:44       ` Namhyung Kim
@ 2025-06-03 21:54         ` Blake Jones
  2025-06-03 22:09           ` Namhyung Kim
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-06-03 21:54 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

Hi Namhyung,

On Tue, Jun 3, 2025 at 2:44 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > IIUC the metadata is collected for each BPF program which may have
> > > multiple subprograms.  Then this patch creates multiple PERF_RECORD_
> > > BPF_METADATA for each subprogram, right?
> > >
> > > Can it be shared using the BPF program ID?
> >
> > In theory, yes, it could be shared. But I want to be able to correlate them
> > with the corresponding PERF_RECORD_KSYMBOL events, and KSYMBOL events for
> > subprograms don't have the full-program ID, so I wouldn't be able to do that.
>
> It's unfortunate that KSYMBOL doesn't have the program ID, but IIRC the
> following BPF_EVENT should have it.  I think it's safe to think KSYMBOLs
> belong to the BPF_EVENT when they are from the same thread.

Hmmm. Is that documented and tested anywhere? Offhand it sounds like an
implementation detail that I wouldn't feel great about depending on -
certainly not without a strong guarantee that it wouldn't change.

Can you say more about why the duplicated records concern you?

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-03 21:54         ` Blake Jones
@ 2025-06-03 22:09           ` Namhyung Kim
  2025-06-03 22:29             ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2025-06-03 22:09 UTC (permalink / raw)
  To: Blake Jones
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Tue, Jun 03, 2025 at 02:54:50PM -0700, Blake Jones wrote:
> Hi Namhyung,
> 
> On Tue, Jun 3, 2025 at 2:44 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > IIUC the metadata is collected for each BPF program which may have
> > > > multiple subprograms.  Then this patch creates multiple PERF_RECORD_
> > > > BPF_METADATA for each subprogram, right?
> > > >
> > > > Can it be shared using the BPF program ID?
> > >
> > > In theory, yes, it could be shared. But I want to be able to correlate them
> > > with the corresponding PERF_RECORD_KSYMBOL events, and KSYMBOL events for
> > > subprograms don't have the full-program ID, so I wouldn't be able to do that.
> >
> > It's unfortunate that KSYMBOL doesn't have the program ID, but IIRC the
> > following BPF_EVENT should have it.  I think it's safe to think KSYMBOLs
> > belong to the BPF_EVENT when they are from the same thread.
> 
> Hmmm. Is that documented and tested anywhere? Offhand it sounds like an
> implementation detail that I wouldn't feel great about depending on -
> certainly not without a strong guarantee that it wouldn't change.

Good point.  Maybe BPF folks have some idea?

Anyway the current code generates them together in a function.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c?h=v6.15#n9825

> 
> Can you say more about why the duplicated records concern you?

More data means more chance to lost something.  I don't expect this is
gonna be a practical concern but in general we should pursue less data.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-03 22:09           ` Namhyung Kim
@ 2025-06-03 22:29             ` Blake Jones
  2025-06-04 21:40               ` Namhyung Kim
  0 siblings, 1 reply; 28+ messages in thread
From: Blake Jones @ 2025-06-03 22:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Tue, Jun 3, 2025 at 3:10 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > Hmmm. Is that documented and tested anywhere? Offhand it sounds like an
> > implementation detail that I wouldn't feel great about depending on -
> > certainly not without a strong guarantee that it wouldn't change.
>
> Good point.  Maybe BPF folks have some idea?
>
> Anyway the current code generates them together in a function.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c?h=v6.15#n9825

It certainly does, yeah. But I don't want to have that become another
instance of https://www.hyrumslaw.com/.

> > Can you say more about why the duplicated records concern you?
>
> More data means more chance to lost something.  I don't expect this is
> gonna be a practical concern but in general we should pursue less data.

That makes sense. In this case, it will only show up for BPF programs that
define "bpf_metadata_" variables (which is already an opt-in action), and
the number of variables a given program defines is likely to be quite small.
So I think the cost of the marginal increase in data generated is outweighed
by the usability and reliability benefits of being able to match these events
1:1 with the KSYMBOL events. If this proves to be a problem in practice,
it can be revisited.

Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-03 22:29             ` Blake Jones
@ 2025-06-04 21:40               ` Namhyung Kim
  2025-06-04 22:12                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2025-06-04 21:40 UTC (permalink / raw)
  To: Blake Jones, Song Liu, Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Kan Liang,
	Chun-Tse Shao, Zhongqiu Han, James Clark, Charlie Jenkins,
	Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu, Graham Woodward,
	Yicong Yang, Ben Gainey, linux-kernel, bpf, linux-perf-users

On Tue, Jun 03, 2025 at 03:29:35PM -0700, Blake Jones wrote:
> On Tue, Jun 3, 2025 at 3:10 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > Hmmm. Is that documented and tested anywhere? Offhand it sounds like an
> > > implementation detail that I wouldn't feel great about depending on -
> > > certainly not without a strong guarantee that it wouldn't change.
> >
> > Good point.  Maybe BPF folks have some idea?
> >
> > Anyway the current code generates them together in a function.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c?h=v6.15#n9825
> 
> It certainly does, yeah. But I don't want to have that become another
> instance of https://www.hyrumslaw.com/.

Thanks for sharing this.

I'm curious about the semantics of the KSYMBOL and BPF_EVENT.  And I
feel like there should be a connection between them.

Song and Jiri, what do you think?

Thanks,
Namhyung

> 
> > > Can you say more about why the duplicated records concern you?
> >
> > More data means more chance to lost something.  I don't expect this is
> > gonna be a practical concern but in general we should pursue less data.
> 
> That makes sense. In this case, it will only show up for BPF programs that
> define "bpf_metadata_" variables (which is already an opt-in action), and
> the number of variables a given program defines is likely to be quite small.
> So I think the cost of the marginal increase in data generated is outweighed
> by the usability and reliability benefits of being able to match these events
> 1:1 with the KSYMBOL events. If this proves to be a problem in practice,
> it can be revisited.
> 
> Blake

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-04 21:40               ` Namhyung Kim
@ 2025-06-04 22:12                 ` Arnaldo Carvalho de Melo
  2025-06-04 23:04                   ` Blake Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-06-04 22:12 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Blake Jones, Song Liu, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Wed, Jun 04, 2025 at 02:40:34PM -0700, Namhyung Kim wrote:
> On Tue, Jun 03, 2025 at 03:29:35PM -0700, Blake Jones wrote:
> > On Tue, Jun 3, 2025 at 3:10 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > Hmmm. Is that documented and tested anywhere? Offhand it sounds like an
> > > > implementation detail that I wouldn't feel great about depending on -
> > > > certainly not without a strong guarantee that it wouldn't change.

> > > Good point.  Maybe BPF folks have some idea?

> > > Anyway the current code generates them together in a function.

> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c?h=v6.15#n9825

> > It certainly does, yeah. But I don't want to have that become another
> > instance of https://www.hyrumslaw.com/.

> Thanks for sharing this.

> I'm curious about the semantics of the KSYMBOL and BPF_EVENT.  And I
> feel like there should be a connection between them.

So, the comment in:

tools/perf/util/bpf-event.c

Is:

 * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
 * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
 * one PERF_RECORD_KSYMBOL is generated for each sub program.

which is not so nicely worded tho :-\

"One KSYMBOL per program", followed by "one KSYMBOL per sub program".

But that matches the referenced:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c?h=v6.15#n9825

So, for these bpf_metadata_ variables, would that be strictly per
program or would it be perf 'sub program'?

Couldn't get an answer from looking at tools/bpf/bpftool/prog.c, but
seems to be with progs, not subprogs, i.e. just the PERF_RECORD_KSYMBOL
associated with progs (not subprogs) will have those variables.

But then it seems those variables _are_ associated with at least one
PERF_RECORD_KSYMBOL, right?

- Arnaldo

> Song and Jiri, what do you think?

> Thanks,
> Namhyung

> > > > Can you say more about why the duplicated records concern you?
> > >
> > > More data means more chance to lost something.  I don't expect this is
> > > gonna be a practical concern but in general we should pursue less data.
> > 
> > That makes sense. In this case, it will only show up for BPF programs that
> > define "bpf_metadata_" variables (which is already an opt-in action), and
> > the number of variables a given program defines is likely to be quite small.
> > So I think the cost of the marginal increase in data generated is outweighed
> > by the usability and reliability benefits of being able to match these events
> > 1:1 with the KSYMBOL events. If this proves to be a problem in practice,
> > it can be revisited.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] perf: collect BPF metadata from existing BPF programs
  2025-06-04 22:12                 ` Arnaldo Carvalho de Melo
@ 2025-06-04 23:04                   ` Blake Jones
  0 siblings, 0 replies; 28+ messages in thread
From: Blake Jones @ 2025-06-04 23:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Song Liu, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Kan Liang, Chun-Tse Shao, Zhongqiu Han, James Clark,
	Charlie Jenkins, Andi Kleen, Dmitry Vyukov, Leo Yan, Yujie Liu,
	Graham Woodward, Yicong Yang, Ben Gainey, linux-kernel, bpf,
	linux-perf-users

On Wed, Jun 4, 2025 at 3:12 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> So, the comment in:
>
> tools/perf/util/bpf-event.c
>
> Is:
>
>  * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
>  * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
>  * one PERF_RECORD_KSYMBOL is generated for each sub program.
>
> which is not so nicely worded tho :-\
>
> "One KSYMBOL per program", followed by "one KSYMBOL per sub program".
>
> But that matches the referenced:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c?h=v6.15#n9825

Indeed, we get one KSYMBOL per subprogram, but one BPF_EVENT per program.
(I found that the term "subprogram" is used inconsistently; sometimes it
includes the main program, while other times it doesn't. In the case of
that comment, and the function perf_event__synthesize_one_bpf_prog, it does
include the main program, which has sub_id = 0.)

> So, for these bpf_metadata_ variables, would that be strictly per
> program or would it be perf 'sub program'?

These BPF metadata records would be generated for each subprogram.

My goal here is to allow PERF_RECORD_SAMPLE events with IPs in BPF code
to be associated with any metadata that describes that code. As the
comment points out, each subprogram gets its own PERF_RECORD_KSYMBOL event,
and that event indicates where the subprogram's starting and ending
addresses are. With one PERF_RECORD_BPF_METADATA event per subprogram, it's
then straightforward to associate the metadata with a range of IPs.

If I only generated the metadata records per program rather than per
subprogram, I could only associate them with a PERF_RECORD_BPF_EVENT event.
That event has the full program's tag, but it doesn't have a list of the
subprogram tags for that program. So it doesn't have enough information on
its own to construct the relevant list of virtual address ranges. And I'd
be quite concerned about assuming that the BPF_EVENT events immediately
follow their related KSYMBOL events, especially for events generated while
the system was generating SAMPLE events.

Blake

> Couldn't get an answer from looking at tools/bpf/bpftool/prog.c, but
> seems to be with progs, not subprogs, i.e. just the PERF_RECORD_KSYMBOL
> associated with progs (not subprogs) will have those variables.
>
> But then it seems those variables _are_ associated with at least one
> PERF_RECORD_KSYMBOL, right?

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-06-04 23:04 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-21 22:27 [PATCH 0/3] perf: generate events for BPF metadata Blake Jones
2025-05-21 22:27 ` [PATCH 1/3] perf: add support for printing BTF character arrays as strings Blake Jones
2025-05-22 17:42   ` Arnaldo Carvalho de Melo
2025-05-22 18:19     ` Blake Jones
2025-05-29  0:58       ` Blake Jones
2025-05-30 17:40         ` Namhyung Kim
2025-05-31  7:26           ` Blake Jones
2025-06-03 18:23             ` Blake Jones
2025-06-03 18:43               ` Alexei Starovoitov
2025-06-03 19:47                 ` Namhyung Kim
2025-05-21 22:27 ` [PATCH 2/3] perf: collect BPF metadata from existing BPF programs Blake Jones
2025-05-29 17:47   ` Ian Rogers
2025-05-29 23:21     ` Blake Jones
2025-05-29 23:23       ` Ian Rogers
2025-06-03 20:15   ` Namhyung Kim
2025-06-03 21:27     ` Blake Jones
2025-06-03 21:44       ` Namhyung Kim
2025-06-03 21:54         ` Blake Jones
2025-06-03 22:09           ` Namhyung Kim
2025-06-03 22:29             ` Blake Jones
2025-06-04 21:40               ` Namhyung Kim
2025-06-04 22:12                 ` Arnaldo Carvalho de Melo
2025-06-04 23:04                   ` Blake Jones
2025-05-21 22:27 ` [PATCH 3/3] perf: collect BPF metadata from new programs, and display the new event Blake Jones
2025-05-29 18:12   ` Ian Rogers
2025-05-29 23:09     ` Blake Jones
2025-05-29 23:27       ` Ian Rogers
2025-05-29 23:49         ` Blake Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).