public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] libbpf/bpftool: support merging split BTFs
@ 2026-02-20 19:40 Josef Bacik
  2026-02-20 19:40 ` [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf() Josef Bacik
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Josef Bacik @ 2026-02-20 19:40 UTC (permalink / raw)
  To: bpf

v1: https://lore.kernel.org/bpf/cover.1771605821.git.josef@toxicpanda.com/

v1->v2:
- Added a btf__dedup() call to btf__add_btf() to ensure that we don't have
  duplicate types in the merged BTF.

--- Original email ---

Hello,

I'm extending systing to do introspection on vfio devices, which requires having
the structs I need from the kernel available in userspace. Normally these are
loadable modules, but in the case of vfio there's multiple structs across
multiple modules. Normally you'd do the following to generate your vmlinux.h
with a module

bpftool btf dump file /sys/kernel/btf/<module> format c \
	--base /sys/kernel/btf/vmlinux > vmlinux.h

but if you need multiple modules you have to hack together multiple dumps and
merge them together. This patch series adds support for merging multiple BTF
sources together, so you can do

bpftool btf dump file /sys/kernel/btf/<module1> \
	file /sys/kernel/btf/<module2> format c \
	--base /sys/kernel/btf/vmlinux > vmlinux.h

I tested this with my usecase and it works. Thanks,

Josef

Josef Bacik (3):
  libbpf: support appending split BTF in btf__add_btf()
  bpftool: support merging multiple module BTFs in btf dump
  selftests/bpf: add test for btf__add_btf() with split BTF sources

 tools/bpf/bpftool/btf.c                       | 135 ++++++++++++++++--
 tools/lib/bpf/btf.c                           |  38 +++--
 .../selftests/bpf/prog_tests/btf_write.c      |  89 ++++++++++++
 3 files changed, 241 insertions(+), 21 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf()
  2026-02-20 19:40 [PATCH v2 0/3] libbpf/bpftool: support merging split BTFs Josef Bacik
@ 2026-02-20 19:40 ` Josef Bacik
  2026-02-20 20:17   ` bot+bpf-ci
  2026-02-20 19:40 ` [PATCH v2 2/3] bpftool: support merging multiple module BTFs in btf dump Josef Bacik
  2026-02-20 19:40 ` [PATCH v2 3/3] selftests/bpf: add test for btf__add_btf() with split BTF sources Josef Bacik
  2 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2026-02-20 19:40 UTC (permalink / raw)
  To: bpf; +Cc: Claude Opus 4.6

btf__add_btf() currently rejects split BTF sources with -ENOTSUP.
This prevents merging types from multiple kernel module BTFs that
are all split against the same vmlinux base.

Extend btf__add_btf() to handle split BTF sources by:

- Replacing the blanket -ENOTSUP with a validation that src and dst
  share the same base BTF pointer when both are split.

- Computing src_start_id from the source's base to distinguish base
  type ID references (which must remain unchanged) from split type
  IDs (which must be remapped to new positions in the destination).

- Using src_btf->nr_types instead of btf__type_cnt()-1 for the type
  count, which is correct for both split and non-split sources.

- Pre-emptively calling btf_ensure_modifiable() on the destination's
  base BTF to prevent a use-after-free: btf_rewrite_str() resolves
  strings via btf__str_by_offset(src) which may return pointers into
  the shared base's string data; btf__add_str(dst) then calls
  btf__find_str(base) which can trigger btf_ensure_modifiable(base),
  reallocating that string data and invalidating the pointer.

For non-split sources the behavior is identical: src_start_id is 1,
the type_id < 1 guard is never true (VOID is already skipped), and
the remapping formula reduces to the original.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 tools/lib/bpf/btf.c | 38 ++++++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 83fe79ffcb8f..0b0bb5cba22b 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -2004,24 +2004,41 @@ int btf__add_btf(struct btf *btf, const struct btf *src_btf)
 {
 	struct btf_pipe p = { .src = src_btf, .dst = btf };
 	int data_sz, sz, cnt, i, err, old_strs_len;
+	__u32 src_start_id;
 	__u32 *off;
 	void *t;
 
-	/* appending split BTF isn't supported yet */
-	if (src_btf->base_btf)
-		return libbpf_err(-ENOTSUP);
+	/* When appending split BTF, the destination must share the same base
+	 * BTF so that base type ID references remain valid.
+	 */
+	if (src_btf->base_btf && btf->base_btf &&
+	    src_btf->base_btf != btf->base_btf)
+		return libbpf_err(-EINVAL);
+
+	src_start_id = src_btf->base_btf ? btf__type_cnt(src_btf->base_btf) : 1;
 
 	/* deconstruct BTF, if necessary, and invalidate raw_data */
 	if (btf_ensure_modifiable(btf))
 		return libbpf_err(-ENOMEM);
 
+	/* If dst has a base BTF, ensure it is modifiable now.
+	 * btf_rewrite_str() resolves strings via btf__str_by_offset(src),
+	 * which for split src may return pointers into dst's base string
+	 * data.  btf__add_str(dst) then calls btf__find_str(dst->base),
+	 * which triggers btf_ensure_modifiable(base) and may reallocate
+	 * the base string data, invalidating the pointer.  Pre-emptively
+	 * making the base modifiable avoids this use-after-free.
+	 */
+	if (btf->base_btf && btf_ensure_modifiable(btf->base_btf))
+		return libbpf_err(-ENOMEM);
+
 	/* remember original strings section size if we have to roll back
 	 * partial strings section changes
 	 */
 	old_strs_len = btf->hdr->str_len;
 
 	data_sz = src_btf->hdr->type_len;
-	cnt = btf__type_cnt(src_btf) - 1;
+	cnt = src_btf->nr_types;
 
 	/* pre-allocate enough memory for new types */
 	t = btf_add_type_mem(btf, data_sz);
@@ -2074,11 +2091,16 @@ int btf__add_btf(struct btf *btf, const struct btf *src_btf)
 			if (!*type_id) /* nothing to do for VOID references */
 				continue;
 
-			/* we haven't updated btf's type count yet, so
-			 * btf->start_id + btf->nr_types - 1 is the type ID offset we should
-			 * add to all newly added BTF types
+			/* For split BTF sources, type IDs referencing the
+			 * base BTF (< src_start_id) remain unchanged since
+			 * dst shares the same base.  Only remap IDs in the
+			 * split portion.  For non-split sources src_start_id
+			 * is 1 so all IDs are remapped as before.
 			 */
-			*type_id += btf->start_id + btf->nr_types - 1;
+			if (*type_id < src_start_id)
+				continue;
+
+			*type_id += btf->start_id + btf->nr_types - src_start_id;
 		}
 
 		/* go to next type data and type offset index entry */
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/3] bpftool: support merging multiple module BTFs in btf dump
  2026-02-20 19:40 [PATCH v2 0/3] libbpf/bpftool: support merging split BTFs Josef Bacik
  2026-02-20 19:40 ` [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf() Josef Bacik
@ 2026-02-20 19:40 ` Josef Bacik
  2026-02-20 20:17   ` bot+bpf-ci
  2026-02-20 19:40 ` [PATCH v2 3/3] selftests/bpf: add test for btf__add_btf() with split BTF sources Josef Bacik
  2 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2026-02-20 19:40 UTC (permalink / raw)
  To: bpf; +Cc: Claude Opus 4.6

Add support for specifying multiple file sources in 'bpftool btf dump'
to generate a single C header containing types from vmlinux plus
multiple kernel modules:

  bpftool btf dump file /sys/kernel/btf/mod1 file /sys/kernel/btf/mod2 format c

This is useful for BPF programs that need to access types defined in
kernel modules. Previously this required a separate bpftool invocation
for each module, producing separate headers that could not be combined
due to overlapping vmlinux type definitions.

The implementation collects all file paths, then for the multi-file
case creates an empty split BTF on the vmlinux base and iteratively
merges each module's types into it via btf__add_btf(). The single-file
code path is preserved exactly to avoid any regression risk.

Auto-detection of vmlinux as the base BTF from sysfs paths works as
before. If vmlinux itself appears in the file list it is skipped with
a warning since its types are already provided by the base.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 tools/bpf/bpftool/btf.c | 135 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 122 insertions(+), 13 deletions(-)

diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index 946612029dee..b925923c6fb6 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -28,6 +28,7 @@
 #define FASTCALL_DECL_TAG	"bpf_fastcall"
 
 #define MAX_ROOT_IDS		16
+#define MAX_BTF_FILES		64
 
 static const char * const btf_kind_str[NR_BTF_KINDS] = {
 	[BTF_KIND_UNKN]		= "UNKNOWN",
@@ -958,20 +959,127 @@ static int do_dump(int argc, char **argv)
 		NEXT_ARG();
 	} else if (is_prefix(src, "file")) {
 		const char sysfs_prefix[] = "/sys/kernel/btf/";
+		const char *files[MAX_BTF_FILES];
+		int nr_files = 0;
 
-		if (!base_btf &&
-		    strncmp(*argv, sysfs_prefix, sizeof(sysfs_prefix) - 1) == 0 &&
-		    strcmp(*argv, sysfs_vmlinux) != 0)
-			base = get_vmlinux_btf_from_sysfs();
-
-		btf = btf__parse_split(*argv, base ?: base_btf);
-		if (!btf) {
-			err = -errno;
-			p_err("failed to load BTF from %s: %s",
-			      *argv, strerror(errno));
-			goto done;
-		}
+		files[nr_files++] = *argv;
 		NEXT_ARG();
+
+		/* Collect additional file arguments. Use strcmp (exact match)
+		 * to avoid ambiguity with file paths that are prefixes of
+		 * "file" (e.g., "./f").
+		 */
+		while (argc && strcmp(*argv, "file") == 0) {
+			NEXT_ARG();
+			if (!REQ_ARGS(1)) {
+				err = -EINVAL;
+				goto done;
+			}
+			if (nr_files >= MAX_BTF_FILES) {
+				p_err("too many BTF files (max %d)",
+				      MAX_BTF_FILES);
+				err = -E2BIG;
+				goto done;
+			}
+			files[nr_files++] = *argv;
+			NEXT_ARG();
+		}
+
+		if (nr_files == 1) {
+			/* Single file — preserve existing behavior exactly */
+			if (!base_btf &&
+			    strncmp(files[0], sysfs_prefix,
+				    sizeof(sysfs_prefix) - 1) == 0 &&
+			    strcmp(files[0], sysfs_vmlinux) != 0)
+				base = get_vmlinux_btf_from_sysfs();
+
+			btf = btf__parse_split(files[0], base ?: base_btf);
+			if (!btf) {
+				err = -errno;
+				p_err("failed to load BTF from %s: %s",
+				      files[0], strerror(errno));
+				goto done;
+			}
+		} else {
+			struct btf *vmlinux_base = base_btf;
+			struct btf *combined, *mod;
+			int j, ret;
+
+			/* Auto-detect vmlinux base from sysfs if needed */
+			if (!vmlinux_base) {
+				for (j = 0; j < nr_files; j++) {
+					if (strncmp(files[j], sysfs_prefix,
+						    sizeof(sysfs_prefix) - 1) == 0 &&
+					    strcmp(files[j], sysfs_vmlinux) != 0) {
+						base = get_vmlinux_btf_from_sysfs();
+						vmlinux_base = base;
+						break;
+					}
+				}
+			}
+			if (!vmlinux_base) {
+				p_err("base BTF is required when merging multiple BTF files; use -B/--base-btf or use sysfs paths");
+				err = -EINVAL;
+				goto done;
+			}
+
+			/* Filter out vmlinux from the file list */
+			for (j = 0; j < nr_files; j++) {
+				if (strcmp(files[j], sysfs_vmlinux) == 0) {
+					p_info("skipping %s (already loaded as base)",
+					       sysfs_vmlinux);
+					memmove(&files[j], &files[j + 1],
+						(nr_files - j - 1) * sizeof(files[0]));
+					nr_files--;
+					j--;
+				}
+			}
+			if (nr_files == 0) {
+				p_err("no module BTF files to merge (all paths were vmlinux)");
+				err = -EINVAL;
+				goto done;
+			}
+
+			combined = btf__new_empty_split(vmlinux_base);
+			if (!combined) {
+				p_err("failed to create combined BTF: %s",
+				      strerror(errno));
+				err = -errno;
+				goto done;
+			}
+
+			for (j = 0; j < nr_files; j++) {
+				mod = btf__parse_split(files[j], vmlinux_base);
+				if (!mod) {
+					p_err("failed to load BTF from %s: %s",
+					      files[j], strerror(errno));
+					btf__free(combined);
+					err = -errno;
+					goto done;
+				}
+
+				ret = btf__add_btf(combined, mod);
+				btf__free(mod);
+				if (ret < 0) {
+					p_err("failed to merge BTF from %s: %s",
+					      files[j], strerror(-ret));
+					btf__free(combined);
+					err = ret;
+					goto done;
+				}
+			}
+
+			ret = btf__dedup(combined, NULL);
+			if (ret) {
+				p_err("failed to dedup combined BTF: %s",
+				      strerror(-ret));
+				btf__free(combined);
+				err = ret;
+				goto done;
+			}
+
+			btf = combined;
+		}
 	} else {
 		err = -1;
 		p_err("unrecognized BTF source specifier: '%s'", src);
@@ -1445,7 +1553,8 @@ static int do_help(int argc, char **argv)
 		"       %1$s %2$s dump BTF_SRC [format FORMAT] [root_id ROOT_ID]\n"
 		"       %1$s %2$s help\n"
 		"\n"
-		"       BTF_SRC := { id BTF_ID | prog PROG | map MAP [{key | value | kv | all}] | file FILE }\n"
+		"       BTF_SRC := { id BTF_ID | prog PROG | map MAP [{key | value | kv | all}] |\n"
+		"                    file FILE [file FILE]... }\n"
 		"       FORMAT  := { raw | c [unsorted] }\n"
 		"       " HELP_SPEC_MAP "\n"
 		"       " HELP_SPEC_PROGRAM "\n"
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/3] selftests/bpf: add test for btf__add_btf() with split BTF sources
  2026-02-20 19:40 [PATCH v2 0/3] libbpf/bpftool: support merging split BTFs Josef Bacik
  2026-02-20 19:40 ` [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf() Josef Bacik
  2026-02-20 19:40 ` [PATCH v2 2/3] bpftool: support merging multiple module BTFs in btf dump Josef Bacik
@ 2026-02-20 19:40 ` Josef Bacik
  2 siblings, 0 replies; 6+ messages in thread
From: Josef Bacik @ 2026-02-20 19:40 UTC (permalink / raw)
  To: bpf; +Cc: Claude Opus 4.6

Add a test that verifies btf__add_btf() correctly handles merging
multiple split BTF objects that share the same base BTF. The test
creates two sibling split BTFs on a common base, merges them into
a combined split BTF, and validates that base type references are
preserved while split type references are properly remapped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 .../selftests/bpf/prog_tests/btf_write.c      | 89 +++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c
index 6e36de1302fc..80353a545cdd 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_write.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c
@@ -497,10 +497,99 @@ static void test_btf_add_btf()
 	btf__free(btf2);
 }
 
+static void test_btf_add_btf_split()
+{
+	struct btf *base = NULL, *split1 = NULL, *split2 = NULL;
+	struct btf *combined = NULL;
+	int id;
+
+	/* Create a base BTF with an INT and a PTR to it */
+	base = btf__new_empty();
+	if (!ASSERT_OK_PTR(base, "base"))
+		return;
+
+	id = btf__add_int(base, "int", 4, BTF_INT_SIGNED);
+	ASSERT_EQ(id, 1, "base_int_id");
+	id = btf__add_ptr(base, 1);
+	ASSERT_EQ(id, 2, "base_ptr_id");
+
+	/* base has 2 types, type IDs 1..2 */
+	ASSERT_EQ(btf__type_cnt(base), 3, "base_type_cnt");
+
+	/* Create split1 on base: a STRUCT referencing base's int (ID 1) */
+	split1 = btf__new_empty_split(base);
+	if (!ASSERT_OK_PTR(split1, "split1"))
+		goto cleanup;
+
+	id = btf__add_struct(split1, "s1", 4);
+	/* split types start at base_type_cnt = 3 */
+	ASSERT_EQ(id, 3, "split1_struct_id");
+	btf__add_field(split1, "x", 1, 0, 0); /* refers to base int */
+
+	id = btf__add_ptr(split1, 3);
+	ASSERT_EQ(id, 4, "split1_ptr_id"); /* ptr to the struct (split self-ref) */
+
+	/* Create split2 on base: a TYPEDEF referencing base's ptr (ID 2) */
+	split2 = btf__new_empty_split(base);
+	if (!ASSERT_OK_PTR(split2, "split2"))
+		goto cleanup;
+
+	id = btf__add_typedef(split2, "int_ptr", 2); /* refers to base ptr */
+	ASSERT_EQ(id, 3, "split2_typedef_id");
+
+	id = btf__add_struct(split2, "s2", 8);
+	ASSERT_EQ(id, 4, "split2_struct_id");
+	btf__add_field(split2, "p", 3, 0, 0); /* refers to split2's own typedef */
+
+	/* Create combined split BTF on same base and merge both */
+	combined = btf__new_empty_split(base);
+	if (!ASSERT_OK_PTR(combined, "combined"))
+		goto cleanup;
+
+	/* Merge split1: its types (3,4) should land at IDs 3,4 in combined */
+	id = btf__add_btf(combined, split1);
+	if (!ASSERT_GE(id, 0, "add_split1"))
+		goto cleanup;
+	ASSERT_EQ(id, 3, "split1_first_id");
+
+	/* Merge split2: its types (3,4) should be remapped to IDs 5,6 */
+	id = btf__add_btf(combined, split2);
+	if (!ASSERT_GE(id, 0, "add_split2"))
+		goto cleanup;
+	ASSERT_EQ(id, 5, "split2_first_id");
+
+	/* combined should have: base (2 types) + split1 (2) + split2 (2) = 6 types + void */
+	ASSERT_EQ(btf__type_cnt(combined), 7, "combined_type_cnt");
+
+	VALIDATE_RAW_BTF(
+		combined,
+		/* base types (IDs 1-2) */
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+
+		/* split1 types (IDs 3-4): base refs unchanged */
+		"[3] STRUCT 's1' size=4 vlen=1\n"
+		"\t'x' type_id=1 bits_offset=0",      /* refers to base int=1 */
+		"[4] PTR '(anon)' type_id=3",          /* refers to split1's struct=3 */
+
+		/* split2 types (IDs 5-6): remapped from 3,4 to 5,6 */
+		"[5] TYPEDEF 'int_ptr' type_id=2",     /* base ptr=2, unchanged */
+		"[6] STRUCT 's2' size=8 vlen=1\n"
+		"\t'p' type_id=5 bits_offset=0");      /* split2 typedef: 3->5 */
+
+cleanup:
+	btf__free(combined);
+	btf__free(split2);
+	btf__free(split1);
+	btf__free(base);
+}
+
 void test_btf_write()
 {
 	if (test__start_subtest("btf_add"))
 		test_btf_add();
 	if (test__start_subtest("btf_add_btf"))
 		test_btf_add_btf();
+	if (test__start_subtest("btf_add_btf_split"))
+		test_btf_add_btf_split();
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/3] bpftool: support merging multiple module BTFs in btf dump
  2026-02-20 19:40 ` [PATCH v2 2/3] bpftool: support merging multiple module BTFs in btf dump Josef Bacik
@ 2026-02-20 20:17   ` bot+bpf-ci
  0 siblings, 0 replies; 6+ messages in thread
From: bot+bpf-ci @ 2026-02-20 20:17 UTC (permalink / raw)
  To: josef, bpf
  Cc: noreply, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song,
	clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
> --- a/tools/bpf/bpftool/btf.c
> +++ b/tools/bpf/bpftool/btf.c

[ ... ]

> +		if (nr_files == 1) {
> +			/* Single file — preserve existing behavior exactly */

[ ... ]

> +			btf = btf__parse_split(files[0], base ?: base_btf);
> +			if (!btf) {
> +				err = -errno;
> +				p_err("failed to load BTF from %s: %s",
> +				      files[0], strerror(errno));
> +				goto done;
> +			}
> +		} else {

[ ... ]

> +			combined = btf__new_empty_split(vmlinux_base);
> +			if (!combined) {
> +				p_err("failed to create combined BTF: %s",
> +				      strerror(errno));
> +				err = -errno;
> +				goto done;
> +			}

The single-file path above captures err = -errno before calling
p_err().  Here the order is reversed -- p_err() calls fprintf which
can modify errno before it is captured.  Could err end up with the
wrong value here?

> +			for (j = 0; j < nr_files; j++) {
> +				mod = btf__parse_split(files[j], vmlinux_base);
> +				if (!mod) {
> +					p_err("failed to load BTF from %s: %s",
> +					      files[j], strerror(errno));
> +					btf__free(combined);
> +					err = -errno;
> +					goto done;
> +				}

Same ordering concern here, and btf__free(combined) sits between the
failing call and the errno capture as well.  btf__free() calls close()
on the BTF fd which can also clobber errno.  If a file fails to parse,
err might end up as 0 or some unrelated value instead of the actual
error, giving callers and scripts a wrong exit code.

[ ... ]


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22238863764

AI-authorship-score: high
AI-authorship-explanation: Commit explicitly includes Co-Authored-By: Claude Opus 4.6 tag indicating direct AI involvement in writing the code.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: errno is captured after p_err() and btf__free() calls in the multi-file error paths, which can clobber errno and produce a wrong exit code from bpftool.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf()
  2026-02-20 19:40 ` [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf() Josef Bacik
@ 2026-02-20 20:17   ` bot+bpf-ci
  0 siblings, 0 replies; 6+ messages in thread
From: bot+bpf-ci @ 2026-02-20 20:17 UTC (permalink / raw)
  To: josef, bpf
  Cc: noreply, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song,
	clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 3590 bytes --]

> Pre-emptively calling btf_ensure_modifiable() on the destination's
> base BTF to prevent a use-after-free: btf_rewrite_str() resolves
> strings via btf__str_by_offset(src) which may return pointers into
> the shared base's string data; btf__add_str(dst) then calls
> btf__find_str(base) which can trigger btf_ensure_modifiable(base),
> reallocating that string data and invalidating the pointer.

Is this use-after-free scenario accurate?

When btf_ensure_modifiable() is called on a BTF that is not yet
modifiable, it takes the full transition path (btf_is_modifiable()
returns false):

btf_ensure_modifiable() {
    if (btf_is_modifiable(btf)) {
        btf_invalidate_raw_data(btf);   /* frees raw_data */
        return 0;
    }
    /* full transition -- does NOT call btf_invalidate_raw_data */
    hdr = malloc(...);
    types = malloc(...);
    set = strset__new(..., btf->strs_data, ...);
    ...
    btf->strs_data = NULL;
    btf->strs_set = set;
}

strset__new() copies the string data into a new allocation:

strset__new() {
    set->strs_data = malloc(init_data_sz);
    memcpy(set->strs_data, init_data, init_data_sz);
}

Since btf_invalidate_raw_data() is only called when btf_is_modifiable()
returns true, the first call to btf_ensure_modifiable(base) preserves
raw_data.  The pointer returned by btf__str_by_offset(src) still
points into valid memory in base->raw_data.

On subsequent iterations, btf__str_by_offset() returns from
strset__data(base->strs_set) rather than raw_data, so even when
btf_invalidate_raw_data() is eventually called through the
already-modifiable path, the string pointer is not affected.

Could the comment be updated to reflect the actual reason for the
pre-emptive call?

> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 83fe79ffcb8f..0b0bb5cba22b 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c

[ ... ]

> @@ -2004,24 +2004,41 @@ int btf__add_btf(struct btf *btf, const struct btf *src_btf)

[ ... ]

> +	/* If dst has a base BTF, ensure it is modifiable now.
> +	 * btf_rewrite_str() resolves strings via btf__str_by_offset(src),
> +	 * which for split src may return pointers into dst's base string
> +	 * data.  btf__add_str(dst) then calls btf__find_str(dst->base),
> +	 * which triggers btf_ensure_modifiable(base) and may reallocate
> +	 * the base string data, invalidating the pointer.  Pre-emptively
> +	 * making the base modifiable avoids this use-after-free.
> +	 */
> +	if (btf->base_btf && btf_ensure_modifiable(btf->base_btf))
> +		return libbpf_err(-ENOMEM);

Same question as above for this comment.  The first call to
btf_ensure_modifiable() on a not-yet-modifiable base does not
free or reallocate the existing string data -- it copies it via
strset__new() and leaves raw_data intact.  The described pointer
invalidation does not appear to occur.


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22238863764

AI-authorship-score: high
AI-authorship-explanation: The commit has an explicit Co-Authored-By: Claude Opus 4.6 tag, and the commit message style with structured bullet points and precise technical descriptions is characteristic of AI-generated content.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: The commit message and code comment describe a use-after-free that does not actually occur; the code change itself is functionally correct but the justification is inaccurate.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-20 20:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-20 19:40 [PATCH v2 0/3] libbpf/bpftool: support merging split BTFs Josef Bacik
2026-02-20 19:40 ` [PATCH v2 1/3] libbpf: support appending split BTF in btf__add_btf() Josef Bacik
2026-02-20 20:17   ` bot+bpf-ci
2026-02-20 19:40 ` [PATCH v2 2/3] bpftool: support merging multiple module BTFs in btf dump Josef Bacik
2026-02-20 20:17   ` bot+bpf-ci
2026-02-20 19:40 ` [PATCH v2 3/3] selftests/bpf: add test for btf__add_btf() with split BTF sources Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox