* [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search
@ 2025-11-04 13:40 Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function Donglin Peng
` (6 more replies)
0 siblings, 7 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast; +Cc: linux-kernel, bpf, Donglin Peng
This patch series introduces significant performance improvements for BTF
type lookups by implementing type permutation and binary search optimizations.
## Overview
The series addresses the performance limitations of linear search in large
BTF instances by:
1. Adding BTF permutation support - Allows rearranging BTF types
2. Implementing binary search optimization - Dramatically improves lookup
performance for sorted BTF types
## Key Changes
### Patch 1: libbpf: Extract BTF type remapping logic into helper function
- Refactors existing code to eliminate duplication
- Improves modularity and maintainability
- Prepares foundation for permutation functionality
### Patch 2: libbpf: Add BTF permutation support for type reordering
- Introduces `btf__permute()` API for in-place type rearrangement
- Handles both main BTF and extension data
- Maintains type reference consistency after permutation
### Patch 3: libbpf: Optimize type lookup with binary search for sorted BTF
- Implements binary search algorithm for sorted BTF instances
- Maintains linear search fallback for compatibility
- Significant performance improvement for large BTF with sorted types
### Patch 4: libbpf: Implement lazy sorting validation for binary search optimization
- Adds on-demand sorting verification
- Caches results for efficient repeated lookups
### Patch 5: btf: Optimize type lookup with binary search
- Ports binary search optimization to kernel-side BTF implementation
- Maintains full backward compatibility
### Patch 6: btf: Add lazy sorting validation for binary search
- Implements kernel-side lazy sorting detection
- Mirrors user-space implementation for consistency
### Patch 7: selftests/bpf: Add test cases for btf__permute functionality
- Validates both base BTF and split BTF scenarios
## Performance Impact Analysis
Repo: https://github.com/pengdonglin137/btf_sort_test
### 1. Sorting Validation Overhead
Test Environment: Local KVM virtual machine
Results:
- Total BTF types: 143,467
- Sorting validation time: 1.451 ms
*Note: This represents the maximum observed overhead during initial BTF loading.*
### 2. Lookup Performance Comparison
Test Case: Locate all 58,718 named types in vmlinux BTF
Methodology:
./vmtest.sh -- ./test_progs -t btf_permute/perf -v
Results:
| Condition | Lookup Time | Improvement |
|--------------------|-------------|-------------|
| Unsorted (Linear) | 17,282 ms | Baseline |
| Sorted (Binary) | 19 ms | 909x faster |
Analysis:
The binary search implementation reduces lookup time from 17.3 seconds to 19 milliseconds,
achieving a **909x** speedup for large-scale type queries.
## Changelog
v4:
- Abstracted btf_dedup_remap_types logic into a helper function (suggested by Eduard).
- Removed btf_sort.c and implemented sorting separately for libbpf and kernel (suggested by Andrii).
- Added test cases for both base BTF and split BTF scenarios (suggested by Eduard).
- Added validation for name-only sorting of types (suggested by Andrii)
- Refactored btf__permute implementation to reduce complexity (suggested by Andrii)
- Add doc comments for btf__permute (suggested by Andrii)
v3:
https://lore.kernel.org/all/20251027135423.3098490-1-dolinux.peng@gmail.com/
- Remove sorting logic from libbpf and provide a generic btf__permute() interface (suggested
by Andrii)
- Omitted the search direction patch to avoid conflicts with base BTF (suggested by Eduard).
- Include btf_sort.c directly in btf.c to reduce function call overhead
v2:
https://lore.kernel.org/all/20251020093941.548058-1-dolinux.peng@gmail.com/
- Moved sorting to the build phase to reduce overhead (suggested by Alexei).
- Integrated sorting into btf_dedup_compact_and_sort_types (suggested by Eduard).
- Added sorting checks during BTF parsing.
- Consolidated common logic into btf_sort.c for sharing (suggested by Alan).
v1:
https://lore.kernel.org/all/20251013131537.1927035-1-dolinux.peng@gmail.com/
Donglin Peng (7):
libbpf: Extract BTF type remapping logic into helper function
libbpf: Add BTF permutation support for type reordering
libbpf: Optimize type lookup with binary search for sorted BTF
libbpf: Implement lazy sorting validation for binary search
optimization
btf: Optimize type lookup with binary search
btf: Add lazy sorting validation for binary search
selftests/bpf: Add test cases for btf__permute functionality
kernel/bpf/btf.c | 177 ++++++-
tools/lib/bpf/btf.c | 436 +++++++++++++++---
tools/lib/bpf/btf.h | 34 ++
tools/lib/bpf/libbpf.map | 1 +
tools/lib/bpf/libbpf_internal.h | 1 +
.../selftests/bpf/prog_tests/btf_permute.c | 142 ++++++
6 files changed, 728 insertions(+), 63 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_permute.c
--
2.34.1
^ permalink raw reply [flat|nested] 53+ messages in thread
* [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-04 23:16 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
2025-11-04 13:40 ` [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering Donglin Peng
` (5 subsequent siblings)
6 siblings, 2 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
Refactor btf_dedup_remap_types() by extracting its core logic into a new
btf_remap_types() helper function. This eliminates code duplication
and improves modularity while maintaining the same functionality.
The new function encapsulates iteration over BTF types and BTF ext
sections, accepting a callback for flexible type ID remapping. This
makes the type remapping logic more maintainable and reusable.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
tools/lib/bpf/btf.c | 63 +++++++++++++++++----------------
tools/lib/bpf/libbpf_internal.h | 1 +
2 files changed, 33 insertions(+), 31 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 18907f0fcf9f..5e1c09b5dce8 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
return 0;
}
+static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
+ btf_remap_type_fn visit, void *ctx)
+{
+ int i, r;
+
+ for (i = 0; i < btf->nr_types; i++) {
+ struct btf_type *t = btf_type_by_id(btf, btf->start_id + i);
+ struct btf_field_iter it;
+ __u32 *type_id;
+
+ r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
+ if (r)
+ return r;
+
+ while ((type_id = btf_field_iter_next(&it))) {
+ r = visit(type_id, ctx);
+ if (r)
+ return r;
+ }
+ }
+
+ if (!btf_ext)
+ return 0;
+
+ r = btf_ext_visit_type_ids(btf_ext, visit, ctx);
+ if (r)
+ return r;
+
+ return 0;
+}
+
struct btf_dedup;
static struct btf_dedup *btf_dedup_new(struct btf *btf, const struct btf_dedup_opts *opts);
@@ -5320,37 +5351,7 @@ static int btf_dedup_remap_type_id(__u32 *type_id, void *ctx)
*/
static int btf_dedup_remap_types(struct btf_dedup *d)
{
- int i, r;
-
- for (i = 0; i < d->btf->nr_types; i++) {
- struct btf_type *t = btf_type_by_id(d->btf, d->btf->start_id + i);
- struct btf_field_iter it;
- __u32 *type_id;
-
- r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
- if (r)
- return r;
-
- while ((type_id = btf_field_iter_next(&it))) {
- __u32 resolved_id, new_id;
-
- resolved_id = resolve_type_id(d, *type_id);
- new_id = d->hypot_map[resolved_id];
- if (new_id > BTF_MAX_NR_TYPES)
- return -EINVAL;
-
- *type_id = new_id;
- }
- }
-
- if (!d->btf_ext)
- return 0;
-
- r = btf_ext_visit_type_ids(d->btf_ext, btf_dedup_remap_type_id, d);
- if (r)
- return r;
-
- return 0;
+ return btf_remap_types(d->btf, d->btf_ext, btf_dedup_remap_type_id, d);
}
/*
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 35b2527bedec..b09d6163f5c3 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -582,6 +582,7 @@ int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void
int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx);
__s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
__u32 kind);
+typedef int (*btf_remap_type_fn)(__u32 *type_id, void *ctx);
/* handle direct returned errors */
static inline int libbpf_err(int ret)
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-04 23:45 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
2025-11-04 13:40 ` [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF Donglin Peng
` (4 subsequent siblings)
6 siblings, 2 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
Introduce btf__permute() API to allow in-place rearrangement of BTF types.
This function reorganizes BTF type order according to a provided array of
type IDs, updating all type references to maintain consistency.
The permutation process involves:
1. Shuffling types into new order based on the provided ID mapping
2. Remapping all type ID references to point to new locations
3. Handling BTF extension data if provided via options
This is particularly useful for optimizing type locality after BTF
deduplication or for meeting specific layout requirements in specialized
use cases.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
tools/lib/bpf/btf.h | 34 +++++++++
tools/lib/bpf/libbpf.map | 1 +
3 files changed, 196 insertions(+)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 5e1c09b5dce8..3bc03f7fe31f 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
btf->owns_base = false;
return libbpf_err(err);
}
+
+struct btf_permute {
+ /* .BTF section to be permuted in-place */
+ struct btf *btf;
+ struct btf_ext *btf_ext;
+ /* Array of type IDs used for permutation. The array length must equal
+ * the number of types in the BTF being permuted, excluding the special
+ * void type at ID 0. For split BTF, the length corresponds to the
+ * number of types added on top of the base BTF.
+ */
+ __u32 *ids;
+ /* Array of type IDs used to map from original type ID to a new permuted
+ * type ID, its length equals to the above ids */
+ __u32 *map;
+};
+
+static int btf_permute_shuffle_types(struct btf_permute *p);
+static int btf_permute_remap_types(struct btf_permute *p);
+static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
+
+int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
+{
+ struct btf_permute p;
+ int i, err = 0;
+ __u32 *map = NULL;
+
+ if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
+ return libbpf_err(-EINVAL);
+
+ map = calloc(btf->nr_types, sizeof(*map));
+ if (!map) {
+ err = -ENOMEM;
+ goto done;
+ }
+
+ for (i = 0; i < btf->nr_types; i++)
+ map[i] = BTF_UNPROCESSED_ID;
+
+ p.btf = btf;
+ p.btf_ext = OPTS_GET(opts, btf_ext, NULL);
+ p.ids = ids;
+ p.map = map;
+
+ if (btf_ensure_modifiable(btf)) {
+ err = -ENOMEM;
+ goto done;
+ }
+ err = btf_permute_shuffle_types(&p);
+ if (err < 0) {
+ pr_debug("btf_permute_shuffle_types failed: %s\n", errstr(err));
+ goto done;
+ }
+ err = btf_permute_remap_types(&p);
+ if (err < 0) {
+ pr_debug("btf_permute_remap_types failed: %s\n", errstr(err));
+ goto done;
+ }
+
+done:
+ free(map);
+ return libbpf_err(err);
+}
+
+/* Shuffle BTF types.
+ *
+ * Rearranges types according to the permutation map in p->ids. The p->map
+ * array stores the mapping from original type IDs to new shuffled IDs,
+ * which is used in the next phase to update type references.
+ *
+ * Validates that all IDs in the permutation array are valid and unique.
+ */
+static int btf_permute_shuffle_types(struct btf_permute *p)
+{
+ struct btf *btf = p->btf;
+ const struct btf_type *t;
+ __u32 *new_offs = NULL, *map;
+ void *nt, *new_types = NULL;
+ int i, id, len, err;
+
+ new_offs = calloc(btf->nr_types, sizeof(*new_offs));
+ new_types = calloc(btf->hdr->type_len, 1);
+ if (!new_offs || !new_types) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+
+ nt = new_types;
+ for (i = 0; i < btf->nr_types; i++) {
+ id = p->ids[i];
+ /* type IDs from base_btf and the VOID type are not allowed */
+ if (id < btf->start_id) {
+ err = -EINVAL;
+ goto out_err;
+ }
+ /* must be a valid type ID */
+ t = btf__type_by_id(btf, id);
+ if (!t) {
+ err = -EINVAL;
+ goto out_err;
+ }
+ map = &p->map[id - btf->start_id];
+ /* duplicate type IDs are not allowed */
+ if (*map != BTF_UNPROCESSED_ID) {
+ err = -EINVAL;
+ goto out_err;
+ }
+ len = btf_type_size(t);
+ memcpy(nt, t, len);
+ new_offs[i] = nt - new_types;
+ *map = btf->start_id + i;
+ nt += len;
+ }
+
+ free(btf->types_data);
+ free(btf->type_offs);
+ btf->types_data = new_types;
+ btf->type_offs = new_offs;
+ return 0;
+
+out_err:
+ free(new_offs);
+ free(new_types);
+ return err;
+}
+
+/* Callback function to remap individual type ID references
+ *
+ * This callback is invoked by btf_remap_types() for each type ID reference
+ * found in the BTF data. It updates the reference to point to the new
+ * permuted type ID using the mapping table.
+ */
+static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
+{
+ struct btf_permute *p = ctx;
+ __u32 new_type_id = *type_id;
+
+ /* skip references that point into the base BTF */
+ if (new_type_id < p->btf->start_id)
+ return 0;
+
+ new_type_id = p->map[*type_id - p->btf->start_id];
+ if (new_type_id > BTF_MAX_NR_TYPES)
+ return -EINVAL;
+
+ *type_id = new_type_id;
+ return 0;
+}
+
+/* Remap referenced type IDs into permuted type IDs.
+ *
+ * After BTF types are permuted, their final type IDs may differ from original
+ * ones. The map from original to a corresponding permuted type ID is stored
+ * in btf_permute->map and is populated during shuffle phase. During remapping
+ * phase we are rewriting all type IDs referenced from any BTF type (e.g.,
+ * struct fields, func proto args, etc) to their final deduped type IDs.
+ */
+static int btf_permute_remap_types(struct btf_permute *p)
+{
+ return btf_remap_types(p->btf, p->btf_ext, btf_permute_remap_type_id, p);
+}
+
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index ccfd905f03df..441f6445d762 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -273,6 +273,40 @@ LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
*/
LIBBPF_API int btf__relocate(struct btf *btf, const struct btf *base_btf);
+struct btf_permute_opts {
+ size_t sz;
+ /* optional .BTF.ext info along the main BTF info */
+ struct btf_ext *btf_ext;
+ size_t :0;
+};
+#define btf_permute_opts__last_field btf_ext
+
+/**
+ * @brief **btf__permute()** rearranges BTF types in-place according to specified mapping
+ * @param btf BTF object to permute
+ * @param ids Array defining new type order. Must contain exactly btf->nr_types elements,
+ * each being a valid type ID in range [btf->start_id, btf->start_id + btf->nr_types - 1]
+ * @param opts Optional parameters, including BTF extension data for reference updates
+ * @return 0 on success, negative error code on failure
+ *
+ * **btf__permute()** performs an in-place permutation of BTF types, rearranging them
+ * according to the order specified in @p ids array. After reordering, all type references
+ * within the BTF data and optional BTF extension are updated to maintain consistency.
+ *
+ * The permutation process consists of two phases:
+ * 1. Type shuffling: Physical reordering of type data in memory
+ * 2. Reference remapping: Updating all type ID references to new locations
+ *
+ * This is particularly useful for optimizing type locality after BTF deduplication
+ * or for meeting specific layout requirements in specialized use cases.
+ *
+ * On error, negative error code is returned and errno is set appropriately.
+ * Common error codes include:
+ * - -EINVAL: Invalid parameters or invalid ID mapping (e.g., duplicate IDs, out-of-range IDs)
+ * - -ENOMEM: Memory allocation failure during permutation process
+ */
+LIBBPF_API int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts);
+
struct btf_dump;
struct btf_dump_opts {
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 8ed8749907d4..b778e5a5d0a8 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -451,4 +451,5 @@ LIBBPF_1.7.0 {
global:
bpf_map__set_exclusive_program;
bpf_map__exclusive_program;
+ btf__permute;
} LIBBPF_1.6.0;
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-04 14:15 ` bot+bpf-ci
` (2 more replies)
2025-11-04 13:40 ` [RFC PATCH v4 4/7] libbpf: Implement lazy sorting validation for binary search optimization Donglin Peng
` (3 subsequent siblings)
6 siblings, 3 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
This patch introduces binary search optimization for BTF type lookups
when the BTF instance contains sorted types.
The optimization significantly improves performance when searching for
types in large BTF instances with sorted type names. For unsorted BTF
or when nr_sorted_types is zero, the implementation falls back to
the original linear search algorithm.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
tools/lib/bpf/btf.c | 142 +++++++++++++++++++++++++++++++++++++-------
1 file changed, 119 insertions(+), 23 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 3bc03f7fe31f..5af14304409c 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -92,6 +92,12 @@ struct btf {
* - for split BTF counts number of types added on top of base BTF.
*/
__u32 nr_types;
+ /* number of sorted and named types in this BTF instance:
+ * - doesn't include special [0] void type;
+ * - for split BTF counts number of sorted and named types added on
+ * top of base BTF.
+ */
+ __u32 nr_sorted_types;
/* if not NULL, points to the base BTF on top of which the current
* split BTF is based
*/
@@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
return type_id;
}
-__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
+/*
+ * Find BTF types with matching names within the [left, right] index range.
+ * On success, updates *left and *right to the boundaries of the matching range
+ * and returns the leftmost matching index.
+ */
+static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
+ __s32 *left, __s32 *right)
{
- __u32 i, nr_types = btf__type_cnt(btf);
+ const struct btf_type *t;
+ const char *tname;
+ __s32 l, r, m, lmost, rmost;
+ int ret;
+
+ /* found the leftmost btf_type that matches */
+ l = *left;
+ r = *right;
+ lmost = -1;
+ while (l <= r) {
+ m = l + (r - l) / 2;
+ t = btf_type_by_id(btf, m);
+ tname = btf__str_by_offset(btf, t->name_off);
+ ret = strcmp(tname, name);
+ if (ret < 0) {
+ l = m + 1;
+ } else {
+ if (ret == 0)
+ lmost = m;
+ r = m - 1;
+ }
+ }
- if (!strcmp(type_name, "void"))
- return 0;
+ if (lmost == -1)
+ return -ENOENT;
+
+ /* found the rightmost btf_type that matches */
+ l = lmost;
+ r = *right;
+ rmost = -1;
+ while (l <= r) {
+ m = l + (r - l) / 2;
+ t = btf_type_by_id(btf, m);
+ tname = btf__str_by_offset(btf, t->name_off);
+ ret = strcmp(tname, name);
+ if (ret <= 0) {
+ if (ret == 0)
+ rmost = m;
+ l = m + 1;
+ } else {
+ r = m - 1;
+ }
+ }
- for (i = 1; i < nr_types; i++) {
- const struct btf_type *t = btf__type_by_id(btf, i);
- const char *name = btf__name_by_offset(btf, t->name_off);
+ *left = lmost;
+ *right = rmost;
+ return lmost;
+}
+
+static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
+ const char *type_name, __u32 kind)
+{
+ const struct btf_type *t;
+ const char *tname;
+ int err = -ENOENT;
- if (name && !strcmp(type_name, name))
- return i;
+ if (!btf)
+ goto out;
+
+ if (start_id < btf->start_id) {
+ err = btf_find_type_by_name_kind(btf->base_btf, start_id,
+ type_name, kind);
+ if (err == -ENOENT)
+ start_id = btf->start_id;
+ }
+
+ if (err == -ENOENT) {
+ if (btf->nr_sorted_types) {
+ /* binary search */
+ __s32 l, r;
+ int ret;
+
+ l = start_id;
+ r = start_id + btf->nr_sorted_types - 1;
+ ret = btf_find_type_by_name_bsearch(btf, type_name, &l, &r);
+ if (ret < 0)
+ goto out;
+ /* return the leftmost with maching names and skip kind checking */
+ if (kind == -1)
+ return ret;
+ /* found the leftmost btf_type that matches */
+ while (l <= r) {
+ t = btf_type_by_id(btf, l);
+ if (BTF_INFO_KIND(t->info) == kind)
+ return l;
+ l++;
+ }
+ } else {
+ /* linear search */
+ __u32 i, total;
+
+ total = btf__type_cnt(btf);
+ for (i = start_id; i < total; i++) {
+ t = btf_type_by_id(btf, i);
+ if (kind != -1 && btf_kind(t) != kind)
+ continue;
+ tname = btf__str_by_offset(btf, t->name_off);
+ if (tname && !strcmp(tname, type_name))
+ return i;
+ }
+ }
}
- return libbpf_err(-ENOENT);
+out:
+ return err;
}
static __s32 btf_find_by_name_kind(const struct btf *btf, int start_id,
const char *type_name, __u32 kind)
{
- __u32 i, nr_types = btf__type_cnt(btf);
-
if (kind == BTF_KIND_UNKN || !strcmp(type_name, "void"))
return 0;
- for (i = start_id; i < nr_types; i++) {
- const struct btf_type *t = btf__type_by_id(btf, i);
- const char *name;
-
- if (btf_kind(t) != kind)
- continue;
- name = btf__name_by_offset(btf, t->name_off);
- if (name && !strcmp(type_name, name))
- return i;
- }
+ return libbpf_err(btf_find_type_by_name_kind(btf, start_id, type_name, kind));
+}
- return libbpf_err(-ENOENT);
+/* the kind value of -1 indicates that kind matching should be skipped */
+__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
+{
+ return btf_find_by_name_kind(btf, btf->start_id, type_name, -1);
}
__s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [RFC PATCH v4 4/7] libbpf: Implement lazy sorting validation for binary search optimization
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
` (2 preceding siblings ...)
2025-11-04 13:40 ` [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-05 0:29 ` Eduard Zingerman
2025-11-04 13:40 ` [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search Donglin Peng
` (2 subsequent siblings)
6 siblings, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
This patch adds lazy validation of BTF type ordering to determine if types
are sorted by name. The check is performed on first access and cached,
enabling efficient binary search for sorted BTF while maintaining linear
search fallback for unsorted cases.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
tools/lib/bpf/btf.c | 76 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 74 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 5af14304409c..0ee00cec5c05 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -26,6 +26,10 @@
#define BTF_MAX_NR_TYPES 0x7fffffffU
#define BTF_MAX_STR_OFFSET 0x7fffffffU
+/* sort verification occurs lazily upon first btf_find_type_by_name_kind()
+ * call
+ */
+#define BTF_NEED_SORT_CHECK ((__u32)-1)
static struct btf_type btf_void;
@@ -96,6 +100,10 @@ struct btf {
* - doesn't include special [0] void type;
* - for split BTF counts number of sorted and named types added on
* top of base BTF.
+ * - BTF_NEED_SORT_CHECK value indicates sort validation will be performed
+ * on first call to btf_find_type_by_name_kind.
+ * - zero value indicates applied sorting check with unsorted BTF or no
+ * named types.
*/
__u32 nr_sorted_types;
/* if not NULL, points to the base BTF on top of which the current
@@ -903,8 +911,67 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
return type_id;
}
-/*
- * Find BTF types with matching names within the [left, right] index range.
+static int btf_compare_type_names(const void *a, const void *b, void *priv)
+{
+ struct btf *btf = (struct btf *)priv;
+ struct btf_type *ta = btf_type_by_id(btf, *(__u32 *)a);
+ struct btf_type *tb = btf_type_by_id(btf, *(__u32 *)b);
+ const char *na, *nb;
+ bool anon_a, anon_b;
+
+ na = btf__str_by_offset(btf, ta->name_off);
+ nb = btf__str_by_offset(btf, tb->name_off);
+ anon_a = str_is_empty(na);
+ anon_b = str_is_empty(nb);
+
+ if (anon_a && !anon_b)
+ return 1;
+ if (!anon_a && anon_b)
+ return -1;
+ if (anon_a && anon_b)
+ return 0;
+
+ return strcmp(na, nb);
+}
+
+/* Verifies BTF type ordering by name and counts named types.
+ *
+ * Checks that types are sorted in ascending order with named types
+ * before anonymous ones. If verified, sets nr_sorted_types to the
+ * number of named types.
+ */
+static void btf_check_sorted(struct btf *btf, int start_id)
+{
+ const struct btf_type *t;
+ int i, n, nr_sorted_types;
+
+ if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
+ return;
+ btf->nr_sorted_types = 0;
+
+ if (btf->nr_types < 2)
+ return;
+
+ nr_sorted_types = 0;
+ n = btf__type_cnt(btf);
+ for (n--, i = start_id; i < n; i++) {
+ int k = i + 1;
+
+ if (btf_compare_type_names(&i, &k, btf) > 0)
+ return;
+ t = btf_type_by_id(btf, k);
+ if (!str_is_empty(btf__str_by_offset(btf, t->name_off)))
+ nr_sorted_types++;
+ }
+
+ t = btf_type_by_id(btf, start_id);
+ if (!str_is_empty(btf__str_by_offset(btf, t->name_off)))
+ nr_sorted_types++;
+ if (nr_sorted_types)
+ btf->nr_sorted_types = nr_sorted_types;
+}
+
+/* Find BTF types with matching names within the [left, right] index range.
* On success, updates *left and *right to the boundaries of the matching range
* and returns the leftmost matching index.
*/
@@ -978,6 +1045,8 @@ static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
}
if (err == -ENOENT) {
+ btf_check_sorted((struct btf *)btf, btf->start_id);
+
if (btf->nr_sorted_types) {
/* binary search */
__s32 l, r;
@@ -1102,6 +1171,7 @@ static struct btf *btf_new_empty(struct btf *base_btf)
btf->fd = -1;
btf->ptr_sz = sizeof(void *);
btf->swapped_endian = false;
+ btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
if (base_btf) {
btf->base_btf = base_btf;
@@ -1153,6 +1223,7 @@ static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf, b
btf->start_id = 1;
btf->start_str_off = 0;
btf->fd = -1;
+ btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
if (base_btf) {
btf->base_btf = base_btf;
@@ -1811,6 +1882,7 @@ static void btf_invalidate_raw_data(struct btf *btf)
free(btf->raw_data_swapped);
btf->raw_data_swapped = NULL;
}
+ btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
}
/* Ensure BTF is ready to be modified (by splitting into a three memory
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
` (3 preceding siblings ...)
2025-11-04 13:40 ` [RFC PATCH v4 4/7] libbpf: Implement lazy sorting validation for binary search optimization Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-04 17:14 ` Alexei Starovoitov
2025-11-04 13:40 ` [RFC PATCH v4 6/7] btf: Add lazy sorting validation for " Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 7/7] selftests/bpf: Add test cases for btf__permute functionality Donglin Peng
6 siblings, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
Improve btf_find_by_name_kind() performance by adding binary search
support for sorted types. Falls back to linear search for compatibility.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
kernel/bpf/btf.c | 111 ++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 101 insertions(+), 10 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 0de8fc8a0e0b..da35d8636b9b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -259,6 +259,7 @@ struct btf {
void *nohdr_data;
struct btf_header hdr;
u32 nr_types; /* includes VOID for base BTF */
+ u32 nr_sorted_types; /* exclude VOID for base BTF */
u32 types_size;
u32 data_size;
refcount_t refcnt;
@@ -494,6 +495,11 @@ static bool btf_type_is_modifier(const struct btf_type *t)
return false;
}
+static int btf_start_id(const struct btf *btf)
+{
+ return btf->start_id + (btf->base_btf ? 0 : 1);
+}
+
bool btf_type_is_void(const struct btf_type *t)
{
return t == &btf_void;
@@ -544,24 +550,109 @@ u32 btf_nr_types(const struct btf *btf)
return total;
}
-s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
+/* Find BTF types with matching names within the [left, right] index range.
+ * On success, updates *left and *right to the boundaries of the matching range
+ * and returns the leftmost matching index.
+ */
+static s32 btf_find_by_name_kind_bsearch(const struct btf *btf, const char *name,
+ s32 *left, s32 *right)
{
const struct btf_type *t;
const char *tname;
- u32 i, total;
+ s32 l, r, m, lmost, rmost;
+ int ret;
- total = btf_nr_types(btf);
- for (i = 1; i < total; i++) {
- t = btf_type_by_id(btf, i);
- if (BTF_INFO_KIND(t->info) != kind)
- continue;
+ /* found the leftmost btf_type that matches */
+ l = *left;
+ r = *right;
+ lmost = -1;
+ while (l <= r) {
+ m = l + (r - l) / 2;
+ t = btf_type_by_id(btf, m);
+ tname = btf_name_by_offset(btf, t->name_off);
+ ret = strcmp(tname, name);
+ if (ret < 0) {
+ l = m + 1;
+ } else {
+ if (ret == 0)
+ lmost = m;
+ r = m - 1;
+ }
+ }
+ if (lmost == -1)
+ return -ENOENT;
+
+ /* found the rightmost btf_type that matches */
+ l = lmost;
+ r = *right;
+ rmost = -1;
+ while (l <= r) {
+ m = l + (r - l) / 2;
+ t = btf_type_by_id(btf, m);
tname = btf_name_by_offset(btf, t->name_off);
- if (!strcmp(tname, name))
- return i;
+ ret = strcmp(tname, name);
+ if (ret <= 0) {
+ if (ret == 0)
+ rmost = m;
+ l = m + 1;
+ } else {
+ r = m - 1;
+ }
}
- return -ENOENT;
+ *left = lmost;
+ *right = rmost;
+ return lmost;
+}
+
+s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
+{
+ const struct btf *base_btf = btf_base_btf(btf);;
+ const struct btf_type *t;
+ const char *tname;
+ int err = -ENOENT;
+
+ if (base_btf)
+ err = btf_find_by_name_kind(base_btf, name, kind);
+
+ if (err == -ENOENT) {
+ if (btf->nr_sorted_types) {
+ /* binary search */
+ s32 l, r;
+ int ret;
+
+ l = btf_start_id(btf);
+ r = l + btf->nr_sorted_types - 1;
+ ret = btf_find_by_name_kind_bsearch(btf, name, &l, &r);
+ if (ret < 0)
+ goto out;
+ /* found the leftmost btf_type that matches */
+ while (l <= r) {
+ t = btf_type_by_id(btf, l);
+ if (BTF_INFO_KIND(t->info) == kind)
+ return l;
+ l++;
+ }
+ } else {
+ /* linear search */
+ u32 i, total;
+
+ total = btf_nr_types(btf);
+ for (i = btf_start_id(btf); i < total; i++) {
+ t = btf_type_by_id(btf, i);
+ if (BTF_INFO_KIND(t->info) != kind)
+ continue;
+
+ tname = btf_name_by_offset(btf, t->name_off);
+ if (!strcmp(tname, name))
+ return i;
+ }
+ }
+ }
+
+out:
+ return err;
}
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [RFC PATCH v4 6/7] btf: Add lazy sorting validation for binary search
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
` (4 preceding siblings ...)
2025-11-04 13:40 ` [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 7/7] selftests/bpf: Add test cases for btf__permute functionality Donglin Peng
6 siblings, 0 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
Implement lazy validation of BTF type ordering to enable efficient
binary search for sorted BTF while maintaining linear search fallback
for unsorted cases.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
kernel/bpf/btf.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index da35d8636b9b..c76d77fd30a7 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -192,6 +192,8 @@
*/
#define BTF_MAX_SIZE (16 * 1024 * 1024)
+#define BTF_NEED_SORT_CHECK ((u32)-1)
+
#define for_each_member_from(i, from, struct_type, member) \
for (i = from, member = btf_type_member(struct_type) + from; \
i < btf_type_vlen(struct_type); \
@@ -550,6 +552,65 @@ u32 btf_nr_types(const struct btf *btf)
return total;
}
+static int btf_compare_type_names(const void *a, const void *b, void *priv)
+{
+ struct btf *btf = (struct btf *)priv;
+ const struct btf_type *ta = btf_type_by_id(btf, *(__u32 *)a);
+ const struct btf_type *tb = btf_type_by_id(btf, *(__u32 *)b);
+ const char *na, *nb;
+
+ if (!ta->name_off && tb->name_off)
+ return 1;
+ if (ta->name_off && !tb->name_off)
+ return -1;
+ if (!ta->name_off && !tb->name_off)
+ return 0;
+
+ na = btf_name_by_offset(btf, ta->name_off);
+ nb = btf_name_by_offset(btf, tb->name_off);
+ return strcmp(na, nb);
+}
+
+/* Verifies BTF type ordering by name and counts named types.
+ *
+ * Checks that types are sorted in ascending order with named types
+ * before anonymous ones. If verified, sets nr_sorted_types to the
+ * number of named types.
+ */
+static void btf_check_sorted(struct btf *btf, int start_id)
+{
+ const struct btf_type *t;
+ int i, n, nr_sorted_types;
+
+ if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
+ return;
+
+ btf->nr_sorted_types = 0;
+
+ if (btf->nr_types < 2)
+ return;
+
+ nr_sorted_types = 0;
+ n = btf_nr_types(btf);
+ for (n--, i = start_id; i < n; i++) {
+ int k = i + 1;
+
+ if (btf_compare_type_names(&i, &k, btf) > 0)
+ return;
+
+ t = btf_type_by_id(btf, k);
+ if (t->name_off)
+ nr_sorted_types++;
+ }
+
+ t = btf_type_by_id(btf, start_id);
+ if (t->name_off)
+ nr_sorted_types++;
+
+ if (nr_sorted_types)
+ btf->nr_sorted_types = nr_sorted_types;
+}
+
/* Find BTF types with matching names within the [left, right] index range.
* On success, updates *left and *right to the boundaries of the matching range
* and returns the leftmost matching index.
@@ -617,6 +678,8 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
err = btf_find_by_name_kind(base_btf, name, kind);
if (err == -ENOENT) {
+ btf_check_sorted((struct btf *)btf, btf_start_id(btf));
+
if (btf->nr_sorted_types) {
/* binary search */
s32 l, r;
@@ -5882,6 +5945,7 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
goto errout;
}
env->btf = btf;
+ btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
data = kvmalloc(attr->btf_size, GFP_KERNEL | __GFP_NOWARN);
if (!data) {
@@ -6301,6 +6365,7 @@ static struct btf *btf_parse_base(struct btf_verifier_env *env, const char *name
btf->data = data;
btf->data_size = data_size;
btf->kernel_btf = true;
+ btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
snprintf(btf->name, sizeof(btf->name), "%s", name);
err = btf_parse_hdr(env);
@@ -6418,6 +6483,7 @@ static struct btf *btf_parse_module(const char *module_name, const void *data,
btf->start_id = base_btf->nr_types;
btf->start_str_off = base_btf->hdr.str_len;
btf->kernel_btf = true;
+ btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
snprintf(btf->name, sizeof(btf->name), "%s", module_name);
btf->data = kvmemdup(data, data_size, GFP_KERNEL | __GFP_NOWARN);
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [RFC PATCH v4 7/7] selftests/bpf: Add test cases for btf__permute functionality
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
` (5 preceding siblings ...)
2025-11-04 13:40 ` [RFC PATCH v4 6/7] btf: Add lazy sorting validation for " Donglin Peng
@ 2025-11-04 13:40 ` Donglin Peng
2025-11-05 0:41 ` Eduard Zingerman
6 siblings, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-04 13:40 UTC (permalink / raw)
To: ast
Cc: linux-kernel, bpf, Donglin Peng, Eduard Zingerman,
Andrii Nakryiko, Alan Maguire, Song Liu, pengdonglin
From: pengdonglin <pengdonglin@xiaomi.com>
This patch introduces test cases for the btf__permute function to ensure
it works correctly with both base BTF and split BTF scenarios.
The test suite includes:
- test_permute_base: Validates permutation on standalone BTF
- test_permute_split: Tests permutation on split BTF with base dependencies
Each test verifies that type IDs are correctly rearranged and type
references are properly updated after permutation operations.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
.../selftests/bpf/prog_tests/btf_permute.c | 142 ++++++++++++++++++
1 file changed, 142 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_permute.c
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_permute.c b/tools/testing/selftests/bpf/prog_tests/btf_permute.c
new file mode 100644
index 000000000000..2692cef627ab
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/btf_permute.c
@@ -0,0 +1,142 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Xiaomi */
+
+#include <test_progs.h>
+#include <bpf/btf.h>
+#include "btf_helpers.h"
+
+/* ensure btf__permute work as expected with base_btf */
+static void test_permute_base(void)
+{
+ struct btf *btf;
+ __u32 permute_ids[6];
+ int err;
+
+ btf = btf__new_empty();
+ if (!ASSERT_OK_PTR(btf, "empty_main_btf"))
+ return;
+
+ btf__add_int(btf, "int", 4, BTF_INT_SIGNED); /* [1] int */
+ btf__add_ptr(btf, 1); /* [2] ptr to int */
+ btf__add_struct(btf, "s1", 4); /* [3] struct s1 { */
+ btf__add_field(btf, "m", 1, 0, 0); /* int m; */
+ /* } */
+ btf__add_struct(btf, "s2", 4); /* [4] struct s2 { */
+ btf__add_field(btf, "m", 1, 0, 0); /* int m; */
+ /* } */
+ btf__add_func_proto(btf, 1); /* [5] int (*)(int *p); */
+ btf__add_func_param(btf, "p", 2);
+ btf__add_func(btf, "f", BTF_FUNC_STATIC, 5); /* [6] int f(int *p); */
+
+ VALIDATE_RAW_BTF(
+ btf,
+ "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+ "[2] PTR '(anon)' type_id=1",
+ "[3] STRUCT 's1' size=4 vlen=1\n"
+ "\t'm' type_id=1 bits_offset=0",
+ "[4] STRUCT 's2' size=4 vlen=1\n"
+ "\t'm' type_id=1 bits_offset=0",
+ "[5] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+ "\t'p' type_id=2",
+ "[6] FUNC 'f' type_id=5 linkage=static");
+
+ permute_ids[0] = 4; /* struct s2 */
+ permute_ids[1] = 3; /* struct s1 */
+ permute_ids[2] = 5; /* int (*)(int *p) */
+ permute_ids[3] = 1; /* int */
+ permute_ids[4] = 6; /* int f(int *p) */
+ permute_ids[5] = 2; /* ptr to int */
+ err = btf__permute(btf, permute_ids, NULL);
+ if (!ASSERT_OK(err, "btf__permute"))
+ goto done;
+
+ VALIDATE_RAW_BTF(
+ btf,
+ "[1] STRUCT 's2' size=4 vlen=1\n"
+ "\t'm' type_id=4 bits_offset=0",
+ "[2] STRUCT 's1' size=4 vlen=1\n"
+ "\t'm' type_id=4 bits_offset=0",
+ "[3] FUNC_PROTO '(anon)' ret_type_id=4 vlen=1\n"
+ "\t'p' type_id=6",
+ "[4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+ "[5] FUNC 'f' type_id=3 linkage=static",
+ "[6] PTR '(anon)' type_id=4");
+
+done:
+ btf__free(btf);
+}
+
+/* ensure btf__permute work as expected with split_btf */
+static void test_permute_split(void)
+{
+ struct btf *split_btf = NULL, *base_btf = NULL;
+ __u32 permute_ids[4];
+ int err;
+
+ base_btf = btf__new_empty();
+ if (!ASSERT_OK_PTR(base_btf, "empty_main_btf"))
+ return;
+
+ btf__add_int(base_btf, "int", 4, BTF_INT_SIGNED); /* [1] int */
+ btf__add_ptr(base_btf, 1); /* [2] ptr to int */
+ VALIDATE_RAW_BTF(
+ base_btf,
+ "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+ "[2] PTR '(anon)' type_id=1");
+ split_btf = btf__new_empty_split(base_btf);
+ if (!ASSERT_OK_PTR(split_btf, "empty_split_btf"))
+ goto cleanup;
+ btf__add_struct(split_btf, "s1", 4); /* [3] struct s1 { */
+ btf__add_field(split_btf, "m", 1, 0, 0); /* int m; */
+ /* } */
+ btf__add_struct(split_btf, "s2", 4); /* [4] struct s2 { */
+ btf__add_field(split_btf, "m", 1, 0, 0); /* int m; */
+ /* } */
+ btf__add_func_proto(split_btf, 1); /* [5] int (*)(int p); */
+ btf__add_func_param(split_btf, "p", 2);
+ btf__add_func(split_btf, "f", BTF_FUNC_STATIC, 5); /* [6] int f(int *p); */
+
+ VALIDATE_RAW_BTF(
+ split_btf,
+ "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+ "[2] PTR '(anon)' type_id=1",
+ "[3] STRUCT 's1' size=4 vlen=1\n"
+ "\t'm' type_id=1 bits_offset=0",
+ "[4] STRUCT 's2' size=4 vlen=1\n"
+ "\t'm' type_id=1 bits_offset=0",
+ "[5] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+ "\t'p' type_id=2",
+ "[6] FUNC 'f' type_id=5 linkage=static");
+
+ permute_ids[0] = 6; /* int f(int *p) */
+ permute_ids[1] = 3; /* struct s1 */
+ permute_ids[2] = 5; /* int (*)(int *p) */
+ permute_ids[3] = 4; /* struct s2 */
+ err = btf__permute(split_btf, permute_ids, NULL);
+ if (!ASSERT_OK(err, "btf__permute"))
+ goto cleanup;
+
+ VALIDATE_RAW_BTF(
+ split_btf,
+ "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+ "[2] PTR '(anon)' type_id=1",
+ "[3] FUNC 'f' type_id=5 linkage=static",
+ "[4] STRUCT 's1' size=4 vlen=1\n"
+ "\t'm' type_id=1 bits_offset=0",
+ "[5] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+ "\t'p' type_id=2",
+ "[6] STRUCT 's2' size=4 vlen=1\n"
+ "\t'm' type_id=1 bits_offset=0");
+
+cleanup:
+ btf__free(split_btf);
+ btf__free(base_btf);
+}
+
+void test_btf_permute(void)
+{
+ if (test__start_subtest("permute_base"))
+ test_permute_base();
+ if (test__start_subtest("permute_split"))
+ test_permute_split();
+}
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-04 13:40 ` [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF Donglin Peng
@ 2025-11-04 14:15 ` bot+bpf-ci
2025-11-05 0:06 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
2 siblings, 0 replies; 53+ messages in thread
From: bot+bpf-ci @ 2025-11-04 14:15 UTC (permalink / raw)
To: dolinux.peng, ast
Cc: linux-kernel, bpf, dolinux.peng, eddyz87, andrii.nakryiko,
alan.maguire, song, pengdonglin, ast, andrii, daniel, martin.lau,
eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3532 bytes --]
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 3bc03f7fe..5af143044 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -92,6 +92,12 @@ struct btf {
> * - for split BTF counts number of types added on top of base BTF.
> */
> __u32 nr_types;
> + /* number of sorted and named types in this BTF instance:
> + * - doesn't include special [0] void type;
> + * - for split BTF counts number of sorted and named types added on
> + * top of base BTF.
> + */
> + __u32 nr_sorted_types;
Where is this field initialized? I searched the entire codebase and
found no assignments to nr_sorted_types. Since struct btf is allocated
with calloc() in btf__new() and other allocation functions, this field
will always be zero.
This means the binary search optimization added below will never execute
because the check at btf_find_type_by_name_kind() tests:
if (btf->nr_sorted_types) {
/* binary search */
...
}
With nr_sorted_types always zero, the optimization appears to be
completely non-functional in this commit.
> /* if not NULL, points to the base BTF on top of which the current
> * split BTF is based
> */
[ ... ]
> @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> return type_id;
> }
>
> -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> +/*
> + * Find BTF types with matching names within the [left, right] index range.
> + * On success, updates *left and *right to the boundaries of the matching range
> + * and returns the leftmost matching index.
> + */
> +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> + __s32 *left, __s32 *right)
> {
> - __u32 i, nr_types = btf__type_cnt(btf);
> + const struct btf_type *t;
> + const char *tname;
> + __s32 l, r, m, lmost, rmost;
> + int ret;
> +
> + /* found the leftmost btf_type that matches */
> + l = *left;
> + r = *right;
> + lmost = -1;
> + while (l <= r) {
> + m = l + (r - l) / 2;
> + t = btf_type_by_id(btf, m);
> + tname = btf__str_by_offset(btf, t->name_off);
> + ret = strcmp(tname, name);
Can strcmp() crash with a NULL tname here? btf__str_by_offset() can
return NULL when offset is invalid (see line 1713 in btf.c). The old
code had a NULL check: "if (name && !strcmp(type_name, name))" and the
new linear search path also checks: "if (tname && !strcmp(tname,
type_name))", but this binary search path calls strcmp() directly.
Though currently unreachable due to nr_sorted_types being zero, if that
initialization issue is fixed, this could dereference NULL.
> + if (ret < 0) {
> + l = m + 1;
> + } else {
> + if (ret == 0)
> + lmost = m;
> + r = m - 1;
> + }
> + }
> +
> + if (lmost == -1)
> + return -ENOENT;
> +
> + /* found the rightmost btf_type that matches */
> + l = lmost;
> + r = *right;
> + rmost = -1;
> + while (l <= r) {
> + m = l + (r - l) / 2;
> + t = btf_type_by_id(btf, m);
> + tname = btf__str_by_offset(btf, t->name_off);
> + ret = strcmp(tname, name);
Same potential NULL dereference issue here.
> + if (ret <= 0) {
> + if (ret == 0)
> + rmost = m;
> + l = m + 1;
> + } else {
> + r = m - 1;
> + }
> + }
> +
> + *left = lmost;
> + *right = rmost;
> + return lmost;
> +}
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19070905166
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search
2025-11-04 13:40 ` [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search Donglin Peng
@ 2025-11-04 17:14 ` Alexei Starovoitov
2025-11-05 13:22 ` Donglin Peng
0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2025-11-04 17:14 UTC (permalink / raw)
To: Donglin Peng
Cc: Alexei Starovoitov, LKML, bpf, Eduard Zingerman, Andrii Nakryiko,
Alan Maguire, Song Liu, pengdonglin
On Tue, Nov 4, 2025 at 5:41 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> Improve btf_find_by_name_kind() performance by adding binary search
> support for sorted types. Falls back to linear search for compatibility.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
> kernel/bpf/btf.c | 111 ++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 101 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 0de8fc8a0e0b..da35d8636b9b 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -259,6 +259,7 @@ struct btf {
> void *nohdr_data;
> struct btf_header hdr;
> u32 nr_types; /* includes VOID for base BTF */
> + u32 nr_sorted_types; /* exclude VOID for base BTF */
> u32 types_size;
> u32 data_size;
> refcount_t refcnt;
> @@ -494,6 +495,11 @@ static bool btf_type_is_modifier(const struct btf_type *t)
> return false;
> }
>
> +static int btf_start_id(const struct btf *btf)
> +{
> + return btf->start_id + (btf->base_btf ? 0 : 1);
> +}
> +
> bool btf_type_is_void(const struct btf_type *t)
> {
> return t == &btf_void;
> @@ -544,24 +550,109 @@ u32 btf_nr_types(const struct btf *btf)
> return total;
> }
>
> -s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> +/* Find BTF types with matching names within the [left, right] index range.
> + * On success, updates *left and *right to the boundaries of the matching range
> + * and returns the leftmost matching index.
> + */
> +static s32 btf_find_by_name_kind_bsearch(const struct btf *btf, const char *name,
> + s32 *left, s32 *right)
> {
> const struct btf_type *t;
> const char *tname;
> - u32 i, total;
> + s32 l, r, m, lmost, rmost;
> + int ret;
>
> - total = btf_nr_types(btf);
> - for (i = 1; i < total; i++) {
> - t = btf_type_by_id(btf, i);
> - if (BTF_INFO_KIND(t->info) != kind)
> - continue;
> + /* found the leftmost btf_type that matches */
> + l = *left;
> + r = *right;
> + lmost = -1;
> + while (l <= r) {
> + m = l + (r - l) / 2;
> + t = btf_type_by_id(btf, m);
> + tname = btf_name_by_offset(btf, t->name_off);
> + ret = strcmp(tname, name);
> + if (ret < 0) {
> + l = m + 1;
> + } else {
> + if (ret == 0)
> + lmost = m;
> + r = m - 1;
> + }
> + }
>
> + if (lmost == -1)
> + return -ENOENT;
> +
> + /* found the rightmost btf_type that matches */
> + l = lmost;
> + r = *right;
> + rmost = -1;
> + while (l <= r) {
> + m = l + (r - l) / 2;
> + t = btf_type_by_id(btf, m);
> tname = btf_name_by_offset(btf, t->name_off);
> - if (!strcmp(tname, name))
> - return i;
> + ret = strcmp(tname, name);
> + if (ret <= 0) {
> + if (ret == 0)
> + rmost = m;
> + l = m + 1;
> + } else {
> + r = m - 1;
> + }
> }
>
> - return -ENOENT;
> + *left = lmost;
> + *right = rmost;
> + return lmost;
> +}
> +
> +s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> +{
> + const struct btf *base_btf = btf_base_btf(btf);;
> + const struct btf_type *t;
> + const char *tname;
> + int err = -ENOENT;
> +
> + if (base_btf)
> + err = btf_find_by_name_kind(base_btf, name, kind);
> +
> + if (err == -ENOENT) {
Please avoid the needless indent.
> + if (btf->nr_sorted_types) {
looks buggy,
since you init it to btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
Also AI is right. Init the field in the same patch.
pw-bot: cr
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-04 13:40 ` [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function Donglin Peng
@ 2025-11-04 23:16 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
1 sibling, 0 replies; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-04 23:16 UTC (permalink / raw)
To: Donglin Peng, ast
Cc: linux-kernel, bpf, Andrii Nakryiko, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 21:40 +0800, Donglin Peng wrote:
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> Refactor btf_dedup_remap_types() by extracting its core logic into a new
> btf_remap_types() helper function. This eliminates code duplication
> and improves modularity while maintaining the same functionality.
>
> The new function encapsulates iteration over BTF types and BTF ext
> sections, accepting a callback for flexible type ID remapping. This
> makes the type remapping logic more maintainable and reusable.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
> tools/lib/bpf/btf.c | 63 +++++++++++++++++----------------
> tools/lib/bpf/libbpf_internal.h | 1 +
> 2 files changed, 33 insertions(+), 31 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 18907f0fcf9f..5e1c09b5dce8 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
> return 0;
> }
>
> +static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
> + btf_remap_type_fn visit, void *ctx)
^^^^^^^^^^^^^^^^^
Nit: there is already 'type_id_visit_fn', no need to add new type.
> +{
> + int i, r;
> +
> + for (i = 0; i < btf->nr_types; i++) {
> + struct btf_type *t = btf_type_by_id(btf, btf->start_id + i);
> + struct btf_field_iter it;
> + __u32 *type_id;
> +
> + r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
> + if (r)
> + return r;
> +
> + while ((type_id = btf_field_iter_next(&it))) {
> + r = visit(type_id, ctx);
> + if (r)
> + return r;
> + }
> + }
> +
> + if (!btf_ext)
> + return 0;
> +
> + r = btf_ext_visit_type_ids(btf_ext, visit, ctx);
> + if (r)
> + return r;
> +
> + return 0;
> +}
> +
> struct btf_dedup;
>
> static struct btf_dedup *btf_dedup_new(struct btf *btf, const struct btf_dedup_opts *opts);
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-04 13:40 ` [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering Donglin Peng
@ 2025-11-04 23:45 ` Eduard Zingerman
2025-11-05 11:31 ` Donglin Peng
2025-11-05 0:11 ` Andrii Nakryiko
1 sibling, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-04 23:45 UTC (permalink / raw)
To: Donglin Peng, ast
Cc: linux-kernel, bpf, Andrii Nakryiko, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 21:40 +0800, Donglin Peng wrote:
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> This function reorganizes BTF type order according to a provided array of
> type IDs, updating all type references to maintain consistency.
>
> The permutation process involves:
> 1. Shuffling types into new order based on the provided ID mapping
> 2. Remapping all type ID references to point to new locations
> 3. Handling BTF extension data if provided via options
>
> This is particularly useful for optimizing type locality after BTF
> deduplication or for meeting specific layout requirements in specialized
> use cases.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
[...]
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -273,6 +273,40 @@ LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
> */
> LIBBPF_API int btf__relocate(struct btf *btf, const struct btf *base_btf);
>
> +struct btf_permute_opts {
> + size_t sz;
> + /* optional .BTF.ext info along the main BTF info */
> + struct btf_ext *btf_ext;
> + size_t :0;
> +};
> +#define btf_permute_opts__last_field btf_ext
> +
> +/**
> + * @brief **btf__permute()** rearranges BTF types in-place according to specified mapping
> + * @param btf BTF object to permute
> + * @param ids Array defining new type order. Must contain exactly btf->nr_types elements,
> + * each being a valid type ID in range [btf->start_id, btf->start_id + btf->nr_types - 1]
> + * @param opts Optional parameters, including BTF extension data for reference updates
> + * @return 0 on success, negative error code on failure
> + *
> + * **btf__permute()** performs an in-place permutation of BTF types, rearranging them
> + * according to the order specified in @p ids array. After reordering, all type references
> + * within the BTF data and optional BTF extension are updated to maintain consistency.
> + *
> + * The permutation process consists of two phases:
> + * 1. Type shuffling: Physical reordering of type data in memory
> + * 2. Reference remapping: Updating all type ID references to new locations
Nit: Please drop this paragraph: it is an implementation detail, not
user-facing behavior, and it is obvious from the function code.
> + *
> + * This is particularly useful for optimizing type locality after BTF deduplication
> + * or for meeting specific layout requirements in specialized use cases.
Nit: Please drop this paragraph as well.
> + *
> + * On error, negative error code is returned and errno is set appropriately.
> + * Common error codes include:
> + * - -EINVAL: Invalid parameters or invalid ID mapping (e.g., duplicate IDs, out-of-range IDs)
> + * - -ENOMEM: Memory allocation failure during permutation process
> + */
> +LIBBPF_API int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts);
> +
> struct btf_dump;
>
> struct btf_dump_opts {
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-04 13:40 ` [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF Donglin Peng
2025-11-04 14:15 ` bot+bpf-ci
@ 2025-11-05 0:06 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
2 siblings, 0 replies; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 0:06 UTC (permalink / raw)
To: Donglin Peng, ast
Cc: linux-kernel, bpf, Andrii Nakryiko, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 21:40 +0800, Donglin Peng wrote:
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> This patch introduces binary search optimization for BTF type lookups
> when the BTF instance contains sorted types.
>
> The optimization significantly improves performance when searching for
> types in large BTF instances with sorted type names. For unsorted BTF
> or when nr_sorted_types is zero, the implementation falls back to
> the original linear search algorithm.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
lgtm, have two nits.
> tools/lib/bpf/btf.c | 142 +++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 119 insertions(+), 23 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 3bc03f7fe31f..5af14304409c 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -92,6 +92,12 @@ struct btf {
> * - for split BTF counts number of types added on top of base BTF.
> */
> __u32 nr_types;
> + /* number of sorted and named types in this BTF instance:
> + * - doesn't include special [0] void type;
> + * - for split BTF counts number of sorted and named types added on
> + * top of base BTF.
> + */
> + __u32 nr_sorted_types;
Silly question: why is this __u32 and not just a flag?
> /* if not NULL, points to the base BTF on top of which the current
> * split BTF is based
> */
> @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> return type_id;
> }
>
> -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> +/*
> + * Find BTF types with matching names within the [left, right] index range.
> + * On success, updates *left and *right to the boundaries of the matching range
> + * and returns the leftmost matching index.
> + */
> +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> + __s32 *left, __s32 *right)
> {
> - __u32 i, nr_types = btf__type_cnt(btf);
> + const struct btf_type *t;
> + const char *tname;
> + __s32 l, r, m, lmost, rmost;
> + int ret;
> +
> + /* found the leftmost btf_type that matches */
> + l = *left;
> + r = *right;
> + lmost = -1;
> + while (l <= r) {
> + m = l + (r - l) / 2;
> + t = btf_type_by_id(btf, m);
> + tname = btf__str_by_offset(btf, t->name_off);
> + ret = strcmp(tname, name);
> + if (ret < 0) {
> + l = m + 1;
> + } else {
> + if (ret == 0)
> + lmost = m;
> + r = m - 1;
> + }
> + }
Nit: I think Andrii's point was that this can be written a tad shorter, e.g.:
https://elixir.bootlin.com/linux/v6.18-rc4/source/kernel/bpf/verifier.c#L2952
>
> - if (!strcmp(type_name, "void"))
> - return 0;
> + if (lmost == -1)
> + return -ENOENT;
> +
> + /* found the rightmost btf_type that matches */
> + l = lmost;
> + r = *right;
> + rmost = -1;
> + while (l <= r) {
> + m = l + (r - l) / 2;
> + t = btf_type_by_id(btf, m);
> + tname = btf__str_by_offset(btf, t->name_off);
> + ret = strcmp(tname, name);
> + if (ret <= 0) {
> + if (ret == 0)
> + rmost = m;
> + l = m + 1;
> + } else {
> + r = m - 1;
> + }
> + }
>
> - for (i = 1; i < nr_types; i++) {
> - const struct btf_type *t = btf__type_by_id(btf, i);
> - const char *name = btf__name_by_offset(btf, t->name_off);
> + *left = lmost;
> + *right = rmost;
> + return lmost;
> +}
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-04 13:40 ` [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function Donglin Peng
2025-11-04 23:16 ` Eduard Zingerman
@ 2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:36 ` Eduard Zingerman
1 sibling, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 0:11 UTC (permalink / raw)
To: Donglin Peng
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> Refactor btf_dedup_remap_types() by extracting its core logic into a new
> btf_remap_types() helper function. This eliminates code duplication
> and improves modularity while maintaining the same functionality.
>
> The new function encapsulates iteration over BTF types and BTF ext
> sections, accepting a callback for flexible type ID remapping. This
> makes the type remapping logic more maintainable and reusable.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
Signed-off-by is supposed to have properly spelled and capitalized
real name of the contributor
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
> tools/lib/bpf/btf.c | 63 +++++++++++++++++----------------
> tools/lib/bpf/libbpf_internal.h | 1 +
> 2 files changed, 33 insertions(+), 31 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 18907f0fcf9f..5e1c09b5dce8 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
> return 0;
> }
>
> +static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
> + btf_remap_type_fn visit, void *ctx)
tbh, my goal is to reduce the amount of callback usage within libbpf,
not add more of it...
I don't like this refactoring. We should convert
btf_ext_visit_type_ids() into iterators, have btf_field_iter_init +
btf_field_iter_next usable in for_each() form, and not try to reuse 5
lines of code. See my comments in the next patch.
> +{
> + int i, r;
> +
> + for (i = 0; i < btf->nr_types; i++) {
> + struct btf_type *t = btf_type_by_id(btf, btf->start_id + i);
> + struct btf_field_iter it;
> + __u32 *type_id;
> +
> + r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
> + if (r)
> + return r;
> +
> + while ((type_id = btf_field_iter_next(&it))) {
> + r = visit(type_id, ctx);
> + if (r)
> + return r;
> + }
> + }
> +
> + if (!btf_ext)
> + return 0;
> +
> + r = btf_ext_visit_type_ids(btf_ext, visit, ctx);
> + if (r)
> + return r;
> +
> + return 0;
> +}
> +
> struct btf_dedup;
>
> static struct btf_dedup *btf_dedup_new(struct btf *btf, const struct btf_dedup_opts *opts);
> @@ -5320,37 +5351,7 @@ static int btf_dedup_remap_type_id(__u32 *type_id, void *ctx)
> */
> static int btf_dedup_remap_types(struct btf_dedup *d)
> {
> - int i, r;
> -
> - for (i = 0; i < d->btf->nr_types; i++) {
> - struct btf_type *t = btf_type_by_id(d->btf, d->btf->start_id + i);
> - struct btf_field_iter it;
> - __u32 *type_id;
> -
> - r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
> - if (r)
> - return r;
> -
> - while ((type_id = btf_field_iter_next(&it))) {
> - __u32 resolved_id, new_id;
> -
> - resolved_id = resolve_type_id(d, *type_id);
> - new_id = d->hypot_map[resolved_id];
> - if (new_id > BTF_MAX_NR_TYPES)
> - return -EINVAL;
> -
> - *type_id = new_id;
> - }
> - }
> -
> - if (!d->btf_ext)
> - return 0;
> -
> - r = btf_ext_visit_type_ids(d->btf_ext, btf_dedup_remap_type_id, d);
> - if (r)
> - return r;
> -
> - return 0;
> + return btf_remap_types(d->btf, d->btf_ext, btf_dedup_remap_type_id, d);
> }
>
> /*
> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> index 35b2527bedec..b09d6163f5c3 100644
> --- a/tools/lib/bpf/libbpf_internal.h
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -582,6 +582,7 @@ int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void
> int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx);
> __s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
> __u32 kind);
> +typedef int (*btf_remap_type_fn)(__u32 *type_id, void *ctx);
>
> /* handle direct returned errors */
> static inline int libbpf_err(int ret)
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-04 13:40 ` [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering Donglin Peng
2025-11-04 23:45 ` Eduard Zingerman
@ 2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:16 ` Eduard Zingerman
2025-11-05 12:52 ` Donglin Peng
1 sibling, 2 replies; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 0:11 UTC (permalink / raw)
To: Donglin Peng
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> This function reorganizes BTF type order according to a provided array of
> type IDs, updating all type references to maintain consistency.
>
> The permutation process involves:
> 1. Shuffling types into new order based on the provided ID mapping
> 2. Remapping all type ID references to point to new locations
> 3. Handling BTF extension data if provided via options
>
> This is particularly useful for optimizing type locality after BTF
> deduplication or for meeting specific layout requirements in specialized
> use cases.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
> tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
> tools/lib/bpf/btf.h | 34 +++++++++
> tools/lib/bpf/libbpf.map | 1 +
> 3 files changed, 196 insertions(+)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 5e1c09b5dce8..3bc03f7fe31f 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
> btf->owns_base = false;
> return libbpf_err(err);
> }
> +
> +struct btf_permute {
> + /* .BTF section to be permuted in-place */
> + struct btf *btf;
> + struct btf_ext *btf_ext;
> + /* Array of type IDs used for permutation. The array length must equal
/*
* Use this comment style
*/
> + * the number of types in the BTF being permuted, excluding the special
> + * void type at ID 0. For split BTF, the length corresponds to the
> + * number of types added on top of the base BTF.
many words, but what exactly ids[i] means is still not clear, actually...
> + */
> + __u32 *ids;
> + /* Array of type IDs used to map from original type ID to a new permuted
> + * type ID, its length equals to the above ids */
wrong comment style
> + __u32 *map;
"map" is a bit generic. What if we use s/ids/id_map/ and
s/map/id_map_rev/ (for reverse)? I'd use "id_map" naming in the public
API to make it clear that it's a mapping of IDs, not just some array
of IDs.
> +};
> +
> +static int btf_permute_shuffle_types(struct btf_permute *p);
> +static int btf_permute_remap_types(struct btf_permute *p);
> +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
> +
> +int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
Let's require user to pass id_map_cnt in addition to id_map itself.
It's easy to get this wrong (especially with that special VOID 0 type
that has to be excluded, which I can't even make up my mind if that's
a good idea or not), so having user explicitly say what they think is
necessary for permutation is good.
> +{
> + struct btf_permute p;
> + int i, err = 0;
> + __u32 *map = NULL;
> +
> + if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
libbpf doesn't protect against NULL passed for mandatory parameters,
please drop !ids check
> + return libbpf_err(-EINVAL);
> +
> + map = calloc(btf->nr_types, sizeof(*map));
> + if (!map) {
> + err = -ENOMEM;
> + goto done;
> + }
> +
> + for (i = 0; i < btf->nr_types; i++)
> + map[i] = BTF_UNPROCESSED_ID;
> +
> + p.btf = btf;
> + p.btf_ext = OPTS_GET(opts, btf_ext, NULL);
> + p.ids = ids;
> + p.map = map;
> +
> + if (btf_ensure_modifiable(btf)) {
> + err = -ENOMEM;
> + goto done;
> + }
> + err = btf_permute_shuffle_types(&p);
> + if (err < 0) {
> + pr_debug("btf_permute_shuffle_types failed: %s\n", errstr(err));
let's drop these pr_debug(), I don't think it's something we expect to ever see
> + goto done;
> + }
> + err = btf_permute_remap_types(&p);
> + if (err < 0) {
> + pr_debug("btf_permute_remap_types failed: %s\n", errstr(err));
ditto
> + goto done;
> + }
> +
> +done:
> + free(map);
> + return libbpf_err(err);
> +}
> +
> +/* Shuffle BTF types.
> + *
> + * Rearranges types according to the permutation map in p->ids. The p->map
> + * array stores the mapping from original type IDs to new shuffled IDs,
> + * which is used in the next phase to update type references.
> + *
> + * Validates that all IDs in the permutation array are valid and unique.
> + */
> +static int btf_permute_shuffle_types(struct btf_permute *p)
> +{
> + struct btf *btf = p->btf;
> + const struct btf_type *t;
> + __u32 *new_offs = NULL, *map;
> + void *nt, *new_types = NULL;
> + int i, id, len, err;
> +
> + new_offs = calloc(btf->nr_types, sizeof(*new_offs));
we don't really need to allocate memory and maintain this, we can just
shift types around and then do what btf_parse_type_sec() does -- just
go over types one by one and calculate offsets, and update them
in-place inside btf->type_offs
> + new_types = calloc(btf->hdr->type_len, 1);
> + if (!new_offs || !new_types) {
> + err = -ENOMEM;
> + goto out_err;
> + }
> +
> + nt = new_types;
> + for (i = 0; i < btf->nr_types; i++) {
> + id = p->ids[i];
> + /* type IDs from base_btf and the VOID type are not allowed */
> + if (id < btf->start_id) {
> + err = -EINVAL;
> + goto out_err;
> + }
> + /* must be a valid type ID */
> + t = btf__type_by_id(btf, id);
> + if (!t) {
> + err = -EINVAL;
> + goto out_err;
> + }
> + map = &p->map[id - btf->start_id];
> + /* duplicate type IDs are not allowed */
> + if (*map != BTF_UNPROCESSED_ID) {
there is no need for BTF_UNPROCESSED_ID, zero is a perfectly valid
value to use as "not yet set" value, as we don't allow remapping VOID
0 to anything anyways.
> + err = -EINVAL;
> + goto out_err;
> + }
> + len = btf_type_size(t);
> + memcpy(nt, t, len);
once you memcpy() data, you can use that btf_field_iter_init +
btf_field_iter_next to *trivially* remap all IDs, no need for patch 1
refactoring, IMO. And no need for two-phase approach either.
> + new_offs[i] = nt - new_types;
> + *map = btf->start_id + i;
> + nt += len;
> + }
> +
> + free(btf->types_data);
> + free(btf->type_offs);
> + btf->types_data = new_types;
> + btf->type_offs = new_offs;
> + return 0;
> +
> +out_err:
> + free(new_offs);
> + free(new_types);
> + return err;
> +}
> +
> +/* Callback function to remap individual type ID references
> + *
> + * This callback is invoked by btf_remap_types() for each type ID reference
> + * found in the BTF data. It updates the reference to point to the new
> + * permuted type ID using the mapping table.
> + */
> +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> +{
> + struct btf_permute *p = ctx;
> + __u32 new_type_id = *type_id;
> +
> + /* skip references that point into the base BTF */
> + if (new_type_id < p->btf->start_id)
> + return 0;
> +
> + new_type_id = p->map[*type_id - p->btf->start_id];
I'm actually confused, I thought p->ids would be the mapping from
original type ID (minus start_id, of course) to a new desired ID, but
it looks to be the other way? ids is a desired resulting *sequence* of
types identified by their original ID. I find it quite confusing. I
think about permutation as a mapping from original type ID to a new
type ID, am I confused?
> + if (new_type_id > BTF_MAX_NR_TYPES)
> + return -EINVAL;
> +
> + *type_id = new_type_id;
> + return 0;
> +}
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-04 13:40 ` [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF Donglin Peng
2025-11-04 14:15 ` bot+bpf-ci
2025-11-05 0:06 ` Eduard Zingerman
@ 2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:19 ` Eduard Zingerman
2 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 0:11 UTC (permalink / raw)
To: Donglin Peng
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> This patch introduces binary search optimization for BTF type lookups
> when the BTF instance contains sorted types.
>
> The optimization significantly improves performance when searching for
> types in large BTF instances with sorted type names. For unsorted BTF
> or when nr_sorted_types is zero, the implementation falls back to
> the original linear search algorithm.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
> tools/lib/bpf/btf.c | 142 +++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 119 insertions(+), 23 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 3bc03f7fe31f..5af14304409c 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -92,6 +92,12 @@ struct btf {
> * - for split BTF counts number of types added on top of base BTF.
> */
> __u32 nr_types;
> + /* number of sorted and named types in this BTF instance:
> + * - doesn't include special [0] void type;
> + * - for split BTF counts number of sorted and named types added on
> + * top of base BTF.
> + */
> + __u32 nr_sorted_types;
we don't need to know the count of sorted types, all we need is a
tristate value: a) data is sorted, b) data is not sorted, c) we don't
know yet. And zero should be treated as "we don't know yet". This is
trivial to do with an enum.
> /* if not NULL, points to the base BTF on top of which the current
> * split BTF is based
> */
> @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> return type_id;
> }
>
> -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> +/*
> + * Find BTF types with matching names within the [left, right] index range.
> + * On success, updates *left and *right to the boundaries of the matching range
> + * and returns the leftmost matching index.
> + */
> +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> + __s32 *left, __s32 *right)
I thought we discussed this, why do you need "right"? Two binary
searches where one would do just fine.
Also this isn't quite the same approach as in find_linfo() in
kernel/bpf/log.c, that one doesn't have extra ret == 0 condition
pw-bot: cr
> {
> - __u32 i, nr_types = btf__type_cnt(btf);
> + const struct btf_type *t;
> + const char *tname;
> + __s32 l, r, m, lmost, rmost;
> + int ret;
> +
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 0:11 ` Andrii Nakryiko
@ 2025-11-05 0:16 ` Eduard Zingerman
2025-11-05 1:04 ` Andrii Nakryiko
2025-11-05 12:52 ` Donglin Peng
1 sibling, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 0:16 UTC (permalink / raw)
To: Andrii Nakryiko, Donglin Peng
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
[...]
> > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > +{
> > + struct btf_permute *p = ctx;
> > + __u32 new_type_id = *type_id;
> > +
> > + /* skip references that point into the base BTF */
> > + if (new_type_id < p->btf->start_id)
> > + return 0;
> > +
> > + new_type_id = p->map[*type_id - p->btf->start_id];
>
> I'm actually confused, I thought p->ids would be the mapping from
> original type ID (minus start_id, of course) to a new desired ID, but
> it looks to be the other way? ids is a desired resulting *sequence* of
> types identified by their original ID. I find it quite confusing. I
> think about permutation as a mapping from original type ID to a new
> type ID, am I confused?
Yes, it is a desired sequence, not mapping.
I guess its a bit simpler to use for sorting use-case, as you can just
swap ids while sorting.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 0:11 ` Andrii Nakryiko
@ 2025-11-05 0:19 ` Eduard Zingerman
2025-11-05 0:54 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 0:19 UTC (permalink / raw)
To: Andrii Nakryiko, Donglin Peng
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
[...]
> > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > return type_id;
> > }
> >
> > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > +/*
> > + * Find BTF types with matching names within the [left, right] index range.
> > + * On success, updates *left and *right to the boundaries of the matching range
> > + * and returns the leftmost matching index.
> > + */
> > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > + __s32 *left, __s32 *right)
>
> I thought we discussed this, why do you need "right"? Two binary
> searches where one would do just fine.
I think the idea is that there would be less strcmp's if there is a
long sequence of items with identical names.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 4/7] libbpf: Implement lazy sorting validation for binary search optimization
2025-11-04 13:40 ` [RFC PATCH v4 4/7] libbpf: Implement lazy sorting validation for binary search optimization Donglin Peng
@ 2025-11-05 0:29 ` Eduard Zingerman
0 siblings, 0 replies; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 0:29 UTC (permalink / raw)
To: Donglin Peng, ast
Cc: linux-kernel, bpf, Andrii Nakryiko, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 21:40 +0800, Donglin Peng wrote:
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> This patch adds lazy validation of BTF type ordering to determine if types
> are sorted by name. The check is performed on first access and cached,
> enabling efficient binary search for sorted BTF while maintaining linear
> search fallback for unsorted cases.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
> tools/lib/bpf/btf.c | 76 +++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 74 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 5af14304409c..0ee00cec5c05 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -26,6 +26,10 @@
>
> #define BTF_MAX_NR_TYPES 0x7fffffffU
> #define BTF_MAX_STR_OFFSET 0x7fffffffU
> +/* sort verification occurs lazily upon first btf_find_type_by_name_kind()
> + * call
> + */
> +#define BTF_NEED_SORT_CHECK ((__u32)-1)
>
> static struct btf_type btf_void;
>
> @@ -96,6 +100,10 @@ struct btf {
> * - doesn't include special [0] void type;
> * - for split BTF counts number of sorted and named types added on
> * top of base BTF.
> + * - BTF_NEED_SORT_CHECK value indicates sort validation will be performed
> + * on first call to btf_find_type_by_name_kind.
> + * - zero value indicates applied sorting check with unsorted BTF or no
> + * named types.
And this can be another flag.
> */
> __u32 nr_sorted_types;
> /* if not NULL, points to the base BTF on top of which the current
> @@ -903,8 +911,67 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> return type_id;
> }
>
> -/*
> - * Find BTF types with matching names within the [left, right] index range.
> +static int btf_compare_type_names(const void *a, const void *b, void *priv)
> +{
> + struct btf *btf = (struct btf *)priv;
> + struct btf_type *ta = btf_type_by_id(btf, *(__u32 *)a);
> + struct btf_type *tb = btf_type_by_id(btf, *(__u32 *)b);
> + const char *na, *nb;
> + bool anon_a, anon_b;
> +
> + na = btf__str_by_offset(btf, ta->name_off);
> + nb = btf__str_by_offset(btf, tb->name_off);
> + anon_a = str_is_empty(na);
> + anon_b = str_is_empty(nb);
> +
> + if (anon_a && !anon_b)
> + return 1;
> + if (!anon_a && anon_b)
> + return -1;
> + if (anon_a && anon_b)
> + return 0;
> +
> + return strcmp(na, nb);
> +}
> +
> +/* Verifies BTF type ordering by name and counts named types.
> + *
> + * Checks that types are sorted in ascending order with named types
> + * before anonymous ones. If verified, sets nr_sorted_types to the
> + * number of named types.
> + */
> +static void btf_check_sorted(struct btf *btf, int start_id)
> +{
> + const struct btf_type *t;
> + int i, n, nr_sorted_types;
> +
> + if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> + return;
> + btf->nr_sorted_types = 0;
> +
> + if (btf->nr_types < 2)
> + return;
> +
> + nr_sorted_types = 0;
> + n = btf__type_cnt(btf);
> + for (n--, i = start_id; i < n; i++) {
^^^
why not -1 one line before?
> + int k = i + 1;
> +
> + if (btf_compare_type_names(&i, &k, btf) > 0)
> + return;
> + t = btf_type_by_id(btf, k);
> + if (!str_is_empty(btf__str_by_offset(btf, t->name_off)))
> + nr_sorted_types++;
> + }
> +
> + t = btf_type_by_id(btf, start_id);
> + if (!str_is_empty(btf__str_by_offset(btf, t->name_off)))
> + nr_sorted_types++;
> + if (nr_sorted_types)
> + btf->nr_sorted_types = nr_sorted_types;
I think that maintaining nr_sorted_types only for named types is an
unnecessary complication. Binary search will skip those anyway,
probably in one iteration.
> +}
> +
> +/* Find BTF types with matching names within the [left, right] index range.
> * On success, updates *left and *right to the boundaries of the matching range
> * and returns the leftmost matching index.
> */
> @@ -978,6 +1045,8 @@ static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> }
>
> if (err == -ENOENT) {
> + btf_check_sorted((struct btf *)btf, btf->start_id);
> +
> if (btf->nr_sorted_types) {
> /* binary search */
> __s32 l, r;
> @@ -1102,6 +1171,7 @@ static struct btf *btf_new_empty(struct btf *base_btf)
> btf->fd = -1;
> btf->ptr_sz = sizeof(void *);
> btf->swapped_endian = false;
> + btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
>
> if (base_btf) {
> btf->base_btf = base_btf;
> @@ -1153,6 +1223,7 @@ static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf, b
> btf->start_id = 1;
> btf->start_str_off = 0;
> btf->fd = -1;
> + btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
>
> if (base_btf) {
> btf->base_btf = base_btf;
> @@ -1811,6 +1882,7 @@ static void btf_invalidate_raw_data(struct btf *btf)
> free(btf->raw_data_swapped);
> btf->raw_data_swapped = NULL;
> }
> + btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
> }
>
> /* Ensure BTF is ready to be modified (by splitting into a three memory
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-05 0:11 ` Andrii Nakryiko
@ 2025-11-05 0:36 ` Eduard Zingerman
2025-11-05 0:57 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 0:36 UTC (permalink / raw)
To: Andrii Nakryiko, Donglin Peng
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
[...]
> > @@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
> > return 0;
> > }
> >
> > +static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
> > + btf_remap_type_fn visit, void *ctx)
>
> tbh, my goal is to reduce the amount of callback usage within libbpf,
> not add more of it...
>
> I don't like this refactoring. We should convert
> btf_ext_visit_type_ids() into iterators, have btf_field_iter_init +
> btf_field_iter_next usable in for_each() form, and not try to reuse 5
> lines of code. See my comments in the next patch.
Remapping types is a concept.
I hate duplicating code for concepts.
Similarly, having patch #3 == patch #5 and patch #4 == patch #6 is
plain ugly. Just waiting for a bug because we changed the one but
forgot to change another in a year or two.
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 7/7] selftests/bpf: Add test cases for btf__permute functionality
2025-11-04 13:40 ` [RFC PATCH v4 7/7] selftests/bpf: Add test cases for btf__permute functionality Donglin Peng
@ 2025-11-05 0:41 ` Eduard Zingerman
0 siblings, 0 replies; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 0:41 UTC (permalink / raw)
To: Donglin Peng, ast
Cc: linux-kernel, bpf, Andrii Nakryiko, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 21:40 +0800, Donglin Peng wrote:
> From: pengdonglin <pengdonglin@xiaomi.com>
>
> This patch introduces test cases for the btf__permute function to ensure
> it works correctly with both base BTF and split BTF scenarios.
>
> The test suite includes:
> - test_permute_base: Validates permutation on standalone BTF
> - test_permute_split: Tests permutation on split BTF with base dependencies
>
> Each test verifies that type IDs are correctly rearranged and type
> references are properly updated after permutation operations.
>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Song Liu <song@kernel.org>
> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> ---
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 0:19 ` Eduard Zingerman
@ 2025-11-05 0:54 ` Andrii Nakryiko
2025-11-05 1:17 ` Eduard Zingerman
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 0:54 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > return type_id;
> > > }
> > >
> > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > +/*
> > > + * Find BTF types with matching names within the [left, right] index range.
> > > + * On success, updates *left and *right to the boundaries of the matching range
> > > + * and returns the leftmost matching index.
> > > + */
> > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > + __s32 *left, __s32 *right)
> >
> > I thought we discussed this, why do you need "right"? Two binary
> > searches where one would do just fine.
>
> I think the idea is that there would be less strcmp's if there is a
> long sequence of items with identical names.
Sure, it's a tradeoff. But how long is the set of duplicate name
entries we expect in kernel BTF? Additional O(logN) over 70K+ types
with high likelihood will take more comparisons.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-05 0:36 ` Eduard Zingerman
@ 2025-11-05 0:57 ` Andrii Nakryiko
2025-11-05 1:23 ` Eduard Zingerman
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 0:57 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 4:36 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > > @@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
> > > return 0;
> > > }
> > >
> > > +static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
> > > + btf_remap_type_fn visit, void *ctx)
> >
> > tbh, my goal is to reduce the amount of callback usage within libbpf,
> > not add more of it...
> >
> > I don't like this refactoring. We should convert
> > btf_ext_visit_type_ids() into iterators, have btf_field_iter_init +
> > btf_field_iter_next usable in for_each() form, and not try to reuse 5
> > lines of code. See my comments in the next patch.
>
> Remapping types is a concept.
> I hate duplicating code for concepts.
> Similarly, having patch #3 == patch #5 and patch #4 == patch #6 is
> plain ugly. Just waiting for a bug because we changed the one but
> forgot to change another in a year or two.
Tricky and evolving part (iterating all type ID fields) is abstracted
behind the iterator (and we should do the same for btf_ext). Iterating
types is not something tricky or requiring constant maintenance.
Same for binary search, I don't see why we'd need to adjust it. So no,
I don't want to share code between kernel and libbpf just to reuse
binary search implementation, sorry.
>
> [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 0:16 ` Eduard Zingerman
@ 2025-11-05 1:04 ` Andrii Nakryiko
2025-11-05 1:20 ` Eduard Zingerman
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 1:04 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > +{
> > > + struct btf_permute *p = ctx;
> > > + __u32 new_type_id = *type_id;
> > > +
> > > + /* skip references that point into the base BTF */
> > > + if (new_type_id < p->btf->start_id)
> > > + return 0;
> > > +
> > > + new_type_id = p->map[*type_id - p->btf->start_id];
> >
> > I'm actually confused, I thought p->ids would be the mapping from
> > original type ID (minus start_id, of course) to a new desired ID, but
> > it looks to be the other way? ids is a desired resulting *sequence* of
> > types identified by their original ID. I find it quite confusing. I
> > think about permutation as a mapping from original type ID to a new
> > type ID, am I confused?
>
> Yes, it is a desired sequence, not mapping.
> I guess its a bit simpler to use for sorting use-case, as you can just
> swap ids while sorting.
The question is really what makes most sense as an interface. Because
for sorting cases it's just the matter of a two-line for() loop to
create ID mapping once types are sorted.
I have slight preference for id_map approach because it is easy to
extend to the case of selectively dropping some types. We can just
define that such IDs should be mapped to zero. This will work as a
natural extension. With the desired end sequence of IDs, it's less
natural and will require more work to determine which IDs are missing
from the sequence.
So unless there is some really good and strong reason, shall we go
with the ID mapping approach?
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 0:54 ` Andrii Nakryiko
@ 2025-11-05 1:17 ` Eduard Zingerman
2025-11-05 13:48 ` Donglin Peng
0 siblings, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 1:17 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> >
> > [...]
> >
> > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > return type_id;
> > > > }
> > > >
> > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > +/*
> > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > + * and returns the leftmost matching index.
> > > > + */
> > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > + __s32 *left, __s32 *right)
> > >
> > > I thought we discussed this, why do you need "right"? Two binary
> > > searches where one would do just fine.
> >
> > I think the idea is that there would be less strcmp's if there is a
> > long sequence of items with identical names.
>
> Sure, it's a tradeoff. But how long is the set of duplicate name
> entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> with high likelihood will take more comparisons.
$ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
51737 '(anon)'
277 'bpf_kfunc'
4 'long
3 'perf_aux_event'
3 'workspace'
2 'ata_acpi_gtm'
2 'avc_cache_stats'
2 'bh_accounting'
2 'bp_cpuinfo'
2 'bpf_fastcall'
'bpf_kfunc' is probably for decl_tags.
So I agree with you regarding the second binary search, it is not
necessary. But skipping all anonymous types (and thus having to
maintain nr_sorted_types) might be useful, on each search two
iterations would be wasted to skip those.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 1:04 ` Andrii Nakryiko
@ 2025-11-05 1:20 ` Eduard Zingerman
2025-11-05 13:19 ` Donglin Peng
2025-11-05 18:23 ` Andrii Nakryiko
0 siblings, 2 replies; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 1:20 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> >
> > [...]
> >
> > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > +{
> > > > + struct btf_permute *p = ctx;
> > > > + __u32 new_type_id = *type_id;
> > > > +
> > > > + /* skip references that point into the base BTF */
> > > > + if (new_type_id < p->btf->start_id)
> > > > + return 0;
> > > > +
> > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > >
> > > I'm actually confused, I thought p->ids would be the mapping from
> > > original type ID (minus start_id, of course) to a new desired ID, but
> > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > types identified by their original ID. I find it quite confusing. I
> > > think about permutation as a mapping from original type ID to a new
> > > type ID, am I confused?
> >
> > Yes, it is a desired sequence, not mapping.
> > I guess its a bit simpler to use for sorting use-case, as you can just
> > swap ids while sorting.
>
> The question is really what makes most sense as an interface. Because
> for sorting cases it's just the matter of a two-line for() loop to
> create ID mapping once types are sorted.
>
> I have slight preference for id_map approach because it is easy to
> extend to the case of selectively dropping some types. We can just
> define that such IDs should be mapped to zero. This will work as a
> natural extension. With the desired end sequence of IDs, it's less
> natural and will require more work to determine which IDs are missing
> from the sequence.
>
> So unless there is some really good and strong reason, shall we go
> with the ID mapping approach?
If the interface is extended with types_cnt, as you suggest, deleting
types is trivial with sequence interface as well. At-least the way it
is implemented by this patch, you just copy elements from 'ids' one by
one.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-05 0:57 ` Andrii Nakryiko
@ 2025-11-05 1:23 ` Eduard Zingerman
2025-11-05 18:20 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 1:23 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, 2025-11-04 at 16:57 -0800, Andrii Nakryiko wrote:
> On Tue, Nov 4, 2025 at 4:36 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> >
> > [...]
> >
> > > > @@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
> > > > return 0;
> > > > }
> > > >
> > > > +static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
> > > > + btf_remap_type_fn visit, void *ctx)
> > >
> > > tbh, my goal is to reduce the amount of callback usage within libbpf,
> > > not add more of it...
> > >
> > > I don't like this refactoring. We should convert
> > > btf_ext_visit_type_ids() into iterators, have btf_field_iter_init +
> > > btf_field_iter_next usable in for_each() form, and not try to reuse 5
> > > lines of code. See my comments in the next patch.
> >
> > Remapping types is a concept.
> > I hate duplicating code for concepts.
> > Similarly, having patch #3 == patch #5 and patch #4 == patch #6 is
> > plain ugly. Just waiting for a bug because we changed the one but
> > forgot to change another in a year or two.
>
> Tricky and evolving part (iterating all type ID fields) is abstracted
> behind the iterator (and we should do the same for btf_ext). Iterating
> types is not something tricky or requiring constant maintenance.
>
> Same for binary search, I don't see why we'd need to adjust it. So no,
> I don't want to share code between kernel and libbpf just to reuse
> binary search implementation, sorry.
<rant>
Sure binary search is trivial, but did you count how many times you
asked people to re-implement binary search as in [1]?
[1] https://elixir.bootlin.com/linux/v6.18-rc4/source/kernel/bpf/verifier.c#L2952
</rant>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-04 23:45 ` Eduard Zingerman
@ 2025-11-05 11:31 ` Donglin Peng
0 siblings, 0 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-05 11:31 UTC (permalink / raw)
To: Eduard Zingerman
Cc: ast, linux-kernel, bpf, Andrii Nakryiko, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 7:45 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 21:40 +0800, Donglin Peng wrote:
> > From: pengdonglin <pengdonglin@xiaomi.com>
> >
> > Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> > This function reorganizes BTF type order according to a provided array of
> > type IDs, updating all type references to maintain consistency.
> >
> > The permutation process involves:
> > 1. Shuffling types into new order based on the provided ID mapping
> > 2. Remapping all type ID references to point to new locations
> > 3. Handling BTF extension data if provided via options
> >
> > This is particularly useful for optimizing type locality after BTF
> > deduplication or for meeting specific layout requirements in specialized
> > use cases.
> >
> > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Cc: Alan Maguire <alan.maguire@oracle.com>
> > Cc: Song Liu <song@kernel.org>
> > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
>
> Acked-by: Eduard Zingerman <eddyz87@gmail.com>
>
> [...]
>
> > --- a/tools/lib/bpf/btf.h
> > +++ b/tools/lib/bpf/btf.h
> > @@ -273,6 +273,40 @@ LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
> > */
> > LIBBPF_API int btf__relocate(struct btf *btf, const struct btf *base_btf);
> >
> > +struct btf_permute_opts {
> > + size_t sz;
> > + /* optional .BTF.ext info along the main BTF info */
> > + struct btf_ext *btf_ext;
> > + size_t :0;
> > +};
> > +#define btf_permute_opts__last_field btf_ext
> > +
> > +/**
> > + * @brief **btf__permute()** rearranges BTF types in-place according to specified mapping
> > + * @param btf BTF object to permute
> > + * @param ids Array defining new type order. Must contain exactly btf->nr_types elements,
> > + * each being a valid type ID in range [btf->start_id, btf->start_id + btf->nr_types - 1]
> > + * @param opts Optional parameters, including BTF extension data for reference updates
> > + * @return 0 on success, negative error code on failure
> > + *
> > + * **btf__permute()** performs an in-place permutation of BTF types, rearranging them
> > + * according to the order specified in @p ids array. After reordering, all type references
> > + * within the BTF data and optional BTF extension are updated to maintain consistency.
> > + *
> > + * The permutation process consists of two phases:
> > + * 1. Type shuffling: Physical reordering of type data in memory
> > + * 2. Reference remapping: Updating all type ID references to new locations
>
> Nit: Please drop this paragraph: it is an implementation detail, not
> user-facing behavior, and it is obvious from the function code.
Thanks, I will fix it in the next version.
>
> > + *
> > + * This is particularly useful for optimizing type locality after BTF deduplication
> > + * or for meeting specific layout requirements in specialized use cases.
>
> Nit: Please drop this paragraph as well.
Thanks, I will fix it in the next version.
>
> > + *
> > + * On error, negative error code is returned and errno is set appropriately.
> > + * Common error codes include:
> > + * - -EINVAL: Invalid parameters or invalid ID mapping (e.g., duplicate IDs, out-of-range IDs)
> > + * - -ENOMEM: Memory allocation failure during permutation process
> > + */
> > +LIBBPF_API int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts);
> > +
> > struct btf_dump;
> >
> > struct btf_dump_opts {
>
> [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:16 ` Eduard Zingerman
@ 2025-11-05 12:52 ` Donglin Peng
2025-11-05 18:29 ` Andrii Nakryiko
1 sibling, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-05 12:52 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 8:11 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > From: pengdonglin <pengdonglin@xiaomi.com>
> >
> > Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> > This function reorganizes BTF type order according to a provided array of
> > type IDs, updating all type references to maintain consistency.
> >
> > The permutation process involves:
> > 1. Shuffling types into new order based on the provided ID mapping
> > 2. Remapping all type ID references to point to new locations
> > 3. Handling BTF extension data if provided via options
> >
> > This is particularly useful for optimizing type locality after BTF
> > deduplication or for meeting specific layout requirements in specialized
> > use cases.
> >
> > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Cc: Alan Maguire <alan.maguire@oracle.com>
> > Cc: Song Liu <song@kernel.org>
> > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> > ---
> > tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
> > tools/lib/bpf/btf.h | 34 +++++++++
> > tools/lib/bpf/libbpf.map | 1 +
> > 3 files changed, 196 insertions(+)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 5e1c09b5dce8..3bc03f7fe31f 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
> > btf->owns_base = false;
> > return libbpf_err(err);
> > }
> > +
> > +struct btf_permute {
> > + /* .BTF section to be permuted in-place */
> > + struct btf *btf;
> > + struct btf_ext *btf_ext;
> > + /* Array of type IDs used for permutation. The array length must equal
>
> /*
> * Use this comment style
> */
Thanks.
>
> > + * the number of types in the BTF being permuted, excluding the special
> > + * void type at ID 0. For split BTF, the length corresponds to the
> > + * number of types added on top of the base BTF.
>
> many words, but what exactly ids[i] means is still not clear, actually...
Thanks. I'll clarify the description. Is the following parameter
explanation acceptable?
@param ids Array containing original type IDs (excluding VOID type ID
0) in user-defined order.
The array size must match btf->nr_types, which
also excludes VOID type ID 0.
>
> > + */
> > + __u32 *ids;
> > + /* Array of type IDs used to map from original type ID to a new permuted
> > + * type ID, its length equals to the above ids */
>
> wrong comment style
Thanks, I will fix it in the next version.
>
> > + __u32 *map;
>
> "map" is a bit generic. What if we use s/ids/id_map/ and
> s/map/id_map_rev/ (for reverse)? I'd use "id_map" naming in the public
> API to make it clear that it's a mapping of IDs, not just some array
> of IDs.
Thank you for the suggestion. While I agree that renaming 'map' to 'id_map'
makes sense for clarity, but 'ids' seems correct as it denotes a collection of
IDs, not a mapping structure.
>
> > +};
> > +
> > +static int btf_permute_shuffle_types(struct btf_permute *p);
> > +static int btf_permute_remap_types(struct btf_permute *p);
> > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
> > +
> > +int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
>
> Let's require user to pass id_map_cnt in addition to id_map itself.
> It's easy to get this wrong (especially with that special VOID 0 type
> that has to be excluded, which I can't even make up my mind if that's
> a good idea or not), so having user explicitly say what they think is
> necessary for permutation is good.
Thank you for your suggestion. However, I am concerned that introducing
an additional `id_map_cnt` parameter could increase complexity. Specifically,
if `id_map_cnt` is less than `btf->nr_types`, we might need to consider whether
to resize the BTF. This could lead to missing types, potential ID remapping
failures, or even require BTF re-deduplication if certain name strings are no
longer referenced by any types.
>
> > +{
> > + struct btf_permute p;
> > + int i, err = 0;
> > + __u32 *map = NULL;
> > +
> > + if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
>
> libbpf doesn't protect against NULL passed for mandatory parameters,
> please drop !ids check
Thanks, I will fix it.
>
> > + return libbpf_err(-EINVAL);
> > +
> > + map = calloc(btf->nr_types, sizeof(*map));
> > + if (!map) {
> > + err = -ENOMEM;
> > + goto done;
> > + }
> > +
> > + for (i = 0; i < btf->nr_types; i++)
> > + map[i] = BTF_UNPROCESSED_ID;
> > +
> > + p.btf = btf;
> > + p.btf_ext = OPTS_GET(opts, btf_ext, NULL);
> > + p.ids = ids;
> > + p.map = map;
> > +
> > + if (btf_ensure_modifiable(btf)) {
> > + err = -ENOMEM;
> > + goto done;
> > + }
> > + err = btf_permute_shuffle_types(&p);
> > + if (err < 0) {
> > + pr_debug("btf_permute_shuffle_types failed: %s\n", errstr(err));
>
> let's drop these pr_debug(), I don't think it's something we expect to ever see
Thanks, I will remove it.
>
> > + goto done;
> > + }
> > + err = btf_permute_remap_types(&p);
> > + if (err < 0) {
> > + pr_debug("btf_permute_remap_types failed: %s\n", errstr(err));
>
> ditto
Thanks, I will remove it.
>
> > + goto done;
> > + }
> > +
> > +done:
> > + free(map);
> > + return libbpf_err(err);
> > +}
> > +
> > +/* Shuffle BTF types.
> > + *
> > + * Rearranges types according to the permutation map in p->ids. The p->map
> > + * array stores the mapping from original type IDs to new shuffled IDs,
> > + * which is used in the next phase to update type references.
> > + *
> > + * Validates that all IDs in the permutation array are valid and unique.
> > + */
> > +static int btf_permute_shuffle_types(struct btf_permute *p)
> > +{
> > + struct btf *btf = p->btf;
> > + const struct btf_type *t;
> > + __u32 *new_offs = NULL, *map;
> > + void *nt, *new_types = NULL;
> > + int i, id, len, err;
> > +
> > + new_offs = calloc(btf->nr_types, sizeof(*new_offs));
>
> we don't really need to allocate memory and maintain this, we can just
> shift types around and then do what btf_parse_type_sec() does -- just
> go over types one by one and calculate offsets, and update them
> in-place inside btf->type_offs
Thank you for the suggestion. However, this approach is not viable because
the `btf__type_by_id()` function relies critically on the integrity of the
`btf->type_offs` data structure. Attempting to modify `type_offs` through
in-place operations could corrupt memory and lead to segmentation faults
due to invalid pointer dereferencing.
>
> > + new_types = calloc(btf->hdr->type_len, 1);
> > + if (!new_offs || !new_types) {
> > + err = -ENOMEM;
> > + goto out_err;
> > + }
> > +
> > + nt = new_types;
> > + for (i = 0; i < btf->nr_types; i++) {
> > + id = p->ids[i];
> > + /* type IDs from base_btf and the VOID type are not allowed */
> > + if (id < btf->start_id) {
> > + err = -EINVAL;
> > + goto out_err;
> > + }
> > + /* must be a valid type ID */
> > + t = btf__type_by_id(btf, id);
> > + if (!t) {
> > + err = -EINVAL;
> > + goto out_err;
> > + }
> > + map = &p->map[id - btf->start_id];
> > + /* duplicate type IDs are not allowed */
> > + if (*map != BTF_UNPROCESSED_ID) {
>
> there is no need for BTF_UNPROCESSED_ID, zero is a perfectly valid
> value to use as "not yet set" value, as we don't allow remapping VOID
> 0 to anything anyways.
Thanks, I will fix it.
>
> > + err = -EINVAL;
> > + goto out_err;
> > + }
> > + len = btf_type_size(t);
> > + memcpy(nt, t, len);
>
> once you memcpy() data, you can use that btf_field_iter_init +
> btf_field_iter_next to *trivially* remap all IDs, no need for patch 1
> refactoring, IMO. And no need for two-phase approach either.
>
> > + new_offs[i] = nt - new_types;
> > + *map = btf->start_id + i;
> > + nt += len;
> > + }
> > +
> > + free(btf->types_data);
> > + free(btf->type_offs);
> > + btf->types_data = new_types;
> > + btf->type_offs = new_offs;
> > + return 0;
> > +
> > +out_err:
> > + free(new_offs);
> > + free(new_types);
> > + return err;
> > +}
> > +
> > +/* Callback function to remap individual type ID references
> > + *
> > + * This callback is invoked by btf_remap_types() for each type ID reference
> > + * found in the BTF data. It updates the reference to point to the new
> > + * permuted type ID using the mapping table.
> > + */
> > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > +{
> > + struct btf_permute *p = ctx;
> > + __u32 new_type_id = *type_id;
> > +
> > + /* skip references that point into the base BTF */
> > + if (new_type_id < p->btf->start_id)
> > + return 0;
> > +
> > + new_type_id = p->map[*type_id - p->btf->start_id];
>
> I'm actually confused, I thought p->ids would be the mapping from
> original type ID (minus start_id, of course) to a new desired ID, but
> it looks to be the other way? ids is a desired resulting *sequence* of
> types identified by their original ID. I find it quite confusing. I
> think about permutation as a mapping from original type ID to a new
> type ID, am I confused?
>
>
> > + if (new_type_id > BTF_MAX_NR_TYPES)
> > + return -EINVAL;
> > +
> > + *type_id = new_type_id;
> > + return 0;
> > +}
>
> [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 1:20 ` Eduard Zingerman
@ 2025-11-05 13:19 ` Donglin Peng
2025-11-05 18:32 ` Andrii Nakryiko
2025-11-05 18:23 ` Andrii Nakryiko
1 sibling, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-05 13:19 UTC (permalink / raw)
To: Eduard Zingerman, Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Wed, Nov 5, 2025 at 9:20 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > >
> > > [...]
> > >
> > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > +{
> > > > > + struct btf_permute *p = ctx;
> > > > > + __u32 new_type_id = *type_id;
> > > > > +
> > > > > + /* skip references that point into the base BTF */
> > > > > + if (new_type_id < p->btf->start_id)
> > > > > + return 0;
> > > > > +
> > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > >
> > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > types identified by their original ID. I find it quite confusing. I
> > > > think about permutation as a mapping from original type ID to a new
> > > > type ID, am I confused?
> > >
> > > Yes, it is a desired sequence, not mapping.
> > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > swap ids while sorting.
> >
> > The question is really what makes most sense as an interface. Because
> > for sorting cases it's just the matter of a two-line for() loop to
> > create ID mapping once types are sorted.
> >
> > I have slight preference for id_map approach because it is easy to
> > extend to the case of selectively dropping some types. We can just
> > define that such IDs should be mapped to zero. This will work as a
> > natural extension. With the desired end sequence of IDs, it's less
> > natural and will require more work to determine which IDs are missing
> > from the sequence.
> >
> > So unless there is some really good and strong reason, shall we go
> > with the ID mapping approach?
>
> If the interface is extended with types_cnt, as you suggest, deleting
> types is trivial with sequence interface as well. At-least the way it
> is implemented by this patch, you just copy elements from 'ids' one by
> one.
Thank you. I also favor the sequence interface approach.
if I understand correctly, using the ID mapping method would require
creating an additional ID array to cache the ordering for each type,
which appears more complex. Furthermore, generating an ID map might
not be straightforward for end users in the sorting scenario, IMO.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search
2025-11-04 17:14 ` Alexei Starovoitov
@ 2025-11-05 13:22 ` Donglin Peng
0 siblings, 0 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-05 13:22 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Alexei Starovoitov, LKML, bpf, Eduard Zingerman, Andrii Nakryiko,
Alan Maguire, Song Liu, pengdonglin
On Wed, Nov 5, 2025 at 1:15 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 5:41 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > From: pengdonglin <pengdonglin@xiaomi.com>
> >
> > Improve btf_find_by_name_kind() performance by adding binary search
> > support for sorted types. Falls back to linear search for compatibility.
> >
> > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Cc: Alan Maguire <alan.maguire@oracle.com>
> > Cc: Song Liu <song@kernel.org>
> > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> > ---
> > kernel/bpf/btf.c | 111 ++++++++++++++++++++++++++++++++++++++++++-----
> > 1 file changed, 101 insertions(+), 10 deletions(-)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 0de8fc8a0e0b..da35d8636b9b 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -259,6 +259,7 @@ struct btf {
> > void *nohdr_data;
> > struct btf_header hdr;
> > u32 nr_types; /* includes VOID for base BTF */
> > + u32 nr_sorted_types; /* exclude VOID for base BTF */
> > u32 types_size;
> > u32 data_size;
> > refcount_t refcnt;
> > @@ -494,6 +495,11 @@ static bool btf_type_is_modifier(const struct btf_type *t)
> > return false;
> > }
> >
> > +static int btf_start_id(const struct btf *btf)
> > +{
> > + return btf->start_id + (btf->base_btf ? 0 : 1);
> > +}
> > +
> > bool btf_type_is_void(const struct btf_type *t)
> > {
> > return t == &btf_void;
> > @@ -544,24 +550,109 @@ u32 btf_nr_types(const struct btf *btf)
> > return total;
> > }
> >
> > -s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> > +/* Find BTF types with matching names within the [left, right] index range.
> > + * On success, updates *left and *right to the boundaries of the matching range
> > + * and returns the leftmost matching index.
> > + */
> > +static s32 btf_find_by_name_kind_bsearch(const struct btf *btf, const char *name,
> > + s32 *left, s32 *right)
> > {
> > const struct btf_type *t;
> > const char *tname;
> > - u32 i, total;
> > + s32 l, r, m, lmost, rmost;
> > + int ret;
> >
> > - total = btf_nr_types(btf);
> > - for (i = 1; i < total; i++) {
> > - t = btf_type_by_id(btf, i);
> > - if (BTF_INFO_KIND(t->info) != kind)
> > - continue;
> > + /* found the leftmost btf_type that matches */
> > + l = *left;
> > + r = *right;
> > + lmost = -1;
> > + while (l <= r) {
> > + m = l + (r - l) / 2;
> > + t = btf_type_by_id(btf, m);
> > + tname = btf_name_by_offset(btf, t->name_off);
> > + ret = strcmp(tname, name);
> > + if (ret < 0) {
> > + l = m + 1;
> > + } else {
> > + if (ret == 0)
> > + lmost = m;
> > + r = m - 1;
> > + }
> > + }
> >
> > + if (lmost == -1)
> > + return -ENOENT;
> > +
> > + /* found the rightmost btf_type that matches */
> > + l = lmost;
> > + r = *right;
> > + rmost = -1;
> > + while (l <= r) {
> > + m = l + (r - l) / 2;
> > + t = btf_type_by_id(btf, m);
> > tname = btf_name_by_offset(btf, t->name_off);
> > - if (!strcmp(tname, name))
> > - return i;
> > + ret = strcmp(tname, name);
> > + if (ret <= 0) {
> > + if (ret == 0)
> > + rmost = m;
> > + l = m + 1;
> > + } else {
> > + r = m - 1;
> > + }
> > }
> >
> > - return -ENOENT;
> > + *left = lmost;
> > + *right = rmost;
> > + return lmost;
> > +}
> > +
> > +s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> > +{
> > + const struct btf *base_btf = btf_base_btf(btf);;
> > + const struct btf_type *t;
> > + const char *tname;
> > + int err = -ENOENT;
> > +
> > + if (base_btf)
> > + err = btf_find_by_name_kind(base_btf, name, kind);
> > +
> > + if (err == -ENOENT) {
>
> Please avoid the needless indent.
Thanks. I will fix it.
>
> > + if (btf->nr_sorted_types) {
>
> looks buggy,
> since you init it to btf->nr_sorted_types = BTF_NEED_SORT_CHECK;
>
> Also AI is right. Init the field in the same patch.
Thanks. I will fix it.
>
> pw-bot: cr
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 1:17 ` Eduard Zingerman
@ 2025-11-05 13:48 ` Donglin Peng
2025-11-05 16:52 ` Eduard Zingerman
2025-11-05 18:11 ` Andrii Nakryiko
0 siblings, 2 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-05 13:48 UTC (permalink / raw)
To: Eduard Zingerman, Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > >
> > > [...]
> > >
> > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > return type_id;
> > > > > }
> > > > >
> > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > +/*
> > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > + * and returns the leftmost matching index.
> > > > > + */
> > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > + __s32 *left, __s32 *right)
> > > >
> > > > I thought we discussed this, why do you need "right"? Two binary
> > > > searches where one would do just fine.
> > >
> > > I think the idea is that there would be less strcmp's if there is a
> > > long sequence of items with identical names.
> >
> > Sure, it's a tradeoff. But how long is the set of duplicate name
> > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > with high likelihood will take more comparisons.
>
> $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> 51737 '(anon)'
> 277 'bpf_kfunc'
> 4 'long
> 3 'perf_aux_event'
> 3 'workspace'
> 2 'ata_acpi_gtm'
> 2 'avc_cache_stats'
> 2 'bh_accounting'
> 2 'bp_cpuinfo'
> 2 'bpf_fastcall'
>
> 'bpf_kfunc' is probably for decl_tags.
> So I agree with you regarding the second binary search, it is not
> necessary. But skipping all anonymous types (and thus having to
> maintain nr_sorted_types) might be useful, on each search two
> iterations would be wasted to skip those.
Thank you. After removing the redundant iterations, performance increased
significantly compared with two iterations.
Test Case: Locate all 58,719 named types in vmlinux BTF
Methodology:
./vmtest.sh -- ./test_progs -t btf_permute/perf -v
Two iterations:
| Condition | Lookup Time | Improvement |
|--------------------|-------------|-------------|
| Unsorted (Linear) | 17,282 ms | Baseline |
| Sorted (Binary) | 19 ms | 909x faster |
One iteration:
Results:
| Condition | Lookup Time | Improvement |
|--------------------|-------------|-------------|
| Unsorted (Linear) | 17,619 ms | Baseline |
| Sorted (Binary) | 10 ms | 1762x faster |
Here is the code implementation with a single iteration approach.
I believe this scenario differs from find_linfo because we cannot
determine in advance whether the specified type name will be found.
Please correct me if I've misunderstood anything, and I welcome any
guidance on this matter.
static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
const char *name,
__s32 start_id)
{
const struct btf_type *t;
const char *tname;
__s32 l, r, m, lmost = -ENOENT;
int ret;
/* found the leftmost btf_type that matches */
l = start_id;
r = btf__type_cnt(btf) - 1;
while (l <= r) {
m = l + (r - l) / 2;
t = btf_type_by_id(btf, m);
if (!t->name_off) {
ret = 1;
} else {
tname = btf__str_by_offset(btf, t->name_off);
ret = !tname ? 1 : strcmp(tname, name);
}
if (ret < 0) {
l = m + 1;
} else {
if (ret == 0)
lmost = m;
r = m - 1;
}
}
return lmost;
}
static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
const char *type_name, __u32 kind)
{
const struct btf_type *t;
const char *tname;
int err = -ENOENT;
__u32 total;
if (!btf)
goto out;
if (start_id < btf->start_id) {
err = btf_find_type_by_name_kind(btf->base_btf, start_id,
type_name, kind);
if (err == -ENOENT)
start_id = btf->start_id;
}
if (err == -ENOENT) {
if (btf_check_sorted((struct btf *)btf)) {
/* binary search */
bool skip_first;
int ret;
/* return the leftmost with maching names */
ret = btf_find_type_by_name_bsearch(btf,
type_name, start_id);
if (ret < 0)
goto out;
/* skip kind checking */
if (kind == -1)
return ret;
total = btf__type_cnt(btf);
skip_first = true;
do {
t = btf_type_by_id(btf, ret);
if (btf_kind(t) != kind) {
if (skip_first) {
skip_first = false;
continue;
}
} else if (skip_first) {
return ret;
}
if (!t->name_off)
break;
tname = btf__str_by_offset(btf, t->name_off);
if (tname && !strcmp(tname, type_name))
return ret;
else
break;
} while (++ret < total);
} else {
/* linear search */
...
}
}
out:
return err;
}
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 13:48 ` Donglin Peng
@ 2025-11-05 16:52 ` Eduard Zingerman
2025-11-06 6:10 ` Donglin Peng
2025-11-05 18:11 ` Andrii Nakryiko
1 sibling, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 16:52 UTC (permalink / raw)
To: Donglin Peng, Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Wed, 2025-11-05 at 21:48 +0800, Donglin Peng wrote:
> On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > >
> > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > >
> > > > [...]
> > > >
> > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > return type_id;
> > > > > > }
> > > > > >
> > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > +/*
> > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > + * and returns the leftmost matching index.
> > > > > > + */
> > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > + __s32 *left, __s32 *right)
> > > > >
> > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > searches where one would do just fine.
> > > >
> > > > I think the idea is that there would be less strcmp's if there is a
> > > > long sequence of items with identical names.
> > >
> > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > with high likelihood will take more comparisons.
> >
> > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > 51737 '(anon)'
> > 277 'bpf_kfunc'
> > 4 'long
> > 3 'perf_aux_event'
> > 3 'workspace'
> > 2 'ata_acpi_gtm'
> > 2 'avc_cache_stats'
> > 2 'bh_accounting'
> > 2 'bp_cpuinfo'
> > 2 'bpf_fastcall'
> >
> > 'bpf_kfunc' is probably for decl_tags.
> > So I agree with you regarding the second binary search, it is not
> > necessary. But skipping all anonymous types (and thus having to
> > maintain nr_sorted_types) might be useful, on each search two
> > iterations would be wasted to skip those.
>
> Thank you. After removing the redundant iterations, performance increased
> significantly compared with two iterations.
>
> Test Case: Locate all 58,719 named types in vmlinux BTF
> Methodology:
> ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
>
> Two iterations:
> > Condition | Lookup Time | Improvement |
> > --------------------|-------------|-------------|
> > Unsorted (Linear) | 17,282 ms | Baseline |
> > Sorted (Binary) | 19 ms | 909x faster |
>
> One iteration:
> Results:
> > Condition | Lookup Time | Improvement |
> > --------------------|-------------|-------------|
> > Unsorted (Linear) | 17,619 ms | Baseline |
> > Sorted (Binary) | 10 ms | 1762x faster |
>
> Here is the code implementation with a single iteration approach.
Could you please also check if there is a difference between having
nr_sorted_types as is and having it equal to nr_types?
Want to understand if this optimization is necessary.
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 13:48 ` Donglin Peng
2025-11-05 16:52 ` Eduard Zingerman
@ 2025-11-05 18:11 ` Andrii Nakryiko
2025-11-06 7:49 ` Donglin Peng
1 sibling, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 18:11 UTC (permalink / raw)
To: Donglin Peng
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 5:48 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > >
> > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > >
> > > > [...]
> > > >
> > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > return type_id;
> > > > > > }
> > > > > >
> > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > +/*
> > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > + * and returns the leftmost matching index.
> > > > > > + */
> > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > + __s32 *left, __s32 *right)
> > > > >
> > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > searches where one would do just fine.
> > > >
> > > > I think the idea is that there would be less strcmp's if there is a
> > > > long sequence of items with identical names.
> > >
> > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > with high likelihood will take more comparisons.
> >
> > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > 51737 '(anon)'
> > 277 'bpf_kfunc'
> > 4 'long
> > 3 'perf_aux_event'
> > 3 'workspace'
> > 2 'ata_acpi_gtm'
> > 2 'avc_cache_stats'
> > 2 'bh_accounting'
> > 2 'bp_cpuinfo'
> > 2 'bpf_fastcall'
> >
> > 'bpf_kfunc' is probably for decl_tags.
> > So I agree with you regarding the second binary search, it is not
> > necessary. But skipping all anonymous types (and thus having to
> > maintain nr_sorted_types) might be useful, on each search two
> > iterations would be wasted to skip those.
fair enough, eliminating a big chunk of anonymous types is useful, let's do this
>
> Thank you. After removing the redundant iterations, performance increased
> significantly compared with two iterations.
>
> Test Case: Locate all 58,719 named types in vmlinux BTF
> Methodology:
> ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
>
> Two iterations:
> | Condition | Lookup Time | Improvement |
> |--------------------|-------------|-------------|
> | Unsorted (Linear) | 17,282 ms | Baseline |
> | Sorted (Binary) | 19 ms | 909x faster |
>
> One iteration:
> Results:
> | Condition | Lookup Time | Improvement |
> |--------------------|-------------|-------------|
> | Unsorted (Linear) | 17,619 ms | Baseline |
> | Sorted (Binary) | 10 ms | 1762x faster |
>
> Here is the code implementation with a single iteration approach.
> I believe this scenario differs from find_linfo because we cannot
> determine in advance whether the specified type name will be found.
> Please correct me if I've misunderstood anything, and I welcome any
> guidance on this matter.
>
> static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
> const char *name,
> __s32 start_id)
> {
> const struct btf_type *t;
> const char *tname;
> __s32 l, r, m, lmost = -ENOENT;
> int ret;
>
> /* found the leftmost btf_type that matches */
> l = start_id;
> r = btf__type_cnt(btf) - 1;
> while (l <= r) {
> m = l + (r - l) / 2;
> t = btf_type_by_id(btf, m);
> if (!t->name_off) {
> ret = 1;
> } else {
> tname = btf__str_by_offset(btf, t->name_off);
> ret = !tname ? 1 : strcmp(tname, name);
> }
> if (ret < 0) {
> l = m + 1;
> } else {
> if (ret == 0)
> lmost = m;
> r = m - 1;
> }
> }
>
> return lmost;
> }
There are different ways to implement this. At the highest level,
implementation below just searches for leftmost element that has name
>= the one we are searching for. One complication is that such element
might not event exists. We can solve that checking ahead of time
whether the rightmost type satisfied the condition, or we could do
something similar to what I do in the loop below, where I allow l == r
and then if that element has name >= to what we search, we exit
because we found it. And if not, l will become larger than r, we'll
break out of the loop and we'll know that we couldn't find the
element. I haven't tested it, but please take a look and if you decide
to go with such approach, do test it for edge cases, of course.
/*
* We are searching for the smallest r such that type #r's name is >= name.
* It might not exist, in which case we'll have l == r + 1.
*/
l = start_id;
r = btf__type_cnt(btf) - 1;
while (l < r) {
m = l + (r - l) / 2;
t = btf_type_by_id(btf, m);
tname = btf__str_by_offset(btf, t->name_off);
if (strcmp(tname, name) >= 0) {
if (l == r)
return r; /* found it! */
r = m;
} else {
l = m + 1;
}
}
/* here we know given element doesn't exist, return index beyond end of types */
return btf__type_cnt(btf);
We could have checked instead whether strcmp(btf__str_by_offset(btf,
btf__type_by_id(btf, btf__type_cnt() - 1)->name_off), name) < 0 and
exit early. That's just a bit more code duplication of essentially
what we do inside the loop, so that if (l == r) seems fine to me, but
I'm not married to this.
>
> static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> const char *type_name, __u32 kind)
> {
> const struct btf_type *t;
> const char *tname;
> int err = -ENOENT;
> __u32 total;
>
> if (!btf)
> goto out;
>
> if (start_id < btf->start_id) {
> err = btf_find_type_by_name_kind(btf->base_btf, start_id,
> type_name, kind);
> if (err == -ENOENT)
> start_id = btf->start_id;
> }
>
> if (err == -ENOENT) {
> if (btf_check_sorted((struct btf *)btf)) {
> /* binary search */
> bool skip_first;
> int ret;
>
> /* return the leftmost with maching names */
> ret = btf_find_type_by_name_bsearch(btf,
> type_name, start_id);
> if (ret < 0)
> goto out;
> /* skip kind checking */
> if (kind == -1)
> return ret;
> total = btf__type_cnt(btf);
> skip_first = true;
> do {
> t = btf_type_by_id(btf, ret);
> if (btf_kind(t) != kind) {
> if (skip_first) {
> skip_first = false;
> continue;
> }
> } else if (skip_first) {
> return ret;
> }
> if (!t->name_off)
> break;
> tname = btf__str_by_offset(btf, t->name_off);
> if (tname && !strcmp(tname, type_name))
> return ret;
> else
> break;
> } while (++ret < total);
> } else {
> /* linear search */
> ...
> }
> }
>
> out:
> return err;
> }
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-05 1:23 ` Eduard Zingerman
@ 2025-11-05 18:20 ` Andrii Nakryiko
2025-11-05 19:41 ` Eduard Zingerman
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 18:20 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 5:23 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 16:57 -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 4, 2025 at 4:36 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > >
> > > [...]
> > >
> > > > > @@ -3400,6 +3400,37 @@ int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
> > > > > return 0;
> > > > > }
> > > > >
> > > > > +static int btf_remap_types(struct btf *btf, struct btf_ext *btf_ext,
> > > > > + btf_remap_type_fn visit, void *ctx)
> > > >
> > > > tbh, my goal is to reduce the amount of callback usage within libbpf,
> > > > not add more of it...
> > > >
> > > > I don't like this refactoring. We should convert
> > > > btf_ext_visit_type_ids() into iterators, have btf_field_iter_init +
> > > > btf_field_iter_next usable in for_each() form, and not try to reuse 5
> > > > lines of code. See my comments in the next patch.
> > >
> > > Remapping types is a concept.
> > > I hate duplicating code for concepts.
> > > Similarly, having patch #3 == patch #5 and patch #4 == patch #6 is
> > > plain ugly. Just waiting for a bug because we changed the one but
> > > forgot to change another in a year or two.
> >
> > Tricky and evolving part (iterating all type ID fields) is abstracted
> > behind the iterator (and we should do the same for btf_ext). Iterating
> > types is not something tricky or requiring constant maintenance.
> >
> > Same for binary search, I don't see why we'd need to adjust it. So no,
> > I don't want to share code between kernel and libbpf just to reuse
> > binary search implementation, sorry.
>
> <rant>
>
> Sure binary search is trivial, but did you count how many times you
> asked people to re-implement binary search as in [1]?
Exact match binary search can be called trivial, yes. Lower/upper
bound binary search looks deceivingly simple, but it requires
attention to every single line of code. But the end result is simple
and straightforward, yes.
I'm not sure what point you are trying to make, though. Yes, I've
asked people many times to implement upper/lower bound binary search
similarly to the one in find_linfo(), because usually people have
various unnecessary checks, keeping track not just of bounds, but also
remembering some element that we know satisfied the condition at some
point before, etc. It's not elegant, harder to reason about, and can
be done more succinctly.
You don't like that I ask people to improve implementation? You don't
like the implementation itself? Or are you suggesting that we should
add a "generic" C implementation of lower_bound/upper_bound and use
callbacks for comparison logic? What are you ranting about, exactly?
As I said, once binary search (of whatever kind, bounds or exact) is
written for something like this, it doesn't have to ever be modified.
I don't see this as a maintainability hurdle at all. But sharing code
between libbpf and kernel is something to be avoided. Look at #ifdef
__KERNEL__ sections of relo_core.c as one reason why.
>
> [1] https://elixir.bootlin.com/linux/v6.18-rc4/source/kernel/bpf/verifier.c#L2952
>
> </rant>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 1:20 ` Eduard Zingerman
2025-11-05 13:19 ` Donglin Peng
@ 2025-11-05 18:23 ` Andrii Nakryiko
2025-11-05 19:23 ` Eduard Zingerman
2025-11-07 2:36 ` Donglin Peng
1 sibling, 2 replies; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 18:23 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Tue, Nov 4, 2025 at 5:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > >
> > > [...]
> > >
> > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > +{
> > > > > + struct btf_permute *p = ctx;
> > > > > + __u32 new_type_id = *type_id;
> > > > > +
> > > > > + /* skip references that point into the base BTF */
> > > > > + if (new_type_id < p->btf->start_id)
> > > > > + return 0;
> > > > > +
> > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > >
> > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > types identified by their original ID. I find it quite confusing. I
> > > > think about permutation as a mapping from original type ID to a new
> > > > type ID, am I confused?
> > >
> > > Yes, it is a desired sequence, not mapping.
> > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > swap ids while sorting.
> >
> > The question is really what makes most sense as an interface. Because
> > for sorting cases it's just the matter of a two-line for() loop to
> > create ID mapping once types are sorted.
> >
> > I have slight preference for id_map approach because it is easy to
> > extend to the case of selectively dropping some types. We can just
> > define that such IDs should be mapped to zero. This will work as a
> > natural extension. With the desired end sequence of IDs, it's less
> > natural and will require more work to determine which IDs are missing
> > from the sequence.
> >
> > So unless there is some really good and strong reason, shall we go
> > with the ID mapping approach?
>
> If the interface is extended with types_cnt, as you suggest, deleting
> types is trivial with sequence interface as well. At-least the way it
> is implemented by this patch, you just copy elements from 'ids' one by
> one.
But it is way less explicit and obvious way to delete element. With ID
map it is obvious, that type will be mapped to zero. With list of IDs,
you effectively search for elements that are missing, which IMO is way
less optimal an interface.
So I still favor the ID map approach.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 12:52 ` Donglin Peng
@ 2025-11-05 18:29 ` Andrii Nakryiko
2025-11-06 7:31 ` Donglin Peng
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 18:29 UTC (permalink / raw)
To: Donglin Peng
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 4:53 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 8:11 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > >
> > > From: pengdonglin <pengdonglin@xiaomi.com>
> > >
> > > Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> > > This function reorganizes BTF type order according to a provided array of
> > > type IDs, updating all type references to maintain consistency.
> > >
> > > The permutation process involves:
> > > 1. Shuffling types into new order based on the provided ID mapping
> > > 2. Remapping all type ID references to point to new locations
> > > 3. Handling BTF extension data if provided via options
> > >
> > > This is particularly useful for optimizing type locality after BTF
> > > deduplication or for meeting specific layout requirements in specialized
> > > use cases.
> > >
> > > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > Cc: Alan Maguire <alan.maguire@oracle.com>
> > > Cc: Song Liu <song@kernel.org>
> > > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> > > ---
> > > tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
> > > tools/lib/bpf/btf.h | 34 +++++++++
> > > tools/lib/bpf/libbpf.map | 1 +
> > > 3 files changed, 196 insertions(+)
> > >
> > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > index 5e1c09b5dce8..3bc03f7fe31f 100644
> > > --- a/tools/lib/bpf/btf.c
> > > +++ b/tools/lib/bpf/btf.c
> > > @@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
> > > btf->owns_base = false;
> > > return libbpf_err(err);
> > > }
> > > +
> > > +struct btf_permute {
> > > + /* .BTF section to be permuted in-place */
> > > + struct btf *btf;
> > > + struct btf_ext *btf_ext;
> > > + /* Array of type IDs used for permutation. The array length must equal
> >
> > /*
> > * Use this comment style
> > */
>
> Thanks.
>
> >
> > > + * the number of types in the BTF being permuted, excluding the special
> > > + * void type at ID 0. For split BTF, the length corresponds to the
> > > + * number of types added on top of the base BTF.
> >
> > many words, but what exactly ids[i] means is still not clear, actually...
>
> Thanks. I'll clarify the description. Is the following parameter
> explanation acceptable?
>
> @param ids Array containing original type IDs (excluding VOID type ID
> 0) in user-defined order.
> The array size must match btf->nr_types, which
Users don't have access to btf->nr_types, so referring to it in API
description seems wrong.
But also, this all will change if we allow removing types, because
then array size might be smaller. But is it intentionally smaller or
user made a mistake? Let's go with the ID map approach, please.
> also excludes VOID type ID 0.
>
>
> >
> > > + */
> > > + __u32 *ids;
> > > + /* Array of type IDs used to map from original type ID to a new permuted
> > > + * type ID, its length equals to the above ids */
> >
> > wrong comment style
>
> Thanks, I will fix it in the next version.
>
> >
> > > + __u32 *map;
> >
> > "map" is a bit generic. What if we use s/ids/id_map/ and
> > s/map/id_map_rev/ (for reverse)? I'd use "id_map" naming in the public
> > API to make it clear that it's a mapping of IDs, not just some array
> > of IDs.
>
> Thank you for the suggestion. While I agree that renaming 'map' to 'id_map'
> makes sense for clarity, but 'ids' seems correct as it denotes a collection of
> IDs, not a mapping structure.
>
> >
> > > +};
> > > +
> > > +static int btf_permute_shuffle_types(struct btf_permute *p);
> > > +static int btf_permute_remap_types(struct btf_permute *p);
> > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
> > > +
> > > +int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
> >
> > Let's require user to pass id_map_cnt in addition to id_map itself.
> > It's easy to get this wrong (especially with that special VOID 0 type
> > that has to be excluded, which I can't even make up my mind if that's
> > a good idea or not), so having user explicitly say what they think is
> > necessary for permutation is good.
>
> Thank you for your suggestion. However, I am concerned that introducing
> an additional `id_map_cnt` parameter could increase complexity. Specifically,
> if `id_map_cnt` is less than `btf->nr_types`, we might need to consider whether
> to resize the BTF. This could lead to missing types, potential ID remapping
> failures, or even require BTF re-deduplication if certain name strings are no
> longer referenced by any types.
>
No, if the user provided a wrong id_map_cnt, it's an error and we
return -EINVAL. No resizing.
> >
> > > +{
> > > + struct btf_permute p;
> > > + int i, err = 0;
> > > + __u32 *map = NULL;
> > > +
> > > + if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
> >
[...]
> > > + goto done;
> > > + }
> > > +
> > > +done:
> > > + free(map);
> > > + return libbpf_err(err);
> > > +}
> > > +
> > > +/* Shuffle BTF types.
> > > + *
> > > + * Rearranges types according to the permutation map in p->ids. The p->map
> > > + * array stores the mapping from original type IDs to new shuffled IDs,
> > > + * which is used in the next phase to update type references.
> > > + *
> > > + * Validates that all IDs in the permutation array are valid and unique.
> > > + */
> > > +static int btf_permute_shuffle_types(struct btf_permute *p)
> > > +{
> > > + struct btf *btf = p->btf;
> > > + const struct btf_type *t;
> > > + __u32 *new_offs = NULL, *map;
> > > + void *nt, *new_types = NULL;
> > > + int i, id, len, err;
> > > +
> > > + new_offs = calloc(btf->nr_types, sizeof(*new_offs));
> >
> > we don't really need to allocate memory and maintain this, we can just
> > shift types around and then do what btf_parse_type_sec() does -- just
> > go over types one by one and calculate offsets, and update them
> > in-place inside btf->type_offs
>
> Thank you for the suggestion. However, this approach is not viable because
> the `btf__type_by_id()` function relies critically on the integrity of the
> `btf->type_offs` data structure. Attempting to modify `type_offs` through
> in-place operations could corrupt memory and lead to segmentation faults
> due to invalid pointer dereferencing.
Huh? By the time this API returns, we'll fix up type_offs, users will
never notice. And to recalculate new type_offs we don't need
type_offs. One of us is missing something important, what is it?
[...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 13:19 ` Donglin Peng
@ 2025-11-05 18:32 ` Andrii Nakryiko
0 siblings, 0 replies; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 18:32 UTC (permalink / raw)
To: Donglin Peng
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 5:19 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 9:20 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > >
> > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > >
> > > > [...]
> > > >
> > > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > > +{
> > > > > > + struct btf_permute *p = ctx;
> > > > > > + __u32 new_type_id = *type_id;
> > > > > > +
> > > > > > + /* skip references that point into the base BTF */
> > > > > > + if (new_type_id < p->btf->start_id)
> > > > > > + return 0;
> > > > > > +
> > > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > > >
> > > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > > types identified by their original ID. I find it quite confusing. I
> > > > > think about permutation as a mapping from original type ID to a new
> > > > > type ID, am I confused?
> > > >
> > > > Yes, it is a desired sequence, not mapping.
> > > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > > swap ids while sorting.
> > >
> > > The question is really what makes most sense as an interface. Because
> > > for sorting cases it's just the matter of a two-line for() loop to
> > > create ID mapping once types are sorted.
> > >
> > > I have slight preference for id_map approach because it is easy to
> > > extend to the case of selectively dropping some types. We can just
> > > define that such IDs should be mapped to zero. This will work as a
> > > natural extension. With the desired end sequence of IDs, it's less
> > > natural and will require more work to determine which IDs are missing
> > > from the sequence.
> > >
> > > So unless there is some really good and strong reason, shall we go
> > > with the ID mapping approach?
> >
> > If the interface is extended with types_cnt, as you suggest, deleting
> > types is trivial with sequence interface as well. At-least the way it
> > is implemented by this patch, you just copy elements from 'ids' one by
> > one.
>
> Thank you. I also favor the sequence interface approach.
> if I understand correctly, using the ID mapping method would require
> creating an additional ID array to cache the ordering for each type,
> which appears more complex. Furthermore, generating an ID map might
> not be straightforward for end users in the sorting scenario, IMO.
Additional array on user side or inside libbpf's implementation? But
even if on the user side, a few temporary extra kilobytes to sort BTF
doesn't seem like a big limitation (definitely not for pahole, for
example).
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 18:23 ` Andrii Nakryiko
@ 2025-11-05 19:23 ` Eduard Zingerman
2025-11-06 17:21 ` Andrii Nakryiko
2025-11-07 2:36 ` Donglin Peng
1 sibling, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 19:23 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, 2025-11-05 at 10:23 -0800, Andrii Nakryiko wrote:
> On Tue, Nov 4, 2025 at 5:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > >
> > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > >
> > > > [...]
> > > >
> > > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > > +{
> > > > > > + struct btf_permute *p = ctx;
> > > > > > + __u32 new_type_id = *type_id;
> > > > > > +
> > > > > > + /* skip references that point into the base BTF */
> > > > > > + if (new_type_id < p->btf->start_id)
> > > > > > + return 0;
> > > > > > +
> > > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > > >
> > > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > > types identified by their original ID. I find it quite confusing. I
> > > > > think about permutation as a mapping from original type ID to a new
> > > > > type ID, am I confused?
> > > >
> > > > Yes, it is a desired sequence, not mapping.
> > > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > > swap ids while sorting.
> > >
> > > The question is really what makes most sense as an interface. Because
> > > for sorting cases it's just the matter of a two-line for() loop to
> > > create ID mapping once types are sorted.
> > >
> > > I have slight preference for id_map approach because it is easy to
> > > extend to the case of selectively dropping some types. We can just
> > > define that such IDs should be mapped to zero. This will work as a
> > > natural extension. With the desired end sequence of IDs, it's less
> > > natural and will require more work to determine which IDs are missing
> > > from the sequence.
> > >
> > > So unless there is some really good and strong reason, shall we go
> > > with the ID mapping approach?
> >
> > If the interface is extended with types_cnt, as you suggest, deleting
> > types is trivial with sequence interface as well. At-least the way it
> > is implemented by this patch, you just copy elements from 'ids' one by
> > one.
>
> But it is way less explicit and obvious way to delete element. With ID
> map it is obvious, that type will be mapped to zero. With list of IDs,
> you effectively search for elements that are missing, which IMO is way
> less optimal an interface.
>
> So I still favor the ID map approach.
You don't need to search for deleted elements with current
implementation (assuming the ids_cnt parameter is added).
Suppose there are 4 types + void in BTF and the 'ids' sequence looks
as follows: {1, 3, 4}, current implementation will:
- iterate over 'ids':
- copy 1 to new_types, remember to remap 1 to 1
- copy 3 to new_types, remember to remap 3 to 2
- copy 4 to new_types, remember to remap 4 to 3
- do the remapping.
Consider the sorting use-case:
- If 'ids' is the desired final order of types, libbpf needs to
allocate the mapping from old id to new id, as described above.
- If 'ids' is a map from old id to new id:
- libbpf will have to allocate a temporary array to hold the desired
id sequence, to know in which order to copy the types;
- user will have to allocate the array for mapping.
So, for id map approach it is one more allocation for no benefit.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-05 18:20 ` Andrii Nakryiko
@ 2025-11-05 19:41 ` Eduard Zingerman
2025-11-06 17:09 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Eduard Zingerman @ 2025-11-05 19:41 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, 2025-11-05 at 10:20 -0800, Andrii Nakryiko wrote:
[...]
> You don't like that I ask people to improve implementation?
Not at all.
> You don't like the implementation itself? Or are you suggesting that
> we should add a "generic" C implementation of
> lower_bound/upper_bound and use callbacks for comparison logic? What
> are you ranting about, exactly?
Actually, having it as a static inline function in a header would be
nice. I just tried that, and gcc is perfectly capable of inlining the
comparison function in -O2 mode.
I'm ranting about patch #5 being 101 insertions(+), 10 deletions(-)
and patch #4 being 119 insertions(+), 23 deletions(-),
while doing exactly the same thing.
And yes, this copy of binary search routine probably won't ever
change. But changes to the comparator logic are pretty much possible,
if we decide to include 'kind' as a secondary key one day.
And that change will have to happen twice.
> As I said, once binary search (of whatever kind, bounds or exact) is
> written for something like this, it doesn't have to ever be modified.
> I don't see this as a maintainability hurdle at all. But sharing code
> between libbpf and kernel is something to be avoided. Look at #ifdef
> __KERNEL__ sections of relo_core.c as one reason why.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 16:52 ` Eduard Zingerman
@ 2025-11-06 6:10 ` Donglin Peng
0 siblings, 0 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-06 6:10 UTC (permalink / raw)
To: Eduard Zingerman, Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Alan Maguire, Song Liu, pengdonglin
On Thu, Nov 6, 2025 at 12:52 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2025-11-05 at 21:48 +0800, Donglin Peng wrote:
> > On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > > return type_id;
> > > > > > > }
> > > > > > >
> > > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > > +/*
> > > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > > + * and returns the leftmost matching index.
> > > > > > > + */
> > > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > > + __s32 *left, __s32 *right)
> > > > > >
> > > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > > searches where one would do just fine.
> > > > >
> > > > > I think the idea is that there would be less strcmp's if there is a
> > > > > long sequence of items with identical names.
> > > >
> > > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > > with high likelihood will take more comparisons.
> > >
> > > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > > 51737 '(anon)'
> > > 277 'bpf_kfunc'
> > > 4 'long
> > > 3 'perf_aux_event'
> > > 3 'workspace'
> > > 2 'ata_acpi_gtm'
> > > 2 'avc_cache_stats'
> > > 2 'bh_accounting'
> > > 2 'bp_cpuinfo'
> > > 2 'bpf_fastcall'
> > >
> > > 'bpf_kfunc' is probably for decl_tags.
> > > So I agree with you regarding the second binary search, it is not
> > > necessary. But skipping all anonymous types (and thus having to
> > > maintain nr_sorted_types) might be useful, on each search two
> > > iterations would be wasted to skip those.
> >
> > Thank you. After removing the redundant iterations, performance increased
> > significantly compared with two iterations.
> >
> > Test Case: Locate all 58,719 named types in vmlinux BTF
> > Methodology:
> > ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
> >
> > Two iterations:
> > > Condition | Lookup Time | Improvement |
> > > --------------------|-------------|-------------|
> > > Unsorted (Linear) | 17,282 ms | Baseline |
> > > Sorted (Binary) | 19 ms | 909x faster |
> >
> > One iteration:
> > Results:
> > > Condition | Lookup Time | Improvement |
> > > --------------------|-------------|-------------|
> > > Unsorted (Linear) | 17,619 ms | Baseline |
> > > Sorted (Binary) | 10 ms | 1762x faster |
> >
> > Here is the code implementation with a single iteration approach.
>
> Could you please also check if there is a difference between having
> nr_sorted_types as is and having it equal to nr_types?
> Want to understand if this optimization is necessary.
Yes, here is the result:
| Condition | Lookup Time |
Improvement |
|----------------------------------------------|------------------
--|------------------|
| Unsorted (Linear) | 16666461 us | Baseline |
| Sorted (Binary) nr__types | 9957 us | 1673x faster |
| Sorted (Binary) nr_sorted_types | 9337 us | 1785x faster |
Using nr_sorted_types provides an additional 6% performance improvement
over nr_types.
>
> [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 18:29 ` Andrii Nakryiko
@ 2025-11-06 7:31 ` Donglin Peng
2025-11-06 17:12 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-06 7:31 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Thu, Nov 6, 2025 at 2:29 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 4:53 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > On Wed, Nov 5, 2025 at 8:11 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > >
> > > > From: pengdonglin <pengdonglin@xiaomi.com>
> > > >
> > > > Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> > > > This function reorganizes BTF type order according to a provided array of
> > > > type IDs, updating all type references to maintain consistency.
> > > >
> > > > The permutation process involves:
> > > > 1. Shuffling types into new order based on the provided ID mapping
> > > > 2. Remapping all type ID references to point to new locations
> > > > 3. Handling BTF extension data if provided via options
> > > >
> > > > This is particularly useful for optimizing type locality after BTF
> > > > deduplication or for meeting specific layout requirements in specialized
> > > > use cases.
> > > >
> > > > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > > Cc: Alan Maguire <alan.maguire@oracle.com>
> > > > Cc: Song Liu <song@kernel.org>
> > > > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > > > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> > > > ---
> > > > tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
> > > > tools/lib/bpf/btf.h | 34 +++++++++
> > > > tools/lib/bpf/libbpf.map | 1 +
> > > > 3 files changed, 196 insertions(+)
> > > >
> > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > index 5e1c09b5dce8..3bc03f7fe31f 100644
> > > > --- a/tools/lib/bpf/btf.c
> > > > +++ b/tools/lib/bpf/btf.c
> > > > @@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
> > > > btf->owns_base = false;
> > > > return libbpf_err(err);
> > > > }
> > > > +
> > > > +struct btf_permute {
> > > > + /* .BTF section to be permuted in-place */
> > > > + struct btf *btf;
> > > > + struct btf_ext *btf_ext;
> > > > + /* Array of type IDs used for permutation. The array length must equal
> > >
> > > /*
> > > * Use this comment style
> > > */
> >
> > Thanks.
> >
> > >
> > > > + * the number of types in the BTF being permuted, excluding the special
> > > > + * void type at ID 0. For split BTF, the length corresponds to the
> > > > + * number of types added on top of the base BTF.
> > >
> > > many words, but what exactly ids[i] means is still not clear, actually...
> >
> > Thanks. I'll clarify the description. Is the following parameter
> > explanation acceptable?
> >
> > @param ids Array containing original type IDs (excluding VOID type ID
> > 0) in user-defined order.
> > The array size must match btf->nr_types, which
>
> Users don't have access to btf->nr_types, so referring to it in API
> description seems wrong.
>
> But also, this all will change if we allow removing types, because
> then array size might be smaller. But is it intentionally smaller or
> user made a mistake? Let's go with the ID map approach, please.
Thanks. I can implement both approaches, then we can assess their
pros and cons.
>
> > also excludes VOID type ID 0.
> >
> >
> > >
> > > > + */
> > > > + __u32 *ids;
> > > > + /* Array of type IDs used to map from original type ID to a new permuted
> > > > + * type ID, its length equals to the above ids */
> > >
> > > wrong comment style
> >
> > Thanks, I will fix it in the next version.
> >
> > >
> > > > + __u32 *map;
> > >
> > > "map" is a bit generic. What if we use s/ids/id_map/ and
> > > s/map/id_map_rev/ (for reverse)? I'd use "id_map" naming in the public
> > > API to make it clear that it's a mapping of IDs, not just some array
> > > of IDs.
> >
> > Thank you for the suggestion. While I agree that renaming 'map' to 'id_map'
> > makes sense for clarity, but 'ids' seems correct as it denotes a collection of
> > IDs, not a mapping structure.
> >
> > >
> > > > +};
> > > > +
> > > > +static int btf_permute_shuffle_types(struct btf_permute *p);
> > > > +static int btf_permute_remap_types(struct btf_permute *p);
> > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
> > > > +
> > > > +int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
> > >
> > > Let's require user to pass id_map_cnt in addition to id_map itself.
> > > It's easy to get this wrong (especially with that special VOID 0 type
> > > that has to be excluded, which I can't even make up my mind if that's
> > > a good idea or not), so having user explicitly say what they think is
> > > necessary for permutation is good.
> >
> > Thank you for your suggestion. However, I am concerned that introducing
> > an additional `id_map_cnt` parameter could increase complexity. Specifically,
> > if `id_map_cnt` is less than `btf->nr_types`, we might need to consider whether
> > to resize the BTF. This could lead to missing types, potential ID remapping
> > failures, or even require BTF re-deduplication if certain name strings are no
> > longer referenced by any types.
> >
>
> No, if the user provided a wrong id_map_cnt, it's an error and we
> return -EINVAL. No resizing.
>
> > >
> > > > +{
> > > > + struct btf_permute p;
> > > > + int i, err = 0;
> > > > + __u32 *map = NULL;
> > > > +
> > > > + if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
> > >
>
> [...]
>
> > > > + goto done;
> > > > + }
> > > > +
> > > > +done:
> > > > + free(map);
> > > > + return libbpf_err(err);
> > > > +}
> > > > +
> > > > +/* Shuffle BTF types.
> > > > + *
> > > > + * Rearranges types according to the permutation map in p->ids. The p->map
> > > > + * array stores the mapping from original type IDs to new shuffled IDs,
> > > > + * which is used in the next phase to update type references.
> > > > + *
> > > > + * Validates that all IDs in the permutation array are valid and unique.
> > > > + */
> > > > +static int btf_permute_shuffle_types(struct btf_permute *p)
> > > > +{
> > > > + struct btf *btf = p->btf;
> > > > + const struct btf_type *t;
> > > > + __u32 *new_offs = NULL, *map;
> > > > + void *nt, *new_types = NULL;
> > > > + int i, id, len, err;
> > > > +
> > > > + new_offs = calloc(btf->nr_types, sizeof(*new_offs));
> > >
> > > we don't really need to allocate memory and maintain this, we can just
> > > shift types around and then do what btf_parse_type_sec() does -- just
> > > go over types one by one and calculate offsets, and update them
> > > in-place inside btf->type_offs
> >
> > Thank you for the suggestion. However, this approach is not viable because
> > the `btf__type_by_id()` function relies critically on the integrity of the
> > `btf->type_offs` data structure. Attempting to modify `type_offs` through
> > in-place operations could corrupt memory and lead to segmentation faults
> > due to invalid pointer dereferencing.
>
> Huh? By the time this API returns, we'll fix up type_offs, users will
> never notice. And to recalculate new type_offs we don't need
> type_offs. One of us is missing something important, what is it?
Thanks, however the bad news is that the btf__type_by_id is indeed called
within the API.
static int btf_permute_shuffle_types(struct btf_permute *p)
{
struct btf *btf = p->btf;
const struct btf_type *t;
__u32 *new_offs = NULL, *ids_map;
void *nt, *new_types = NULL;
int i, id, len, err;
new_offs = calloc(btf->nr_types, sizeof(*new_offs));
new_types = calloc(btf->hdr->type_len, 1);
......
nt = new_types;
for (i = 0; i < btf->nr_types; i++) {
id = p->ids[i];
......
/* must be a valid type ID */
t = btf__type_by_id(btf, id); <<<<<<<<<<<<<
......
len = btf_type_size(t);
memcpy(nt, t, len);
new_offs[i] = nt - new_types;
*ids_map = btf->start_id + i;
nt += len;
}
......
}
>
> [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-05 18:11 ` Andrii Nakryiko
@ 2025-11-06 7:49 ` Donglin Peng
2025-11-06 17:31 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-06 7:49 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Thu, Nov 6, 2025 at 2:11 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 5:48 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > > return type_id;
> > > > > > > }
> > > > > > >
> > > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > > +/*
> > > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > > + * and returns the leftmost matching index.
> > > > > > > + */
> > > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > > + __s32 *left, __s32 *right)
> > > > > >
> > > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > > searches where one would do just fine.
> > > > >
> > > > > I think the idea is that there would be less strcmp's if there is a
> > > > > long sequence of items with identical names.
> > > >
> > > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > > with high likelihood will take more comparisons.
> > >
> > > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > > 51737 '(anon)'
> > > 277 'bpf_kfunc'
> > > 4 'long
> > > 3 'perf_aux_event'
> > > 3 'workspace'
> > > 2 'ata_acpi_gtm'
> > > 2 'avc_cache_stats'
> > > 2 'bh_accounting'
> > > 2 'bp_cpuinfo'
> > > 2 'bpf_fastcall'
> > >
> > > 'bpf_kfunc' is probably for decl_tags.
> > > So I agree with you regarding the second binary search, it is not
> > > necessary. But skipping all anonymous types (and thus having to
> > > maintain nr_sorted_types) might be useful, on each search two
> > > iterations would be wasted to skip those.
>
> fair enough, eliminating a big chunk of anonymous types is useful, let's do this
>
> >
> > Thank you. After removing the redundant iterations, performance increased
> > significantly compared with two iterations.
> >
> > Test Case: Locate all 58,719 named types in vmlinux BTF
> > Methodology:
> > ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
> >
> > Two iterations:
> > | Condition | Lookup Time | Improvement |
> > |--------------------|-------------|-------------|
> > | Unsorted (Linear) | 17,282 ms | Baseline |
> > | Sorted (Binary) | 19 ms | 909x faster |
> >
> > One iteration:
> > Results:
> > | Condition | Lookup Time | Improvement |
> > |--------------------|-------------|-------------|
> > | Unsorted (Linear) | 17,619 ms | Baseline |
> > | Sorted (Binary) | 10 ms | 1762x faster |
> >
> > Here is the code implementation with a single iteration approach.
> > I believe this scenario differs from find_linfo because we cannot
> > determine in advance whether the specified type name will be found.
> > Please correct me if I've misunderstood anything, and I welcome any
> > guidance on this matter.
> >
> > static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
> > const char *name,
> > __s32 start_id)
> > {
> > const struct btf_type *t;
> > const char *tname;
> > __s32 l, r, m, lmost = -ENOENT;
> > int ret;
> >
> > /* found the leftmost btf_type that matches */
> > l = start_id;
> > r = btf__type_cnt(btf) - 1;
> > while (l <= r) {
> > m = l + (r - l) / 2;
> > t = btf_type_by_id(btf, m);
> > if (!t->name_off) {
> > ret = 1;
> > } else {
> > tname = btf__str_by_offset(btf, t->name_off);
> > ret = !tname ? 1 : strcmp(tname, name);
> > }
> > if (ret < 0) {
> > l = m + 1;
> > } else {
> > if (ret == 0)
> > lmost = m;
> > r = m - 1;
> > }
> > }
> >
> > return lmost;
> > }
>
> There are different ways to implement this. At the highest level,
> implementation below just searches for leftmost element that has name
> >= the one we are searching for. One complication is that such element
> might not event exists. We can solve that checking ahead of time
> whether the rightmost type satisfied the condition, or we could do
> something similar to what I do in the loop below, where I allow l == r
> and then if that element has name >= to what we search, we exit
> because we found it. And if not, l will become larger than r, we'll
> break out of the loop and we'll know that we couldn't find the
> element. I haven't tested it, but please take a look and if you decide
> to go with such approach, do test it for edge cases, of course.
>
> /*
> * We are searching for the smallest r such that type #r's name is >= name.
> * It might not exist, in which case we'll have l == r + 1.
> */
> l = start_id;
> r = btf__type_cnt(btf) - 1;
> while (l < r) {
> m = l + (r - l) / 2;
> t = btf_type_by_id(btf, m);
> tname = btf__str_by_offset(btf, t->name_off);
>
> if (strcmp(tname, name) >= 0) {
> if (l == r)
> return r; /* found it! */
It seems that this if condition will never hold, because a while(l < r) loop
is used. Moreover, even if the condition were to hold, it wouldn't guarantee
a successful search.
> r = m;
> } else {
> l = m + 1;
> }
> }
> /* here we know given element doesn't exist, return index beyond end of types */
> return btf__type_cnt(btf);
I think that return -ENOENT seems more reasonable.
>
>
> We could have checked instead whether strcmp(btf__str_by_offset(btf,
> btf__type_by_id(btf, btf__type_cnt() - 1)->name_off), name) < 0 and
> exit early. That's just a bit more code duplication of essentially
> what we do inside the loop, so that if (l == r) seems fine to me, but
> I'm not married to this.
Sorry, I believe that even if strcmp(btf__str_by_offset(btf,
btf__type_by_id(btf,
btf__type_cnt() - 1)->name_off), name) >= 0, it still doesn't seem to
guarantee that the search will definitely succeed.
>
> >
> > static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> > const char *type_name, __u32 kind)
> > {
> > const struct btf_type *t;
> > const char *tname;
> > int err = -ENOENT;
> > __u32 total;
> >
> > if (!btf)
> > goto out;
> >
> > if (start_id < btf->start_id) {
> > err = btf_find_type_by_name_kind(btf->base_btf, start_id,
> > type_name, kind);
> > if (err == -ENOENT)
> > start_id = btf->start_id;
> > }
> >
> > if (err == -ENOENT) {
> > if (btf_check_sorted((struct btf *)btf)) {
> > /* binary search */
> > bool skip_first;
> > int ret;
> >
> > /* return the leftmost with maching names */
> > ret = btf_find_type_by_name_bsearch(btf,
> > type_name, start_id);
> > if (ret < 0)
> > goto out;
> > /* skip kind checking */
> > if (kind == -1)
> > return ret;
> > total = btf__type_cnt(btf);
> > skip_first = true;
> > do {
> > t = btf_type_by_id(btf, ret);
> > if (btf_kind(t) != kind) {
> > if (skip_first) {
> > skip_first = false;
> > continue;
> > }
> > } else if (skip_first) {
> > return ret;
> > }
> > if (!t->name_off)
> > break;
> > tname = btf__str_by_offset(btf, t->name_off);
> > if (tname && !strcmp(tname, type_name))
> > return ret;
> > else
> > break;
> > } while (++ret < total);
> > } else {
> > /* linear search */
> > ...
> > }
> > }
> >
> > out:
> > return err;
> > }
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function
2025-11-05 19:41 ` Eduard Zingerman
@ 2025-11-06 17:09 ` Andrii Nakryiko
0 siblings, 0 replies; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-06 17:09 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 11:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2025-11-05 at 10:20 -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > You don't like that I ask people to improve implementation?
>
> Not at all.
>
> > You don't like the implementation itself? Or are you suggesting that
> > we should add a "generic" C implementation of
> > lower_bound/upper_bound and use callbacks for comparison logic? What
> > are you ranting about, exactly?
>
> Actually, having it as a static inline function in a header would be
> nice. I just tried that, and gcc is perfectly capable of inlining the
> comparison function in -O2 mode.
I dislike callbacks in principle, but I don't mind having such a
reusable primitive, if it's reasonably abstracted. Do it.
>
> I'm ranting about patch #5 being 101 insertions(+), 10 deletions(-)
> and patch #4 being 119 insertions(+), 23 deletions(-),
> while doing exactly the same thing.
Understandable, but code reuse is not (at least it should not be) the
goal for its own sake. It should help manage complexity and improve
maintainability. Code sharing is not all pros, it creates unnecessary
entanglement and dependencies.
I don't think sharing this code between libbpf and kernel is justified
here. 100 lines of code is not a big deal, IMO.
>
> And yes, this copy of binary search routine probably won't ever
> change. But changes to the comparator logic are pretty much possible,
> if we decide to include 'kind' as a secondary key one day.
> And that change will have to happen twice.
>
> > As I said, once binary search (of whatever kind, bounds or exact) is
> > written for something like this, it doesn't have to ever be modified.
> > I don't see this as a maintainability hurdle at all. But sharing code
> > between libbpf and kernel is something to be avoided. Look at #ifdef
> > __KERNEL__ sections of relo_core.c as one reason why.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-06 7:31 ` Donglin Peng
@ 2025-11-06 17:12 ` Andrii Nakryiko
2025-11-07 1:39 ` Donglin Peng
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-06 17:12 UTC (permalink / raw)
To: Donglin Peng
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 11:31 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Thu, Nov 6, 2025 at 2:29 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Nov 5, 2025 at 4:53 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > >
> > > On Wed, Nov 5, 2025 at 8:11 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > > >
> > > > > From: pengdonglin <pengdonglin@xiaomi.com>
> > > > >
> > > > > Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> > > > > This function reorganizes BTF type order according to a provided array of
> > > > > type IDs, updating all type references to maintain consistency.
> > > > >
> > > > > The permutation process involves:
> > > > > 1. Shuffling types into new order based on the provided ID mapping
> > > > > 2. Remapping all type ID references to point to new locations
> > > > > 3. Handling BTF extension data if provided via options
> > > > >
> > > > > This is particularly useful for optimizing type locality after BTF
> > > > > deduplication or for meeting specific layout requirements in specialized
> > > > > use cases.
> > > > >
> > > > > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > > > Cc: Alan Maguire <alan.maguire@oracle.com>
> > > > > Cc: Song Liu <song@kernel.org>
> > > > > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > > > > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> > > > > ---
> > > > > tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
> > > > > tools/lib/bpf/btf.h | 34 +++++++++
> > > > > tools/lib/bpf/libbpf.map | 1 +
> > > > > 3 files changed, 196 insertions(+)
> > > > >
> > > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > > index 5e1c09b5dce8..3bc03f7fe31f 100644
> > > > > --- a/tools/lib/bpf/btf.c
> > > > > +++ b/tools/lib/bpf/btf.c
> > > > > @@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
> > > > > btf->owns_base = false;
> > > > > return libbpf_err(err);
> > > > > }
> > > > > +
> > > > > +struct btf_permute {
> > > > > + /* .BTF section to be permuted in-place */
> > > > > + struct btf *btf;
> > > > > + struct btf_ext *btf_ext;
> > > > > + /* Array of type IDs used for permutation. The array length must equal
> > > >
> > > > /*
> > > > * Use this comment style
> > > > */
> > >
> > > Thanks.
> > >
> > > >
> > > > > + * the number of types in the BTF being permuted, excluding the special
> > > > > + * void type at ID 0. For split BTF, the length corresponds to the
> > > > > + * number of types added on top of the base BTF.
> > > >
> > > > many words, but what exactly ids[i] means is still not clear, actually...
> > >
> > > Thanks. I'll clarify the description. Is the following parameter
> > > explanation acceptable?
> > >
> > > @param ids Array containing original type IDs (excluding VOID type ID
> > > 0) in user-defined order.
> > > The array size must match btf->nr_types, which
> >
> > Users don't have access to btf->nr_types, so referring to it in API
> > description seems wrong.
> >
> > But also, this all will change if we allow removing types, because
> > then array size might be smaller. But is it intentionally smaller or
> > user made a mistake? Let's go with the ID map approach, please.
>
> Thanks. I can implement both approaches, then we can assess their
> pros and cons.
>
> >
> > > also excludes VOID type ID 0.
> > >
> > >
> > > >
> > > > > + */
> > > > > + __u32 *ids;
> > > > > + /* Array of type IDs used to map from original type ID to a new permuted
> > > > > + * type ID, its length equals to the above ids */
> > > >
> > > > wrong comment style
> > >
> > > Thanks, I will fix it in the next version.
> > >
> > > >
> > > > > + __u32 *map;
> > > >
> > > > "map" is a bit generic. What if we use s/ids/id_map/ and
> > > > s/map/id_map_rev/ (for reverse)? I'd use "id_map" naming in the public
> > > > API to make it clear that it's a mapping of IDs, not just some array
> > > > of IDs.
> > >
> > > Thank you for the suggestion. While I agree that renaming 'map' to 'id_map'
> > > makes sense for clarity, but 'ids' seems correct as it denotes a collection of
> > > IDs, not a mapping structure.
> > >
> > > >
> > > > > +};
> > > > > +
> > > > > +static int btf_permute_shuffle_types(struct btf_permute *p);
> > > > > +static int btf_permute_remap_types(struct btf_permute *p);
> > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
> > > > > +
> > > > > +int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
> > > >
> > > > Let's require user to pass id_map_cnt in addition to id_map itself.
> > > > It's easy to get this wrong (especially with that special VOID 0 type
> > > > that has to be excluded, which I can't even make up my mind if that's
> > > > a good idea or not), so having user explicitly say what they think is
> > > > necessary for permutation is good.
> > >
> > > Thank you for your suggestion. However, I am concerned that introducing
> > > an additional `id_map_cnt` parameter could increase complexity. Specifically,
> > > if `id_map_cnt` is less than `btf->nr_types`, we might need to consider whether
> > > to resize the BTF. This could lead to missing types, potential ID remapping
> > > failures, or even require BTF re-deduplication if certain name strings are no
> > > longer referenced by any types.
> > >
> >
> > No, if the user provided a wrong id_map_cnt, it's an error and we
> > return -EINVAL. No resizing.
> >
> > > >
> > > > > +{
> > > > > + struct btf_permute p;
> > > > > + int i, err = 0;
> > > > > + __u32 *map = NULL;
> > > > > +
> > > > > + if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
> > > >
> >
> > [...]
> >
> > > > > + goto done;
> > > > > + }
> > > > > +
> > > > > +done:
> > > > > + free(map);
> > > > > + return libbpf_err(err);
> > > > > +}
> > > > > +
> > > > > +/* Shuffle BTF types.
> > > > > + *
> > > > > + * Rearranges types according to the permutation map in p->ids. The p->map
> > > > > + * array stores the mapping from original type IDs to new shuffled IDs,
> > > > > + * which is used in the next phase to update type references.
> > > > > + *
> > > > > + * Validates that all IDs in the permutation array are valid and unique.
> > > > > + */
> > > > > +static int btf_permute_shuffle_types(struct btf_permute *p)
> > > > > +{
> > > > > + struct btf *btf = p->btf;
> > > > > + const struct btf_type *t;
> > > > > + __u32 *new_offs = NULL, *map;
> > > > > + void *nt, *new_types = NULL;
> > > > > + int i, id, len, err;
> > > > > +
> > > > > + new_offs = calloc(btf->nr_types, sizeof(*new_offs));
> > > >
> > > > we don't really need to allocate memory and maintain this, we can just
> > > > shift types around and then do what btf_parse_type_sec() does -- just
> > > > go over types one by one and calculate offsets, and update them
> > > > in-place inside btf->type_offs
> > >
> > > Thank you for the suggestion. However, this approach is not viable because
> > > the `btf__type_by_id()` function relies critically on the integrity of the
> > > `btf->type_offs` data structure. Attempting to modify `type_offs` through
> > > in-place operations could corrupt memory and lead to segmentation faults
> > > due to invalid pointer dereferencing.
> >
> > Huh? By the time this API returns, we'll fix up type_offs, users will
> > never notice. And to recalculate new type_offs we don't need
> > type_offs. One of us is missing something important, what is it?
>
> Thanks, however the bad news is that the btf__type_by_id is indeed called
> within the API.
>
> static int btf_permute_shuffle_types(struct btf_permute *p)
> {
> struct btf *btf = p->btf;
> const struct btf_type *t;
> __u32 *new_offs = NULL, *ids_map;
> void *nt, *new_types = NULL;
> int i, id, len, err;
>
> new_offs = calloc(btf->nr_types, sizeof(*new_offs));
> new_types = calloc(btf->hdr->type_len, 1);
> ......
> nt = new_types;
> for (i = 0; i < btf->nr_types; i++) {
> id = p->ids[i];
> ......
> /* must be a valid type ID */
> t = btf__type_by_id(btf, id); <<<<<<<<<<<<<
You are still on the old types layout and old type_offs at that point.
You are not using your new_offs here *anyways*
> ......
> len = btf_type_size(t);
> memcpy(nt, t, len);
> new_offs[i] = nt - new_types;
> *ids_map = btf->start_id + i;
> nt += len;
> }
> ......
... you will recalculate and update type_offs here ... well past
btf__type_by_id() usage ...
> }
>
> >
> > [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 19:23 ` Eduard Zingerman
@ 2025-11-06 17:21 ` Andrii Nakryiko
0 siblings, 0 replies; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-06 17:21 UTC (permalink / raw)
To: Eduard Zingerman
Cc: Donglin Peng, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 11:23 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2025-11-05 at 10:23 -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 4, 2025 at 5:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > > > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > > > +{
> > > > > > > + struct btf_permute *p = ctx;
> > > > > > > + __u32 new_type_id = *type_id;
> > > > > > > +
> > > > > > > + /* skip references that point into the base BTF */
> > > > > > > + if (new_type_id < p->btf->start_id)
> > > > > > > + return 0;
> > > > > > > +
> > > > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > > > >
> > > > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > > > types identified by their original ID. I find it quite confusing. I
> > > > > > think about permutation as a mapping from original type ID to a new
> > > > > > type ID, am I confused?
> > > > >
> > > > > Yes, it is a desired sequence, not mapping.
> > > > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > > > swap ids while sorting.
> > > >
> > > > The question is really what makes most sense as an interface. Because
> > > > for sorting cases it's just the matter of a two-line for() loop to
> > > > create ID mapping once types are sorted.
> > > >
> > > > I have slight preference for id_map approach because it is easy to
> > > > extend to the case of selectively dropping some types. We can just
> > > > define that such IDs should be mapped to zero. This will work as a
> > > > natural extension. With the desired end sequence of IDs, it's less
> > > > natural and will require more work to determine which IDs are missing
> > > > from the sequence.
> > > >
> > > > So unless there is some really good and strong reason, shall we go
> > > > with the ID mapping approach?
> > >
> > > If the interface is extended with types_cnt, as you suggest, deleting
> > > types is trivial with sequence interface as well. At-least the way it
> > > is implemented by this patch, you just copy elements from 'ids' one by
> > > one.
> >
> > But it is way less explicit and obvious way to delete element. With ID
> > map it is obvious, that type will be mapped to zero. With list of IDs,
> > you effectively search for elements that are missing, which IMO is way
> > less optimal an interface.
> >
> > So I still favor the ID map approach.
>
> You don't need to search for deleted elements with current
> implementation (assuming the ids_cnt parameter is added).
> Suppose there are 4 types + void in BTF and the 'ids' sequence looks
> as follows: {1, 3, 4}, current implementation will:
> - iterate over 'ids':
> - copy 1 to new_types, remember to remap 1 to 1
> - copy 3 to new_types, remember to remap 3 to 2
> - copy 4 to new_types, remember to remap 4 to 3
> - do the remapping.
Eduard, from API perspective I very much do not like saying that "if
type ID is missing from the list -- drop it". I very much prefer "map
type you want to delete to zero". How can I be more clear about this?
I didn't even talk about implementation, I was talking about API.
>
> Consider the sorting use-case:
> - If 'ids' is the desired final order of types, libbpf needs to
> allocate the mapping from old id to new id, as described above.
> - If 'ids' is a map from old id to new id:
> - libbpf will have to allocate a temporary array to hold the desired
> id sequence, to know in which order to copy the types;
> - user will have to allocate the array for mapping.
>
> So, for id map approach it is one more allocation for no benefit.
On the libbpf side - no difference in terms of memory use. On the user
side, worst case, N * sizeof(int) temporary allocation for ID mapping.
400KB at most to resort 100K of BTF types, which takes megabytes
anyways. I don't even want to talk about the amount of memory pahole
will waste on DWARF information processing. And depending on what data
structures user code keeps for sorting indexing, this allocation might
be necessary anyways with either approach.
But this is all irrelevant. I care about the interface way more than
temporary 400KB of memory usage.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-06 7:49 ` Donglin Peng
@ 2025-11-06 17:31 ` Andrii Nakryiko
2025-11-07 4:57 ` Donglin Peng
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-06 17:31 UTC (permalink / raw)
To: Donglin Peng
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Wed, Nov 5, 2025 at 11:49 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Thu, Nov 6, 2025 at 2:11 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Nov 5, 2025 at 5:48 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > >
> > > On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > >
> > > > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > > > return type_id;
> > > > > > > > }
> > > > > > > >
> > > > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > > > +/*
> > > > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > > > + * and returns the leftmost matching index.
> > > > > > > > + */
> > > > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > > > + __s32 *left, __s32 *right)
> > > > > > >
> > > > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > > > searches where one would do just fine.
> > > > > >
> > > > > > I think the idea is that there would be less strcmp's if there is a
> > > > > > long sequence of items with identical names.
> > > > >
> > > > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > > > with high likelihood will take more comparisons.
> > > >
> > > > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > > > 51737 '(anon)'
> > > > 277 'bpf_kfunc'
> > > > 4 'long
> > > > 3 'perf_aux_event'
> > > > 3 'workspace'
> > > > 2 'ata_acpi_gtm'
> > > > 2 'avc_cache_stats'
> > > > 2 'bh_accounting'
> > > > 2 'bp_cpuinfo'
> > > > 2 'bpf_fastcall'
> > > >
> > > > 'bpf_kfunc' is probably for decl_tags.
> > > > So I agree with you regarding the second binary search, it is not
> > > > necessary. But skipping all anonymous types (and thus having to
> > > > maintain nr_sorted_types) might be useful, on each search two
> > > > iterations would be wasted to skip those.
> >
> > fair enough, eliminating a big chunk of anonymous types is useful, let's do this
> >
> > >
> > > Thank you. After removing the redundant iterations, performance increased
> > > significantly compared with two iterations.
> > >
> > > Test Case: Locate all 58,719 named types in vmlinux BTF
> > > Methodology:
> > > ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
> > >
> > > Two iterations:
> > > | Condition | Lookup Time | Improvement |
> > > |--------------------|-------------|-------------|
> > > | Unsorted (Linear) | 17,282 ms | Baseline |
> > > | Sorted (Binary) | 19 ms | 909x faster |
> > >
> > > One iteration:
> > > Results:
> > > | Condition | Lookup Time | Improvement |
> > > |--------------------|-------------|-------------|
> > > | Unsorted (Linear) | 17,619 ms | Baseline |
> > > | Sorted (Binary) | 10 ms | 1762x faster |
> > >
> > > Here is the code implementation with a single iteration approach.
> > > I believe this scenario differs from find_linfo because we cannot
> > > determine in advance whether the specified type name will be found.
> > > Please correct me if I've misunderstood anything, and I welcome any
> > > guidance on this matter.
> > >
> > > static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
> > > const char *name,
> > > __s32 start_id)
> > > {
> > > const struct btf_type *t;
> > > const char *tname;
> > > __s32 l, r, m, lmost = -ENOENT;
> > > int ret;
> > >
> > > /* found the leftmost btf_type that matches */
> > > l = start_id;
> > > r = btf__type_cnt(btf) - 1;
> > > while (l <= r) {
> > > m = l + (r - l) / 2;
> > > t = btf_type_by_id(btf, m);
> > > if (!t->name_off) {
> > > ret = 1;
> > > } else {
> > > tname = btf__str_by_offset(btf, t->name_off);
> > > ret = !tname ? 1 : strcmp(tname, name);
> > > }
> > > if (ret < 0) {
> > > l = m + 1;
> > > } else {
> > > if (ret == 0)
> > > lmost = m;
> > > r = m - 1;
> > > }
> > > }
> > >
> > > return lmost;
> > > }
> >
> > There are different ways to implement this. At the highest level,
> > implementation below just searches for leftmost element that has name
> > >= the one we are searching for. One complication is that such element
> > might not event exists. We can solve that checking ahead of time
> > whether the rightmost type satisfied the condition, or we could do
> > something similar to what I do in the loop below, where I allow l == r
> > and then if that element has name >= to what we search, we exit
> > because we found it. And if not, l will become larger than r, we'll
> > break out of the loop and we'll know that we couldn't find the
> > element. I haven't tested it, but please take a look and if you decide
> > to go with such approach, do test it for edge cases, of course.
> >
> > /*
> > * We are searching for the smallest r such that type #r's name is >= name.
> > * It might not exist, in which case we'll have l == r + 1.
> > */
> > l = start_id;
> > r = btf__type_cnt(btf) - 1;
> > while (l < r) {
> > m = l + (r - l) / 2;
> > t = btf_type_by_id(btf, m);
> > tname = btf__str_by_offset(btf, t->name_off);
> >
> > if (strcmp(tname, name) >= 0) {
> > if (l == r)
> > return r; /* found it! */
>
> It seems that this if condition will never hold, because a while(l < r) loop
It should be `while (l <= r)`, I forgot to update it, but I mentioned
that I do want to allow l == r condition.
> is used. Moreover, even if the condition were to hold, it wouldn't guarantee
> a successful search.
Elaborate please on "wouldn't guarantee a successful search".
>
> > r = m;
> > } else {
> > l = m + 1;
> > }
> > }
> > /* here we know given element doesn't exist, return index beyond end of types */
> > return btf__type_cnt(btf);
>
> I think that return -ENOENT seems more reasonable.
Think how you will be using this inside btf_find_type_by_name_kind():
int idx = btf_find_by_name_bsearch(btf, name);
for (int n = btf__type_cnt(btf); idx < n; idx++) {
struct btf_type *t = btf__type_by_id(btf, idx);
const char *tname = btf__str_by_offset(btf, t->name_off);
if (strcmp(tname, name) != 0)
return -ENOENT;
if (btf_kind(t) == kind)
return idx;
}
return -ENOENT;
Having btf_find_by_name_bsearch() return -ENOENT instead of
btf__type_cnt() just will require extra explicit -ENOENT handling. And
given the function now can return "error", we'd need to either handle
other non-ENOENT errors, to at least leave comment that this should
never happen, though interface itself looks like it could.
This is relatively minor and its all internal implementation, so we
can change that later. But I'm explaining my reasons for why I'd
return index of non-existing type after the end, just like you'd do
with pointer-based interfaces that return pointer after the last
element.
>
> >
> >
> > We could have checked instead whether strcmp(btf__str_by_offset(btf,
> > btf__type_by_id(btf, btf__type_cnt() - 1)->name_off), name) < 0 and
> > exit early. That's just a bit more code duplication of essentially
> > what we do inside the loop, so that if (l == r) seems fine to me, but
> > I'm not married to this.
>
> Sorry, I believe that even if strcmp(btf__str_by_offset(btf,
> btf__type_by_id(btf,
> btf__type_cnt() - 1)->name_off), name) >= 0, it still doesn't seem to
> guarantee that the search will definitely succeed.
If the last element has >= name, search will definitely find at least
that element. What do you mean by "succeed"? All I care about here is
that binary search loop doesn't loop forever and it returns correct
index (or detects that no element can be found).
>
> >
> > >
> > > static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> > > const char *type_name, __u32 kind)
> > > {
> > > const struct btf_type *t;
> > > const char *tname;
> > > int err = -ENOENT;
> > > __u32 total;
> > >
> > > if (!btf)
> > > goto out;
> > >
> > > if (start_id < btf->start_id) {
> > > err = btf_find_type_by_name_kind(btf->base_btf, start_id,
> > > type_name, kind);
> > > if (err == -ENOENT)
> > > start_id = btf->start_id;
> > > }
> > >
> > > if (err == -ENOENT) {
> > > if (btf_check_sorted((struct btf *)btf)) {
> > > /* binary search */
> > > bool skip_first;
> > > int ret;
> > >
> > > /* return the leftmost with maching names */
> > > ret = btf_find_type_by_name_bsearch(btf,
> > > type_name, start_id);
> > > if (ret < 0)
> > > goto out;
> > > /* skip kind checking */
> > > if (kind == -1)
> > > return ret;
> > > total = btf__type_cnt(btf);
> > > skip_first = true;
> > > do {
> > > t = btf_type_by_id(btf, ret);
> > > if (btf_kind(t) != kind) {
> > > if (skip_first) {
> > > skip_first = false;
> > > continue;
> > > }
> > > } else if (skip_first) {
> > > return ret;
> > > }
> > > if (!t->name_off)
> > > break;
> > > tname = btf__str_by_offset(btf, t->name_off);
> > > if (tname && !strcmp(tname, type_name))
> > > return ret;
> > > else
> > > break;
> > > } while (++ret < total);
> > > } else {
> > > /* linear search */
> > > ...
> > > }
> > > }
> > >
> > > out:
> > > return err;
> > > }
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-06 17:12 ` Andrii Nakryiko
@ 2025-11-07 1:39 ` Donglin Peng
0 siblings, 0 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-07 1:39 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: ast, linux-kernel, bpf, Eduard Zingerman, Alan Maguire, Song Liu,
pengdonglin
On Fri, Nov 7, 2025 at 1:13 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 11:31 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > On Thu, Nov 6, 2025 at 2:29 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Wed, Nov 5, 2025 at 4:53 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 5, 2025 at 8:11 AM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Tue, Nov 4, 2025 at 5:40 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > > > >
> > > > > > From: pengdonglin <pengdonglin@xiaomi.com>
> > > > > >
> > > > > > Introduce btf__permute() API to allow in-place rearrangement of BTF types.
> > > > > > This function reorganizes BTF type order according to a provided array of
> > > > > > type IDs, updating all type references to maintain consistency.
> > > > > >
> > > > > > The permutation process involves:
> > > > > > 1. Shuffling types into new order based on the provided ID mapping
> > > > > > 2. Remapping all type ID references to point to new locations
> > > > > > 3. Handling BTF extension data if provided via options
> > > > > >
> > > > > > This is particularly useful for optimizing type locality after BTF
> > > > > > deduplication or for meeting specific layout requirements in specialized
> > > > > > use cases.
> > > > > >
> > > > > > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > > > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > > > > Cc: Alan Maguire <alan.maguire@oracle.com>
> > > > > > Cc: Song Liu <song@kernel.org>
> > > > > > Signed-off-by: pengdonglin <pengdonglin@xiaomi.com>
> > > > > > Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
> > > > > > ---
> > > > > > tools/lib/bpf/btf.c | 161 +++++++++++++++++++++++++++++++++++++++
> > > > > > tools/lib/bpf/btf.h | 34 +++++++++
> > > > > > tools/lib/bpf/libbpf.map | 1 +
> > > > > > 3 files changed, 196 insertions(+)
> > > > > >
> > > > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > > > index 5e1c09b5dce8..3bc03f7fe31f 100644
> > > > > > --- a/tools/lib/bpf/btf.c
> > > > > > +++ b/tools/lib/bpf/btf.c
> > > > > > @@ -5830,3 +5830,164 @@ int btf__relocate(struct btf *btf, const struct btf *base_btf)
> > > > > > btf->owns_base = false;
> > > > > > return libbpf_err(err);
> > > > > > }
> > > > > > +
> > > > > > +struct btf_permute {
> > > > > > + /* .BTF section to be permuted in-place */
> > > > > > + struct btf *btf;
> > > > > > + struct btf_ext *btf_ext;
> > > > > > + /* Array of type IDs used for permutation. The array length must equal
> > > > >
> > > > > /*
> > > > > * Use this comment style
> > > > > */
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > > + * the number of types in the BTF being permuted, excluding the special
> > > > > > + * void type at ID 0. For split BTF, the length corresponds to the
> > > > > > + * number of types added on top of the base BTF.
> > > > >
> > > > > many words, but what exactly ids[i] means is still not clear, actually...
> > > >
> > > > Thanks. I'll clarify the description. Is the following parameter
> > > > explanation acceptable?
> > > >
> > > > @param ids Array containing original type IDs (excluding VOID type ID
> > > > 0) in user-defined order.
> > > > The array size must match btf->nr_types, which
> > >
> > > Users don't have access to btf->nr_types, so referring to it in API
> > > description seems wrong.
> > >
> > > But also, this all will change if we allow removing types, because
> > > then array size might be smaller. But is it intentionally smaller or
> > > user made a mistake? Let's go with the ID map approach, please.
> >
> > Thanks. I can implement both approaches, then we can assess their
> > pros and cons.
> >
> > >
> > > > also excludes VOID type ID 0.
> > > >
> > > >
> > > > >
> > > > > > + */
> > > > > > + __u32 *ids;
> > > > > > + /* Array of type IDs used to map from original type ID to a new permuted
> > > > > > + * type ID, its length equals to the above ids */
> > > > >
> > > > > wrong comment style
> > > >
> > > > Thanks, I will fix it in the next version.
> > > >
> > > > >
> > > > > > + __u32 *map;
> > > > >
> > > > > "map" is a bit generic. What if we use s/ids/id_map/ and
> > > > > s/map/id_map_rev/ (for reverse)? I'd use "id_map" naming in the public
> > > > > API to make it clear that it's a mapping of IDs, not just some array
> > > > > of IDs.
> > > >
> > > > Thank you for the suggestion. While I agree that renaming 'map' to 'id_map'
> > > > makes sense for clarity, but 'ids' seems correct as it denotes a collection of
> > > > IDs, not a mapping structure.
> > > >
> > > > >
> > > > > > +};
> > > > > > +
> > > > > > +static int btf_permute_shuffle_types(struct btf_permute *p);
> > > > > > +static int btf_permute_remap_types(struct btf_permute *p);
> > > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx);
> > > > > > +
> > > > > > +int btf__permute(struct btf *btf, __u32 *ids, const struct btf_permute_opts *opts)
> > > > >
> > > > > Let's require user to pass id_map_cnt in addition to id_map itself.
> > > > > It's easy to get this wrong (especially with that special VOID 0 type
> > > > > that has to be excluded, which I can't even make up my mind if that's
> > > > > a good idea or not), so having user explicitly say what they think is
> > > > > necessary for permutation is good.
> > > >
> > > > Thank you for your suggestion. However, I am concerned that introducing
> > > > an additional `id_map_cnt` parameter could increase complexity. Specifically,
> > > > if `id_map_cnt` is less than `btf->nr_types`, we might need to consider whether
> > > > to resize the BTF. This could lead to missing types, potential ID remapping
> > > > failures, or even require BTF re-deduplication if certain name strings are no
> > > > longer referenced by any types.
> > > >
> > >
> > > No, if the user provided a wrong id_map_cnt, it's an error and we
> > > return -EINVAL. No resizing.
> > >
> > > > >
> > > > > > +{
> > > > > > + struct btf_permute p;
> > > > > > + int i, err = 0;
> > > > > > + __u32 *map = NULL;
> > > > > > +
> > > > > > + if (!OPTS_VALID(opts, btf_permute_opts) || !ids)
> > > > >
> > >
> > > [...]
> > >
> > > > > > + goto done;
> > > > > > + }
> > > > > > +
> > > > > > +done:
> > > > > > + free(map);
> > > > > > + return libbpf_err(err);
> > > > > > +}
> > > > > > +
> > > > > > +/* Shuffle BTF types.
> > > > > > + *
> > > > > > + * Rearranges types according to the permutation map in p->ids. The p->map
> > > > > > + * array stores the mapping from original type IDs to new shuffled IDs,
> > > > > > + * which is used in the next phase to update type references.
> > > > > > + *
> > > > > > + * Validates that all IDs in the permutation array are valid and unique.
> > > > > > + */
> > > > > > +static int btf_permute_shuffle_types(struct btf_permute *p)
> > > > > > +{
> > > > > > + struct btf *btf = p->btf;
> > > > > > + const struct btf_type *t;
> > > > > > + __u32 *new_offs = NULL, *map;
> > > > > > + void *nt, *new_types = NULL;
> > > > > > + int i, id, len, err;
> > > > > > +
> > > > > > + new_offs = calloc(btf->nr_types, sizeof(*new_offs));
> > > > >
> > > > > we don't really need to allocate memory and maintain this, we can just
> > > > > shift types around and then do what btf_parse_type_sec() does -- just
> > > > > go over types one by one and calculate offsets, and update them
> > > > > in-place inside btf->type_offs
> > > >
> > > > Thank you for the suggestion. However, this approach is not viable because
> > > > the `btf__type_by_id()` function relies critically on the integrity of the
> > > > `btf->type_offs` data structure. Attempting to modify `type_offs` through
> > > > in-place operations could corrupt memory and lead to segmentation faults
> > > > due to invalid pointer dereferencing.
> > >
> > > Huh? By the time this API returns, we'll fix up type_offs, users will
> > > never notice. And to recalculate new type_offs we don't need
> > > type_offs. One of us is missing something important, what is it?
> >
> > Thanks, however the bad news is that the btf__type_by_id is indeed called
> > within the API.
> >
> > static int btf_permute_shuffle_types(struct btf_permute *p)
> > {
> > struct btf *btf = p->btf;
> > const struct btf_type *t;
> > __u32 *new_offs = NULL, *ids_map;
> > void *nt, *new_types = NULL;
> > int i, id, len, err;
> >
> > new_offs = calloc(btf->nr_types, sizeof(*new_offs));
> > new_types = calloc(btf->hdr->type_len, 1);
> > ......
> > nt = new_types;
> > for (i = 0; i < btf->nr_types; i++) {
> > id = p->ids[i];
> > ......
> > /* must be a valid type ID */
> > t = btf__type_by_id(btf, id); <<<<<<<<<<<<<
>
> You are still on the old types layout and old type_offs at that point.
> You are not using your new_offs here *anyways*
>
> > ......
> > len = btf_type_size(t);
> > memcpy(nt, t, len);
> > new_offs[i] = nt - new_types;
> > *ids_map = btf->start_id + i;
> > nt += len;
> > }
> > ......
>
> ... you will recalculate and update type_offs here ... well past
> btf__type_by_id() usage ...
Thanks, I see. We need to add another for loop for this update, but
the cost is minimal compared to the memory savings. I will fix it in
v6.
>
> > }
> >
> > >
> > > [...]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-05 18:23 ` Andrii Nakryiko
2025-11-05 19:23 ` Eduard Zingerman
@ 2025-11-07 2:36 ` Donglin Peng
2025-11-07 17:43 ` Andrii Nakryiko
1 sibling, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-07 2:36 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Thu, Nov 6, 2025 at 2:23 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 5:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > >
> > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > >
> > > > [...]
> > > >
> > > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > > +{
> > > > > > + struct btf_permute *p = ctx;
> > > > > > + __u32 new_type_id = *type_id;
> > > > > > +
> > > > > > + /* skip references that point into the base BTF */
> > > > > > + if (new_type_id < p->btf->start_id)
> > > > > > + return 0;
> > > > > > +
> > > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > > >
> > > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > > types identified by their original ID. I find it quite confusing. I
> > > > > think about permutation as a mapping from original type ID to a new
> > > > > type ID, am I confused?
> > > >
> > > > Yes, it is a desired sequence, not mapping.
> > > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > > swap ids while sorting.
> > >
> > > The question is really what makes most sense as an interface. Because
> > > for sorting cases it's just the matter of a two-line for() loop to
> > > create ID mapping once types are sorted.
> > >
> > > I have slight preference for id_map approach because it is easy to
> > > extend to the case of selectively dropping some types. We can just
> > > define that such IDs should be mapped to zero. This will work as a
> > > natural extension. With the desired end sequence of IDs, it's less
> > > natural and will require more work to determine which IDs are missing
> > > from the sequence.
> > >
> > > So unless there is some really good and strong reason, shall we go
> > > with the ID mapping approach?
> >
> > If the interface is extended with types_cnt, as you suggest, deleting
> > types is trivial with sequence interface as well. At-least the way it
> > is implemented by this patch, you just copy elements from 'ids' one by
> > one.
>
> But it is way less explicit and obvious way to delete element. With ID
> map it is obvious, that type will be mapped to zero. With list of IDs,
> you effectively search for elements that are missing, which IMO is way
> less optimal an interface.
>
> So I still favor the ID map approach.
Hi Andrii,
I've submitted v5 implementing the sequence-based approach, and I plan
to introduce
the ID map approach in v6. However, I have a few remaining questions that need
clarification:
1. ID Map Array Semantics:
- When the ID map array specifies `[2] = 4`, does this indicate
that the original type
at `start_id + 2` should be remapped to position `start_id + 4`?
Should the following
mapping attempts be rejected:
a) If the target index `4` exceeds the total number of types (`nr_types`)?
b) If multiple source types map to the same target location
(e.g., both `[1] = 3`
and `[2] = 3`)?
- If [3] = 0, does this indicate that the type at start_id + 3 should
be dropped?
- Does this also imply that the VOID type (ID 0) cannot be remapped
and must always remain unchanged?
2. ID Map Array Size:
- Must the ID map array size <= the number of BTF types? If the array
is smaller, should any missing types be automatically dropped?
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-06 17:31 ` Andrii Nakryiko
@ 2025-11-07 4:57 ` Donglin Peng
2025-11-07 17:01 ` Andrii Nakryiko
0 siblings, 1 reply; 53+ messages in thread
From: Donglin Peng @ 2025-11-07 4:57 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin, zhangxiaoqin
On Fri, Nov 7, 2025 at 1:31 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Nov 5, 2025 at 11:49 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > On Thu, Nov 6, 2025 at 2:11 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Wed, Nov 5, 2025 at 5:48 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > > > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > > > >
> > > > > > > [...]
> > > > > > >
> > > > > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > > > > return type_id;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > > > > +/*
> > > > > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > > > > + * and returns the leftmost matching index.
> > > > > > > > > + */
> > > > > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > > > > + __s32 *left, __s32 *right)
> > > > > > > >
> > > > > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > > > > searches where one would do just fine.
> > > > > > >
> > > > > > > I think the idea is that there would be less strcmp's if there is a
> > > > > > > long sequence of items with identical names.
> > > > > >
> > > > > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > > > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > > > > with high likelihood will take more comparisons.
> > > > >
> > > > > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > > > > 51737 '(anon)'
> > > > > 277 'bpf_kfunc'
> > > > > 4 'long
> > > > > 3 'perf_aux_event'
> > > > > 3 'workspace'
> > > > > 2 'ata_acpi_gtm'
> > > > > 2 'avc_cache_stats'
> > > > > 2 'bh_accounting'
> > > > > 2 'bp_cpuinfo'
> > > > > 2 'bpf_fastcall'
> > > > >
> > > > > 'bpf_kfunc' is probably for decl_tags.
> > > > > So I agree with you regarding the second binary search, it is not
> > > > > necessary. But skipping all anonymous types (and thus having to
> > > > > maintain nr_sorted_types) might be useful, on each search two
> > > > > iterations would be wasted to skip those.
> > >
> > > fair enough, eliminating a big chunk of anonymous types is useful, let's do this
> > >
> > > >
> > > > Thank you. After removing the redundant iterations, performance increased
> > > > significantly compared with two iterations.
> > > >
> > > > Test Case: Locate all 58,719 named types in vmlinux BTF
> > > > Methodology:
> > > > ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
> > > >
> > > > Two iterations:
> > > > | Condition | Lookup Time | Improvement |
> > > > |--------------------|-------------|-------------|
> > > > | Unsorted (Linear) | 17,282 ms | Baseline |
> > > > | Sorted (Binary) | 19 ms | 909x faster |
> > > >
> > > > One iteration:
> > > > Results:
> > > > | Condition | Lookup Time | Improvement |
> > > > |--------------------|-------------|-------------|
> > > > | Unsorted (Linear) | 17,619 ms | Baseline |
> > > > | Sorted (Binary) | 10 ms | 1762x faster |
> > > >
> > > > Here is the code implementation with a single iteration approach.
> > > > I believe this scenario differs from find_linfo because we cannot
> > > > determine in advance whether the specified type name will be found.
> > > > Please correct me if I've misunderstood anything, and I welcome any
> > > > guidance on this matter.
> > > >
> > > > static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
> > > > const char *name,
> > > > __s32 start_id)
> > > > {
> > > > const struct btf_type *t;
> > > > const char *tname;
> > > > __s32 l, r, m, lmost = -ENOENT;
> > > > int ret;
> > > >
> > > > /* found the leftmost btf_type that matches */
> > > > l = start_id;
> > > > r = btf__type_cnt(btf) - 1;
> > > > while (l <= r) {
> > > > m = l + (r - l) / 2;
> > > > t = btf_type_by_id(btf, m);
> > > > if (!t->name_off) {
> > > > ret = 1;
> > > > } else {
> > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > ret = !tname ? 1 : strcmp(tname, name);
> > > > }
> > > > if (ret < 0) {
> > > > l = m + 1;
> > > > } else {
> > > > if (ret == 0)
> > > > lmost = m;
> > > > r = m - 1;
> > > > }
> > > > }
> > > >
> > > > return lmost;
> > > > }
> > >
> > > There are different ways to implement this. At the highest level,
> > > implementation below just searches for leftmost element that has name
> > > >= the one we are searching for. One complication is that such element
> > > might not event exists. We can solve that checking ahead of time
> > > whether the rightmost type satisfied the condition, or we could do
> > > something similar to what I do in the loop below, where I allow l == r
> > > and then if that element has name >= to what we search, we exit
> > > because we found it. And if not, l will become larger than r, we'll
> > > break out of the loop and we'll know that we couldn't find the
> > > element. I haven't tested it, but please take a look and if you decide
> > > to go with such approach, do test it for edge cases, of course.
> > >
> > > /*
> > > * We are searching for the smallest r such that type #r's name is >= name.
> > > * It might not exist, in which case we'll have l == r + 1.
> > > */
> > > l = start_id;
> > > r = btf__type_cnt(btf) - 1;
> > > while (l < r) {
> > > m = l + (r - l) / 2;
> > > t = btf_type_by_id(btf, m);
> > > tname = btf__str_by_offset(btf, t->name_off);
> > >
> > > if (strcmp(tname, name) >= 0) {
> > > if (l == r)
> > > return r; /* found it! */
> >
> > It seems that this if condition will never hold, because a while(l < r) loop
>
> It should be `while (l <= r)`, I forgot to update it, but I mentioned
> that I do want to allow l == r condition.
>
> > is used. Moreover, even if the condition were to hold, it wouldn't guarantee
> > a successful search.
>
> Elaborate please on "wouldn't guarantee a successful search".
I think a successful search is that we can successfully find the element that
we want.
>
> >
> > > r = m;
> > > } else {
> > > l = m + 1;
> > > }
> > > }
> > > /* here we know given element doesn't exist, return index beyond end of types */
> > > return btf__type_cnt(btf);
> >
> > I think that return -ENOENT seems more reasonable.
>
> Think how you will be using this inside btf_find_type_by_name_kind():
>
>
> int idx = btf_find_by_name_bsearch(btf, name);
>
> for (int n = btf__type_cnt(btf); idx < n; idx++) {
> struct btf_type *t = btf__type_by_id(btf, idx);
> const char *tname = btf__str_by_offset(btf, t->name_off);
> if (strcmp(tname, name) != 0)
> return -ENOENT;
> if (btf_kind(t) == kind)
> return idx;
> }
> return -ENOENT;
Thanks, it seems cleaner.
>
>
> Having btf_find_by_name_bsearch() return -ENOENT instead of
> btf__type_cnt() just will require extra explicit -ENOENT handling. And
> given the function now can return "error", we'd need to either handle
> other non-ENOENT errors, to at least leave comment that this should
> never happen, though interface itself looks like it could.
>
> This is relatively minor and its all internal implementation, so we
> can change that later. But I'm explaining my reasons for why I'd
> return index of non-existing type after the end, just like you'd do
> with pointer-based interfaces that return pointer after the last
> element.
Thanks, I see.
>
>
> >
> > >
> > >
> > > We could have checked instead whether strcmp(btf__str_by_offset(btf,
> > > btf__type_by_id(btf, btf__type_cnt() - 1)->name_off), name) < 0 and
> > > exit early. That's just a bit more code duplication of essentially
> > > what we do inside the loop, so that if (l == r) seems fine to me, but
> > > I'm not married to this.
> >
> > Sorry, I believe that even if strcmp(btf__str_by_offset(btf,
> > btf__type_by_id(btf,
> > btf__type_cnt() - 1)->name_off), name) >= 0, it still doesn't seem to
> > guarantee that the search will definitely succeed.
>
> If the last element has >= name, search will definitely find at least
> that element. What do you mean by "succeed"? All I care about here is
Thank you. By "successful search," I mean finding the exact matching
element we're looking for—not just the first element that meets the "≥"
condition.
Here's a concrete example to illustrate the issue:
Base BTF contains: {"A", "C", "E", "F"}
Split BTF contains: {"B", "D"}
Target search: "D" in split BTF
The current implementation recursively searches from the base BTF first.
While "D" is lexicographically ≤ "F" (the last element in base BTF), "D" doesn't
actually exist in the base BTF. When the binary search reaches the l
== r condition,
it returns the index of "E" instead.
This requires an extra name comparison check after btf_find_by_name_bsearch
returns, which could be avoided in the first loop iteration if the
search directly
identified exact matches.
int idx = btf_find_by_name_bsearch(btf, name);
for (int n = btf__type_cnt(btf); idx < n; idx++) {
struct btf_type *t = btf__type_by_id(btf, idx);
const char *tname = btf__str_by_offset(btf, t->name_off);
if (strcmp(tname, name) != 0) <<< This check is redundant on the first loop
iteration
when a matching index is found
return -ENOENT;
if (btf_kind(t) == kind)
return idx;
}
return -ENOENT;
I tested this with a simple program searching for 3 in {0, 1, 2, 4, 5}:
int main(int argc, char *argv[])
{
int values[] = {0, 1, 2, 4, 5};
int to_find;
int i;
to_find = atoi(argv[1]);;
for (i = 0; i < ARRAY_SIZE(values); i++)
printf("[%d] = %d\n", i , values[i]);
printf("To Find %d\n", to_find);
{
int l, m, r;
l = 0;
r = ARRAY_SIZE(values) - 1;
while (l <= r) {
m = l + (r- l) / 2;
if (values[m] >= to_find) {
if (l == r) {
printf("!!!! Found: [%d] ==>
%d\n", r, values[r]);
break;
}
r = m;
} else {
l = m + 1;
}
}
printf("END: l: %d, r: %d\n", l, r);
}
return 0;
}
Output:
[0] = 0
[1] = 1
[2] = 2
[3] = 4
[4] = 5
To Find 3
!!!! Found: [3] ==> 4
END: l: 3, r: 3
The search returns index 3 (value 4), which is the first value ≥ 3,
but since 4 ≠ 3,
it's not an exact match. Thus, the algorithm cannot guarantee a
successful search
for the exact element without additional checks.
> that binary search loop doesn't loop forever and it returns correct
> index (or detects that no element can be found).
>
> >
> > >
> > > >
> > > > static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> > > > const char *type_name, __u32 kind)
> > > > {
> > > > const struct btf_type *t;
> > > > const char *tname;
> > > > int err = -ENOENT;
> > > > __u32 total;
> > > >
> > > > if (!btf)
> > > > goto out;
> > > >
> > > > if (start_id < btf->start_id) {
> > > > err = btf_find_type_by_name_kind(btf->base_btf, start_id,
> > > > type_name, kind);
> > > > if (err == -ENOENT)
> > > > start_id = btf->start_id;
> > > > }
> > > >
> > > > if (err == -ENOENT) {
> > > > if (btf_check_sorted((struct btf *)btf)) {
> > > > /* binary search */
> > > > bool skip_first;
> > > > int ret;
> > > >
> > > > /* return the leftmost with maching names */
> > > > ret = btf_find_type_by_name_bsearch(btf,
> > > > type_name, start_id);
> > > > if (ret < 0)
> > > > goto out;
> > > > /* skip kind checking */
> > > > if (kind == -1)
> > > > return ret;
> > > > total = btf__type_cnt(btf);
> > > > skip_first = true;
> > > > do {
> > > > t = btf_type_by_id(btf, ret);
> > > > if (btf_kind(t) != kind) {
> > > > if (skip_first) {
> > > > skip_first = false;
> > > > continue;
> > > > }
> > > > } else if (skip_first) {
> > > > return ret;
> > > > }
> > > > if (!t->name_off)
> > > > break;
> > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > if (tname && !strcmp(tname, type_name))
> > > > return ret;
> > > > else
> > > > break;
> > > > } while (++ret < total);
> > > > } else {
> > > > /* linear search */
> > > > ...
> > > > }
> > > > }
> > > >
> > > > out:
> > > > return err;
> > > > }
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-07 4:57 ` Donglin Peng
@ 2025-11-07 17:01 ` Andrii Nakryiko
2025-11-10 2:04 ` Donglin Peng
0 siblings, 1 reply; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-07 17:01 UTC (permalink / raw)
To: Donglin Peng
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin, zhangxiaoqin
On Thu, Nov 6, 2025 at 8:57 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Fri, Nov 7, 2025 at 1:31 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Nov 5, 2025 at 11:49 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > >
> > > On Thu, Nov 6, 2025 at 2:11 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 5, 2025 at 5:48 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > > > > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > > > > >
> > > > > > > > [...]
> > > > > > > >
> > > > > > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > > > > > return type_id;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > > > > > +/*
> > > > > > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > > > > > + * and returns the leftmost matching index.
> > > > > > > > > > + */
> > > > > > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > > > > > + __s32 *left, __s32 *right)
> > > > > > > > >
> > > > > > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > > > > > searches where one would do just fine.
> > > > > > > >
> > > > > > > > I think the idea is that there would be less strcmp's if there is a
> > > > > > > > long sequence of items with identical names.
> > > > > > >
> > > > > > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > > > > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > > > > > with high likelihood will take more comparisons.
> > > > > >
> > > > > > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > > > > > 51737 '(anon)'
> > > > > > 277 'bpf_kfunc'
> > > > > > 4 'long
> > > > > > 3 'perf_aux_event'
> > > > > > 3 'workspace'
> > > > > > 2 'ata_acpi_gtm'
> > > > > > 2 'avc_cache_stats'
> > > > > > 2 'bh_accounting'
> > > > > > 2 'bp_cpuinfo'
> > > > > > 2 'bpf_fastcall'
> > > > > >
> > > > > > 'bpf_kfunc' is probably for decl_tags.
> > > > > > So I agree with you regarding the second binary search, it is not
> > > > > > necessary. But skipping all anonymous types (and thus having to
> > > > > > maintain nr_sorted_types) might be useful, on each search two
> > > > > > iterations would be wasted to skip those.
> > > >
> > > > fair enough, eliminating a big chunk of anonymous types is useful, let's do this
> > > >
> > > > >
> > > > > Thank you. After removing the redundant iterations, performance increased
> > > > > significantly compared with two iterations.
> > > > >
> > > > > Test Case: Locate all 58,719 named types in vmlinux BTF
> > > > > Methodology:
> > > > > ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
> > > > >
> > > > > Two iterations:
> > > > > | Condition | Lookup Time | Improvement |
> > > > > |--------------------|-------------|-------------|
> > > > > | Unsorted (Linear) | 17,282 ms | Baseline |
> > > > > | Sorted (Binary) | 19 ms | 909x faster |
> > > > >
> > > > > One iteration:
> > > > > Results:
> > > > > | Condition | Lookup Time | Improvement |
> > > > > |--------------------|-------------|-------------|
> > > > > | Unsorted (Linear) | 17,619 ms | Baseline |
> > > > > | Sorted (Binary) | 10 ms | 1762x faster |
> > > > >
> > > > > Here is the code implementation with a single iteration approach.
> > > > > I believe this scenario differs from find_linfo because we cannot
> > > > > determine in advance whether the specified type name will be found.
> > > > > Please correct me if I've misunderstood anything, and I welcome any
> > > > > guidance on this matter.
> > > > >
> > > > > static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
> > > > > const char *name,
> > > > > __s32 start_id)
> > > > > {
> > > > > const struct btf_type *t;
> > > > > const char *tname;
> > > > > __s32 l, r, m, lmost = -ENOENT;
> > > > > int ret;
> > > > >
> > > > > /* found the leftmost btf_type that matches */
> > > > > l = start_id;
> > > > > r = btf__type_cnt(btf) - 1;
> > > > > while (l <= r) {
> > > > > m = l + (r - l) / 2;
> > > > > t = btf_type_by_id(btf, m);
> > > > > if (!t->name_off) {
> > > > > ret = 1;
> > > > > } else {
> > > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > > ret = !tname ? 1 : strcmp(tname, name);
> > > > > }
> > > > > if (ret < 0) {
> > > > > l = m + 1;
> > > > > } else {
> > > > > if (ret == 0)
> > > > > lmost = m;
> > > > > r = m - 1;
> > > > > }
> > > > > }
> > > > >
> > > > > return lmost;
> > > > > }
> > > >
> > > > There are different ways to implement this. At the highest level,
> > > > implementation below just searches for leftmost element that has name
> > > > >= the one we are searching for. One complication is that such element
> > > > might not event exists. We can solve that checking ahead of time
> > > > whether the rightmost type satisfied the condition, or we could do
> > > > something similar to what I do in the loop below, where I allow l == r
> > > > and then if that element has name >= to what we search, we exit
> > > > because we found it. And if not, l will become larger than r, we'll
> > > > break out of the loop and we'll know that we couldn't find the
> > > > element. I haven't tested it, but please take a look and if you decide
> > > > to go with such approach, do test it for edge cases, of course.
> > > >
> > > > /*
> > > > * We are searching for the smallest r such that type #r's name is >= name.
> > > > * It might not exist, in which case we'll have l == r + 1.
> > > > */
> > > > l = start_id;
> > > > r = btf__type_cnt(btf) - 1;
> > > > while (l < r) {
> > > > m = l + (r - l) / 2;
> > > > t = btf_type_by_id(btf, m);
> > > > tname = btf__str_by_offset(btf, t->name_off);
> > > >
> > > > if (strcmp(tname, name) >= 0) {
> > > > if (l == r)
> > > > return r; /* found it! */
> > >
> > > It seems that this if condition will never hold, because a while(l < r) loop
> >
> > It should be `while (l <= r)`, I forgot to update it, but I mentioned
> > that I do want to allow l == r condition.
> >
> > > is used. Moreover, even if the condition were to hold, it wouldn't guarantee
> > > a successful search.
> >
> > Elaborate please on "wouldn't guarantee a successful search".
>
> I think a successful search is that we can successfully find the element that
> we want.
>
Ok, I never intended to find exact match with that leftmost >= element
as a primitive.
> >
> > >
> > > > r = m;
> > > > } else {
> > > > l = m + 1;
> > > > }
> > > > }
> > > > /* here we know given element doesn't exist, return index beyond end of types */
> > > > return btf__type_cnt(btf);
> > >
> > > I think that return -ENOENT seems more reasonable.
> >
> > Think how you will be using this inside btf_find_type_by_name_kind():
> >
> >
> > int idx = btf_find_by_name_bsearch(btf, name);
> >
> > for (int n = btf__type_cnt(btf); idx < n; idx++) {
> > struct btf_type *t = btf__type_by_id(btf, idx);
> > const char *tname = btf__str_by_offset(btf, t->name_off);
> > if (strcmp(tname, name) != 0)
> > return -ENOENT;
> > if (btf_kind(t) == kind)
> > return idx;
> > }
> > return -ENOENT;
>
> Thanks, it seems cleaner.
ok, great
>
> >
> >
> > Having btf_find_by_name_bsearch() return -ENOENT instead of
> > btf__type_cnt() just will require extra explicit -ENOENT handling. And
> > given the function now can return "error", we'd need to either handle
> > other non-ENOENT errors, to at least leave comment that this should
> > never happen, though interface itself looks like it could.
> >
> > This is relatively minor and its all internal implementation, so we
> > can change that later. But I'm explaining my reasons for why I'd
> > return index of non-existing type after the end, just like you'd do
> > with pointer-based interfaces that return pointer after the last
> > element.
>
> Thanks, I see.
>
> >
> >
> > >
> > > >
> > > >
> > > > We could have checked instead whether strcmp(btf__str_by_offset(btf,
> > > > btf__type_by_id(btf, btf__type_cnt() - 1)->name_off), name) < 0 and
> > > > exit early. That's just a bit more code duplication of essentially
> > > > what we do inside the loop, so that if (l == r) seems fine to me, but
> > > > I'm not married to this.
> > >
> > > Sorry, I believe that even if strcmp(btf__str_by_offset(btf,
> > > btf__type_by_id(btf,
> > > btf__type_cnt() - 1)->name_off), name) >= 0, it still doesn't seem to
> > > guarantee that the search will definitely succeed.
> >
> > If the last element has >= name, search will definitely find at least
> > that element. What do you mean by "succeed"? All I care about here is
>
> Thank you. By "successful search," I mean finding the exact matching
> element we're looking for—not just the first element that meets the "≥"
> condition.
We don't have to find the exact match, just the leftmost >= element.
For search by name+kind you will have to do linear search *anyways*
and compare name for every single potential candidate (Except maybe
the very first one as micro-optimization and complication, if we had
exact matching leftmost element; but I don't care about that
complication). So leftmost >= element is a universal "primitive" that
allows you to implement exact by name or exact by name+kind search in
exactly the same fashion.
>
> Here's a concrete example to illustrate the issue:
>
> Base BTF contains: {"A", "C", "E", "F"}
> Split BTF contains: {"B", "D"}
> Target search: "D" in split BTF
>
> The current implementation recursively searches from the base BTF first.
> While "D" is lexicographically ≤ "F" (the last element in base BTF), "D" doesn't
> actually exist in the base BTF. When the binary search reaches the l
> == r condition,
> it returns the index of "E" instead.
>
> This requires an extra name comparison check after btf_find_by_name_bsearch
> returns, which could be avoided in the first loop iteration if the
> search directly
> identified exact matches.
See above, I think this is misguided. There is nothing wrong with
checking after bsearch returns *candidate* index, and you cannot avoid
that for name+kind search.
>
> int idx = btf_find_by_name_bsearch(btf, name);
>
> for (int n = btf__type_cnt(btf); idx < n; idx++) {
> struct btf_type *t = btf__type_by_id(btf, idx);
> const char *tname = btf__str_by_offset(btf, t->name_off);
> if (strcmp(tname, name) != 0) <<< This check is redundant on the first loop
> iteration
Yes, I think this is absolutely OK and acceptable. Are you worried
about the overhead of a single strcmp()? See below for notes on having
single overall name and name+kind implementation using this approach.
> when a matching index is found
> return -ENOENT;
> if (btf_kind(t) == kind)
> return idx;
> }
> return -ENOENT;
>
> I tested this with a simple program searching for 3 in {0, 1, 2, 4, 5}:
>
> int main(int argc, char *argv[])
> {
> int values[] = {0, 1, 2, 4, 5};
> int to_find;
> int i;
>
> to_find = atoi(argv[1]);;
>
> for (i = 0; i < ARRAY_SIZE(values); i++)
> printf("[%d] = %d\n", i , values[i]);
>
> printf("To Find %d\n", to_find);
>
> {
> int l, m, r;
>
> l = 0;
> r = ARRAY_SIZE(values) - 1;
>
> while (l <= r) {
> m = l + (r- l) / 2;
> if (values[m] >= to_find) {
> if (l == r) {
> printf("!!!! Found: [%d] ==>
> %d\n", r, values[r]);
> break;
> }
> r = m;
> } else {
> l = m + 1;
> }
> }
>
> printf("END: l: %d, r: %d\n", l, r);
> }
>
> return 0;
> }
>
> Output:
> [0] = 0
> [1] = 1
> [2] = 2
> [3] = 4
> [4] = 5
> To Find 3
> !!!! Found: [3] ==> 4
> END: l: 3, r: 3
>
> The search returns index 3 (value 4), which is the first value ≥ 3,
> but since 4 ≠ 3,
> it's not an exact match. Thus, the algorithm cannot guarantee a
> successful search
> for the exact element without additional checks.
It was never a goal to find an exact match, yes, additional checks
after the search is necessary to confirm name or name+kind match (and
the latter will have to check name for every single item, except maybe
the first one if we had exact match "guarantee", but I think this is
absolutely unnecessary). And this is unavoidable for name+kind search.
So instead of optimizing one extra strcmp() let's have uniform
implementation for both name and name+kind searches. In fact, you can
even have the same universal implementation of both if you treat kind
== 0 as "don't care about kind".
>
> > that binary search loop doesn't loop forever and it returns correct
> > index (or detects that no element can be found).
> >
> > >
> > > >
> > > > >
> > > > > static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> > > > > const char *type_name, __u32 kind)
> > > > > {
> > > > > const struct btf_type *t;
> > > > > const char *tname;
> > > > > int err = -ENOENT;
> > > > > __u32 total;
> > > > >
> > > > > if (!btf)
> > > > > goto out;
> > > > >
> > > > > if (start_id < btf->start_id) {
> > > > > err = btf_find_type_by_name_kind(btf->base_btf, start_id,
> > > > > type_name, kind);
> > > > > if (err == -ENOENT)
> > > > > start_id = btf->start_id;
> > > > > }
> > > > >
> > > > > if (err == -ENOENT) {
> > > > > if (btf_check_sorted((struct btf *)btf)) {
> > > > > /* binary search */
> > > > > bool skip_first;
> > > > > int ret;
> > > > >
> > > > > /* return the leftmost with maching names */
> > > > > ret = btf_find_type_by_name_bsearch(btf,
> > > > > type_name, start_id);
> > > > > if (ret < 0)
> > > > > goto out;
> > > > > /* skip kind checking */
> > > > > if (kind == -1)
> > > > > return ret;
> > > > > total = btf__type_cnt(btf);
> > > > > skip_first = true;
> > > > > do {
> > > > > t = btf_type_by_id(btf, ret);
> > > > > if (btf_kind(t) != kind) {
> > > > > if (skip_first) {
> > > > > skip_first = false;
> > > > > continue;
> > > > > }
> > > > > } else if (skip_first) {
> > > > > return ret;
> > > > > }
> > > > > if (!t->name_off)
> > > > > break;
> > > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > > if (tname && !strcmp(tname, type_name))
> > > > > return ret;
> > > > > else
> > > > > break;
> > > > > } while (++ret < total);
> > > > > } else {
> > > > > /* linear search */
> > > > > ...
> > > > > }
> > > > > }
> > > > >
> > > > > out:
> > > > > return err;
> > > > > }
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering
2025-11-07 2:36 ` Donglin Peng
@ 2025-11-07 17:43 ` Andrii Nakryiko
0 siblings, 0 replies; 53+ messages in thread
From: Andrii Nakryiko @ 2025-11-07 17:43 UTC (permalink / raw)
To: Donglin Peng
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin
On Thu, Nov 6, 2025 at 6:36 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
>
> On Thu, Nov 6, 2025 at 2:23 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Nov 4, 2025 at 5:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2025-11-04 at 17:04 -0800, Andrii Nakryiko wrote:
> > > > On Tue, Nov 4, 2025 at 4:16 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > > +static int btf_permute_remap_type_id(__u32 *type_id, void *ctx)
> > > > > > > +{
> > > > > > > + struct btf_permute *p = ctx;
> > > > > > > + __u32 new_type_id = *type_id;
> > > > > > > +
> > > > > > > + /* skip references that point into the base BTF */
> > > > > > > + if (new_type_id < p->btf->start_id)
> > > > > > > + return 0;
> > > > > > > +
> > > > > > > + new_type_id = p->map[*type_id - p->btf->start_id];
> > > > > >
> > > > > > I'm actually confused, I thought p->ids would be the mapping from
> > > > > > original type ID (minus start_id, of course) to a new desired ID, but
> > > > > > it looks to be the other way? ids is a desired resulting *sequence* of
> > > > > > types identified by their original ID. I find it quite confusing. I
> > > > > > think about permutation as a mapping from original type ID to a new
> > > > > > type ID, am I confused?
> > > > >
> > > > > Yes, it is a desired sequence, not mapping.
> > > > > I guess its a bit simpler to use for sorting use-case, as you can just
> > > > > swap ids while sorting.
> > > >
> > > > The question is really what makes most sense as an interface. Because
> > > > for sorting cases it's just the matter of a two-line for() loop to
> > > > create ID mapping once types are sorted.
> > > >
> > > > I have slight preference for id_map approach because it is easy to
> > > > extend to the case of selectively dropping some types. We can just
> > > > define that such IDs should be mapped to zero. This will work as a
> > > > natural extension. With the desired end sequence of IDs, it's less
> > > > natural and will require more work to determine which IDs are missing
> > > > from the sequence.
> > > >
> > > > So unless there is some really good and strong reason, shall we go
> > > > with the ID mapping approach?
> > >
> > > If the interface is extended with types_cnt, as you suggest, deleting
> > > types is trivial with sequence interface as well. At-least the way it
> > > is implemented by this patch, you just copy elements from 'ids' one by
> > > one.
> >
> > But it is way less explicit and obvious way to delete element. With ID
> > map it is obvious, that type will be mapped to zero. With list of IDs,
> > you effectively search for elements that are missing, which IMO is way
> > less optimal an interface.
> >
> > So I still favor the ID map approach.
>
> Hi Andrii,
>
> I've submitted v5 implementing the sequence-based approach, and I plan
> to introduce
> the ID map approach in v6. However, I have a few remaining questions that need
> clarification:
>
> 1. ID Map Array Semantics:
>
> - When the ID map array specifies `[2] = 4`, does this indicate
> that the original type
> at `start_id + 2` should be remapped to position `start_id + 4`?
I'd say that 4 should be "absolute type ID" for simplicity. Because
that's what users work with. I'd say the position ([2]) should also
map to type ID for non-split case. So for base BTF I'd require [0]=0,
i.e., id_map count should be btf__type_cnt() sized. (I can be
convinced that's wrong and inconvenient) For split BTF the situation
is of course more complicated, because requiring btf__type_cnt()-sized
array for just split BTF would be super wasteful. So for split BTF [2]
would be as you say 3rd type within split BTF, that is type
#(btf__start_id() + 2), yes.
> Should the following
> mapping attempts be rejected:
> a) If the target index `4` exceeds the total number of types (`nr_types`)?
yes
> b) If multiple source types map to the same target location
> (e.g., both `[1] = 3`
> and `[2] = 3`)?
yes (at least for now, we can lift this if we ever have a good reason
by adding some option)
>
> - If [3] = 0, does this indicate that the type at start_id + 3 should
> be dropped?
yes, but let's not worry about deletion right now and just reject
this. I'd like to keep this option for the future, but right now we
should reject such case.
>
> - Does this also imply that the VOID type (ID 0) cannot be remapped
> and must always remain unchanged?
yes, it must be always be zero, it's baked into BTF
>
>
> 2. ID Map Array Size:
>
> - Must the ID map array size <= the number of BTF types? If the array
> is smaller, should any missing types be automatically dropped?
no, it's an error, id_map size should match the number of types. For
base it should be btf__type_cnt(), for split BTF it should be
`btf__type_cnt() - btf__type_cnt(btf__base_btf(split_btf))`. (That's
one of the reasons I think we should have [0] = 0 for base, to keep
this consistent).
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF
2025-11-07 17:01 ` Andrii Nakryiko
@ 2025-11-10 2:04 ` Donglin Peng
0 siblings, 0 replies; 53+ messages in thread
From: Donglin Peng @ 2025-11-10 2:04 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Eduard Zingerman, ast, linux-kernel, bpf, Alan Maguire, Song Liu,
pengdonglin, zhangxiaoqin
On Sat, Nov 8, 2025 at 1:01 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Nov 6, 2025 at 8:57 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
> >
> > On Fri, Nov 7, 2025 at 1:31 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Wed, Nov 5, 2025 at 11:49 PM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 6, 2025 at 2:11 AM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 5, 2025 at 5:48 AM Donglin Peng <dolinux.peng@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 5, 2025 at 9:17 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, 2025-11-04 at 16:54 -0800, Andrii Nakryiko wrote:
> > > > > > > > On Tue, Nov 4, 2025 at 4:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, 2025-11-04 at 16:11 -0800, Andrii Nakryiko wrote:
> > > > > > > > >
> > > > > > > > > [...]
> > > > > > > > >
> > > > > > > > > > > @@ -897,44 +903,134 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > > > > > > > > > > return type_id;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > -__s32 btf__find_by_name(const struct btf *btf, const char *type_name)
> > > > > > > > > > > +/*
> > > > > > > > > > > + * Find BTF types with matching names within the [left, right] index range.
> > > > > > > > > > > + * On success, updates *left and *right to the boundaries of the matching range
> > > > > > > > > > > + * and returns the leftmost matching index.
> > > > > > > > > > > + */
> > > > > > > > > > > +static __s32 btf_find_type_by_name_bsearch(const struct btf *btf, const char *name,
> > > > > > > > > > > + __s32 *left, __s32 *right)
> > > > > > > > > >
> > > > > > > > > > I thought we discussed this, why do you need "right"? Two binary
> > > > > > > > > > searches where one would do just fine.
> > > > > > > > >
> > > > > > > > > I think the idea is that there would be less strcmp's if there is a
> > > > > > > > > long sequence of items with identical names.
> > > > > > > >
> > > > > > > > Sure, it's a tradeoff. But how long is the set of duplicate name
> > > > > > > > entries we expect in kernel BTF? Additional O(logN) over 70K+ types
> > > > > > > > with high likelihood will take more comparisons.
> > > > > > >
> > > > > > > $ bpftool btf dump file vmlinux | grep '^\[' | awk '{print $3}' | sort | uniq -c | sort -k1nr | head
> > > > > > > 51737 '(anon)'
> > > > > > > 277 'bpf_kfunc'
> > > > > > > 4 'long
> > > > > > > 3 'perf_aux_event'
> > > > > > > 3 'workspace'
> > > > > > > 2 'ata_acpi_gtm'
> > > > > > > 2 'avc_cache_stats'
> > > > > > > 2 'bh_accounting'
> > > > > > > 2 'bp_cpuinfo'
> > > > > > > 2 'bpf_fastcall'
> > > > > > >
> > > > > > > 'bpf_kfunc' is probably for decl_tags.
> > > > > > > So I agree with you regarding the second binary search, it is not
> > > > > > > necessary. But skipping all anonymous types (and thus having to
> > > > > > > maintain nr_sorted_types) might be useful, on each search two
> > > > > > > iterations would be wasted to skip those.
> > > > >
> > > > > fair enough, eliminating a big chunk of anonymous types is useful, let's do this
> > > > >
> > > > > >
> > > > > > Thank you. After removing the redundant iterations, performance increased
> > > > > > significantly compared with two iterations.
> > > > > >
> > > > > > Test Case: Locate all 58,719 named types in vmlinux BTF
> > > > > > Methodology:
> > > > > > ./vmtest.sh -- ./test_progs -t btf_permute/perf -v
> > > > > >
> > > > > > Two iterations:
> > > > > > | Condition | Lookup Time | Improvement |
> > > > > > |--------------------|-------------|-------------|
> > > > > > | Unsorted (Linear) | 17,282 ms | Baseline |
> > > > > > | Sorted (Binary) | 19 ms | 909x faster |
> > > > > >
> > > > > > One iteration:
> > > > > > Results:
> > > > > > | Condition | Lookup Time | Improvement |
> > > > > > |--------------------|-------------|-------------|
> > > > > > | Unsorted (Linear) | 17,619 ms | Baseline |
> > > > > > | Sorted (Binary) | 10 ms | 1762x faster |
> > > > > >
> > > > > > Here is the code implementation with a single iteration approach.
> > > > > > I believe this scenario differs from find_linfo because we cannot
> > > > > > determine in advance whether the specified type name will be found.
> > > > > > Please correct me if I've misunderstood anything, and I welcome any
> > > > > > guidance on this matter.
> > > > > >
> > > > > > static __s32 btf_find_type_by_name_bsearch(const struct btf *btf,
> > > > > > const char *name,
> > > > > > __s32 start_id)
> > > > > > {
> > > > > > const struct btf_type *t;
> > > > > > const char *tname;
> > > > > > __s32 l, r, m, lmost = -ENOENT;
> > > > > > int ret;
> > > > > >
> > > > > > /* found the leftmost btf_type that matches */
> > > > > > l = start_id;
> > > > > > r = btf__type_cnt(btf) - 1;
> > > > > > while (l <= r) {
> > > > > > m = l + (r - l) / 2;
> > > > > > t = btf_type_by_id(btf, m);
> > > > > > if (!t->name_off) {
> > > > > > ret = 1;
> > > > > > } else {
> > > > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > > > ret = !tname ? 1 : strcmp(tname, name);
> > > > > > }
> > > > > > if (ret < 0) {
> > > > > > l = m + 1;
> > > > > > } else {
> > > > > > if (ret == 0)
> > > > > > lmost = m;
> > > > > > r = m - 1;
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > return lmost;
> > > > > > }
> > > > >
> > > > > There are different ways to implement this. At the highest level,
> > > > > implementation below just searches for leftmost element that has name
> > > > > >= the one we are searching for. One complication is that such element
> > > > > might not event exists. We can solve that checking ahead of time
> > > > > whether the rightmost type satisfied the condition, or we could do
> > > > > something similar to what I do in the loop below, where I allow l == r
> > > > > and then if that element has name >= to what we search, we exit
> > > > > because we found it. And if not, l will become larger than r, we'll
> > > > > break out of the loop and we'll know that we couldn't find the
> > > > > element. I haven't tested it, but please take a look and if you decide
> > > > > to go with such approach, do test it for edge cases, of course.
> > > > >
> > > > > /*
> > > > > * We are searching for the smallest r such that type #r's name is >= name.
> > > > > * It might not exist, in which case we'll have l == r + 1.
> > > > > */
> > > > > l = start_id;
> > > > > r = btf__type_cnt(btf) - 1;
> > > > > while (l < r) {
> > > > > m = l + (r - l) / 2;
> > > > > t = btf_type_by_id(btf, m);
> > > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > >
> > > > > if (strcmp(tname, name) >= 0) {
> > > > > if (l == r)
> > > > > return r; /* found it! */
> > > >
> > > > It seems that this if condition will never hold, because a while(l < r) loop
> > >
> > > It should be `while (l <= r)`, I forgot to update it, but I mentioned
> > > that I do want to allow l == r condition.
> > >
> > > > is used. Moreover, even if the condition were to hold, it wouldn't guarantee
> > > > a successful search.
> > >
> > > Elaborate please on "wouldn't guarantee a successful search".
> >
> > I think a successful search is that we can successfully find the element that
> > we want.
> >
>
> Ok, I never intended to find exact match with that leftmost >= element
> as a primitive.
>
> > >
> > > >
> > > > > r = m;
> > > > > } else {
> > > > > l = m + 1;
> > > > > }
> > > > > }
> > > > > /* here we know given element doesn't exist, return index beyond end of types */
> > > > > return btf__type_cnt(btf);
> > > >
> > > > I think that return -ENOENT seems more reasonable.
> > >
> > > Think how you will be using this inside btf_find_type_by_name_kind():
> > >
> > >
> > > int idx = btf_find_by_name_bsearch(btf, name);
> > >
> > > for (int n = btf__type_cnt(btf); idx < n; idx++) {
> > > struct btf_type *t = btf__type_by_id(btf, idx);
> > > const char *tname = btf__str_by_offset(btf, t->name_off);
> > > if (strcmp(tname, name) != 0)
> > > return -ENOENT;
> > > if (btf_kind(t) == kind)
> > > return idx;
> > > }
> > > return -ENOENT;
> >
> > Thanks, it seems cleaner.
>
> ok, great
>
> >
> > >
> > >
> > > Having btf_find_by_name_bsearch() return -ENOENT instead of
> > > btf__type_cnt() just will require extra explicit -ENOENT handling. And
> > > given the function now can return "error", we'd need to either handle
> > > other non-ENOENT errors, to at least leave comment that this should
> > > never happen, though interface itself looks like it could.
> > >
> > > This is relatively minor and its all internal implementation, so we
> > > can change that later. But I'm explaining my reasons for why I'd
> > > return index of non-existing type after the end, just like you'd do
> > > with pointer-based interfaces that return pointer after the last
> > > element.
> >
> > Thanks, I see.
> >
> > >
> > >
> > > >
> > > > >
> > > > >
> > > > > We could have checked instead whether strcmp(btf__str_by_offset(btf,
> > > > > btf__type_by_id(btf, btf__type_cnt() - 1)->name_off), name) < 0 and
> > > > > exit early. That's just a bit more code duplication of essentially
> > > > > what we do inside the loop, so that if (l == r) seems fine to me, but
> > > > > I'm not married to this.
> > > >
> > > > Sorry, I believe that even if strcmp(btf__str_by_offset(btf,
> > > > btf__type_by_id(btf,
> > > > btf__type_cnt() - 1)->name_off), name) >= 0, it still doesn't seem to
> > > > guarantee that the search will definitely succeed.
> > >
> > > If the last element has >= name, search will definitely find at least
> > > that element. What do you mean by "succeed"? All I care about here is
> >
> > Thank you. By "successful search," I mean finding the exact matching
> > element we're looking for—not just the first element that meets the "≥"
> > condition.
>
> We don't have to find the exact match, just the leftmost >= element.
> For search by name+kind you will have to do linear search *anyways*
> and compare name for every single potential candidate (Except maybe
> the very first one as micro-optimization and complication, if we had
> exact matching leftmost element; but I don't care about that
> complication). So leftmost >= element is a universal "primitive" that
> allows you to implement exact by name or exact by name+kind search in
> exactly the same fashion.
>
> >
> > Here's a concrete example to illustrate the issue:
> >
> > Base BTF contains: {"A", "C", "E", "F"}
> > Split BTF contains: {"B", "D"}
> > Target search: "D" in split BTF
> >
> > The current implementation recursively searches from the base BTF first.
> > While "D" is lexicographically ≤ "F" (the last element in base BTF), "D" doesn't
> > actually exist in the base BTF. When the binary search reaches the l
> > == r condition,
> > it returns the index of "E" instead.
> >
> > This requires an extra name comparison check after btf_find_by_name_bsearch
> > returns, which could be avoided in the first loop iteration if the
> > search directly
> > identified exact matches.
>
> See above, I think this is misguided. There is nothing wrong with
> checking after bsearch returns *candidate* index, and you cannot avoid
> that for name+kind search.
>
> >
> > int idx = btf_find_by_name_bsearch(btf, name);
> >
> > for (int n = btf__type_cnt(btf); idx < n; idx++) {
> > struct btf_type *t = btf__type_by_id(btf, idx);
> > const char *tname = btf__str_by_offset(btf, t->name_off);
> > if (strcmp(tname, name) != 0) <<< This check is redundant on the first loop
> > iteration
>
> Yes, I think this is absolutely OK and acceptable. Are you worried
> about the overhead of a single strcmp()? See below for notes on having
> single overall name and name+kind implementation using this approach.
>
> > when a matching index is found
> > return -ENOENT;
> > if (btf_kind(t) == kind)
> > return idx;
> > }
> > return -ENOENT;
> >
> > I tested this with a simple program searching for 3 in {0, 1, 2, 4, 5}:
> >
> > int main(int argc, char *argv[])
> > {
> > int values[] = {0, 1, 2, 4, 5};
> > int to_find;
> > int i;
> >
> > to_find = atoi(argv[1]);;
> >
> > for (i = 0; i < ARRAY_SIZE(values); i++)
> > printf("[%d] = %d\n", i , values[i]);
> >
> > printf("To Find %d\n", to_find);
> >
> > {
> > int l, m, r;
> >
> > l = 0;
> > r = ARRAY_SIZE(values) - 1;
> >
> > while (l <= r) {
> > m = l + (r- l) / 2;
> > if (values[m] >= to_find) {
> > if (l == r) {
> > printf("!!!! Found: [%d] ==>
> > %d\n", r, values[r]);
> > break;
> > }
> > r = m;
> > } else {
> > l = m + 1;
> > }
> > }
> >
> > printf("END: l: %d, r: %d\n", l, r);
> > }
> >
> > return 0;
> > }
> >
> > Output:
> > [0] = 0
> > [1] = 1
> > [2] = 2
> > [3] = 4
> > [4] = 5
> > To Find 3
> > !!!! Found: [3] ==> 4
> > END: l: 3, r: 3
> >
> > The search returns index 3 (value 4), which is the first value ≥ 3,
> > but since 4 ≠ 3,
> > it's not an exact match. Thus, the algorithm cannot guarantee a
> > successful search
> > for the exact element without additional checks.
>
> It was never a goal to find an exact match, yes, additional checks
> after the search is necessary to confirm name or name+kind match (and
> the latter will have to check name for every single item, except maybe
> the first one if we had exact match "guarantee", but I think this is
> absolutely unnecessary). And this is unavoidable for name+kind search.
> So instead of optimizing one extra strcmp() let's have uniform
> implementation for both name and name+kind searches. In fact, you can
> even have the same universal implementation of both if you treat kind
> == 0 as "don't care about kind".
Thanks, I'll apply this suggestion in the next version.
>
>
> >
> > > that binary search loop doesn't loop forever and it returns correct
> > > index (or detects that no element can be found).
> > >
> > > >
> > > > >
> > > > > >
> > > > > > static __s32 btf_find_type_by_name_kind(const struct btf *btf, int start_id,
> > > > > > const char *type_name, __u32 kind)
> > > > > > {
> > > > > > const struct btf_type *t;
> > > > > > const char *tname;
> > > > > > int err = -ENOENT;
> > > > > > __u32 total;
> > > > > >
> > > > > > if (!btf)
> > > > > > goto out;
> > > > > >
> > > > > > if (start_id < btf->start_id) {
> > > > > > err = btf_find_type_by_name_kind(btf->base_btf, start_id,
> > > > > > type_name, kind);
> > > > > > if (err == -ENOENT)
> > > > > > start_id = btf->start_id;
> > > > > > }
> > > > > >
> > > > > > if (err == -ENOENT) {
> > > > > > if (btf_check_sorted((struct btf *)btf)) {
> > > > > > /* binary search */
> > > > > > bool skip_first;
> > > > > > int ret;
> > > > > >
> > > > > > /* return the leftmost with maching names */
> > > > > > ret = btf_find_type_by_name_bsearch(btf,
> > > > > > type_name, start_id);
> > > > > > if (ret < 0)
> > > > > > goto out;
> > > > > > /* skip kind checking */
> > > > > > if (kind == -1)
> > > > > > return ret;
> > > > > > total = btf__type_cnt(btf);
> > > > > > skip_first = true;
> > > > > > do {
> > > > > > t = btf_type_by_id(btf, ret);
> > > > > > if (btf_kind(t) != kind) {
> > > > > > if (skip_first) {
> > > > > > skip_first = false;
> > > > > > continue;
> > > > > > }
> > > > > > } else if (skip_first) {
> > > > > > return ret;
> > > > > > }
> > > > > > if (!t->name_off)
> > > > > > break;
> > > > > > tname = btf__str_by_offset(btf, t->name_off);
> > > > > > if (tname && !strcmp(tname, type_name))
> > > > > > return ret;
> > > > > > else
> > > > > > break;
> > > > > > } while (++ret < total);
> > > > > > } else {
> > > > > > /* linear search */
> > > > > > ...
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > out:
> > > > > > return err;
> > > > > > }
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2025-11-10 2:05 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 13:40 [RFC PATCH v4 0/7] libbpf: BTF performance optimizations with permutation and binary search Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 1/7] libbpf: Extract BTF type remapping logic into helper function Donglin Peng
2025-11-04 23:16 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:36 ` Eduard Zingerman
2025-11-05 0:57 ` Andrii Nakryiko
2025-11-05 1:23 ` Eduard Zingerman
2025-11-05 18:20 ` Andrii Nakryiko
2025-11-05 19:41 ` Eduard Zingerman
2025-11-06 17:09 ` Andrii Nakryiko
2025-11-04 13:40 ` [RFC PATCH v4 2/7] libbpf: Add BTF permutation support for type reordering Donglin Peng
2025-11-04 23:45 ` Eduard Zingerman
2025-11-05 11:31 ` Donglin Peng
2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:16 ` Eduard Zingerman
2025-11-05 1:04 ` Andrii Nakryiko
2025-11-05 1:20 ` Eduard Zingerman
2025-11-05 13:19 ` Donglin Peng
2025-11-05 18:32 ` Andrii Nakryiko
2025-11-05 18:23 ` Andrii Nakryiko
2025-11-05 19:23 ` Eduard Zingerman
2025-11-06 17:21 ` Andrii Nakryiko
2025-11-07 2:36 ` Donglin Peng
2025-11-07 17:43 ` Andrii Nakryiko
2025-11-05 12:52 ` Donglin Peng
2025-11-05 18:29 ` Andrii Nakryiko
2025-11-06 7:31 ` Donglin Peng
2025-11-06 17:12 ` Andrii Nakryiko
2025-11-07 1:39 ` Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 3/7] libbpf: Optimize type lookup with binary search for sorted BTF Donglin Peng
2025-11-04 14:15 ` bot+bpf-ci
2025-11-05 0:06 ` Eduard Zingerman
2025-11-05 0:11 ` Andrii Nakryiko
2025-11-05 0:19 ` Eduard Zingerman
2025-11-05 0:54 ` Andrii Nakryiko
2025-11-05 1:17 ` Eduard Zingerman
2025-11-05 13:48 ` Donglin Peng
2025-11-05 16:52 ` Eduard Zingerman
2025-11-06 6:10 ` Donglin Peng
2025-11-05 18:11 ` Andrii Nakryiko
2025-11-06 7:49 ` Donglin Peng
2025-11-06 17:31 ` Andrii Nakryiko
2025-11-07 4:57 ` Donglin Peng
2025-11-07 17:01 ` Andrii Nakryiko
2025-11-10 2:04 ` Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 4/7] libbpf: Implement lazy sorting validation for binary search optimization Donglin Peng
2025-11-05 0:29 ` Eduard Zingerman
2025-11-04 13:40 ` [RFC PATCH v4 5/7] btf: Optimize type lookup with binary search Donglin Peng
2025-11-04 17:14 ` Alexei Starovoitov
2025-11-05 13:22 ` Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 6/7] btf: Add lazy sorting validation for " Donglin Peng
2025-11-04 13:40 ` [RFC PATCH v4 7/7] selftests/bpf: Add test cases for btf__permute functionality Donglin Peng
2025-11-05 0:41 ` Eduard Zingerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox