* [PATCH v7 1/5] bpf: allow calling bpf_kptr_xchg while holding a lock
2026-02-14 4:36 [PATCH v7 0/5] bpf: Expand the usage scenarios of bpf_kptr_xchg Chengkaitao
@ 2026-02-14 4:36 ` Chengkaitao
2026-02-14 4:36 ` [PATCH v7 2/5] bpf: allow using bpf_kptr_xchg even if the NON_OWN_REF flag is set Chengkaitao
2026-02-14 4:36 ` [PATCH v7 3/5] selftests/bpf: Add supplementary tests for bpf_kptr_xchg Chengkaitao
2 siblings, 0 replies; 4+ messages in thread
From: Chengkaitao @ 2026-02-14 4:36 UTC (permalink / raw)
To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah, yangfeng,
alexei.starovoitov
Cc: linux-kernel, bpf, linux-kselftest, Kaitao Cheng
From: Kaitao Cheng <chengkaitao@kylinos.cn>
For the following scenario:
struct tree_node {
struct bpf_rb_node node;
struct request __kptr *req;
u64 key;
};
struct bpf_rb_root tree_root __contains(tree_node, node);
struct bpf_spin_lock tree_lock;
If we need to traverse all nodes in the rbtree, retrieve the __kptr
pointer from each node, and read kernel data from the referenced
object, using bpf_kptr_xchg appears unavoidable.
This patch skips the BPF verifier checks for bpf_kptr_xchg when
called while holding a lock.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
kernel/bpf/verifier.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index edf5342b982f..2927a4622ff8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21044,7 +21044,8 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
if (env->cur_state->active_locks) {
if ((insn->src_reg == BPF_REG_0 &&
- insn->imm != BPF_FUNC_spin_unlock) ||
+ insn->imm != BPF_FUNC_spin_unlock &&
+ insn->imm != BPF_FUNC_kptr_xchg) ||
(insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
(insn->off != 0 || !kfunc_spin_allowed(insn->imm)))) {
verbose(env,
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v7 2/5] bpf: allow using bpf_kptr_xchg even if the NON_OWN_REF flag is set
2026-02-14 4:36 [PATCH v7 0/5] bpf: Expand the usage scenarios of bpf_kptr_xchg Chengkaitao
2026-02-14 4:36 ` [PATCH v7 1/5] bpf: allow calling bpf_kptr_xchg while holding a lock Chengkaitao
@ 2026-02-14 4:36 ` Chengkaitao
2026-02-14 4:36 ` [PATCH v7 3/5] selftests/bpf: Add supplementary tests for bpf_kptr_xchg Chengkaitao
2 siblings, 0 replies; 4+ messages in thread
From: Chengkaitao @ 2026-02-14 4:36 UTC (permalink / raw)
To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah, yangfeng,
alexei.starovoitov
Cc: linux-kernel, bpf, linux-kselftest, Kaitao Cheng
From: Kaitao Cheng <chengkaitao@kylinos.cn>
When traversing an rbtree using bpf_rbtree_left/right, if bpf_kptr_xchg
is used to access the __kptr pointer contained in a node, it currently
requires first removing the node with bpf_rbtree_remove and clearing the
NON_OWN_REF flag, then re-adding the node to the original rbtree with
bpf_rbtree_add after usage. This process significantly degrades rbtree
traversal performance. The patch enables accessing __kptr pointers with
the NON_OWN_REF flag set while holding the lock, eliminating the need
for this remove-read-add sequence.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
---
kernel/bpf/verifier.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2927a4622ff8..3536a91ff8c7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9329,7 +9329,8 @@ static const struct bpf_reg_types timer_types = { .types = { PTR_TO_MAP_VALUE }
static const struct bpf_reg_types kptr_xchg_dest_types = {
.types = {
PTR_TO_MAP_VALUE,
- PTR_TO_BTF_ID | MEM_ALLOC
+ PTR_TO_BTF_ID | MEM_ALLOC,
+ PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF,
}
};
static const struct bpf_reg_types dynptr_types = {
@@ -9489,6 +9490,7 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
}
case PTR_TO_BTF_ID | MEM_ALLOC:
case PTR_TO_BTF_ID | MEM_PERCPU | MEM_ALLOC:
+ case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF:
if (meta->func_id != BPF_FUNC_spin_lock && meta->func_id != BPF_FUNC_spin_unlock &&
meta->func_id != BPF_FUNC_kptr_xchg) {
verifier_bug(env, "unimplemented handling of MEM_ALLOC");
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v7 3/5] selftests/bpf: Add supplementary tests for bpf_kptr_xchg
2026-02-14 4:36 [PATCH v7 0/5] bpf: Expand the usage scenarios of bpf_kptr_xchg Chengkaitao
2026-02-14 4:36 ` [PATCH v7 1/5] bpf: allow calling bpf_kptr_xchg while holding a lock Chengkaitao
2026-02-14 4:36 ` [PATCH v7 2/5] bpf: allow using bpf_kptr_xchg even if the NON_OWN_REF flag is set Chengkaitao
@ 2026-02-14 4:36 ` Chengkaitao
2 siblings, 0 replies; 4+ messages in thread
From: Chengkaitao @ 2026-02-14 4:36 UTC (permalink / raw)
To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah, yangfeng,
alexei.starovoitov
Cc: linux-kernel, bpf, linux-kselftest, Kaitao Cheng
From: Kaitao Cheng <chengkaitao@kylinos.cn>
1. Allow using bpf_kptr_xchg while holding a lock.
2. When the rb_node contains a __kptr pointer, we do not need to
perform a remove-read-add operation.
This patch implements the following workflow:
1. Construct a rbtree with 16 elements.
2. Traverse the rbtree, locate the kptr pointer in the target node,
and read the content pointed to by the pointer.
3. Remove all nodes from the rbtree.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
---
.../testing/selftests/bpf/prog_tests/rbtree.c | 6 +
tools/testing/selftests/bpf/progs/bpf_misc.h | 4 +
.../selftests/bpf/progs/rbtree_search_kptr.c | 167 ++++++++++++++++++
3 files changed, 177 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/rbtree_search_kptr.c
diff --git a/tools/testing/selftests/bpf/prog_tests/rbtree.c b/tools/testing/selftests/bpf/prog_tests/rbtree.c
index d8f3d7a45fe9..a854fb38e418 100644
--- a/tools/testing/selftests/bpf/prog_tests/rbtree.c
+++ b/tools/testing/selftests/bpf/prog_tests/rbtree.c
@@ -9,6 +9,7 @@
#include "rbtree_btf_fail__wrong_node_type.skel.h"
#include "rbtree_btf_fail__add_wrong_type.skel.h"
#include "rbtree_search.skel.h"
+#include "rbtree_search_kptr.skel.h"
static void test_rbtree_add_nodes(void)
{
@@ -193,3 +194,8 @@ void test_rbtree_search(void)
{
RUN_TESTS(rbtree_search);
}
+
+void test_rbtree_search_kptr(void)
+{
+ RUN_TESTS(rbtree_search_kptr);
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index c9bfbe1bafc1..0904fe14ad1d 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -188,6 +188,10 @@
#define POINTER_VALUE 0xbadcafe
#define TEST_DATA_LEN 64
+#ifndef __aligned
+#define __aligned(x) __attribute__((aligned(x)))
+#endif
+
#ifndef __used
#define __used __attribute__((used))
#endif
diff --git a/tools/testing/selftests/bpf/progs/rbtree_search_kptr.c b/tools/testing/selftests/bpf/progs/rbtree_search_kptr.c
new file mode 100644
index 000000000000..069fc64b0167
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/rbtree_search_kptr.c
@@ -0,0 +1,167 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 KylinSoft Corporation. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+
+#define NR_NODES 16
+
+struct node_data {
+ int data;
+};
+
+struct tree_node {
+ struct bpf_rb_node node;
+ u64 key;
+ struct node_data __kptr * node_data;
+};
+
+#define private(name) SEC(".data." #name) __hidden __aligned(8)
+
+private(A) struct bpf_rb_root root __contains(tree_node, node);
+private(A) struct bpf_spin_lock lock;
+
+static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b)
+{
+ struct tree_node *node_a, *node_b;
+
+ node_a = container_of(a, struct tree_node, node);
+ node_b = container_of(b, struct tree_node, node);
+
+ return node_a->key < node_b->key;
+}
+
+SEC("syscall")
+__retval(0)
+long rbtree_search_kptr(void *ctx)
+{
+ struct tree_node *tnode;
+ struct bpf_rb_node *rb_n;
+ struct node_data __kptr * node_data;
+ int lookup_key = NR_NODES / 2;
+ int lookup_data = NR_NODES / 2;
+ int i, data, ret = 0;
+
+ for (i = 0; i < NR_NODES && can_loop; i++) {
+ tnode = bpf_obj_new(typeof(*tnode));
+ if (!tnode)
+ return __LINE__;
+
+ node_data = bpf_obj_new(typeof(*node_data));
+ if (!node_data) {
+ bpf_obj_drop(tnode);
+ return __LINE__;
+ }
+
+ tnode->key = i;
+ node_data->data = i;
+
+ node_data = bpf_kptr_xchg(&tnode->node_data, node_data);
+ if (node_data)
+ bpf_obj_drop(node_data);
+
+ bpf_spin_lock(&lock);
+ bpf_rbtree_add(&root, &tnode->node, less);
+ bpf_spin_unlock(&lock);
+ }
+
+ bpf_spin_lock(&lock);
+ rb_n = bpf_rbtree_root(&root);
+ while (rb_n && can_loop) {
+ tnode = container_of(rb_n, struct tree_node, node);
+ node_data = bpf_kptr_xchg(&tnode->node_data, NULL);
+ if (!node_data) {
+ ret = __LINE__;
+ goto fail;
+ }
+
+ data = node_data->data;
+ node_data = bpf_kptr_xchg(&tnode->node_data, node_data);
+ if (node_data) {
+ bpf_spin_unlock(&lock);
+ bpf_obj_drop(node_data);
+ return __LINE__;
+ }
+
+ if (lookup_key == tnode->key) {
+ if (data == lookup_data)
+ break;
+
+ ret = __LINE__;
+ goto fail;
+ }
+
+ if (lookup_key < tnode->key)
+ rb_n = bpf_rbtree_left(&root, rb_n);
+ else
+ rb_n = bpf_rbtree_right(&root, rb_n);
+ }
+ bpf_spin_unlock(&lock);
+
+ while (can_loop) {
+ bpf_spin_lock(&lock);
+ rb_n = bpf_rbtree_first(&root);
+ if (!rb_n) {
+ bpf_spin_unlock(&lock);
+ return 0;
+ }
+
+ rb_n = bpf_rbtree_remove(&root, rb_n);
+ if (!rb_n) {
+ ret = __LINE__;
+ goto fail;
+ }
+ bpf_spin_unlock(&lock);
+
+ tnode = container_of(rb_n, struct tree_node, node);
+
+ node_data = bpf_kptr_xchg(&tnode->node_data, NULL);
+ if (node_data)
+ bpf_obj_drop(node_data);
+
+ bpf_obj_drop(tnode);
+ }
+
+ return 0;
+fail:
+ bpf_spin_unlock(&lock);
+ return ret;
+}
+
+
+SEC("syscall")
+__failure __msg("R1 type=scalar expected=map_value, ptr_, ptr_")
+long non_own_ref_kptr_xchg_no_lock(void *ctx)
+{
+ struct tree_node *tnode;
+ struct bpf_rb_node *rb_n;
+ struct node_data __kptr * node_data;
+ int data;
+
+ bpf_spin_lock(&lock);
+ rb_n = bpf_rbtree_first(&root);
+ if (!rb_n) {
+ bpf_spin_unlock(&lock);
+ return __LINE__;
+ }
+ bpf_spin_unlock(&lock);
+
+ tnode = container_of(rb_n, struct tree_node, node);
+ node_data = bpf_kptr_xchg(&tnode->node_data, NULL);
+ if (!node_data)
+ return __LINE__;
+
+ data = node_data->data;
+ if (data < 0)
+ return __LINE__;
+
+ node_data = bpf_kptr_xchg(&tnode->node_data, node_data);
+ if (node_data)
+ return __LINE__;
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 4+ messages in thread