[PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking
@ 2025-05-06  1:58 Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 1/8] bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE" Martin KaFai Lau
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

The RFC v1 [1] showed a fq qdisc implementation in bpf
that is much closer to the kernel sch_fq.c.

The fq example and bpf qdisc changes are separated out from this set.
This set is to focus on the kfunc and verifier changes that
enable the bpf rbtree traversal and list peeking.

v2:
- Added tests to check that the return value of
  the bpf_rbtree_{root,left,right} and bpf_list_{front,back} is
  marked as a non_own_ref node pointer. (Kumar)
- Added tests to ensure that the bpf_rbtree_{root,left,right} and
  bpf_list_{front,back} must be called after holding the spinlock.
- Squashed the selftests adjustment to the corresponding verifier
  changes to avoid bisect failure. (Kumar)
- Separated the bpf qdisc specific changes and fq selftest example
  from this set.

[1]: https://lore.kernel.org/bpf/20250418224652.105998-1-martin.lau@linux.dev/

Martin KaFai Lau (8):
  bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE"
  bpf: Simplify reg0 marking for the rbtree kfuncs that return a
    bpf_rb_node pointer
  bpf: Add bpf_rbtree_{root,left,right} kfunc
  bpf: Allow refcounted bpf_rb_node used in
    bpf_rbtree_{remove,left,right}
  selftests/bpf: Add tests for bpf_rbtree_{root,left,right}
  bpf: Simplify reg0 marking for the list kfuncs that return a
    bpf_list_node pointer
  bpf: Add bpf_list_{front,back} kfunc
  selftests/bpf: Add test for bpf_list_{front,back}

 kernel/bpf/helpers.c                          |  52 +++++
 kernel/bpf/verifier.c                         |  64 ++++--
 .../selftests/bpf/prog_tests/linked_list.c    |   6 +
 .../testing/selftests/bpf/prog_tests/rbtree.c |   6 +
 .../selftests/bpf/progs/linked_list_peek.c    | 113 ++++++++++
 .../testing/selftests/bpf/progs/rbtree_fail.c |  29 +--
 .../selftests/bpf/progs/rbtree_search.c       | 206 ++++++++++++++++++
 7 files changed, 445 insertions(+), 31 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/linked_list_peek.c
 create mode 100644 tools/testing/selftests/bpf/progs/rbtree_search.c

-- 
2.47.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 1/8] bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE"
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 2/8] bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer Martin KaFai Lau
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

In a later patch, two new kfuncs will take the bpf_rb_node pointer arg.

struct bpf_rb_node *bpf_rbtree_left(struct bpf_rb_root *root,
				    struct bpf_rb_node *node);
struct bpf_rb_node *bpf_rbtree_right(struct bpf_rb_root *root,
				     struct bpf_rb_node *node);

In the check_kfunc_call, there is a "case KF_ARG_PTR_TO_RB_NODE"
to check if the reg->type should be an allocated pointer or should be
a non_owning_ref.

The later patch will need to ensure that the bpf_rb_node pointer passing
to the new bpf_rbtree_{left,right} must be a non_owning_ref. This
should be the same requirement as the existing bpf_rbtree_remove.

This patch swaps the current "if else" statement. Instead of checking
the bpf_rbtree_remove, it checks the bpf_rbtree_add. Then the new
bpf_rbtree_{left,right} will fall into the "else" case to make
the later patch simpler. bpf_rbtree_add should be the only
one that needs an allocated pointer.

This should be a no-op change considering there are only two kfunc(s)
taking bpf_rb_node pointer arg, rbtree_add and rbtree_remove.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 kernel/bpf/verifier.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 54c6953a8b84..2e1ce7debc16 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13200,22 +13200,22 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				return ret;
 			break;
 		case KF_ARG_PTR_TO_RB_NODE:
-			if (meta->func_id == special_kfunc_list[KF_bpf_rbtree_remove]) {
-				if (!type_is_non_owning_ref(reg->type) || reg->ref_obj_id) {
-					verbose(env, "rbtree_remove node input must be non-owning ref\n");
+			if (meta->func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
+				if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
+					verbose(env, "arg#%d expected pointer to allocated object\n", i);
 					return -EINVAL;
 				}
-				if (in_rbtree_lock_required_cb(env)) {
-					verbose(env, "rbtree_remove not allowed in rbtree cb\n");
+				if (!reg->ref_obj_id) {
+					verbose(env, "allocated object must be referenced\n");
 					return -EINVAL;
 				}
 			} else {
-				if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
-					verbose(env, "arg#%d expected pointer to allocated object\n", i);
+				if (!type_is_non_owning_ref(reg->type) || reg->ref_obj_id) {
+					verbose(env, "rbtree_remove node input must be non-owning ref\n");
 					return -EINVAL;
 				}
-				if (!reg->ref_obj_id) {
-					verbose(env, "allocated object must be referenced\n");
+				if (in_rbtree_lock_required_cb(env)) {
+					verbose(env, "rbtree_remove not allowed in rbtree cb\n");
 					return -EINVAL;
 				}
 			}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 2/8] bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 1/8] bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE" Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 3/8] bpf: Add bpf_rbtree_{root,left,right} kfunc Martin KaFai Lau
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

The current rbtree kfunc, bpf_rbtree_{first, remove}, returns the
bpf_rb_node pointer. The check_kfunc_call currently checks the
kfunc btf_id instead of its return pointer type to decide
if it needs to do mark_reg_graph_node(reg0) and ref_set_non_owning(reg0).

The later patch will add bpf_rbtree_{root,left,right} that will also
return a bpf_rb_node pointer. Instead of adding more kfunc btf_id
checks to the "if" case, this patch changes the test to check the
kfunc's return type. is_rbtree_node_type() function is added to
test if a pointer type is a bpf_rb_node. The callers have already
skipped the modifiers of the pointer type.

A note on the ref_set_non_owning(), although bpf_rbtree_remove()
also returns a bpf_rb_node pointer, the bpf_rbtree_remove()
has the KF_ACQUIRE flag. Thus, its reg0 will not become non-owning.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 kernel/bpf/verifier.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2e1ce7debc16..bf14da00f09a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11987,6 +11987,11 @@ static bool is_kfunc_arg_res_spin_lock(const struct btf *btf, const struct btf_p
 	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RES_SPIN_LOCK_ID);
 }
 
+static bool is_rbtree_node_type(const struct btf_type *t)
+{
+	return t == btf_type_by_id(btf_vmlinux, kf_arg_btf_ids[KF_ARG_RB_NODE_ID]);
+}
+
 static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf,
 				  const struct btf_param *arg)
 {
@@ -13750,8 +13755,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 				struct btf_field *field = meta.arg_list_head.field;
 
 				mark_reg_graph_node(regs, BPF_REG_0, &field->graph_root);
-			} else if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_remove] ||
-				   meta.func_id == special_kfunc_list[KF_bpf_rbtree_first]) {
+			} else if (is_rbtree_node_type(ptr_type)) {
 				struct btf_field *field = meta.arg_rbtree_root.field;
 
 				mark_reg_graph_node(regs, BPF_REG_0, &field->graph_root);
@@ -13881,7 +13885,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			if (is_kfunc_ret_null(&meta))
 				regs[BPF_REG_0].id = id;
 			regs[BPF_REG_0].ref_obj_id = id;
-		} else if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_first]) {
+		} else if (is_rbtree_node_type(ptr_type)) {
 			ref_set_non_owning(env, &regs[BPF_REG_0]);
 		}
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 3/8] bpf: Add bpf_rbtree_{root,left,right} kfunc
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 1/8] bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE" Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 2/8] bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 4/8] bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right} Martin KaFai Lau
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

In a bpf fq implementation that is much closer to the kernel fq,
it will need to traverse the rbtree:
https://lore.kernel.org/bpf/20250418224652.105998-13-martin.lau@linux.dev/

The much simplified logic that uses the bpf_rbtree_{root,left,right}
to traverse the rbtree is like:

struct fq_flow {
	struct bpf_rb_node	fq_node;
	struct bpf_rb_node	rate_node;
	struct bpf_refcount	refcount;
	unsigned long		sk_long;
};

struct fq_flow_root {
	struct bpf_spin_lock lock;
	struct bpf_rb_root root __contains(fq_flow, fq_node);
};

struct fq_flow *fq_classify(...)
{
	struct bpf_rb_node *tofree[FQ_GC_MAX];
	struct fq_flow_root *root;
	struct fq_flow *gc_f, *f;
	struct bpf_rb_node *p;
	int i, fcnt = 0;

	/* ... */

	f = NULL;
	bpf_spin_lock(&root->lock);
	p = bpf_rbtree_root(&root->root);
	while (can_loop) {
		if (!p)
			break;

		gc_f = bpf_rb_entry(p, struct fq_flow, fq_node);
		if (gc_f->sk_long == sk_long) {
			f = bpf_refcount_acquire(gc_f);
			break;
		}

		/* To be removed from the rbtree */
		if (fcnt < FQ_GC_MAX && fq_gc_candidate(gc_f, jiffies_now))
			tofree[fcnt++] = p;

		if (gc_f->sk_long > sk_long)
			p = bpf_rbtree_left(&root->root, p);
		else
			p = bpf_rbtree_right(&root->root, p);
	}

	/* remove from the rbtree */
	for (i = 0; i < fcnt; i++) {
		p = tofree[i];
		tofree[i] = bpf_rbtree_remove(&root->root, p);
	}

	bpf_spin_unlock(&root->lock);

	/* bpf_obj_drop the fq_flow(s) that have just been removed
	 * from the rbtree.
	 */
	for (i = 0; i < fcnt; i++) {
		p = tofree[i];
		if (p) {
			gc_f = bpf_rb_entry(p, struct fq_flow, fq_node);
			bpf_obj_drop(gc_f);
		}
	}

	return f;

}

The above simplified code needs to traverse the rbtree for two purposes,
1) find the flow with the desired sk_long value
2) while searching for the sk_long, collect flows that are
   the fq_gc_candidate. They will be removed from the rbtree.

This patch adds the bpf_rbtree_{root,left,right} kfunc to enable
the rbtree traversal. The returned bpf_rb_node pointer will be a
non-owning reference which is the same as the returned pointer
of the exisiting bpf_rbtree_first kfunc.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 kernel/bpf/helpers.c  | 30 ++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c | 22 ++++++++++++++++++----
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index e3a2662f4e33..36150d340c16 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2366,6 +2366,33 @@ __bpf_kfunc struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root)
 	return (struct bpf_rb_node *)rb_first_cached(r);
 }
 
+__bpf_kfunc struct bpf_rb_node *bpf_rbtree_root(struct bpf_rb_root *root)
+{
+	struct rb_root_cached *r = (struct rb_root_cached *)root;
+
+	return (struct bpf_rb_node *)r->rb_root.rb_node;
+}
+
+__bpf_kfunc struct bpf_rb_node *bpf_rbtree_left(struct bpf_rb_root *root, struct bpf_rb_node *node)
+{
+	struct bpf_rb_node_kern *node_internal = (struct bpf_rb_node_kern *)node;
+
+	if (READ_ONCE(node_internal->owner) != root)
+		return NULL;
+
+	return (struct bpf_rb_node *)node_internal->rb_node.rb_left;
+}
+
+__bpf_kfunc struct bpf_rb_node *bpf_rbtree_right(struct bpf_rb_root *root, struct bpf_rb_node *node)
+{
+	struct bpf_rb_node_kern *node_internal = (struct bpf_rb_node_kern *)node;
+
+	if (READ_ONCE(node_internal->owner) != root)
+		return NULL;
+
+	return (struct bpf_rb_node *)node_internal->rb_node.rb_right;
+}
+
 /**
  * bpf_task_acquire - Acquire a reference to a task. A task acquired by this
  * kfunc which is not stored in a map as a kptr, must be released by calling
@@ -3214,6 +3241,9 @@ BTF_ID_FLAGS(func, bpf_task_release, KF_RELEASE)
 BTF_ID_FLAGS(func, bpf_rbtree_remove, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_rbtree_add_impl)
 BTF_ID_FLAGS(func, bpf_rbtree_first, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_rbtree_root, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_rbtree_left, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_rbtree_right, KF_RET_NULL)
 
 #ifdef CONFIG_CGROUPS
 BTF_ID_FLAGS(func, bpf_cgroup_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index bf14da00f09a..51a17e64a0a9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12081,6 +12081,9 @@ enum special_kfunc_type {
 	KF_bpf_rbtree_remove,
 	KF_bpf_rbtree_add_impl,
 	KF_bpf_rbtree_first,
+	KF_bpf_rbtree_root,
+	KF_bpf_rbtree_left,
+	KF_bpf_rbtree_right,
 	KF_bpf_dynptr_from_skb,
 	KF_bpf_dynptr_from_xdp,
 	KF_bpf_dynptr_slice,
@@ -12121,6 +12124,9 @@ BTF_ID(func, bpf_rdonly_cast)
 BTF_ID(func, bpf_rbtree_remove)
 BTF_ID(func, bpf_rbtree_add_impl)
 BTF_ID(func, bpf_rbtree_first)
+BTF_ID(func, bpf_rbtree_root)
+BTF_ID(func, bpf_rbtree_left)
+BTF_ID(func, bpf_rbtree_right)
 #ifdef CONFIG_NET
 BTF_ID(func, bpf_dynptr_from_skb)
 BTF_ID(func, bpf_dynptr_from_xdp)
@@ -12156,6 +12162,9 @@ BTF_ID(func, bpf_rcu_read_unlock)
 BTF_ID(func, bpf_rbtree_remove)
 BTF_ID(func, bpf_rbtree_add_impl)
 BTF_ID(func, bpf_rbtree_first)
+BTF_ID(func, bpf_rbtree_root)
+BTF_ID(func, bpf_rbtree_left)
+BTF_ID(func, bpf_rbtree_right)
 #ifdef CONFIG_NET
 BTF_ID(func, bpf_dynptr_from_skb)
 BTF_ID(func, bpf_dynptr_from_xdp)
@@ -12591,7 +12600,10 @@ static bool is_bpf_rbtree_api_kfunc(u32 btf_id)
 {
 	return btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl] ||
 	       btf_id == special_kfunc_list[KF_bpf_rbtree_remove] ||
-	       btf_id == special_kfunc_list[KF_bpf_rbtree_first];
+	       btf_id == special_kfunc_list[KF_bpf_rbtree_first] ||
+	       btf_id == special_kfunc_list[KF_bpf_rbtree_root] ||
+	       btf_id == special_kfunc_list[KF_bpf_rbtree_left] ||
+	       btf_id == special_kfunc_list[KF_bpf_rbtree_right];
 }
 
 static bool is_bpf_iter_num_api_kfunc(u32 btf_id)
@@ -12691,7 +12703,9 @@ static bool check_kfunc_is_graph_node_api(struct bpf_verifier_env *env,
 		break;
 	case BPF_RB_NODE:
 		ret = (kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_remove] ||
-		       kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl]);
+		       kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl] ||
+		       kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_left] ||
+		       kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_right]);
 		break;
 	default:
 		verbose(env, "verifier internal error: unexpected graph node argument type %s\n",
@@ -13216,11 +13230,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				}
 			} else {
 				if (!type_is_non_owning_ref(reg->type) || reg->ref_obj_id) {
-					verbose(env, "rbtree_remove node input must be non-owning ref\n");
+					verbose(env, "%s node input must be non-owning ref\n", func_name);
 					return -EINVAL;
 				}
 				if (in_rbtree_lock_required_cb(env)) {
-					verbose(env, "rbtree_remove not allowed in rbtree cb\n");
+					verbose(env, "%s not allowed in rbtree cb\n", func_name);
 					return -EINVAL;
 				}
 			}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 4/8] bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right}
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
                   ` (2 preceding siblings ...)
  2025-05-06  1:58 ` [PATCH v2 bpf-next 3/8] bpf: Add bpf_rbtree_{root,left,right} kfunc Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 5/8] selftests/bpf: Add tests for bpf_rbtree_{root,left,right} Martin KaFai Lau
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

The bpf_rbtree_{remove,left,right} requires the root's lock to be held.
They also check the node_internal->owner is still owned by that root
before proceeding, so it is safe to allow refcounted bpf_rb_node
pointer to be used in these kfuncs.

In a bpf fq implementation which is much closer to the kernel fq,
https://lore.kernel.org/bpf/20250418224652.105998-13-martin.lau@linux.dev/,
a networking flow (allocated by bpf_obj_new) can be added to two different
rbtrees. There are cases that the flow is searched from one rbtree,
held the refcount of the flow, and then removed from another rbtree:

struct fq_flow {
	struct bpf_rb_node	fq_node;
	struct bpf_rb_node	rate_node;
	struct bpf_refcount	refcount;
	unsigned long		sk_long;
};

int bpf_fq_enqueue(...)
{
	/* ... */

	bpf_spin_lock(&root->lock);
	while (can_loop) {
		/* ... */
		if (!p)
			break;
		gc_f = bpf_rb_entry(p, struct fq_flow, fq_node);
		if (gc_f->sk_long == sk_long) {
			f = bpf_refcount_acquire(gc_f);
			break;
		}
		/* ... */
	}
	bpf_spin_unlock(&root->lock);

	if (f) {
		bpf_spin_lock(&q->lock);
		bpf_rbtree_remove(&q->delayed, &f->rate_node);
		bpf_spin_unlock(&q->lock);
	}
}

bpf_rbtree_{left,right} do not need this change but are relaxed together
with bpf_rbtree_remove instead of adding extra verifier logic
to exclude these kfuncs.

To avoid bi-sect failure, this patch also changes the selftests together.

The "rbtree_api_remove_unadded_node" is not expecting verifier's error.
The test now expects bpf_rbtree_remove(&groot, &m->node) to return NULL.
The test uses __retval(0) to ensure this NULL return value.

Some of the "only take non-owning..." failure messages are changed also.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 kernel/bpf/verifier.c                         |  4 +--
 .../testing/selftests/bpf/progs/rbtree_fail.c | 29 ++++++++++---------
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 51a17e64a0a9..9093a351b0b3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13229,8 +13229,8 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 					return -EINVAL;
 				}
 			} else {
-				if (!type_is_non_owning_ref(reg->type) || reg->ref_obj_id) {
-					verbose(env, "%s node input must be non-owning ref\n", func_name);
+				if (!type_is_non_owning_ref(reg->type) && !reg->ref_obj_id) {
+					verbose(env, "%s can only take non-owning or refcounted bpf_rb_node pointer\n", func_name);
 					return -EINVAL;
 				}
 				if (in_rbtree_lock_required_cb(env)) {
diff --git a/tools/testing/selftests/bpf/progs/rbtree_fail.c b/tools/testing/selftests/bpf/progs/rbtree_fail.c
index dbd5eee8e25e..4acb6af2dfe3 100644
--- a/tools/testing/selftests/bpf/progs/rbtree_fail.c
+++ b/tools/testing/selftests/bpf/progs/rbtree_fail.c
@@ -69,11 +69,11 @@ long rbtree_api_nolock_first(void *ctx)
 }
 
 SEC("?tc")
-__failure __msg("rbtree_remove node input must be non-owning ref")
+__retval(0)
 long rbtree_api_remove_unadded_node(void *ctx)
 {
 	struct node_data *n, *m;
-	struct bpf_rb_node *res;
+	struct bpf_rb_node *res_n, *res_m;
 
 	n = bpf_obj_new(typeof(*n));
 	if (!n)
@@ -88,19 +88,20 @@ long rbtree_api_remove_unadded_node(void *ctx)
 	bpf_spin_lock(&glock);
 	bpf_rbtree_add(&groot, &n->node, less);
 
-	/* This remove should pass verifier */
-	res = bpf_rbtree_remove(&groot, &n->node);
-	n = container_of(res, struct node_data, node);
+	res_n = bpf_rbtree_remove(&groot, &n->node);
 
-	/* This remove shouldn't, m isn't in an rbtree */
-	res = bpf_rbtree_remove(&groot, &m->node);
-	m = container_of(res, struct node_data, node);
+	res_m = bpf_rbtree_remove(&groot, &m->node);
 	bpf_spin_unlock(&glock);
 
-	if (n)
-		bpf_obj_drop(n);
-	if (m)
-		bpf_obj_drop(m);
+	bpf_obj_drop(m);
+	if (res_n)
+		bpf_obj_drop(container_of(res_n, struct node_data, node));
+	if (res_m) {
+		bpf_obj_drop(container_of(res_m, struct node_data, node));
+		/* m was not added to the rbtree */
+		return 2;
+	}
+
 	return 0;
 }
 
@@ -178,7 +179,7 @@ long rbtree_api_use_unchecked_remove_retval(void *ctx)
 }
 
 SEC("?tc")
-__failure __msg("rbtree_remove node input must be non-owning ref")
+__failure __msg("bpf_rbtree_remove can only take non-owning or refcounted bpf_rb_node pointer")
 long rbtree_api_add_release_unlock_escape(void *ctx)
 {
 	struct node_data *n;
@@ -202,7 +203,7 @@ long rbtree_api_add_release_unlock_escape(void *ctx)
 }
 
 SEC("?tc")
-__failure __msg("rbtree_remove node input must be non-owning ref")
+__failure __msg("bpf_rbtree_remove can only take non-owning or refcounted bpf_rb_node pointer")
 long rbtree_api_first_release_unlock_escape(void *ctx)
 {
 	struct bpf_rb_node *res;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 5/8] selftests/bpf: Add tests for bpf_rbtree_{root,left,right}
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
                   ` (3 preceding siblings ...)
  2025-05-06  1:58 ` [PATCH v2 bpf-next 4/8] bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right} Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 6/8] bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer Martin KaFai Lau
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

This patch has a much simplified rbtree usage from the
kernel sch_fq qdisc. It has a "struct node_data" which can be
added to two different rbtrees which are ordered by different keys.

The test first populates both rbtrees. Then search for a lookup_key
from the "groot0" rbtree. Once the lookup_key is found, that node
refcount is taken. The node is then removed from another "groot1"
rbtree.

While searching the lookup_key, the test will also try to remove
all rbnodes in the path leading to the lookup_key.

The test_{root,left,right}_spinlock_true tests ensure that the
return value of the bpf_rbtree functions is a non_own_ref node pointer.
This is done by forcing an verifier error by calling a helper
bpf_jiffies64() while holding the spinlock. The tests then
check for the verifier message
"call bpf_rbtree...R0=rcu_ptr_or_null_node..."

The other test_{root,left,right}_spinlock_false tests ensure that
they must be called with spinlock held.

Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> # Check non_own_ref marking
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/rbtree.c |   6 +
 .../selftests/bpf/progs/rbtree_search.c       | 206 ++++++++++++++++++
 2 files changed, 212 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/rbtree_search.c

diff --git a/tools/testing/selftests/bpf/prog_tests/rbtree.c b/tools/testing/selftests/bpf/prog_tests/rbtree.c
index 9818f06c97c5..d8f3d7a45fe9 100644
--- a/tools/testing/selftests/bpf/prog_tests/rbtree.c
+++ b/tools/testing/selftests/bpf/prog_tests/rbtree.c
@@ -8,6 +8,7 @@
 #include "rbtree_fail.skel.h"
 #include "rbtree_btf_fail__wrong_node_type.skel.h"
 #include "rbtree_btf_fail__add_wrong_type.skel.h"
+#include "rbtree_search.skel.h"
 
 static void test_rbtree_add_nodes(void)
 {
@@ -187,3 +188,8 @@ void test_rbtree_fail(void)
 {
 	RUN_TESTS(rbtree_fail);
 }
+
+void test_rbtree_search(void)
+{
+	RUN_TESTS(rbtree_search);
+}
diff --git a/tools/testing/selftests/bpf/progs/rbtree_search.c b/tools/testing/selftests/bpf/progs/rbtree_search.c
new file mode 100644
index 000000000000..098ef970fac1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/rbtree_search.c
@@ -0,0 +1,206 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+
+struct node_data {
+	struct bpf_refcount ref;
+	struct bpf_rb_node r0;
+	struct bpf_rb_node r1;
+	int key0;
+	int key1;
+};
+
+#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8)))
+private(A) struct bpf_spin_lock glock0;
+private(A) struct bpf_rb_root groot0 __contains(node_data, r0);
+
+private(B) struct bpf_spin_lock glock1;
+private(B) struct bpf_rb_root groot1 __contains(node_data, r1);
+
+#define rb_entry(ptr, type, member) container_of(ptr, type, member)
+#define NR_NODES 16
+
+int zero = 0;
+
+static bool less0(struct bpf_rb_node *a, const struct bpf_rb_node *b)
+{
+	struct node_data *node_a;
+	struct node_data *node_b;
+
+	node_a = rb_entry(a, struct node_data, r0);
+	node_b = rb_entry(b, struct node_data, r0);
+
+	return node_a->key0 < node_b->key0;
+}
+
+static bool less1(struct bpf_rb_node *a, const struct bpf_rb_node *b)
+{
+	struct node_data *node_a;
+	struct node_data *node_b;
+
+	node_a = rb_entry(a, struct node_data, r1);
+	node_b = rb_entry(b, struct node_data, r1);
+
+	return node_a->key1 < node_b->key1;
+}
+
+SEC("syscall")
+__retval(0)
+long rbtree_search(void *ctx)
+{
+	struct bpf_rb_node *rb_n, *rb_m, *gc_ns[NR_NODES];
+	long lookup_key = NR_NODES / 2;
+	struct node_data *n, *m;
+	int i, nr_gc = 0;
+
+	for (i = zero; i < NR_NODES && can_loop; i++) {
+		n = bpf_obj_new(typeof(*n));
+		if (!n)
+			return __LINE__;
+
+		m = bpf_refcount_acquire(n);
+
+		n->key0 = i;
+		m->key1 = i;
+
+		bpf_spin_lock(&glock0);
+		bpf_rbtree_add(&groot0, &n->r0, less0);
+		bpf_spin_unlock(&glock0);
+
+		bpf_spin_lock(&glock1);
+		bpf_rbtree_add(&groot1, &m->r1, less1);
+		bpf_spin_unlock(&glock1);
+	}
+
+	n = NULL;
+	bpf_spin_lock(&glock0);
+	rb_n = bpf_rbtree_root(&groot0);
+	while (can_loop) {
+		if (!rb_n) {
+			bpf_spin_unlock(&glock0);
+			return __LINE__;
+		}
+
+		n = rb_entry(rb_n, struct node_data, r0);
+		if (lookup_key == n->key0)
+			break;
+		if (nr_gc < NR_NODES)
+			gc_ns[nr_gc++] = rb_n;
+		if (lookup_key < n->key0)
+			rb_n = bpf_rbtree_left(&groot0, rb_n);
+		else
+			rb_n = bpf_rbtree_right(&groot0, rb_n);
+	}
+
+	if (!n || lookup_key != n->key0) {
+		bpf_spin_unlock(&glock0);
+		return __LINE__;
+	}
+
+	for (i = 0; i < nr_gc; i++) {
+		rb_n = gc_ns[i];
+		gc_ns[i] = bpf_rbtree_remove(&groot0, rb_n);
+	}
+
+	m = bpf_refcount_acquire(n);
+	bpf_spin_unlock(&glock0);
+
+	for (i = 0; i < nr_gc; i++) {
+		rb_n = gc_ns[i];
+		if (rb_n) {
+			n = rb_entry(rb_n, struct node_data, r0);
+			bpf_obj_drop(n);
+		}
+	}
+
+	if (!m)
+		return __LINE__;
+
+	bpf_spin_lock(&glock1);
+	rb_m = bpf_rbtree_remove(&groot1, &m->r1);
+	bpf_spin_unlock(&glock1);
+	bpf_obj_drop(m);
+	if (!rb_m)
+		return __LINE__;
+	bpf_obj_drop(rb_entry(rb_m, struct node_data, r1));
+
+	return 0;
+}
+
+#define TEST_ROOT(dolock)				\
+SEC("syscall")						\
+__failure __msg(MSG)					\
+long test_root_spinlock_##dolock(void *ctx)		\
+{							\
+	struct bpf_rb_node *rb_n;			\
+	__u64 jiffies = 0;				\
+							\
+	if (dolock)					\
+		bpf_spin_lock(&glock0);			\
+	rb_n = bpf_rbtree_root(&groot0);		\
+	if (rb_n)					\
+		jiffies = bpf_jiffies64();		\
+	if (dolock)					\
+		bpf_spin_unlock(&glock0);		\
+							\
+	return !!jiffies;				\
+}
+
+#define TEST_LR(op, dolock)				\
+SEC("syscall")						\
+__failure __msg(MSG)					\
+long test_##op##_spinlock_##dolock(void *ctx)		\
+{							\
+	struct bpf_rb_node *rb_n;			\
+	struct node_data *n;				\
+	__u64 jiffies = 0;				\
+							\
+	bpf_spin_lock(&glock0);				\
+	rb_n = bpf_rbtree_root(&groot0);		\
+	if (!rb_n) {					\
+		bpf_spin_unlock(&glock0);		\
+		return 1;				\
+	}						\
+	n = rb_entry(rb_n, struct node_data, r0);	\
+	n = bpf_refcount_acquire(n);			\
+	bpf_spin_unlock(&glock0);			\
+	if (!n)						\
+		return 1;				\
+							\
+	if (dolock)					\
+		bpf_spin_lock(&glock0);			\
+	rb_n = bpf_rbtree_##op(&groot0, &n->r0);	\
+	if (rb_n)					\
+		jiffies = bpf_jiffies64();		\
+	if (dolock)					\
+		bpf_spin_unlock(&glock0);		\
+							\
+	return !!jiffies;				\
+}
+
+/*
+ * Use a spearate MSG macro instead of passing to TEST_XXX(..., MSG)
+ * to ensure the message itself is not in the bpf prog lineinfo
+ * which the verifier includes in its log.
+ * Otherwise, the test_loader will incorrectly match the prog lineinfo
+ * instead of the log generated by the verifier.
+ */
+#define MSG "call bpf_rbtree_root{{.+}}; R0{{(_w)?}}=rcu_ptr_or_null_node_data(id={{[0-9]+}},non_own_ref"
+TEST_ROOT(true)
+#undef MSG
+#define MSG "call bpf_rbtree_{{(left|right).+}}; R0{{(_w)?}}=rcu_ptr_or_null_node_data(id={{[0-9]+}},non_own_ref"
+TEST_LR(left,  true)
+TEST_LR(right, true)
+#undef MSG
+
+#define MSG "bpf_spin_lock at off=0 must be held for bpf_rb_root"
+TEST_ROOT(false)
+TEST_LR(left, false)
+TEST_LR(right, false)
+#undef MSG
+
+char _license[] SEC("license") = "GPL";
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 6/8] bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
                   ` (4 preceding siblings ...)
  2025-05-06  1:58 ` [PATCH v2 bpf-next 5/8] selftests/bpf: Add tests for bpf_rbtree_{root,left,right} Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 7/8] bpf: Add bpf_list_{front,back} kfunc Martin KaFai Lau
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

The next patch will add bpf_list_{front,back} kfuncs to peek the head
and tail of a list. Both of them will return a 'struct bpf_list_node *'.

Follow the earlier change for rbtree, this patch checks the
return btf type is a 'struct bpf_list_node' pointer instead
of checking each kfuncs individually to decide if
mark_reg_graph_node should be called. This will make
the bpf_list_{front,back} kfunc addition easier in
the later patch.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 kernel/bpf/verifier.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9093a351b0b3..acb2f44316cc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11992,6 +11992,11 @@ static bool is_rbtree_node_type(const struct btf_type *t)
 	return t == btf_type_by_id(btf_vmlinux, kf_arg_btf_ids[KF_ARG_RB_NODE_ID]);
 }
 
+static bool is_list_node_type(const struct btf_type *t)
+{
+	return t == btf_type_by_id(btf_vmlinux, kf_arg_btf_ids[KF_ARG_LIST_NODE_ID]);
+}
+
 static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf,
 				  const struct btf_param *arg)
 {
@@ -13764,8 +13769,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 				insn_aux->kptr_struct_meta =
 					btf_find_struct_meta(meta.arg_btf,
 							     meta.arg_btf_id);
-			} else if (meta.func_id == special_kfunc_list[KF_bpf_list_pop_front] ||
-				   meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) {
+			} else if (is_list_node_type(ptr_type)) {
 				struct btf_field *field = meta.arg_list_head.field;
 
 				mark_reg_graph_node(regs, BPF_REG_0, &field->graph_root);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 7/8] bpf: Add bpf_list_{front,back} kfunc
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
                   ` (5 preceding siblings ...)
  2025-05-06  1:58 ` [PATCH v2 bpf-next 6/8] bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06  1:58 ` [PATCH v2 bpf-next 8/8] selftests/bpf: Add test for bpf_list_{front,back} Martin KaFai Lau
  2025-05-06 17:30 ` [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking patchwork-bot+netdevbpf
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

In the kernel fq qdisc implementation, it only needs to look at
the fields of the first node in a list but does not always
need to remove it from the list. It is more convenient to have
a peek kfunc for the list. It works similar to the bpf_rbtree_first().

This patch adds bpf_list_{front,back} kfunc. The verifier is changed
such that the kfunc returning "struct bpf_list_node *" will be
marked as non-owning. The exception is the KF_ACQUIRE kfunc. The
net effect is only the new bpf_list_{front,back} kfuncs will
have its return pointer marked as non-owning.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 kernel/bpf/helpers.c  | 22 ++++++++++++++++++++++
 kernel/bpf/verifier.c | 12 ++++++++++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 36150d340c16..78cefb41266a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2293,6 +2293,26 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
 	return __bpf_list_del(head, true);
 }
 
+__bpf_kfunc struct bpf_list_node *bpf_list_front(struct bpf_list_head *head)
+{
+	struct list_head *h = (struct list_head *)head;
+
+	if (list_empty(h) || unlikely(!h->next))
+		return NULL;
+
+	return (struct bpf_list_node *)h->next;
+}
+
+__bpf_kfunc struct bpf_list_node *bpf_list_back(struct bpf_list_head *head)
+{
+	struct list_head *h = (struct list_head *)head;
+
+	if (list_empty(h) || unlikely(!h->next))
+		return NULL;
+
+	return (struct bpf_list_node *)h->prev;
+}
+
 __bpf_kfunc struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root,
 						  struct bpf_rb_node *node)
 {
@@ -3236,6 +3256,8 @@ BTF_ID_FLAGS(func, bpf_list_push_front_impl)
 BTF_ID_FLAGS(func, bpf_list_push_back_impl)
 BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_list_front, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_list_back, KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_task_release, KF_RELEASE)
 BTF_ID_FLAGS(func, bpf_rbtree_remove, KF_ACQUIRE | KF_RET_NULL)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index acb2f44316cc..99aa2c890e7b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12079,6 +12079,8 @@ enum special_kfunc_type {
 	KF_bpf_list_push_back_impl,
 	KF_bpf_list_pop_front,
 	KF_bpf_list_pop_back,
+	KF_bpf_list_front,
+	KF_bpf_list_back,
 	KF_bpf_cast_to_kern_ctx,
 	KF_bpf_rdonly_cast,
 	KF_bpf_rcu_read_lock,
@@ -12124,6 +12126,8 @@ BTF_ID(func, bpf_list_push_front_impl)
 BTF_ID(func, bpf_list_push_back_impl)
 BTF_ID(func, bpf_list_pop_front)
 BTF_ID(func, bpf_list_pop_back)
+BTF_ID(func, bpf_list_front)
+BTF_ID(func, bpf_list_back)
 BTF_ID(func, bpf_cast_to_kern_ctx)
 BTF_ID(func, bpf_rdonly_cast)
 BTF_ID(func, bpf_rbtree_remove)
@@ -12160,6 +12164,8 @@ BTF_ID(func, bpf_list_push_front_impl)
 BTF_ID(func, bpf_list_push_back_impl)
 BTF_ID(func, bpf_list_pop_front)
 BTF_ID(func, bpf_list_pop_back)
+BTF_ID(func, bpf_list_front)
+BTF_ID(func, bpf_list_back)
 BTF_ID(func, bpf_cast_to_kern_ctx)
 BTF_ID(func, bpf_rdonly_cast)
 BTF_ID(func, bpf_rcu_read_lock)
@@ -12598,7 +12604,9 @@ static bool is_bpf_list_api_kfunc(u32 btf_id)
 	return btf_id == special_kfunc_list[KF_bpf_list_push_front_impl] ||
 	       btf_id == special_kfunc_list[KF_bpf_list_push_back_impl] ||
 	       btf_id == special_kfunc_list[KF_bpf_list_pop_front] ||
-	       btf_id == special_kfunc_list[KF_bpf_list_pop_back];
+	       btf_id == special_kfunc_list[KF_bpf_list_pop_back] ||
+	       btf_id == special_kfunc_list[KF_bpf_list_front] ||
+	       btf_id == special_kfunc_list[KF_bpf_list_back];
 }
 
 static bool is_bpf_rbtree_api_kfunc(u32 btf_id)
@@ -13903,7 +13911,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			if (is_kfunc_ret_null(&meta))
 				regs[BPF_REG_0].id = id;
 			regs[BPF_REG_0].ref_obj_id = id;
-		} else if (is_rbtree_node_type(ptr_type)) {
+		} else if (is_rbtree_node_type(ptr_type) || is_list_node_type(ptr_type)) {
 			ref_set_non_owning(env, &regs[BPF_REG_0]);
 		}
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 8/8] selftests/bpf: Add test for bpf_list_{front,back}
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
                   ` (6 preceding siblings ...)
  2025-05-06  1:58 ` [PATCH v2 bpf-next 7/8] bpf: Add bpf_list_{front,back} kfunc Martin KaFai Lau
@ 2025-05-06  1:58 ` Martin KaFai Lau
  2025-05-06 17:30 ` [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking patchwork-bot+netdevbpf
  8 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2025-05-06  1:58 UTC (permalink / raw)
  To: bpf
  Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ',
	'Daniel Borkmann ', 'Kumar Kartikeya Dwivedi ',
	'Amery Hung ', netdev, kernel-team

From: Martin KaFai Lau <martin.lau@kernel.org>

This patch adds the "list_peek" test to use the new
bpf_list_{front,back} kfunc.

The test_{front,back}* tests ensure that the return value
is a non_own_ref node pointer and requires the spinlock to be held.

Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> # check non_own_ref marking
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 .../selftests/bpf/prog_tests/linked_list.c    |   6 +
 .../selftests/bpf/progs/linked_list_peek.c    | 113 ++++++++++++++++++
 2 files changed, 119 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/linked_list_peek.c

diff --git a/tools/testing/selftests/bpf/prog_tests/linked_list.c b/tools/testing/selftests/bpf/prog_tests/linked_list.c
index 77d07e0a4a55..5266c7022863 100644
--- a/tools/testing/selftests/bpf/prog_tests/linked_list.c
+++ b/tools/testing/selftests/bpf/prog_tests/linked_list.c
@@ -7,6 +7,7 @@
 
 #include "linked_list.skel.h"
 #include "linked_list_fail.skel.h"
+#include "linked_list_peek.skel.h"
 
 static char log_buf[1024 * 1024];
 
@@ -805,3 +806,8 @@ void test_linked_list(void)
 	test_linked_list_success(LIST_IN_LIST, true);
 	test_linked_list_success(TEST_ALL, false);
 }
+
+void test_linked_list_peek(void)
+{
+	RUN_TESTS(linked_list_peek);
+}
diff --git a/tools/testing/selftests/bpf/progs/linked_list_peek.c b/tools/testing/selftests/bpf/progs/linked_list_peek.c
new file mode 100644
index 000000000000..264e81bfb287
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/linked_list_peek.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+
+struct node_data {
+	struct bpf_list_node l;
+	int key;
+};
+
+#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8)))
+private(A) struct bpf_spin_lock glock;
+private(A) struct bpf_list_head ghead __contains(node_data, l);
+
+#define list_entry(ptr, type, member) container_of(ptr, type, member)
+#define NR_NODES 16
+
+int zero = 0;
+
+SEC("syscall")
+__retval(0)
+long list_peek(void *ctx)
+{
+	struct bpf_list_node *l_n;
+	struct node_data *n;
+	int i, err = 0;
+
+	bpf_spin_lock(&glock);
+	l_n = bpf_list_front(&ghead);
+	bpf_spin_unlock(&glock);
+	if (l_n)
+		return __LINE__;
+
+	bpf_spin_lock(&glock);
+	l_n = bpf_list_back(&ghead);
+	bpf_spin_unlock(&glock);
+	if (l_n)
+		return __LINE__;
+
+	for (i = zero; i < NR_NODES && can_loop; i++) {
+		n = bpf_obj_new(typeof(*n));
+		if (!n)
+			return __LINE__;
+		n->key = i;
+		bpf_spin_lock(&glock);
+		bpf_list_push_back(&ghead, &n->l);
+		bpf_spin_unlock(&glock);
+	}
+
+	bpf_spin_lock(&glock);
+
+	l_n = bpf_list_front(&ghead);
+	if (!l_n) {
+		err = __LINE__;
+		goto done;
+	}
+
+	n = list_entry(l_n, struct node_data, l);
+	if (n->key != 0) {
+		err = __LINE__;
+		goto done;
+	}
+
+	l_n = bpf_list_back(&ghead);
+	if (!l_n) {
+		err = __LINE__;
+		goto done;
+	}
+
+	n = list_entry(l_n, struct node_data, l);
+	if (n->key != NR_NODES - 1) {
+		err = __LINE__;
+		goto done;
+	}
+
+done:
+	bpf_spin_unlock(&glock);
+	return err;
+}
+
+#define TEST_FB(op, dolock)					\
+SEC("syscall")							\
+__failure __msg(MSG)						\
+long test_##op##_spinlock_##dolock(void *ctx)			\
+{								\
+	struct bpf_list_node *l_n;				\
+	__u64 jiffies = 0;					\
+								\
+	if (dolock)						\
+		bpf_spin_lock(&glock);				\
+	l_n = bpf_list_##op(&ghead);				\
+	if (l_n)						\
+		jiffies = bpf_jiffies64();			\
+	if (dolock)						\
+		bpf_spin_unlock(&glock);			\
+								\
+	return !!jiffies;					\
+}
+
+#define MSG "call bpf_list_{{(front|back).+}}; R0{{(_w)?}}=ptr_or_null_node_data(id={{[0-9]+}},non_own_ref"
+TEST_FB(front, true)
+TEST_FB(back, true)
+#undef MSG
+
+#define MSG "bpf_spin_lock at off=0 must be held for bpf_list_head"
+TEST_FB(front, false)
+TEST_FB(back, false)
+#undef MSG
+
+char _license[] SEC("license") = "GPL";
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking
  2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
                   ` (7 preceding siblings ...)
  2025-05-06  1:58 ` [PATCH v2 bpf-next 8/8] selftests/bpf: Add test for bpf_list_{front,back} Martin KaFai Lau
@ 2025-05-06 17:30 ` patchwork-bot+netdevbpf
  8 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-05-06 17:30 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, ast, andrii, daniel, memxor, ameryhung, netdev, kernel-team

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon,  5 May 2025 18:58:47 -0700 you wrote:
> From: Martin KaFai Lau <martin.lau@kernel.org>
> 
> The RFC v1 [1] showed a fq qdisc implementation in bpf
> that is much closer to the kernel sch_fq.c.
> 
> The fq example and bpf qdisc changes are separated out from this set.
> This set is to focus on the kfunc and verifier changes that
> enable the bpf rbtree traversal and list peeking.
> 
> [...]

Here is the summary with links:
  - [v2,bpf-next,1/8] bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE"
    https://git.kernel.org/bpf/bpf-next/c/b183c0123d9b
  - [v2,bpf-next,2/8] bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer
    https://git.kernel.org/bpf/bpf-next/c/7faccdf4b47d
  - [v2,bpf-next,3/8] bpf: Add bpf_rbtree_{root,left,right} kfunc
    https://git.kernel.org/bpf/bpf-next/c/9e3e66c553f7
  - [v2,bpf-next,4/8] bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right}
    https://git.kernel.org/bpf/bpf-next/c/2ddef1783c43
  - [v2,bpf-next,5/8] selftests/bpf: Add tests for bpf_rbtree_{root,left,right}
    https://git.kernel.org/bpf/bpf-next/c/47ada65c5cf9
  - [v2,bpf-next,6/8] bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer
    https://git.kernel.org/bpf/bpf-next/c/3fab84f00d32
  - [v2,bpf-next,7/8] bpf: Add bpf_list_{front,back} kfunc
    https://git.kernel.org/bpf/bpf-next/c/fb5b480205ba
  - [v2,bpf-next,8/8] selftests/bpf: Add test for bpf_list_{front,back}
    https://git.kernel.org/bpf/bpf-next/c/29318b4d5dc3

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-05-06 17:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06  1:58 [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 1/8] bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE" Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 2/8] bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 3/8] bpf: Add bpf_rbtree_{root,left,right} kfunc Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 4/8] bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right} Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 5/8] selftests/bpf: Add tests for bpf_rbtree_{root,left,right} Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 6/8] bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 7/8] bpf: Add bpf_list_{front,back} kfunc Martin KaFai Lau
2025-05-06  1:58 ` [PATCH v2 bpf-next 8/8] selftests/bpf: Add test for bpf_list_{front,back} Martin KaFai Lau
2025-05-06 17:30 ` [PATCH v2 bpf-next 0/8] bpf: Support bpf rbtree traversal and list peeking patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).