* [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs
@ 2026-04-27 16:58 Kaitao cheng
2026-04-27 16:58 ` [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer Kaitao cheng
` (7 more replies)
0 siblings, 8 replies; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:58 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel, Kaitao cheng
In BPF, a list can only be used to implement a stack structure.
Due to an incomplete API set, only FIFO or LIFO operations are
supported. The patches enhance the BPF list API, making it more
list-like.
Five new kfuncs have been added:
bpf_list_del: remove a node from the list
bpf_list_add_impl: insert a node after a given list node
bpf_list_is_first: check if a node is the first in the list
bpf_list_is_last: check if a node is the last in the list
bpf_list_empty: check if the list is empty
And add test cases for the aforementioned kfuncs.
Changes in v10:
- Remove the table-driven approach (Ihor Solodrai)
- Use the __nonown_allowed suffix for bpf_list_del/front/back
- Add test cases for __nonown_allowed
Changes in v9:
- Expand table-driven approach coverage (Emil Tsalapatis)
- Clear list node owner and unlink before drop (Emil Tsalapatis)
- Remove warnings caused by WARN_ON_ONCE() (Emil Tsalapatis)
- Introduce the __nonown_allowed suffix (Alexei Starovoitov)
Changes in v8:
- Use [patch v7 5/5] as the start of the patch series (Leon Hwang)
- Introduce double pointer prev_ptr in __bpf_list_del
(Kumar Kartikeya Dwivedi)
- Extract refactored __bpf_list_del/add into separate patches (Leon Hwang)
- Allow bpf_list_front/back result as the prev argument of bpf_list_add
- Split test cases (Leon Hwang)
Changes in v7:
- Replace bpf_list_node_is_edge with bpf_list_is_first/is_last
- Reimplement __bpf_list_del and __bpf_list_add (Kumar Kartikeya Dwivedi)
- Simplify test cases (Mykyta Yatsenko)
Changes in v6:
- Merge [patch v5 (2,4,6)/6] into [patch v6 4/5] (Leon Hwang)
- If list_head was 0-initialized, init it
- refactor kfunc checks to table-driven approach (Leon Hwang)
Changes in v5:
- Fix bpf_obj leak on bpf_list_add_impl error
Changes in v4:
- [patch v3 1/6] Revert to version v1 (Alexei Starovoitov)
- Change the parameters of bpf_list_add_impl to (head, new, prev, ...)
Changes in v3:
- Add a new lock_rec member to struct bpf_reference_state for lock
holding detection.
- Add test cases to verify that the verifier correctly restricts calls
to bpf_list_del when the spin_lock is not held.
Changes in v2:
- Remove the head parameter from bpf_list_del (Alexei Starovoitov)
- Add bpf_list_add/is_first/is_last/empty to API and test cases
(Alexei Starovoitov)
Link to v9:
https://lore.kernel.org/all/20260329140506.9595-1-pilgrimtao@gmail.com/
Link to v8:
https://lore.kernel.org/all/20260316112843.78657-1-pilgrimtao@gmail.com/
Link to v7:
https://lore.kernel.org/all/20260308134614.29711-1-pilgrimtao@gmail.com/
Link to v6:
https://lore.kernel.org/all/20260304143459.78059-1-pilgrimtao@gmail.com/
Link to v5:
https://lore.kernel.org/all/20260304031606.43884-1-pilgrimtao@gmail.com/
Link to v4:
https://lore.kernel.org/all/20260303135219.33726-1-pilgrimtao@gmail.com/
Link to v3:
https://lore.kernel.org/all/20260302124028.82420-1-pilgrimtao@gmail.com/
Link to v2:
https://lore.kernel.org/all/20260225092651.94689-1-pilgrimtao@gmail.com/
Link to v1:
https://lore.kernel.org/all/20260209025250.55750-1-pilgrimtao@gmail.com/
Kaitao Cheng (8):
bpf: refactor __bpf_list_del to take list node pointer
bpf: clear list node owner and unlink before drop
bpf: Introduce the bpf_list_del kfunc.
bpf: refactor __bpf_list_add to take insertion point via **prev_ptr
bpf: Add bpf_list_add to insert node after a given list node
bpf: add bpf_list_is_first/last/empty kfuncs
bpf: allow non-owning list-node args via __nonown_allowed
selftests/bpf: Add test cases for
bpf_list_del/add/is_first/is_last/empty
Documentation/bpf/kfuncs.rst | 22 +-
kernel/bpf/helpers.c | 139 ++++--
kernel/bpf/verifier.c | 44 +-
.../selftests/bpf/progs/refcounted_kptr.c | 421 ++++++++++++++++++
4 files changed, 593 insertions(+), 33 deletions(-)
--
2.50.1 (Apple Git-155)
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
@ 2026-04-27 16:58 ` Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-27 16:59 ` [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop Kaitao cheng
` (6 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:58 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
Refactor __bpf_list_del to accept (head, struct list_head *n) instead of
(head, bool tail). The caller now passes the specific node to remove:
bpf_list_pop_front passes h->next, bpf_list_pop_back passes h->prev.
Prepares for introducing bpf_list_del(head, node) kfunc to remove an
arbitrary node when the user holds ownership.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
kernel/bpf/helpers.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index baa12b24bb64..9cd7b028592c 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2550,22 +2550,24 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
return bpf_list_push_back(head, node, meta__ign, off);
}
-static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tail)
+static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head,
+ struct list_head *n)
{
- struct list_head *n, *h = (void *)head;
+ struct list_head *h = (void *)head;
struct bpf_list_node_kern *node;
/* If list_head was 0-initialized by map, bpf_obj_init_field wasn't
* called on its fields, so init here
*/
- if (unlikely(!h->next))
+ if (unlikely(!h->next)) {
INIT_LIST_HEAD(h);
+ return NULL;
+ }
if (list_empty(h))
return NULL;
- n = tail ? h->prev : h->next;
node = container_of(n, struct bpf_list_node_kern, list_head);
- if (WARN_ON_ONCE(READ_ONCE(node->owner) != head))
+ if (unlikely(READ_ONCE(node->owner) != head))
return NULL;
list_del_init(n);
@@ -2575,12 +2577,16 @@ static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tai
__bpf_kfunc struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head)
{
- return __bpf_list_del(head, false);
+ struct list_head *h = (void *)head;
+
+ return __bpf_list_del(head, h->next);
}
__bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
{
- return __bpf_list_del(head, true);
+ struct list_head *h = (void *)head;
+
+ return __bpf_list_del(head, h->prev);
}
__bpf_kfunc struct bpf_list_node *bpf_list_front(struct bpf_list_head *head)
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
2026-04-27 16:58 ` [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-27 16:59 ` [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc Kaitao cheng
` (5 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
When draining a BPF list_head, clear each node's owner pointer while still
holding the spinlock, so concurrent readers always see a consistent owner.
Delink each node with list_del_init() before calling __bpf_obj_drop_impl(),
preventing subsequent users who hold a reference count to the node from
acquiring an invalid next node.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
kernel/bpf/helpers.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 9cd7b028592c..1e8754877dd1 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2247,10 +2247,11 @@ EXPORT_SYMBOL_GPL(bpf_base_func_proto);
void bpf_list_head_free(const struct btf_field *field, void *list_head,
struct bpf_spin_lock *spin_lock)
{
- struct list_head *head = list_head, *orig_head = list_head;
+ struct list_head *head = list_head, drain, *pos, *n;
BUILD_BUG_ON(sizeof(struct list_head) > sizeof(struct bpf_list_head));
BUILD_BUG_ON(__alignof__(struct list_head) > __alignof__(struct bpf_list_head));
+ INIT_LIST_HEAD(&drain);
/* Do the actual list draining outside the lock to not hold the lock for
* too long, and also prevent deadlocks if tracing programs end up
@@ -2261,20 +2262,23 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head,
__bpf_spin_lock_irqsave(spin_lock);
if (!head->next || list_empty(head))
goto unlock;
- head = head->next;
+ list_for_each_safe(pos, n, head) {
+ WRITE_ONCE(container_of(pos,
+ struct bpf_list_node_kern, list_head)->owner, NULL);
+ list_move_tail(pos, &drain);
+ }
unlock:
- INIT_LIST_HEAD(orig_head);
+ INIT_LIST_HEAD(head);
__bpf_spin_unlock_irqrestore(spin_lock);
- while (head != orig_head) {
- void *obj = head;
-
- obj -= field->graph_root.node_offset;
- head = head->next;
+ while (!list_empty(&drain)) {
+ pos = drain.next;
+ list_del_init(pos);
/* The contained type can also have resources, including a
* bpf_list_head which needs to be freed.
*/
- __bpf_obj_drop_impl(obj, field->graph_root.value_rec, false);
+ __bpf_obj_drop_impl((char *)pos - field->graph_root.node_offset,
+ field->graph_root.value_rec, false);
}
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc.
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
2026-04-27 16:58 ` [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-27 16:59 ` [PATCH bpf-next v10 4/8] bpf: refactor __bpf_list_add to take insertion point via **prev_ptr Kaitao cheng
` (4 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
Allow users to remove any node from a linked list.
We have added an additional parameter bpf_list_head *head to
bpf_list_del, as the verifier requires the head parameter to
check whether the lock is being held.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
kernel/bpf/helpers.c | 10 ++++++++++
kernel/bpf/verifier.c | 6 +++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 1e8754877dd1..51b6ea4bb8cb 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2593,6 +2593,15 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
return __bpf_list_del(head, h->prev);
}
+__bpf_kfunc struct bpf_list_node *bpf_list_del(struct bpf_list_head *head,
+ struct bpf_list_node *node)
+{
+ struct bpf_list_node_kern *kn = (void *)node;
+
+ /* verifier guarantees node is a list node rather than list head */
+ return __bpf_list_del(head, &kn->list_head);
+}
+
__bpf_kfunc struct bpf_list_node *bpf_list_front(struct bpf_list_head *head)
{
struct list_head *h = (struct list_head *)head;
@@ -4725,6 +4734,7 @@ BTF_ID_FLAGS(func, bpf_list_push_back, KF_IMPLICIT_ARGS)
BTF_ID_FLAGS(func, bpf_list_push_back_impl)
BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_front, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_back, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 03f9e16c2abe..3c0e0076bd69 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10744,6 +10744,7 @@ enum special_kfunc_type {
KF_bpf_list_push_back,
KF_bpf_list_pop_front,
KF_bpf_list_pop_back,
+ KF_bpf_list_del,
KF_bpf_list_front,
KF_bpf_list_back,
KF_bpf_cast_to_kern_ctx,
@@ -10812,6 +10813,7 @@ BTF_ID(func, bpf_list_push_back_impl)
BTF_ID(func, bpf_list_push_back)
BTF_ID(func, bpf_list_pop_front)
BTF_ID(func, bpf_list_pop_back)
+BTF_ID(func, bpf_list_del)
BTF_ID(func, bpf_list_front)
BTF_ID(func, bpf_list_back)
BTF_ID(func, bpf_cast_to_kern_ctx)
@@ -11334,6 +11336,7 @@ static bool is_bpf_list_api_kfunc(u32 btf_id)
return is_bpf_list_push_kfunc(btf_id) ||
btf_id == special_kfunc_list[KF_bpf_list_pop_front] ||
btf_id == special_kfunc_list[KF_bpf_list_pop_back] ||
+ btf_id == special_kfunc_list[KF_bpf_list_del] ||
btf_id == special_kfunc_list[KF_bpf_list_front] ||
btf_id == special_kfunc_list[KF_bpf_list_back];
}
@@ -11456,7 +11459,8 @@ static bool check_kfunc_is_graph_node_api(struct bpf_verifier_env *env,
switch (node_field_type) {
case BPF_LIST_NODE:
- ret = is_bpf_list_push_kfunc(kfunc_btf_id);
+ ret = is_bpf_list_push_kfunc(kfunc_btf_id) ||
+ kfunc_btf_id == special_kfunc_list[KF_bpf_list_del];
break;
case BPF_RB_NODE:
ret = (is_bpf_rbtree_add_kfunc(kfunc_btf_id) ||
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 4/8] bpf: refactor __bpf_list_add to take insertion point via **prev_ptr
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
` (2 preceding siblings ...)
2026-04-27 16:59 ` [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node Kaitao cheng
` (3 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
Refactor __bpf_list_add to accept (node, head, struct list_head **prev_ptr,
..) instead of (node, head, bool tail, ..). Load prev from *prev_ptr after
INIT_LIST_HEAD(h), so we never dereference an uninitialized h->prev when
head was 0-initialized (e.g. push_back passes &h->prev).
When prev is not the list head, validate that prev is in the list via
its owner.
Prepares for bpf_list_add(head, new, prev, ..) to insert after a given
list node.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 51b6ea4bb8cb..5388078f3171 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2471,9 +2471,11 @@ __bpf_kfunc void *bpf_refcount_acquire_impl(void *p__refcounted_kptr, void *meta
static int __bpf_list_add(struct bpf_list_node_kern *node,
struct bpf_list_head *head,
- bool tail, struct btf_record *rec, u64 off)
+ struct list_head **prev_ptr,
+ struct btf_record *rec, u64 off)
{
struct list_head *n = &node->list_head, *h = (void *)head;
+ struct list_head *prev;
/* If list_head was 0-initialized by map, bpf_obj_init_field wasn't
* called on its fields, so init here
@@ -2481,19 +2483,31 @@ static int __bpf_list_add(struct bpf_list_node_kern *node,
if (unlikely(!h->next))
INIT_LIST_HEAD(h);
+ prev = *prev_ptr;
+
+ /* When prev is not the list head, it must be a node in this list. */
+ if (prev != h) {
+ struct bpf_list_node_kern *prev_kn =
+ container_of(prev, struct bpf_list_node_kern, list_head);
+
+ if (unlikely(READ_ONCE(prev_kn->owner) != head))
+ goto fail;
+ }
+
/* node->owner != NULL implies !list_empty(n), no need to separately
* check the latter
*/
- if (cmpxchg(&node->owner, NULL, BPF_PTR_POISON)) {
- /* Only called from BPF prog, no need to migrate_disable */
- __bpf_obj_drop_impl((void *)n - off, rec, false);
- return -EINVAL;
- }
+ if (cmpxchg(&node->owner, NULL, BPF_PTR_POISON))
+ goto fail;
- tail ? list_add_tail(n, h) : list_add(n, h);
+ list_add(n, prev);
WRITE_ONCE(node->owner, head);
-
return 0;
+
+fail:
+ /* Only called from BPF prog, no need to migrate_disable */
+ __bpf_obj_drop_impl((void *)n - off, rec, false);
+ return -EINVAL;
}
/**
@@ -2514,8 +2528,9 @@ __bpf_kfunc int bpf_list_push_front(struct bpf_list_head *head,
u64 off)
{
struct bpf_list_node_kern *n = (void *)node;
+ struct list_head *h = (void *)head;
- return __bpf_list_add(n, head, false, meta ? meta->record : NULL, off);
+ return __bpf_list_add(n, head, &h, meta ? meta->record : NULL, off);
}
__bpf_kfunc int bpf_list_push_front_impl(struct bpf_list_head *head,
@@ -2543,8 +2558,9 @@ __bpf_kfunc int bpf_list_push_back(struct bpf_list_head *head,
u64 off)
{
struct bpf_list_node_kern *n = (void *)node;
+ struct list_head *h = (void *)head;
- return __bpf_list_add(n, head, true, meta ? meta->record : NULL, off);
+ return __bpf_list_add(n, head, &h->prev, meta ? meta->record : NULL, off);
}
__bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
` (3 preceding siblings ...)
2026-04-27 16:59 ` [PATCH bpf-next v10 4/8] bpf: refactor __bpf_list_add to take insertion point via **prev_ptr Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-27 16:59 ` [PATCH bpf-next v10 6/8] bpf: add bpf_list_is_first/last/empty kfuncs Kaitao cheng
` (2 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
Add a new kfunc bpf_list_add(head, new, prev, meta, off) that
inserts 'new' after 'prev' in the BPF linked list. Both must be in
the same list; 'prev' must already be in the list. The new node must
be an owning reference (e.g. from bpf_obj_new); the kfunc consumes
that reference and the node becomes non-owning once inserted.
We have added an additional parameter bpf_list_head *head to
bpf_list_add, as the verifier requires the head parameter to
check whether the lock is being held.
Returns 0 on success, -EINVAL if 'prev' is not in a list or 'new'
is already in a list (or duplicate insertion). On failure, the
kernel drops the passed-in node.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
kernel/bpf/helpers.c | 11 +++++++++++
kernel/bpf/verifier.c | 12 +++++++++---
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 5388078f3171..2b8e8d4284a5 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2570,6 +2570,16 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
return bpf_list_push_back(head, node, meta__ign, off);
}
+__bpf_kfunc int bpf_list_add(struct bpf_list_head *head, struct bpf_list_node *new,
+ struct bpf_list_node *prev, struct btf_struct_meta *meta,
+ u64 off)
+{
+ struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev;
+ struct list_head *prev_ptr = &p->list_head;
+
+ return __bpf_list_add(n, head, &prev_ptr, meta ? meta->record : NULL, off);
+}
+
static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head,
struct list_head *n)
{
@@ -4748,6 +4758,7 @@ BTF_ID_FLAGS(func, bpf_list_push_front, KF_IMPLICIT_ARGS)
BTF_ID_FLAGS(func, bpf_list_push_front_impl)
BTF_ID_FLAGS(func, bpf_list_push_back, KF_IMPLICIT_ARGS)
BTF_ID_FLAGS(func, bpf_list_push_back_impl)
+BTF_ID_FLAGS(func, bpf_list_add, KF_IMPLICIT_ARGS)
BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3c0e0076bd69..50f8732aa065 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10742,6 +10742,7 @@ enum special_kfunc_type {
KF_bpf_list_push_front,
KF_bpf_list_push_back_impl,
KF_bpf_list_push_back,
+ KF_bpf_list_add,
KF_bpf_list_pop_front,
KF_bpf_list_pop_back,
KF_bpf_list_del,
@@ -10811,6 +10812,7 @@ BTF_ID(func, bpf_list_push_front_impl)
BTF_ID(func, bpf_list_push_front)
BTF_ID(func, bpf_list_push_back_impl)
BTF_ID(func, bpf_list_push_back)
+BTF_ID(func, bpf_list_add)
BTF_ID(func, bpf_list_pop_front)
BTF_ID(func, bpf_list_pop_back)
BTF_ID(func, bpf_list_del)
@@ -10923,7 +10925,8 @@ static bool is_bpf_list_push_kfunc(u32 func_id)
return func_id == special_kfunc_list[KF_bpf_list_push_front] ||
func_id == special_kfunc_list[KF_bpf_list_push_front_impl] ||
func_id == special_kfunc_list[KF_bpf_list_push_back] ||
- func_id == special_kfunc_list[KF_bpf_list_push_back_impl];
+ func_id == special_kfunc_list[KF_bpf_list_push_back_impl] ||
+ func_id == special_kfunc_list[KF_bpf_list_add];
}
static bool is_bpf_rbtree_add_kfunc(u32 func_id)
@@ -19228,8 +19231,11 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
int struct_meta_reg = BPF_REG_3;
int node_offset_reg = BPF_REG_4;
- /* rbtree_add has extra 'less' arg, so args-to-fixup are in diff regs */
- if (is_bpf_rbtree_add_kfunc(desc->func_id)) {
+ /* list_add/rbtree_add have an extra arg (prev/less),
+ * so args-to-fixup are in diff regs.
+ */
+ if (desc->func_id == special_kfunc_list[KF_bpf_list_add] ||
+ is_bpf_rbtree_add_kfunc(desc->func_id)) {
struct_meta_reg = BPF_REG_4;
node_offset_reg = BPF_REG_5;
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 6/8] bpf: add bpf_list_is_first/last/empty kfuncs
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
` (4 preceding siblings ...)
2026-04-27 16:59 ` [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 7/8] bpf: allow non-owning list-node args via __nonown_allowed Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 8/8] selftests/bpf: Add test cases for bpf_list_del/add/is_first/is_last/empty Kaitao cheng
7 siblings, 0 replies; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel, Emil Tsalapatis
From: Kaitao Cheng <chengkaitao@kylinos.cn>
Add three kfuncs for BPF linked list queries:
- bpf_list_is_first(head, node): true if node is the first in the list.
- bpf_list_is_last(head, node): true if node is the last in the list.
- bpf_list_empty(head): true if the list has no entries.
Currently, without these kfuncs, to implement the above functionality
it is necessary to first call bpf_list_pop_front/back to retrieve the
first or last node before checking whether the passed-in node was the
first or last one. After the check, the node had to be pushed back into
the list using bpf_list_push_front/back, which was very inefficient.
Now, with the bpf_list_is_first/last/empty kfuncs, we can directly
check whether a node is the first, last, or whether the list is empty,
without having to first retrieve the node.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
---
kernel/bpf/helpers.c | 38 ++++++++++++++++++++++++++++++++++++++
kernel/bpf/verifier.c | 15 +++++++++++++--
2 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 2b8e8d4284a5..dfd465badd9d 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2648,6 +2648,41 @@ __bpf_kfunc struct bpf_list_node *bpf_list_back(struct bpf_list_head *head)
return (struct bpf_list_node *)h->prev;
}
+__bpf_kfunc bool bpf_list_is_first(struct bpf_list_head *head, struct bpf_list_node *node)
+{
+ struct list_head *h = (struct list_head *)head;
+ struct bpf_list_node_kern *kn = (struct bpf_list_node_kern *)node;
+
+ if (READ_ONCE(kn->owner) != head)
+ return false;
+
+ return list_is_first(&kn->list_head, h);
+}
+
+__bpf_kfunc bool bpf_list_is_last(struct bpf_list_head *head, struct bpf_list_node *node)
+{
+ struct list_head *h = (struct list_head *)head;
+ struct bpf_list_node_kern *kn = (struct bpf_list_node_kern *)node;
+
+ if (READ_ONCE(kn->owner) != head)
+ return false;
+
+ return list_is_last(&kn->list_head, h);
+}
+
+__bpf_kfunc bool bpf_list_empty(struct bpf_list_head *head)
+{
+ struct list_head *h = (struct list_head *)head;
+
+ /* If list_head was 0-initialized by map, bpf_obj_init_field wasn't
+ * called on its fields, so init here
+ */
+ if (unlikely(!h->next))
+ INIT_LIST_HEAD(h);
+
+ return list_empty(h);
+}
+
__bpf_kfunc struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root,
struct bpf_rb_node *node)
{
@@ -4764,6 +4799,9 @@ BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_front, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_list_back, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_list_is_first)
+BTF_ID_FLAGS(func, bpf_list_is_last)
+BTF_ID_FLAGS(func, bpf_list_empty)
BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_task_release, KF_RELEASE)
BTF_ID_FLAGS(func, bpf_rbtree_remove, KF_ACQUIRE | KF_RET_NULL)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 50f8732aa065..ca33f35bc3eb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10748,6 +10748,9 @@ enum special_kfunc_type {
KF_bpf_list_del,
KF_bpf_list_front,
KF_bpf_list_back,
+ KF_bpf_list_is_first,
+ KF_bpf_list_is_last,
+ KF_bpf_list_empty,
KF_bpf_cast_to_kern_ctx,
KF_bpf_rdonly_cast,
KF_bpf_rcu_read_lock,
@@ -10818,6 +10821,9 @@ BTF_ID(func, bpf_list_pop_back)
BTF_ID(func, bpf_list_del)
BTF_ID(func, bpf_list_front)
BTF_ID(func, bpf_list_back)
+BTF_ID(func, bpf_list_is_first)
+BTF_ID(func, bpf_list_is_last)
+BTF_ID(func, bpf_list_empty)
BTF_ID(func, bpf_cast_to_kern_ctx)
BTF_ID(func, bpf_rdonly_cast)
BTF_ID(func, bpf_rcu_read_lock)
@@ -11341,7 +11347,10 @@ static bool is_bpf_list_api_kfunc(u32 btf_id)
btf_id == special_kfunc_list[KF_bpf_list_pop_back] ||
btf_id == special_kfunc_list[KF_bpf_list_del] ||
btf_id == special_kfunc_list[KF_bpf_list_front] ||
- btf_id == special_kfunc_list[KF_bpf_list_back];
+ btf_id == special_kfunc_list[KF_bpf_list_back] ||
+ btf_id == special_kfunc_list[KF_bpf_list_is_first] ||
+ btf_id == special_kfunc_list[KF_bpf_list_is_last] ||
+ btf_id == special_kfunc_list[KF_bpf_list_empty];
}
static bool is_bpf_rbtree_api_kfunc(u32 btf_id)
@@ -11463,7 +11472,9 @@ static bool check_kfunc_is_graph_node_api(struct bpf_verifier_env *env,
switch (node_field_type) {
case BPF_LIST_NODE:
ret = is_bpf_list_push_kfunc(kfunc_btf_id) ||
- kfunc_btf_id == special_kfunc_list[KF_bpf_list_del];
+ kfunc_btf_id == special_kfunc_list[KF_bpf_list_del] ||
+ kfunc_btf_id == special_kfunc_list[KF_bpf_list_is_first] ||
+ kfunc_btf_id == special_kfunc_list[KF_bpf_list_is_last];
break;
case BPF_RB_NODE:
ret = (is_bpf_rbtree_add_kfunc(kfunc_btf_id) ||
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 7/8] bpf: allow non-owning list-node args via __nonown_allowed
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
` (5 preceding siblings ...)
2026-04-27 16:59 ` [PATCH bpf-next v10 6/8] bpf: add bpf_list_is_first/last/empty kfuncs Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 8/8] selftests/bpf: Add test cases for bpf_list_del/add/is_first/is_last/empty Kaitao cheng
7 siblings, 0 replies; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
KF_ARG_PTR_TO_LIST_NODE normally requires an owning reference
(PTR_TO_BTF_ID | MEM_ALLOC with ref_obj_id). Introduce and use
the __nonown_allowed annotation on selected list-node arguments
so non-owning references with ref_obj_id==0 are accepted as well.
This enables passing bpf_list_front() / bpf_list_back() results to:
bpf_list_add() as insertion point (prev)
bpf_list_del() as deletion target (node)
bpf_list_is_first/last() as query target (node)
Verifier keeps existing owning-ref checks by default; only arguments
annotated with __nonown_allowed bypass MEM_ALLOC/ref_obj_id checks
and then follow the same list-node validation path.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
Documentation/bpf/kfuncs.rst | 22 ++++++++++++++++++++--
kernel/bpf/helpers.c | 20 +++++++++++---------
kernel/bpf/verifier.c | 13 +++++++++++++
3 files changed, 44 insertions(+), 11 deletions(-)
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index 75e6c078e0e7..3a9db1108b95 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -207,8 +207,26 @@ Here, the buffer may be NULL. If the buffer is not NULL, it must be at least
buffer__szk bytes in size. The kfunc is responsible for checking if the buffer
is NULL before using it.
-2.3.5 __str Annotation
-----------------------------
+2.3.5 __nonown_allowed Annotation
+---------------------------------
+
+This annotation is used to indicate that the parameter may be a non-owning reference.
+
+An example is given below::
+
+ __bpf_kfunc int bpf_list_add(..., struct bpf_list_node
+ *prev__nonown_allowed, ...)
+ {
+ ...
+ }
+
+For the ``prev__nonown_allowed`` parameter (resolved as ``KF_ARG_PTR_TO_LIST_NODE``),
+suffix ``__nonown_allowed`` retains the usual owning-pointer rules and also
+permits a non-owning reference with no ref_obj_id (e.g. the return value of
+bpf_list_front() / bpf_list_back()).
+
+2.3.6 __str Annotation
+----------------------
This annotation is used to indicate that the argument is a constant string.
An example is given below::
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index dfd465badd9d..f2f8705f0e9a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2571,10 +2571,10 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
}
__bpf_kfunc int bpf_list_add(struct bpf_list_head *head, struct bpf_list_node *new,
- struct bpf_list_node *prev, struct btf_struct_meta *meta,
- u64 off)
+ struct bpf_list_node *prev__nonown_allowed,
+ struct btf_struct_meta *meta, u64 off)
{
- struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev;
+ struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev__nonown_allowed;
struct list_head *prev_ptr = &p->list_head;
return __bpf_list_add(n, head, &prev_ptr, meta ? meta->record : NULL, off);
@@ -2620,9 +2620,9 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
}
__bpf_kfunc struct bpf_list_node *bpf_list_del(struct bpf_list_head *head,
- struct bpf_list_node *node)
+ struct bpf_list_node *node__nonown_allowed)
{
- struct bpf_list_node_kern *kn = (void *)node;
+ struct bpf_list_node_kern *kn = (void *)node__nonown_allowed;
/* verifier guarantees node is a list node rather than list head */
return __bpf_list_del(head, &kn->list_head);
@@ -2648,10 +2648,11 @@ __bpf_kfunc struct bpf_list_node *bpf_list_back(struct bpf_list_head *head)
return (struct bpf_list_node *)h->prev;
}
-__bpf_kfunc bool bpf_list_is_first(struct bpf_list_head *head, struct bpf_list_node *node)
+__bpf_kfunc bool bpf_list_is_first(struct bpf_list_head *head,
+ struct bpf_list_node *node__nonown_allowed)
{
struct list_head *h = (struct list_head *)head;
- struct bpf_list_node_kern *kn = (struct bpf_list_node_kern *)node;
+ struct bpf_list_node_kern *kn = (struct bpf_list_node_kern *)node__nonown_allowed;
if (READ_ONCE(kn->owner) != head)
return false;
@@ -2659,10 +2660,11 @@ __bpf_kfunc bool bpf_list_is_first(struct bpf_list_head *head, struct bpf_list_n
return list_is_first(&kn->list_head, h);
}
-__bpf_kfunc bool bpf_list_is_last(struct bpf_list_head *head, struct bpf_list_node *node)
+__bpf_kfunc bool bpf_list_is_last(struct bpf_list_head *head,
+ struct bpf_list_node *node__nonown_allowed)
{
struct list_head *h = (struct list_head *)head;
- struct bpf_list_node_kern *kn = (struct bpf_list_node_kern *)node;
+ struct bpf_list_node_kern *kn = (struct bpf_list_node_kern *)node__nonown_allowed;
if (READ_ONCE(kn->owner) != head)
return false;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ca33f35bc3eb..08ab337866bf 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10502,6 +10502,11 @@ static bool is_kfunc_arg_nullable(const struct btf *btf, const struct btf_param
return btf_param_match_suffix(btf, arg, "__nullable");
}
+static bool is_kfunc_arg_nonown_allowed(const struct btf *btf, const struct btf_param *arg)
+{
+ return btf_param_match_suffix(btf, arg, "__nonown_allowed");
+}
+
static bool is_kfunc_arg_const_str(const struct btf *btf, const struct btf_param *arg)
{
return btf_param_match_suffix(btf, arg, "__str");
@@ -12017,6 +12022,13 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
return ret;
break;
case KF_ARG_PTR_TO_LIST_NODE:
+ if (is_kfunc_arg_nonown_allowed(btf, &args[i]) &&
+ type_is_non_owning_ref(reg->type) && !reg->ref_obj_id) {
+ /* Allow bpf_list_front/back return value for
+ * __nonown_allowed list-node arguments.
+ */
+ goto check_ok;
+ }
if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
verbose(env, "%s expected pointer to allocated object\n",
reg_arg_name(env, argno));
@@ -12026,6 +12038,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
verbose(env, "allocated object must be referenced\n");
return -EINVAL;
}
+check_ok:
ret = process_kf_arg_ptr_to_list_node(env, reg, argno, meta);
if (ret < 0)
return ret;
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH bpf-next v10 8/8] selftests/bpf: Add test cases for bpf_list_del/add/is_first/is_last/empty
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
` (6 preceding siblings ...)
2026-04-27 16:59 ` [PATCH bpf-next v10 7/8] bpf: allow non-owning list-node args via __nonown_allowed Kaitao cheng
@ 2026-04-27 16:59 ` Kaitao cheng
7 siblings, 0 replies; 17+ messages in thread
From: Kaitao cheng @ 2026-04-27 16:59 UTC (permalink / raw)
To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah, chengkaitao,
linux-kselftest
Cc: bpf, linux-kernel
From: Kaitao Cheng <chengkaitao@kylinos.cn>
Extend refcounted_kptr with tests for bpf_list_add (including prev from
bpf_list_front and bpf_refcount_acquire), bpf_list_del (including node
from bpf_list_front, bpf_rbtree_remove and bpf_refcount_acquire),
bpf_list_empty, bpf_list_is_first/last, and push_back on uninit head.
To verify the validity of bpf_list_del/add, the test also expects the
verifier to reject calls to bpf_list_del/add made without holding the
spin_lock.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
.../selftests/bpf/progs/refcounted_kptr.c | 421 ++++++++++++++++++
1 file changed, 421 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/refcounted_kptr.c b/tools/testing/selftests/bpf/progs/refcounted_kptr.c
index c847398837cc..21ae06797b18 100644
--- a/tools/testing/selftests/bpf/progs/refcounted_kptr.c
+++ b/tools/testing/selftests/bpf/progs/refcounted_kptr.c
@@ -367,6 +367,427 @@ long insert_rbtree_and_stash__del_tree_##rem_tree(void *ctx) \
INSERT_STASH_READ(true, "insert_stash_read: remove from tree");
INSERT_STASH_READ(false, "insert_stash_read: don't remove from tree");
+SEC("tc")
+__description("list_empty_test: list empty before add, non-empty after add")
+__success __retval(0)
+int list_empty_test(void *ctx)
+{
+ struct node_data *node_new;
+
+ bpf_spin_lock(&lock);
+ if (!bpf_list_empty(&head)) {
+ bpf_spin_unlock(&lock);
+ return -1;
+ }
+ bpf_spin_unlock(&lock);
+
+ node_new = bpf_obj_new(typeof(*node_new));
+ if (!node_new)
+ return -2;
+
+ bpf_spin_lock(&lock);
+ bpf_list_push_front(&head, &node_new->l);
+
+ if (bpf_list_empty(&head)) {
+ bpf_spin_unlock(&lock);
+ return -3;
+ }
+ bpf_spin_unlock(&lock);
+ return 0;
+}
+
+static struct node_data *__add_in_list(struct bpf_list_head *head,
+ struct bpf_spin_lock *lock)
+{
+ struct node_data *node_new, *node_ref;
+
+ node_new = bpf_obj_new(typeof(*node_new));
+ if (!node_new)
+ return NULL;
+
+ node_ref = bpf_refcount_acquire(node_new);
+
+ bpf_spin_lock(lock);
+ bpf_list_push_front(head, &node_new->l);
+ bpf_spin_unlock(lock);
+ return node_ref;
+}
+
+SEC("tc")
+__description("list_is_edge_test1: is_first on first node, is_last on last node")
+__success __retval(0)
+int list_is_edge_test1(void *ctx)
+{
+ struct node_data *node_first, *node_last;
+ int err = 0;
+
+ node_last = __add_in_list(&head, &lock);
+ if (!node_last)
+ return -1;
+
+ node_first = __add_in_list(&head, &lock);
+ if (!node_first) {
+ bpf_obj_drop(node_last);
+ return -2;
+ }
+
+ bpf_spin_lock(&lock);
+ if (!bpf_list_is_first(&head, &node_first->l)) {
+ err = -3;
+ goto fail;
+ }
+ if (!bpf_list_is_last(&head, &node_last->l))
+ err = -4;
+
+fail:
+ bpf_spin_unlock(&lock);
+ bpf_obj_drop(node_first);
+ bpf_obj_drop(node_last);
+ return err;
+}
+
+SEC("tc")
+__description("list_is_edge_test2: accept list_front/list_back return value")
+__success __retval(0)
+int list_is_edge_test2(void *ctx)
+{
+ struct bpf_list_node *front, *back;
+ struct node_data *a, *b;
+ long err = 0;
+
+ a = __add_in_list(&head, &lock);
+ if (!a)
+ return -1;
+
+ b = __add_in_list(&head, &lock);
+ if (!b) {
+ bpf_obj_drop(a);
+ return -2;
+ }
+
+ bpf_spin_lock(&lock);
+ front = bpf_list_front(&head);
+ back = bpf_list_back(&head);
+ if (!front || !back) {
+ err = -3;
+ goto out_unlock;
+ }
+
+ if (!bpf_list_is_first(&head, front) || bpf_list_is_last(&head, front)) {
+ err = -4;
+ goto out_unlock;
+ }
+
+ if (!bpf_list_is_last(&head, back) || bpf_list_is_first(&head, back)) {
+ err = -5;
+ goto out_unlock;
+ }
+
+out_unlock:
+ bpf_spin_unlock(&lock);
+ bpf_obj_drop(a);
+ bpf_obj_drop(b);
+ return err;
+}
+
+SEC("tc")
+__description("list_is_edge_test3: single node is both first and last")
+__success __retval(0)
+int list_is_edge_test3(void *ctx)
+{
+ struct node_data *tmp;
+ struct bpf_list_node *node;
+ long err = 0;
+
+ tmp = __add_in_list(&head, &lock);
+ if (!tmp)
+ return -1;
+
+ bpf_spin_lock(&lock);
+ node = bpf_list_front(&head);
+ if (!node) {
+ bpf_spin_unlock(&lock);
+ bpf_obj_drop(tmp);
+ return -2;
+ }
+
+ if (!bpf_list_is_first(&head, node) || !bpf_list_is_last(&head, node))
+ err = -3;
+ bpf_spin_unlock(&lock);
+
+ bpf_obj_drop(tmp);
+ return err;
+}
+
+SEC("tc")
+__description("list_del_test1: del returns removed nodes")
+__success __retval(0)
+int list_del_test1(void *ctx)
+{
+ struct node_data *node_first, *node_last;
+ struct bpf_list_node *bpf_node_first, *bpf_node_last;
+ int err = 0;
+
+ node_last = __add_in_list(&head, &lock);
+ if (!node_last)
+ return -1;
+
+ node_first = __add_in_list(&head, &lock);
+ if (!node_first) {
+ bpf_obj_drop(node_last);
+ return -2;
+ }
+
+ bpf_spin_lock(&lock);
+ bpf_node_last = bpf_list_del(&head, &node_last->l);
+ bpf_node_first = bpf_list_del(&head, &node_first->l);
+ bpf_spin_unlock(&lock);
+
+ if (bpf_node_first)
+ bpf_obj_drop(container_of(bpf_node_first, struct node_data, l));
+ else
+ err = -3;
+
+ if (bpf_node_last)
+ bpf_obj_drop(container_of(bpf_node_last, struct node_data, l));
+ else
+ err = -4;
+
+ bpf_obj_drop(node_first);
+ bpf_obj_drop(node_last);
+ return err;
+}
+
+SEC("tc")
+__description("list_del_test2: remove an arbitrary node from the list")
+__success __retval(0)
+int list_del_test2(void *ctx)
+{
+ struct bpf_rb_node *rb;
+ struct bpf_list_node *l;
+ struct node_data *n;
+ long err;
+
+ err = __insert_in_tree_and_list(&head, &root, &lock);
+ if (err)
+ return err;
+
+ bpf_spin_lock(&lock);
+ rb = bpf_rbtree_first(&root);
+ if (!rb) {
+ bpf_spin_unlock(&lock);
+ return -4;
+ }
+
+ rb = bpf_rbtree_remove(&root, rb);
+ if (!rb) {
+ bpf_spin_unlock(&lock);
+ return -5;
+ }
+
+ n = container_of(rb, struct node_data, r);
+ l = bpf_list_del(&head, &n->l);
+ bpf_spin_unlock(&lock);
+ bpf_obj_drop(n);
+ if (!l)
+ return -6;
+
+ bpf_obj_drop(container_of(l, struct node_data, l));
+ return 0;
+}
+
+SEC("tc")
+__description("list_del_test3: list_del accepts list_front return value as node")
+__success __retval(0)
+int list_del_test3(void *ctx)
+{
+ struct node_data *tmp;
+ struct bpf_list_node *bpf_node, *l;
+ long err = 0;
+
+ tmp = __add_in_list(&head, &lock);
+ if (!tmp)
+ return -1;
+
+ bpf_spin_lock(&lock);
+ bpf_node = bpf_list_front(&head);
+ if (!bpf_node) {
+ bpf_spin_unlock(&lock);
+ err = -2;
+ goto fail;
+ }
+
+ l = bpf_list_del(&head, bpf_node);
+ bpf_spin_unlock(&lock);
+ if (!l) {
+ err = -3;
+ goto fail;
+ }
+
+ bpf_obj_drop(container_of(l, struct node_data, l));
+ bpf_obj_drop(tmp);
+ return 0;
+
+fail:
+ bpf_obj_drop(tmp);
+ return err;
+}
+
+SEC("tc")
+__description("list_add_test1: insert new node after prev")
+__success __retval(0)
+int list_add_test1(void *ctx)
+{
+ struct node_data *node_first;
+ struct node_data *new_node;
+ long err = 0;
+
+ node_first = __add_in_list(&head, &lock);
+ if (!node_first)
+ return -1;
+
+ new_node = bpf_obj_new(typeof(*new_node));
+ if (!new_node) {
+ err = -2;
+ goto fail;
+ }
+
+ bpf_spin_lock(&lock);
+ err = bpf_list_add(&head, &new_node->l, &node_first->l);
+ bpf_spin_unlock(&lock);
+ if (err) {
+ err = -3;
+ goto fail;
+ }
+
+fail:
+ bpf_obj_drop(node_first);
+ return err;
+}
+
+SEC("tc")
+__description("list_add_test2: list_add accepts list_front return value as prev")
+__success __retval(0)
+int list_add_test2(void *ctx)
+{
+ struct node_data *new_node, *tmp;
+ struct bpf_list_node *bpf_node;
+ long err = 0;
+
+ tmp = __add_in_list(&head, &lock);
+ if (!tmp)
+ return -1;
+
+ new_node = bpf_obj_new(typeof(*new_node));
+ if (!new_node) {
+ err = -2;
+ goto fail;
+ }
+
+ bpf_spin_lock(&lock);
+ bpf_node = bpf_list_front(&head);
+ if (!bpf_node) {
+ bpf_spin_unlock(&lock);
+ bpf_obj_drop(new_node);
+ err = -3;
+ goto fail;
+ }
+
+ err = bpf_list_add(&head, &new_node->l, bpf_node);
+ bpf_spin_unlock(&lock);
+ if (err) {
+ err = -4;
+ goto fail;
+ }
+
+fail:
+ bpf_obj_drop(tmp);
+ return err;
+}
+
+struct uninit_head_val {
+ struct bpf_spin_lock lock;
+ struct bpf_list_head head __contains(node_data, l);
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct uninit_head_val);
+ __uint(max_entries, 1);
+} uninit_head_map SEC(".maps");
+
+SEC("tc")
+__description("list_push_back_uninit_head: push_back on 0-initialized list head")
+__success __retval(0)
+int list_push_back_uninit_head(void *ctx)
+{
+ struct uninit_head_val *st;
+ struct node_data *node;
+ int ret = -1, key = 0;
+
+ st = bpf_map_lookup_elem(&uninit_head_map, &key);
+ if (!st)
+ return -1;
+
+ node = bpf_obj_new(typeof(*node));
+ if (!node)
+ return -1;
+
+ bpf_spin_lock(&st->lock);
+ ret = bpf_list_push_back(&st->head, &node->l);
+ bpf_spin_unlock(&st->lock);
+
+ return ret;
+}
+
+SEC("?tc")
+__failure __msg("bpf_spin_lock at off=32 must be held for bpf_list_head")
+long list_del_without_lock_fail(void *ctx)
+{
+ struct node_data *n;
+ struct bpf_list_node *l;
+
+ n = bpf_obj_new(typeof(*n));
+ if (!n)
+ return -1;
+
+ /* Error case: delete list node without holding lock */
+ l = bpf_list_del(&head, &n->l);
+ bpf_obj_drop(n);
+ if (!l)
+ return -2;
+ bpf_obj_drop(container_of(l, struct node_data, l));
+
+ return 0;
+}
+
+SEC("?tc")
+__failure __msg("bpf_spin_lock at off=32 must be held for bpf_list_head")
+long list_add_without_lock_fail(void *ctx)
+{
+ struct node_data *n, *prev;
+
+ n = bpf_obj_new(typeof(*n));
+ if (!n)
+ return -1;
+
+ prev = bpf_obj_new(typeof(*prev));
+ if (!prev) {
+ bpf_obj_drop(n);
+ return -1;
+ }
+
+ /* Error case: add list node without holding lock */
+ if (bpf_list_add(&head, &n->l, &prev->l)) {
+ bpf_obj_drop(prev);
+ bpf_obj_drop(n);
+ return -2;
+ }
+
+ return 0;
+}
+
SEC("tc")
__success
long rbtree_refcounted_node_ref_escapes(void *ctx)
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer
2026-04-27 16:58 ` [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer Kaitao cheng
@ 2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 13:52 ` Kaitao Cheng
0 siblings, 1 reply; 17+ messages in thread
From: bot+bpf-ci @ 2026-04-27 18:43 UTC (permalink / raw)
To: kaitao.cheng, martin.lau, ast, daniel, andrii, eddyz87, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
chengkaitao, linux-kselftest
Cc: bpf, linux-kernel, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2486 bytes --]
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
[ ... ]
> @@ -2575,12 +2577,16 @@ static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head,
>
> __bpf_kfunc struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head)
> {
> - return __bpf_list_del(head, false);
> + struct list_head *h = (void *)head;
> +
> + return __bpf_list_del(head, h->next);
> }
This addresses an unaddressed review comment from Emil Tsalapatis in v8.
The code reads h->next before __bpf_list_del checks if the list is
initialized. If the list is uninitialized (h->next is NULL), a NULL pointer
is passed to __bpf_list_del.
While the function will eventually initialize the list and return NULL without
dereferencing the NULL pointer:
kernel/bpf/helpers.c:__bpf_list_del() {
if (unlikely(!h->next)) {
INIT_LIST_HEAD(h);
return NULL;
}
...
}
this is counterintuitive - passing a potentially garbage value that is later
ignored.
Emil suggested moving the init check logic into the top-level kfuncs
(bpf_list_pop_front/bpf_list_pop_back) to ensure the list_head passed to
__bpf_list_del is always valid. You can also just init the list and return
NULL in that case - we know it's empty.
The same issue applies to bpf_list_pop_back() which reads h->prev before
the initialization check.
Original comment from Emil Tsalapatis (Thu, 19 Mar 2026 12:17:50 -0400):
https://lore.kernel.org/bpf/DH6W7K2SLPYB.PCO7FQ6CR1YZ@etsalapatis.com/
"The new code reads n _before_ we check if the list is initialized. So the n
we are passing from the caller may well be NULL. However, __bpf_list_del()
will in that case now a) call INIT_LIST_HEAD(() to properly set up prev/next,
b) call list_empty() on the newly initialized list and exit without ever
reading the NULL passed by the caller.
This is kind of counterintuitive: We are passing essentially a garbage value
to __bpf_list_del that we thankfully end upi ignoring. Can you move the init
check logic into the top-level kfuncs to make sure the list_head we're passing
to __bpf_list_del is always valid? You can also just init the list and return
NULL in that case - we know it's empty."
This issue remains unaddressed in v10.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop
2026-04-27 16:59 ` [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop Kaitao cheng
@ 2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 14:08 ` Kaitao Cheng
0 siblings, 1 reply; 17+ messages in thread
From: bot+bpf-ci @ 2026-04-27 18:43 UTC (permalink / raw)
To: kaitao.cheng, martin.lau, ast, daniel, andrii, eddyz87, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
chengkaitao, linux-kselftest
Cc: bpf, linux-kernel, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 858 bytes --]
commit e4bda22d1a0e4a2ede811c0a776c6aba8b315b5a
Author: Kaitao Cheng <chengkaitao@kylinos.cn>
bpf: clear list node owner and unlink before drop
When draining a BPF list_head, this commit ensures each node's owner
pointer is cleared while holding the spinlock and delinks each node before
calling __bpf_obj_drop_impl() to prevent invalid next node references.
This commit appears to fix a bug but is missing a Fixes: tag. The issue
being addressed traces back to the introduction of bpf_list_head support in
map values. Should this commit include:
Fixes: f0c5941ff5b2 ("bpf: Support bpf_list_head in map values")
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node
2026-04-27 16:59 ` [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node Kaitao cheng
@ 2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 14:18 ` Kaitao Cheng
0 siblings, 1 reply; 17+ messages in thread
From: bot+bpf-ci @ 2026-04-27 18:43 UTC (permalink / raw)
To: kaitao.cheng, martin.lau, ast, daniel, andrii, eddyz87, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
chengkaitao, linux-kselftest
Cc: bpf, linux-kernel, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 4217 bytes --]
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2570,6 +2570,16 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
> return bpf_list_push_back(head, node, meta__ign, off);
> }
>
> +__bpf_kfunc int bpf_list_add(struct bpf_list_head *head, struct bpf_list_node *new,
> + struct bpf_list_node *prev, struct btf_struct_meta *meta,
> + u64 off)
> +{
> + struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev;
> + struct list_head *prev_ptr = &p->list_head;
> +
> + return __bpf_list_add(n, head, &prev_ptr, meta ? meta->record : NULL, off);
> +}
> +
The 'prev' argument is described in the commit message as "must already be
in the list." How can a BPF program obtain such a reference and pass it to
this kfunc?
A node already in the list is tracked by the verifier as a non-owning
reference (PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF with ref_obj_id == 0).
The typical way to obtain this is via bpf_list_front() or bpf_list_back(),
which return non-owning references.
However, check_kfunc_args() in kernel/bpf/verifier.c handles
KF_ARG_PTR_TO_LIST_NODE by requiring:
if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
verbose(env, "%s expected pointer to allocated object\n", ...);
return -EINVAL;
}
if (!reg->ref_obj_id) {
verbose(env, "allocated object must be referenced\n");
return -EINVAL;
}
This check rejects non-owning references. The 'prev' argument in R3 is
typed as struct bpf_list_node *, so the verifier classifies it as
KF_ARG_PTR_TO_LIST_NODE and applies this validation.
Compare with KF_ARG_PTR_TO_RB_NODE, which distinguishes between insertion
(requires owning ref) and other operations (allows non-owning refs):
case KF_ARG_PTR_TO_RB_NODE:
if (meta->func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
/* owning ref required */
} else {
/* non-owning or refcounted allowed */
}
Without a similar distinction for KF_ARG_PTR_TO_LIST_NODE, is there any
valid BPF register state that satisfies both the verifier check and the
runtime semantics stated in the commit message?
It appears the API only works when 'prev' is an owning reference to a
refcounted object still in the list. This is the pattern used in the
selftests, but it limits the kfunc to a narrow use case that the commit
message does not describe.
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -10742,6 +10742,7 @@ enum special_kfunc_type {
> KF_bpf_list_push_front,
> KF_bpf_list_push_back_impl,
> KF_bpf_list_push_back,
> + KF_bpf_list_add,
> KF_bpf_list_pop_front,
> KF_bpf_list_pop_back,
> KF_bpf_list_del,
[ ... ]
> @@ -10923,7 +10925,8 @@ static bool is_bpf_list_push_kfunc(u32 func_id)
> return func_id == special_kfunc_list[KF_bpf_list_push_front] ||
> func_id == special_kfunc_list[KF_bpf_list_push_front_impl] ||
> func_id == special_kfunc_list[KF_bpf_list_push_back] ||
> - func_id == special_kfunc_list[KF_bpf_list_push_back_impl];
> + func_id == special_kfunc_list[KF_bpf_list_push_back_impl] ||
> + func_id == special_kfunc_list[KF_bpf_list_add];
> }
>
> static bool is_bpf_rbtree_add_kfunc(u32 func_id)
> @@ -19228,8 +19231,11 @@ struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) };
> int struct_meta_reg = BPF_REG_3;
> int node_offset_reg = BPF_REG_4;
>
> - /* rbtree_add has extra 'less' arg, so args-to-fixup are in diff regs */
> - if (is_bpf_rbtree_add_kfunc(desc->func_id)) {
> + /* list_add/rbtree_add have an extra arg (prev/less),
> + * so args-to-fixup are in diff regs.
> + */
> + if (desc->func_id == special_kfunc_list[KF_bpf_list_add] ||
> + is_bpf_rbtree_add_kfunc(desc->func_id)) {
> struct_meta_reg = BPF_REG_4;
> node_offset_reg = BPF_REG_5;
> }
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc.
2026-04-27 16:59 ` [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc Kaitao cheng
@ 2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 14:32 ` Kaitao Cheng
0 siblings, 1 reply; 17+ messages in thread
From: bot+bpf-ci @ 2026-04-27 18:43 UTC (permalink / raw)
To: kaitao.cheng, martin.lau, ast, daniel, andrii, eddyz87, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
chengkaitao, linux-kselftest
Cc: bpf, linux-kernel, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3758 bytes --]
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 1e8754877dd1..51b6ea4bb8cb 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2593,6 +2593,15 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
> return __bpf_list_del(head, h->prev);
> }
>
> +__bpf_kfunc struct bpf_list_node *bpf_list_del(struct bpf_list_head *head,
> + struct bpf_list_node *node)
> +{
> + struct bpf_list_node_kern *kn = (void *)node;
> +
> + /* verifier guarantees node is a list node rather than list head */
> + return __bpf_list_del(head, &kn->list_head);
> +}
Can this kfunc actually be called from any BPF program? Looking at the
verifier's handling of KF_ARG_PTR_TO_LIST_NODE arguments in
check_kfunc_args():
case KF_ARG_PTR_TO_LIST_NODE:
if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
verbose(env, "%s expected pointer to allocated object\n", ...);
return -EINVAL;
}
if (!reg->ref_obj_id) {
verbose(env, "allocated object must be referenced\n");
return -EINVAL;
}
This code path unconditionally requires an owning reference (ref_obj_id
must be non-zero).
However, once a node is added to the list via bpf_list_push_front/back,
the owning reference is converted to a non-owning reference
(ref_obj_id becomes 0) via ref_convert_owning_non_owning() and the
original owning reference is released. Any subsequent access to that
in-list node (including via bpf_list_front/back) only provides
non-owning references with ref_obj_id == 0.
This means:
- Passing a non-owning ref to bpf_list_del() fails the verifier's
"!reg->ref_obj_id" check with "allocated object must be referenced"
- Passing an owning ref to a node not yet in the list passes the
verifier check, but __bpf_list_del() will return NULL because
node_kern->owner != head
The sister kfunc bpf_rbtree_remove handles this by having a special
branch in check_kfunc_args() that accepts non-owning references for
that specific kfunc. Does bpf_list_del need similar handling?
Looking at __bpf_list_del() in kernel/bpf/helpers.c:
node = container_of(n, struct bpf_list_node_kern, list_head);
if (unlikely(READ_ONCE(node->owner) != head))
return NULL;
Emil Tsalapatis and the Sashiko review bot raised concerns about
removing the WARN_ON_ONCE that was in v8 of this series. The original
WARN_ON_ONCE helped detect list corruption by triggering a kernel
warning when owner != head. The current unlikely() check silently
returns NULL instead.
Should this check trigger a warning for debugging list corruption issues?
Additionally, the Sashiko review noted a reuse bug: when
bpf_list_head_free() is called on a list that still contains elements,
those elements remain allocated but their owner field still points to
the now-freed head. Since the owner field is checked but never cleared,
could this lead to dangling pointer issues if nodes aren't freed
properly during list cleanup?
> @@ -4725,6 +4734,7 @@ BTF_ID_FLAGS(func, bpf_list_push_back, KF_IMPLICIT_ARGS)
> BTF_ID_FLAGS(func, bpf_list_push_back_impl)
> BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
> +BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_list_front, KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_list_back, KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer
2026-04-27 18:43 ` bot+bpf-ci
@ 2026-04-28 13:52 ` Kaitao Cheng
0 siblings, 0 replies; 17+ messages in thread
From: Kaitao Cheng @ 2026-04-28 13:52 UTC (permalink / raw)
To: bot+bpf-ci, emil
Cc: shuah, john.fastabend, yonghong.song, song, eddyz87, andrii, ast,
martin.lau, daniel, jolsa, haoluo, sdf, kpsingh, bpf,
linux-kernel, martin.lau, clm, ihor.solodrai, chengkaitao,
linux-kselftest
在 2026/4/28 02:43, bot+bpf-ci@kernel.org 写道:
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>
> [ ... ]
>
>> @@ -2575,12 +2577,16 @@ static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head,
>>
>> __bpf_kfunc struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head)
>> {
>> - return __bpf_list_del(head, false);
>> + struct list_head *h = (void *)head;
>> +
>> + return __bpf_list_del(head, h->next);
>> }
>
> This addresses an unaddressed review comment from Emil Tsalapatis in v8.
> The code reads h->next before __bpf_list_del checks if the list is
> initialized. If the list is uninitialized (h->next is NULL), a NULL pointer
> is passed to __bpf_list_del.
>
> While the function will eventually initialize the list and return NULL without
> dereferencing the NULL pointer:
>
> kernel/bpf/helpers.c:__bpf_list_del() {
> if (unlikely(!h->next)) {
> INIT_LIST_HEAD(h);
> return NULL;
> }
> ...
> }
>
> this is counterintuitive - passing a potentially garbage value that is later
> ignored.
>
> Emil suggested moving the init check logic into the top-level kfuncs
> (bpf_list_pop_front/bpf_list_pop_back) to ensure the list_head passed to
> __bpf_list_del is always valid. You can also just init the list and return
> NULL in that case - we know it's empty.
>
> The same issue applies to bpf_list_pop_back() which reads h->prev before
> the initialization check.
>
> Original comment from Emil Tsalapatis (Thu, 19 Mar 2026 12:17:50 -0400):
> https://lore.kernel.org/bpf/DH6W7K2SLPYB.PCO7FQ6CR1YZ@etsalapatis.com/
>
> "The new code reads n _before_ we check if the list is initialized. So the n
> we are passing from the caller may well be NULL. However, __bpf_list_del()
> will in that case now a) call INIT_LIST_HEAD(() to properly set up prev/next,
> b) call list_empty() on the newly initialized list and exit without ever
> reading the NULL passed by the caller.
>
> This is kind of counterintuitive: We are passing essentially a garbage value
> to __bpf_list_del that we thankfully end upi ignoring. Can you move the init
> check logic into the top-level kfuncs to make sure the list_head we're passing
> to __bpf_list_del is always valid? You can also just init the list and return
> NULL in that case - we know it's empty."
>
> This issue remains unaddressed in v10.
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
The '__bpf_list_del' function is a static function, so its scope and
associated risk are both very limited. I think its purpose is code
reuse and avoiding duplication.
If we move 'unlikely(!h->next)' to the outer callers, I believe it
brings no benefit other than making the code more redundant. Also,
I think validating a function's parameters before execution is not
counterintuitive; on the contrary, for global functions it is almost
a hard requirement.
--
Thanks
Kaitao Cheng
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop
2026-04-27 18:43 ` bot+bpf-ci
@ 2026-04-28 14:08 ` Kaitao Cheng
0 siblings, 0 replies; 17+ messages in thread
From: Kaitao Cheng @ 2026-04-28 14:08 UTC (permalink / raw)
To: bot+bpf-ci
Cc: martin.lau, john.fastabend, yonghong.song, song, eddyz87, daniel,
ast, linux-kselftest, chengkaitao, shuah, jolsa, haoluo, sdf,
kpsingh, bpf, linux-kernel, martin.lau, clm, ihor.solodrai,
andrii
在 2026/4/28 02:43, bot+bpf-ci@kernel.org 写道:
>
> commit e4bda22d1a0e4a2ede811c0a776c6aba8b315b5a
> Author: Kaitao Cheng <chengkaitao@kylinos.cn>
>
> bpf: clear list node owner and unlink before drop
>
> When draining a BPF list_head, this commit ensures each node's owner
> pointer is cleared while holding the spinlock and delinks each node before
> calling __bpf_obj_drop_impl() to prevent invalid next node references.
>
> This commit appears to fix a bug but is missing a Fixes: tag. The issue
> being addressed traces back to the introduction of bpf_list_head support in
> map values. Should this commit include:
>
> Fixes: f0c5941ff5b2 ("bpf: Support bpf_list_head in map values")
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
The issue only appears after the later patch "bpf: Introduce the
bpf_list_del kfunc" is merged. Versions before that are not affected.
Therefore, there is no need to add the "Fixes: f0c5941ff5b2" tag.
--
Thanks
Kaitao Cheng
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node
2026-04-27 18:43 ` bot+bpf-ci
@ 2026-04-28 14:18 ` Kaitao Cheng
0 siblings, 0 replies; 17+ messages in thread
From: Kaitao Cheng @ 2026-04-28 14:18 UTC (permalink / raw)
To: bot+bpf-ci
Cc: shuah, john.fastabend, yonghong.song, song, eddyz87, andrii,
daniel, ast, martin.lau, haoluo, sdf, kpsingh, bpf, linux-kernel,
martin.lau, clm, ihor.solodrai, jolsa, chengkaitao,
linux-kselftest
在 2026/4/28 02:43, bot+bpf-ci@kernel.org 写道:
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>> @@ -2570,6 +2570,16 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
>> return bpf_list_push_back(head, node, meta__ign, off);
>> }
>>
>> +__bpf_kfunc int bpf_list_add(struct bpf_list_head *head, struct bpf_list_node *new,
>> + struct bpf_list_node *prev, struct btf_struct_meta *meta,
>> + u64 off)
>> +{
>> + struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev;
>> + struct list_head *prev_ptr = &p->list_head;
>> +
>> + return __bpf_list_add(n, head, &prev_ptr, meta ? meta->record : NULL, off);
>> +}
>> +
>
> The 'prev' argument is described in the commit message as "must already be
> in the list." How can a BPF program obtain such a reference and pass it to
> this kfunc?
>
> A node already in the list is tracked by the verifier as a non-owning
> reference (PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF with ref_obj_id == 0).
> The typical way to obtain this is via bpf_list_front() or bpf_list_back(),
> which return non-owning references.
>
> However, check_kfunc_args() in kernel/bpf/verifier.c handles
> KF_ARG_PTR_TO_LIST_NODE by requiring:
>
> if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
> verbose(env, "%s expected pointer to allocated object\n", ...);
> return -EINVAL;
> }
> if (!reg->ref_obj_id) {
> verbose(env, "allocated object must be referenced\n");
> return -EINVAL;
> }
>
> This check rejects non-owning references. The 'prev' argument in R3 is
> typed as struct bpf_list_node *, so the verifier classifies it as
> KF_ARG_PTR_TO_LIST_NODE and applies this validation.
>
> Compare with KF_ARG_PTR_TO_RB_NODE, which distinguishes between insertion
> (requires owning ref) and other operations (allows non-owning refs):
>
> case KF_ARG_PTR_TO_RB_NODE:
> if (meta->func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
> /* owning ref required */
> } else {
> /* non-owning or refcounted allowed */
> }
>
> Without a similar distinction for KF_ARG_PTR_TO_LIST_NODE, is there any
> valid BPF register state that satisfies both the verifier check and the
> runtime semantics stated in the commit message?
>
> It appears the API only works when 'prev' is an owning reference to a
> refcounted object still in the list. This is the pattern used in the
> selftests, but it limits the kfunc to a narrow use case that the commit
> message does not describe.
I have already fixed this issue in [PATCH bpf-next v10 7/8].
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -10742,6 +10742,7 @@ enum special_kfunc_type {
>> KF_bpf_list_push_front,
>> KF_bpf_list_push_back_impl,
>> KF_bpf_list_push_back,
>> + KF_bpf_list_add,
>> KF_bpf_list_pop_front,
>> KF_bpf_list_pop_back,
>> KF_bpf_list_del,
>
> [ ... ]
>
>> @@ -10923,7 +10925,8 @@ static bool is_bpf_list_push_kfunc(u32 func_id)
>> return func_id == special_kfunc_list[KF_bpf_list_push_front] ||
>> func_id == special_kfunc_list[KF_bpf_list_push_front_impl] ||
>> func_id == special_kfunc_list[KF_bpf_list_push_back] ||
>> - func_id == special_kfunc_list[KF_bpf_list_push_back_impl];
>> + func_id == special_kfunc_list[KF_bpf_list_push_back_impl] ||
>> + func_id == special_kfunc_list[KF_bpf_list_add];
>> }
>>
>> static bool is_bpf_rbtree_add_kfunc(u32 func_id)
>> @@ -19228,8 +19231,11 @@ struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) };
>> int struct_meta_reg = BPF_REG_3;
>> int node_offset_reg = BPF_REG_4;
>>
>> - /* rbtree_add has extra 'less' arg, so args-to-fixup are in diff regs */
>> - if (is_bpf_rbtree_add_kfunc(desc->func_id)) {
>> + /* list_add/rbtree_add have an extra arg (prev/less),
>> + * so args-to-fixup are in diff regs.
>> + */
>> + if (desc->func_id == special_kfunc_list[KF_bpf_list_add] ||
>> + is_bpf_rbtree_add_kfunc(desc->func_id)) {
>> struct_meta_reg = BPF_REG_4;
>> node_offset_reg = BPF_REG_5;
>> }
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
--
Thanks
Kaitao Cheng
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc.
2026-04-27 18:43 ` bot+bpf-ci
@ 2026-04-28 14:32 ` Kaitao Cheng
0 siblings, 0 replies; 17+ messages in thread
From: Kaitao Cheng @ 2026-04-28 14:32 UTC (permalink / raw)
To: bot+bpf-ci
Cc: martin.lau, john.fastabend, yonghong.song, song, eddyz87, andrii,
daniel, ast, linux-kselftest, chengkaitao, shuah, jolsa, haoluo,
sdf, kpsingh, bpf, linux-kernel, martin.lau, clm, ihor.solodrai
在 2026/4/28 02:43, bot+bpf-ci@kernel.org 写道:
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> index 1e8754877dd1..51b6ea4bb8cb 100644
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>> @@ -2593,6 +2593,15 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
>> return __bpf_list_del(head, h->prev);
>> }
>>
>> +__bpf_kfunc struct bpf_list_node *bpf_list_del(struct bpf_list_head *head,
>> + struct bpf_list_node *node)
>> +{
>> + struct bpf_list_node_kern *kn = (void *)node;
>> +
>> + /* verifier guarantees node is a list node rather than list head */
>> + return __bpf_list_del(head, &kn->list_head);
>> +}
>
> Can this kfunc actually be called from any BPF program? Looking at the
> verifier's handling of KF_ARG_PTR_TO_LIST_NODE arguments in
> check_kfunc_args():
>
> case KF_ARG_PTR_TO_LIST_NODE:
> if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
> verbose(env, "%s expected pointer to allocated object\n", ...);
> return -EINVAL;
> }
> if (!reg->ref_obj_id) {
> verbose(env, "allocated object must be referenced\n");
> return -EINVAL;
> }
>
> This code path unconditionally requires an owning reference (ref_obj_id
> must be non-zero).
>
> However, once a node is added to the list via bpf_list_push_front/back,
> the owning reference is converted to a non-owning reference
> (ref_obj_id becomes 0) via ref_convert_owning_non_owning() and the
> original owning reference is released. Any subsequent access to that
> in-list node (including via bpf_list_front/back) only provides
> non-owning references with ref_obj_id == 0.
>
> This means:
> - Passing a non-owning ref to bpf_list_del() fails the verifier's
> "!reg->ref_obj_id" check with "allocated object must be referenced"
> - Passing an owning ref to a node not yet in the list passes the
> verifier check, but __bpf_list_del() will return NULL because
> node_kern->owner != head
>
> The sister kfunc bpf_rbtree_remove handles this by having a special
> branch in check_kfunc_args() that accepts non-owning references for
> that specific kfunc. Does bpf_list_del need similar handling?
I have already fixed this issue in [PATCH bpf-next v10 7/8].
> Looking at __bpf_list_del() in kernel/bpf/helpers.c:
>
> node = container_of(n, struct bpf_list_node_kern, list_head);
> if (unlikely(READ_ONCE(node->owner) != head))
> return NULL;
>
> Emil Tsalapatis and the Sashiko review bot raised concerns about
> removing the WARN_ON_ONCE that was in v8 of this series. The original
> WARN_ON_ONCE helped detect list corruption by triggering a kernel
> warning when owner != head. The current unlikely() check silently
> returns NULL instead.
>
> Should this check trigger a warning for debugging list corruption issues?
In previous discussions, people generally advised against keeping
warnings on this kind of path that can be triggered by BPF programs.
> Additionally, the Sashiko review noted a reuse bug: when
> bpf_list_head_free() is called on a list that still contains elements,
> those elements remain allocated but their owner field still points to
> the now-freed head. Since the owner field is checked but never cleared,
> could this lead to dangling pointer issues if nodes aren't freed
> properly during list cleanup?
I have already fixed this issue in [PATCH bpf-next v10 2/8].
>> @@ -4725,6 +4734,7 @@ BTF_ID_FLAGS(func, bpf_list_push_back, KF_IMPLICIT_ARGS)
>> BTF_ID_FLAGS(func, bpf_list_push_back_impl)
>> BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
>> BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
>> +BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL)
>> BTF_ID_FLAGS(func, bpf_list_front, KF_RET_NULL)
>> BTF_ID_FLAGS(func, bpf_list_back, KF_RET_NULL)
>> BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25009536772
--
Thanks
Kaitao Cheng
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-04-28 14:33 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 16:58 [PATCH bpf-next v10 0/8] bpf: Extend the bpf_list family of APIs Kaitao cheng
2026-04-27 16:58 ` [PATCH bpf-next v10 1/8] bpf: refactor __bpf_list_del to take list node pointer Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 13:52 ` Kaitao Cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 2/8] bpf: clear list node owner and unlink before drop Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 14:08 ` Kaitao Cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 3/8] bpf: Introduce the bpf_list_del kfunc Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 14:32 ` Kaitao Cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 4/8] bpf: refactor __bpf_list_add to take insertion point via **prev_ptr Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node Kaitao cheng
2026-04-27 18:43 ` bot+bpf-ci
2026-04-28 14:18 ` Kaitao Cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 6/8] bpf: add bpf_list_is_first/last/empty kfuncs Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 7/8] bpf: allow non-owning list-node args via __nonown_allowed Kaitao cheng
2026-04-27 16:59 ` [PATCH bpf-next v10 8/8] selftests/bpf: Add test cases for bpf_list_del/add/is_first/is_last/empty Kaitao cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox