All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kaitao Cheng <kaitao.cheng@linux.dev>
To: ast@kernel.org, corbet@lwn.net, martin.lau@linux.dev,
	daniel@iogearbox.net, andrii@kernel.org, eddyz87@gmail.com,
	song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, shuah@kernel.org,
	chengkaitao@kylinos.cn, skhan@linuxfoundation.org,
	memxor@gmail.com
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, vmalik@redhat.com,
	linux-kselftest@vger.kernel.org
Subject: [PATCH bpf-next v11 2/8] bpf: clear list node owner and unlink before drop
Date: Thu, 21 May 2026 11:23:00 +0800	[thread overview]
Message-ID: <20260521032306.97118-3-kaitao.cheng@linux.dev> (raw)
In-Reply-To: <20260521032306.97118-1-kaitao.cheng@linux.dev>

From: Kaitao Cheng <chengkaitao@kylinos.cn>

The issue only becomes exposed once bpf_list_del() is available: callers
can pass an arbitrary bpf_list_head and bpf_list_node pair, including
nodes that are not actually linked to the supplied head, or nodes that
outlive their original head after refcount-based retention.  This was
not practically reachable for callers restricted to pop-style helpers
alone; bpf_list_del() widens the API surface.

A failure mode appears when bpf_list_head_free() runs while a program
still holds an independent refcount on a node (for example via
bpf_refcount_acquire()).  The list head value embedded in map memory can
go away while the node object survives.  If node->owner is left pointing
at the old head address until drop completes, that pointer becomes stale.
If a new bpf_list_head is later allocated at the same address and the
stale node is passed to bpf_list_del(), the owner comparison can succeed
even though the node is not really linked to the new head, and
list_del_init() will follow bogus next/prev pointers with the risk of
memory corruption.

When draining a bpf_list_head, mark each node owner with BPF_PTR_POISON
under the map spinlock while moving it to a private drain list, then
list_del_init() the node and clear owner to NULL before calling
__bpf_obj_drop_impl().  Concurrent readers therefore never observe a
node that appears linked to a head while its list_head is inconsistent,
and surviving refcounted nodes never retain a stale non-NULL owner.

Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
---
 kernel/bpf/helpers.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 094457c3e6d3..59855b434f0b 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2247,10 +2247,11 @@ EXPORT_SYMBOL_GPL(bpf_base_func_proto);
 void bpf_list_head_free(const struct btf_field *field, void *list_head,
 			struct bpf_spin_lock *spin_lock)
 {
-	struct list_head *head = list_head, *orig_head = list_head;
+	struct list_head *head = list_head, drain, *pos, *n;
 
 	BUILD_BUG_ON(sizeof(struct list_head) > sizeof(struct bpf_list_head));
 	BUILD_BUG_ON(__alignof__(struct list_head) > __alignof__(struct bpf_list_head));
+	INIT_LIST_HEAD(&drain);
 
 	/* Do the actual list draining outside the lock to not hold the lock for
 	 * too long, and also prevent deadlocks if tracing programs end up
@@ -2261,20 +2262,30 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head,
 	__bpf_spin_lock_irqsave(spin_lock);
 	if (!head->next || list_empty(head))
 		goto unlock;
-	head = head->next;
+	list_for_each_safe(pos, n, head) {
+		struct bpf_list_node_kern *node;
+
+		node = container_of(pos, struct bpf_list_node_kern, list_head);
+		WRITE_ONCE(node->owner, BPF_PTR_POISON);
+		list_move_tail(pos, &drain);
+	}
 unlock:
-	INIT_LIST_HEAD(orig_head);
+	INIT_LIST_HEAD(head);
 	__bpf_spin_unlock_irqrestore(spin_lock);
 
-	while (head != orig_head) {
-		void *obj = head;
+	while (!list_empty(&drain)) {
+		struct bpf_list_node_kern *node;
 
-		obj -= field->graph_root.node_offset;
-		head = head->next;
+		pos = drain.next;
+		node = container_of(pos, struct bpf_list_node_kern, list_head);
+		list_del_init(pos);
+		/* Ensure __bpf_list_add() sees the node as unlinked. */
+		smp_store_release(&node->owner, NULL);
 		/* The contained type can also have resources, including a
 		 * bpf_list_head which needs to be freed.
 		 */
-		__bpf_obj_drop_impl(obj, field->graph_root.value_rec, false);
+		__bpf_obj_drop_impl((char *)pos - field->graph_root.node_offset,
+				    field->graph_root.value_rec, false);
 	}
 }
 
-- 
2.50.1 (Apple Git-155)


  parent reply	other threads:[~2026-05-21  3:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-21  3:22 [PATCH bpf-next v11 0/8] bpf: Extend the bpf_list family of APIs Kaitao Cheng
2026-05-21  3:22 ` [PATCH bpf-next v11 1/8] bpf: refactor __bpf_list_del to take list node pointer Kaitao Cheng
2026-05-21  3:23 ` Kaitao Cheng [this message]
2026-05-21  4:08   ` [PATCH bpf-next v11 2/8] bpf: clear list node owner and unlink before drop bot+bpf-ci
2026-05-21  4:30   ` sashiko-bot
2026-05-21  6:11     ` Kaitao Cheng
2026-05-21  3:23 ` [PATCH bpf-next v11 3/8] bpf: allow non-owning list-node args via __nonown_allowed Kaitao Cheng
2026-05-21  4:08   ` bot+bpf-ci
2026-05-21  6:29     ` Kaitao Cheng
2026-05-21  3:23 ` [PATCH bpf-next v11 4/8] bpf: Introduce the bpf_list_del kfunc Kaitao Cheng
2026-05-21  4:08   ` bot+bpf-ci
2026-05-21  6:59     ` Kaitao Cheng
2026-05-21  3:23 ` [PATCH bpf-next v11 5/8] bpf: refactor __bpf_list_add to take insertion point via **prev_ptr Kaitao Cheng
2026-05-21  3:23 ` [PATCH bpf-next v11 6/8] bpf: Add bpf_list_add to insert node after a given list node Kaitao Cheng
2026-05-21  4:08   ` bot+bpf-ci
2026-05-21  7:35     ` Kaitao Cheng
2026-05-21 12:49   ` sashiko-bot
2026-05-21  3:23 ` [PATCH bpf-next v11 7/8] bpf: add bpf_list_is_first/last/empty kfuncs Kaitao Cheng
2026-05-21 13:29   ` sashiko-bot
2026-05-21  3:23 ` [PATCH bpf-next v11 8/8] selftests/bpf: Add test cases for bpf_list_del/add/is_first/is_last/empty Kaitao Cheng
2026-05-21  4:08   ` bot+bpf-ci
2026-05-21 10:00 ` [PATCH bpf-next v11 0/8] bpf: Extend the bpf_list family of APIs patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260521032306.97118-3-kaitao.cheng@linux.dev \
    --to=kaitao.cheng@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=chengkaitao@kylinos.cn \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=song@kernel.org \
    --cc=vmalik@redhat.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.