From: Zijing Yin <yzjaurora@gmail.com>
To: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, ast@kernel.org,
daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org,
Zijing Yin <yzjaurora@gmail.com>
Subject: [PATCH bpf-next] bpf: reuseport: add cond_resched_rcu() in reuseport_array_free()
Date: Fri, 10 Apr 2026 07:07:25 -0700 [thread overview]
Message-ID: <20260410140725.961623-1-yzjaurora@gmail.com> (raw)
reuseport_array_free() iterates over all map entries inside
rcu_read_lock() to detach sockets from the array. When max_entries is
very large (e.g., hundreds of millions), this loop runs for an extended
period without yielding the CPU, triggering RCU stall warnings in the
kworker thread that executes bpf_map_free_deferred().
The observed stall occurs because the loop has no scheduling point:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Workqueue: events_unbound bpf_map_free_deferred
Call Trace:
reuseport_array_free+0x1ec/0x470 kernel/bpf/reuseport_array.c:127
bpf_map_free_deferred+0x34a/0x7e0 kernel/bpf/syscall.c:893
process_one_work+0x952/0x1a80
worker_thread+0x87b/0x11f0
Add cond_resched_rcu() in the loop body to allow the scheduler to run
and RCU grace periods to complete. This is safe because each iteration
processes a single entry independently, sk->sk_callback_lock is not held
across the yield point, and the map is fully detached from userspace so
no concurrent insertions can occur.
This follows an established pattern for long-running kernel loops that
must run under rcu_read_lock(). The closest precedent is in another BPF
map free function:
kernel/bpf/hashtab.c:1600
htab_free_malloced_internal_structs()
rcu_read_lock();
for (i = 0; i < htab->n_buckets; i++) {
... walk bucket ...
cond_resched_rcu();
}
rcu_read_unlock();
Fixes: 5dc4c4b7d4e8 ("bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY")
Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
---
Base: bpf-next.git master branch
(tip a0c584fc18056709c8e047a82a6045d6c209f4ce
"bpf: Fix use-after-free in offloaded map/prog info fill"
as of 2026-04-09).
Tested with CONFIG_PREEMPT_RCU=y, CONFIG_KASAN=y (inline),
CONFIG_SMP=n (single vCPU QEMU VM), gcc 13.3.0.
To reproduce: create a BPF_MAP_TYPE_REUSEPORT_SOCKARRAY with
max_entries >= 100M, set rcu_cpu_stall_timeout low, pin the CPU with a
SCHED_FIFO thread so the kworker stays in rcu_read_lock() long enough
to trip the stall timeout, then close the fd. Without the fix the
reuseport_array_free() kworker stalls RCU reliably; with the fix,
cond_resched_rcu() yields periodically and no stall is observed.
Reproducer (C source): repro_reuseport.c (https://pastebin.com/YjdwqdX1)
kernel/bpf/reuseport_array.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index 49b8e5a0c6b4f..e3c789b80e2b8 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -5,6 +5,7 @@
#include <linux/bpf.h>
#include <linux/err.h>
#include <linux/sock_diag.h>
+#include <linux/rcupdate_wait.h>
#include <net/sock_reuseport.h>
#include <linux/btf_ids.h>
@@ -136,6 +137,7 @@ static void reuseport_array_free(struct bpf_map *map)
write_unlock_bh(&sk->sk_callback_lock);
RCU_INIT_POINTER(array->ptrs[i], NULL);
}
+ cond_resched_rcu();
}
rcu_read_unlock();
--
2.43.0
next reply other threads:[~2026-04-10 14:07 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 14:07 Zijing Yin [this message]
2026-04-10 19:53 ` [PATCH bpf-next] bpf: reuseport: add cond_resched_rcu() in reuseport_array_free() Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260410140725.961623-1-yzjaurora@gmail.com \
--to=yzjaurora@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox