* [PATCH v1 bpf] bpf: Free reuseport cBPF prog after RCU grace period.
@ 2026-04-24 23:52 Kuniyuki Iwashima
2026-04-25 0:26 ` bot+bpf-ci
0 siblings, 1 reply; 3+ messages in thread
From: Kuniyuki Iwashima @ 2026-04-24 23:52 UTC (permalink / raw)
To: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, bpf, netdev, Eulgyu Kim
Eulgyu Kim reported the splat below with a repro. [0]
The repro sets up a UDP reuseport group with a cBPF prog and
replaces it with a new one while another thread is sending
a UDP packet to the group.
The reuseport prog is freed by sk_reuseport_prog_free().
bpf_prog_put() is called for "e"BPF prog to destruct through
multiple stages while cBPF prog is freed immediately by
bpf_release_orig_filter() and bpf_prog_free().
If a reuseport prog is detached from the setsockopt() path
(reuseport_attach_prog() or reuseport_detach_prog()),
sk_reuseport_prog_free() is called without waiting for RCU
readers to complete, resulting in various bugs.
Let's defer freeing the reuseport cBPF prog after one RCU
grace period.
Note "e"BPF prog is safe as is unless the fast path starts
to touch fields destroyed in bpf_prog_put_deferred() and
__bpf_prog_put_noref().
[0]:
BUG: KASAN: vmalloc-out-of-bounds in reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596
Read of size 4 at addr ffffc9000051e004 by task slowme/10208
CPU: 6 UID: 1000 PID: 10208 Comm: slowme Not tainted 7.0.0-geb7ac95ff75e #32 PREEMPT(full)
Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596
udp4_lib_lookup2+0x3bc/0x950 net/ipv4/udp.c:495
__udp4_lib_lookup+0x768/0xe20 net/ipv4/udp.c:723
__udp4_lib_lookup_skb+0x297/0x390 net/ipv4/udp.c:752
__udp4_lib_rcv+0x1312/0x2620 net/ipv4/udp.c:2752
ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207
ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241
NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
__netif_receive_skb_one_core net/core/dev.c:6181 [inline]
__netif_receive_skb net/core/dev.c:6294 [inline]
process_backlog+0xaa4/0x1960 net/core/dev.c:6645
__napi_poll+0xae/0x340 net/core/dev.c:7709
napi_poll net/core/dev.c:7772 [inline]
net_rx_action+0x5d7/0xf50 net/core/dev.c:7929
handle_softirqs+0x22b/0x870 kernel/softirq.c:622
do_softirq+0x76/0xd0 kernel/softirq.c:523
</IRQ>
<TASK>
__local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:924 [inline]
__dev_queue_xmit+0x1dd7/0x3710 net/core/dev.c:4890
neigh_output include/net/neighbour.h:556 [inline]
ip_finish_output2+0xca9/0x1070 net/ipv4/ip_output.c:237
NF_HOOK_COND include/linux/netfilter.h:307 [inline]
ip_output+0x29f/0x450 net/ipv4/ip_output.c:438
ip_send_skb+0x45/0xc0 net/ipv4/ip_output.c:1508
udp_send_skb+0xb04/0x1510 net/ipv4/udp.c:1195
udp_sendmsg+0x1a71/0x2350 net/ipv4/udp.c:1485
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
__sys_sendto+0x554/0x680 net/socket.c:2206
__do_sys_sendto net/socket.c:2213 [inline]
__se_sys_sendto net/socket.c:2209 [inline]
__x64_sys_sendto+0xde/0x100 net/socket.c:2209
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x160/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x415a2d
Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6bc31e41e8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f6bc31e4cdc RCX: 0000000000415a2d
RDX: 0000000000000001 RSI: 00007f6bc31e421f RDI: 0000000000000003
RBP: 00007f6bc31e4240 R08: 00007f6bc31e4220 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000212 R12: 00007f6bc31e46c0
R13: ffffffffffffffb8 R14: 0000000000000000 R15: 00007ffc9b0d70b0
</TASK>
Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Reported-by: Eulgyu Kim <eulgyukim@snu.ac.kr>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/core/filter.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index bc96c18df4e0..dba4c9340bb7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1654,15 +1654,24 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
return err;
}
-void sk_reuseport_prog_free(struct bpf_prog *prog)
+static void sk_reuseport_prog_free_rcu(struct rcu_head *rcu)
+{
+ struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
+ struct bpf_prog *prog = aux->prog;
+
+ bpf_release_orig_filter(prog);
+ bpf_prog_free(prog);
+}
+
+void sk_reuseport_prog_free(struct bpf_prog *prog, bool wait_rcu)
{
if (!prog)
return;
- if (prog->type == BPF_PROG_TYPE_SK_REUSEPORT)
- bpf_prog_put(prog);
+ if (bpf_prog_was_classic(prog))
+ call_rcu(&prog->aux->rcu, sk_reuseport_prog_free_rcu);
else
- bpf_prog_destroy(prog);
+ bpf_prog_put(prog);
}
static inline int __bpf_try_make_writable(struct sk_buff *skb,
--
2.54.0.rc2.544.gc7ae2d5bb8-goog
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v1 bpf] bpf: Free reuseport cBPF prog after RCU grace period.
2026-04-24 23:52 [PATCH v1 bpf] bpf: Free reuseport cBPF prog after RCU grace period Kuniyuki Iwashima
@ 2026-04-25 0:26 ` bot+bpf-ci
2026-04-25 0:30 ` Kuniyuki Iwashima
0 siblings, 1 reply; 3+ messages in thread
From: bot+bpf-ci @ 2026-04-25 0:26 UTC (permalink / raw)
To: kuniyu, martin.lau, ast, daniel, andrii, eddyz87, memxor
Cc: kuniyu, kuni1840, bpf, netdev, eulgyukim, ast, andrii, daniel,
martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]
> diff --git a/net/core/filter.c b/net/core/filter.c
> index bc96c18df4e0..dba4c9340bb7 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1654,15 +1654,24 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
> return err;
> }
>
> -void sk_reuseport_prog_free(struct bpf_prog *prog)
> +static void sk_reuseport_prog_free_rcu(struct rcu_head *rcu)
> +{
> + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
> + struct bpf_prog *prog = aux->prog;
> +
> + bpf_release_orig_filter(prog);
> + bpf_prog_free(prog);
> +}
> +
> +void sk_reuseport_prog_free(struct bpf_prog *prog, bool wait_rcu)
> {
The function signature was changed to add a 'bool wait_rcu' parameter, but
the header file declaration in include/linux/filter.h and all three call
sites in net/core/sock_reuseport.c were not updated.
The three call sites are:
- reuseport_attach_prog() calling sk_reuseport_prog_free(old_prog)
- reuseport_free_rcu() calling sk_reuseport_prog_free(...)
- reuseport_detach_prog() calling sk_reuseport_prog_free(old_prog)
Will this compile? It looks like the callers will fail with 'too few
arguments to function sk_reuseport_prog_free'.
> if (!prog)
> return;
>
> - if (prog->type == BPF_PROG_TYPE_SK_REUSEPORT)
> - bpf_prog_put(prog);
> + if (bpf_prog_was_classic(prog))
> + call_rcu(&prog->aux->rcu, sk_reuseport_prog_free_rcu);
> else
> - bpf_prog_destroy(prog);
> + bpf_prog_put(prog);
> }
The 'wait_rcu' parameter is never used in the function body. The decision
to defer freeing via RCU is based entirely on bpf_prog_was_classic(prog).
What is the purpose of the wait_rcu parameter? The commit message explains
the goal is to defer freeing cBPF programs, but doesn't explain why a
per-call toggle would be needed. With the parameter ignored, callers
cannot control the behavior even if they wanted to.
>
> static inline int __bpf_try_make_writable(struct sk_buff *skb,
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24917377520
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v1 bpf] bpf: Free reuseport cBPF prog after RCU grace period.
2026-04-25 0:26 ` bot+bpf-ci
@ 2026-04-25 0:30 ` Kuniyuki Iwashima
0 siblings, 0 replies; 3+ messages in thread
From: Kuniyuki Iwashima @ 2026-04-25 0:30 UTC (permalink / raw)
To: bot+bpf-ci
Cc: martin.lau, ast, daniel, andrii, eddyz87, memxor, kuni1840, bpf,
netdev, eulgyukim, martin.lau, yonghong.song, clm, ihor.solodrai
On Fri, Apr 24, 2026 at 5:26 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index bc96c18df4e0..dba4c9340bb7 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -1654,15 +1654,24 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
> > return err;
> > }
> >
> > -void sk_reuseport_prog_free(struct bpf_prog *prog)
> > +static void sk_reuseport_prog_free_rcu(struct rcu_head *rcu)
> > +{
> > + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
> > + struct bpf_prog *prog = aux->prog;
> > +
> > + bpf_release_orig_filter(prog);
> > + bpf_prog_free(prog);
> > +}
> > +
> > +void sk_reuseport_prog_free(struct bpf_prog *prog, bool wait_rcu)
> > {
>
> The function signature was changed to add a 'bool wait_rcu' parameter, but
Oops, sorry I forgot to commit the staged change..
pw-bot: cr
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-25 0:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24 23:52 [PATCH v1 bpf] bpf: Free reuseport cBPF prog after RCU grace period Kuniyuki Iwashima
2026-04-25 0:26 ` bot+bpf-ci
2026-04-25 0:30 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox