* [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks
@ 2026-06-23 18:03 Michal Luczaj
2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
0 siblings, 2 replies; 9+ messages in thread
From: Michal Luczaj @ 2026-06-23 18:03 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan
Cc: netdev, bpf, linux-kernel, linux-kselftest, Michal Luczaj
Fix for UDP sockets refcount asymmetry in sockmap lookup/release.
Accompanied by a selftest.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Michal Luczaj (2):
bpf, sockmap: Don't leak UDP socks on lookup-bind-release
selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
net/ipv4/udp_bpf.c | 3 ++
.../selftests/bpf/prog_tests/sockmap_basic.c | 50 ++++++++++++++++++++++
.../bpf/progs/test_sockmap_lookup_bind_release.c | 37 ++++++++++++++++
3 files changed, 90 insertions(+)
---
base-commit: 12091470c6b4c1c14b2de12dcbae2ada6cb6d20b
change-id: 20260617-sockmap-lookup-udp-leak-bc4e5c5481d7
Best regards,
--
Michal Luczaj <mhal@rbox.co>
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release 2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj @ 2026-06-23 18:03 ` Michal Luczaj 2026-06-23 21:19 ` Emil Tsalapatis ` (2 more replies) 2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj 1 sibling, 3 replies; 9+ messages in thread From: Michal Luczaj @ 2026-06-23 18:03 UTC (permalink / raw) To: John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan Cc: netdev, bpf, linux-kernel, linux-kselftest, Michal Luczaj UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false. Because sockmap accepts unbound UDP sockets, a BPF program can increment a socket's refcount via lookup. If the socket is subsequently bound, the transition from unbound to bound causes bpf_sk_release() to skip the decrement of the refcount, causing a memory leak. unreferenced object 0xffff88810bc2eb40 (size 1984): comm "test_progs", pid 2451, jiffies 4295320596 hex dump (first 32 bytes): 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................ 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace (crc bdee079d): kmem_cache_alloc_noprof+0x557/0x660 sk_prot_alloc+0x69/0x240 sk_alloc+0x30/0x460 inet_create+0x2ce/0xf80 __sock_create+0x25b/0x5c0 __sys_socket+0x119/0x1d0 __x64_sys_socket+0x72/0xd0 do_syscall_64+0xa1/0x5f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Maintain balanced refcounts across sk lookup/release: (re-)set SOCK_RCU_FREE on proto update to treat the socket (whether bound or unbound) as not requiring a refcount increment on (a RCU protected) lookup. Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets") Signed-off-by: Michal Luczaj <mhal@rbox.co> --- Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed sockets in bpf_sk_assign"). --- net/ipv4/udp_bpf.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index ad57c4c9eaab..970327b59582 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) if (sk->sk_family == AF_INET6) udp_bpf_check_v6_needs_rebuild(psock->sk_proto); + /* Treat all sockets as non-refcounted, regardless of binding state. */ + sock_set_flag(sk, SOCK_RCU_FREE); + sock_replace_proto(sk, &udp_bpf_prots[family]); return 0; } -- 2.54.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release 2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj @ 2026-06-23 21:19 ` Emil Tsalapatis 2026-06-24 1:36 ` Jiayuan Chen 2026-06-24 13:36 ` Jakub Sitnicki 2 siblings, 0 replies; 9+ messages in thread From: Emil Tsalapatis @ 2026-06-23 21:19 UTC (permalink / raw) To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan Cc: netdev, bpf, linux-kernel, linux-kselftest On Tue Jun 23, 2026 at 2:03 PM EDT, Michal Luczaj wrote: > UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means > sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false. > > Because sockmap accepts unbound UDP sockets, a BPF program can increment a > socket's refcount via lookup. If the socket is subsequently bound, the > transition from unbound to bound causes bpf_sk_release() to skip the > decrement of the refcount, causing a memory leak. > > unreferenced object 0xffff88810bc2eb40 (size 1984): > comm "test_progs", pid 2451, jiffies 4295320596 > hex dump (first 32 bytes): > 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................ > 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ > backtrace (crc bdee079d): > kmem_cache_alloc_noprof+0x557/0x660 > sk_prot_alloc+0x69/0x240 > sk_alloc+0x30/0x460 > inet_create+0x2ce/0xf80 > __sock_create+0x25b/0x5c0 > __sys_socket+0x119/0x1d0 > __x64_sys_socket+0x72/0xd0 > do_syscall_64+0xa1/0x5f0 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Maintain balanced refcounts across sk lookup/release: (re-)set > SOCK_RCU_FREE on proto update to treat the socket (whether bound or > unbound) as not requiring a refcount increment on (a RCU protected) lookup. > > Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets") > Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> > --- > Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed > sockets in bpf_sk_assign"). > --- > net/ipv4/udp_bpf.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c > index ad57c4c9eaab..970327b59582 100644 > --- a/net/ipv4/udp_bpf.c > +++ b/net/ipv4/udp_bpf.c > @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) > if (sk->sk_family == AF_INET6) > udp_bpf_check_v6_needs_rebuild(psock->sk_proto); > > + /* Treat all sockets as non-refcounted, regardless of binding state. */ > + sock_set_flag(sk, SOCK_RCU_FREE); > + > sock_replace_proto(sk, &udp_bpf_prots[family]); > return 0; > } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release 2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj 2026-06-23 21:19 ` Emil Tsalapatis @ 2026-06-24 1:36 ` Jiayuan Chen 2026-06-24 13:36 ` Jakub Sitnicki 2 siblings, 0 replies; 9+ messages in thread From: Jiayuan Chen @ 2026-06-24 1:36 UTC (permalink / raw) To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan Cc: netdev, bpf, linux-kernel, linux-kselftest On 6/24/26 2:03 AM, Michal Luczaj wrote: > UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means > sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false. > > Because sockmap accepts unbound UDP sockets, a BPF program can increment a > socket's refcount via lookup. If the socket is subsequently bound, the > transition from unbound to bound causes bpf_sk_release() to skip the > decrement of the refcount, causing a memory leak. > > unreferenced object 0xffff88810bc2eb40 (size 1984): > comm "test_progs", pid 2451, jiffies 4295320596 > hex dump (first 32 bytes): > 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................ > 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ > backtrace (crc bdee079d): > kmem_cache_alloc_noprof+0x557/0x660 > sk_prot_alloc+0x69/0x240 > sk_alloc+0x30/0x460 > inet_create+0x2ce/0xf80 > __sock_create+0x25b/0x5c0 > __sys_socket+0x119/0x1d0 > __x64_sys_socket+0x72/0xd0 > do_syscall_64+0xa1/0x5f0 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Maintain balanced refcounts across sk lookup/release: (re-)set > SOCK_RCU_FREE on proto update to treat the socket (whether bound or > unbound) as not requiring a refcount increment on (a RCU protected) lookup. > > Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets") > Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release 2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj 2026-06-23 21:19 ` Emil Tsalapatis 2026-06-24 1:36 ` Jiayuan Chen @ 2026-06-24 13:36 ` Jakub Sitnicki 2 siblings, 0 replies; 9+ messages in thread From: Jakub Sitnicki @ 2026-06-24 13:36 UTC (permalink / raw) To: Michal Luczaj, Willem de Bruijn Cc: John Fastabend, Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan, netdev, bpf, linux-kernel, linux-kselftest On Tue, Jun 23, 2026 at 08:03 PM +02, Michal Luczaj wrote: > UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means > sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false. > > Because sockmap accepts unbound UDP sockets, a BPF program can increment a > socket's refcount via lookup. If the socket is subsequently bound, the > transition from unbound to bound causes bpf_sk_release() to skip the > decrement of the refcount, causing a memory leak. > > unreferenced object 0xffff88810bc2eb40 (size 1984): > comm "test_progs", pid 2451, jiffies 4295320596 > hex dump (first 32 bytes): > 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................ > 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ > backtrace (crc bdee079d): > kmem_cache_alloc_noprof+0x557/0x660 > sk_prot_alloc+0x69/0x240 > sk_alloc+0x30/0x460 > inet_create+0x2ce/0xf80 > __sock_create+0x25b/0x5c0 > __sys_socket+0x119/0x1d0 > __x64_sys_socket+0x72/0xd0 > do_syscall_64+0xa1/0x5f0 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Maintain balanced refcounts across sk lookup/release: (re-)set > SOCK_RCU_FREE on proto update to treat the socket (whether bound or > unbound) as not requiring a refcount increment on (a RCU protected) lookup. > > Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets") > Signed-off-by: Michal Luczaj <mhal@rbox.co> > --- > Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed > sockets in bpf_sk_assign"). > --- > net/ipv4/udp_bpf.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c > index ad57c4c9eaab..970327b59582 100644 > --- a/net/ipv4/udp_bpf.c > +++ b/net/ipv4/udp_bpf.c > @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) > if (sk->sk_family == AF_INET6) > udp_bpf_check_v6_needs_rebuild(psock->sk_proto); > > + /* Treat all sockets as non-refcounted, regardless of binding state. */ > + sock_set_flag(sk, SOCK_RCU_FREE); > + > sock_replace_proto(sk, &udp_bpf_prots[family]); > return 0; > } There is a side effect that an unhashed (unbound) UDP socket can now be selected in sk_lookup with bpf_sk_assign. Though perhaps that's for the better because TC bpf_sk_assign doesn't reject non-refcounted UDP sockets either, so we would have both socket dispatch sites behave the same way. Also, with this patch, if we insert & remove an unhashed UDP socket into/from a sockmap, we end up with an unhashed non-refcounted UDP socket. Not entirely sure if that is actually a problem or not. Willem, what is your take on having unhashed non-refcoted UDP sockets? ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release 2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj 2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj @ 2026-06-23 18:03 ` Michal Luczaj 2026-06-23 19:00 ` sashiko-bot 2026-06-23 19:32 ` bot+bpf-ci 1 sibling, 2 replies; 9+ messages in thread From: Michal Luczaj @ 2026-06-23 18:03 UTC (permalink / raw) To: John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan Cc: netdev, bpf, linux-kernel, linux-kselftest, Michal Luczaj Setup and join a cgroup, then attach a cgroup/connect4 program that runs sk = bpf_map_lookup_elem(sockmap, 0) bpf_bind(ctx, sa, sizeof(sa)) bpf_sk_release(sk) Unpatched kernel leaks the socket. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- .../selftests/bpf/prog_tests/sockmap_basic.c | 50 ++++++++++++++++++++++ .../bpf/progs/test_sockmap_lookup_bind_release.c | 37 ++++++++++++++++ 2 files changed, 87 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index cb3229711f93..11972ffdb16e 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -7,6 +7,7 @@ #include "test_progs.h" #include "test_skmsg_load_helpers.skel.h" +#include "test_sockmap_lookup_bind_release.skel.h" #include "test_sockmap_update.skel.h" #include "test_sockmap_invalid_update.skel.h" #include "test_sockmap_skb_verdict_attach.skel.h" @@ -17,6 +18,7 @@ #include "test_sockmap_msg_pop_data.skel.h" #include "bpf_iter_sockmap.skel.h" +#include "cgroup_helpers.h" #include "sockmap_helpers.h" #define TCP_REPAIR 19 /* TCP sock is under repair right now */ @@ -1373,6 +1375,52 @@ static void test_sockmap_multi_channels(int sotype) test_sockmap_pass_prog__destroy(skel); } +#define LOOKUP_BIND_RELEASE_CG "/sockmap_lookup-bind-release" +#define LOOKUP_BIND_RELEASE_REP 64 + +static void test_sockmap_lookup_bind_release(void) +{ + struct test_sockmap_lookup_bind_release *skel; + struct sockaddr_in sa; + int cg, i; + + cg = cgroup_setup_and_join(LOOKUP_BIND_RELEASE_CG); + if (!ASSERT_OK_FD(cg, "cgroup_setup_and_join")) + return; + + skel = test_sockmap_lookup_bind_release__open_and_load(); + if (!ASSERT_OK_PTR(skel, "open_and_load")) + goto cleanup; + + skel->links.connect = bpf_program__attach_cgroup(skel->progs.connect, cg); + if (!ASSERT_OK_PTR(skel->links.connect, "attach_cgroup")) + goto destroy; + + sa.sin_family = AF_INET; + sa.sin_port = bpf_htons(1234); + sa.sin_addr.s_addr = bpf_htonl(INADDR_LOOPBACK); + + for (i = 0; i < LOOKUP_BIND_RELEASE_REP; ++i) { + __close_fd int sk; + + sk = xsocket(AF_INET, SOCK_DGRAM, 0); + if (sk < 0) + break; + + if (xbpf_map_update_elem(bpf_map__fd(skel->maps.sockmap), &u32(0), + &sk, BPF_ANY)) + break; + + if (xconnect(sk, (struct sockaddr *)&sa, sizeof(sa))) + break; + } + +destroy: + test_sockmap_lookup_bind_release__destroy(skel); +cleanup: + cleanup_cgroup_environment(); +} + void test_sockmap_basic(void) { if (test__start_subtest("sockmap create_update_free")) @@ -1451,4 +1499,6 @@ void test_sockmap_basic(void) test_sockmap_multi_channels(SOCK_STREAM); if (test__start_subtest("sockmap udp multi channels")) test_sockmap_multi_channels(SOCK_DGRAM); + if (test__start_subtest("sockmap lookup-bind-release")) + test_sockmap_lookup_bind_release(); } diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c new file mode 100644 index 000000000000..cc77b193893b --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c @@ -0,0 +1,37 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <linux/in.h> +#include <bpf/bpf_helpers.h> + +struct { + __uint(type, BPF_MAP_TYPE_SOCKMAP); + __uint(max_entries, 1); + __type(key, int); + __type(value, int); +} sockmap SEC(".maps"); + +SEC("cgroup/connect4") +int connect(struct bpf_sock_addr *ctx) +{ + struct bpf_sock *sk; + int ret = SK_DROP; + + sk = bpf_map_lookup_elem(&sockmap, &(int){0}); + if (sk) { + if (sk == ctx->sk) { + struct sockaddr_in sa = { + .sin_family = ctx->user_family, + .sin_port = ctx->user_port, + .sin_addr.s_addr = ctx->user_ip4 + }; + + ret = !bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)); + } + + bpf_sk_release(sk); + } + + return ret; +} + +char _license[] SEC("license") = "GPL"; -- 2.54.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release 2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj @ 2026-06-23 19:00 ` sashiko-bot 2026-06-23 21:18 ` Emil Tsalapatis 2026-06-23 19:32 ` bot+bpf-ci 1 sibling, 1 reply; 9+ messages in thread From: sashiko-bot @ 2026-06-23 19:00 UTC (permalink / raw) To: Michal Luczaj; +Cc: bpf Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: New issues: - [Medium] The cgroup file descriptor `cg` is leaked because it is never closed in `test_sockmap_lookup_bind_release()`. - [Low] The `struct sockaddr_in sa` is passed to the kernel via `xconnect` with uninitialized bytes in its `sin_zero` padding. Pre-existing issues: - [High] Calling `bpf_bind()` between `bpf_map_lookup_elem()` and `bpf_sk_release()` on a UDP socket mutates the socket's `SOCK_RCU_FREE` state, causing `bpf_sk_release()` to silently skip decrementing the reference count. -- commit f32ef88a53a5c519bf4b22a337fa782c27f0ecda Author: Michal Luczaj <mhal@rbox.co> selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release This patch adds a selftest to reproduce a UDP socket leak. It sets up a cgroup, attaches a connect4 BPF program, and triggers a sequence of socket operations (lookup, bind, release) that leaks the socket on an unpatched kernel. > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c > index cb3229711f93a..11972ffdb16ed 100644 > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c [ ... ] > @@ -1373,6 +1375,52 @@ static void test_sockmap_multi_channels(int sotype) > test_sockmap_pass_prog__destroy(skel); > } > > +#define LOOKUP_BIND_RELEASE_CG "/sockmap_lookup-bind-release" > +#define LOOKUP_BIND_RELEASE_REP 64 > + > +static void test_sockmap_lookup_bind_release(void) > +{ > + struct test_sockmap_lookup_bind_release *skel; > + struct sockaddr_in sa; > + int cg, i; > + > + cg = cgroup_setup_and_join(LOOKUP_BIND_RELEASE_CG); [Severity: Medium] Does this leak the cgroup file descriptor? The fd is acquired here and stored in cg, but close(cg) doesn't appear to be called on either the success or error paths before the function returns. The cleanup_cgroup_environment() function cleans up directories but doesn't close the test's held file descriptors. > + if (!ASSERT_OK_FD(cg, "cgroup_setup_and_join")) > + return; > + > + skel = test_sockmap_lookup_bind_release__open_and_load(); > + if (!ASSERT_OK_PTR(skel, "open_and_load")) > + goto cleanup; > + > + skel->links.connect = bpf_program__attach_cgroup(skel->progs.connect, cg); > + if (!ASSERT_OK_PTR(skel->links.connect, "attach_cgroup")) > + goto destroy; > + > + sa.sin_family = AF_INET; > + sa.sin_port = bpf_htons(1234); > + sa.sin_addr.s_addr = bpf_htonl(INADDR_LOOPBACK); [Severity: Low] Does this leave the sin_zero padding of the sa structure uninitialized? It looks like this partially initializes the struct on the stack. [ ... ] > + if (xbpf_map_update_elem(bpf_map__fd(skel->maps.sockmap), &u32(0), > + &sk, BPF_ANY)) > + break; > + > + if (xconnect(sk, (struct sockaddr *)&sa, sizeof(sa))) When sa is passed to xconnect() here, can it leak the uninitialized padding bytes from the userspace stack to the kernel? Would it be better to fully initialize it using "struct sockaddr_in sa = {};" or explicitly zero it? > + break; > + } > + > +destroy: > + test_sockmap_lookup_bind_release__destroy(skel); > +cleanup: > + cleanup_cgroup_environment(); > +} [ ... ] > diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c > new file mode 100644 > index 0000000000000..cc77b193893bd > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c > @@ -0,0 +1,37 @@ [ ... ] > +SEC("cgroup/connect4") > +int connect(struct bpf_sock_addr *ctx) > +{ > + struct bpf_sock *sk; > + int ret = SK_DROP; > + > + sk = bpf_map_lookup_elem(&sockmap, &(int){0}); > + if (sk) { > + if (sk == ctx->sk) { > + struct sockaddr_in sa = { > + .sin_family = ctx->user_family, > + .sin_port = ctx->user_port, > + .sin_addr.s_addr = ctx->user_ip4 > + }; > + > + ret = !bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)); > + } > + > + bpf_sk_release(sk); [Severity: High] This is a pre-existing issue, but does calling bpf_bind() here mutate the SOCK_RCU_FREE state and cause bpf_sk_release() to skip decrementing the reference count? If bpf_map_lookup_elem() looks up a newly created UDP socket, SOCK_RCU_FREE is not yet set, so sk_is_refcounted() evaluates to true and the reference count is incremented. Then bpf_bind() resolves to udp_v4_get_port(), which dynamically sets the SOCK_RCU_FREE flag on the socket. When bpf_sk_release() is called, it evaluates sk_is_refcounted() again. Because SOCK_RCU_FREE is now set, it evaluates to false, causing bpf_sk_release() to skip the sock_gen_put() call and permanently leak the socket reference. > + } > + > + return ret; > +} -- Sashiko AI review · https://sashiko.dev/#/patchset/20260623-sockmap-lookup-udp-leak-v1-0-05804f9308e4@rbox.co?part=2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release 2026-06-23 19:00 ` sashiko-bot @ 2026-06-23 21:18 ` Emil Tsalapatis 0 siblings, 0 replies; 9+ messages in thread From: Emil Tsalapatis @ 2026-06-23 21:18 UTC (permalink / raw) To: sashiko-reviews, Michal Luczaj; +Cc: bpf On Tue Jun 23, 2026 at 3:00 PM EDT, sashiko-bot wrote: > Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: > > New issues: > - [Medium] The cgroup file descriptor `cg` is leaked because it is never closed in `test_sockmap_lookup_bind_release()`. > - [Low] The `struct sockaddr_in sa` is passed to the kernel via `xconnect` with uninitialized bytes in its `sin_zero` padding. > > Pre-existing issues: > - [High] Calling `bpf_bind()` between `bpf_map_lookup_elem()` and `bpf_sk_release()` on a UDP socket mutates the socket's `SOCK_RCU_FREE` state, causing `bpf_sk_release()` to silently skip decrementing the reference count. > -- > > commit f32ef88a53a5c519bf4b22a337fa782c27f0ecda > Author: Michal Luczaj <mhal@rbox.co> > > selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release > > This patch adds a selftest to reproduce a UDP socket leak. It sets up a cgroup, > attaches a connect4 BPF program, and triggers a sequence of socket operations > (lookup, bind, release) that leaks the socket on an unpatched kernel. > >> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c >> index cb3229711f93a..11972ffdb16ed 100644 >> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c >> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c > > [ ... ] > >> @@ -1373,6 +1375,52 @@ static void test_sockmap_multi_channels(int sotype) >> test_sockmap_pass_prog__destroy(skel); >> } >> >> +#define LOOKUP_BIND_RELEASE_CG "/sockmap_lookup-bind-release" >> +#define LOOKUP_BIND_RELEASE_REP 64 >> + >> +static void test_sockmap_lookup_bind_release(void) >> +{ >> + struct test_sockmap_lookup_bind_release *skel; >> + struct sockaddr_in sa; >> + int cg, i; >> + >> + cg = cgroup_setup_and_join(LOOKUP_BIND_RELEASE_CG); > > [Severity: Medium] > Does this leak the cgroup file descriptor? > > The fd is acquired here and stored in cg, but close(cg) doesn't appear to be > called on either the success or error paths before the function returns. The > cleanup_cgroup_environment() function cleans up directories but doesn't close > the test's held file descriptors. > >> + if (!ASSERT_OK_FD(cg, "cgroup_setup_and_join")) >> + return; >> + >> + skel = test_sockmap_lookup_bind_release__open_and_load(); >> + if (!ASSERT_OK_PTR(skel, "open_and_load")) >> + goto cleanup; >> + >> + skel->links.connect = bpf_program__attach_cgroup(skel->progs.connect, cg); >> + if (!ASSERT_OK_PTR(skel->links.connect, "attach_cgroup")) >> + goto destroy; >> + >> + sa.sin_family = AF_INET; >> + sa.sin_port = bpf_htons(1234); >> + sa.sin_addr.s_addr = bpf_htonl(INADDR_LOOPBACK); > > [Severity: Low] > Does this leave the sin_zero padding of the sa structure uninitialized? > > It looks like this partially initializes the struct on the stack. > > [ ... ] > >> + if (xbpf_map_update_elem(bpf_map__fd(skel->maps.sockmap), &u32(0), >> + &sk, BPF_ANY)) >> + break; >> + >> + if (xconnect(sk, (struct sockaddr *)&sa, sizeof(sa))) > > When sa is passed to xconnect() here, can it leak the uninitialized padding > bytes from the userspace stack to the kernel? > > Would it be better to fully initialize it using "struct sockaddr_in sa = {};" > or explicitly zero it? > >> + break; >> + } >> + >> +destroy: >> + test_sockmap_lookup_bind_release__destroy(skel); >> +cleanup: >> + cleanup_cgroup_environment(); >> +} > I think these issues are borderline cosmetic but valid. > [ ... ] > >> diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c >> new file mode 100644 >> index 0000000000000..cc77b193893bd >> --- /dev/null >> +++ b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c >> @@ -0,0 +1,37 @@ > > [ ... ] > >> +SEC("cgroup/connect4") >> +int connect(struct bpf_sock_addr *ctx) >> +{ >> + struct bpf_sock *sk; >> + int ret = SK_DROP; >> + >> + sk = bpf_map_lookup_elem(&sockmap, &(int){0}); >> + if (sk) { >> + if (sk == ctx->sk) { >> + struct sockaddr_in sa = { >> + .sin_family = ctx->user_family, >> + .sin_port = ctx->user_port, >> + .sin_addr.s_addr = ctx->user_ip4 >> + }; >> + >> + ret = !bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)); >> + } >> + >> + bpf_sk_release(sk); > > [Severity: High] > This is a pre-existing issue, but does calling bpf_bind() here mutate the > SOCK_RCU_FREE state and cause bpf_sk_release() to skip decrementing the > reference count? > > If bpf_map_lookup_elem() looks up a newly created UDP socket, SOCK_RCU_FREE > is not yet set, so sk_is_refcounted() evaluates to true and the reference > count is incremented. > > Then bpf_bind() resolves to udp_v4_get_port(), which dynamically sets the > SOCK_RCU_FREE flag on the socket. > > When bpf_sk_release() is called, it evaluates sk_is_refcounted() again. Because > SOCK_RCU_FREE is now set, it evaluates to false, causing bpf_sk_release() to > skip the sock_gen_put() call and permanently leak the socket reference. > AFAICT this is actually not a problem for this specific test, because we fix it in patch 1/2. We have already added the socket to the sockmap so udp_bpf_update_proto should have triggered. It looks like Sashiko missed this? Specifically for UDP, I think the code is fine. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> >> + } >> + >> + return ret; >> +} ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release 2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj 2026-06-23 19:00 ` sashiko-bot @ 2026-06-23 19:32 ` bot+bpf-ci 1 sibling, 0 replies; 9+ messages in thread From: bot+bpf-ci @ 2026-06-23 19:32 UTC (permalink / raw) To: mhal, john.fastabend, jakub, jiayuan.chen, davem, edumazet, kuba, pabeni, horms, ast, cong.wang, daniel, andrii, eddyz87, memxor, martin.lau, song, yonghong.song, jolsa, emil, shuah Cc: netdev, bpf, linux-kernel, linux-kselftest, mhal, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai [-- Attachment #1: Type: text/plain, Size: 1107 bytes --] > selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release > > Setup and join a cgroup, then attach a cgroup/connect4 program that runs > > sk = bpf_map_lookup_elem(sockmap, 0) > bpf_bind(ctx, sa, sizeof(sa)) > bpf_sk_release(sk) > > Unpatched kernel leaks the socket. > > Signed-off-by: Michal Luczaj <mhal@rbox.co> This test reproduces a UDP socket leak across sockmap lookup-bind-release, but there is no Fixes: tag. The sibling fix commit a2510dc351c5 ("bpf, sockmap: Don't leak UDP socks on lookup-bind-release") carries a Fixes: tag pointing at the commit that lifted the socket-state restriction allowing unbound UDP sockets into sockmap. Should this test carry the same tag so it stays backportable alongside the fix? Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets") --- AI reviewed your patch. Please fix the bug or email reply why it's not a bug. See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md CI run summary: https://github.com/kernel-patches/bpf/actions/runs/28049771327 ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-06-24 13:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj 2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj 2026-06-23 21:19 ` Emil Tsalapatis 2026-06-24 1:36 ` Jiayuan Chen 2026-06-24 13:36 ` Jakub Sitnicki 2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj 2026-06-23 19:00 ` sashiko-bot 2026-06-23 21:18 ` Emil Tsalapatis 2026-06-23 19:32 ` bot+bpf-ci
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.