[PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks
@ 2026-06-23 18:03 Michal Luczaj
  2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
  2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
  0 siblings, 2 replies; 9+ messages in thread
From: Michal Luczaj @ 2026-06-23 18:03 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
	Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan
  Cc: netdev, bpf, linux-kernel, linux-kselftest, Michal Luczaj

Fix for UDP sockets refcount asymmetry in sockmap lookup/release.
Accompanied by a selftest.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Michal Luczaj (2):
      bpf, sockmap: Don't leak UDP socks on lookup-bind-release
      selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release

 net/ipv4/udp_bpf.c                                 |  3 ++
 .../selftests/bpf/prog_tests/sockmap_basic.c       | 50 ++++++++++++++++++++++
 .../bpf/progs/test_sockmap_lookup_bind_release.c   | 37 ++++++++++++++++
 3 files changed, 90 insertions(+)
---
base-commit: 12091470c6b4c1c14b2de12dcbae2ada6cb6d20b
change-id: 20260617-sockmap-lookup-udp-leak-bc4e5c5481d7

Best regards,
--  
Michal Luczaj <mhal@rbox.co>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
  2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj
@ 2026-06-23 18:03 ` Michal Luczaj
  2026-06-23 21:19   ` Emil Tsalapatis
                     ` (2 more replies)
  2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
  1 sibling, 3 replies; 9+ messages in thread
From: Michal Luczaj @ 2026-06-23 18:03 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
	Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan
  Cc: netdev, bpf, linux-kernel, linux-kselftest, Michal Luczaj

UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.

Because sockmap accepts unbound UDP sockets, a BPF program can increment a
socket's refcount via lookup. If the socket is subsequently bound, the
transition from unbound to bound causes bpf_sk_release() to skip the
decrement of the refcount, causing a memory leak.

unreferenced object 0xffff88810bc2eb40 (size 1984):
  comm "test_progs", pid 2451, jiffies 4295320596
  hex dump (first 32 bytes):
    7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00  ................
    02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
  backtrace (crc bdee079d):
    kmem_cache_alloc_noprof+0x557/0x660
    sk_prot_alloc+0x69/0x240
    sk_alloc+0x30/0x460
    inet_create+0x2ce/0xf80
    __sock_create+0x25b/0x5c0
    __sys_socket+0x119/0x1d0
    __x64_sys_socket+0x72/0xd0
    do_syscall_64+0xa1/0x5f0
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

Maintain balanced refcounts across sk lookup/release: (re-)set
SOCK_RCU_FREE on proto update to treat the socket (whether bound or
unbound) as not requiring a refcount increment on (a RCU protected) lookup.

Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
sockets in bpf_sk_assign").
---
 net/ipv4/udp_bpf.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
index ad57c4c9eaab..970327b59582 100644
--- a/net/ipv4/udp_bpf.c
+++ b/net/ipv4/udp_bpf.c
@@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
 	if (sk->sk_family == AF_INET6)
 		udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
 
+	/* Treat all sockets as non-refcounted, regardless of binding state. */
+	sock_set_flag(sk, SOCK_RCU_FREE);
+
 	sock_replace_proto(sk, &udp_bpf_prots[family]);
 	return 0;
 }

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
  2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj
  2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
@ 2026-06-23 18:03 ` Michal Luczaj
  2026-06-23 19:00   ` sashiko-bot
  2026-06-23 19:32   ` bot+bpf-ci
  1 sibling, 2 replies; 9+ messages in thread
From: Michal Luczaj @ 2026-06-23 18:03 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki, Jiayuan Chen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Alexei Starovoitov, Cong Wang, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
	Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan
  Cc: netdev, bpf, linux-kernel, linux-kselftest, Michal Luczaj

Setup and join a cgroup, then attach a cgroup/connect4 program that runs

   sk = bpf_map_lookup_elem(sockmap, 0)
   bpf_bind(ctx, sa, sizeof(sa))
   bpf_sk_release(sk)

Unpatched kernel leaks the socket.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 .../selftests/bpf/prog_tests/sockmap_basic.c       | 50 ++++++++++++++++++++++
 .../bpf/progs/test_sockmap_lookup_bind_release.c   | 37 ++++++++++++++++
 2 files changed, 87 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index cb3229711f93..11972ffdb16e 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -7,6 +7,7 @@
 
 #include "test_progs.h"
 #include "test_skmsg_load_helpers.skel.h"
+#include "test_sockmap_lookup_bind_release.skel.h"
 #include "test_sockmap_update.skel.h"
 #include "test_sockmap_invalid_update.skel.h"
 #include "test_sockmap_skb_verdict_attach.skel.h"
@@ -17,6 +18,7 @@
 #include "test_sockmap_msg_pop_data.skel.h"
 #include "bpf_iter_sockmap.skel.h"
 
+#include "cgroup_helpers.h"
 #include "sockmap_helpers.h"
 
 #define TCP_REPAIR		19	/* TCP sock is under repair right now */
@@ -1373,6 +1375,52 @@ static void test_sockmap_multi_channels(int sotype)
 	test_sockmap_pass_prog__destroy(skel);
 }
 
+#define LOOKUP_BIND_RELEASE_CG	"/sockmap_lookup-bind-release"
+#define LOOKUP_BIND_RELEASE_REP	64
+
+static void test_sockmap_lookup_bind_release(void)
+{
+	struct test_sockmap_lookup_bind_release *skel;
+	struct sockaddr_in sa;
+	int cg, i;
+
+	cg = cgroup_setup_and_join(LOOKUP_BIND_RELEASE_CG);
+	if (!ASSERT_OK_FD(cg, "cgroup_setup_and_join"))
+		return;
+
+	skel = test_sockmap_lookup_bind_release__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		goto cleanup;
+
+	skel->links.connect = bpf_program__attach_cgroup(skel->progs.connect, cg);
+	if (!ASSERT_OK_PTR(skel->links.connect, "attach_cgroup"))
+		goto destroy;
+
+	sa.sin_family = AF_INET;
+	sa.sin_port = bpf_htons(1234);
+	sa.sin_addr.s_addr = bpf_htonl(INADDR_LOOPBACK);
+
+	for (i = 0; i < LOOKUP_BIND_RELEASE_REP; ++i) {
+		__close_fd int sk;
+
+		sk = xsocket(AF_INET, SOCK_DGRAM, 0);
+		if (sk < 0)
+			break;
+
+		if (xbpf_map_update_elem(bpf_map__fd(skel->maps.sockmap), &u32(0),
+					 &sk, BPF_ANY))
+			break;
+
+		if (xconnect(sk, (struct sockaddr *)&sa, sizeof(sa)))
+			break;
+	}
+
+destroy:
+	test_sockmap_lookup_bind_release__destroy(skel);
+cleanup:
+	cleanup_cgroup_environment();
+}
+
 void test_sockmap_basic(void)
 {
 	if (test__start_subtest("sockmap create_update_free"))
@@ -1451,4 +1499,6 @@ void test_sockmap_basic(void)
 		test_sockmap_multi_channels(SOCK_STREAM);
 	if (test__start_subtest("sockmap udp multi channels"))
 		test_sockmap_multi_channels(SOCK_DGRAM);
+	if (test__start_subtest("sockmap lookup-bind-release"))
+		test_sockmap_lookup_bind_release();
 }
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c
new file mode 100644
index 000000000000..cc77b193893b
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <bpf/bpf_helpers.h>
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SOCKMAP);
+	__uint(max_entries, 1);
+	__type(key, int);
+	__type(value, int);
+} sockmap SEC(".maps");
+
+SEC("cgroup/connect4")
+int connect(struct bpf_sock_addr *ctx)
+{
+	struct bpf_sock *sk;
+	int ret = SK_DROP;
+
+	sk = bpf_map_lookup_elem(&sockmap, &(int){0});
+	if (sk) {
+		if (sk == ctx->sk) {
+			struct sockaddr_in sa = {
+				.sin_family = ctx->user_family,
+				.sin_port = ctx->user_port,
+				.sin_addr.s_addr = ctx->user_ip4
+			};
+
+			ret = !bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa));
+		}
+
+		bpf_sk_release(sk);
+	}
+
+	return ret;
+}
+
+char _license[] SEC("license") = "GPL";

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
  2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
@ 2026-06-23 19:00   ` sashiko-bot
  2026-06-23 21:18     ` Emil Tsalapatis
  2026-06-23 19:32   ` bot+bpf-ci
  1 sibling, 1 reply; 9+ messages in thread
From: sashiko-bot @ 2026-06-23 19:00 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [Medium] The cgroup file descriptor `cg` is leaked because it is never closed in `test_sockmap_lookup_bind_release()`.
- [Low] The `struct sockaddr_in sa` is passed to the kernel via `xconnect` with uninitialized bytes in its `sin_zero` padding.

Pre-existing issues:
- [High] Calling `bpf_bind()` between `bpf_map_lookup_elem()` and `bpf_sk_release()` on a UDP socket mutates the socket's `SOCK_RCU_FREE` state, causing `bpf_sk_release()` to silently skip decrementing the reference count.
--

commit f32ef88a53a5c519bf4b22a337fa782c27f0ecda
Author: Michal Luczaj <mhal@rbox.co>

selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release

This patch adds a selftest to reproduce a UDP socket leak. It sets up a cgroup,
attaches a connect4 BPF program, and triggers a sequence of socket operations
(lookup, bind, release) that leaks the socket on an unpatched kernel.

> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> index cb3229711f93a..11972ffdb16ed 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c

[ ... ]

> @@ -1373,6 +1375,52 @@ static void test_sockmap_multi_channels(int sotype)
>  	test_sockmap_pass_prog__destroy(skel);
>  }
>  
> +#define LOOKUP_BIND_RELEASE_CG	"/sockmap_lookup-bind-release"
> +#define LOOKUP_BIND_RELEASE_REP	64
> +
> +static void test_sockmap_lookup_bind_release(void)
> +{
> +	struct test_sockmap_lookup_bind_release *skel;
> +	struct sockaddr_in sa;
> +	int cg, i;
> +
> +	cg = cgroup_setup_and_join(LOOKUP_BIND_RELEASE_CG);

[Severity: Medium]
Does this leak the cgroup file descriptor? 

The fd is acquired here and stored in cg, but close(cg) doesn't appear to be
called on either the success or error paths before the function returns. The
cleanup_cgroup_environment() function cleans up directories but doesn't close
the test's held file descriptors.

> +	if (!ASSERT_OK_FD(cg, "cgroup_setup_and_join"))
> +		return;
> +
> +	skel = test_sockmap_lookup_bind_release__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "open_and_load"))
> +		goto cleanup;
> +
> +	skel->links.connect = bpf_program__attach_cgroup(skel->progs.connect, cg);
> +	if (!ASSERT_OK_PTR(skel->links.connect, "attach_cgroup"))
> +		goto destroy;
> +
> +	sa.sin_family = AF_INET;
> +	sa.sin_port = bpf_htons(1234);
> +	sa.sin_addr.s_addr = bpf_htonl(INADDR_LOOPBACK);

[Severity: Low]
Does this leave the sin_zero padding of the sa structure uninitialized?

It looks like this partially initializes the struct on the stack.

[ ... ]

> +		if (xbpf_map_update_elem(bpf_map__fd(skel->maps.sockmap), &u32(0),
> +					 &sk, BPF_ANY))
> +			break;
> +
> +		if (xconnect(sk, (struct sockaddr *)&sa, sizeof(sa)))

When sa is passed to xconnect() here, can it leak the uninitialized padding
bytes from the userspace stack to the kernel? 

Would it be better to fully initialize it using "struct sockaddr_in sa = {};"
or explicitly zero it?

> +			break;
> +	}
> +
> +destroy:
> +	test_sockmap_lookup_bind_release__destroy(skel);
> +cleanup:
> +	cleanup_cgroup_environment();
> +}

[ ... ]

> diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c
> new file mode 100644
> index 0000000000000..cc77b193893bd
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c
> @@ -0,0 +1,37 @@

[ ... ]

> +SEC("cgroup/connect4")
> +int connect(struct bpf_sock_addr *ctx)
> +{
> +	struct bpf_sock *sk;
> +	int ret = SK_DROP;
> +
> +	sk = bpf_map_lookup_elem(&sockmap, &(int){0});
> +	if (sk) {
> +		if (sk == ctx->sk) {
> +			struct sockaddr_in sa = {
> +				.sin_family = ctx->user_family,
> +				.sin_port = ctx->user_port,
> +				.sin_addr.s_addr = ctx->user_ip4
> +			};
> +
> +			ret = !bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa));
> +		}
> +
> +		bpf_sk_release(sk);

[Severity: High]
This is a pre-existing issue, but does calling bpf_bind() here mutate the
SOCK_RCU_FREE state and cause bpf_sk_release() to skip decrementing the
reference count?

If bpf_map_lookup_elem() looks up a newly created UDP socket, SOCK_RCU_FREE
is not yet set, so sk_is_refcounted() evaluates to true and the reference
count is incremented.

Then bpf_bind() resolves to udp_v4_get_port(), which dynamically sets the
SOCK_RCU_FREE flag on the socket.

When bpf_sk_release() is called, it evaluates sk_is_refcounted() again. Because
SOCK_RCU_FREE is now set, it evaluates to false, causing bpf_sk_release() to
skip the sock_gen_put() call and permanently leak the socket reference.

> +	}
> +
> +	return ret;
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623-sockmap-lookup-udp-leak-v1-0-05804f9308e4@rbox.co?part=2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
  2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
  2026-06-23 19:00   ` sashiko-bot
@ 2026-06-23 19:32   ` bot+bpf-ci
  1 sibling, 0 replies; 9+ messages in thread
From: bot+bpf-ci @ 2026-06-23 19:32 UTC (permalink / raw)
  To: mhal, john.fastabend, jakub, jiayuan.chen, davem, edumazet, kuba,
	pabeni, horms, ast, cong.wang, daniel, andrii, eddyz87, memxor,
	martin.lau, song, yonghong.song, jolsa, emil, shuah
  Cc: netdev, bpf, linux-kernel, linux-kselftest, mhal, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1107 bytes --]

> selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
>
>     Setup and join a cgroup, then attach a cgroup/connect4 program that runs
>
>        sk = bpf_map_lookup_elem(sockmap, 0)
>        bpf_bind(ctx, sa, sizeof(sa))
>        bpf_sk_release(sk)
>
>     Unpatched kernel leaks the socket.
>
>     Signed-off-by: Michal Luczaj <mhal@rbox.co>

This test reproduces a UDP socket leak across sockmap lookup-bind-release,
but there is no Fixes: tag.

The sibling fix commit a2510dc351c5 ("bpf, sockmap: Don't leak UDP socks on
lookup-bind-release") carries a Fixes: tag pointing at the commit that lifted
the socket-state restriction allowing unbound UDP sockets into sockmap.

Should this test carry the same tag so it stays backportable alongside the
fix?

  Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/28049771327

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
  2026-06-23 19:00   ` sashiko-bot
@ 2026-06-23 21:18     ` Emil Tsalapatis
  0 siblings, 0 replies; 9+ messages in thread
From: Emil Tsalapatis @ 2026-06-23 21:18 UTC (permalink / raw)
  To: sashiko-reviews, Michal Luczaj; +Cc: bpf

On Tue Jun 23, 2026 at 3:00 PM EDT, sashiko-bot wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
>
> New issues:
> - [Medium] The cgroup file descriptor `cg` is leaked because it is never closed in `test_sockmap_lookup_bind_release()`.
> - [Low] The `struct sockaddr_in sa` is passed to the kernel via `xconnect` with uninitialized bytes in its `sin_zero` padding.
>
> Pre-existing issues:
> - [High] Calling `bpf_bind()` between `bpf_map_lookup_elem()` and `bpf_sk_release()` on a UDP socket mutates the socket's `SOCK_RCU_FREE` state, causing `bpf_sk_release()` to silently skip decrementing the reference count.
> --
>
> commit f32ef88a53a5c519bf4b22a337fa782c27f0ecda
> Author: Michal Luczaj <mhal@rbox.co>
>
> selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release
>
> This patch adds a selftest to reproduce a UDP socket leak. It sets up a cgroup,
> attaches a connect4 BPF program, and triggers a sequence of socket operations
> (lookup, bind, release) that leaks the socket on an unpatched kernel.
>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
>> index cb3229711f93a..11972ffdb16ed 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
>
> [ ... ]
>
>> @@ -1373,6 +1375,52 @@ static void test_sockmap_multi_channels(int sotype)
>>  	test_sockmap_pass_prog__destroy(skel);
>>  }
>>  
>> +#define LOOKUP_BIND_RELEASE_CG	"/sockmap_lookup-bind-release"
>> +#define LOOKUP_BIND_RELEASE_REP	64
>> +
>> +static void test_sockmap_lookup_bind_release(void)
>> +{
>> +	struct test_sockmap_lookup_bind_release *skel;
>> +	struct sockaddr_in sa;
>> +	int cg, i;
>> +
>> +	cg = cgroup_setup_and_join(LOOKUP_BIND_RELEASE_CG);
>
> [Severity: Medium]
> Does this leak the cgroup file descriptor? 
>
> The fd is acquired here and stored in cg, but close(cg) doesn't appear to be
> called on either the success or error paths before the function returns. The
> cleanup_cgroup_environment() function cleans up directories but doesn't close
> the test's held file descriptors.
>
>> +	if (!ASSERT_OK_FD(cg, "cgroup_setup_and_join"))
>> +		return;
>> +
>> +	skel = test_sockmap_lookup_bind_release__open_and_load();
>> +	if (!ASSERT_OK_PTR(skel, "open_and_load"))
>> +		goto cleanup;
>> +
>> +	skel->links.connect = bpf_program__attach_cgroup(skel->progs.connect, cg);
>> +	if (!ASSERT_OK_PTR(skel->links.connect, "attach_cgroup"))
>> +		goto destroy;
>> +
>> +	sa.sin_family = AF_INET;
>> +	sa.sin_port = bpf_htons(1234);
>> +	sa.sin_addr.s_addr = bpf_htonl(INADDR_LOOPBACK);
>
> [Severity: Low]
> Does this leave the sin_zero padding of the sa structure uninitialized?
>
> It looks like this partially initializes the struct on the stack.
>
> [ ... ]
>
>> +		if (xbpf_map_update_elem(bpf_map__fd(skel->maps.sockmap), &u32(0),
>> +					 &sk, BPF_ANY))
>> +			break;
>> +
>> +		if (xconnect(sk, (struct sockaddr *)&sa, sizeof(sa)))
>
> When sa is passed to xconnect() here, can it leak the uninitialized padding
> bytes from the userspace stack to the kernel? 
>
> Would it be better to fully initialize it using "struct sockaddr_in sa = {};"
> or explicitly zero it?
>
>> +			break;
>> +	}
>> +
>> +destroy:
>> +	test_sockmap_lookup_bind_release__destroy(skel);
>> +cleanup:
>> +	cleanup_cgroup_environment();
>> +}
>

I think these issues are borderline cosmetic but valid.

> [ ... ]
>
>> diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c
>> new file mode 100644
>> index 0000000000000..cc77b193893bd
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_sockmap_lookup_bind_release.c
>> @@ -0,0 +1,37 @@
>
> [ ... ]
>
>> +SEC("cgroup/connect4")
>> +int connect(struct bpf_sock_addr *ctx)
>> +{
>> +	struct bpf_sock *sk;
>> +	int ret = SK_DROP;
>> +
>> +	sk = bpf_map_lookup_elem(&sockmap, &(int){0});
>> +	if (sk) {
>> +		if (sk == ctx->sk) {
>> +			struct sockaddr_in sa = {
>> +				.sin_family = ctx->user_family,
>> +				.sin_port = ctx->user_port,
>> +				.sin_addr.s_addr = ctx->user_ip4
>> +			};
>> +
>> +			ret = !bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa));
>> +		}
>> +
>> +		bpf_sk_release(sk);
>
> [Severity: High]
> This is a pre-existing issue, but does calling bpf_bind() here mutate the
> SOCK_RCU_FREE state and cause bpf_sk_release() to skip decrementing the
> reference count?
>
> If bpf_map_lookup_elem() looks up a newly created UDP socket, SOCK_RCU_FREE
> is not yet set, so sk_is_refcounted() evaluates to true and the reference
> count is incremented.
>
> Then bpf_bind() resolves to udp_v4_get_port(), which dynamically sets the
> SOCK_RCU_FREE flag on the socket.
>
> When bpf_sk_release() is called, it evaluates sk_is_refcounted() again. Because
> SOCK_RCU_FREE is now set, it evaluates to false, causing bpf_sk_release() to
> skip the sock_gen_put() call and permanently leak the socket reference.
>

AFAICT this is actually not a problem for this specific test, because we
fix it in patch 1/2. We have already added the socket to the sockmap so
udp_bpf_update_proto should have triggered. It looks like Sashiko missed
this?

Specifically for UDP, I think the code is fine.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

>> +	}
>> +
>> +	return ret;
>> +}


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
  2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
@ 2026-06-23 21:19   ` Emil Tsalapatis
  2026-06-24  1:36   ` Jiayuan Chen
  2026-06-24 13:36   ` Jakub Sitnicki
  2 siblings, 0 replies; 9+ messages in thread
From: Emil Tsalapatis @ 2026-06-23 21:19 UTC (permalink / raw)
  To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann,
	Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
	Emil Tsalapatis, Shuah Khan
  Cc: netdev, bpf, linux-kernel, linux-kselftest

On Tue Jun 23, 2026 at 2:03 PM EDT, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
>   comm "test_progs", pid 2451, jiffies 4295320596
>   hex dump (first 32 bytes):
>     7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00  ................
>     02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
>   backtrace (crc bdee079d):
>     kmem_cache_alloc_noprof+0x557/0x660
>     sk_prot_alloc+0x69/0x240
>     sk_alloc+0x30/0x460
>     inet_create+0x2ce/0xf80
>     __sock_create+0x25b/0x5c0
>     __sys_socket+0x119/0x1d0
>     __x64_sys_socket+0x72/0xd0
>     do_syscall_64+0xa1/0x5f0
>     entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
> Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
> sockets in bpf_sk_assign").
> ---
>  net/ipv4/udp_bpf.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
> index ad57c4c9eaab..970327b59582 100644
> --- a/net/ipv4/udp_bpf.c
> +++ b/net/ipv4/udp_bpf.c
> @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
>  	if (sk->sk_family == AF_INET6)
>  		udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
>  
> +	/* Treat all sockets as non-refcounted, regardless of binding state. */
> +	sock_set_flag(sk, SOCK_RCU_FREE);
> +
>  	sock_replace_proto(sk, &udp_bpf_prots[family]);
>  	return 0;
>  }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
  2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
  2026-06-23 21:19   ` Emil Tsalapatis
@ 2026-06-24  1:36   ` Jiayuan Chen
  2026-06-24 13:36   ` Jakub Sitnicki
  2 siblings, 0 replies; 9+ messages in thread
From: Jiayuan Chen @ 2026-06-24  1:36 UTC (permalink / raw)
  To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann,
	Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
	Emil Tsalapatis, Shuah Khan
  Cc: netdev, bpf, linux-kernel, linux-kselftest


On 6/24/26 2:03 AM, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
>    comm "test_progs", pid 2451, jiffies 4295320596
>    hex dump (first 32 bytes):
>      7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00  ................
>      02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
>    backtrace (crc bdee079d):
>      kmem_cache_alloc_noprof+0x557/0x660
>      sk_prot_alloc+0x69/0x240
>      sk_alloc+0x30/0x460
>      inet_create+0x2ce/0xf80
>      __sock_create+0x25b/0x5c0
>      __sys_socket+0x119/0x1d0
>      __x64_sys_socket+0x72/0xd0
>      do_syscall_64+0xa1/0x5f0
>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>

Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
  2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
  2026-06-23 21:19   ` Emil Tsalapatis
  2026-06-24  1:36   ` Jiayuan Chen
@ 2026-06-24 13:36   ` Jakub Sitnicki
  2 siblings, 0 replies; 9+ messages in thread
From: Jakub Sitnicki @ 2026-06-24 13:36 UTC (permalink / raw)
  To: Michal Luczaj, Willem de Bruijn
  Cc: John Fastabend, Jiayuan Chen, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov,
	Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu,
	Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan, netdev,
	bpf, linux-kernel, linux-kselftest

On Tue, Jun 23, 2026 at 08:03 PM +02, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
>   comm "test_progs", pid 2451, jiffies 4295320596
>   hex dump (first 32 bytes):
>     7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00  ................
>     02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
>   backtrace (crc bdee079d):
>     kmem_cache_alloc_noprof+0x557/0x660
>     sk_prot_alloc+0x69/0x240
>     sk_alloc+0x30/0x460
>     inet_create+0x2ce/0xf80
>     __sock_create+0x25b/0x5c0
>     __sys_socket+0x119/0x1d0
>     __x64_sys_socket+0x72/0xd0
>     do_syscall_64+0xa1/0x5f0
>     entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> ---
> Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
> sockets in bpf_sk_assign").
> ---
>  net/ipv4/udp_bpf.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
> index ad57c4c9eaab..970327b59582 100644
> --- a/net/ipv4/udp_bpf.c
> +++ b/net/ipv4/udp_bpf.c
> @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
>  	if (sk->sk_family == AF_INET6)
>  		udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
>  
> +	/* Treat all sockets as non-refcounted, regardless of binding state. */
> +	sock_set_flag(sk, SOCK_RCU_FREE);
> +
>  	sock_replace_proto(sk, &udp_bpf_prots[family]);
>  	return 0;
>  }

There is a side effect that an unhashed (unbound) UDP socket can now be
selected in sk_lookup with bpf_sk_assign. Though perhaps that's for the
better because TC bpf_sk_assign doesn't reject non-refcounted UDP
sockets either, so we would have both socket dispatch sites behave the
same way.

Also, with this patch, if we insert & remove an unhashed UDP socket
into/from a sockmap, we end up with an unhashed non-refcounted UDP
socket. Not entirely sure if that is actually a problem or not.

Willem, what is your take on having unhashed non-refcoted UDP sockets?

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-06-24 13:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj
2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
2026-06-23 21:19   ` Emil Tsalapatis
2026-06-24  1:36   ` Jiayuan Chen
2026-06-24 13:36   ` Jakub Sitnicki
2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
2026-06-23 19:00   ` sashiko-bot
2026-06-23 21:18     ` Emil Tsalapatis
2026-06-23 19:32   ` bot+bpf-ci

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.