* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
@ 2026-06-23 21:19 ` Emil Tsalapatis
2026-06-24 1:36 ` Jiayuan Chen
2026-06-24 13:36 ` Jakub Sitnicki
2 siblings, 0 replies; 8+ messages in thread
From: Emil Tsalapatis @ 2026-06-23 21:19 UTC (permalink / raw)
To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
Emil Tsalapatis, Shuah Khan
Cc: netdev, bpf, linux-kernel, linux-kselftest
On Tue Jun 23, 2026 at 2:03 PM EDT, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
> comm "test_progs", pid 2451, jiffies 4295320596
> hex dump (first 32 bytes):
> 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................
> 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
> backtrace (crc bdee079d):
> kmem_cache_alloc_noprof+0x557/0x660
> sk_prot_alloc+0x69/0x240
> sk_alloc+0x30/0x460
> inet_create+0x2ce/0xf80
> __sock_create+0x25b/0x5c0
> __sys_socket+0x119/0x1d0
> __x64_sys_socket+0x72/0xd0
> do_syscall_64+0xa1/0x5f0
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> ---
> Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
> sockets in bpf_sk_assign").
> ---
> net/ipv4/udp_bpf.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
> index ad57c4c9eaab..970327b59582 100644
> --- a/net/ipv4/udp_bpf.c
> +++ b/net/ipv4/udp_bpf.c
> @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
> if (sk->sk_family == AF_INET6)
> udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
>
> + /* Treat all sockets as non-refcounted, regardless of binding state. */
> + sock_set_flag(sk, SOCK_RCU_FREE);
> +
> sock_replace_proto(sk, &udp_bpf_prots[family]);
> return 0;
> }
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
2026-06-23 21:19 ` Emil Tsalapatis
@ 2026-06-24 1:36 ` Jiayuan Chen
2026-06-24 13:36 ` Jakub Sitnicki
2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2026-06-24 1:36 UTC (permalink / raw)
To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
Emil Tsalapatis, Shuah Khan
Cc: netdev, bpf, linux-kernel, linux-kselftest
On 6/24/26 2:03 AM, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
> comm "test_progs", pid 2451, jiffies 4295320596
> hex dump (first 32 bytes):
> 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................
> 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
> backtrace (crc bdee079d):
> kmem_cache_alloc_noprof+0x557/0x660
> sk_prot_alloc+0x69/0x240
> sk_alloc+0x30/0x460
> inet_create+0x2ce/0xf80
> __sock_create+0x25b/0x5c0
> __sys_socket+0x119/0x1d0
> __x64_sys_socket+0x72/0xd0
> do_syscall_64+0xa1/0x5f0
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
2026-06-23 21:19 ` Emil Tsalapatis
2026-06-24 1:36 ` Jiayuan Chen
@ 2026-06-24 13:36 ` Jakub Sitnicki
2026-06-24 20:01 ` Willem de Bruijn
2 siblings, 1 reply; 8+ messages in thread
From: Jakub Sitnicki @ 2026-06-24 13:36 UTC (permalink / raw)
To: Michal Luczaj, Willem de Bruijn
Cc: John Fastabend, Jiayuan Chen, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov,
Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan, netdev,
bpf, linux-kernel, linux-kselftest
On Tue, Jun 23, 2026 at 08:03 PM +02, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
> comm "test_progs", pid 2451, jiffies 4295320596
> hex dump (first 32 bytes):
> 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................
> 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
> backtrace (crc bdee079d):
> kmem_cache_alloc_noprof+0x557/0x660
> sk_prot_alloc+0x69/0x240
> sk_alloc+0x30/0x460
> inet_create+0x2ce/0xf80
> __sock_create+0x25b/0x5c0
> __sys_socket+0x119/0x1d0
> __x64_sys_socket+0x72/0xd0
> do_syscall_64+0xa1/0x5f0
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> ---
> Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
> sockets in bpf_sk_assign").
> ---
> net/ipv4/udp_bpf.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
> index ad57c4c9eaab..970327b59582 100644
> --- a/net/ipv4/udp_bpf.c
> +++ b/net/ipv4/udp_bpf.c
> @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
> if (sk->sk_family == AF_INET6)
> udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
>
> + /* Treat all sockets as non-refcounted, regardless of binding state. */
> + sock_set_flag(sk, SOCK_RCU_FREE);
> +
> sock_replace_proto(sk, &udp_bpf_prots[family]);
> return 0;
> }
There is a side effect that an unhashed (unbound) UDP socket can now be
selected in sk_lookup with bpf_sk_assign. Though perhaps that's for the
better because TC bpf_sk_assign doesn't reject non-refcounted UDP
sockets either, so we would have both socket dispatch sites behave the
same way.
Also, with this patch, if we insert & remove an unhashed UDP socket
into/from a sockmap, we end up with an unhashed non-refcounted UDP
socket. Not entirely sure if that is actually a problem or not.
Willem, what is your take on having unhashed non-refcoted UDP sockets?
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
2026-06-24 13:36 ` Jakub Sitnicki
@ 2026-06-24 20:01 ` Willem de Bruijn
0 siblings, 0 replies; 8+ messages in thread
From: Willem de Bruijn @ 2026-06-24 20:01 UTC (permalink / raw)
To: Jakub Sitnicki, Michal Luczaj, Willem de Bruijn
Cc: John Fastabend, Jiayuan Chen, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Alexei Starovoitov,
Cong Wang, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Emil Tsalapatis, Shuah Khan, netdev,
bpf, linux-kernel, linux-kselftest, kuniyu
Jakub Sitnicki wrote:
> On Tue, Jun 23, 2026 at 08:03 PM +02, Michal Luczaj wrote:
> > UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> > sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
> >
> > Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> > socket's refcount via lookup. If the socket is subsequently bound, the
> > transition from unbound to bound causes bpf_sk_release() to skip the
> > decrement of the refcount, causing a memory leak.
> >
> > unreferenced object 0xffff88810bc2eb40 (size 1984):
> > comm "test_progs", pid 2451, jiffies 4295320596
> > hex dump (first 32 bytes):
> > 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................
> > 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
> > backtrace (crc bdee079d):
> > kmem_cache_alloc_noprof+0x557/0x660
> > sk_prot_alloc+0x69/0x240
> > sk_alloc+0x30/0x460
> > inet_create+0x2ce/0xf80
> > __sock_create+0x25b/0x5c0
> > __sys_socket+0x119/0x1d0
> > __x64_sys_socket+0x72/0xd0
> > do_syscall_64+0xa1/0x5f0
> > entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > Maintain balanced refcounts across sk lookup/release: (re-)set
> > SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> > unbound) as not requiring a refcount increment on (a RCU protected) lookup.
> >
> > Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> > Signed-off-by: Michal Luczaj <mhal@rbox.co>
> > ---
> > Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
> > sockets in bpf_sk_assign").
> > ---
> > net/ipv4/udp_bpf.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
> > index ad57c4c9eaab..970327b59582 100644
> > --- a/net/ipv4/udp_bpf.c
> > +++ b/net/ipv4/udp_bpf.c
> > @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
> > if (sk->sk_family == AF_INET6)
> > udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
> >
> > + /* Treat all sockets as non-refcounted, regardless of binding state. */
> > + sock_set_flag(sk, SOCK_RCU_FREE);
> > +
> > sock_replace_proto(sk, &udp_bpf_prots[family]);
> > return 0;
> > }
>
> There is a side effect that an unhashed (unbound) UDP socket can now be
> selected in sk_lookup with bpf_sk_assign.
The commit does mention a related fix, beneath the ---, commit
67312adc96b5 ("bpf: reject unhashed sockets in bpf_sk_assign").
That fixes a similar issue by exactly disallowing this:
Fix the problem by rejecting unhashed sockets in bpf_sk_assign().
This matches the behaviour of __inet_lookup_skb which is ultimately
the goal of bpf_sk_assign().
So ..
> Though perhaps that's for the
> better because TC bpf_sk_assign doesn't reject non-refcounted UDP
> sockets either, so we would have both socket dispatch sites behave the
> same way.
.. there are two conflicting types of consistency here? Consistent with
__inet_lookup_skb or the TC bpf hook. Of those the first is the more
canonical.
> Also, with this patch, if we insert & remove an unhashed UDP socket
> into/from a sockmap, we end up with an unhashed non-refcounted UDP
> socket. Not entirely sure if that is actually a problem or not.
>
> Willem, what is your take on having unhashed non-refcoted UDP sockets?
I don't immediately see a problem, but I'm not an expert on SOCK_RCU_FREE.
^ permalink raw reply [flat|nested] 8+ messages in thread