All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Jakub Sitnicki <jakub@cloudflare.com>,
	 Michal Luczaj <mhal@rbox.co>,
	 Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
	 Jiayuan Chen <jiayuan.chen@linux.dev>,
	 "David S. Miller" <davem@davemloft.net>,
	 Eric Dumazet <edumazet@google.com>,
	 Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>,
	 Simon Horman <horms@kernel.org>,
	 Alexei Starovoitov <ast@kernel.org>,
	 Cong Wang <cong.wang@bytedance.com>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	 Andrii Nakryiko <andrii@kernel.org>,
	 Eduard Zingerman <eddyz87@gmail.com>,
	 Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	 Martin KaFai Lau <martin.lau@linux.dev>,
	 Song Liu <song@kernel.org>,
	 Yonghong Song <yonghong.song@linux.dev>,
	 Jiri Olsa <jolsa@kernel.org>,
	 Emil Tsalapatis <emil@etsalapatis.com>,
	 Shuah Khan <shuah@kernel.org>,
	 netdev@vger.kernel.org,  bpf@vger.kernel.org,
	 linux-kernel@vger.kernel.org,  linux-kselftest@vger.kernel.org,
	 kuniyu@google.com
Subject: Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
Date: Wed, 24 Jun 2026 16:01:09 -0400	[thread overview]
Message-ID: <willemdebruijn.kernel.24d11e11d5dc0@gmail.com> (raw)
In-Reply-To: <87wlvoxdq1.fsf@cloudflare.com>

Jakub Sitnicki wrote:
> On Tue, Jun 23, 2026 at 08:03 PM +02, Michal Luczaj wrote:
> > UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> > sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
> >
> > Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> > socket's refcount via lookup. If the socket is subsequently bound, the
> > transition from unbound to bound causes bpf_sk_release() to skip the
> > decrement of the refcount, causing a memory leak.
> >
> > unreferenced object 0xffff88810bc2eb40 (size 1984):
> >   comm "test_progs", pid 2451, jiffies 4295320596
> >   hex dump (first 32 bytes):
> >     7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00  ................
> >     02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
> >   backtrace (crc bdee079d):
> >     kmem_cache_alloc_noprof+0x557/0x660
> >     sk_prot_alloc+0x69/0x240
> >     sk_alloc+0x30/0x460
> >     inet_create+0x2ce/0xf80
> >     __sock_create+0x25b/0x5c0
> >     __sys_socket+0x119/0x1d0
> >     __x64_sys_socket+0x72/0xd0
> >     do_syscall_64+0xa1/0x5f0
> >     entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > Maintain balanced refcounts across sk lookup/release: (re-)set
> > SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> > unbound) as not requiring a refcount increment on (a RCU protected) lookup.
> >
> > Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> > Signed-off-by: Michal Luczaj <mhal@rbox.co>
> > ---
> > Note: this issue is related to commit 67312adc96b5 ("bpf: reject unhashed
> > sockets in bpf_sk_assign").
> > ---
> >  net/ipv4/udp_bpf.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
> > index ad57c4c9eaab..970327b59582 100644
> > --- a/net/ipv4/udp_bpf.c
> > +++ b/net/ipv4/udp_bpf.c
> > @@ -173,6 +173,9 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
> >  	if (sk->sk_family == AF_INET6)
> >  		udp_bpf_check_v6_needs_rebuild(psock->sk_proto);
> >  
> > +	/* Treat all sockets as non-refcounted, regardless of binding state. */
> > +	sock_set_flag(sk, SOCK_RCU_FREE);
> > +
> >  	sock_replace_proto(sk, &udp_bpf_prots[family]);
> >  	return 0;
> >  }
> 
> There is a side effect that an unhashed (unbound) UDP socket can now be
> selected in sk_lookup with bpf_sk_assign.

The commit does mention a related fix, beneath the ---, commit
67312adc96b5 ("bpf: reject unhashed sockets in bpf_sk_assign").
That fixes a similar issue by exactly disallowing this:

    Fix the problem by rejecting unhashed sockets in bpf_sk_assign().
    This matches the behaviour of __inet_lookup_skb which is ultimately
    the goal of bpf_sk_assign().

So ..

> Though perhaps that's for the
> better because TC bpf_sk_assign doesn't reject non-refcounted UDP
> sockets either, so we would have both socket dispatch sites behave the
> same way.

.. there are two conflicting types of consistency here? Consistent with
__inet_lookup_skb or the TC bpf hook. Of those the first is the more
canonical.

> Also, with this patch, if we insert & remove an unhashed UDP socket
> into/from a sockmap, we end up with an unhashed non-refcounted UDP
> socket. Not entirely sure if that is actually a problem or not.
> 
> Willem, what is your take on having unhashed non-refcoted UDP sockets?

I don't immediately see a problem, but I'm not an expert on SOCK_RCU_FREE.


  reply	other threads:[~2026-06-24 20:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23 18:03 [PATCH bpf 0/2] bpf, sockmap: Fix sockmap leaking UDP socks Michal Luczaj
2026-06-23 18:03 ` [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release Michal Luczaj
2026-06-23 21:19   ` Emil Tsalapatis
2026-06-24  1:36   ` Jiayuan Chen
2026-06-24 13:36   ` Jakub Sitnicki
2026-06-24 20:01     ` Willem de Bruijn [this message]
2026-06-23 18:03 ` [PATCH bpf 2/2] selftests/bpf: Add test for UDP sock leak on sockmap lookup-bind-release Michal Luczaj
2026-06-23 19:00   ` sashiko-bot
2026-06-23 21:18     ` Emil Tsalapatis
2026-06-23 19:32   ` bot+bpf-ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=willemdebruijn.kernel.24d11e11d5dc0@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=emil@etsalapatis.com \
    --cc=horms@kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=jiayuan.chen@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=mhal@rbox.co \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.