From: "Jiayuan Chen" <jiayuan.chen@linux.dev>
To: "Kuniyuki Iwashima" <kuniyu@google.com>
Cc: "John Fastabend" <john.fastabend@gmail.com>,
"Jakub Sitnicki" <jakub@cloudflare.com>,
"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
"Kuniyuki Iwashima" <kuni1840@gmail.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com
Subject: Re: [PATCH v4 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
Date: Thu, 05 Mar 2026 08:30:15 +0000 [thread overview]
Message-ID: <389162b2e117d8a5675b02286c7bafcc342086b6@linux.dev> (raw)
In-Reply-To: <CAAVpQUAGSzypVhT8V-qjhUfZeEQhXQtS1z+iQC7vrzZCHEr9=A@mail.gmail.com>
March 5, 2026 at 15:48, "Kuniyuki Iwashima" <kuniyu@google.com mailto:kuniyu@google.com?to=%22Kuniyuki%20Iwashima%22%20%3Ckuniyu%40google.com%3E > wrote:
>
> On Wed, Mar 4, 2026 at 10:37 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> >
> > On Sat, Feb 21, 2026 at 11:30:53PM +0800, Kuniyuki Iwashima wrote:
> > [...]
> > </TASK>
> >
> > Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap")
> > Reported-by: syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/netdev/698f84c6.a70a0220.2c38d7.00cb.GAE@google.com/
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> > ---
> > v4: Fix deadlock when requeued
> > v2: Fix build failure when CONFIG_INET=n
> > ---
> > include/net/udp.h | 9 +++++++++
> > net/core/skmsg.c | 29 +++++++++++++++++++++++++----
> > net/ipv4/udp.c | 9 +++++++++
> > 3 files changed, 43 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/net/udp.h b/include/net/udp.h
> > index 700dbedcb15f..ae38a4da9388 100644
> > --- a/include/net/udp.h
> > +++ b/include/net/udp.h
> > @@ -455,6 +455,15 @@ struct sock *__udp6_lib_lookup(const struct net *net,
> > struct sk_buff *skb);
> > struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
> > __be16 sport, __be16 dport);
> > +
> > +#ifdef CONFIG_INET
> > +void udp_sock_rfree(struct sk_buff *skb);
> > +#else
> > +static inline void udp_sock_rfree(struct sk_buff *skb)
> > +{
> > +}
> > +#endif
> > +
> > int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
> >
> > /* UDP uses skb->dev_scratch to cache as much information as possible and avoid
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index 96f43e0dbb17..dd9134a45663 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -7,6 +7,7 @@
> >
> > #include <net/sock.h>
> > #include <net/tcp.h>
> > +#include <net/udp.h>
> > #include <net/tls.h>
> > #include <trace/events/sock.h>
> >
> > @@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> > u32 off, u32 len, bool take_ref)
> > {
> > struct sock *sk = psock->sk;
> > + bool is_udp = sk_is_udp(sk);
> > struct sk_msg *msg;
> > int err = -EAGAIN;
> >
> > @@ -583,13 +585,20 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> > if (!msg)
> > goto out;
> >
> > - if (skb->sk != sk && take_ref) {
> > + if (is_udp) {
> > + if (unlikely(skb->destructor == udp_sock_rfree))
> > + goto enqueue;
> > +
> > + spin_lock_bh(&sk->sk_receive_queue.lock);
> > + }
> > +
> > + if (is_udp || (skb->sk != sk && take_ref)) {
> > if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
> > - goto free;
> > + goto unlock;
> >
> > if (!sk_rmem_schedule(sk, skb, skb->truesize))
> > - goto free;
> > - } else {
> > + goto unlock;
> > + } else if (skb->sk == sk || !take_ref) {
> > /* This is used in tcp_bpf_recvmsg_parser() to determine whether the
> > * data originates from the socket's own protocol stack. No need to
> > * refcount sk because msg's lifetime is bound to sk via the ingress_msg.
> > @@ -604,11 +613,23 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> > * into user buffers.
> > */
> > skb_set_owner_r(skb, sk);
> > +
> > + if (is_udp) {
> > + spin_unlock_bh(&sk->sk_receive_queue.lock);
> > +
> > + skb->destructor = udp_sock_rfree;
> > + }
> > +
> > +enqueue:
> > err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
> > if (err < 0)
> > goto free;
> > out:
> > return err;
> > +
> > +unlock:
> > + if (is_udp)
> > + spin_unlock_bh(&sk->sk_receive_queue.lock);
> > free:
> > kfree(msg);
> > goto out;
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 422c96fea249..831d26748a90 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -2039,6 +2039,15 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags,
> > }
> > EXPORT_SYMBOL(__skb_recv_udp);
> >
> > +void udp_sock_rfree(struct sk_buff *skb)
> > +{
> > + struct sock *sk = skb->sk;
> > +
> > + spin_lock_bh(&sk->sk_receive_queue.lock);
> > + sock_rfree(skb);
> > + spin_unlock_bh(&sk->sk_receive_queue.lock);
> > +}
> > +
> > int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
> > {
> > struct sk_buff *skb;
> > --
> > 2.53.0.371.g1d285c8824-goog
> >
> > There are too many protocol-specific checks in the generic skmsg code
> > here. This should be abstracted out.
> >
> > Also TCP also has a similar issue with sk_forward_alloc concurrency
> > between the backlog charge and recvmsg uncharge paths so the abstraction
> > is necessary.
> >
> > I've put together a simple diff based on your patch for reference.
> > I haven't tested it thoroughly, but it at least handles TCP and UDP
> > separately through a callback, keeping the generic code clean.
> >
> I can follow up on TCP, but TCP needs another fix, which
> cannot be factored out.
>
> Some wrong assumptions in your patch:
>
> 1) tcp_psock_ingress_charge() uses psock->ingress_lock,
> but it does not protect any tcp_sock fields, especially
> sk->sk_forwad_alloc.
>
> 2) TCP memory accounting happens under bh_lock_sock()
> _or_ lock_sock(). sendmsg()/recvmsg() works under
> lock_sock() and TCP fast path puts skb to backlog if
> lock_sock() is held by userspace. This means even
> a simple bh_lock_sock() in tcp_psock_ingress_charge()
> race with memory accounting happening under lock_sock().
>
> Since sk_psock_skb_ingress() is called from both GFP_KERNEL
> and GFP_ATOMIC context, the simplest fix would be to put
> lock_sock() in sk_psock_backlog() for TCP, which is ugly though.
>
OK, thanks.
Now my only concern is keeping UDP-specific logic
out of skmsg.c. We could use a function pointer so that UDP
implements its own ingress charge in udp_bpf.c, while other
protocols just use a default no-op or even NULL.
next prev parent reply other threads:[~2026-03-05 8:30 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-21 23:30 [PATCH v4 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
2026-02-21 23:30 ` [PATCH v4 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
2026-03-05 11:05 ` Jakub Sitnicki
2026-03-05 11:27 ` Jiayuan Chen
2026-02-21 23:30 ` [PATCH v4 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
2026-03-05 1:48 ` Jiayuan Chen
2026-03-05 3:43 ` Kuniyuki Iwashima
2026-03-07 0:03 ` Martin KaFai Lau
2026-03-07 2:51 ` Kuniyuki Iwashima
2026-03-05 11:35 ` Jiayuan Chen
2026-03-05 11:51 ` Jakub Sitnicki
2026-02-21 23:30 ` [PATCH v4 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
2026-03-05 2:30 ` Jiayuan Chen
2026-03-05 3:41 ` Kuniyuki Iwashima
2026-03-05 11:36 ` Jiayuan Chen
2026-03-05 11:39 ` Jakub Sitnicki
2026-03-05 17:46 ` Kuniyuki Iwashima
2026-02-21 23:30 ` [PATCH v4 bpf/net 4/6] sockmap: Inline sk_psock_create_ingress_msg() Kuniyuki Iwashima
2026-03-05 11:44 ` Jakub Sitnicki
2026-02-21 23:30 ` [PATCH v4 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
2026-02-21 23:30 ` [PATCH v4 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
2026-03-04 20:04 ` Martin KaFai Lau
2026-03-04 20:14 ` Kuniyuki Iwashima
2026-03-05 6:37 ` Jiayuan Chen
2026-03-05 7:48 ` Kuniyuki Iwashima
2026-03-05 8:30 ` Jiayuan Chen [this message]
2026-03-05 9:27 ` Kuniyuki Iwashima
2026-03-05 10:45 ` Jiayuan Chen
2026-03-05 11:04 ` Jiayuan Chen
2026-03-05 17:42 ` Kuniyuki Iwashima
2026-03-06 7:44 ` Jiayuan Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=389162b2e117d8a5675b02286c7bafcc342086b6@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=bpf@vger.kernel.org \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=kuni1840@gmail.com \
--cc=kuniyu@google.com \
--cc=netdev@vger.kernel.org \
--cc=syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox