public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	Jiayuan Chen <jiayuan.chen@linux.dev>,
	Cong Wang <xiyou.wangcong@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Kuniyuki Iwashima <kuni1840@gmail.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com
Subject: Re: [PATCH v4 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
Date: Wed, 4 Mar 2026 12:04:50 -0800	[thread overview]
Message-ID: <88077a77-8b9e-4372-bb39-c7a638cfa3d4@linux.dev> (raw)
In-Reply-To: <20260221233234.3814768-7-kuniyu@google.com>

On 2/21/26 3:30 PM, Kuniyuki Iwashima wrote:
> syzbot reported imbalanced sk->sk_forward_alloc [0] and
> demonstrated that UDP memory accounting by SOCKMAP is broken.
> 
> The repro put a UDP sk into SOCKMAP and redirected skb to itself,
> where skb->truesize was 4240.
> 
> First, udp_rmem_schedule() set sk->sk_forward_alloc to 8192
> (2 * PAGE_SIZE), and skb->truesize was charged:
> 
>    sk->sk_forward_alloc = 0 + 8192 - 4240;  // => 3952
> 
> Then, udp_read_skb() dequeued the skb by skb_recv_udp(), which finally
> calls udp_rmem_release() and _partially_ reclaims sk->sk_forward_alloc
> because skb->truesize was larger than PAGE_SIZE:
> 
>    sk->sk_forward_alloc += 4240;  // => 8192 (PAGE_SIZE is reclaimable)
>    sk->sk_forward_alloc -= 4096;  // => 4096
> 
> Later, sk_psock_skb_ingress_self() called skb_set_owner_r() to
> charge the skb again, triggering an sk->sk_forward_alloc underflow:
> 
>    sk->sk_forward_alloc -= 4240   // => -144
> 
> Another problem is that UDP memory accounting is not performed
> under spin_lock_bh(&sk->sk_receive_queue.lock).
> 
> skb_set_owner_r() and sock_rfree() are called locklessly and
> corrupt sk->sk_forward_alloc, leading to the splat.
> 
> Let's not skip memory accounting for UDP and ensure the proper
> lock is held.
> 
> Note that UDP does not need msg->sk assignment, which is for TCP.
> 
> [0]:
> WARNING: net/ipv4/af_inet.c:157 at inet_sock_destruct+0x62d/0x740 net/ipv4/af_inet.c:157, CPU#0: ksoftirqd/0/15
> Modules linked in:
> CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026
> RIP: 0010:inet_sock_destruct+0x62d/0x740 net/ipv4/af_inet.c:157
> Code: 0f 0b 90 e9 58 fe ff ff e8 40 55 b3 f7 90 0f 0b 90 e9 8b fe ff ff e8 32 55 b3 f7 90 0f 0b 90 e9 b1 fe ff ff e8 24 55 b3 f7 90 <0f> 0b 90 e9 d7 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 95 fc
> RSP: 0018:ffffc90000147a48 EFLAGS: 00010246
> RAX: ffffffff8a1121dc RBX: dffffc0000000000 RCX: ffff88801d2c3d00
> RDX: 0000000000000100 RSI: 0000000000000f70 RDI: 0000000000000000
> RBP: 0000000000000f70 R08: ffff888030ce1327 R09: 1ffff1100619c264
> R10: dffffc0000000000 R11: ffffed100619c265 R12: ffff888030ce1080
> R13: dffffc0000000000 R14: ffff888030ce130c R15: ffffffff8fa87e00
> FS:  0000000000000000(0000) GS:ffff8881256f8000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000200000000700 CR3: 000000007200c000 CR4: 00000000003526f0
> Call Trace:
>   <TASK>
>   __sk_destruct+0x85/0x880 net/core/sock.c:2350
>   rcu_do_batch kernel/rcu/tree.c:2605 [inline]
>   rcu_core+0xc9e/0x1750 kernel/rcu/tree.c:2857
>   handle_softirqs+0x22a/0x7c0 kernel/softirq.c:622
>   run_ksoftirqd+0x36/0x60 kernel/softirq.c:1063
>   smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
>   kthread+0x726/0x8b0 kernel/kthread.c:463
>   ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
>   </TASK>
> 
> Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap")

Cc: Cong Wang, who is the author of the quoted commit and also the 
author of the full UDP support in sockmap.

JakubS and JohnF who are the maintainers of sockmap and skmsg. Please 
take a look also.

Jiayuan Chen, you had submitted patches(/fixes) for sockmap/skmsg. It is 
a good chance to gain review credit.

> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 96f43e0dbb17..dd9134a45663 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -7,6 +7,7 @@
>   
>   #include <net/sock.h>
>   #include <net/tcp.h>
> +#include <net/udp.h>
>   #include <net/tls.h>
>   #include <trace/events/sock.h>
>   
> @@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
>   				u32 off, u32 len, bool take_ref)
>   {
>   	struct sock *sk = psock->sk;
> +	bool is_udp = sk_is_udp(sk);
>   	struct sk_msg *msg;
>   	int err = -EAGAIN;
>   
> @@ -583,13 +585,20 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
>   	if (!msg)
>   		goto out;
>   
> -	if (skb->sk != sk && take_ref) {
> +	if (is_udp) {
> +		if (unlikely(skb->destructor == udp_sock_rfree))

A quick question. This case happens when the earlier 
sk_psock_skb_ingress_enqueue() failed?

> +			goto enqueue;
> +
> +		spin_lock_bh(&sk->sk_receive_queue.lock);
> +	}
> +
> +	if (is_udp || (skb->sk != sk && take_ref)) {
>   		if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
> -			goto free;
> +			goto unlock;
>   
>   		if (!sk_rmem_schedule(sk, skb, skb->truesize))
> -			goto free;
> -	} else {
> +			goto unlock;
> +	} else if (skb->sk == sk || !take_ref) {
>   		/* This is used in tcp_bpf_recvmsg_parser() to determine whether the
>   		 * data originates from the socket's own protocol stack. No need to
>   		 * refcount sk because msg's lifetime is bound to sk via the ingress_msg.
> @@ -604,11 +613,23 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
>   	 * into user buffers.
>   	 */
>   	skb_set_owner_r(skb, sk);
> +
> +	if (is_udp) {
> +		spin_unlock_bh(&sk->sk_receive_queue.lock);
> +
> +		skb->destructor = udp_sock_rfree;
> +	}
> +
> +enqueue:
>   	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
>   	if (err < 0)
>   		goto free;
>   out:
>   	return err;
> +
> +unlock:
> +	if (is_udp)
> +		spin_unlock_bh(&sk->sk_receive_queue.lock);
>   free:
>   	kfree(msg);
>   	goto out;

  reply	other threads:[~2026-03-04 20:05 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-21 23:30 [PATCH v4 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
2026-02-21 23:30 ` [PATCH v4 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
2026-03-05 11:05   ` Jakub Sitnicki
2026-03-05 11:27   ` Jiayuan Chen
2026-02-21 23:30 ` [PATCH v4 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
2026-03-05  1:48   ` Jiayuan Chen
2026-03-05  3:43     ` Kuniyuki Iwashima
2026-03-07  0:03       ` Martin KaFai Lau
2026-03-07  2:51         ` Kuniyuki Iwashima
2026-03-05 11:35   ` Jiayuan Chen
2026-03-05 11:51   ` Jakub Sitnicki
2026-02-21 23:30 ` [PATCH v4 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
2026-03-05  2:30   ` Jiayuan Chen
2026-03-05  3:41     ` Kuniyuki Iwashima
2026-03-05 11:36   ` Jiayuan Chen
2026-03-05 11:39   ` Jakub Sitnicki
2026-03-05 17:46     ` Kuniyuki Iwashima
2026-02-21 23:30 ` [PATCH v4 bpf/net 4/6] sockmap: Inline sk_psock_create_ingress_msg() Kuniyuki Iwashima
2026-03-05 11:44   ` Jakub Sitnicki
2026-02-21 23:30 ` [PATCH v4 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
2026-02-21 23:30 ` [PATCH v4 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
2026-03-04 20:04   ` Martin KaFai Lau [this message]
2026-03-04 20:14     ` Kuniyuki Iwashima
2026-03-05  6:37   ` Jiayuan Chen
2026-03-05  7:48     ` Kuniyuki Iwashima
2026-03-05  8:30       ` Jiayuan Chen
2026-03-05  9:27         ` Kuniyuki Iwashima
2026-03-05 10:45           ` Jiayuan Chen
2026-03-05 11:04             ` Jiayuan Chen
2026-03-05 17:42               ` Kuniyuki Iwashima
2026-03-06  7:44                 ` Jiayuan Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88077a77-8b9e-4372-bb39-c7a638cfa3d4@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=jiayuan.chen@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox