public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP.
@ 2026-02-19 17:37 Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev

syzbot reported 3 issues in SOCKMAP for UDP.

Patch 1 - 2 fix lockless accesses to sk->sk_data_ready()
and sk->sk_write_space().

Patch 3 fixes UAF in sk_msg_recvmsg().

Patch 4 - 5 consolidate sk_psock_skb_ingress_self() into
sk_psock_skb_ingress() as prep.

Patch 6 fixes broken memory accounting.


Changes:
  v3:
    Patch 2: Use WRITE_ONCE() in udp_bpf_update_proto()

  v2: https://lore.kernel.org/netdev/20260217000701.791189-1-kuniyu@google.com/
    Patch 2: Cache sk->sk_write_space in sock_wfree()
    Patch 5: Keep msg->sk assignment
    Patch 6: Fix build failure when CONFIG_INET=n

  v1: https://lore.kernel.org/netdev/20260215204353.3645744-1-kuniyu@google.com/


Kuniyuki Iwashima (6):
  sockmap: Annotate sk->sk_data_ready() for UDP.
  sockmap: Annotate sk->sk_write_space() for UDP.
  sockmap: Fix use-after-free in udp_bpf_recvmsg().
  sockmap: Pass gfp_t flag to sk_psock_skb_ingress().
  sockmap: Consolidate sk_psock_skb_ingress_self().
  sockmap: Fix broken memory accounting for UDP.

 include/net/udp.h  |  9 +++++
 net/core/skmsg.c   | 97 ++++++++++++++++++++--------------------------
 net/core/sock.c    |  8 +++-
 net/ipv4/udp.c     | 11 +++++-
 net/ipv4/udp_bpf.c | 11 +++++-
 5 files changed, 77 insertions(+), 59 deletions(-)

-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() for UDP.
  2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev, syzbot+113cea56c13a8a1e95ab

syzbot reported data race of sk->sk_data_ready(). [0]

UDP fast path does not hold bh_lock_sock(), instead
spin_lock_bh(&sk->sk_receive_queue.lock) is used.

Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_data_ready().

Another option is to hold sk->sk_receive_queue.lock in
sock_map_sk_acquire() if sk_is_udp() is true, but this is
overkill and also does not work for sk->sk_write_space().

[0]:
BUG: KCSAN: data-race in __udp_enqueue_schedule_skb / sk_psock_drop

write to 0xffff88811d063048 of 8 bytes by task 23114 on cpu 0:
 sk_psock_stop_verdict net/core/skmsg.c:1287 [inline]
 sk_psock_drop+0x12f/0x270 net/core/skmsg.c:873
 sk_psock_put include/linux/skmsg.h:473 [inline]
 sock_map_unref+0x2a5/0x300 net/core/sock_map.c:185
 __sock_map_delete net/core/sock_map.c:426 [inline]
 sock_map_delete_from_link net/core/sock_map.c:439 [inline]
 sock_map_unlink net/core/sock_map.c:1608 [inline]
 sock_map_remove_links+0x228/0x340 net/core/sock_map.c:1623
 sock_map_close+0xa1/0x340 net/core/sock_map.c:1684
 inet_release+0xcd/0xf0 net/ipv4/af_inet.c:437
 __sock_release net/socket.c:662 [inline]
 sock_close+0x6b/0x150 net/socket.c:1455
 __fput+0x29b/0x650 fs/file_table.c:468
 ____fput+0x1c/0x30 fs/file_table.c:496
 task_work_run+0x130/0x1a0 kernel/task_work.c:233
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 __exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
 exit_to_user_mode_loop+0x1f7/0x6f0 kernel/entry/common.c:75
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
 syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
 do_syscall_64+0x1d3/0x2a0 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff88811d063048 of 8 bytes by task 23117 on cpu 1:
 __udp_enqueue_schedule_skb+0x6c1/0x840 net/ipv4/udp.c:1789
 __udp_queue_rcv_skb net/ipv4/udp.c:2346 [inline]
 udp_queue_rcv_one_skb+0x709/0xc20 net/ipv4/udp.c:2475
 udp_queue_rcv_skb+0x20e/0x2b0 net/ipv4/udp.c:2493
 __udp4_lib_mcast_deliver+0x6e8/0x790 net/ipv4/udp.c:2585
 __udp4_lib_rcv+0x96f/0x1260 net/ipv4/udp.c:2724
 udp_rcv+0x4f/0x60 net/ipv4/udp.c:2911
 ip_protocol_deliver_rcu+0x3f9/0x780 net/ipv4/ip_input.c:207
 ip_local_deliver_finish+0x1fc/0x2f0 net/ipv4/ip_input.c:241
 NF_HOOK include/linux/netfilter.h:318 [inline]
 ip_local_deliver+0xe8/0x1e0 net/ipv4/ip_input.c:262
 dst_input include/net/dst.h:474 [inline]
 ip_sublist_rcv_finish net/ipv4/ip_input.c:584 [inline]
 ip_list_rcv_finish net/ipv4/ip_input.c:628 [inline]
 ip_sublist_rcv+0x42b/0x6d0 net/ipv4/ip_input.c:644
 ip_list_rcv+0x261/0x290 net/ipv4/ip_input.c:678
 __netif_receive_skb_list_ptype net/core/dev.c:6195 [inline]
 __netif_receive_skb_list_core+0x4dc/0x500 net/core/dev.c:6242
 __netif_receive_skb_list net/core/dev.c:6294 [inline]
 netif_receive_skb_list_internal+0x47d/0x5f0 net/core/dev.c:6385
 netif_receive_skb_list+0x31/0x1f0 net/core/dev.c:6437
 xdp_recv_frames net/bpf/test_run.c:269 [inline]
 xdp_test_run_batch net/bpf/test_run.c:350 [inline]
 bpf_test_run_xdp_live+0x104c/0x1360 net/bpf/test_run.c:379
 bpf_prog_test_run_xdp+0x57b/0xa10 net/bpf/test_run.c:1396
 bpf_prog_test_run+0x204/0x340 kernel/bpf/syscall.c:4703
 __sys_bpf+0x4c0/0x7b0 kernel/bpf/syscall.c:6182
 __do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
 __x64_sys_bpf+0x41/0x50 kernel/bpf/syscall.c:6272
 x64_sys_call+0x28e1/0x3000 arch/x86/include/generated/asm/syscalls_64.h:322
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffffffff847b24d0 -> 0xffffffff84673410

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 23117 Comm: syz.8.5085 Tainted: G        W           syzkaller #0 PREEMPT(voluntary)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025

Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support")
Reported-by: syzbot+113cea56c13a8a1e95ab@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69922ac9.a70a0220.2c38d7.00e1.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/core/skmsg.c | 4 ++--
 net/ipv4/udp.c   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index ddde93dd8bc6..75fa94217e1e 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1296,7 +1296,7 @@ void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
 		return;
 
 	psock->saved_data_ready = sk->sk_data_ready;
-	sk->sk_data_ready = sk_psock_verdict_data_ready;
+	WRITE_ONCE(sk->sk_data_ready, sk_psock_verdict_data_ready);
 	sk->sk_write_space = sk_psock_write_space;
 }
 
@@ -1308,6 +1308,6 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
 	if (!psock->saved_data_ready)
 		return;
 
-	sk->sk_data_ready = psock->saved_data_ready;
+	WRITE_ONCE(sk->sk_data_ready, psock->saved_data_ready);
 	psock->saved_data_ready = NULL;
 }
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b96e47f1c8a2..422c96fea249 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1787,7 +1787,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 		 * using prepare_to_wait_exclusive().
 		 */
 		while (nb) {
-			INDIRECT_CALL_1(sk->sk_data_ready,
+			INDIRECT_CALL_1(READ_ONCE(sk->sk_data_ready),
 					sock_def_readable, sk);
 			nb--;
 		}
-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() for UDP.
  2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev

UDP TX skb->destructor() is sock_wfree(), and UDP only
holds lock_sock() for UDP_CORK / MSG_MORE sendmsg().

Otherwise, sk->sk_write_space() is read locklessly.

Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_write_space().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v3: Use WRITE_ONCE() in udp_bpf_update_proto()
v2: Cache sk->sk_write_space in sock_wfree()
---
 net/core/skmsg.c   | 2 +-
 net/core/sock.c    | 8 ++++++--
 net/ipv4/udp_bpf.c | 2 +-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 75fa94217e1e..3d7eb2f4ac98 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1297,7 +1297,7 @@ void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
 
 	psock->saved_data_ready = sk->sk_data_ready;
 	WRITE_ONCE(sk->sk_data_ready, sk_psock_verdict_data_ready);
-	sk->sk_write_space = sk_psock_write_space;
+	WRITE_ONCE(sk->sk_write_space, sk_psock_write_space);
 }
 
 void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
diff --git a/net/core/sock.c b/net/core/sock.c
index 693e6d80f501..710f57ff3768 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2673,8 +2673,12 @@ void sock_wfree(struct sk_buff *skb)
 	int old;
 
 	if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE)) {
+		void (*sk_write_space)(struct sock *sk);
+
+		sk_write_space = READ_ONCE(sk->sk_write_space);
+
 		if (sock_flag(sk, SOCK_RCU_FREE) &&
-		    sk->sk_write_space == sock_def_write_space) {
+		    sk_write_space == sock_def_write_space) {
 			rcu_read_lock();
 			free = __refcount_sub_and_test(len, &sk->sk_wmem_alloc,
 						       &old);
@@ -2690,7 +2694,7 @@ void sock_wfree(struct sk_buff *skb)
 		 * after sk_write_space() call
 		 */
 		WARN_ON(refcount_sub_and_test(len - 1, &sk->sk_wmem_alloc));
-		sk->sk_write_space(sk);
+		sk_write_space(sk);
 		len = 1;
 	}
 	/*
diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
index 91233e37cd97..779a3a03762f 100644
--- a/net/ipv4/udp_bpf.c
+++ b/net/ipv4/udp_bpf.c
@@ -158,7 +158,7 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
 	int family = sk->sk_family == AF_INET ? UDP_BPF_IPV4 : UDP_BPF_IPV6;
 
 	if (restore) {
-		sk->sk_write_space = psock->saved_write_space;
+		WRITE_ONCE(sk->sk_write_space, psock->saved_write_space);
 		sock_replace_proto(sk, psock->sk_proto);
 		return 0;
 	}
-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg().
  2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress() Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev, syzbot+9307c991a6d07ce6e6d8

syzbot reported use-after-free of struct sk_msg in sk_msg_recvmsg(). [0]

sk_msg_recvmsg() peeks sk_msg from psock->ingress_msg under a lock,
but its processing is lockless.

Thus, sk_msg_recvmsg() must be serialised by callers, otherwise
multiple threads could touch the same sk_msg.

For example, TCP uses lock_sock(), and AF_UNIX uses unix_sk(sk)->iolock.

Initially, udp_bpf_recvmsg() had used lock_sock(), but the cited
commit accidentally removed it.

Let's serialise sk_msg_recvmsg() with lock_sock() in udp_bpf_recvmsg().

Note that holding spin_lock_bh(&sk->sk_receive_queue.lock) is not
an option due to copy_page_to_iter() in sk_msg_recvmsg().

[0]:
BUG: KASAN: slab-use-after-free in sk_msg_recvmsg+0xb54/0xc30 net/core/skmsg.c:428
Read of size 4 at addr ffff88814cdcf000 by task syz.0.24/6020

CPU: 1 UID: 0 PID: 6020 Comm: syz.0.24 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xba/0x230 mm/kasan/report.c:482
 kasan_report+0x117/0x150 mm/kasan/report.c:595
 sk_msg_recvmsg+0xb54/0xc30 net/core/skmsg.c:428
 udp_bpf_recvmsg+0x4bd/0xe00 net/ipv4/udp_bpf.c:84
 inet_recvmsg+0x260/0x270 net/ipv4/af_inet.c:891
 sock_recvmsg_nosec net/socket.c:1078 [inline]
 sock_recvmsg+0x1a8/0x270 net/socket.c:1100
 ____sys_recvmsg+0x1e6/0x4a0 net/socket.c:2812
 ___sys_recvmsg+0x215/0x590 net/socket.c:2854
 do_recvmmsg+0x334/0x800 net/socket.c:2949
 __sys_recvmmsg net/socket.c:3023 [inline]
 __do_sys_recvmmsg net/socket.c:3046 [inline]
 __se_sys_recvmmsg net/socket.c:3039 [inline]
 __x64_sys_recvmmsg+0x198/0x250 net/socket.c:3039
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb319f9aeb9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb31ad97028 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007fb31a216090 RCX: 00007fb319f9aeb9
RDX: 0000000000000001 RSI: 0000200000000400 RDI: 0000000000000004
RBP: 00007fb31a008c1f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000040000021 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb31a216128 R14: 00007fb31a216090 R15: 00007ffe21dd0a98
 </TASK>

Allocated by task 6019:
 kasan_save_stack mm/kasan/common.c:57 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
 poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
 __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
 kasan_kmalloc include/linux/kasan.h:263 [inline]
 __kmalloc_cache_noprof+0x3d1/0x6e0 mm/slub.c:5780
 kmalloc_noprof include/linux/slab.h:957 [inline]
 kzalloc_noprof include/linux/slab.h:1094 [inline]
 alloc_sk_msg net/core/skmsg.c:510 [inline]
 sk_psock_skb_ingress_self+0x60/0x350 net/core/skmsg.c:612
 sk_psock_verdict_apply net/core/skmsg.c:1038 [inline]
 sk_psock_verdict_recv+0x7d9/0x8d0 net/core/skmsg.c:1236
 udp_read_skb+0x73e/0x7e0 net/ipv4/udp.c:2045
 sk_psock_verdict_data_ready+0x12d/0x550 net/core/skmsg.c:1257
 __udp_enqueue_schedule_skb+0xc54/0x10b0 net/ipv4/udp.c:1789
 __udp_queue_rcv_skb net/ipv4/udp.c:2346 [inline]
 udp_queue_rcv_one_skb+0xac5/0x19c0 net/ipv4/udp.c:2475
 __udp4_lib_mcast_deliver+0xc06/0xcf0 net/ipv4/udp.c:2585
 __udp4_lib_rcv+0x10f6/0x2620 net/ipv4/udp.c:2724
 ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207
 ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241
 NF_HOOK+0x336/0x3c0 include/linux/netfilter.h:318
 dst_input include/net/dst.h:474 [inline]
 ip_sublist_rcv_finish+0x221/0x2a0 net/ipv4/ip_input.c:584
 ip_list_rcv_finish net/ipv4/ip_input.c:628 [inline]
 ip_sublist_rcv+0x5c6/0xa70 net/ipv4/ip_input.c:644
 ip_list_rcv+0x3f1/0x450 net/ipv4/ip_input.c:678
 __netif_receive_skb_list_ptype net/core/dev.c:6195 [inline]
 __netif_receive_skb_list_core+0x7e5/0x810 net/core/dev.c:6242
 __netif_receive_skb_list net/core/dev.c:6294 [inline]
 netif_receive_skb_list_internal+0x995/0xcf0 net/core/dev.c:6385
 netif_receive_skb_list+0x54/0x410 net/core/dev.c:6437
 xdp_recv_frames net/bpf/test_run.c:269 [inline]
 xdp_test_run_batch net/bpf/test_run.c:350 [inline]
 bpf_test_run_xdp_live+0x1946/0x1cf0 net/bpf/test_run.c:379
 bpf_prog_test_run_xdp+0x81c/0x1160 net/bpf/test_run.c:1396
 bpf_prog_test_run+0x2c7/0x340 kernel/bpf/syscall.c:4703
 __sys_bpf+0x5cb/0x920 kernel/bpf/syscall.c:6182
 __do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
 __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6272
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 6021:
 kasan_save_stack mm/kasan/common.c:57 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
 poison_slab_object mm/kasan/common.c:253 [inline]
 __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
 kasan_slab_free include/linux/kasan.h:235 [inline]
 slab_free_hook mm/slub.c:2540 [inline]
 slab_free mm/slub.c:6674 [inline]
 kfree+0x1be/0x650 mm/slub.c:6882
 kfree_sk_msg include/linux/skmsg.h:385 [inline]
 sk_msg_recvmsg+0xaa8/0xc30 net/core/skmsg.c:483
 udp_bpf_recvmsg+0x4bd/0xe00 net/ipv4/udp_bpf.c:84
 inet_recvmsg+0x260/0x270 net/ipv4/af_inet.c:891
 sock_recvmsg_nosec net/socket.c:1078 [inline]
 sock_recvmsg+0x1a8/0x270 net/socket.c:1100
 ____sys_recvmsg+0x1e6/0x4a0 net/socket.c:2812
 ___sys_recvmsg+0x215/0x590 net/socket.c:2854
 do_recvmmsg+0x334/0x800 net/socket.c:2949
 __sys_recvmmsg net/socket.c:3023 [inline]
 __do_sys_recvmmsg net/socket.c:3046 [inline]
 __se_sys_recvmmsg net/socket.c:3039 [inline]
 __x64_sys_recvmmsg+0x198/0x250 net/socket.c:3039
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 9f2470fbc4cb ("skmsg: Improve udp_bpf_recvmsg() accuracy")
Reported-by: syzbot+9307c991a6d07ce6e6d8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69922ac9.a70a0220.2c38d7.00e0.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv4/udp_bpf.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
index 779a3a03762f..b07774034a91 100644
--- a/net/ipv4/udp_bpf.c
+++ b/net/ipv4/udp_bpf.c
@@ -52,7 +52,9 @@ static int udp_msg_wait_data(struct sock *sk, struct sk_psock *psock,
 	sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk);
 	ret = udp_msg_has_data(sk, psock);
 	if (!ret) {
+		release_sock(sk);
 		wait_woken(&wait, TASK_INTERRUPTIBLE, timeo);
+		lock_sock(sk);
 		ret = udp_msg_has_data(sk, psock);
 	}
 	sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk);
@@ -81,6 +83,7 @@ static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		goto out;
 	}
 
+	lock_sock(sk);
 msg_bytes_ready:
 	copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
 	if (!copied) {
@@ -92,11 +95,17 @@ static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		if (data) {
 			if (psock_has_data(psock))
 				goto msg_bytes_ready;
+
+			release_sock(sk);
+
 			ret = sk_udp_recvmsg(sk, msg, len, flags, addr_len);
 			goto out;
 		}
 		copied = -EAGAIN;
 	}
+
+	release_sock(sk);
+
 	ret = copied;
 out:
 	sk_psock_put(sk, psock);
-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress().
  2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2026-02-19 17:37 ` [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
  5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev

SOCKMAP memory accounting for UDP is broken for two reasons:

  1. sk->sk_forward_alloc must be changed under
     spin_lock_bh(&sk->sk_receive_queue.lock)

  2. sk_psock_skb_ingress_self() should not be used for UDP
     since UDP may reclaim sk->sk_forward_alloc partially
     before passing skb to sockmap, resulting in a negative
     sk->sk_forward_alloc

This is a prep commit to consolidate sk_psock_skb_ingress_self()
and centralise the fix to sk_psock_skb_ingress().

Let's pass gfp_t flag to sk_psock_skb_ingress() and inline
sk_psock_create_ingress_msg().

Note that now alloc_sk_msg() is called first.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/core/skmsg.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 3d7eb2f4ac98..57845b0d8a71 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -529,18 +529,6 @@ static struct sk_msg *alloc_sk_msg(gfp_t gfp)
 	return msg;
 }
 
-static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk,
-						  struct sk_buff *skb)
-{
-	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
-		return NULL;
-
-	if (!sk_rmem_schedule(sk, skb, skb->truesize))
-		return NULL;
-
-	return alloc_sk_msg(GFP_KERNEL);
-}
-
 static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
 					u32 off, u32 len,
 					struct sk_psock *psock,
@@ -588,11 +576,11 @@ static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb
 				     u32 off, u32 len, bool take_ref);
 
 static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
-				u32 off, u32 len)
+				u32 off, u32 len, gfp_t gfp_flags)
 {
 	struct sock *sk = psock->sk;
 	struct sk_msg *msg;
-	int err;
+	int err = -EAGAIN;
 
 	/* If we are receiving on the same sock skb->sk is already assigned,
 	 * skip memory accounting and owner transition seeing it already set
@@ -600,9 +588,16 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 	 */
 	if (unlikely(skb->sk == sk))
 		return sk_psock_skb_ingress_self(psock, skb, off, len, true);
-	msg = sk_psock_create_ingress_msg(sk, skb);
+
+	msg = alloc_sk_msg(gfp_flags);
 	if (!msg)
-		return -EAGAIN;
+		goto out;
+
+	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+		goto free;
+
+	if (!sk_rmem_schedule(sk, skb, skb->truesize))
+		goto free;
 
 	/* This will transition ownership of the data from the socket where
 	 * the BPF program was run initiating the redirect to the socket
@@ -613,8 +608,12 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 	skb_set_owner_r(skb, sk);
 	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, true);
 	if (err < 0)
-		kfree(msg);
+		goto free;
+out:
 	return err;
+free:
+	kfree(msg);
+	goto out;
 }
 
 /* Puts an skb on the ingress queue of the socket already assigned to the
@@ -652,7 +651,7 @@ static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
 		return skb_send_sock(psock->sk, skb, off, len);
 	}
 
-	return sk_psock_skb_ingress(psock, skb, off, len);
+	return sk_psock_skb_ingress(psock, skb, off, len, GFP_KERNEL);
 }
 
 static void sk_psock_skb_state(struct sk_psock *psock,
-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self().
  2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2026-02-19 17:37 ` [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress() Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
  2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
  5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev

SOCKMAP memory accounting for UDP is broken, and
sk_psock_skb_ingress_self() should not be used for UDP.

Let's consolidate sk_psock_skb_ingress_self() to
sk_psock_skb_ingress() so we can centralise the fix.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: Keep msg->sk assignment
---
 net/core/skmsg.c | 62 ++++++++++++++----------------------------------
 1 file changed, 18 insertions(+), 44 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 57845b0d8a71..6bf3c517dbd2 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -572,32 +572,31 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
 	return copied;
 }
 
-static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb,
-				     u32 off, u32 len, bool take_ref);
-
 static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
-				u32 off, u32 len, gfp_t gfp_flags)
+				u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
 {
 	struct sock *sk = psock->sk;
 	struct sk_msg *msg;
 	int err = -EAGAIN;
 
-	/* If we are receiving on the same sock skb->sk is already assigned,
-	 * skip memory accounting and owner transition seeing it already set
-	 * correctly.
-	 */
-	if (unlikely(skb->sk == sk))
-		return sk_psock_skb_ingress_self(psock, skb, off, len, true);
-
 	msg = alloc_sk_msg(gfp_flags);
 	if (!msg)
 		goto out;
 
-	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
-		goto free;
+	if (skb->sk != sk) {
+		if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+			goto free;
 
-	if (!sk_rmem_schedule(sk, skb, skb->truesize))
-		goto free;
+		if (!sk_rmem_schedule(sk, skb, skb->truesize))
+			goto free;
+	}
+
+	/* This is used in tcp_bpf_recvmsg_parser() to determine whether the
+	 * data originates from the socket's own protocol stack. No need to
+	 * refcount sk because msg's lifetime is bound to sk via the ingress_msg.
+	 */
+	if (skb->sk == sk || !take_ref)
+		msg->sk = sk;
 
 	/* This will transition ownership of the data from the socket where
 	 * the BPF program was run initiating the redirect to the socket
@@ -606,7 +605,8 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 	 * into user buffers.
 	 */
 	skb_set_owner_r(skb, sk);
-	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, true);
+
+	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
 	if (err < 0)
 		goto free;
 out:
@@ -616,32 +616,6 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 	goto out;
 }
 
-/* Puts an skb on the ingress queue of the socket already assigned to the
- * skb. In this case we do not need to check memory limits or skb_set_owner_r
- * because the skb is already accounted for here.
- */
-static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb,
-				     u32 off, u32 len, bool take_ref)
-{
-	struct sk_msg *msg = alloc_sk_msg(GFP_ATOMIC);
-	struct sock *sk = psock->sk;
-	int err;
-
-	if (unlikely(!msg))
-		return -EAGAIN;
-	skb_set_owner_r(skb, sk);
-
-	/* This is used in tcp_bpf_recvmsg_parser() to determine whether the
-	 * data originates from the socket's own protocol stack. No need to
-	 * refcount sk because msg's lifetime is bound to sk via the ingress_msg.
-	 */
-	msg->sk = sk;
-	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
-	if (err < 0)
-		kfree(msg);
-	return err;
-}
-
 static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
 			       u32 off, u32 len, bool ingress)
 {
@@ -651,7 +625,7 @@ static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
 		return skb_send_sock(psock->sk, skb, off, len);
 	}
 
-	return sk_psock_skb_ingress(psock, skb, off, len, GFP_KERNEL);
+	return sk_psock_skb_ingress(psock, skb, off, len, GFP_KERNEL, true);
 }
 
 static void sk_psock_skb_state(struct sk_psock *psock,
@@ -1058,7 +1032,7 @@ static int sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb,
 				off = stm->offset;
 				len = stm->full_len;
 			}
-			err = sk_psock_skb_ingress_self(psock, skb, off, len, false);
+			err = sk_psock_skb_ingress(psock, skb, off, len, GFP_ATOMIC, false);
 		}
 		if (err < 0) {
 			spin_lock_bh(&psock->ingress_lock);
-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
  2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
                   ` (4 preceding siblings ...)
  2026-02-19 17:37 ` [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
  2026-02-19 18:10   ` bot+bpf-ci
  5 siblings, 1 reply; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki
  Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
	netdev, syzbot+5b3b7e51dda1be027b7a

syzbot reported imbalanced sk->sk_forward_alloc [0] and
demonstrated that UDP memory accounting by SOCKMAP is broken.

The repro put a UDP sk into SOCKMAP and redirected skb to itself,
where skb->truesize was 4240.

First, udp_rmem_schedule() set sk->sk_forward_alloc to 8192
(2 * PAGE_SIZE), and skb->truesize was charged:

  sk->sk_forward_alloc = 0 + 8192 - 4240;  // => 3952

Then, udp_read_skb() dequeued the skb by skb_recv_udp(), which finally
calls udp_rmem_release() and _partially_ reclaims sk->sk_forward_alloc
because skb->truesize was larger than PAGE_SIZE:

  sk->sk_forward_alloc += 4240;  // => 8192 (PAGE_SIZE is reclaimable)
  sk->sk_forward_alloc -= 4096;  // => 4096

Later, sk_psock_skb_ingress_self() called skb_set_owner_r() to
charge the skb again, triggering an sk->sk_forward_alloc underflow:

  sk->sk_forward_alloc -= 4240   // => -144

Another problem is that UDP memory accounting is not performed
under spin_lock_bh(&sk->sk_receive_queue.lock).

skb_set_owner_r() and sock_rfree() are called locklessly and
corrupt sk->sk_forward_alloc, leading to the splat.

Let's not skip memory accounting for UDP and ensure the proper
lock is held.

[0]:
WARNING: net/ipv4/af_inet.c:157 at inet_sock_destruct+0x62d/0x740 net/ipv4/af_inet.c:157, CPU#0: ksoftirqd/0/15
Modules linked in:
CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026
RIP: 0010:inet_sock_destruct+0x62d/0x740 net/ipv4/af_inet.c:157
Code: 0f 0b 90 e9 58 fe ff ff e8 40 55 b3 f7 90 0f 0b 90 e9 8b fe ff ff e8 32 55 b3 f7 90 0f 0b 90 e9 b1 fe ff ff e8 24 55 b3 f7 90 <0f> 0b 90 e9 d7 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 95 fc
RSP: 0018:ffffc90000147a48 EFLAGS: 00010246
RAX: ffffffff8a1121dc RBX: dffffc0000000000 RCX: ffff88801d2c3d00
RDX: 0000000000000100 RSI: 0000000000000f70 RDI: 0000000000000000
RBP: 0000000000000f70 R08: ffff888030ce1327 R09: 1ffff1100619c264
R10: dffffc0000000000 R11: ffffed100619c265 R12: ffff888030ce1080
R13: dffffc0000000000 R14: ffff888030ce130c R15: ffffffff8fa87e00
FS:  0000000000000000(0000) GS:ffff8881256f8000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000000700 CR3: 000000007200c000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 __sk_destruct+0x85/0x880 net/core/sock.c:2350
 rcu_do_batch kernel/rcu/tree.c:2605 [inline]
 rcu_core+0xc9e/0x1750 kernel/rcu/tree.c:2857
 handle_softirqs+0x22a/0x7c0 kernel/softirq.c:622
 run_ksoftirqd+0x36/0x60 kernel/softirq.c:1063
 smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
 kthread+0x726/0x8b0 kernel/kthread.c:463
 ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
 </TASK>

Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap")
Reported-by: syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/698f84c6.a70a0220.2c38d7.00cb.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: fix build failure when CONFIG_INET=n
---
 include/net/udp.h |  9 +++++++++
 net/core/skmsg.c  | 20 +++++++++++++++++---
 net/ipv4/udp.c    |  9 +++++++++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index 700dbedcb15f..ae38a4da9388 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -455,6 +455,15 @@ struct sock *__udp6_lib_lookup(const struct net *net,
 			       struct sk_buff *skb);
 struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport);
+
+#ifdef CONFIG_INET
+void udp_sock_rfree(struct sk_buff *skb);
+#else
+static inline void udp_sock_rfree(struct sk_buff *skb)
+{
+}
+#endif
+
 int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 
 /* UDP uses skb->dev_scratch to cache as much information as possible and avoid
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 6bf3c517dbd2..c5fdb2827422 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -7,6 +7,7 @@
 
 #include <net/sock.h>
 #include <net/tcp.h>
+#include <net/udp.h>
 #include <net/tls.h>
 #include <trace/events/sock.h>
 
@@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 				u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
 {
 	struct sock *sk = psock->sk;
+	bool is_udp = sk_is_udp(sk);
 	struct sk_msg *msg;
 	int err = -EAGAIN;
 
@@ -583,12 +585,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 	if (!msg)
 		goto out;
 
-	if (skb->sk != sk) {
+	if (is_udp)
+		spin_lock_bh(&sk->sk_receive_queue.lock);
+
+	if (skb->sk != sk || is_udp) {
 		if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
-			goto free;
+			goto unlock;
 
 		if (!sk_rmem_schedule(sk, skb, skb->truesize))
-			goto free;
+			goto unlock;
 	}
 
 	/* This is used in tcp_bpf_recvmsg_parser() to determine whether the
@@ -606,11 +611,20 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
 	 */
 	skb_set_owner_r(skb, sk);
 
+	if (is_udp) {
+		skb->destructor = udp_sock_rfree;
+		spin_unlock_bh(&sk->sk_receive_queue.lock);
+	}
+
 	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
 	if (err < 0)
 		goto free;
 out:
 	return err;
+
+unlock:
+	if (is_udp)
+		spin_unlock_bh(&sk->sk_receive_queue.lock);
 free:
 	kfree(msg);
 	goto out;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 422c96fea249..831d26748a90 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2039,6 +2039,15 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags,
 }
 EXPORT_SYMBOL(__skb_recv_udp);
 
+void udp_sock_rfree(struct sk_buff *skb)
+{
+	struct sock *sk = skb->sk;
+
+	spin_lock_bh(&sk->sk_receive_queue.lock);
+	sock_rfree(skb);
+	spin_unlock_bh(&sk->sk_receive_queue.lock);
+}
+
 int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
 {
 	struct sk_buff *skb;
-- 
2.53.0.345.g96ddfc5eaa-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
  2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
@ 2026-02-19 18:10   ` bot+bpf-ci
  2026-02-19 18:29     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 9+ messages in thread
From: bot+bpf-ci @ 2026-02-19 18:10 UTC (permalink / raw)
  To: kuniyu, john.fastabend, jakub
  Cc: willemdebruijn.kernel, kuniyu, kuni1840, bpf, netdev,
	syzbot+5b3b7e51dda1be027b7a, ast, andrii, daniel, martin.lau,
	eddyz87, yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 3213 bytes --]

> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c

[ ... ]

> @@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
>  				u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
>  {
>  	struct sock *sk = psock->sk;
> +	bool is_udp = sk_is_udp(sk);
>  	struct sk_msg *msg;
>  	int err = -EAGAIN;
>
> @@ -583,12 +585,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
>  	if (!msg)
>  		goto out;
>
> -	if (skb->sk != sk) {
> +	if (is_udp)
> +		spin_lock_bh(&sk->sk_receive_queue.lock);
> +
> +	if (skb->sk != sk || is_udp) {
>  		if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
> -			goto free;
> +			goto unlock;
>
>  		if (!sk_rmem_schedule(sk, skb, skb->truesize))
> -			goto free;
> +			goto unlock;
>  	}

[ ... ]

>  	skb_set_owner_r(skb, sk);
                        ^^^

If sk_psock_skb_ingress_enqueue() fails after this point (for
example, skb_to_sgvec() fails and skb_linearize() also fails under
GFP_ATOMIC memory pressure), the error path frees msg and returns
-EAGAIN, but the skb is kept with skb->destructor set to
udp_sock_rfree.

The caller sk_psock_verdict_apply() then queues this skb onto
psock->ingress_skb for retry via sk_psock_backlog().

On retry, sk_psock_handle_skb() calls sk_psock_skb_ingress()
again. The function acquires sk->sk_receive_queue.lock, then calls
skb_set_owner_r() which calls skb_orphan(). skb_orphan() invokes
the current skb->destructor, which is now udp_sock_rfree():

  sk_psock_skb_ingress()
    spin_lock_bh(&sk->sk_receive_queue.lock)   <-- held
    skb_set_owner_r(skb, sk)
      skb_orphan(skb)
        skb->destructor(skb)           [= udp_sock_rfree]
          spin_lock_bh(&sk->sk_receive_queue.lock)  <-- deadlock

Could this self-deadlock on the receive queue lock?

> +	if (is_udp) {
> +		skb->destructor = udp_sock_rfree;
> +		spin_unlock_bh(&sk->sk_receive_queue.lock);
> +	}
> +
>  	err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
>  	if (err < 0)
>  		goto free;
>  out:
>  	return err;
> +
> +unlock:
> +	if (is_udp)
> +		spin_unlock_bh(&sk->sk_receive_queue.lock);
>  free:
>  	kfree(msg);
>  	goto out;

> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c

[ ... ]

> +void udp_sock_rfree(struct sk_buff *skb)
> +{
> +	struct sock *sk = skb->sk;
> +
> +	spin_lock_bh(&sk->sk_receive_queue.lock);
> +	sock_rfree(skb);
> +	spin_unlock_bh(&sk->sk_receive_queue.lock);
> +}


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22193327667

AI-authorship-score: low
AI-authorship-explanation: The commit demonstrates deep understanding of UDP memory accounting internals and references a specific syzbot report, consistent with an experienced kernel networking developer.
issues-found: 1
issue-severity-score: high
issue-severity-explanation: Self-deadlock on sk->sk_receive_queue.lock when sk_psock_skb_ingress is retried after enqueue failure for UDP, causing a system hang.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
  2026-02-19 18:10   ` bot+bpf-ci
@ 2026-02-19 18:29     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 18:29 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: john.fastabend, jakub, willemdebruijn.kernel, kuni1840, bpf,
	netdev, syzbot+5b3b7e51dda1be027b7a, ast, andrii, daniel,
	martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

On Thu, Feb 19, 2026 at 10:11 AM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
>
> [ ... ]
>
> > @@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> >                               u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
> >  {
> >       struct sock *sk = psock->sk;
> > +     bool is_udp = sk_is_udp(sk);
> >       struct sk_msg *msg;
> >       int err = -EAGAIN;
> >
> > @@ -583,12 +585,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> >       if (!msg)
> >               goto out;
> >
> > -     if (skb->sk != sk) {
> > +     if (is_udp)
> > +             spin_lock_bh(&sk->sk_receive_queue.lock);
> > +
> > +     if (skb->sk != sk || is_udp) {
> >               if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
> > -                     goto free;
> > +                     goto unlock;
> >
> >               if (!sk_rmem_schedule(sk, skb, skb->truesize))
> > -                     goto free;
> > +                     goto unlock;
> >       }
>
> [ ... ]
>
> >       skb_set_owner_r(skb, sk);
>                         ^^^
>
> If sk_psock_skb_ingress_enqueue() fails after this point (for
> example, skb_to_sgvec() fails and skb_linearize() also fails under
> GFP_ATOMIC memory pressure), the error path frees msg and returns
> -EAGAIN, but the skb is kept with skb->destructor set to
> udp_sock_rfree.
>
> The caller sk_psock_verdict_apply() then queues this skb onto
> psock->ingress_skb for retry via sk_psock_backlog().
>
> On retry, sk_psock_handle_skb() calls sk_psock_skb_ingress()
> again. The function acquires sk->sk_receive_queue.lock, then calls
> skb_set_owner_r() which calls skb_orphan(). skb_orphan() invokes
> the current skb->destructor, which is now udp_sock_rfree():
>
>   sk_psock_skb_ingress()
>     spin_lock_bh(&sk->sk_receive_queue.lock)   <-- held
>     skb_set_owner_r(skb, sk)
>       skb_orphan(skb)
>         skb->destructor(skb)           [= udp_sock_rfree]
>           spin_lock_bh(&sk->sk_receive_queue.lock)  <-- deadlock
>
> Could this self-deadlock on the receive queue lock?

Oh exactly...
Will short-circuit to sk_psock_skb_ingress_enqueue()
in that case.


>
> > +     if (is_udp) {
> > +             skb->destructor = udp_sock_rfree;
> > +             spin_unlock_bh(&sk->sk_receive_queue.lock);
> > +     }
> > +
> >       err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
> >       if (err < 0)
> >               goto free;
> >  out:
> >       return err;
> > +
> > +unlock:
> > +     if (is_udp)
> > +             spin_unlock_bh(&sk->sk_receive_queue.lock);
> >  free:
> >       kfree(msg);
> >       goto out;
>
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
>
> [ ... ]
>
> > +void udp_sock_rfree(struct sk_buff *skb)
> > +{
> > +     struct sock *sk = skb->sk;
> > +
> > +     spin_lock_bh(&sk->sk_receive_queue.lock);
> > +     sock_rfree(skb);
> > +     spin_unlock_bh(&sk->sk_receive_queue.lock);
> > +}
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22193327667
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit demonstrates deep understanding of UDP memory accounting internals and references a specific syzbot report, consistent with an experienced kernel networking developer.
> issues-found: 1
> issue-severity-score: high
> issue-severity-explanation: Self-deadlock on sk->sk_receive_queue.lock when sk_psock_skb_ingress is retried after enqueue failure for UDP, causing a system hang.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-19 18:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
2026-02-19 18:10   ` bot+bpf-ci
2026-02-19 18:29     ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox