* [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP.
@ 2026-02-19 17:37 Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev
syzbot reported 3 issues in SOCKMAP for UDP.
Patch 1 - 2 fix lockless accesses to sk->sk_data_ready()
and sk->sk_write_space().
Patch 3 fixes UAF in sk_msg_recvmsg().
Patch 4 - 5 consolidate sk_psock_skb_ingress_self() into
sk_psock_skb_ingress() as prep.
Patch 6 fixes broken memory accounting.
Changes:
v3:
Patch 2: Use WRITE_ONCE() in udp_bpf_update_proto()
v2: https://lore.kernel.org/netdev/20260217000701.791189-1-kuniyu@google.com/
Patch 2: Cache sk->sk_write_space in sock_wfree()
Patch 5: Keep msg->sk assignment
Patch 6: Fix build failure when CONFIG_INET=n
v1: https://lore.kernel.org/netdev/20260215204353.3645744-1-kuniyu@google.com/
Kuniyuki Iwashima (6):
sockmap: Annotate sk->sk_data_ready() for UDP.
sockmap: Annotate sk->sk_write_space() for UDP.
sockmap: Fix use-after-free in udp_bpf_recvmsg().
sockmap: Pass gfp_t flag to sk_psock_skb_ingress().
sockmap: Consolidate sk_psock_skb_ingress_self().
sockmap: Fix broken memory accounting for UDP.
include/net/udp.h | 9 +++++
net/core/skmsg.c | 97 ++++++++++++++++++++--------------------------
net/core/sock.c | 8 +++-
net/ipv4/udp.c | 11 +++++-
net/ipv4/udp_bpf.c | 11 +++++-
5 files changed, 77 insertions(+), 59 deletions(-)
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() for UDP.
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev, syzbot+113cea56c13a8a1e95ab
syzbot reported data race of sk->sk_data_ready(). [0]
UDP fast path does not hold bh_lock_sock(), instead
spin_lock_bh(&sk->sk_receive_queue.lock) is used.
Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_data_ready().
Another option is to hold sk->sk_receive_queue.lock in
sock_map_sk_acquire() if sk_is_udp() is true, but this is
overkill and also does not work for sk->sk_write_space().
[0]:
BUG: KCSAN: data-race in __udp_enqueue_schedule_skb / sk_psock_drop
write to 0xffff88811d063048 of 8 bytes by task 23114 on cpu 0:
sk_psock_stop_verdict net/core/skmsg.c:1287 [inline]
sk_psock_drop+0x12f/0x270 net/core/skmsg.c:873
sk_psock_put include/linux/skmsg.h:473 [inline]
sock_map_unref+0x2a5/0x300 net/core/sock_map.c:185
__sock_map_delete net/core/sock_map.c:426 [inline]
sock_map_delete_from_link net/core/sock_map.c:439 [inline]
sock_map_unlink net/core/sock_map.c:1608 [inline]
sock_map_remove_links+0x228/0x340 net/core/sock_map.c:1623
sock_map_close+0xa1/0x340 net/core/sock_map.c:1684
inet_release+0xcd/0xf0 net/ipv4/af_inet.c:437
__sock_release net/socket.c:662 [inline]
sock_close+0x6b/0x150 net/socket.c:1455
__fput+0x29b/0x650 fs/file_table.c:468
____fput+0x1c/0x30 fs/file_table.c:496
task_work_run+0x130/0x1a0 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
exit_to_user_mode_loop+0x1f7/0x6f0 kernel/entry/common.c:75
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
do_syscall_64+0x1d3/0x2a0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff88811d063048 of 8 bytes by task 23117 on cpu 1:
__udp_enqueue_schedule_skb+0x6c1/0x840 net/ipv4/udp.c:1789
__udp_queue_rcv_skb net/ipv4/udp.c:2346 [inline]
udp_queue_rcv_one_skb+0x709/0xc20 net/ipv4/udp.c:2475
udp_queue_rcv_skb+0x20e/0x2b0 net/ipv4/udp.c:2493
__udp4_lib_mcast_deliver+0x6e8/0x790 net/ipv4/udp.c:2585
__udp4_lib_rcv+0x96f/0x1260 net/ipv4/udp.c:2724
udp_rcv+0x4f/0x60 net/ipv4/udp.c:2911
ip_protocol_deliver_rcu+0x3f9/0x780 net/ipv4/ip_input.c:207
ip_local_deliver_finish+0x1fc/0x2f0 net/ipv4/ip_input.c:241
NF_HOOK include/linux/netfilter.h:318 [inline]
ip_local_deliver+0xe8/0x1e0 net/ipv4/ip_input.c:262
dst_input include/net/dst.h:474 [inline]
ip_sublist_rcv_finish net/ipv4/ip_input.c:584 [inline]
ip_list_rcv_finish net/ipv4/ip_input.c:628 [inline]
ip_sublist_rcv+0x42b/0x6d0 net/ipv4/ip_input.c:644
ip_list_rcv+0x261/0x290 net/ipv4/ip_input.c:678
__netif_receive_skb_list_ptype net/core/dev.c:6195 [inline]
__netif_receive_skb_list_core+0x4dc/0x500 net/core/dev.c:6242
__netif_receive_skb_list net/core/dev.c:6294 [inline]
netif_receive_skb_list_internal+0x47d/0x5f0 net/core/dev.c:6385
netif_receive_skb_list+0x31/0x1f0 net/core/dev.c:6437
xdp_recv_frames net/bpf/test_run.c:269 [inline]
xdp_test_run_batch net/bpf/test_run.c:350 [inline]
bpf_test_run_xdp_live+0x104c/0x1360 net/bpf/test_run.c:379
bpf_prog_test_run_xdp+0x57b/0xa10 net/bpf/test_run.c:1396
bpf_prog_test_run+0x204/0x340 kernel/bpf/syscall.c:4703
__sys_bpf+0x4c0/0x7b0 kernel/bpf/syscall.c:6182
__do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
__x64_sys_bpf+0x41/0x50 kernel/bpf/syscall.c:6272
x64_sys_call+0x28e1/0x3000 arch/x86/include/generated/asm/syscalls_64.h:322
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffffffff847b24d0 -> 0xffffffff84673410
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 23117 Comm: syz.8.5085 Tainted: G W syzkaller #0 PREEMPT(voluntary)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support")
Reported-by: syzbot+113cea56c13a8a1e95ab@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69922ac9.a70a0220.2c38d7.00e1.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/core/skmsg.c | 4 ++--
net/ipv4/udp.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index ddde93dd8bc6..75fa94217e1e 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1296,7 +1296,7 @@ void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
return;
psock->saved_data_ready = sk->sk_data_ready;
- sk->sk_data_ready = sk_psock_verdict_data_ready;
+ WRITE_ONCE(sk->sk_data_ready, sk_psock_verdict_data_ready);
sk->sk_write_space = sk_psock_write_space;
}
@@ -1308,6 +1308,6 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
if (!psock->saved_data_ready)
return;
- sk->sk_data_ready = psock->saved_data_ready;
+ WRITE_ONCE(sk->sk_data_ready, psock->saved_data_ready);
psock->saved_data_ready = NULL;
}
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b96e47f1c8a2..422c96fea249 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1787,7 +1787,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
* using prepare_to_wait_exclusive().
*/
while (nb) {
- INDIRECT_CALL_1(sk->sk_data_ready,
+ INDIRECT_CALL_1(READ_ONCE(sk->sk_data_ready),
sock_def_readable, sk);
nb--;
}
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() for UDP.
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev
UDP TX skb->destructor() is sock_wfree(), and UDP only
holds lock_sock() for UDP_CORK / MSG_MORE sendmsg().
Otherwise, sk->sk_write_space() is read locklessly.
Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_write_space().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v3: Use WRITE_ONCE() in udp_bpf_update_proto()
v2: Cache sk->sk_write_space in sock_wfree()
---
net/core/skmsg.c | 2 +-
net/core/sock.c | 8 ++++++--
net/ipv4/udp_bpf.c | 2 +-
3 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 75fa94217e1e..3d7eb2f4ac98 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1297,7 +1297,7 @@ void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
psock->saved_data_ready = sk->sk_data_ready;
WRITE_ONCE(sk->sk_data_ready, sk_psock_verdict_data_ready);
- sk->sk_write_space = sk_psock_write_space;
+ WRITE_ONCE(sk->sk_write_space, sk_psock_write_space);
}
void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
diff --git a/net/core/sock.c b/net/core/sock.c
index 693e6d80f501..710f57ff3768 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2673,8 +2673,12 @@ void sock_wfree(struct sk_buff *skb)
int old;
if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE)) {
+ void (*sk_write_space)(struct sock *sk);
+
+ sk_write_space = READ_ONCE(sk->sk_write_space);
+
if (sock_flag(sk, SOCK_RCU_FREE) &&
- sk->sk_write_space == sock_def_write_space) {
+ sk_write_space == sock_def_write_space) {
rcu_read_lock();
free = __refcount_sub_and_test(len, &sk->sk_wmem_alloc,
&old);
@@ -2690,7 +2694,7 @@ void sock_wfree(struct sk_buff *skb)
* after sk_write_space() call
*/
WARN_ON(refcount_sub_and_test(len - 1, &sk->sk_wmem_alloc));
- sk->sk_write_space(sk);
+ sk_write_space(sk);
len = 1;
}
/*
diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
index 91233e37cd97..779a3a03762f 100644
--- a/net/ipv4/udp_bpf.c
+++ b/net/ipv4/udp_bpf.c
@@ -158,7 +158,7 @@ int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
int family = sk->sk_family == AF_INET ? UDP_BPF_IPV4 : UDP_BPF_IPV6;
if (restore) {
- sk->sk_write_space = psock->saved_write_space;
+ WRITE_ONCE(sk->sk_write_space, psock->saved_write_space);
sock_replace_proto(sk, psock->sk_proto);
return 0;
}
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg().
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress() Kuniyuki Iwashima
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev, syzbot+9307c991a6d07ce6e6d8
syzbot reported use-after-free of struct sk_msg in sk_msg_recvmsg(). [0]
sk_msg_recvmsg() peeks sk_msg from psock->ingress_msg under a lock,
but its processing is lockless.
Thus, sk_msg_recvmsg() must be serialised by callers, otherwise
multiple threads could touch the same sk_msg.
For example, TCP uses lock_sock(), and AF_UNIX uses unix_sk(sk)->iolock.
Initially, udp_bpf_recvmsg() had used lock_sock(), but the cited
commit accidentally removed it.
Let's serialise sk_msg_recvmsg() with lock_sock() in udp_bpf_recvmsg().
Note that holding spin_lock_bh(&sk->sk_receive_queue.lock) is not
an option due to copy_page_to_iter() in sk_msg_recvmsg().
[0]:
BUG: KASAN: slab-use-after-free in sk_msg_recvmsg+0xb54/0xc30 net/core/skmsg.c:428
Read of size 4 at addr ffff88814cdcf000 by task syz.0.24/6020
CPU: 1 UID: 0 PID: 6020 Comm: syz.0.24 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xba/0x230 mm/kasan/report.c:482
kasan_report+0x117/0x150 mm/kasan/report.c:595
sk_msg_recvmsg+0xb54/0xc30 net/core/skmsg.c:428
udp_bpf_recvmsg+0x4bd/0xe00 net/ipv4/udp_bpf.c:84
inet_recvmsg+0x260/0x270 net/ipv4/af_inet.c:891
sock_recvmsg_nosec net/socket.c:1078 [inline]
sock_recvmsg+0x1a8/0x270 net/socket.c:1100
____sys_recvmsg+0x1e6/0x4a0 net/socket.c:2812
___sys_recvmsg+0x215/0x590 net/socket.c:2854
do_recvmmsg+0x334/0x800 net/socket.c:2949
__sys_recvmmsg net/socket.c:3023 [inline]
__do_sys_recvmmsg net/socket.c:3046 [inline]
__se_sys_recvmmsg net/socket.c:3039 [inline]
__x64_sys_recvmmsg+0x198/0x250 net/socket.c:3039
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb319f9aeb9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb31ad97028 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007fb31a216090 RCX: 00007fb319f9aeb9
RDX: 0000000000000001 RSI: 0000200000000400 RDI: 0000000000000004
RBP: 00007fb31a008c1f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000040000021 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb31a216128 R14: 00007fb31a216090 R15: 00007ffe21dd0a98
</TASK>
Allocated by task 6019:
kasan_save_stack mm/kasan/common.c:57 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
kasan_kmalloc include/linux/kasan.h:263 [inline]
__kmalloc_cache_noprof+0x3d1/0x6e0 mm/slub.c:5780
kmalloc_noprof include/linux/slab.h:957 [inline]
kzalloc_noprof include/linux/slab.h:1094 [inline]
alloc_sk_msg net/core/skmsg.c:510 [inline]
sk_psock_skb_ingress_self+0x60/0x350 net/core/skmsg.c:612
sk_psock_verdict_apply net/core/skmsg.c:1038 [inline]
sk_psock_verdict_recv+0x7d9/0x8d0 net/core/skmsg.c:1236
udp_read_skb+0x73e/0x7e0 net/ipv4/udp.c:2045
sk_psock_verdict_data_ready+0x12d/0x550 net/core/skmsg.c:1257
__udp_enqueue_schedule_skb+0xc54/0x10b0 net/ipv4/udp.c:1789
__udp_queue_rcv_skb net/ipv4/udp.c:2346 [inline]
udp_queue_rcv_one_skb+0xac5/0x19c0 net/ipv4/udp.c:2475
__udp4_lib_mcast_deliver+0xc06/0xcf0 net/ipv4/udp.c:2585
__udp4_lib_rcv+0x10f6/0x2620 net/ipv4/udp.c:2724
ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207
ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241
NF_HOOK+0x336/0x3c0 include/linux/netfilter.h:318
dst_input include/net/dst.h:474 [inline]
ip_sublist_rcv_finish+0x221/0x2a0 net/ipv4/ip_input.c:584
ip_list_rcv_finish net/ipv4/ip_input.c:628 [inline]
ip_sublist_rcv+0x5c6/0xa70 net/ipv4/ip_input.c:644
ip_list_rcv+0x3f1/0x450 net/ipv4/ip_input.c:678
__netif_receive_skb_list_ptype net/core/dev.c:6195 [inline]
__netif_receive_skb_list_core+0x7e5/0x810 net/core/dev.c:6242
__netif_receive_skb_list net/core/dev.c:6294 [inline]
netif_receive_skb_list_internal+0x995/0xcf0 net/core/dev.c:6385
netif_receive_skb_list+0x54/0x410 net/core/dev.c:6437
xdp_recv_frames net/bpf/test_run.c:269 [inline]
xdp_test_run_batch net/bpf/test_run.c:350 [inline]
bpf_test_run_xdp_live+0x1946/0x1cf0 net/bpf/test_run.c:379
bpf_prog_test_run_xdp+0x81c/0x1160 net/bpf/test_run.c:1396
bpf_prog_test_run+0x2c7/0x340 kernel/bpf/syscall.c:4703
__sys_bpf+0x5cb/0x920 kernel/bpf/syscall.c:6182
__do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6272
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 6021:
kasan_save_stack mm/kasan/common.c:57 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:253 [inline]
__kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
kasan_slab_free include/linux/kasan.h:235 [inline]
slab_free_hook mm/slub.c:2540 [inline]
slab_free mm/slub.c:6674 [inline]
kfree+0x1be/0x650 mm/slub.c:6882
kfree_sk_msg include/linux/skmsg.h:385 [inline]
sk_msg_recvmsg+0xaa8/0xc30 net/core/skmsg.c:483
udp_bpf_recvmsg+0x4bd/0xe00 net/ipv4/udp_bpf.c:84
inet_recvmsg+0x260/0x270 net/ipv4/af_inet.c:891
sock_recvmsg_nosec net/socket.c:1078 [inline]
sock_recvmsg+0x1a8/0x270 net/socket.c:1100
____sys_recvmsg+0x1e6/0x4a0 net/socket.c:2812
___sys_recvmsg+0x215/0x590 net/socket.c:2854
do_recvmmsg+0x334/0x800 net/socket.c:2949
__sys_recvmmsg net/socket.c:3023 [inline]
__do_sys_recvmmsg net/socket.c:3046 [inline]
__se_sys_recvmmsg net/socket.c:3039 [inline]
__x64_sys_recvmmsg+0x198/0x250 net/socket.c:3039
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 9f2470fbc4cb ("skmsg: Improve udp_bpf_recvmsg() accuracy")
Reported-by: syzbot+9307c991a6d07ce6e6d8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69922ac9.a70a0220.2c38d7.00e0.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/ipv4/udp_bpf.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c
index 779a3a03762f..b07774034a91 100644
--- a/net/ipv4/udp_bpf.c
+++ b/net/ipv4/udp_bpf.c
@@ -52,7 +52,9 @@ static int udp_msg_wait_data(struct sock *sk, struct sk_psock *psock,
sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk);
ret = udp_msg_has_data(sk, psock);
if (!ret) {
+ release_sock(sk);
wait_woken(&wait, TASK_INTERRUPTIBLE, timeo);
+ lock_sock(sk);
ret = udp_msg_has_data(sk, psock);
}
sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk);
@@ -81,6 +83,7 @@ static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
goto out;
}
+ lock_sock(sk);
msg_bytes_ready:
copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
if (!copied) {
@@ -92,11 +95,17 @@ static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
if (data) {
if (psock_has_data(psock))
goto msg_bytes_ready;
+
+ release_sock(sk);
+
ret = sk_udp_recvmsg(sk, msg, len, flags, addr_len);
goto out;
}
copied = -EAGAIN;
}
+
+ release_sock(sk);
+
ret = copied;
out:
sk_psock_put(sk, psock);
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress().
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
` (2 preceding siblings ...)
2026-02-19 17:37 ` [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev
SOCKMAP memory accounting for UDP is broken for two reasons:
1. sk->sk_forward_alloc must be changed under
spin_lock_bh(&sk->sk_receive_queue.lock)
2. sk_psock_skb_ingress_self() should not be used for UDP
since UDP may reclaim sk->sk_forward_alloc partially
before passing skb to sockmap, resulting in a negative
sk->sk_forward_alloc
This is a prep commit to consolidate sk_psock_skb_ingress_self()
and centralise the fix to sk_psock_skb_ingress().
Let's pass gfp_t flag to sk_psock_skb_ingress() and inline
sk_psock_create_ingress_msg().
Note that now alloc_sk_msg() is called first.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/core/skmsg.c | 35 +++++++++++++++++------------------
1 file changed, 17 insertions(+), 18 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 3d7eb2f4ac98..57845b0d8a71 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -529,18 +529,6 @@ static struct sk_msg *alloc_sk_msg(gfp_t gfp)
return msg;
}
-static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk,
- struct sk_buff *skb)
-{
- if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
- return NULL;
-
- if (!sk_rmem_schedule(sk, skb, skb->truesize))
- return NULL;
-
- return alloc_sk_msg(GFP_KERNEL);
-}
-
static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
u32 off, u32 len,
struct sk_psock *psock,
@@ -588,11 +576,11 @@ static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb
u32 off, u32 len, bool take_ref);
static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
- u32 off, u32 len)
+ u32 off, u32 len, gfp_t gfp_flags)
{
struct sock *sk = psock->sk;
struct sk_msg *msg;
- int err;
+ int err = -EAGAIN;
/* If we are receiving on the same sock skb->sk is already assigned,
* skip memory accounting and owner transition seeing it already set
@@ -600,9 +588,16 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
*/
if (unlikely(skb->sk == sk))
return sk_psock_skb_ingress_self(psock, skb, off, len, true);
- msg = sk_psock_create_ingress_msg(sk, skb);
+
+ msg = alloc_sk_msg(gfp_flags);
if (!msg)
- return -EAGAIN;
+ goto out;
+
+ if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+ goto free;
+
+ if (!sk_rmem_schedule(sk, skb, skb->truesize))
+ goto free;
/* This will transition ownership of the data from the socket where
* the BPF program was run initiating the redirect to the socket
@@ -613,8 +608,12 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
skb_set_owner_r(skb, sk);
err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, true);
if (err < 0)
- kfree(msg);
+ goto free;
+out:
return err;
+free:
+ kfree(msg);
+ goto out;
}
/* Puts an skb on the ingress queue of the socket already assigned to the
@@ -652,7 +651,7 @@ static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
return skb_send_sock(psock->sk, skb, off, len);
}
- return sk_psock_skb_ingress(psock, skb, off, len);
+ return sk_psock_skb_ingress(psock, skb, off, len, GFP_KERNEL);
}
static void sk_psock_skb_state(struct sk_psock *psock,
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self().
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
` (3 preceding siblings ...)
2026-02-19 17:37 ` [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress() Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
5 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev
SOCKMAP memory accounting for UDP is broken, and
sk_psock_skb_ingress_self() should not be used for UDP.
Let's consolidate sk_psock_skb_ingress_self() to
sk_psock_skb_ingress() so we can centralise the fix.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: Keep msg->sk assignment
---
net/core/skmsg.c | 62 ++++++++++++++----------------------------------
1 file changed, 18 insertions(+), 44 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 57845b0d8a71..6bf3c517dbd2 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -572,32 +572,31 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
return copied;
}
-static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb,
- u32 off, u32 len, bool take_ref);
-
static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
- u32 off, u32 len, gfp_t gfp_flags)
+ u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
{
struct sock *sk = psock->sk;
struct sk_msg *msg;
int err = -EAGAIN;
- /* If we are receiving on the same sock skb->sk is already assigned,
- * skip memory accounting and owner transition seeing it already set
- * correctly.
- */
- if (unlikely(skb->sk == sk))
- return sk_psock_skb_ingress_self(psock, skb, off, len, true);
-
msg = alloc_sk_msg(gfp_flags);
if (!msg)
goto out;
- if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
- goto free;
+ if (skb->sk != sk) {
+ if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+ goto free;
- if (!sk_rmem_schedule(sk, skb, skb->truesize))
- goto free;
+ if (!sk_rmem_schedule(sk, skb, skb->truesize))
+ goto free;
+ }
+
+ /* This is used in tcp_bpf_recvmsg_parser() to determine whether the
+ * data originates from the socket's own protocol stack. No need to
+ * refcount sk because msg's lifetime is bound to sk via the ingress_msg.
+ */
+ if (skb->sk == sk || !take_ref)
+ msg->sk = sk;
/* This will transition ownership of the data from the socket where
* the BPF program was run initiating the redirect to the socket
@@ -606,7 +605,8 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
* into user buffers.
*/
skb_set_owner_r(skb, sk);
- err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, true);
+
+ err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
if (err < 0)
goto free;
out:
@@ -616,32 +616,6 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
goto out;
}
-/* Puts an skb on the ingress queue of the socket already assigned to the
- * skb. In this case we do not need to check memory limits or skb_set_owner_r
- * because the skb is already accounted for here.
- */
-static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb,
- u32 off, u32 len, bool take_ref)
-{
- struct sk_msg *msg = alloc_sk_msg(GFP_ATOMIC);
- struct sock *sk = psock->sk;
- int err;
-
- if (unlikely(!msg))
- return -EAGAIN;
- skb_set_owner_r(skb, sk);
-
- /* This is used in tcp_bpf_recvmsg_parser() to determine whether the
- * data originates from the socket's own protocol stack. No need to
- * refcount sk because msg's lifetime is bound to sk via the ingress_msg.
- */
- msg->sk = sk;
- err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
- if (err < 0)
- kfree(msg);
- return err;
-}
-
static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
u32 off, u32 len, bool ingress)
{
@@ -651,7 +625,7 @@ static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
return skb_send_sock(psock->sk, skb, off, len);
}
- return sk_psock_skb_ingress(psock, skb, off, len, GFP_KERNEL);
+ return sk_psock_skb_ingress(psock, skb, off, len, GFP_KERNEL, true);
}
static void sk_psock_skb_state(struct sk_psock *psock,
@@ -1058,7 +1032,7 @@ static int sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb,
off = stm->offset;
len = stm->full_len;
}
- err = sk_psock_skb_ingress_self(psock, skb, off, len, false);
+ err = sk_psock_skb_ingress(psock, skb, off, len, GFP_ATOMIC, false);
}
if (err < 0) {
spin_lock_bh(&psock->ingress_lock);
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
` (4 preceding siblings ...)
2026-02-19 17:37 ` [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
@ 2026-02-19 17:37 ` Kuniyuki Iwashima
2026-02-19 18:10 ` bot+bpf-ci
5 siblings, 1 reply; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 17:37 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki
Cc: Willem de Bruijn, Kuniyuki Iwashima, Kuniyuki Iwashima, bpf,
netdev, syzbot+5b3b7e51dda1be027b7a
syzbot reported imbalanced sk->sk_forward_alloc [0] and
demonstrated that UDP memory accounting by SOCKMAP is broken.
The repro put a UDP sk into SOCKMAP and redirected skb to itself,
where skb->truesize was 4240.
First, udp_rmem_schedule() set sk->sk_forward_alloc to 8192
(2 * PAGE_SIZE), and skb->truesize was charged:
sk->sk_forward_alloc = 0 + 8192 - 4240; // => 3952
Then, udp_read_skb() dequeued the skb by skb_recv_udp(), which finally
calls udp_rmem_release() and _partially_ reclaims sk->sk_forward_alloc
because skb->truesize was larger than PAGE_SIZE:
sk->sk_forward_alloc += 4240; // => 8192 (PAGE_SIZE is reclaimable)
sk->sk_forward_alloc -= 4096; // => 4096
Later, sk_psock_skb_ingress_self() called skb_set_owner_r() to
charge the skb again, triggering an sk->sk_forward_alloc underflow:
sk->sk_forward_alloc -= 4240 // => -144
Another problem is that UDP memory accounting is not performed
under spin_lock_bh(&sk->sk_receive_queue.lock).
skb_set_owner_r() and sock_rfree() are called locklessly and
corrupt sk->sk_forward_alloc, leading to the splat.
Let's not skip memory accounting for UDP and ensure the proper
lock is held.
[0]:
WARNING: net/ipv4/af_inet.c:157 at inet_sock_destruct+0x62d/0x740 net/ipv4/af_inet.c:157, CPU#0: ksoftirqd/0/15
Modules linked in:
CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026
RIP: 0010:inet_sock_destruct+0x62d/0x740 net/ipv4/af_inet.c:157
Code: 0f 0b 90 e9 58 fe ff ff e8 40 55 b3 f7 90 0f 0b 90 e9 8b fe ff ff e8 32 55 b3 f7 90 0f 0b 90 e9 b1 fe ff ff e8 24 55 b3 f7 90 <0f> 0b 90 e9 d7 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 95 fc
RSP: 0018:ffffc90000147a48 EFLAGS: 00010246
RAX: ffffffff8a1121dc RBX: dffffc0000000000 RCX: ffff88801d2c3d00
RDX: 0000000000000100 RSI: 0000000000000f70 RDI: 0000000000000000
RBP: 0000000000000f70 R08: ffff888030ce1327 R09: 1ffff1100619c264
R10: dffffc0000000000 R11: ffffed100619c265 R12: ffff888030ce1080
R13: dffffc0000000000 R14: ffff888030ce130c R15: ffffffff8fa87e00
FS: 0000000000000000(0000) GS:ffff8881256f8000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000000700 CR3: 000000007200c000 CR4: 00000000003526f0
Call Trace:
<TASK>
__sk_destruct+0x85/0x880 net/core/sock.c:2350
rcu_do_batch kernel/rcu/tree.c:2605 [inline]
rcu_core+0xc9e/0x1750 kernel/rcu/tree.c:2857
handle_softirqs+0x22a/0x7c0 kernel/softirq.c:622
run_ksoftirqd+0x36/0x60 kernel/softirq.c:1063
smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
kthread+0x726/0x8b0 kernel/kthread.c:463
ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>
Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap")
Reported-by: syzbot+5b3b7e51dda1be027b7a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/698f84c6.a70a0220.2c38d7.00cb.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: fix build failure when CONFIG_INET=n
---
include/net/udp.h | 9 +++++++++
net/core/skmsg.c | 20 +++++++++++++++++---
net/ipv4/udp.c | 9 +++++++++
3 files changed, 35 insertions(+), 3 deletions(-)
diff --git a/include/net/udp.h b/include/net/udp.h
index 700dbedcb15f..ae38a4da9388 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -455,6 +455,15 @@ struct sock *__udp6_lib_lookup(const struct net *net,
struct sk_buff *skb);
struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
__be16 sport, __be16 dport);
+
+#ifdef CONFIG_INET
+void udp_sock_rfree(struct sk_buff *skb);
+#else
+static inline void udp_sock_rfree(struct sk_buff *skb)
+{
+}
+#endif
+
int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
/* UDP uses skb->dev_scratch to cache as much information as possible and avoid
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 6bf3c517dbd2..c5fdb2827422 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -7,6 +7,7 @@
#include <net/sock.h>
#include <net/tcp.h>
+#include <net/udp.h>
#include <net/tls.h>
#include <trace/events/sock.h>
@@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
{
struct sock *sk = psock->sk;
+ bool is_udp = sk_is_udp(sk);
struct sk_msg *msg;
int err = -EAGAIN;
@@ -583,12 +585,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
if (!msg)
goto out;
- if (skb->sk != sk) {
+ if (is_udp)
+ spin_lock_bh(&sk->sk_receive_queue.lock);
+
+ if (skb->sk != sk || is_udp) {
if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
- goto free;
+ goto unlock;
if (!sk_rmem_schedule(sk, skb, skb->truesize))
- goto free;
+ goto unlock;
}
/* This is used in tcp_bpf_recvmsg_parser() to determine whether the
@@ -606,11 +611,20 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
*/
skb_set_owner_r(skb, sk);
+ if (is_udp) {
+ skb->destructor = udp_sock_rfree;
+ spin_unlock_bh(&sk->sk_receive_queue.lock);
+ }
+
err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
if (err < 0)
goto free;
out:
return err;
+
+unlock:
+ if (is_udp)
+ spin_unlock_bh(&sk->sk_receive_queue.lock);
free:
kfree(msg);
goto out;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 422c96fea249..831d26748a90 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2039,6 +2039,15 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags,
}
EXPORT_SYMBOL(__skb_recv_udp);
+void udp_sock_rfree(struct sk_buff *skb)
+{
+ struct sock *sk = skb->sk;
+
+ spin_lock_bh(&sk->sk_receive_queue.lock);
+ sock_rfree(skb);
+ spin_unlock_bh(&sk->sk_receive_queue.lock);
+}
+
int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
{
struct sk_buff *skb;
--
2.53.0.345.g96ddfc5eaa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
@ 2026-02-19 18:10 ` bot+bpf-ci
2026-02-19 18:29 ` Kuniyuki Iwashima
0 siblings, 1 reply; 9+ messages in thread
From: bot+bpf-ci @ 2026-02-19 18:10 UTC (permalink / raw)
To: kuniyu, john.fastabend, jakub
Cc: willemdebruijn.kernel, kuniyu, kuni1840, bpf, netdev,
syzbot+5b3b7e51dda1be027b7a, ast, andrii, daniel, martin.lau,
eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3213 bytes --]
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
[ ... ]
> @@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
> {
> struct sock *sk = psock->sk;
> + bool is_udp = sk_is_udp(sk);
> struct sk_msg *msg;
> int err = -EAGAIN;
>
> @@ -583,12 +585,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> if (!msg)
> goto out;
>
> - if (skb->sk != sk) {
> + if (is_udp)
> + spin_lock_bh(&sk->sk_receive_queue.lock);
> +
> + if (skb->sk != sk || is_udp) {
> if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
> - goto free;
> + goto unlock;
>
> if (!sk_rmem_schedule(sk, skb, skb->truesize))
> - goto free;
> + goto unlock;
> }
[ ... ]
> skb_set_owner_r(skb, sk);
^^^
If sk_psock_skb_ingress_enqueue() fails after this point (for
example, skb_to_sgvec() fails and skb_linearize() also fails under
GFP_ATOMIC memory pressure), the error path frees msg and returns
-EAGAIN, but the skb is kept with skb->destructor set to
udp_sock_rfree.
The caller sk_psock_verdict_apply() then queues this skb onto
psock->ingress_skb for retry via sk_psock_backlog().
On retry, sk_psock_handle_skb() calls sk_psock_skb_ingress()
again. The function acquires sk->sk_receive_queue.lock, then calls
skb_set_owner_r() which calls skb_orphan(). skb_orphan() invokes
the current skb->destructor, which is now udp_sock_rfree():
sk_psock_skb_ingress()
spin_lock_bh(&sk->sk_receive_queue.lock) <-- held
skb_set_owner_r(skb, sk)
skb_orphan(skb)
skb->destructor(skb) [= udp_sock_rfree]
spin_lock_bh(&sk->sk_receive_queue.lock) <-- deadlock
Could this self-deadlock on the receive queue lock?
> + if (is_udp) {
> + skb->destructor = udp_sock_rfree;
> + spin_unlock_bh(&sk->sk_receive_queue.lock);
> + }
> +
> err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
> if (err < 0)
> goto free;
> out:
> return err;
> +
> +unlock:
> + if (is_udp)
> + spin_unlock_bh(&sk->sk_receive_queue.lock);
> free:
> kfree(msg);
> goto out;
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
[ ... ]
> +void udp_sock_rfree(struct sk_buff *skb)
> +{
> + struct sock *sk = skb->sk;
> +
> + spin_lock_bh(&sk->sk_receive_queue.lock);
> + sock_rfree(skb);
> + spin_unlock_bh(&sk->sk_receive_queue.lock);
> +}
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22193327667
AI-authorship-score: low
AI-authorship-explanation: The commit demonstrates deep understanding of UDP memory accounting internals and references a specific syzbot report, consistent with an experienced kernel networking developer.
issues-found: 1
issue-severity-score: high
issue-severity-explanation: Self-deadlock on sk->sk_receive_queue.lock when sk_psock_skb_ingress is retried after enqueue failure for UDP, causing a system hang.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP.
2026-02-19 18:10 ` bot+bpf-ci
@ 2026-02-19 18:29 ` Kuniyuki Iwashima
0 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-19 18:29 UTC (permalink / raw)
To: bot+bpf-ci
Cc: john.fastabend, jakub, willemdebruijn.kernel, kuni1840, bpf,
netdev, syzbot+5b3b7e51dda1be027b7a, ast, andrii, daniel,
martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai
On Thu, Feb 19, 2026 at 10:11 AM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
>
> [ ... ]
>
> > @@ -576,6 +577,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> > u32 off, u32 len, gfp_t gfp_flags, bool take_ref)
> > {
> > struct sock *sk = psock->sk;
> > + bool is_udp = sk_is_udp(sk);
> > struct sk_msg *msg;
> > int err = -EAGAIN;
> >
> > @@ -583,12 +585,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb,
> > if (!msg)
> > goto out;
> >
> > - if (skb->sk != sk) {
> > + if (is_udp)
> > + spin_lock_bh(&sk->sk_receive_queue.lock);
> > +
> > + if (skb->sk != sk || is_udp) {
> > if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
> > - goto free;
> > + goto unlock;
> >
> > if (!sk_rmem_schedule(sk, skb, skb->truesize))
> > - goto free;
> > + goto unlock;
> > }
>
> [ ... ]
>
> > skb_set_owner_r(skb, sk);
> ^^^
>
> If sk_psock_skb_ingress_enqueue() fails after this point (for
> example, skb_to_sgvec() fails and skb_linearize() also fails under
> GFP_ATOMIC memory pressure), the error path frees msg and returns
> -EAGAIN, but the skb is kept with skb->destructor set to
> udp_sock_rfree.
>
> The caller sk_psock_verdict_apply() then queues this skb onto
> psock->ingress_skb for retry via sk_psock_backlog().
>
> On retry, sk_psock_handle_skb() calls sk_psock_skb_ingress()
> again. The function acquires sk->sk_receive_queue.lock, then calls
> skb_set_owner_r() which calls skb_orphan(). skb_orphan() invokes
> the current skb->destructor, which is now udp_sock_rfree():
>
> sk_psock_skb_ingress()
> spin_lock_bh(&sk->sk_receive_queue.lock) <-- held
> skb_set_owner_r(skb, sk)
> skb_orphan(skb)
> skb->destructor(skb) [= udp_sock_rfree]
> spin_lock_bh(&sk->sk_receive_queue.lock) <-- deadlock
>
> Could this self-deadlock on the receive queue lock?
Oh exactly...
Will short-circuit to sk_psock_skb_ingress_enqueue()
in that case.
>
> > + if (is_udp) {
> > + skb->destructor = udp_sock_rfree;
> > + spin_unlock_bh(&sk->sk_receive_queue.lock);
> > + }
> > +
> > err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, take_ref);
> > if (err < 0)
> > goto free;
> > out:
> > return err;
> > +
> > +unlock:
> > + if (is_udp)
> > + spin_unlock_bh(&sk->sk_receive_queue.lock);
> > free:
> > kfree(msg);
> > goto out;
>
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
>
> [ ... ]
>
> > +void udp_sock_rfree(struct sk_buff *skb)
> > +{
> > + struct sock *sk = skb->sk;
> > +
> > + spin_lock_bh(&sk->sk_receive_queue.lock);
> > + sock_rfree(skb);
> > + spin_unlock_bh(&sk->sk_receive_queue.lock);
> > +}
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22193327667
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit demonstrates deep understanding of UDP memory accounting internals and references a specific syzbot report, consistent with an experienced kernel networking developer.
> issues-found: 1
> issue-severity-score: high
> issue-severity-explanation: Self-deadlock on sk->sk_receive_queue.lock when sk_psock_skb_ingress is retried after enqueue failure for UDP, causing a system hang.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-02-19 18:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-19 17:37 [PATCH v3 bpf/net 0/6] sockmap: Fix UAF and broken memory accounting for UDP Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 1/6] sockmap: Annotate sk->sk_data_ready() " Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 2/6] sockmap: Annotate sk->sk_write_space() " Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 3/6] sockmap: Fix use-after-free in udp_bpf_recvmsg() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 4/6] sockmap: Pass gfp_t flag to sk_psock_skb_ingress() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 5/6] sockmap: Consolidate sk_psock_skb_ingress_self() Kuniyuki Iwashima
2026-02-19 17:37 ` [PATCH v3 bpf/net 6/6] sockmap: Fix broken memory accounting for UDP Kuniyuki Iwashima
2026-02-19 18:10 ` bot+bpf-ci
2026-02-19 18:29 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox