All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap
@ 2025-02-28  5:51 Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28  5:51 UTC (permalink / raw)
  To: xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
	linux-kselftest, mrpre, cong.wang

1. Issue
Syzkaller reported this issue [1].

2. Reproduce

We can reproduce this issue by using the test_sockmap_with_close_on_write()
test I provided in selftest, also you need to apply the following patch to
ensure 100% reproducibility (sleep after checking sock):

'''
static void sk_psock_verdict_data_ready(struct sock *sk)
{
        .......
        if (unlikely(!sock))
                return;
+       if (!strcmp("test_progs", current->comm)) {
+               printk("sleep 2s to wait socket freed\n");
+               mdelay(2000);
+               printk("sleep end\n");
+       }
        ops = READ_ONCE(sock->ops);
        if (!ops || !ops->read_skb)
                return;
}
'''

Then running './test_progs -v sockmap_basic', and if the kernel has KASAN
enabled [2], you will see the following warning:

'''
BUG: KASAN: slab-use-after-free in sk_psock_verdict_data_ready+0x29b/0x2d0
Read of size 8 at addr ffff88813a777020 by task test_progs/47055

Tainted: [O]=OOT_MODULE
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 print_address_description.constprop.0+0x30/0x420
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 print_report+0xb7/0x270
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 ? kasan_addr_to_slab+0xd/0xa0
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 kasan_report+0xca/0x100
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 sk_psock_verdict_data_ready+0x29b/0x2d0
 unix_stream_sendmsg+0x4a6/0xa40
 ? __pfx_unix_stream_sendmsg+0x10/0x10
 ? fdget+0x2c1/0x3a0
 __sys_sendto+0x39c/0x410
'''

3. Reason
'''
CPU0                                             CPU1
unix_stream_sendmsg(sk):
  other = unix_peer(sk)
  other->sk_data_ready(other):
    socket *sock = sk->sk_socket
    if (unlikely(!sock))
        return;
                                                 close(other):
                                                   ...
                                                   other->close()
                                                   free(socket)
    READ_ONCE(sock->ops)
    ^
    use 'sock' after free
'''

For TCP, UDP, or other protocols, we have already performed
rcu_read_lock() when the network stack receives packets in ip_input.c:
'''
ip_local_deliver_finish():
    rcu_read_lock()
    ip_protocol_deliver_rcu()
        xxx_rcv
    rcu_read_unlock()
'''

However, for Unix sockets, sk_data_ready is called directly from the
process context without rcu_read_lock() protection.

4. Solution
Based on the fact that the 'struct socket' is released using call_rcu(),
We add rcu_read_{un}lock() at the entrance and exit of our sk_data_ready.
It will not increase performance overhead, at least for TCP and UDP, they
are already in a relatively large critical section.

Of course, we can also add a custom callback for Unix sockets and call
rcu_read_lock() before calling _verdict_data_ready like this:
'''
if (sk_is_unix(sk))
    sk->sk_data_ready = sk_psock_verdict_data_ready_rcu;
else
    sk->sk_data_ready = sk_psock_verdict_data_ready;

sk_psock_verdict_data_ready_rcu():
    rcu_read_lock()
    sk_psock_verdict_data_ready()
    rcu_read_unlock()
'''
However, this will cause too many branches, and it's not suitable to
distinguish network protocols in skmsg.c.

[1] https://syzkaller.appspot.com/bug?extid=dd90a702f518e0eac072
[2] https://syzkaller.appspot.com/text?tag=KernelConfig&x=1362a5aee630ff34

---
v1 -> v2:
1. Add Fixes tag.
2. Extend selftest of edge case for TCP/UDP sockets.
3. Add Reviewed-by and Acked-by tag.
https://lore.kernel.org/bpf/20250226132242.52663-1-jiayuan.chen@linux.dev/T/#t

Jiayuan Chen (3):
  bpf, sockmap: avoid using sk_socket after free
  selftests/bpf: Add socketpair to create_pair to support unix socket
  selftests/bpf: Add edge case tests for sockmap

 net/core/skmsg.c                              | 18 ++++--
 .../selftests/bpf/prog_tests/socket_helpers.h | 13 +++-
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 59 +++++++++++++++++++
 3 files changed, 84 insertions(+), 6 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-03-10 14:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28  5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
2025-03-07  9:45   ` Michal Luczaj
2025-03-10 11:36     ` Jiayuan Chen
2025-03-10 13:08       ` Michal Luczaj
2025-03-10 14:13         ` Jiayuan Chen
2025-02-28  5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
2025-02-28  5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.