* [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap
@ 2025-02-28 5:51 Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang
1. Issue
Syzkaller reported this issue [1].
2. Reproduce
We can reproduce this issue by using the test_sockmap_with_close_on_write()
test I provided in selftest, also you need to apply the following patch to
ensure 100% reproducibility (sleep after checking sock):
'''
static void sk_psock_verdict_data_ready(struct sock *sk)
{
.......
if (unlikely(!sock))
return;
+ if (!strcmp("test_progs", current->comm)) {
+ printk("sleep 2s to wait socket freed\n");
+ mdelay(2000);
+ printk("sleep end\n");
+ }
ops = READ_ONCE(sock->ops);
if (!ops || !ops->read_skb)
return;
}
'''
Then running './test_progs -v sockmap_basic', and if the kernel has KASAN
enabled [2], you will see the following warning:
'''
BUG: KASAN: slab-use-after-free in sk_psock_verdict_data_ready+0x29b/0x2d0
Read of size 8 at addr ffff88813a777020 by task test_progs/47055
Tainted: [O]=OOT_MODULE
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
print_address_description.constprop.0+0x30/0x420
? sk_psock_verdict_data_ready+0x29b/0x2d0
print_report+0xb7/0x270
? sk_psock_verdict_data_ready+0x29b/0x2d0
? kasan_addr_to_slab+0xd/0xa0
? sk_psock_verdict_data_ready+0x29b/0x2d0
kasan_report+0xca/0x100
? sk_psock_verdict_data_ready+0x29b/0x2d0
sk_psock_verdict_data_ready+0x29b/0x2d0
unix_stream_sendmsg+0x4a6/0xa40
? __pfx_unix_stream_sendmsg+0x10/0x10
? fdget+0x2c1/0x3a0
__sys_sendto+0x39c/0x410
'''
3. Reason
'''
CPU0 CPU1
unix_stream_sendmsg(sk):
other = unix_peer(sk)
other->sk_data_ready(other):
socket *sock = sk->sk_socket
if (unlikely(!sock))
return;
close(other):
...
other->close()
free(socket)
READ_ONCE(sock->ops)
^
use 'sock' after free
'''
For TCP, UDP, or other protocols, we have already performed
rcu_read_lock() when the network stack receives packets in ip_input.c:
'''
ip_local_deliver_finish():
rcu_read_lock()
ip_protocol_deliver_rcu()
xxx_rcv
rcu_read_unlock()
'''
However, for Unix sockets, sk_data_ready is called directly from the
process context without rcu_read_lock() protection.
4. Solution
Based on the fact that the 'struct socket' is released using call_rcu(),
We add rcu_read_{un}lock() at the entrance and exit of our sk_data_ready.
It will not increase performance overhead, at least for TCP and UDP, they
are already in a relatively large critical section.
Of course, we can also add a custom callback for Unix sockets and call
rcu_read_lock() before calling _verdict_data_ready like this:
'''
if (sk_is_unix(sk))
sk->sk_data_ready = sk_psock_verdict_data_ready_rcu;
else
sk->sk_data_ready = sk_psock_verdict_data_ready;
sk_psock_verdict_data_ready_rcu():
rcu_read_lock()
sk_psock_verdict_data_ready()
rcu_read_unlock()
'''
However, this will cause too many branches, and it's not suitable to
distinguish network protocols in skmsg.c.
[1] https://syzkaller.appspot.com/bug?extid=dd90a702f518e0eac072
[2] https://syzkaller.appspot.com/text?tag=KernelConfig&x=1362a5aee630ff34
---
v1 -> v2:
1. Add Fixes tag.
2. Extend selftest of edge case for TCP/UDP sockets.
3. Add Reviewed-by and Acked-by tag.
https://lore.kernel.org/bpf/20250226132242.52663-1-jiayuan.chen@linux.dev/T/#t
Jiayuan Chen (3):
bpf, sockmap: avoid using sk_socket after free
selftests/bpf: Add socketpair to create_pair to support unix socket
selftests/bpf: Add edge case tests for sockmap
net/core/skmsg.c | 18 ++++--
.../selftests/bpf/prog_tests/socket_helpers.h | 13 +++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 59 +++++++++++++++++++
3 files changed, 84 insertions(+), 6 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
@ 2025-02-28 5:51 ` Jiayuan Chen
2025-03-07 9:45 ` Michal Luczaj
2025-02-28 5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
2 siblings, 1 reply; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang, syzbot+dd90a702f518e0eac072
Use RCU lock to protect sk_socket, preventing concurrent close and release
by another thread.
Because TCP/UDP are already within a relatively large critical section:
'''
ip_local_deliver_finish
rcu_read_lock
ip_protocol_deliver_rcu
tcp_rcv/udp_rcv
rcu_read_unlock
'''
Adding rcu_read_{un}lock() at the entrance and exit of sk_data_ready
will not increase performance overhead.
Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()")
Reported-by: syzbot+dd90a702f518e0eac072@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/6734c033.050a0220.2a2fcc.0015.GAE@google.com/
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
net/core/skmsg.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 0ddc4c718833..1b71ae1d1bf5 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1222,27 +1222,35 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
static void sk_psock_verdict_data_ready(struct sock *sk)
{
- struct socket *sock = sk->sk_socket;
+ struct socket *sock;
const struct proto_ops *ops;
int copied;
trace_sk_data_ready(sk);
+ /* We need RCU to prevent the sk_socket from being released.
+ * Especially for Unix sockets, we are currently in the process
+ * context and do not have RCU protection.
+ */
+ rcu_read_lock();
+ sock = sk->sk_socket;
if (unlikely(!sock))
- return;
+ goto unlock;
+
ops = READ_ONCE(sock->ops);
if (!ops || !ops->read_skb)
- return;
+ goto unlock;
+
copied = ops->read_skb(sk, sk_psock_verdict_recv);
if (copied >= 0) {
struct sk_psock *psock;
- rcu_read_lock();
psock = sk_psock(sk);
if (psock)
sk_psock_data_ready(sk, psock);
- rcu_read_unlock();
}
+unlock:
+ rcu_read_unlock();
}
void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
@ 2025-02-28 5:51 ` Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang
Current wrapper function create_pair() is used to create a pair of
connected links and returns two fds, but it does not support unix sockets.
Here we introduce socketpair() into create_pair(), which supports creating
a pair of unix sockets, since the semantics of the two are the same.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
.../selftests/bpf/prog_tests/socket_helpers.h | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/socket_helpers.h b/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
index 1bdfb79ef009..a805143dd84f 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
+++ b/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
@@ -313,11 +313,22 @@ static inline int recv_timeout(int fd, void *buf, size_t len, int flags,
static inline int create_pair(int family, int sotype, int *p0, int *p1)
{
- __close_fd int s, c = -1, p = -1;
+ __close_fd int s = -1, c = -1, p = -1;
struct sockaddr_storage addr;
socklen_t len = sizeof(addr);
int err;
+ if (family == AF_UNIX) {
+ int fds[2];
+
+ err = socketpair(family, sotype, 0, fds);
+ if (!err) {
+ *p0 = fds[0];
+ *p1 = fds[1];
+ }
+ return err;
+ }
+
s = socket_loopback(family, sotype);
if (s < 0)
return s;
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
@ 2025-02-28 5:51 ` Jiayuan Chen
2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang
Add edge case tests for sockmap.
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
.../selftests/bpf/prog_tests/sockmap_basic.c | 59 +++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index 1e3e4392dcca..ad8bb085baf2 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -1042,6 +1042,59 @@ static void test_sockmap_vsock_unconnected(void)
xclose(map);
}
+void *close_thread(void *arg)
+{
+ int *fd = (int *)arg;
+
+ sleep(1);
+ close(*fd);
+ *fd = -1;
+ return NULL;
+}
+
+void test_sockmap_with_close_on_write(int family, int sotype)
+{
+ struct test_sockmap_pass_prog *skel;
+ int err, map, verdict;
+ pthread_t tid;
+ int zero = 0;
+ int c = -1, p = -1;
+
+ skel = test_sockmap_pass_prog__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ return;
+
+ verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
+ map = bpf_map__fd(skel->maps.sock_map_rx);
+
+ err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach"))
+ goto out;
+
+ err = create_pair(family, sotype, &c, &p);
+ if (!ASSERT_OK(err, "create_pair"))
+ goto out;
+
+ err = bpf_map_update_elem(map, &zero, &p, BPF_ANY);
+ if (!ASSERT_OK(err, "bpf_map_update_elem"))
+ goto out;
+
+ err = pthread_create(&tid, 0, close_thread, &p);
+ if (!ASSERT_OK(err, "pthread_create"))
+ goto out;
+
+ while (p > 0)
+ send(c, "a", 1, MSG_NOSIGNAL);
+
+ pthread_join(tid, NULL);
+out:
+ if (c > 0)
+ close(c);
+ if (p > 0)
+ close(p);
+ test_sockmap_pass_prog__destroy(skel);
+}
+
void test_sockmap_basic(void)
{
if (test__start_subtest("sockmap create_update_free"))
@@ -1108,4 +1161,10 @@ void test_sockmap_basic(void)
test_sockmap_skb_verdict_vsock_poll();
if (test__start_subtest("sockmap vsock unconnected"))
test_sockmap_vsock_unconnected();
+ if (test__start_subtest("sockmap with write on close")) {
+ test_sockmap_with_close_on_write(AF_UNIX, SOCK_STREAM);
+ test_sockmap_with_close_on_write(AF_UNIX, SOCK_DGRAM);
+ test_sockmap_with_close_on_write(AF_INET, SOCK_STREAM);
+ test_sockmap_with_close_on_write(AF_INET, SOCK_DGRAM);
+ }
}
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
@ 2025-03-07 9:45 ` Michal Luczaj
2025-03-10 11:36 ` Jiayuan Chen
0 siblings, 1 reply; 8+ messages in thread
From: Michal Luczaj @ 2025-03-07 9:45 UTC (permalink / raw)
To: Jiayuan Chen, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
On 2/28/25 06:51, Jiayuan Chen wrote:
> ...
> static void sk_psock_verdict_data_ready(struct sock *sk)
> {
> - struct socket *sock = sk->sk_socket;
> + struct socket *sock;
> const struct proto_ops *ops;
> int copied;
>
> trace_sk_data_ready(sk);
>
> + /* We need RCU to prevent the sk_socket from being released.
> + * Especially for Unix sockets, we are currently in the process
> + * context and do not have RCU protection.
> + */
> + rcu_read_lock();
> + sock = sk->sk_socket;
> if (unlikely(!sock))
> - return;
> + goto unlock;
> +
> ops = READ_ONCE(sock->ops);
> if (!ops || !ops->read_skb)
> - return;
> + goto unlock;
> +
> copied = ops->read_skb(sk, sk_psock_verdict_recv);
> if (copied >= 0) {
> struct sk_psock *psock;
>
> - rcu_read_lock();
> psock = sk_psock(sk);
> if (psock)
> sk_psock_data_ready(sk, psock);
> - rcu_read_unlock();
> }
> +unlock:
> + rcu_read_unlock();
> }
Hi,
Doesn't sk_psock_handle_skb() (!ingress path) have the same `struct socket`
release race issue? Any plans on fixing that one, too?
BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
read_skb() under RCU read lock.
Thanks,
Michal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-03-07 9:45 ` Michal Luczaj
@ 2025-03-10 11:36 ` Jiayuan Chen
2025-03-10 13:08 ` Michal Luczaj
0 siblings, 1 reply; 8+ messages in thread
From: Jiayuan Chen @ 2025-03-10 11:36 UTC (permalink / raw)
To: Michal Luczaj, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
>
> On 2/28/25 06:51, Jiayuan Chen wrote:
>
> >
> > ...
> >
> > static void sk_psock_verdict_data_ready(struct sock *sk)
> >
> > {
> >
> > - struct socket *sock = sk->sk_socket;
> >
> > + struct socket *sock;
> >
> > const struct proto_ops *ops;
> >
> > int copied;
> >
> >
> >
> > trace_sk_data_ready(sk);
> >
> >
> >
> > + /* We need RCU to prevent the sk_socket from being released.
> >
> > + * Especially for Unix sockets, we are currently in the process
> >
> > + * context and do not have RCU protection.
> >
> > + */
> >
> > + rcu_read_lock();
> >
> > + sock = sk->sk_socket;
> >
> > if (unlikely(!sock))
> >
> > - return;
> >
> > + goto unlock;
> >
> > +
> >
> > ops = READ_ONCE(sock->ops);
> >
> > if (!ops || !ops->read_skb)
> >
> > - return;
> >
> > + goto unlock;
> >
> > +
> >
> > copied = ops->read_skb(sk, sk_psock_verdict_recv);
> >
> > if (copied >= 0) {
> >
> > struct sk_psock *psock;
> >
> >
> >
> > - rcu_read_lock();
> >
> > psock = sk_psock(sk);
> >
> > if (psock)
> >
> > sk_psock_data_ready(sk, psock);
> >
> > - rcu_read_unlock();
> >
> > }
> >
> > +unlock:
> >
> > + rcu_read_unlock();
> >
> > }
> >
>
> Hi,
>
> Doesn't sk_psock_handle_skb() (!ingress path) have the same `struct socket`
>
> release race issue? Any plans on fixing that one, too?
Yes, the send path logic also has similar issues, and after some hacking,
I was able to reproduce it. Thanks for providing this information.
I can fix these together in the next revision of this patchset, anyway,
this patchset still needs further confirmation from the maintainer.
>
> BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
>
> read_skb() under RCU read lock.
>
> Thanks,
>
> Michal
>
My environment also has LOCKDEP enabled, but I didn't see similar
warnings.
Moreover, RCU assertions are typically written as:
WARN_ON_ONCE(!rcu_read_lock_held())
And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
returning 1. So, it's unlikely to trigger a warning due to an RCU lock
being held.
Could you provide more of the call stack?
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-03-10 11:36 ` Jiayuan Chen
@ 2025-03-10 13:08 ` Michal Luczaj
2025-03-10 14:13 ` Jiayuan Chen
0 siblings, 1 reply; 8+ messages in thread
From: Michal Luczaj @ 2025-03-10 13:08 UTC (permalink / raw)
To: Jiayuan Chen, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
On 3/10/25 12:36, Jiayuan Chen wrote:
> March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
> ...
>> BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
>> read_skb() under RCU read lock.
>>
> My environment also has LOCKDEP enabled, but I didn't see similar
> warnings.
> Moreover, RCU assertions are typically written as:
>
> WARN_ON_ONCE(!rcu_read_lock_held())
>
> And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
> returning 1. So, it's unlikely to trigger a warning due to an RCU lock
> being held.
>
> Could you provide more of the call stack?
Sure, bpf-next with this series applied, test_progs -t sockmap_basic:
=============================
[ BUG: Invalid wait context ]
6.14.0-rc3+ #111 Tainted: G OE
-----------------------------
test_progs/37755 is trying to lock:
ffff88810d9bc3c0 (&u->iolock){+.+.}-{4:4}, at: unix_stream_read_skb+0x30/0x120
other info that might help us debug this:
context-{5:5}
1 lock held by test_progs/37755:
#0: ffffffff833700e0 (rcu_read_lock){....}-{1:3}, at: sk_psock_verdict_data_ready+0x3e/0x2a0
stack backtrace:
CPU: 13 UID: 0 PID: 37755 Comm: test_progs Tainted: G OE 6.14.0-rc3+ #111
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
Call Trace:
dump_stack_lvl+0x68/0x90
lock_acquire+0xcf/0x2e0
__mutex_lock+0x9c/0xcc0
unix_stream_read_skb+0x30/0x120
sk_psock_verdict_data_ready+0x8d/0x2a0
unix_stream_sendmsg+0x232/0x640
__sys_sendto+0x1cd/0x1e0
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x93/0x180
entry_SYSCALL_64_after_hwframe+0x76/0x7e
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-03-10 13:08 ` Michal Luczaj
@ 2025-03-10 14:13 ` Jiayuan Chen
0 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-03-10 14:13 UTC (permalink / raw)
To: Michal Luczaj, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
March 10, 2025 at 9:08 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
>
> On 3/10/25 12:36, Jiayuan Chen wrote:
>
> >
> > March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
> >
> > ...
> >
> > >
> > > BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
> > >
> > > read_skb() under RCU read lock.
> > >
> >
> > My environment also has LOCKDEP enabled, but I didn't see similar
> >
> > warnings.
> >
> > Moreover, RCU assertions are typically written as:
> >
> >
> >
> > WARN_ON_ONCE(!rcu_read_lock_held())
> >
> >
> >
> > And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
> >
> > returning 1. So, it's unlikely to trigger a warning due to an RCU lock
> >
> > being held.
> >
> >
> >
> > Could you provide more of the call stack?
> >
>
> Sure, bpf-next with this series applied, test_progs -t sockmap_basic:
>
> =============================
>
> [ BUG: Invalid wait context ]
>
> 6.14.0-rc3+ #111 Tainted: G OE
>
> -----------------------------
>
> test_progs/37755 is trying to lock:
>
> ffff88810d9bc3c0 (&u->iolock){+.+.}-{4:4}, at: unix_stream_read_skb+0x30/0x120
>
> other info that might help us debug this:
>
> context-{5:5}
>
> 1 lock held by test_progs/37755:
>
> #0: ffffffff833700e0 (rcu_read_lock){....}-{1:3}, at: sk_psock_verdict_data_ready+0x3e/0x2a0
>
> stack backtrace:
>
> CPU: 13 UID: 0 PID: 37755 Comm: test_progs Tainted: G OE 6.14.0-rc3+ #111
>
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
>
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>
> Call Trace:
>
> dump_stack_lvl+0x68/0x90
>
> lock_acquire+0xcf/0x2e0
>
> __mutex_lock+0x9c/0xcc0
>
> unix_stream_read_skb+0x30/0x120
>
> sk_psock_verdict_data_ready+0x8d/0x2a0
>
> unix_stream_sendmsg+0x232/0x640
>
> __sys_sendto+0x1cd/0x1e0
>
> __x64_sys_sendto+0x20/0x30
>
> do_syscall_64+0x93/0x180
>
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
Thanks, I got this stack too after enabling CONFIG_PROVE_LOCKING.
It seems that we can't call sleepable lock such as mutex_lock under rcu-locked context.
I'm working on it.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-03-10 14:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
2025-03-07 9:45 ` Michal Luczaj
2025-03-10 11:36 ` Jiayuan Chen
2025-03-10 13:08 ` Michal Luczaj
2025-03-10 14:13 ` Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).