* [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap
@ 2025-02-28 5:51 Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang
1. Issue
Syzkaller reported this issue [1].
2. Reproduce
We can reproduce this issue by using the test_sockmap_with_close_on_write()
test I provided in selftest, also you need to apply the following patch to
ensure 100% reproducibility (sleep after checking sock):
'''
static void sk_psock_verdict_data_ready(struct sock *sk)
{
.......
if (unlikely(!sock))
return;
+ if (!strcmp("test_progs", current->comm)) {
+ printk("sleep 2s to wait socket freed\n");
+ mdelay(2000);
+ printk("sleep end\n");
+ }
ops = READ_ONCE(sock->ops);
if (!ops || !ops->read_skb)
return;
}
'''
Then running './test_progs -v sockmap_basic', and if the kernel has KASAN
enabled [2], you will see the following warning:
'''
BUG: KASAN: slab-use-after-free in sk_psock_verdict_data_ready+0x29b/0x2d0
Read of size 8 at addr ffff88813a777020 by task test_progs/47055
Tainted: [O]=OOT_MODULE
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
print_address_description.constprop.0+0x30/0x420
? sk_psock_verdict_data_ready+0x29b/0x2d0
print_report+0xb7/0x270
? sk_psock_verdict_data_ready+0x29b/0x2d0
? kasan_addr_to_slab+0xd/0xa0
? sk_psock_verdict_data_ready+0x29b/0x2d0
kasan_report+0xca/0x100
? sk_psock_verdict_data_ready+0x29b/0x2d0
sk_psock_verdict_data_ready+0x29b/0x2d0
unix_stream_sendmsg+0x4a6/0xa40
? __pfx_unix_stream_sendmsg+0x10/0x10
? fdget+0x2c1/0x3a0
__sys_sendto+0x39c/0x410
'''
3. Reason
'''
CPU0 CPU1
unix_stream_sendmsg(sk):
other = unix_peer(sk)
other->sk_data_ready(other):
socket *sock = sk->sk_socket
if (unlikely(!sock))
return;
close(other):
...
other->close()
free(socket)
READ_ONCE(sock->ops)
^
use 'sock' after free
'''
For TCP, UDP, or other protocols, we have already performed
rcu_read_lock() when the network stack receives packets in ip_input.c:
'''
ip_local_deliver_finish():
rcu_read_lock()
ip_protocol_deliver_rcu()
xxx_rcv
rcu_read_unlock()
'''
However, for Unix sockets, sk_data_ready is called directly from the
process context without rcu_read_lock() protection.
4. Solution
Based on the fact that the 'struct socket' is released using call_rcu(),
We add rcu_read_{un}lock() at the entrance and exit of our sk_data_ready.
It will not increase performance overhead, at least for TCP and UDP, they
are already in a relatively large critical section.
Of course, we can also add a custom callback for Unix sockets and call
rcu_read_lock() before calling _verdict_data_ready like this:
'''
if (sk_is_unix(sk))
sk->sk_data_ready = sk_psock_verdict_data_ready_rcu;
else
sk->sk_data_ready = sk_psock_verdict_data_ready;
sk_psock_verdict_data_ready_rcu():
rcu_read_lock()
sk_psock_verdict_data_ready()
rcu_read_unlock()
'''
However, this will cause too many branches, and it's not suitable to
distinguish network protocols in skmsg.c.
[1] https://syzkaller.appspot.com/bug?extid=dd90a702f518e0eac072
[2] https://syzkaller.appspot.com/text?tag=KernelConfig&x=1362a5aee630ff34
---
v1 -> v2:
1. Add Fixes tag.
2. Extend selftest of edge case for TCP/UDP sockets.
3. Add Reviewed-by and Acked-by tag.
https://lore.kernel.org/bpf/20250226132242.52663-1-jiayuan.chen@linux.dev/T/#t
Jiayuan Chen (3):
bpf, sockmap: avoid using sk_socket after free
selftests/bpf: Add socketpair to create_pair to support unix socket
selftests/bpf: Add edge case tests for sockmap
net/core/skmsg.c | 18 ++++--
.../selftests/bpf/prog_tests/socket_helpers.h | 13 +++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 59 +++++++++++++++++++
3 files changed, 84 insertions(+), 6 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
@ 2025-02-28 5:51 ` Jiayuan Chen
2025-03-07 9:45 ` Michal Luczaj
2025-02-28 5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
2 siblings, 1 reply; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang, syzbot+dd90a702f518e0eac072
Use RCU lock to protect sk_socket, preventing concurrent close and release
by another thread.
Because TCP/UDP are already within a relatively large critical section:
'''
ip_local_deliver_finish
rcu_read_lock
ip_protocol_deliver_rcu
tcp_rcv/udp_rcv
rcu_read_unlock
'''
Adding rcu_read_{un}lock() at the entrance and exit of sk_data_ready
will not increase performance overhead.
Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()")
Reported-by: syzbot+dd90a702f518e0eac072@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/6734c033.050a0220.2a2fcc.0015.GAE@google.com/
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
net/core/skmsg.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 0ddc4c718833..1b71ae1d1bf5 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1222,27 +1222,35 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
static void sk_psock_verdict_data_ready(struct sock *sk)
{
- struct socket *sock = sk->sk_socket;
+ struct socket *sock;
const struct proto_ops *ops;
int copied;
trace_sk_data_ready(sk);
+ /* We need RCU to prevent the sk_socket from being released.
+ * Especially for Unix sockets, we are currently in the process
+ * context and do not have RCU protection.
+ */
+ rcu_read_lock();
+ sock = sk->sk_socket;
if (unlikely(!sock))
- return;
+ goto unlock;
+
ops = READ_ONCE(sock->ops);
if (!ops || !ops->read_skb)
- return;
+ goto unlock;
+
copied = ops->read_skb(sk, sk_psock_verdict_recv);
if (copied >= 0) {
struct sk_psock *psock;
- rcu_read_lock();
psock = sk_psock(sk);
if (psock)
sk_psock_data_ready(sk, psock);
- rcu_read_unlock();
}
+unlock:
+ rcu_read_unlock();
}
void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
@ 2025-02-28 5:51 ` Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang
Current wrapper function create_pair() is used to create a pair of
connected links and returns two fds, but it does not support unix sockets.
Here we introduce socketpair() into create_pair(), which supports creating
a pair of unix sockets, since the semantics of the two are the same.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
.../selftests/bpf/prog_tests/socket_helpers.h | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/socket_helpers.h b/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
index 1bdfb79ef009..a805143dd84f 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
+++ b/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
@@ -313,11 +313,22 @@ static inline int recv_timeout(int fd, void *buf, size_t len, int flags,
static inline int create_pair(int family, int sotype, int *p0, int *p1)
{
- __close_fd int s, c = -1, p = -1;
+ __close_fd int s = -1, c = -1, p = -1;
struct sockaddr_storage addr;
socklen_t len = sizeof(addr);
int err;
+ if (family == AF_UNIX) {
+ int fds[2];
+
+ err = socketpair(family, sotype, 0, fds);
+ if (!err) {
+ *p0 = fds[0];
+ *p1 = fds[1];
+ }
+ return err;
+ }
+
s = socket_loopback(family, sotype);
if (s < 0)
return s;
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
@ 2025-02-28 5:51 ` Jiayuan Chen
2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28 5:51 UTC (permalink / raw)
To: xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
linux-kselftest, mrpre, cong.wang
Add edge case tests for sockmap.
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
.../selftests/bpf/prog_tests/sockmap_basic.c | 59 +++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index 1e3e4392dcca..ad8bb085baf2 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -1042,6 +1042,59 @@ static void test_sockmap_vsock_unconnected(void)
xclose(map);
}
+void *close_thread(void *arg)
+{
+ int *fd = (int *)arg;
+
+ sleep(1);
+ close(*fd);
+ *fd = -1;
+ return NULL;
+}
+
+void test_sockmap_with_close_on_write(int family, int sotype)
+{
+ struct test_sockmap_pass_prog *skel;
+ int err, map, verdict;
+ pthread_t tid;
+ int zero = 0;
+ int c = -1, p = -1;
+
+ skel = test_sockmap_pass_prog__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ return;
+
+ verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
+ map = bpf_map__fd(skel->maps.sock_map_rx);
+
+ err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach"))
+ goto out;
+
+ err = create_pair(family, sotype, &c, &p);
+ if (!ASSERT_OK(err, "create_pair"))
+ goto out;
+
+ err = bpf_map_update_elem(map, &zero, &p, BPF_ANY);
+ if (!ASSERT_OK(err, "bpf_map_update_elem"))
+ goto out;
+
+ err = pthread_create(&tid, 0, close_thread, &p);
+ if (!ASSERT_OK(err, "pthread_create"))
+ goto out;
+
+ while (p > 0)
+ send(c, "a", 1, MSG_NOSIGNAL);
+
+ pthread_join(tid, NULL);
+out:
+ if (c > 0)
+ close(c);
+ if (p > 0)
+ close(p);
+ test_sockmap_pass_prog__destroy(skel);
+}
+
void test_sockmap_basic(void)
{
if (test__start_subtest("sockmap create_update_free"))
@@ -1108,4 +1161,10 @@ void test_sockmap_basic(void)
test_sockmap_skb_verdict_vsock_poll();
if (test__start_subtest("sockmap vsock unconnected"))
test_sockmap_vsock_unconnected();
+ if (test__start_subtest("sockmap with write on close")) {
+ test_sockmap_with_close_on_write(AF_UNIX, SOCK_STREAM);
+ test_sockmap_with_close_on_write(AF_UNIX, SOCK_DGRAM);
+ test_sockmap_with_close_on_write(AF_INET, SOCK_STREAM);
+ test_sockmap_with_close_on_write(AF_INET, SOCK_DGRAM);
+ }
}
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
@ 2025-03-07 9:45 ` Michal Luczaj
2025-03-10 11:36 ` Jiayuan Chen
0 siblings, 1 reply; 8+ messages in thread
From: Michal Luczaj @ 2025-03-07 9:45 UTC (permalink / raw)
To: Jiayuan Chen, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
On 2/28/25 06:51, Jiayuan Chen wrote:
> ...
> static void sk_psock_verdict_data_ready(struct sock *sk)
> {
> - struct socket *sock = sk->sk_socket;
> + struct socket *sock;
> const struct proto_ops *ops;
> int copied;
>
> trace_sk_data_ready(sk);
>
> + /* We need RCU to prevent the sk_socket from being released.
> + * Especially for Unix sockets, we are currently in the process
> + * context and do not have RCU protection.
> + */
> + rcu_read_lock();
> + sock = sk->sk_socket;
> if (unlikely(!sock))
> - return;
> + goto unlock;
> +
> ops = READ_ONCE(sock->ops);
> if (!ops || !ops->read_skb)
> - return;
> + goto unlock;
> +
> copied = ops->read_skb(sk, sk_psock_verdict_recv);
> if (copied >= 0) {
> struct sk_psock *psock;
>
> - rcu_read_lock();
> psock = sk_psock(sk);
> if (psock)
> sk_psock_data_ready(sk, psock);
> - rcu_read_unlock();
> }
> +unlock:
> + rcu_read_unlock();
> }
Hi,
Doesn't sk_psock_handle_skb() (!ingress path) have the same `struct socket`
release race issue? Any plans on fixing that one, too?
BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
read_skb() under RCU read lock.
Thanks,
Michal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-03-07 9:45 ` Michal Luczaj
@ 2025-03-10 11:36 ` Jiayuan Chen
2025-03-10 13:08 ` Michal Luczaj
0 siblings, 1 reply; 8+ messages in thread
From: Jiayuan Chen @ 2025-03-10 11:36 UTC (permalink / raw)
To: Michal Luczaj, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
>
> On 2/28/25 06:51, Jiayuan Chen wrote:
>
> >
> > ...
> >
> > static void sk_psock_verdict_data_ready(struct sock *sk)
> >
> > {
> >
> > - struct socket *sock = sk->sk_socket;
> >
> > + struct socket *sock;
> >
> > const struct proto_ops *ops;
> >
> > int copied;
> >
> >
> >
> > trace_sk_data_ready(sk);
> >
> >
> >
> > + /* We need RCU to prevent the sk_socket from being released.
> >
> > + * Especially for Unix sockets, we are currently in the process
> >
> > + * context and do not have RCU protection.
> >
> > + */
> >
> > + rcu_read_lock();
> >
> > + sock = sk->sk_socket;
> >
> > if (unlikely(!sock))
> >
> > - return;
> >
> > + goto unlock;
> >
> > +
> >
> > ops = READ_ONCE(sock->ops);
> >
> > if (!ops || !ops->read_skb)
> >
> > - return;
> >
> > + goto unlock;
> >
> > +
> >
> > copied = ops->read_skb(sk, sk_psock_verdict_recv);
> >
> > if (copied >= 0) {
> >
> > struct sk_psock *psock;
> >
> >
> >
> > - rcu_read_lock();
> >
> > psock = sk_psock(sk);
> >
> > if (psock)
> >
> > sk_psock_data_ready(sk, psock);
> >
> > - rcu_read_unlock();
> >
> > }
> >
> > +unlock:
> >
> > + rcu_read_unlock();
> >
> > }
> >
>
> Hi,
>
> Doesn't sk_psock_handle_skb() (!ingress path) have the same `struct socket`
>
> release race issue? Any plans on fixing that one, too?
Yes, the send path logic also has similar issues, and after some hacking,
I was able to reproduce it. Thanks for providing this information.
I can fix these together in the next revision of this patchset, anyway,
this patchset still needs further confirmation from the maintainer.
>
> BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
>
> read_skb() under RCU read lock.
>
> Thanks,
>
> Michal
>
My environment also has LOCKDEP enabled, but I didn't see similar
warnings.
Moreover, RCU assertions are typically written as:
WARN_ON_ONCE(!rcu_read_lock_held())
And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
returning 1. So, it's unlikely to trigger a warning due to an RCU lock
being held.
Could you provide more of the call stack?
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-03-10 11:36 ` Jiayuan Chen
@ 2025-03-10 13:08 ` Michal Luczaj
2025-03-10 14:13 ` Jiayuan Chen
0 siblings, 1 reply; 8+ messages in thread
From: Michal Luczaj @ 2025-03-10 13:08 UTC (permalink / raw)
To: Jiayuan Chen, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
On 3/10/25 12:36, Jiayuan Chen wrote:
> March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
> ...
>> BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
>> read_skb() under RCU read lock.
>>
> My environment also has LOCKDEP enabled, but I didn't see similar
> warnings.
> Moreover, RCU assertions are typically written as:
>
> WARN_ON_ONCE(!rcu_read_lock_held())
>
> And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
> returning 1. So, it's unlikely to trigger a warning due to an RCU lock
> being held.
>
> Could you provide more of the call stack?
Sure, bpf-next with this series applied, test_progs -t sockmap_basic:
=============================
[ BUG: Invalid wait context ]
6.14.0-rc3+ #111 Tainted: G OE
-----------------------------
test_progs/37755 is trying to lock:
ffff88810d9bc3c0 (&u->iolock){+.+.}-{4:4}, at: unix_stream_read_skb+0x30/0x120
other info that might help us debug this:
context-{5:5}
1 lock held by test_progs/37755:
#0: ffffffff833700e0 (rcu_read_lock){....}-{1:3}, at: sk_psock_verdict_data_ready+0x3e/0x2a0
stack backtrace:
CPU: 13 UID: 0 PID: 37755 Comm: test_progs Tainted: G OE 6.14.0-rc3+ #111
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
Call Trace:
dump_stack_lvl+0x68/0x90
lock_acquire+0xcf/0x2e0
__mutex_lock+0x9c/0xcc0
unix_stream_read_skb+0x30/0x120
sk_psock_verdict_data_ready+0x8d/0x2a0
unix_stream_sendmsg+0x232/0x640
__sys_sendto+0x1cd/0x1e0
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x93/0x180
entry_SYSCALL_64_after_hwframe+0x76/0x7e
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
2025-03-10 13:08 ` Michal Luczaj
@ 2025-03-10 14:13 ` Jiayuan Chen
0 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-03-10 14:13 UTC (permalink / raw)
To: Michal Luczaj, xiyou.wangcong, john.fastabend, jakub, martin.lau
Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
mrpre, cong.wang, syzbot+dd90a702f518e0eac072
March 10, 2025 at 9:08 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
>
> On 3/10/25 12:36, Jiayuan Chen wrote:
>
> >
> > March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
> >
> > ...
> >
> > >
> > > BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
> > >
> > > read_skb() under RCU read lock.
> > >
> >
> > My environment also has LOCKDEP enabled, but I didn't see similar
> >
> > warnings.
> >
> > Moreover, RCU assertions are typically written as:
> >
> >
> >
> > WARN_ON_ONCE(!rcu_read_lock_held())
> >
> >
> >
> > And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
> >
> > returning 1. So, it's unlikely to trigger a warning due to an RCU lock
> >
> > being held.
> >
> >
> >
> > Could you provide more of the call stack?
> >
>
> Sure, bpf-next with this series applied, test_progs -t sockmap_basic:
>
> =============================
>
> [ BUG: Invalid wait context ]
>
> 6.14.0-rc3+ #111 Tainted: G OE
>
> -----------------------------
>
> test_progs/37755 is trying to lock:
>
> ffff88810d9bc3c0 (&u->iolock){+.+.}-{4:4}, at: unix_stream_read_skb+0x30/0x120
>
> other info that might help us debug this:
>
> context-{5:5}
>
> 1 lock held by test_progs/37755:
>
> #0: ffffffff833700e0 (rcu_read_lock){....}-{1:3}, at: sk_psock_verdict_data_ready+0x3e/0x2a0
>
> stack backtrace:
>
> CPU: 13 UID: 0 PID: 37755 Comm: test_progs Tainted: G OE 6.14.0-rc3+ #111
>
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
>
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>
> Call Trace:
>
> dump_stack_lvl+0x68/0x90
>
> lock_acquire+0xcf/0x2e0
>
> __mutex_lock+0x9c/0xcc0
>
> unix_stream_read_skb+0x30/0x120
>
> sk_psock_verdict_data_ready+0x8d/0x2a0
>
> unix_stream_sendmsg+0x232/0x640
>
> __sys_sendto+0x1cd/0x1e0
>
> __x64_sys_sendto+0x20/0x30
>
> do_syscall_64+0x93/0x180
>
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
Thanks, I got this stack too after enabling CONFIG_PROVE_LOCKING.
It seems that we can't call sleepable lock such as mutex_lock under rcu-locked context.
I'm working on it.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-03-10 14:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28 5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
2025-03-07 9:45 ` Michal Luczaj
2025-03-10 11:36 ` Jiayuan Chen
2025-03-10 13:08 ` Michal Luczaj
2025-03-10 14:13 ` Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
2025-02-28 5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.