All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap
@ 2025-02-28  5:51 Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28  5:51 UTC (permalink / raw)
  To: xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
	linux-kselftest, mrpre, cong.wang

1. Issue
Syzkaller reported this issue [1].

2. Reproduce

We can reproduce this issue by using the test_sockmap_with_close_on_write()
test I provided in selftest, also you need to apply the following patch to
ensure 100% reproducibility (sleep after checking sock):

'''
static void sk_psock_verdict_data_ready(struct sock *sk)
{
        .......
        if (unlikely(!sock))
                return;
+       if (!strcmp("test_progs", current->comm)) {
+               printk("sleep 2s to wait socket freed\n");
+               mdelay(2000);
+               printk("sleep end\n");
+       }
        ops = READ_ONCE(sock->ops);
        if (!ops || !ops->read_skb)
                return;
}
'''

Then running './test_progs -v sockmap_basic', and if the kernel has KASAN
enabled [2], you will see the following warning:

'''
BUG: KASAN: slab-use-after-free in sk_psock_verdict_data_ready+0x29b/0x2d0
Read of size 8 at addr ffff88813a777020 by task test_progs/47055

Tainted: [O]=OOT_MODULE
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 print_address_description.constprop.0+0x30/0x420
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 print_report+0xb7/0x270
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 ? kasan_addr_to_slab+0xd/0xa0
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 kasan_report+0xca/0x100
 ? sk_psock_verdict_data_ready+0x29b/0x2d0
 sk_psock_verdict_data_ready+0x29b/0x2d0
 unix_stream_sendmsg+0x4a6/0xa40
 ? __pfx_unix_stream_sendmsg+0x10/0x10
 ? fdget+0x2c1/0x3a0
 __sys_sendto+0x39c/0x410
'''

3. Reason
'''
CPU0                                             CPU1
unix_stream_sendmsg(sk):
  other = unix_peer(sk)
  other->sk_data_ready(other):
    socket *sock = sk->sk_socket
    if (unlikely(!sock))
        return;
                                                 close(other):
                                                   ...
                                                   other->close()
                                                   free(socket)
    READ_ONCE(sock->ops)
    ^
    use 'sock' after free
'''

For TCP, UDP, or other protocols, we have already performed
rcu_read_lock() when the network stack receives packets in ip_input.c:
'''
ip_local_deliver_finish():
    rcu_read_lock()
    ip_protocol_deliver_rcu()
        xxx_rcv
    rcu_read_unlock()
'''

However, for Unix sockets, sk_data_ready is called directly from the
process context without rcu_read_lock() protection.

4. Solution
Based on the fact that the 'struct socket' is released using call_rcu(),
We add rcu_read_{un}lock() at the entrance and exit of our sk_data_ready.
It will not increase performance overhead, at least for TCP and UDP, they
are already in a relatively large critical section.

Of course, we can also add a custom callback for Unix sockets and call
rcu_read_lock() before calling _verdict_data_ready like this:
'''
if (sk_is_unix(sk))
    sk->sk_data_ready = sk_psock_verdict_data_ready_rcu;
else
    sk->sk_data_ready = sk_psock_verdict_data_ready;

sk_psock_verdict_data_ready_rcu():
    rcu_read_lock()
    sk_psock_verdict_data_ready()
    rcu_read_unlock()
'''
However, this will cause too many branches, and it's not suitable to
distinguish network protocols in skmsg.c.

[1] https://syzkaller.appspot.com/bug?extid=dd90a702f518e0eac072
[2] https://syzkaller.appspot.com/text?tag=KernelConfig&x=1362a5aee630ff34

---
v1 -> v2:
1. Add Fixes tag.
2. Extend selftest of edge case for TCP/UDP sockets.
3. Add Reviewed-by and Acked-by tag.
https://lore.kernel.org/bpf/20250226132242.52663-1-jiayuan.chen@linux.dev/T/#t

Jiayuan Chen (3):
  bpf, sockmap: avoid using sk_socket after free
  selftests/bpf: Add socketpair to create_pair to support unix socket
  selftests/bpf: Add edge case tests for sockmap

 net/core/skmsg.c                              | 18 ++++--
 .../selftests/bpf/prog_tests/socket_helpers.h | 13 +++-
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 59 +++++++++++++++++++
 3 files changed, 84 insertions(+), 6 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
  2025-02-28  5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
@ 2025-02-28  5:51 ` Jiayuan Chen
  2025-03-07  9:45   ` Michal Luczaj
  2025-02-28  5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
  2 siblings, 1 reply; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28  5:51 UTC (permalink / raw)
  To: xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
	linux-kselftest, mrpre, cong.wang, syzbot+dd90a702f518e0eac072

Use RCU lock to protect sk_socket, preventing concurrent close and release
by another thread.

Because TCP/UDP are already within a relatively large critical section:
'''
ip_local_deliver_finish
  rcu_read_lock
  ip_protocol_deliver_rcu
      tcp_rcv/udp_rcv
  rcu_read_unlock
'''

Adding rcu_read_{un}lock() at the entrance and exit of sk_data_ready
will not increase performance overhead.

Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()")
Reported-by: syzbot+dd90a702f518e0eac072@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/6734c033.050a0220.2a2fcc.0015.GAE@google.com/
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 net/core/skmsg.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 0ddc4c718833..1b71ae1d1bf5 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1222,27 +1222,35 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
 
 static void sk_psock_verdict_data_ready(struct sock *sk)
 {
-	struct socket *sock = sk->sk_socket;
+	struct socket *sock;
 	const struct proto_ops *ops;
 	int copied;
 
 	trace_sk_data_ready(sk);
 
+	/* We need RCU to prevent the sk_socket from being released.
+	 * Especially for Unix sockets, we are currently in the process
+	 * context and do not have RCU protection.
+	 */
+	rcu_read_lock();
+	sock = sk->sk_socket;
 	if (unlikely(!sock))
-		return;
+		goto unlock;
+
 	ops = READ_ONCE(sock->ops);
 	if (!ops || !ops->read_skb)
-		return;
+		goto unlock;
+
 	copied = ops->read_skb(sk, sk_psock_verdict_recv);
 	if (copied >= 0) {
 		struct sk_psock *psock;
 
-		rcu_read_lock();
 		psock = sk_psock(sk);
 		if (psock)
 			sk_psock_data_ready(sk, psock);
-		rcu_read_unlock();
 	}
+unlock:
+	rcu_read_unlock();
 }
 
 void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket
  2025-02-28  5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
@ 2025-02-28  5:51 ` Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen
  2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28  5:51 UTC (permalink / raw)
  To: xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
	linux-kselftest, mrpre, cong.wang

Current wrapper function create_pair() is used to create a pair of
connected links and returns two fds, but it does not support unix sockets.

Here we introduce socketpair() into create_pair(), which supports creating
a pair of unix sockets, since the semantics of the two are the same.

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 .../selftests/bpf/prog_tests/socket_helpers.h       | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/socket_helpers.h b/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
index 1bdfb79ef009..a805143dd84f 100644
--- a/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
+++ b/tools/testing/selftests/bpf/prog_tests/socket_helpers.h
@@ -313,11 +313,22 @@ static inline int recv_timeout(int fd, void *buf, size_t len, int flags,
 
 static inline int create_pair(int family, int sotype, int *p0, int *p1)
 {
-	__close_fd int s, c = -1, p = -1;
+	__close_fd int s = -1, c = -1, p = -1;
 	struct sockaddr_storage addr;
 	socklen_t len = sizeof(addr);
 	int err;
 
+	if (family == AF_UNIX) {
+		int fds[2];
+
+		err = socketpair(family, sotype, 0, fds);
+		if (!err) {
+			*p0 = fds[0];
+			*p1 = fds[1];
+		}
+		return err;
+	}
+
 	s = socket_loopback(family, sotype);
 	if (s < 0)
 		return s;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap
  2025-02-28  5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
  2025-02-28  5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
@ 2025-02-28  5:51 ` Jiayuan Chen
  2 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-02-28  5:51 UTC (permalink / raw)
  To: xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, mhal, jiayuan.chen, sgarzare, netdev, bpf, linux-kernel,
	linux-kselftest, mrpre, cong.wang

Add edge case tests for sockmap.

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 59 +++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index 1e3e4392dcca..ad8bb085baf2 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -1042,6 +1042,59 @@ static void test_sockmap_vsock_unconnected(void)
 	xclose(map);
 }
 
+void *close_thread(void *arg)
+{
+	int *fd = (int *)arg;
+
+	sleep(1);
+	close(*fd);
+	*fd = -1;
+	return NULL;
+}
+
+void test_sockmap_with_close_on_write(int family, int sotype)
+{
+	struct test_sockmap_pass_prog *skel;
+	int err, map, verdict;
+	pthread_t tid;
+	int zero = 0;
+	int c = -1, p = -1;
+
+	skel = test_sockmap_pass_prog__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		return;
+
+	verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
+	map = bpf_map__fd(skel->maps.sock_map_rx);
+
+	err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+	if (!ASSERT_OK(err, "bpf_prog_attach"))
+		goto out;
+
+	err = create_pair(family, sotype, &c, &p);
+	if (!ASSERT_OK(err, "create_pair"))
+		goto out;
+
+	err = bpf_map_update_elem(map, &zero, &p, BPF_ANY);
+	if (!ASSERT_OK(err, "bpf_map_update_elem"))
+		goto out;
+
+	err = pthread_create(&tid, 0, close_thread, &p);
+	if (!ASSERT_OK(err, "pthread_create"))
+		goto out;
+
+	while (p > 0)
+		send(c, "a", 1, MSG_NOSIGNAL);
+
+	pthread_join(tid, NULL);
+out:
+	if (c > 0)
+		close(c);
+	if (p > 0)
+		close(p);
+	test_sockmap_pass_prog__destroy(skel);
+}
+
 void test_sockmap_basic(void)
 {
 	if (test__start_subtest("sockmap create_update_free"))
@@ -1108,4 +1161,10 @@ void test_sockmap_basic(void)
 		test_sockmap_skb_verdict_vsock_poll();
 	if (test__start_subtest("sockmap vsock unconnected"))
 		test_sockmap_vsock_unconnected();
+	if (test__start_subtest("sockmap with write on close")) {
+		test_sockmap_with_close_on_write(AF_UNIX, SOCK_STREAM);
+		test_sockmap_with_close_on_write(AF_UNIX, SOCK_DGRAM);
+		test_sockmap_with_close_on_write(AF_INET, SOCK_STREAM);
+		test_sockmap_with_close_on_write(AF_INET, SOCK_DGRAM);
+	}
 }
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
  2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
@ 2025-03-07  9:45   ` Michal Luczaj
  2025-03-10 11:36     ` Jiayuan Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Luczaj @ 2025-03-07  9:45 UTC (permalink / raw)
  To: Jiayuan Chen, xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
	mrpre, cong.wang, syzbot+dd90a702f518e0eac072

On 2/28/25 06:51, Jiayuan Chen wrote:
> ...
>  static void sk_psock_verdict_data_ready(struct sock *sk)
>  {
> -	struct socket *sock = sk->sk_socket;
> +	struct socket *sock;
>  	const struct proto_ops *ops;
>  	int copied;
>  
>  	trace_sk_data_ready(sk);
>  
> +	/* We need RCU to prevent the sk_socket from being released.
> +	 * Especially for Unix sockets, we are currently in the process
> +	 * context and do not have RCU protection.
> +	 */
> +	rcu_read_lock();
> +	sock = sk->sk_socket;
>  	if (unlikely(!sock))
> -		return;
> +		goto unlock;
> +
>  	ops = READ_ONCE(sock->ops);
>  	if (!ops || !ops->read_skb)
> -		return;
> +		goto unlock;
> +
>  	copied = ops->read_skb(sk, sk_psock_verdict_recv);
>  	if (copied >= 0) {
>  		struct sk_psock *psock;
>  
> -		rcu_read_lock();
>  		psock = sk_psock(sk);
>  		if (psock)
>  			sk_psock_data_ready(sk, psock);
> -		rcu_read_unlock();
>  	}
> +unlock:
> +	rcu_read_unlock();
>  }

Hi,

Doesn't sk_psock_handle_skb() (!ingress path) have the same `struct socket`
release race issue? Any plans on fixing that one, too?

BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
read_skb() under RCU read lock.

Thanks,
Michal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
  2025-03-07  9:45   ` Michal Luczaj
@ 2025-03-10 11:36     ` Jiayuan Chen
  2025-03-10 13:08       ` Michal Luczaj
  0 siblings, 1 reply; 8+ messages in thread
From: Jiayuan Chen @ 2025-03-10 11:36 UTC (permalink / raw)
  To: Michal Luczaj, xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
	mrpre, cong.wang, syzbot+dd90a702f518e0eac072

March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:

> 
> On 2/28/25 06:51, Jiayuan Chen wrote:
> 
> > 
> > ...
> > 
> >  static void sk_psock_verdict_data_ready(struct sock *sk)
> > 
> >  {
> > 
> >  - struct socket *sock = sk->sk_socket;
> > 
> >  + struct socket *sock;
> > 
> >  const struct proto_ops *ops;
> > 
> >  int copied;
> > 
> >  
> > 
> >  trace_sk_data_ready(sk);
> > 
> >  
> > 
> >  + /* We need RCU to prevent the sk_socket from being released.
> > 
> >  + * Especially for Unix sockets, we are currently in the process
> > 
> >  + * context and do not have RCU protection.
> > 
> >  + */
> > 
> >  + rcu_read_lock();
> > 
> >  + sock = sk->sk_socket;
> > 
> >  if (unlikely(!sock))
> > 
> >  - return;
> > 
> >  + goto unlock;
> > 
> >  +
> > 
> >  ops = READ_ONCE(sock->ops);
> > 
> >  if (!ops || !ops->read_skb)
> > 
> >  - return;
> > 
> >  + goto unlock;
> > 
> >  +
> > 
> >  copied = ops->read_skb(sk, sk_psock_verdict_recv);
> > 
> >  if (copied >= 0) {
> > 
> >  struct sk_psock *psock;
> > 
> >  
> > 
> >  - rcu_read_lock();
> > 
> >  psock = sk_psock(sk);
> > 
> >  if (psock)
> > 
> >  sk_psock_data_ready(sk, psock);
> > 
> >  - rcu_read_unlock();
> > 
> >  }
> > 
> >  +unlock:
> > 
> >  + rcu_read_unlock();
> > 
> >  }
> > 
> 
> Hi,
> 
> Doesn't sk_psock_handle_skb() (!ingress path) have the same `struct socket`
> 
> release race issue? Any plans on fixing that one, too?

Yes, the send path logic also has similar issues, and after some hacking,
I was able to reproduce it. Thanks for providing this information.
I can fix these together in the next revision of this patchset, anyway,
this patchset still needs further confirmation from the maintainer.

> 
> BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
> 
> read_skb() under RCU read lock.
> 
> Thanks,
> 
> Michal
>
My environment also has LOCKDEP enabled, but I didn't see similar
warnings.
Moreover, RCU assertions are typically written as:

WARN_ON_ONCE(!rcu_read_lock_held())

And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
returning 1. So, it's unlikely to trigger a warning due to an RCU lock
being held.

Could you provide more of the call stack?

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
  2025-03-10 11:36     ` Jiayuan Chen
@ 2025-03-10 13:08       ` Michal Luczaj
  2025-03-10 14:13         ` Jiayuan Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Luczaj @ 2025-03-10 13:08 UTC (permalink / raw)
  To: Jiayuan Chen, xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
	mrpre, cong.wang, syzbot+dd90a702f518e0eac072

On 3/10/25 12:36, Jiayuan Chen wrote:
> March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
> ...
>> BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
>> read_skb() under RCU read lock.
>>
> My environment also has LOCKDEP enabled, but I didn't see similar
> warnings.
> Moreover, RCU assertions are typically written as:
> 
> WARN_ON_ONCE(!rcu_read_lock_held())
> 
> And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
> returning 1. So, it's unlikely to trigger a warning due to an RCU lock
> being held.
> 
> Could you provide more of the call stack?

Sure, bpf-next with this series applied, test_progs -t sockmap_basic:

=============================
[ BUG: Invalid wait context ]
6.14.0-rc3+ #111 Tainted: G           OE
-----------------------------
test_progs/37755 is trying to lock:
ffff88810d9bc3c0 (&u->iolock){+.+.}-{4:4}, at: unix_stream_read_skb+0x30/0x120
other info that might help us debug this:
context-{5:5}
1 lock held by test_progs/37755:
 #0: ffffffff833700e0 (rcu_read_lock){....}-{1:3}, at: sk_psock_verdict_data_ready+0x3e/0x2a0
stack backtrace:
CPU: 13 UID: 0 PID: 37755 Comm: test_progs Tainted: G           OE      6.14.0-rc3+ #111
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
Call Trace:
 dump_stack_lvl+0x68/0x90
 lock_acquire+0xcf/0x2e0
 __mutex_lock+0x9c/0xcc0
 unix_stream_read_skb+0x30/0x120
 sk_psock_verdict_data_ready+0x8d/0x2a0
 unix_stream_sendmsg+0x232/0x640
 __sys_sendto+0x1cd/0x1e0
 __x64_sys_sendto+0x20/0x30
 do_syscall_64+0x93/0x180
 entry_SYSCALL_64_after_hwframe+0x76/0x7e


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free
  2025-03-10 13:08       ` Michal Luczaj
@ 2025-03-10 14:13         ` Jiayuan Chen
  0 siblings, 0 replies; 8+ messages in thread
From: Jiayuan Chen @ 2025-03-10 14:13 UTC (permalink / raw)
  To: Michal Luczaj, xiyou.wangcong, john.fastabend, jakub, martin.lau
  Cc: davem, edumazet, kuba, pabeni, horms, andrii, eddyz87, mykolal,
	ast, daniel, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	shuah, sgarzare, netdev, bpf, linux-kernel, linux-kselftest,
	mrpre, cong.wang, syzbot+dd90a702f518e0eac072

March 10, 2025 at 9:08 PM, "Michal Luczaj" <mhal@rbox.co> wrote:



> 
> On 3/10/25 12:36, Jiayuan Chen wrote:
> 
> > 
> > March 7, 2025 at 5:45 PM, "Michal Luczaj" <mhal@rbox.co> wrote:
> > 
> >  ...
> > 
> > > 
> > > BTW, lockdep (CONFIG_LOCKDEP=y) complains about calling AF_UNIX's
> > > 
> > >  read_skb() under RCU read lock.
> > > 
> > 
> >  My environment also has LOCKDEP enabled, but I didn't see similar
> > 
> >  warnings.
> > 
> >  Moreover, RCU assertions are typically written as:
> > 
> >  
> > 
> >  WARN_ON_ONCE(!rcu_read_lock_held())
> > 
> >  
> > 
> >  And when LOCKDEP is not enabled, rcu_read_lock_held() defaults to
> > 
> >  returning 1. So, it's unlikely to trigger a warning due to an RCU lock
> > 
> >  being held.
> > 
> >  
> > 
> >  Could you provide more of the call stack?
> > 
> 
> Sure, bpf-next with this series applied, test_progs -t sockmap_basic:
> 
> =============================
> 
> [ BUG: Invalid wait context ]
> 
> 6.14.0-rc3+ #111 Tainted: G OE
> 
> -----------------------------
> 
> test_progs/37755 is trying to lock:
> 
> ffff88810d9bc3c0 (&u->iolock){+.+.}-{4:4}, at: unix_stream_read_skb+0x30/0x120
> 
> other info that might help us debug this:
> 
> context-{5:5}
> 
> 1 lock held by test_progs/37755:
> 
>  #0: ffffffff833700e0 (rcu_read_lock){....}-{1:3}, at: sk_psock_verdict_data_ready+0x3e/0x2a0
> 
> stack backtrace:
> 
> CPU: 13 UID: 0 PID: 37755 Comm: test_progs Tainted: G OE 6.14.0-rc3+ #111
> 
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> 
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> 
> Call Trace:
> 
>  dump_stack_lvl+0x68/0x90
> 
>  lock_acquire+0xcf/0x2e0
> 
>  __mutex_lock+0x9c/0xcc0
> 
>  unix_stream_read_skb+0x30/0x120
> 
>  sk_psock_verdict_data_ready+0x8d/0x2a0
> 
>  unix_stream_sendmsg+0x232/0x640
> 
>  __sys_sendto+0x1cd/0x1e0
> 
>  __x64_sys_sendto+0x20/0x30
> 
>  do_syscall_64+0x93/0x180
> 
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
Thanks, I got this stack too after enabling CONFIG_PROVE_LOCKING.
It seems that we can't call sleepable lock such as mutex_lock under rcu-locked context.
I'm working on it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-03-10 14:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28  5:51 [PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap Jiayuan Chen
2025-02-28  5:51 ` [PATCH bpf-next v2 1/3] bpf, sockmap: avoid using sk_socket after free Jiayuan Chen
2025-03-07  9:45   ` Michal Luczaj
2025-03-10 11:36     ` Jiayuan Chen
2025-03-10 13:08       ` Michal Luczaj
2025-03-10 14:13         ` Jiayuan Chen
2025-02-28  5:51 ` [PATCH bpf-next v2 2/3] selftests/bpf: Add socketpair to create_pair to support unix socket Jiayuan Chen
2025-02-28  5:51 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add edge case tests for sockmap Jiayuan Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.