netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Jason Xing <kerneljasonxing@gmail.com>,
	 john.fastabend@gmail.com,  edumazet@google.com,
	 jakub@cloudflare.com,  davem@davemloft.net,  kuba@kernel.org,
	 pabeni@redhat.com,  daniel@iogearbox.net,  ast@kernel.org
Cc: netdev@vger.kernel.org,  bpf@vger.kernel.org,
	 Jason Xing <kernelxing@tencent.com>,
	 syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com
Subject: RE: [PATCH net] bpf, skmsg: fix NULL pointer dereference in sk_psock_skb_ingress_enqueue
Date: Wed, 03 Apr 2024 18:00:59 -0700	[thread overview]
Message-ID: <660dfbcb45cfc_23f4720810@john.notmuch> (raw)
In-Reply-To: <20240329134037.92124-1-kerneljasonxing@gmail.com>

Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which
> syzbot reported [1].
> 
> [1]
> BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue
> 
> write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1:
>  sk_psock_stop_verdict net/core/skmsg.c:1257 [inline]
>  sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843
>  sk_psock_put include/linux/skmsg.h:459 [inline]
>  sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648
>  unix_release+0x4b/0x80 net/unix/af_unix.c:1048
>  __sock_release net/socket.c:659 [inline]
>  sock_close+0x68/0x150 net/socket.c:1421
>  __fput+0x2c1/0x660 fs/file_table.c:422
>  __fput_sync+0x44/0x60 fs/file_table.c:507
>  __do_sys_close fs/open.c:1556 [inline]
>  __se_sys_close+0x101/0x1b0 fs/open.c:1541
>  __x64_sys_close+0x1f/0x30 fs/open.c:1541
>  do_syscall_64+0xd3/0x1d0
>  entry_SYSCALL_64_after_hwframe+0x6d/0x75
> 
> read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0:
>  sk_psock_data_ready include/linux/skmsg.h:464 [inline]
>  sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555
>  sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606
>  sk_psock_verdict_apply net/core/skmsg.c:1008 [inline]
>  sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202
>  unix_read_skb net/unix/af_unix.c:2546 [inline]
>  unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682
>  sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223
>  unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339
>  sock_sendmsg_nosec net/socket.c:730 [inline]
>  __sock_sendmsg+0x140/0x180 net/socket.c:745
>  ____sys_sendmsg+0x312/0x410 net/socket.c:2584
>  ___sys_sendmsg net/socket.c:2638 [inline]
>  __sys_sendmsg+0x1e9/0x280 net/socket.c:2667
>  __do_sys_sendmsg net/socket.c:2676 [inline]
>  __se_sys_sendmsg net/socket.c:2674 [inline]
>  __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674
>  do_syscall_64+0xd3/0x1d0
>  entry_SYSCALL_64_after_hwframe+0x6d/0x75
> 
> value changed: 0xffffffff83d7feb0 -> 0x0000000000000000
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G        W          6.8.0-syzkaller-08951-gfe46a7dd189e #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
> 
> Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer
> dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer
> similarly due to no protection of saved_data_ready. Here is another
> different caller causing the same issue because of the same reason. So
> we should protect it with sk_callback_lock read lock because the writer
> side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);".
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
>  net/core/skmsg.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 4d75ef9d24bf..67c4c01c5235 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
>  	msg->skb = skb;
>  
>  	sk_psock_queue_msg(psock, msg);
> +	read_lock_bh(&sk->sk_callback_lock);
>  	sk_psock_data_ready(sk, psock);
> +	read_unlock_bh(&sk->sk_callback_lock);
>  	return copied;
>  }

The problem is the check and then usage presumably it is already set
to NULL:

 static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
 {
	if (psock->saved_data_ready)
		psock->saved_data_ready(sk);


I'm thinking we might be able to get away with just a READ_ONCE here with
similar WRITE_ONCE on other side. Something like this,

  sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
  {
       saved_data_ready = READ_ONCE(psock->saved_data_ready)

       if (saved_data_ready)
             saved_data_ready(sk)
       ....

And then in sk_psock_stop_verdict,

	WRITE_ONCE(sk->sk_data_ready, psock->saved_data_ready);
	WRITE_ONCE(psock->saved_data_ready, NULL);

And because we don't actually release the sock until a RCU grace period we
should be OK. The TCP stack manages to work correctly without wrapping
tcp_data_ready in locks like this. But nice thing there is you don't change
this callback on live sockets.

I think at least to keep backport simply above patch is ok, but lets move
the read_lock_bh()/unlock_bh() into the sk_psock_data_ready() call and then
we don't duplicate this error again. Does that make sense?

Thanks,
John


  parent reply	other threads:[~2024-04-04  1:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29 13:40 [PATCH net] bpf, skmsg: fix NULL pointer dereference in sk_psock_skb_ingress_enqueue Jason Xing
2024-03-31  9:44 ` Jason Xing
2024-04-04  0:44 ` Jason Xing
2024-04-04  1:00 ` John Fastabend [this message]
2024-04-04  1:25   ` Jason Xing
2024-04-05  4:45     ` John Fastabend
2024-04-05  5:12       ` Jason Xing
2024-04-05 14:58         ` John Fastabend
2024-04-05 15:00           ` Jason Xing
2024-04-05 15:28             ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=660dfbcb45cfc_23f4720810@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).