From: John Fastabend <john.fastabend@gmail.com>
To: Jason Xing <kerneljasonxing@gmail.com>,
john.fastabend@gmail.com, edumazet@google.com,
jakub@cloudflare.com, davem@davemloft.net, kuba@kernel.org,
pabeni@redhat.com, daniel@iogearbox.net, ast@kernel.org
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
Jason Xing <kernelxing@tencent.com>,
syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com
Subject: RE: [PATCH net] bpf, skmsg: fix NULL pointer dereference in sk_psock_skb_ingress_enqueue
Date: Wed, 03 Apr 2024 18:00:59 -0700 [thread overview]
Message-ID: <660dfbcb45cfc_23f4720810@john.notmuch> (raw)
In-Reply-To: <20240329134037.92124-1-kerneljasonxing@gmail.com>
Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which
> syzbot reported [1].
>
> [1]
> BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue
>
> write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1:
> sk_psock_stop_verdict net/core/skmsg.c:1257 [inline]
> sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843
> sk_psock_put include/linux/skmsg.h:459 [inline]
> sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648
> unix_release+0x4b/0x80 net/unix/af_unix.c:1048
> __sock_release net/socket.c:659 [inline]
> sock_close+0x68/0x150 net/socket.c:1421
> __fput+0x2c1/0x660 fs/file_table.c:422
> __fput_sync+0x44/0x60 fs/file_table.c:507
> __do_sys_close fs/open.c:1556 [inline]
> __se_sys_close+0x101/0x1b0 fs/open.c:1541
> __x64_sys_close+0x1f/0x30 fs/open.c:1541
> do_syscall_64+0xd3/0x1d0
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
> read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0:
> sk_psock_data_ready include/linux/skmsg.h:464 [inline]
> sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555
> sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606
> sk_psock_verdict_apply net/core/skmsg.c:1008 [inline]
> sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202
> unix_read_skb net/unix/af_unix.c:2546 [inline]
> unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682
> sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223
> unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x140/0x180 net/socket.c:745
> ____sys_sendmsg+0x312/0x410 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x1e9/0x280 net/socket.c:2667
> __do_sys_sendmsg net/socket.c:2676 [inline]
> __se_sys_sendmsg net/socket.c:2674 [inline]
> __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674
> do_syscall_64+0xd3/0x1d0
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
> value changed: 0xffffffff83d7feb0 -> 0x0000000000000000
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
>
> Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer
> dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer
> similarly due to no protection of saved_data_ready. Here is another
> different caller causing the same issue because of the same reason. So
> we should protect it with sk_callback_lock read lock because the writer
> side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);".
>
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> net/core/skmsg.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 4d75ef9d24bf..67c4c01c5235 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
> msg->skb = skb;
>
> sk_psock_queue_msg(psock, msg);
> + read_lock_bh(&sk->sk_callback_lock);
> sk_psock_data_ready(sk, psock);
> + read_unlock_bh(&sk->sk_callback_lock);
> return copied;
> }
The problem is the check and then usage presumably it is already set
to NULL:
static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
{
if (psock->saved_data_ready)
psock->saved_data_ready(sk);
I'm thinking we might be able to get away with just a READ_ONCE here with
similar WRITE_ONCE on other side. Something like this,
sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
{
saved_data_ready = READ_ONCE(psock->saved_data_ready)
if (saved_data_ready)
saved_data_ready(sk)
....
And then in sk_psock_stop_verdict,
WRITE_ONCE(sk->sk_data_ready, psock->saved_data_ready);
WRITE_ONCE(psock->saved_data_ready, NULL);
And because we don't actually release the sock until a RCU grace period we
should be OK. The TCP stack manages to work correctly without wrapping
tcp_data_ready in locks like this. But nice thing there is you don't change
this callback on live sockets.
I think at least to keep backport simply above patch is ok, but lets move
the read_lock_bh()/unlock_bh() into the sk_psock_data_ready() call and then
we don't duplicate this error again. Does that make sense?
Thanks,
John
next prev parent reply other threads:[~2024-04-04 1:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-29 13:40 [PATCH net] bpf, skmsg: fix NULL pointer dereference in sk_psock_skb_ingress_enqueue Jason Xing
2024-03-31 9:44 ` Jason Xing
2024-04-04 0:44 ` Jason Xing
2024-04-04 1:00 ` John Fastabend [this message]
2024-04-04 1:25 ` Jason Xing
2024-04-05 4:45 ` John Fastabend
2024-04-05 5:12 ` Jason Xing
2024-04-05 14:58 ` John Fastabend
2024-04-05 15:00 ` Jason Xing
2024-04-05 15:28 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=660dfbcb45cfc_23f4720810@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jakub@cloudflare.com \
--cc=kerneljasonxing@gmail.com \
--cc=kernelxing@tencent.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.