From: Michal Luczaj <mhal@rbox.co>
To: John Fastabend <john.fastabend@gmail.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
Eric Dumazet <edumazet@google.com>,
Kuniyuki Iwashima <kuniyu@amazon.com>,
Paolo Abeni <pabeni@redhat.com>,
Willem de Bruijn <willemb@google.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Simon Horman <horms@kernel.org>,
Stefano Garzarella <sgarzare@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Bobby Eshleman <bobby.eshleman@bytedance.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Mykola Lysenko <mykolal@fb.com>,
Shuah Khan <shuah@kernel.org>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH net 1/4] sockmap, vsock: For connectible sockets allow only connected
Date: Fri, 14 Feb 2025 14:11:48 +0100 [thread overview]
Message-ID: <251be392-7cd5-4c69-bc02-12c794ea18a1@rbox.co> (raw)
In-Reply-To: <20250213-vsock-listen-sockmap-nullptr-v1-1-994b7cd2f16b@rbox.co>
> ...
> Another design detail is that listening vsocks are not supposed to have any
> transport assigned at all. Which implies they are not supported by the
> sockmap. But this is complicated by the fact that a socket, before
> switching to TCP_LISTEN, may have had some transport assigned during a
> failed connect() attempt. Hence, we may end up with a listening vsock in a
> sockmap, which blows up quickly:
>
> KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127]
> CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+
> Workqueue: vsock-loopback vsock_loopback_work
> RIP: 0010:vsock_read_skb+0x4b/0x90
> Call Trace:
> sk_psock_verdict_data_ready+0xa4/0x2e0
> virtio_transport_recv_pkt+0x1ca8/0x2acc
> vsock_loopback_work+0x27d/0x3f0
> process_one_work+0x846/0x1420
> worker_thread+0x5b3/0xf80
> kthread+0x35a/0x700
> ret_from_fork+0x2d/0x70
> ret_from_fork_asm+0x1a/0x30
Perhaps I should have expanded more on the null-ptr-deref itself.
The idea is: force a vsock into assigning a transport and add it to the
sockmap (with a verdict program), but keep it unconnected. Then, drop
the transport and set the vsock to TCP_LISTEN. The moment a new
connection is established:
virtio_transport_recv_pkt()
virtio_transport_recv_listen()
sk->sk_data_ready(sk) i.e. sk_psock_verdict_data_ready()
ops->read_skb() i.e. vsock_read_skb()
vsk->transport->read_skb() vsk->transport is NULL, boom
Here's a stand-alone repro:
/*
* # modprobe -a vsock_loopback vhost_vsock
* # gcc test.c && ./a.out
*/
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/socket.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
#include <linux/vm_sockets.h>
static void die(const char *msg)
{
perror(msg);
exit(-1);
}
static int sockmap_create(void)
{
union bpf_attr attr = {
.map_type = BPF_MAP_TYPE_SOCKMAP,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = 1
};
int map;
map = syscall(SYS_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
if (map < 0)
die("map_create");
return map;
}
static void map_update_elem(int fd, int key, int value)
{
union bpf_attr attr = {
.map_fd = fd,
.key = (uint64_t)&key,
.value = (uint64_t)&value,
.flags = BPF_ANY
};
if (syscall(SYS_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr)))
die("map_update_elem");
}
static int prog_load(void)
{
/* mov %r0, 1; exit */
struct bpf_insn insns[] = {
{ .code = BPF_ALU64 | BPF_MOV | BPF_K, .dst_reg = 0, .imm = 1 },
{ .code = BPF_JMP | BPF_EXIT }
};
union bpf_attr attr = {
.prog_type = BPF_PROG_TYPE_SK_SKB,
.insn_cnt = sizeof(insns)/sizeof(insns[0]),
.insns = (long)insns,
.license = (long)"",
};
int prog = syscall(SYS_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog < 0)
die("prog_load");
return prog;
}
static void link_create(int prog_fd, int target_fd)
{
union bpf_attr attr = {
.link_create = {
.prog_fd = prog_fd,
.target_fd = target_fd,
.attach_type = BPF_SK_SKB_VERDICT
}
};
if (syscall(SYS_bpf, BPF_LINK_CREATE, &attr, sizeof(attr)) < 0)
die("link_create");
}
int main(void)
{
struct sockaddr_vm addr = {
.svm_family = AF_VSOCK,
.svm_cid = VMADDR_CID_LOCAL,
.svm_port = VMADDR_PORT_ANY
};
socklen_t alen = sizeof(addr);
int s, map, prog, c;
s = socket(AF_VSOCK, SOCK_SEQPACKET, 0);
if (s < 0)
die("socket");
if (bind(s, (struct sockaddr *)&addr, alen))
die("bind");
if (!connect(s, (struct sockaddr *)&addr, alen) || errno != ECONNRESET)
die("connect #1");
map = sockmap_create();
prog = prog_load();
link_create(prog, map);
map_update_elem(map, 0, s);
addr.svm_cid = 0x42424242; /* non-existing */
if (!connect(s, (struct sockaddr *)&addr, alen) || errno != ESOCKTNOSUPPORT)
die("connect #2");
if (listen(s, 1))
die("listen");
if (getsockname(s, (struct sockaddr *)&addr, &alen))
die("getsockname");
c = socket(AF_VSOCK, SOCK_SEQPACKET, 0);
if (c < 0)
die("socket c");
if (connect(c, (struct sockaddr *)&addr, alen))
die("connect #3");
return 0;
}
next prev parent reply other threads:[~2025-02-14 13:12 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-13 11:58 [PATCH net 0/4] sockmap, vsock: For connectible sockets allow only connected Michal Luczaj
2025-02-13 11:58 ` [PATCH net 1/4] " Michal Luczaj
2025-02-14 13:11 ` Michal Luczaj [this message]
2025-02-18 8:52 ` Stefano Garzarella
2025-02-13 11:58 ` [PATCH net 2/4] vsock/bpf: Warn on socket without transport Michal Luczaj
2025-02-17 10:59 ` Stefano Garzarella
2025-02-17 19:45 ` Michal Luczaj
2025-02-18 8:49 ` Stefano Garzarella
2025-02-13 11:58 ` [PATCH net 3/4] selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected Michal Luczaj
2025-02-18 8:53 ` Stefano Garzarella
2025-02-13 11:58 ` [PATCH net 4/4] selftest/bpf: Add vsock test for " Michal Luczaj
2025-02-14 13:12 ` Michal Luczaj
2025-02-18 8:54 ` Stefano Garzarella
2025-02-18 11:10 ` [PATCH net 0/4] sockmap, vsock: For connectible sockets allow only connected patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=251be392-7cd5-4c69-bc02-12c794ea18a1@rbox.co \
--to=mhal@rbox.co \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bobby.eshleman@bytedance.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=mst@redhat.com \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=sgarzare@redhat.com \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).