From: Jakub Sitnicki <jakub@cloudflare.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, John Fastabend <john.fastabend@gmail.com>,
Eric Dumazet <edumazet@google.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
kernel-team@cloudflare.com,
syzbot+04c21ed96d861dccc5cd@syzkaller.appspotmail.com
Subject: [PATCH bpf v2 2/4] bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener
Date: Sat, 21 Jan 2023 13:41:44 +0100 [thread overview]
Message-ID: <20230113-sockmap-fix-v2-2-1e0ee7ac2f90@cloudflare.com> (raw)
In-Reply-To: <20230113-sockmap-fix-v2-0-1e0ee7ac2f90@cloudflare.com>
A listening socket linked to a sockmap has its sk_prot overridden. It
points to one of the struct proto variants in tcp_bpf_prots. The variant
depends on the socket's family and which sockmap programs are attached.
A child socket cloned from a TCP listener initially inherits their sk_prot.
But before cloning is finished, we restore the child's proto to the
listener's original non-tcp_bpf_prots one. This happens in
tcp_create_openreq_child -> tcp_bpf_clone.
Today, in tcp_bpf_clone we detect if the child's proto should be restored
by checking only for the TCP_BPF_BASE proto variant. This is not
correct. The sk_prot of listening socket linked to a sockmap can point to
to any variant in tcp_bpf_prots.
If the listeners sk_prot happens to be not the TCP_BPF_BASE variant, then
the child socket unintentionally is left if the inherited sk_prot by
tcp_bpf_clone.
This leads to issues like infinite recursion on close [1], because the
child state is otherwise not set up for use with tcp_bpf_prot operations.
Adjust the check in tcp_bpf_clone to detect all of tcp_bpf_prots variants.
Note that it wouldn't be sufficient to check the socket state when
overriding the sk_prot in tcp_bpf_update_proto in order to always use the
TCP_BPF_BASE variant for listening sockets. Since commit
b8b8315e39ff ("bpf, sockmap: Remove unhash handler for BPF sockmap usage")
it is possible for a socket to transition to TCP_LISTEN state while already
linked to a sockmap, e.g. connect() -> insert into map ->
connect(AF_UNSPEC) -> listen().
[1]: https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/
Fixes: e80251555f0b ("tcp_bpf: Don't let child socket inherit parent protocol ops on copy")
Reported-by: syzbot+04c21ed96d861dccc5cd@syzkaller.appspotmail.com
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
include/linux/util_macros.h | 12 ++++++++++++
net/ipv4/tcp_bpf.c | 4 ++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/linux/util_macros.h b/include/linux/util_macros.h
index 72299f261b25..43db6e47503c 100644
--- a/include/linux/util_macros.h
+++ b/include/linux/util_macros.h
@@ -38,4 +38,16 @@
*/
#define find_closest_descending(x, a, as) __find_closest(x, a, as, >=)
+/**
+ * is_insidevar - check if the @ptr points inside the @var memory range.
+ * @ptr: the pointer to a memory address.
+ * @var: the variable which address and size identify the memory range.
+ *
+ * Evaluates to true if the address in @ptr lies within the memory
+ * range allocated to @var.
+ */
+#define is_insidevar(ptr, var) \
+ ((uintptr_t)(ptr) >= (uintptr_t)(var) && \
+ (uintptr_t)(ptr) < (uintptr_t)(var) + sizeof(var))
+
#endif
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 94aad3870c5f..cf26d65ca389 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -6,6 +6,7 @@
#include <linux/bpf.h>
#include <linux/init.h>
#include <linux/wait.h>
+#include <linux/util_macros.h>
#include <net/inet_common.h>
#include <net/tls.h>
@@ -639,10 +640,9 @@ EXPORT_SYMBOL_GPL(tcp_bpf_update_proto);
*/
void tcp_bpf_clone(const struct sock *sk, struct sock *newsk)
{
- int family = sk->sk_family == AF_INET6 ? TCP_BPF_IPV6 : TCP_BPF_IPV4;
struct proto *prot = newsk->sk_prot;
- if (prot == &tcp_bpf_prots[family][TCP_BPF_BASE])
+ if (is_insidevar(prot, tcp_bpf_prots))
newsk->sk_prot = sk->sk_prot_creator;
}
#endif /* CONFIG_BPF_SYSCALL */
--
2.39.0
next prev parent reply other threads:[~2023-01-21 12:42 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-21 12:41 [PATCH bpf v2 0/4] bpf, sockmap: Fix infinite recursion in sock_map_close Jakub Sitnicki
2023-01-21 12:41 ` [PATCH bpf v2 1/4] bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself Jakub Sitnicki
2023-01-25 5:20 ` John Fastabend
2023-01-21 12:41 ` Jakub Sitnicki [this message]
2023-01-25 5:25 ` [PATCH bpf v2 2/4] bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener John Fastabend
2023-01-21 12:41 ` [PATCH bpf v2 3/4] selftests/bpf: Pass BPF skeleton to sockmap_listen ops tests Jakub Sitnicki
2023-01-25 5:27 ` John Fastabend
2023-01-21 12:41 ` [PATCH bpf v2 4/4] selftests/bpf: Cover listener cloning with progs attached to sockmap Jakub Sitnicki
2023-01-25 5:28 ` John Fastabend
2023-01-25 6:00 ` [PATCH bpf v2 0/4] bpf, sockmap: Fix infinite recursion in sock_map_close patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230113-sockmap-fix-v2-2-1e0ee7ac2f90@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=edumazet@google.com \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@cloudflare.com \
--cc=netdev@vger.kernel.org \
--cc=syzbot+04c21ed96d861dccc5cd@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox