All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Jakub Sitnicki <jakub@cloudflare.com>,  bpf@vger.kernel.org
Cc: netdev@vger.kernel.org,  kernel-team@cloudflare.com,
	 Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	 Andrii Nakryiko <andrii@kernel.org>,
	 John Fastabend <john.fastabend@gmail.com>,
	 Cong Wang <cong.wang@bytedance.com>
Subject: RE: [PATCH bpf] bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets
Date: Mon, 25 Sep 2023 11:27:32 -0700	[thread overview]
Message-ID: <6511d1143dc59_110e52088a@john.notmuch> (raw)
In-Reply-To: <20230920102055.42662-1-jakub@cloudflare.com>

Jakub Sitnicki wrote:
> With a SOCKMAP/SOCKHASH map and an sk_msg program user can steer messages
> sent from one TCP socket (s1) to actually egress from another TCP
> socket (s2):
> 
> tcp_bpf_sendmsg(s1)		// = sk_prot->sendmsg
>   tcp_bpf_send_verdict(s1)	// __SK_REDIRECT case
>     tcp_bpf_sendmsg_redir(s2)
>       tcp_bpf_push_locked(s2)
> 	tcp_bpf_push(s2)
> 	  tcp_rate_check_app_limited(s2) // expects tcp_sock
> 	  tcp_sendmsg_locked(s2)	 // ditto
> 
> There is a hard-coded assumption in the call-chain, that the egress
> socket (s2) is a TCP socket.
> 
> However in commit 122e6c79efe1 ("sock_map: Update sock type checks for
> UDP") we have enabled redirects to non-TCP sockets. This was done for the
> sake of BPF sk_skb programs. There was no indention to support sk_msg
> send-to-egress use case.
> 
> As a result, attempts to send-to-egress through a non-TCP socket lead to a
> crash due to invalid downcast from sock to tcp_sock:
> 
>  BUG: kernel NULL pointer dereference, address: 000000000000002f
>  ...
>  Call Trace:
>   <TASK>
>   ? show_regs+0x60/0x70
>   ? __die+0x1f/0x70
>   ? page_fault_oops+0x80/0x160
>   ? do_user_addr_fault+0x2d7/0x800
>   ? rcu_is_watching+0x11/0x50
>   ? exc_page_fault+0x70/0x1c0
>   ? asm_exc_page_fault+0x27/0x30
>   ? tcp_tso_segs+0x14/0xa0
>   tcp_write_xmit+0x67/0xce0
>   __tcp_push_pending_frames+0x32/0xf0
>   tcp_push+0x107/0x140
>   tcp_sendmsg_locked+0x99f/0xbb0
>   tcp_bpf_push+0x19d/0x3a0
>   tcp_bpf_sendmsg_redir+0x55/0xd0
>   tcp_bpf_send_verdict+0x407/0x550
>   tcp_bpf_sendmsg+0x1a1/0x390
>   inet_sendmsg+0x6a/0x70
>   sock_sendmsg+0x9d/0xc0
>   ? sockfd_lookup_light+0x12/0x80
>   __sys_sendto+0x10e/0x160
>   ? syscall_enter_from_user_mode+0x20/0x60
>   ? __this_cpu_preempt_check+0x13/0x20
>   ? lockdep_hardirqs_on+0x82/0x110
>   __x64_sys_sendto+0x1f/0x30
>   do_syscall_64+0x38/0x90
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> Reject selecting a non-TCP sockets as redirect target from a BPF sk_msg
> program to prevent the crash. When attempted, user will receive an EACCES
> error from send/sendto/sendmsg() syscall.
> 
> Fixes: 122e6c79efe1 ("sock_map: Update sock type checks for UDP")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> FYI, I'm working on revamping the sockmap_listen selftest, which exercises
> some of redirect combinations, to cover the whole combination matrix so
> that we can catch these kinds of problems early on.

Yes this would be appreciated.

> 
>  net/core/sock_map.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index cb11750b1df5..4292c2ed1828 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -668,6 +668,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg,
>  	sk = __sock_map_lookup_elem(map, key);
>  	if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
>  		return SK_DROP;
> +	if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk))
> +		return SK_DROP;
>  
>  	msg->flags = flags;
>  	msg->sk_redir = sk;
> @@ -1267,6 +1269,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg,
>  	sk = __sock_hash_lookup_elem(map, key);
>  	if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
>  		return SK_DROP;
> +	if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk))
> +		return SK_DROP;

As a stop gap I think this is fine. If anyone wants to add support though
I do think as a use case it would make sense to redirect TCP into an
AF_UNIX socket and vice versa.

Acked-by: John Fastabend <john.fastabend@gmail.com>

  parent reply	other threads:[~2023-09-25 18:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-20 10:20 [PATCH bpf] bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets Jakub Sitnicki
2023-09-20 18:19 ` Kui-Feng Lee
2023-09-20 20:59   ` Jakub Sitnicki
2023-09-20 21:11     ` Kui-Feng Lee
2023-09-25 18:27 ` John Fastabend [this message]
2023-09-29 15:20 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6511d1143dc59_110e52088a@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.