From: Cong Wang <xiyou.wangcong@gmail.com>
To: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com,
wangdongdong.6@bytedance.com, jiang.wang@bytedance.com,
Cong Wang <cong.wang@bytedance.com>
Subject: [Patch bpf-next 00/19] sock_map: add non-TCP and cross-protocol support
Date: Tue, 2 Feb 2021 20:16:17 -0800 [thread overview]
Message-ID: <20210203041636.38555-1-xiyou.wangcong@gmail.com> (raw)
From: Cong Wang <cong.wang@bytedance.com>
Currently sockmap only fully supports TCP, UDP is partially supported
as it is only allowed to add into sockmap. This patch extends sockmap
with: 1) full UDP support; 2) full AF_UNIX dgram support; 3) cross
protocol support. Our goal is to allow socket splice between AF_UNIX
dgram and UDP.
On the high level, ->sendmsg_locked() and ->read_sock() are required
for each protocol to support sockmap redirection, and in order to do
sock proto update, a new ops ->update_proto() is introduced, which is
also required to implement. It is slightly harder for AF_UNIX, as it
does not have a full struct proto implementation and redirection.
In order to support cross protocol, we have to make skb independent
of protocols, which is extremely hard given how creatively UDP uses
dev_scratch. Fortunately, we can pass skmsg instead of skb when
redirecting to ingress, the only thing needs to add is a new
->recvmsg() to retrieve skmsg. On the egress side, a new skb is
allocated behind skb_send_sock_locked(), it comes for free.
Another big barrier is skb CB, which was hard-coded as TCP_CB(),
I switch it to skb ext to solve this problem. Please see patch 3 for
more details.
This patchset passed all tests, the existing ones and the new ones I
add within this patchset.
---
Cong Wang (19):
bpf: rename BPF_STREAM_PARSER to BPF_SOCK_MAP
skmsg: get rid of struct sk_psock_parser
skmsg: use skb ext instead of TCP_SKB_CB
sock_map: rename skb_parser and skb_verdict
sock_map: introduce BPF_SK_SKB_VERDICT
sock: introduce sk_prot->update_proto()
udp: implement ->sendmsg_locked()
udp: implement ->read_sock() for sockmap
udp: add ->read_sock() and ->sendmsg_locked() to ipv6
af_unix: implement ->sendmsg_locked for dgram socket
af_unix: implement ->read_sock() for sockmap
af_unix: implement ->update_proto()
af_unix: set TCP_ESTABLISHED for datagram sockets too
skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data()
udp: implement udp_bpf_recvmsg() for sockmap
af_unix: implement unix_dgram_bpf_recvmsg()
sock_map: update sock type checks
selftests/bpf: add test cases for unix and udp sockmap
selftests/bpf: add test case for redirection between udp and unix
MAINTAINERS | 1 +
include/linux/bpf.h | 4 +-
include/linux/bpf_types.h | 2 +-
include/linux/skbuff.h | 4 +
include/linux/skmsg.h | 90 +++-
include/net/af_unix.h | 13 +
include/net/ipv6.h | 1 +
include/net/sock.h | 3 +
include/net/tcp.h | 33 +-
include/net/udp.h | 9 +-
include/uapi/linux/bpf.h | 1 +
kernel/bpf/syscall.c | 1 +
net/Kconfig | 14 +-
net/core/Makefile | 2 +-
net/core/filter.c | 3 +-
net/core/skbuff.c | 7 +
net/core/skmsg.c | 223 +++++---
net/core/sock_map.c | 128 ++---
net/ipv4/Makefile | 2 +-
net/ipv4/af_inet.c | 2 +
net/ipv4/tcp_bpf.c | 130 +----
net/ipv4/tcp_ipv4.c | 3 +
net/ipv4/udp.c | 68 ++-
net/ipv4/udp_bpf.c | 78 ++-
net/ipv6/af_inet6.c | 2 +
net/ipv6/tcp_ipv6.c | 3 +
net/ipv6/udp.c | 30 +-
net/tls/tls_sw.c | 4 +-
net/unix/Makefile | 1 +
net/unix/af_unix.c | 105 +++-
net/unix/unix_bpf.c | 99 ++++
tools/bpf/bpftool/common.c | 1 +
tools/bpf/bpftool/prog.c | 1 +
tools/include/uapi/linux/bpf.h | 1 +
.../selftests/bpf/prog_tests/sockmap_listen.c | 475 +++++++++++++++++-
.../selftests/bpf/progs/test_sockmap_listen.c | 24 +-
36 files changed, 1233 insertions(+), 335 deletions(-)
create mode 100644 net/unix/unix_bpf.c
--
2.25.1
next reply other threads:[~2021-02-03 4:17 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-03 4:16 Cong Wang [this message]
2021-02-03 4:16 ` [Patch bpf-next 01/19] bpf: rename BPF_STREAM_PARSER to BPF_SOCK_MAP Cong Wang
2021-02-05 10:32 ` Jakub Sitnicki
2021-02-09 1:40 ` Cong Wang
2021-02-08 8:21 ` John Fastabend
2021-02-08 9:50 ` Lorenz Bauer
2021-02-09 1:45 ` Cong Wang
2021-02-09 6:48 ` John Fastabend
2021-02-03 4:16 ` [Patch bpf-next 02/19] skmsg: get rid of struct sk_psock_parser Cong Wang
2021-02-05 11:25 ` Jakub Sitnicki
2021-02-08 8:39 ` John Fastabend
2021-02-09 0:19 ` Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 03/19] skmsg: use skb ext instead of TCP_SKB_CB Cong Wang
2021-02-05 22:09 ` Jakub Sitnicki
2021-02-08 18:56 ` Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 04/19] sock_map: rename skb_parser and skb_verdict Cong Wang
2021-02-08 8:27 ` John Fastabend
2021-02-03 4:16 ` [Patch bpf-next 05/19] sock_map: introduce BPF_SK_SKB_VERDICT Cong Wang
2021-02-08 8:31 ` John Fastabend
2021-02-03 4:16 ` [Patch bpf-next 06/19] sock: introduce sk_prot->update_proto() Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 07/19] udp: implement ->sendmsg_locked() Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 08/19] udp: implement ->read_sock() for sockmap Cong Wang
2021-02-08 9:48 ` Lorenz Bauer
2021-02-09 1:35 ` Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 09/19] udp: add ->read_sock() and ->sendmsg_locked() to ipv6 Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 10/19] af_unix: implement ->sendmsg_locked for dgram socket Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 11/19] af_unix: implement ->read_sock() for sockmap Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 12/19] af_unix: implement ->update_proto() Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 13/19] af_unix: set TCP_ESTABLISHED for datagram sockets too Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 14/19] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 15/19] udp: implement udp_bpf_recvmsg() for sockmap Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 16/19] af_unix: implement unix_dgram_bpf_recvmsg() Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 17/19] sock_map: update sock type checks Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 18/19] selftests/bpf: add test cases for unix and udp sockmap Cong Wang
2021-02-05 10:53 ` Jakub Sitnicki
2021-02-08 18:43 ` Cong Wang
2021-02-03 4:16 ` [Patch bpf-next 19/19] selftests/bpf: add test case for redirection between udp and unix Cong Wang
2021-02-03 17:48 ` [Patch bpf-next 00/19] sock_map: add non-TCP and cross-protocol support Alexei Starovoitov
2021-02-03 19:22 ` Cong Wang
2021-02-03 20:29 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210203041636.38555-1-xiyou.wangcong@gmail.com \
--to=xiyou.wangcong@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=cong.wang@bytedance.com \
--cc=duanxiongchun@bytedance.com \
--cc=jiang.wang@bytedance.com \
--cc=netdev@vger.kernel.org \
--cc=wangdongdong.6@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).