* [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() @ 2024-08-08 15:05 Alan Maguire 2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire 2024-08-08 15:05 ` [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire 0 siblings, 2 replies; 4+ messages in thread From: Alan Maguire @ 2024-08-08 15:05 UTC (permalink / raw) To: martin.lau Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire As previously discussed here [1], long-lived sockets can miss a chance to set additional callbacks if a sock ops program was not attached early in their lifetime. Adding support to bpf_setsockopt() to set callback flags (and bpf_getsockopt() to retrieve them) provides other opportunities to enable callbacks, either directly via a cgroup/setsockopt intercepted setsockopt() or via a socket iterator. Patch 1 adds bpf_[get|set]sockopt() support; patch 2 adds testing for it via a sockops programs, along with verification via a cgroup/getsockopt program. Changes since v1 [2]: - Removed unneeded READ_ONCE() (Martin, patch 1) - Reworked sockopt test to leave existing tests undisturbed while adding test_nonstandard_opt() test to cover the TCP_BPF_SOCK_OPS_CB_FLAGS case; test verifies that value set via bpf_setsockopt() is what we expect via a call to getsockopt() which is caught by a cgroup/getsockopt program to provide the flags value (Martin, patch 2) - Removed unneeded iterator test (Martin) [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ [2] https://lore.kernel.org/bpf/20240802152929.2695863-1-alan.maguire@oracle.com/ Alan Maguire (2): bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS include/uapi/linux/bpf.h | 3 +- net/core/filter.c | 15 ++++++ tools/include/uapi/linux/bpf.h | 3 +- .../selftests/bpf/prog_tests/setget_sockopt.c | 47 +++++++++++++++++++ .../selftests/bpf/progs/setget_sockopt.c | 24 ++++++++-- 5 files changed, 87 insertions(+), 5 deletions(-) -- 2.31.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags 2024-08-08 15:05 [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire @ 2024-08-08 15:05 ` Alan Maguire 2024-08-09 3:32 ` Martin KaFai Lau 2024-08-08 15:05 ` [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire 1 sibling, 1 reply; 4+ messages in thread From: Alan Maguire @ 2024-08-08 15:05 UTC (permalink / raw) To: martin.lau Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire Currently the only opportunity to set sock ops flags dictating which callbacks fire for a socket is from within a TCP-BPF sockops program. This is problematic if the connection is already set up as there is no further chance to specify callbacks for that socket. Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt() to allow users to specify callbacks later, either via an iterator over sockets or via a socket-specific program triggered by a setsockopt() on the socket. Previous discussion on this here [1]. [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ Signed-off-by: Alan Maguire <alan.maguire@oracle.com> --- include/uapi/linux/bpf.h | 3 ++- net/core/filter.c | 15 +++++++++++++++ tools/include/uapi/linux/bpf.h | 3 ++- 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 35bcf52dbc65..d4d7efc34e67 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2851,7 +2851,7 @@ union bpf_attr { * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, - * **TCP_BPF_RTO_MIN**. + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**. * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. * * **IPPROTO_IPV6**, which supports the following *optname*\ s: * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. @@ -7080,6 +7080,7 @@ enum { TCP_BPF_SYN = 1005, /* Copy the TCP header */ TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */ TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */ + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */ }; enum { diff --git a/net/core/filter.c b/net/core/filter.c index 78a6f746ea0b..67114e2fb52d 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname, return -EINVAL; inet_csk(sk)->icsk_rto_min = timeout; break; + case TCP_BPF_SOCK_OPS_CB_FLAGS: + if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS)) + return -EINVAL; + tp->bpf_sock_ops_cb_flags = val; + break; default: return -EINVAL; } @@ -5366,6 +5371,16 @@ static int sol_tcp_sockopt(struct sock *sk, int optname, if (*optlen < 1) return -EINVAL; break; + case TCP_BPF_SOCK_OPS_CB_FLAGS: + if (*optlen != sizeof(int)) + return -EINVAL; + if (getopt) { + struct tcp_sock *tp = tcp_sk(sk); + + memcpy(optval, &tp->bpf_sock_ops_cb_flags, *optlen); + return 0; + } + return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen); default: if (getopt) return -EINVAL; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 35bcf52dbc65..d4d7efc34e67 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -2851,7 +2851,7 @@ union bpf_attr { * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, - * **TCP_BPF_RTO_MIN**. + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**. * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. * * **IPPROTO_IPV6**, which supports the following *optname*\ s: * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. @@ -7080,6 +7080,7 @@ enum { TCP_BPF_SYN = 1005, /* Copy the TCP header */ TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */ TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */ + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */ }; enum { -- 2.31.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags 2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire @ 2024-08-09 3:32 ` Martin KaFai Lau 0 siblings, 0 replies; 4+ messages in thread From: Martin KaFai Lau @ 2024-08-09 3:32 UTC (permalink / raw) To: Alan Maguire Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf On 8/8/24 8:05 AM, Alan Maguire wrote: > Currently the only opportunity to set sock ops flags dictating > which callbacks fire for a socket is from within a TCP-BPF sockops > program. This is problematic if the connection is already set up > as there is no further chance to specify callbacks for that socket. > Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt() > to allow users to specify callbacks later, either via an iterator > over sockets or via a socket-specific program triggered by a > setsockopt() on the socket. > > Previous discussion on this here [1]. > > [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ > > Signed-off-by: Alan Maguire <alan.maguire@oracle.com> > --- > include/uapi/linux/bpf.h | 3 ++- > net/core/filter.c | 15 +++++++++++++++ > tools/include/uapi/linux/bpf.h | 3 ++- > 3 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 35bcf52dbc65..d4d7efc34e67 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2851,7 +2851,7 @@ union bpf_attr { > * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, > * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, > * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, > - * **TCP_BPF_RTO_MIN**. > + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**. > * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. > * * **IPPROTO_IPV6**, which supports the following *optname*\ s: > * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. > @@ -7080,6 +7080,7 @@ enum { > TCP_BPF_SYN = 1005, /* Copy the TCP header */ > TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */ > TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */ > + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */ > }; > > enum { > diff --git a/net/core/filter.c b/net/core/filter.c > index 78a6f746ea0b..67114e2fb52d 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname, > return -EINVAL; > inet_csk(sk)->icsk_rto_min = timeout; > break; > + case TCP_BPF_SOCK_OPS_CB_FLAGS: > + if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS)) > + return -EINVAL; > + tp->bpf_sock_ops_cb_flags = val; > + break; > default: > return -EINVAL; > } > @@ -5366,6 +5371,16 @@ static int sol_tcp_sockopt(struct sock *sk, int optname, > if (*optlen < 1) > return -EINVAL; > break; > + case TCP_BPF_SOCK_OPS_CB_FLAGS: > + if (*optlen != sizeof(int)) > + return -EINVAL; > + if (getopt) { > + struct tcp_sock *tp = tcp_sk(sk); > + > + memcpy(optval, &tp->bpf_sock_ops_cb_flags, *optlen); bpf_sock_ops_cb_flags is a u8. memcpy with "*optlen == sizeof(int)" is an issue. I fixed it up by assigning to a local int first. Applied. Thanks. > + return 0; > + } > + return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen); > default: > if (getopt) > return -EINVAL; > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index 35bcf52dbc65..d4d7efc34e67 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -2851,7 +2851,7 @@ union bpf_attr { > * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, > * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, > * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, > - * **TCP_BPF_RTO_MIN**. > + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**. > * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. > * * **IPPROTO_IPV6**, which supports the following *optname*\ s: > * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. > @@ -7080,6 +7080,7 @@ enum { > TCP_BPF_SYN = 1005, /* Copy the TCP header */ > TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */ > TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */ > + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */ > }; > > enum { ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS 2024-08-08 15:05 [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire 2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire @ 2024-08-08 15:05 ` Alan Maguire 1 sibling, 0 replies; 4+ messages in thread From: Alan Maguire @ 2024-08-08 15:05 UTC (permalink / raw) To: martin.lau Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire Add tests to set TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via bpf_setsockopt() and use a cgroup/getsockopt program to retrieve the value to verify it was set. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> --- .../selftests/bpf/prog_tests/setget_sockopt.c | 47 +++++++++++++++++++ .../selftests/bpf/progs/setget_sockopt.c | 24 ++++++++-- 2 files changed, 68 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c index 7d4a9b3d3722..89a0b899441c 100644 --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c @@ -154,6 +154,51 @@ static void test_ktls(int family) close(sfd); } +static void test_nonstandard_opt(int family) +{ + struct setget_sockopt__bss *bss = skel->bss; + struct bpf_link *getsockopt_link = NULL; + int sfd = -1, fd = -1, cfd = -1, flags; + socklen_t flagslen = sizeof(flags); + + memset(bss, 0, sizeof(*bss)); + + sfd = start_server(family, SOCK_STREAM, + family == AF_INET6 ? addr6_str : addr4_str, 0, 0); + if (!ASSERT_GE(sfd, 0, "start_server")) + return; + + fd = connect_to_fd(sfd, 0); + if (!ASSERT_GE(fd, 0, "connect_to_fd_server")) { + close(sfd); + return; + } + + /* cgroup/getsockopt prog will intercept getsockopt() below and + * retrieve the tcp socket bpf_sock_ops_cb_flags value for the + * accept()ed socket; this was set earlier in the passive established + * callback for the accept()ed socket via bpf_setsockopt(). + */ + getsockopt_link = bpf_program__attach_cgroup(skel->progs._getsockopt, cg_fd); + if (!ASSERT_OK_PTR(getsockopt_link, "getsockopt prog")) + goto err_out; + + cfd = accept(sfd, NULL, 0); + if (!ASSERT_GE(cfd, 0, "accept")) + goto err_out; + + if (!ASSERT_OK(getsockopt(cfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, &flags, &flagslen), + "getsockopt_flags")) + goto err_out; + ASSERT_EQ(flags & BPF_SOCK_OPS_STATE_CB_FLAG, BPF_SOCK_OPS_STATE_CB_FLAG, + "cb_flags_set"); +err_out: + close(sfd); + close(fd); + close(cfd); + bpf_link__destroy(getsockopt_link); +} + void test_setget_sockopt(void) { cg_fd = test__join_cgroup(CG_NAME); @@ -191,6 +236,8 @@ void test_setget_sockopt(void) test_udp(AF_INET); test_ktls(AF_INET6); test_ktls(AF_INET); + test_nonstandard_opt(AF_INET); + test_nonstandard_opt(AF_INET6); done: setget_sockopt__destroy(skel); diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c index 60518aed1ffc..0b59be6502cc 100644 --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c @@ -353,11 +353,30 @@ int BPF_PROG(socket_post_create, struct socket *sock, int family, return 1; } +SEC("cgroup/getsockopt") +int _getsockopt(struct bpf_sockopt *ctx) +{ + struct bpf_sock *sk = ctx->sk; + int *optval = ctx->optval; + struct tcp_sock *tp; + + if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS) + return 1; + + tp = bpf_core_cast(sk, struct tcp_sock); + if (ctx->optval + sizeof(int) <= ctx->optval_end) { + *optval = tp->bpf_sock_ops_cb_flags; + ctx->retval = 0; + } + return 1; +} + SEC("sockops") int skops_sockopt(struct bpf_sock_ops *skops) { struct bpf_sock *bpf_sk = skops->sk; struct sock *sk; + int flags; if (!bpf_sk) return 1; @@ -384,9 +403,8 @@ int skops_sockopt(struct bpf_sock_ops *skops) nr_passive += !(bpf_test_sockopt(skops, sk) || test_tcp_maxseg(skops, sk) || test_tcp_saved_syn(skops, sk)); - bpf_sock_ops_cb_flags_set(skops, - skops->bpf_sock_ops_cb_flags | - BPF_SOCK_OPS_STATE_CB_FLAG); + flags = skops->bpf_sock_ops_cb_flags | BPF_SOCK_OPS_STATE_CB_FLAG; + bpf_setsockopt(skops, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, &flags, sizeof(flags)); break; case BPF_SOCK_OPS_STATE_CB: if (skops->args[1] == BPF_TCP_CLOSE_WAIT) -- 2.31.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-08-09 3:32 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-08 15:05 [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire 2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire 2024-08-09 3:32 ` Martin KaFai Lau 2024-08-08 15:05 ` [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox