* [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt()
@ 2024-08-08 15:05 Alan Maguire
2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
2024-08-08 15:05 ` [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
0 siblings, 2 replies; 4+ messages in thread
From: Alan Maguire @ 2024-08-08 15:05 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
As previously discussed here [1], long-lived sockets can miss
a chance to set additional callbacks if a sock ops program
was not attached early in their lifetime. Adding support
to bpf_setsockopt() to set callback flags (and bpf_getsockopt()
to retrieve them) provides other opportunities to enable callbacks,
either directly via a cgroup/setsockopt intercepted setsockopt()
or via a socket iterator.
Patch 1 adds bpf_[get|set]sockopt() support; patch 2 adds testing
for it via a sockops programs, along with verification via a
cgroup/getsockopt program.
Changes since v1 [2]:
- Removed unneeded READ_ONCE() (Martin, patch 1)
- Reworked sockopt test to leave existing tests undisturbed while adding
test_nonstandard_opt() test to cover the TCP_BPF_SOCK_OPS_CB_FLAGS
case; test verifies that value set via bpf_setsockopt() is what we
expect via a call to getsockopt() which is caught by a
cgroup/getsockopt program to provide the flags value (Martin, patch 2)
- Removed unneeded iterator test (Martin)
[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/
[2] https://lore.kernel.org/bpf/20240802152929.2695863-1-alan.maguire@oracle.com/
Alan Maguire (2):
bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS
include/uapi/linux/bpf.h | 3 +-
net/core/filter.c | 15 ++++++
tools/include/uapi/linux/bpf.h | 3 +-
.../selftests/bpf/prog_tests/setget_sockopt.c | 47 +++++++++++++++++++
.../selftests/bpf/progs/setget_sockopt.c | 24 ++++++++--
5 files changed, 87 insertions(+), 5 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
2024-08-08 15:05 [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
@ 2024-08-08 15:05 ` Alan Maguire
2024-08-09 3:32 ` Martin KaFai Lau
2024-08-08 15:05 ` [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
1 sibling, 1 reply; 4+ messages in thread
From: Alan Maguire @ 2024-08-08 15:05 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
Currently the only opportunity to set sock ops flags dictating
which callbacks fire for a socket is from within a TCP-BPF sockops
program. This is problematic if the connection is already set up
as there is no further chance to specify callbacks for that socket.
Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt()
to allow users to specify callbacks later, either via an iterator
over sockets or via a socket-specific program triggered by a
setsockopt() on the socket.
Previous discussion on this here [1].
[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
include/uapi/linux/bpf.h | 3 ++-
net/core/filter.c | 15 +++++++++++++++
tools/include/uapi/linux/bpf.h | 3 ++-
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 35bcf52dbc65..d4d7efc34e67 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2851,7 +2851,7 @@ union bpf_attr {
* **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
* **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
* **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
- * **TCP_BPF_RTO_MIN**.
+ * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
* * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
* * **IPPROTO_IPV6**, which supports the following *optname*\ s:
* **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7080,6 +7080,7 @@ enum {
TCP_BPF_SYN = 1005, /* Copy the TCP header */
TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
+ TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
};
enum {
diff --git a/net/core/filter.c b/net/core/filter.c
index 78a6f746ea0b..67114e2fb52d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
return -EINVAL;
inet_csk(sk)->icsk_rto_min = timeout;
break;
+ case TCP_BPF_SOCK_OPS_CB_FLAGS:
+ if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
+ return -EINVAL;
+ tp->bpf_sock_ops_cb_flags = val;
+ break;
default:
return -EINVAL;
}
@@ -5366,6 +5371,16 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
if (*optlen < 1)
return -EINVAL;
break;
+ case TCP_BPF_SOCK_OPS_CB_FLAGS:
+ if (*optlen != sizeof(int))
+ return -EINVAL;
+ if (getopt) {
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ memcpy(optval, &tp->bpf_sock_ops_cb_flags, *optlen);
+ return 0;
+ }
+ return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen);
default:
if (getopt)
return -EINVAL;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 35bcf52dbc65..d4d7efc34e67 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2851,7 +2851,7 @@ union bpf_attr {
* **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
* **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
* **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
- * **TCP_BPF_RTO_MIN**.
+ * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
* * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
* * **IPPROTO_IPV6**, which supports the following *optname*\ s:
* **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7080,6 +7080,7 @@ enum {
TCP_BPF_SYN = 1005, /* Copy the TCP header */
TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
+ TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
};
enum {
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-08 15:05 [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
@ 2024-08-08 15:05 ` Alan Maguire
1 sibling, 0 replies; 4+ messages in thread
From: Alan Maguire @ 2024-08-08 15:05 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
Add tests to set TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
bpf_setsockopt() and use a cgroup/getsockopt program to retrieve
the value to verify it was set.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
.../selftests/bpf/prog_tests/setget_sockopt.c | 47 +++++++++++++++++++
.../selftests/bpf/progs/setget_sockopt.c | 24 ++++++++--
2 files changed, 68 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
index 7d4a9b3d3722..89a0b899441c 100644
--- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
@@ -154,6 +154,51 @@ static void test_ktls(int family)
close(sfd);
}
+static void test_nonstandard_opt(int family)
+{
+ struct setget_sockopt__bss *bss = skel->bss;
+ struct bpf_link *getsockopt_link = NULL;
+ int sfd = -1, fd = -1, cfd = -1, flags;
+ socklen_t flagslen = sizeof(flags);
+
+ memset(bss, 0, sizeof(*bss));
+
+ sfd = start_server(family, SOCK_STREAM,
+ family == AF_INET6 ? addr6_str : addr4_str, 0, 0);
+ if (!ASSERT_GE(sfd, 0, "start_server"))
+ return;
+
+ fd = connect_to_fd(sfd, 0);
+ if (!ASSERT_GE(fd, 0, "connect_to_fd_server")) {
+ close(sfd);
+ return;
+ }
+
+ /* cgroup/getsockopt prog will intercept getsockopt() below and
+ * retrieve the tcp socket bpf_sock_ops_cb_flags value for the
+ * accept()ed socket; this was set earlier in the passive established
+ * callback for the accept()ed socket via bpf_setsockopt().
+ */
+ getsockopt_link = bpf_program__attach_cgroup(skel->progs._getsockopt, cg_fd);
+ if (!ASSERT_OK_PTR(getsockopt_link, "getsockopt prog"))
+ goto err_out;
+
+ cfd = accept(sfd, NULL, 0);
+ if (!ASSERT_GE(cfd, 0, "accept"))
+ goto err_out;
+
+ if (!ASSERT_OK(getsockopt(cfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, &flags, &flagslen),
+ "getsockopt_flags"))
+ goto err_out;
+ ASSERT_EQ(flags & BPF_SOCK_OPS_STATE_CB_FLAG, BPF_SOCK_OPS_STATE_CB_FLAG,
+ "cb_flags_set");
+err_out:
+ close(sfd);
+ close(fd);
+ close(cfd);
+ bpf_link__destroy(getsockopt_link);
+}
+
void test_setget_sockopt(void)
{
cg_fd = test__join_cgroup(CG_NAME);
@@ -191,6 +236,8 @@ void test_setget_sockopt(void)
test_udp(AF_INET);
test_ktls(AF_INET6);
test_ktls(AF_INET);
+ test_nonstandard_opt(AF_INET);
+ test_nonstandard_opt(AF_INET6);
done:
setget_sockopt__destroy(skel);
diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c
index 60518aed1ffc..0b59be6502cc 100644
--- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
+++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
@@ -353,11 +353,30 @@ int BPF_PROG(socket_post_create, struct socket *sock, int family,
return 1;
}
+SEC("cgroup/getsockopt")
+int _getsockopt(struct bpf_sockopt *ctx)
+{
+ struct bpf_sock *sk = ctx->sk;
+ int *optval = ctx->optval;
+ struct tcp_sock *tp;
+
+ if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS)
+ return 1;
+
+ tp = bpf_core_cast(sk, struct tcp_sock);
+ if (ctx->optval + sizeof(int) <= ctx->optval_end) {
+ *optval = tp->bpf_sock_ops_cb_flags;
+ ctx->retval = 0;
+ }
+ return 1;
+}
+
SEC("sockops")
int skops_sockopt(struct bpf_sock_ops *skops)
{
struct bpf_sock *bpf_sk = skops->sk;
struct sock *sk;
+ int flags;
if (!bpf_sk)
return 1;
@@ -384,9 +403,8 @@ int skops_sockopt(struct bpf_sock_ops *skops)
nr_passive += !(bpf_test_sockopt(skops, sk) ||
test_tcp_maxseg(skops, sk) ||
test_tcp_saved_syn(skops, sk));
- bpf_sock_ops_cb_flags_set(skops,
- skops->bpf_sock_ops_cb_flags |
- BPF_SOCK_OPS_STATE_CB_FLAG);
+ flags = skops->bpf_sock_ops_cb_flags | BPF_SOCK_OPS_STATE_CB_FLAG;
+ bpf_setsockopt(skops, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, &flags, sizeof(flags));
break;
case BPF_SOCK_OPS_STATE_CB:
if (skops->args[1] == BPF_TCP_CLOSE_WAIT)
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
@ 2024-08-09 3:32 ` Martin KaFai Lau
0 siblings, 0 replies; 4+ messages in thread
From: Martin KaFai Lau @ 2024-08-09 3:32 UTC (permalink / raw)
To: Alan Maguire
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 8/8/24 8:05 AM, Alan Maguire wrote:
> Currently the only opportunity to set sock ops flags dictating
> which callbacks fire for a socket is from within a TCP-BPF sockops
> program. This is problematic if the connection is already set up
> as there is no further chance to specify callbacks for that socket.
> Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt()
> to allow users to specify callbacks later, either via an iterator
> over sockets or via a socket-specific program triggered by a
> setsockopt() on the socket.
>
> Previous discussion on this here [1].
>
> [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
> include/uapi/linux/bpf.h | 3 ++-
> net/core/filter.c | 15 +++++++++++++++
> tools/include/uapi/linux/bpf.h | 3 ++-
> 3 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 35bcf52dbc65..d4d7efc34e67 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2851,7 +2851,7 @@ union bpf_attr {
> * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
> * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
> * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
> - * **TCP_BPF_RTO_MIN**.
> + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
> * * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
> * * **IPPROTO_IPV6**, which supports the following *optname*\ s:
> * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
> @@ -7080,6 +7080,7 @@ enum {
> TCP_BPF_SYN = 1005, /* Copy the TCP header */
> TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
> TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
> + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
> };
>
> enum {
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 78a6f746ea0b..67114e2fb52d 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
> return -EINVAL;
> inet_csk(sk)->icsk_rto_min = timeout;
> break;
> + case TCP_BPF_SOCK_OPS_CB_FLAGS:
> + if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
> + return -EINVAL;
> + tp->bpf_sock_ops_cb_flags = val;
> + break;
> default:
> return -EINVAL;
> }
> @@ -5366,6 +5371,16 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
> if (*optlen < 1)
> return -EINVAL;
> break;
> + case TCP_BPF_SOCK_OPS_CB_FLAGS:
> + if (*optlen != sizeof(int))
> + return -EINVAL;
> + if (getopt) {
> + struct tcp_sock *tp = tcp_sk(sk);
> +
> + memcpy(optval, &tp->bpf_sock_ops_cb_flags, *optlen);
bpf_sock_ops_cb_flags is a u8. memcpy with "*optlen == sizeof(int)" is an issue.
I fixed it up by assigning to a local int first. Applied. Thanks.
> + return 0;
> + }
> + return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen);
> default:
> if (getopt)
> return -EINVAL;
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 35bcf52dbc65..d4d7efc34e67 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2851,7 +2851,7 @@ union bpf_attr {
> * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
> * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
> * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
> - * **TCP_BPF_RTO_MIN**.
> + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
> * * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
> * * **IPPROTO_IPV6**, which supports the following *optname*\ s:
> * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
> @@ -7080,6 +7080,7 @@ enum {
> TCP_BPF_SYN = 1005, /* Copy the TCP header */
> TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
> TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
> + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
> };
>
> enum {
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-08-09 3:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 15:05 [PATCH v2 bpf-next 0/2] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
2024-08-08 15:05 ` [PATCH v2 bpf-next 1/2] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
2024-08-09 3:32 ` Martin KaFai Lau
2024-08-08 15:05 ` [PATCH v2 bpf-next 2/2] selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox