* [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt()
@ 2024-08-02 15:29 Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
As previously discussed here [1], long-lived sockets can miss
a chance to set additional callbacks if a sock ops program
was not attached early in their lifetime. Adding support
to bpf_setsockopt() to set callback flags (and bpf_getsockopt()
to retrieve them) provides other opportunities to enable callbacks,
either directly via a cgroup/setsockopt intercepted setsockopt()
or via a socket iterator.
Patch 1 adds bpf_[get|set]sockopt() support; patch 2 adds testing
for it via sockops and cgroup/setsockopt programs and patch 3
extends the existing bpf_iter_setsockopt test to cover setting
sock ops flags via bpf_setsockopt() in iterator context.
[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/
Alan Maguire (3):
bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
selftests/bpf: modify bpf_iter_setsockopt to test
TCP_BPF_SOCK_OPS_CB_FLAGS
include/uapi/linux/bpf.h | 3 +-
net/core/filter.c | 16 ++++
tools/include/uapi/linux/bpf.h | 3 +-
.../bpf/prog_tests/bpf_iter_setsockopt.c | 83 +++++++++++++------
.../selftests/bpf/prog_tests/setget_sockopt.c | 11 +++
.../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++---
.../selftests/bpf/progs/setget_sockopt.c | 37 ++++++++-
7 files changed, 188 insertions(+), 41 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
@ 2024-08-02 15:29 ` Alan Maguire
2024-08-06 19:54 ` Martin KaFai Lau
2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
Currently the only opportunity to set sock ops flags dictating
which callbacks fire for a socket is from within a TCP-BPF sockops
program. This is problematic if the connection is already set up
as there is no further chance to specify callbacks for that socket.
Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt()
to allow users to specify callbacks later, either via an iterator
over sockets or via a socket-specific program triggered by a
setsockopt() on the socket.
Previous discussion on this here [1].
[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
include/uapi/linux/bpf.h | 3 ++-
net/core/filter.c | 16 ++++++++++++++++
tools/include/uapi/linux/bpf.h | 3 ++-
3 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 35bcf52dbc65..d4d7efc34e67 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2851,7 +2851,7 @@ union bpf_attr {
* **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
* **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
* **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
- * **TCP_BPF_RTO_MIN**.
+ * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
* * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
* * **IPPROTO_IPV6**, which supports the following *optname*\ s:
* **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7080,6 +7080,7 @@ enum {
TCP_BPF_SYN = 1005, /* Copy the TCP header */
TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
+ TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
};
enum {
diff --git a/net/core/filter.c b/net/core/filter.c
index 78a6f746ea0b..570ca3f12175 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
return -EINVAL;
inet_csk(sk)->icsk_rto_min = timeout;
break;
+ case TCP_BPF_SOCK_OPS_CB_FLAGS:
+ if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
+ return -EINVAL;
+ tp->bpf_sock_ops_cb_flags = val;
+ break;
default:
return -EINVAL;
}
@@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
if (*optlen < 1)
return -EINVAL;
break;
+ case TCP_BPF_SOCK_OPS_CB_FLAGS:
+ if (*optlen != sizeof(int))
+ return -EINVAL;
+ if (getopt) {
+ struct tcp_sock *tp = tcp_sk(sk);
+ int val = READ_ONCE(tp->bpf_sock_ops_cb_flags);
+
+ memcpy(optval, &val, *optlen);
+ return 0;
+ }
+ return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen);
default:
if (getopt)
return -EINVAL;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 35bcf52dbc65..d4d7efc34e67 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2851,7 +2851,7 @@ union bpf_attr {
* **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
* **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
* **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
- * **TCP_BPF_RTO_MIN**.
+ * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
* * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
* * **IPPROTO_IPV6**, which supports the following *optname*\ s:
* **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7080,6 +7080,7 @@ enum {
TCP_BPF_SYN = 1005, /* Copy the TCP header */
TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
+ TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
};
enum {
--
2.31.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
@ 2024-08-02 15:29 ` Alan Maguire
2024-08-06 21:27 ` Martin KaFai Lau
2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
bpf_setsockopt() and also add a cgroup/setsockopt program that
catches setsockopt() for this option and uses bpf_setsockopt()
to set it. The latter allows us to support modifying sockops
cb flags on a per-socket basis via setsockopt() without adding
support into core setsockopt() itself.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
.../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
.../selftests/bpf/progs/setget_sockopt.c | 37 +++++++++++++++++--
2 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
index 7d4a9b3d3722..b9c54217a489 100644
--- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
@@ -42,6 +42,7 @@ static int create_netns(void)
static void test_tcp(int family)
{
struct setget_sockopt__bss *bss = skel->bss;
+ int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG | BPF_SOCK_OPS_RTO_CB_FLAG;
int sfd, cfd;
memset(bss, 0, sizeof(*bss));
@@ -56,6 +57,9 @@ static void test_tcp(int family)
close(sfd);
return;
}
+ ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+ &cb_flags, sizeof(cb_flags)),
+ 0, "setsockopt cb_flags");
close(sfd);
close(cfd);
@@ -65,6 +69,8 @@ static void test_tcp(int family)
ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
+ ASSERT_GE(bss->nr_state, 1, "nr_state");
+ ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
}
static void test_udp(int family)
@@ -185,6 +191,11 @@ void test_setget_sockopt(void)
if (!ASSERT_OK_PTR(skel->links.socket_post_create, "attach_cgroup"))
goto done;
+ skel->links.tcp_setsockopt =
+ bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
+ if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
+ goto done;
+
test_tcp(AF_INET6);
test_tcp(AF_INET);
test_udp(AF_INET6);
diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c
index 60518aed1ffc..920af9e21e84 100644
--- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
+++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
@@ -20,6 +20,8 @@ int nr_connect;
int nr_binddev;
int nr_socket_post_create;
int nr_fin_wait1;
+int nr_state;
+int nr_setsockopt;
struct sockopt_test {
int opt;
@@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
{ .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
{ .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
{ .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
+ { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new = BPF_SOCK_OPS_ALL_CB_FLAGS,
+ .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore = BPF_SOCK_OPS_STATE_CB_FLAG, },
{ .opt = 0, },
};
@@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx, struct sock *sk,
if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
return 1;
+
if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
tmp != expected)
return 1;
@@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
nr_passive += !(bpf_test_sockopt(skops, sk) ||
test_tcp_maxseg(skops, sk) ||
test_tcp_saved_syn(skops, sk));
- bpf_sock_ops_cb_flags_set(skops,
- skops->bpf_sock_ops_cb_flags |
- BPF_SOCK_OPS_STATE_CB_FLAG);
+
+ /* no need to set sockops cb flags here as sockopt
+ * tests and user-space originated setsockopt() will
+ * set flags to include BPF_SOCK_OPS_STATE_CB.
+ */
break;
case BPF_SOCK_OPS_STATE_CB:
+ nr_state++;
if (skops->args[1] == BPF_TCP_CLOSE_WAIT)
nr_fin_wait1 += !bpf_test_sockopt(skops, sk);
break;
@@ -397,4 +405,27 @@ int skops_sockopt(struct bpf_sock_ops *skops)
return 1;
}
+SEC("cgroup/setsockopt")
+int tcp_setsockopt(struct bpf_sockopt *ctx)
+{
+ struct bpf_sock *sk = ctx->sk;
+ __u8 *optval_end = ctx->optval_end;
+ __u8 *optval = ctx->optval;
+ int val = 0;
+
+ if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS)
+ return 1;
+ if (optval + sizeof(int) > optval_end)
+ return 0;
+ if (ctx->optlen != sizeof(int))
+ return 0;
+ val = *(int *)optval;
+ if (bpf_setsockopt(sk, ctx->level, ctx->optname, &val, sizeof(val)))
+ return 0;
+ nr_setsockopt++;
+ /* BPF has handled this no need to call "real" setsockopt() */
+ ctx->optlen = -1;
+ return 1;
+}
+
char _license[] SEC("license") = "GPL";
--
2.31.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
@ 2024-08-02 15:29 ` Alan Maguire
2024-08-06 21:42 ` Martin KaFai Lau
2 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
To: martin.lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire
Add support to test bpf_setsockopt(.., TCP_BPF_SOCK_OPS_CB_FLAGS, ...)
in BPF iterator context; use per-socket storage to store the new
value and retrieve it in a cgroup/getsockopt program we attach to
allow us to query TCP_BPF_SOCK_OPS_CB_FLAGS via getsockopt.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
.../bpf/prog_tests/bpf_iter_setsockopt.c | 83 +++++++++++++------
.../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++---
2 files changed, 123 insertions(+), 36 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c
index 16bed9dd8e6a..42effafe8efe 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c
@@ -4,10 +4,13 @@
#include <sched.h>
#include <test_progs.h>
#include "network_helpers.h"
+#include "cgroup_helpers.h"
#include "bpf_dctcp.skel.h"
#include "bpf_cubic.skel.h"
#include "bpf_iter_setsockopt.skel.h"
+#define TEST_CGROUP "/test-iter-setsockopt"
+
static int create_netns(void)
{
if (!ASSERT_OK(unshare(CLONE_NEWNET), "create netns"))
@@ -32,17 +35,26 @@ static unsigned int set_bpf_cubic(int *fds, unsigned int nr_fds)
return nr_fds;
}
-static unsigned int check_bpf_dctcp(int *fds, unsigned int nr_fds)
+static unsigned int check_bpf_val(int *fds, unsigned int nr_fds, bool cong)
{
char tcp_cc[16];
- socklen_t optlen = sizeof(tcp_cc);
+ socklen_t cc_optlen = sizeof(tcp_cc);
+ int flags;
+ socklen_t flags_optlen = sizeof(flags);
unsigned int i;
for (i = 0; i < nr_fds; i++) {
- if (getsockopt(fds[i], SOL_TCP, TCP_CONGESTION,
- tcp_cc, &optlen) ||
- strcmp(tcp_cc, "bpf_dctcp"))
- return i;
+ if (cong) {
+ if (getsockopt(fds[i], SOL_TCP, TCP_CONGESTION,
+ tcp_cc, &cc_optlen) ||
+ strcmp(tcp_cc, "bpf_dctcp"))
+ return i;
+ } else {
+ if (getsockopt(fds[i], SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+ &flags, &flags_optlen) ||
+ flags != BPF_SOCK_OPS_ALL_CB_FLAGS)
+ return i;
+ }
}
return nr_fds;
@@ -102,7 +114,7 @@ static unsigned short get_local_port(int fd)
}
static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel,
- bool random_retry)
+ bool random_retry, bool cong)
{
int *reuse_listen_fds = NULL, *accepted_fds = NULL, *est_fds = NULL;
unsigned int nr_reuse_listens = 256, nr_est = 256;
@@ -140,9 +152,16 @@ static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel,
"get_local_port(reuse_listen_fds[0])"))
goto done;
- /* Run bpf tcp iter to switch from bpf_cubic to bpf_dctcp */
+ /* Run bpf tcp iter to change tcp value:
+ *
+ * - If cong is true, switch from bpf_cubic to bpf_dctcp;
+ * - If cong is false, use bpf_setsockopt() to set TCP sockops flags.
+ */
+
iter_skel->bss->random_retry = random_retry;
- iter_fd = bpf_iter_create(bpf_link__fd(iter_skel->links.change_tcp_cc));
+ iter_skel->bss->cong = cong;
+
+ iter_fd = bpf_iter_create(bpf_link__fd(iter_skel->links.change_tcp_val));
if (!ASSERT_GE(iter_fd, 0, "create iter_fd"))
goto done;
@@ -152,22 +171,21 @@ static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel,
if (!ASSERT_OK(err, "read iter error"))
goto done;
- /* Check reuseport listen fds for dctcp */
- ASSERT_EQ(check_bpf_dctcp(reuse_listen_fds, nr_reuse_listens),
+ /* Check reuseport listen fds */
+ ASSERT_EQ(check_bpf_val(reuse_listen_fds, nr_reuse_listens, cong),
nr_reuse_listens,
- "check reuse_listen_fds dctcp");
-
- /* Check non reuseport listen fd for dctcp */
- ASSERT_EQ(check_bpf_dctcp(&listen_fd, 1), 1,
- "check listen_fd dctcp");
+ "check reuse_listen_fds");
+ /* Check non reuseport listen fd */
+ ASSERT_EQ(check_bpf_val(&listen_fd, 1, cong), 1,
+ "check listen_fd");
- /* Check established fds for dctcp */
- ASSERT_EQ(check_bpf_dctcp(est_fds, nr_est), nr_est,
- "check est_fds dctcp");
+ /* Check established fds */
+ ASSERT_EQ(check_bpf_val(est_fds, nr_est, cong), nr_est,
+ "check est_fds");
- /* Check accepted fds for dctcp */
- ASSERT_EQ(check_bpf_dctcp(accepted_fds, nr_est), nr_est,
- "check accepted_fds dctcp");
+ /* Check accepted fds */
+ ASSERT_EQ(check_bpf_val(accepted_fds, nr_est, cong), nr_est,
+ "check accepted_fds");
done:
if (iter_fd != -1)
@@ -186,6 +204,8 @@ void serial_test_bpf_iter_setsockopt(void)
struct bpf_dctcp *dctcp_skel = NULL;
struct bpf_link *cubic_link = NULL;
struct bpf_link *dctcp_link = NULL;
+ struct bpf_link *getsockopt_link = NULL;
+ int cgroup_fd;
if (create_netns())
return;
@@ -194,8 +214,9 @@ void serial_test_bpf_iter_setsockopt(void)
iter_skel = bpf_iter_setsockopt__open_and_load();
if (!ASSERT_OK_PTR(iter_skel, "iter_skel"))
return;
- iter_skel->links.change_tcp_cc = bpf_program__attach_iter(iter_skel->progs.change_tcp_cc, NULL);
- if (!ASSERT_OK_PTR(iter_skel->links.change_tcp_cc, "attach iter"))
+ iter_skel->links.change_tcp_val = bpf_program__attach_iter(iter_skel->progs.change_tcp_val,
+ NULL);
+ if (!ASSERT_OK_PTR(iter_skel->links.change_tcp_val, "attach iter"))
goto done;
/* Load bpf_cubic */
@@ -214,13 +235,23 @@ void serial_test_bpf_iter_setsockopt(void)
if (!ASSERT_OK_PTR(dctcp_link, "dctcp_link"))
goto done;
- do_bpf_iter_setsockopt(iter_skel, true);
- do_bpf_iter_setsockopt(iter_skel, false);
+ cgroup_fd = cgroup_setup_and_join(TEST_CGROUP);
+ if (!ASSERT_OK_FD(cgroup_fd, "cgroup switch"))
+ goto done;
+ getsockopt_link = bpf_program__attach_cgroup(iter_skel->progs._getsockopt, cgroup_fd);
+ if (!ASSERT_OK_PTR(getsockopt_link, "getsockopt prog"))
+ goto done;
+ do_bpf_iter_setsockopt(iter_skel, true, true);
+ do_bpf_iter_setsockopt(iter_skel, false, true);
+ do_bpf_iter_setsockopt(iter_skel, true, false);
+ do_bpf_iter_setsockopt(iter_skel, false, false);
done:
bpf_link__destroy(cubic_link);
bpf_link__destroy(dctcp_link);
+ bpf_link__destroy(getsockopt_link);
bpf_cubic__destroy(cubic_skel);
bpf_dctcp__destroy(dctcp_skel);
bpf_iter_setsockopt__destroy(iter_skel);
+ cleanup_cgroup_environment();
}
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c b/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c
index ec7f91850dec..60752a7ebdf8 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c
@@ -5,6 +5,13 @@
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
+struct {
+ __uint(type, BPF_MAP_TYPE_SK_STORAGE);
+ __uint(map_flags, BPF_F_NO_PREALLOC);
+ __type(key, int);
+ __type(value, int);
+} sk_map SEC(".maps");
+
#define bpf_tcp_sk(skc) ({ \
struct sock_common *_skc = skc; \
sk = NULL; \
@@ -21,6 +28,7 @@ unsigned short listen_hport = 0;
char cubic_cc[TCP_CA_NAME_MAX] = "bpf_cubic";
char dctcp_cc[TCP_CA_NAME_MAX] = "bpf_dctcp";
bool random_retry = false;
+bool cong = false;
static bool tcp_cc_eq(const char *a, const char *b)
{
@@ -36,10 +44,32 @@ static bool tcp_cc_eq(const char *a, const char *b)
return true;
}
+/* This program is used to intercept getsockopt() calls, providing
+ * the value of bpf_sock_ops_cb_flags for the socket; this value
+ * has been saved in per-socket storage earlier via the iterator
+ * program.
+ */
+SEC("cgroup/getsockopt")
+int _getsockopt(struct bpf_sockopt *ctx)
+{
+ struct bpf_sock *sk = ctx->sk;
+ int *optval = ctx->optval;
+ int *sk_storage = 0;
+
+ if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS)
+ return 1;
+ sk_storage = bpf_sk_storage_get(&sk_map, sk, 0, 0);
+ if (sk_storage) {
+ if (ctx->optval + sizeof(int) <= ctx->optval_end)
+ *optval = *sk_storage;
+ ctx->retval = 0;
+ }
+ return 1;
+}
+
SEC("iter/tcp")
-int change_tcp_cc(struct bpf_iter__tcp *ctx)
+int change_tcp_val(struct bpf_iter__tcp *ctx)
{
- char cur_cc[TCP_CA_NAME_MAX];
struct tcp_sock *tp;
struct sock *sk;
@@ -54,17 +84,43 @@ int change_tcp_cc(struct bpf_iter__tcp *ctx)
bpf_ntohs(sk->sk_dport) != listen_hport))
return 0;
- if (bpf_getsockopt(tp, SOL_TCP, TCP_CONGESTION,
- cur_cc, sizeof(cur_cc)))
- return 0;
+ if (cong) {
+ char cur_cc[TCP_CA_NAME_MAX];
- if (!tcp_cc_eq(cur_cc, cubic_cc))
- return 0;
+ if (bpf_getsockopt(tp, SOL_TCP, TCP_CONGESTION,
+ cur_cc, sizeof(cur_cc)))
+ return 0;
- if (random_retry && bpf_get_prandom_u32() % 4 == 1)
- return 1;
+ if (!tcp_cc_eq(cur_cc, cubic_cc))
+ return 0;
+
+ if (random_retry && bpf_get_prandom_u32() % 4 == 1)
+ return 1;
+
+ bpf_setsockopt(tp, SOL_TCP, TCP_CONGESTION, dctcp_cc, sizeof(dctcp_cc));
+ } else {
+ int val, newval = BPF_SOCK_OPS_ALL_CB_FLAGS;
+ int *sk_storage;
- bpf_setsockopt(tp, SOL_TCP, TCP_CONGESTION, dctcp_cc, sizeof(dctcp_cc));
+ if (bpf_getsockopt(tp, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+ &val, sizeof(val)))
+ return 0;
+
+ if (val == newval)
+ return 0;
+
+ if (random_retry && bpf_get_prandom_u32() % 4 == 1)
+ return 1;
+
+ if (bpf_setsockopt(tp, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+ &newval, sizeof(newval)))
+ return 0;
+ /* store flags value for retrieval in cgroup/getsockopt prog */
+ sk_storage = bpf_sk_storage_get(&sk_map, sk, 0,
+ BPF_SK_STORAGE_GET_F_CREATE);
+ if (sk_storage)
+ *sk_storage = newval;
+ }
return 0;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
@ 2024-08-06 19:54 ` Martin KaFai Lau
2024-08-07 17:53 ` Alan Maguire
0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-06 19:54 UTC (permalink / raw)
To: Alan Maguire
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 8/2/24 8:29 AM, Alan Maguire wrote:
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 78a6f746ea0b..570ca3f12175 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
> return -EINVAL;
> inet_csk(sk)->icsk_rto_min = timeout;
> break;
> + case TCP_BPF_SOCK_OPS_CB_FLAGS:
> + if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
> + return -EINVAL;
> + tp->bpf_sock_ops_cb_flags = val;
> + break;
> default:
> return -EINVAL;
> }
> @@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
> if (*optlen < 1)
> return -EINVAL;
> break;
> + case TCP_BPF_SOCK_OPS_CB_FLAGS:
> + if (*optlen != sizeof(int))
> + return -EINVAL;
> + if (getopt) {
> + struct tcp_sock *tp = tcp_sk(sk);
> + int val = READ_ONCE(tp->bpf_sock_ops_cb_flags);
READ_ONCE() here looks suspicious. There is no existing WRITE_ONCE.
Is it needed? The read here should have already passed the sock_owned_by_me
test. The existing write side should also have the sock_owned_by_me.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
@ 2024-08-06 21:27 ` Martin KaFai Lau
2024-08-07 17:58 ` Alan Maguire
0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-06 21:27 UTC (permalink / raw)
To: Alan Maguire
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 8/2/24 8:29 AM, Alan Maguire wrote:
> Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
> bpf_setsockopt() and also add a cgroup/setsockopt program that
> catches setsockopt() for this option and uses bpf_setsockopt()
> to set it. The latter allows us to support modifying sockops
> cb flags on a per-socket basis via setsockopt() without adding
> support into core setsockopt() itself.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
> .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
> .../selftests/bpf/progs/setget_sockopt.c | 37 +++++++++++++++++--
> 2 files changed, 45 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
> index 7d4a9b3d3722..b9c54217a489 100644
> --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
> +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
> @@ -42,6 +42,7 @@ static int create_netns(void)
> static void test_tcp(int family)
> {
> struct setget_sockopt__bss *bss = skel->bss;
> + int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG | BPF_SOCK_OPS_RTO_CB_FLAG;
> int sfd, cfd;
>
> memset(bss, 0, sizeof(*bss));
> @@ -56,6 +57,9 @@ static void test_tcp(int family)
> close(sfd);
> return;
> }
> + ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
> + &cb_flags, sizeof(cb_flags)),
> + 0, "setsockopt cb_flags");
The sfd here is the listen fd. The setsockopt here is after the connection is
established and the connection won't be affected by this setsockopt...
I don't think this test belongs to test_tcp() here, more on this below...
> close(sfd);
> close(cfd);
>
> @@ -65,6 +69,8 @@ static void test_tcp(int family)
> ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
> ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
> ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
> + ASSERT_GE(bss->nr_state, 1, "nr_state");
> + ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
> }
>
> static void test_udp(int family)
> @@ -185,6 +191,11 @@ void test_setget_sockopt(void)
> if (!ASSERT_OK_PTR(skel->links.socket_post_create, "attach_cgroup"))
> goto done;
>
> + skel->links.tcp_setsockopt =
> + bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
> + if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
> + goto done;
> +
> test_tcp(AF_INET6);
> test_tcp(AF_INET);
> test_udp(AF_INET6);
> diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c
> index 60518aed1ffc..920af9e21e84 100644
> --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
> +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
> @@ -20,6 +20,8 @@ int nr_connect;
> int nr_binddev;
> int nr_socket_post_create;
> int nr_fin_wait1;
> +int nr_state;
> +int nr_setsockopt;
>
> struct sockopt_test {
> int opt;
> @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
> { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
> { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
> { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
> + { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new = BPF_SOCK_OPS_ALL_CB_FLAGS,
> + .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore = BPF_SOCK_OPS_STATE_CB_FLAG, },
> { .opt = 0, },
> };
>
> @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx, struct sock *sk,
>
> if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
> return 1;
> +
> if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
> tmp != expected)
> return 1;
> @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
> nr_passive += !(bpf_test_sockopt(skops, sk) ||
> test_tcp_maxseg(skops, sk) ||
> test_tcp_saved_syn(skops, sk));
> - bpf_sock_ops_cb_flags_set(skops,
> - skops->bpf_sock_ops_cb_flags |
> - BPF_SOCK_OPS_STATE_CB_FLAG);
> +
> + /* no need to set sockops cb flags here as sockopt
> + * tests and user-space originated setsockopt() will
> + * set flags to include BPF_SOCK_OPS_STATE_CB.
> + */
I don't think the "user-space originated..." comment is accruate here. The
user-space originated setsockopt() from the above test_tcp() won't affect the
passively established sk here. This took me a while to digest...
iiuc, the removed bpf_sock_ops_cb_flags_set() for the passive connection is now
implicitly done (and hidden) in the newly added sol_tcp_tests[].restore.
How about keeping the explicit bpf_sock_ops_cb_flags_set() and removing the
".restore".
The existing bpf_sock_ops_cb_flags_set() can be changed to
bpf_setsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS) if it helps to test if it is effective.
> break;
> case BPF_SOCK_OPS_STATE_CB:
> + nr_state++;
How about removing the nr_state addition and adding a SEC("cgroup/getsockopt")
bpf prog to test the getsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS).
Create another test_nonstandard_opt() in prog_tests/setget_sockopt.c and
separate it from the existing test_{tcp, udp...} which is mainly exercising
set/getsockopt(sol_*_tests[]) on different hooks (right now it has
lsm_cgroup/socket_post_create and sockops). It doesn't fit testing the user
syscall set/getsockopt on the nonstandard_opt.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
@ 2024-08-06 21:42 ` Martin KaFai Lau
0 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-06 21:42 UTC (permalink / raw)
To: Alan Maguire
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 8/2/24 8:29 AM, Alan Maguire wrote:
> Add support to test bpf_setsockopt(.., TCP_BPF_SOCK_OPS_CB_FLAGS, ...)
> in BPF iterator context; use per-socket storage to store the new
> value and retrieve it in a cgroup/getsockopt program we attach to
> allow us to query TCP_BPF_SOCK_OPS_CB_FLAGS via getsockopt.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
> .../bpf/prog_tests/bpf_iter_setsockopt.c | 83 +++++++++++++------
> .../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++---
There are too many code churns to reuse this test to test a new
TCP_BPF_SOCK_OPS_CB_FLAGS sockopt. This is not the right test to reuse. It was
created mainly to test if the tcp batching logic can survive the bpf-iter's
seq_stop.
I don't think it needs a separate bpf_set/getsockopt test specifically for the
bpf iter prog. The test in patch 2 should be enough.
pw-bot: cr
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
2024-08-06 19:54 ` Martin KaFai Lau
@ 2024-08-07 17:53 ` Alan Maguire
0 siblings, 0 replies; 10+ messages in thread
From: Alan Maguire @ 2024-08-07 17:53 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 06/08/2024 20:54, Martin KaFai Lau wrote:
> On 8/2/24 8:29 AM, Alan Maguire wrote:
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 78a6f746ea0b..570ca3f12175 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock
>> *sk, int optname,
>> return -EINVAL;
>> inet_csk(sk)->icsk_rto_min = timeout;
>> break;
>> + case TCP_BPF_SOCK_OPS_CB_FLAGS:
>> + if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
>> + return -EINVAL;
>> + tp->bpf_sock_ops_cb_flags = val;
>> + break;
>> default:
>> return -EINVAL;
>> }
>> @@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int
>> optname,
>> if (*optlen < 1)
>> return -EINVAL;
>> break;
>> + case TCP_BPF_SOCK_OPS_CB_FLAGS:
>> + if (*optlen != sizeof(int))
>> + return -EINVAL;
>> + if (getopt) {
>> + struct tcp_sock *tp = tcp_sk(sk);
>> + int val = READ_ONCE(tp->bpf_sock_ops_cb_flags);
>
> READ_ONCE() here looks suspicious. There is no existing WRITE_ONCE.
>
> Is it needed? The read here should have already passed the
> sock_owned_by_me test. The existing write side should also have the
> sock_owned_by_me.
>
>
Yep, you're right. Will remove for v2. Thanks for taking a look!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-06 21:27 ` Martin KaFai Lau
@ 2024-08-07 17:58 ` Alan Maguire
2024-08-07 21:14 ` Martin KaFai Lau
0 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-07 17:58 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 06/08/2024 22:27, Martin KaFai Lau wrote:
> On 8/2/24 8:29 AM, Alan Maguire wrote:
>> Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
>> bpf_setsockopt() and also add a cgroup/setsockopt program that
>> catches setsockopt() for this option and uses bpf_setsockopt()
>> to set it. The latter allows us to support modifying sockops
>> cb flags on a per-socket basis via setsockopt() without adding
>> support into core setsockopt() itself.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>> .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
>> .../selftests/bpf/progs/setget_sockopt.c | 37 +++++++++++++++++--
>> 2 files changed, 45 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> index 7d4a9b3d3722..b9c54217a489 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> @@ -42,6 +42,7 @@ static int create_netns(void)
>> static void test_tcp(int family)
>> {
>> struct setget_sockopt__bss *bss = skel->bss;
>> + int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG |
>> BPF_SOCK_OPS_RTO_CB_FLAG;
>> int sfd, cfd;
>> memset(bss, 0, sizeof(*bss));
>> @@ -56,6 +57,9 @@ static void test_tcp(int family)
>> close(sfd);
>> return;
>> }
>> + ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
>> + &cb_flags, sizeof(cb_flags)),
>> + 0, "setsockopt cb_flags");
>
> The sfd here is the listen fd. The setsockopt here is after the
> connection is established and the connection won't be affected by this
> setsockopt...
>
> I don't think this test belongs to test_tcp() here, more on this below...
>
>> close(sfd);
>> close(cfd);
>> @@ -65,6 +69,8 @@ static void test_tcp(int family)
>> ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
>> ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
>> ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
>> + ASSERT_GE(bss->nr_state, 1, "nr_state");
>> + ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
>> }
>> static void test_udp(int family)
>> @@ -185,6 +191,11 @@ void test_setget_sockopt(void)
>> if (!ASSERT_OK_PTR(skel->links.socket_post_create,
>> "attach_cgroup"))
>> goto done;
>> + skel->links.tcp_setsockopt =
>> + bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
>> + if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
>> + goto done;
>> +
>> test_tcp(AF_INET6);
>> test_tcp(AF_INET);
>> test_udp(AF_INET6);
>> diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> index 60518aed1ffc..920af9e21e84 100644
>> --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> @@ -20,6 +20,8 @@ int nr_connect;
>> int nr_binddev;
>> int nr_socket_post_create;
>> int nr_fin_wait1;
>> +int nr_state;
>> +int nr_setsockopt;
>> struct sockopt_test {
>> int opt;
>> @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
>> { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
>> { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
>> { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
>> + { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new =
>> BPF_SOCK_OPS_ALL_CB_FLAGS,
>> + .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore =
>> BPF_SOCK_OPS_STATE_CB_FLAG, },
>> { .opt = 0, },
>> };
>> @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx,
>> struct sock *sk,
>> if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
>> return 1;
>> +
>> if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
>> tmp != expected)
>> return 1;
>> @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
>> nr_passive += !(bpf_test_sockopt(skops, sk) ||
>> test_tcp_maxseg(skops, sk) ||
>> test_tcp_saved_syn(skops, sk));
>> - bpf_sock_ops_cb_flags_set(skops,
>> - skops->bpf_sock_ops_cb_flags |
>> - BPF_SOCK_OPS_STATE_CB_FLAG);
>> +
>> + /* no need to set sockops cb flags here as sockopt
>> + * tests and user-space originated setsockopt() will
>> + * set flags to include BPF_SOCK_OPS_STATE_CB.
>> + */
>
> I don't think the "user-space originated..." comment is accruate here.
> The user-space originated setsockopt() from the above test_tcp() won't
> affect the passively established sk here. This took me a while to digest...
>
> iiuc, the removed bpf_sock_ops_cb_flags_set() for the passive connection
> is now implicitly done (and hidden) in the newly added
> sol_tcp_tests[].restore.
>
> How about keeping the explicit bpf_sock_ops_cb_flags_set() and removing
> the ".restore".
>
> The existing bpf_sock_ops_cb_flags_set() can be changed to
> bpf_setsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS) if it helps to test if it is
> effective.
Sounds good; I'll make that change and avoid changing test_tcp() etc.
>
>> break;
>> case BPF_SOCK_OPS_STATE_CB:
>> + nr_state++;
>
> How about removing the nr_state addition and adding a
> SEC("cgroup/getsockopt") bpf prog to test the
> getsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS).
>
Will do. Given that currently we cannot call bpf_getsockopt() from
cgroup/getsockopt progs it might make sense to use per-socket storage to
store the cb flags value that we set via bpf_setsockopt() in the sock
ops program, and retrieve it in the cgroup/getsockopt prog. Does that
sound ok?
> Create another test_nonstandard_opt() in prog_tests/setget_sockopt.c and
> separate it from the existing test_{tcp, udp...} which is mainly
> exercising set/getsockopt(sol_*_tests[]) on different hooks (right now
> it has lsm_cgroup/socket_post_create and sockops). It doesn't fit
> testing the user syscall set/getsockopt on the nonstandard_opt.
Sounds good. I'll also drop the test in patch 3 as you suggest for v2.
Thanks again!
Alan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
2024-08-07 17:58 ` Alan Maguire
@ 2024-08-07 21:14 ` Martin KaFai Lau
0 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-07 21:14 UTC (permalink / raw)
To: Alan Maguire
Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf
On 8/7/24 10:58 AM, Alan Maguire wrote:
> On 06/08/2024 22:27, Martin KaFai Lau wrote:
>> On 8/2/24 8:29 AM, Alan Maguire wrote:
>>> Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
>>> bpf_setsockopt() and also add a cgroup/setsockopt program that
>>> catches setsockopt() for this option and uses bpf_setsockopt()
>>> to set it. The latter allows us to support modifying sockops
>>> cb flags on a per-socket basis via setsockopt() without adding
>>> support into core setsockopt() itself.
>>>
>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>> ---
>>> .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
>>> .../selftests/bpf/progs/setget_sockopt.c | 37 +++++++++++++++++--
>>> 2 files changed, 45 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> index 7d4a9b3d3722..b9c54217a489 100644
>>> --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> @@ -42,6 +42,7 @@ static int create_netns(void)
>>> static void test_tcp(int family)
>>> {
>>> struct setget_sockopt__bss *bss = skel->bss;
>>> + int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG |
>>> BPF_SOCK_OPS_RTO_CB_FLAG;
>>> int sfd, cfd;
>>> memset(bss, 0, sizeof(*bss));
>>> @@ -56,6 +57,9 @@ static void test_tcp(int family)
>>> close(sfd);
>>> return;
>>> }
>>> + ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
>>> + &cb_flags, sizeof(cb_flags)),
>>> + 0, "setsockopt cb_flags");
>>
>> The sfd here is the listen fd. The setsockopt here is after the
>> connection is established and the connection won't be affected by this
>> setsockopt...
>>
>> I don't think this test belongs to test_tcp() here, more on this below...
>>
>>> close(sfd);
>>> close(cfd);
>>> @@ -65,6 +69,8 @@ static void test_tcp(int family)
>>> ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
>>> ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
>>> ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
>>> + ASSERT_GE(bss->nr_state, 1, "nr_state");
>>> + ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
>>> }
>>> static void test_udp(int family)
>>> @@ -185,6 +191,11 @@ void test_setget_sockopt(void)
>>> if (!ASSERT_OK_PTR(skel->links.socket_post_create,
>>> "attach_cgroup"))
>>> goto done;
>>> + skel->links.tcp_setsockopt =
>>> + bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
>>> + if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
>>> + goto done;
>>> +
>>> test_tcp(AF_INET6);
>>> test_tcp(AF_INET);
>>> test_udp(AF_INET6);
>>> diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> index 60518aed1ffc..920af9e21e84 100644
>>> --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> @@ -20,6 +20,8 @@ int nr_connect;
>>> int nr_binddev;
>>> int nr_socket_post_create;
>>> int nr_fin_wait1;
>>> +int nr_state;
>>> +int nr_setsockopt;
>>> struct sockopt_test {
>>> int opt;
>>> @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
>>> { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
>>> { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
>>> { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
>>> + { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new =
>>> BPF_SOCK_OPS_ALL_CB_FLAGS,
>>> + .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore =
>>> BPF_SOCK_OPS_STATE_CB_FLAG, },
>>> { .opt = 0, },
>>> };
>>> @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx,
>>> struct sock *sk,
>>> if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
>>> return 1;
>>> +
>>> if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
>>> tmp != expected)
>>> return 1;
>>> @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
>>> nr_passive += !(bpf_test_sockopt(skops, sk) ||
>>> test_tcp_maxseg(skops, sk) ||
>>> test_tcp_saved_syn(skops, sk));
>>> - bpf_sock_ops_cb_flags_set(skops,
>>> - skops->bpf_sock_ops_cb_flags |
>>> - BPF_SOCK_OPS_STATE_CB_FLAG);
>>> +
>>> + /* no need to set sockops cb flags here as sockopt
>>> + * tests and user-space originated setsockopt() will
>>> + * set flags to include BPF_SOCK_OPS_STATE_CB.
>>> + */
>>
>> I don't think the "user-space originated..." comment is accruate here.
>> The user-space originated setsockopt() from the above test_tcp() won't
>> affect the passively established sk here. This took me a while to digest...
>>
>> iiuc, the removed bpf_sock_ops_cb_flags_set() for the passive connection
>> is now implicitly done (and hidden) in the newly added
>> sol_tcp_tests[].restore.
>>
>> How about keeping the explicit bpf_sock_ops_cb_flags_set() and removing
>> the ".restore".
>>
>> The existing bpf_sock_ops_cb_flags_set() can be changed to
>> bpf_setsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS) if it helps to test if it is
>> effective.
>
> Sounds good; I'll make that change and avoid changing test_tcp() etc.
>>
>>> break;
>>> case BPF_SOCK_OPS_STATE_CB:
>>> + nr_state++;
>>
>> How about removing the nr_state addition and adding a
>> SEC("cgroup/getsockopt") bpf prog to test the
>> getsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS).
>>
>
> Will do. Given that currently we cannot call bpf_getsockopt() from
> cgroup/getsockopt progs it might make sense to use per-socket storage to
> store the cb flags value that we set via bpf_setsockopt() in the sock
> ops program, and retrieve it in the cgroup/getsockopt prog. Does that
> sound ok?
The bpf_sock_ops_cb_flags can be directly inspected in cgroup/getsockopt
with the help of bpf_core_cast(). The following is based on the patch3's
_getsockopt (untested):
#include <vmlinux.h>
#include <bpf/bpf_core_read.h>
SEC("cgroup/getsockopt")
int _getsockopt(struct bpf_sockopt *ctx)
{
struct bpf_sock *sk = ctx->sk;
int *optval = ctx->optval;
struct tcp_sock *tp;
if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS || sk->protocol != IPPROTO_TCP ||
ctx->optval + sizeof(int) > ctx->optval_end)
return 1;
tp = bpf_core_cast(sk, struct tcp_sock);
*optval = tp->bpf_sock_ops_cb_flags;
ctx->retval = 0;
return 1;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-08-07 21:14 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
2024-08-06 19:54 ` Martin KaFai Lau
2024-08-07 17:53 ` Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2024-08-06 21:27 ` Martin KaFai Lau
2024-08-07 17:58 ` Alan Maguire
2024-08-07 21:14 ` Martin KaFai Lau
2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2024-08-06 21:42 ` Martin KaFai Lau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox