public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt()
@ 2024-08-02 15:29 Alan Maguire
  2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
  To: martin.lau
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire

As previously discussed here [1], long-lived sockets can miss
a chance to set additional callbacks if a sock ops program
was not attached early in their lifetime.  Adding support
to bpf_setsockopt() to set callback flags (and bpf_getsockopt()
to retrieve them) provides other opportunities to enable callbacks,
either directly via a cgroup/setsockopt intercepted setsockopt()
or via a socket iterator.

Patch 1 adds bpf_[get|set]sockopt() support; patch 2 adds testing
for it via sockops and cgroup/setsockopt programs and patch 3
extends the existing bpf_iter_setsockopt test to cover setting
sock ops flags via bpf_setsockopt() in iterator context.

[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/

Alan Maguire (3):
  bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
  selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
  selftests/bpf: modify bpf_iter_setsockopt to test
    TCP_BPF_SOCK_OPS_CB_FLAGS

 include/uapi/linux/bpf.h                      |  3 +-
 net/core/filter.c                             | 16 ++++
 tools/include/uapi/linux/bpf.h                |  3 +-
 .../bpf/prog_tests/bpf_iter_setsockopt.c      | 83 +++++++++++++------
 .../selftests/bpf/prog_tests/setget_sockopt.c | 11 +++
 .../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++---
 .../selftests/bpf/progs/setget_sockopt.c      | 37 ++++++++-
 7 files changed, 188 insertions(+), 41 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
  2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
@ 2024-08-02 15:29 ` Alan Maguire
  2024-08-06 19:54   ` Martin KaFai Lau
  2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
  2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
  2 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
  To: martin.lau
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire

Currently the only opportunity to set sock ops flags dictating
which callbacks fire for a socket is from within a TCP-BPF sockops
program.  This is problematic if the connection is already set up
as there is no further chance to specify callbacks for that socket.
Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt()
to allow users to specify callbacks later, either via an iterator
over sockets or via a socket-specific program triggered by a
setsockopt() on the socket.

Previous discussion on this here [1].

[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 include/uapi/linux/bpf.h       |  3 ++-
 net/core/filter.c              | 16 ++++++++++++++++
 tools/include/uapi/linux/bpf.h |  3 ++-
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 35bcf52dbc65..d4d7efc34e67 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2851,7 +2851,7 @@ union bpf_attr {
  * 		  **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
  * 		  **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
  * 		  **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
- * 		  **TCP_BPF_RTO_MIN**.
+ *		  **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
  * 		* **IPPROTO_IP**, which supports *optname* **IP_TOS**.
  * 		* **IPPROTO_IPV6**, which supports the following *optname*\ s:
  * 		  **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7080,6 +7080,7 @@ enum {
 	TCP_BPF_SYN		= 1005, /* Copy the TCP header */
 	TCP_BPF_SYN_IP		= 1006, /* Copy the IP[46] and TCP header */
 	TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
+	TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
 };
 
 enum {
diff --git a/net/core/filter.c b/net/core/filter.c
index 78a6f746ea0b..570ca3f12175 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
 			return -EINVAL;
 		inet_csk(sk)->icsk_rto_min = timeout;
 		break;
+	case TCP_BPF_SOCK_OPS_CB_FLAGS:
+		if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
+			return -EINVAL;
+		tp->bpf_sock_ops_cb_flags = val;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
 		if (*optlen < 1)
 			return -EINVAL;
 		break;
+	case TCP_BPF_SOCK_OPS_CB_FLAGS:
+		if (*optlen != sizeof(int))
+			return -EINVAL;
+		if (getopt) {
+			struct tcp_sock *tp = tcp_sk(sk);
+			int val = READ_ONCE(tp->bpf_sock_ops_cb_flags);
+
+			memcpy(optval, &val, *optlen);
+			return 0;
+		}
+		return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen);
 	default:
 		if (getopt)
 			return -EINVAL;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 35bcf52dbc65..d4d7efc34e67 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2851,7 +2851,7 @@ union bpf_attr {
  * 		  **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
  * 		  **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
  * 		  **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
- * 		  **TCP_BPF_RTO_MIN**.
+ *		  **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
  * 		* **IPPROTO_IP**, which supports *optname* **IP_TOS**.
  * 		* **IPPROTO_IPV6**, which supports the following *optname*\ s:
  * 		  **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -7080,6 +7080,7 @@ enum {
 	TCP_BPF_SYN		= 1005, /* Copy the TCP header */
 	TCP_BPF_SYN_IP		= 1006, /* Copy the IP[46] and TCP header */
 	TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
+	TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */
 };
 
 enum {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
  2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
  2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
@ 2024-08-02 15:29 ` Alan Maguire
  2024-08-06 21:27   ` Martin KaFai Lau
  2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
  2 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
  To: martin.lau
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire

Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
bpf_setsockopt() and also add a cgroup/setsockopt program that
catches setsockopt() for this option and uses bpf_setsockopt()
to set it.  The latter allows us to support modifying sockops
cb flags on a per-socket basis via setsockopt() without adding
support into core setsockopt() itself.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
 .../selftests/bpf/progs/setget_sockopt.c      | 37 +++++++++++++++++--
 2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
index 7d4a9b3d3722..b9c54217a489 100644
--- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
@@ -42,6 +42,7 @@ static int create_netns(void)
 static void test_tcp(int family)
 {
 	struct setget_sockopt__bss *bss = skel->bss;
+	int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG | BPF_SOCK_OPS_RTO_CB_FLAG;
 	int sfd, cfd;
 
 	memset(bss, 0, sizeof(*bss));
@@ -56,6 +57,9 @@ static void test_tcp(int family)
 		close(sfd);
 		return;
 	}
+	ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+			     &cb_flags, sizeof(cb_flags)),
+		  0, "setsockopt cb_flags");
 	close(sfd);
 	close(cfd);
 
@@ -65,6 +69,8 @@ static void test_tcp(int family)
 	ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
 	ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
 	ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
+	ASSERT_GE(bss->nr_state, 1, "nr_state");
+	ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
 }
 
 static void test_udp(int family)
@@ -185,6 +191,11 @@ void test_setget_sockopt(void)
 	if (!ASSERT_OK_PTR(skel->links.socket_post_create, "attach_cgroup"))
 		goto done;
 
+	skel->links.tcp_setsockopt =
+		bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
+	if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
+		goto done;
+
 	test_tcp(AF_INET6);
 	test_tcp(AF_INET);
 	test_udp(AF_INET6);
diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c
index 60518aed1ffc..920af9e21e84 100644
--- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
+++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
@@ -20,6 +20,8 @@ int nr_connect;
 int nr_binddev;
 int nr_socket_post_create;
 int nr_fin_wait1;
+int nr_state;
+int nr_setsockopt;
 
 struct sockopt_test {
 	int opt;
@@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
 	{ .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
 	{ .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
 	{ .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
+	{ .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new = BPF_SOCK_OPS_ALL_CB_FLAGS,
+	  .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore = BPF_SOCK_OPS_STATE_CB_FLAG, },
 	{ .opt = 0, },
 };
 
@@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx, struct sock *sk,
 
 	if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
 		return 1;
+
 	if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
 	    tmp != expected)
 		return 1;
@@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
 		nr_passive += !(bpf_test_sockopt(skops, sk) ||
 				test_tcp_maxseg(skops, sk) ||
 				test_tcp_saved_syn(skops, sk));
-		bpf_sock_ops_cb_flags_set(skops,
-					  skops->bpf_sock_ops_cb_flags |
-					  BPF_SOCK_OPS_STATE_CB_FLAG);
+
+		/* no need to set sockops cb flags here as sockopt
+		 * tests and user-space originated setsockopt() will
+		 * set flags to include BPF_SOCK_OPS_STATE_CB.
+		 */
 		break;
 	case BPF_SOCK_OPS_STATE_CB:
+		nr_state++;
 		if (skops->args[1] == BPF_TCP_CLOSE_WAIT)
 			nr_fin_wait1 += !bpf_test_sockopt(skops, sk);
 		break;
@@ -397,4 +405,27 @@ int skops_sockopt(struct bpf_sock_ops *skops)
 	return 1;
 }
 
+SEC("cgroup/setsockopt")
+int tcp_setsockopt(struct bpf_sockopt *ctx)
+{
+	struct bpf_sock *sk = ctx->sk;
+	__u8 *optval_end = ctx->optval_end;
+	__u8 *optval = ctx->optval;
+	int val = 0;
+
+	if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS)
+		return 1;
+	if (optval + sizeof(int) > optval_end)
+		return 0;
+	if (ctx->optlen != sizeof(int))
+		return 0;
+	val = *(int *)optval;
+	if (bpf_setsockopt(sk, ctx->level, ctx->optname, &val, sizeof(val)))
+		return 0;
+	nr_setsockopt++;
+	/* BPF has handled this no need to call "real" setsockopt() */
+	ctx->optlen = -1;
+	return 1;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS
  2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
  2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
  2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
@ 2024-08-02 15:29 ` Alan Maguire
  2024-08-06 21:42   ` Martin KaFai Lau
  2 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-02 15:29 UTC (permalink / raw)
  To: martin.lau
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf, Alan Maguire

Add support to test bpf_setsockopt(.., TCP_BPF_SOCK_OPS_CB_FLAGS, ...)
in BPF iterator context; use per-socket storage to store the new
value and retrieve it in a cgroup/getsockopt program we attach to
allow us to query TCP_BPF_SOCK_OPS_CB_FLAGS via getsockopt.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../bpf/prog_tests/bpf_iter_setsockopt.c      | 83 +++++++++++++------
 .../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++---
 2 files changed, 123 insertions(+), 36 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c
index 16bed9dd8e6a..42effafe8efe 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c
@@ -4,10 +4,13 @@
 #include <sched.h>
 #include <test_progs.h>
 #include "network_helpers.h"
+#include "cgroup_helpers.h"
 #include "bpf_dctcp.skel.h"
 #include "bpf_cubic.skel.h"
 #include "bpf_iter_setsockopt.skel.h"
 
+#define TEST_CGROUP "/test-iter-setsockopt"
+
 static int create_netns(void)
 {
 	if (!ASSERT_OK(unshare(CLONE_NEWNET), "create netns"))
@@ -32,17 +35,26 @@ static unsigned int set_bpf_cubic(int *fds, unsigned int nr_fds)
 	return nr_fds;
 }
 
-static unsigned int check_bpf_dctcp(int *fds, unsigned int nr_fds)
+static unsigned int check_bpf_val(int *fds, unsigned int nr_fds, bool cong)
 {
 	char tcp_cc[16];
-	socklen_t optlen = sizeof(tcp_cc);
+	socklen_t cc_optlen = sizeof(tcp_cc);
+	int flags;
+	socklen_t flags_optlen = sizeof(flags);
 	unsigned int i;
 
 	for (i = 0; i < nr_fds; i++) {
-		if (getsockopt(fds[i], SOL_TCP, TCP_CONGESTION,
-			       tcp_cc, &optlen) ||
-		    strcmp(tcp_cc, "bpf_dctcp"))
-			return i;
+		if (cong) {
+			if (getsockopt(fds[i], SOL_TCP, TCP_CONGESTION,
+				       tcp_cc, &cc_optlen) ||
+			    strcmp(tcp_cc, "bpf_dctcp"))
+				return i;
+		} else {
+			if (getsockopt(fds[i], SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+				       &flags, &flags_optlen) ||
+			    flags != BPF_SOCK_OPS_ALL_CB_FLAGS)
+				return i;
+		}
 	}
 
 	return nr_fds;
@@ -102,7 +114,7 @@ static unsigned short get_local_port(int fd)
 }
 
 static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel,
-				   bool random_retry)
+				   bool random_retry, bool cong)
 {
 	int *reuse_listen_fds = NULL, *accepted_fds = NULL, *est_fds = NULL;
 	unsigned int nr_reuse_listens = 256, nr_est = 256;
@@ -140,9 +152,16 @@ static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel,
 			"get_local_port(reuse_listen_fds[0])"))
 		goto done;
 
-	/* Run bpf tcp iter to switch from bpf_cubic to bpf_dctcp */
+	/* Run bpf tcp iter to change tcp value:
+	 *
+	 * - If cong is true, switch from bpf_cubic to bpf_dctcp;
+	 * - If cong is false, use bpf_setsockopt() to set TCP sockops flags.
+	 */
+
 	iter_skel->bss->random_retry = random_retry;
-	iter_fd = bpf_iter_create(bpf_link__fd(iter_skel->links.change_tcp_cc));
+	iter_skel->bss->cong = cong;
+
+	iter_fd = bpf_iter_create(bpf_link__fd(iter_skel->links.change_tcp_val));
 	if (!ASSERT_GE(iter_fd, 0, "create iter_fd"))
 		goto done;
 
@@ -152,22 +171,21 @@ static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel,
 	if (!ASSERT_OK(err, "read iter error"))
 		goto done;
 
-	/* Check reuseport listen fds for dctcp */
-	ASSERT_EQ(check_bpf_dctcp(reuse_listen_fds, nr_reuse_listens),
+	/* Check reuseport listen fds */
+	ASSERT_EQ(check_bpf_val(reuse_listen_fds, nr_reuse_listens, cong),
 		  nr_reuse_listens,
-		  "check reuse_listen_fds dctcp");
-
-	/* Check non reuseport listen fd for dctcp */
-	ASSERT_EQ(check_bpf_dctcp(&listen_fd, 1), 1,
-		  "check listen_fd dctcp");
+		  "check reuse_listen_fds");
+	/* Check non reuseport listen fd */
+	ASSERT_EQ(check_bpf_val(&listen_fd, 1, cong), 1,
+		  "check listen_fd");
 
-	/* Check established fds for dctcp */
-	ASSERT_EQ(check_bpf_dctcp(est_fds, nr_est), nr_est,
-		  "check est_fds dctcp");
+	/* Check established fds */
+	ASSERT_EQ(check_bpf_val(est_fds, nr_est, cong), nr_est,
+		  "check est_fds");
 
-	/* Check accepted fds for dctcp */
-	ASSERT_EQ(check_bpf_dctcp(accepted_fds, nr_est), nr_est,
-		  "check accepted_fds dctcp");
+	/* Check accepted fds */
+	ASSERT_EQ(check_bpf_val(accepted_fds, nr_est, cong), nr_est,
+		  "check accepted_fds");
 
 done:
 	if (iter_fd != -1)
@@ -186,6 +204,8 @@ void serial_test_bpf_iter_setsockopt(void)
 	struct bpf_dctcp *dctcp_skel = NULL;
 	struct bpf_link *cubic_link = NULL;
 	struct bpf_link *dctcp_link = NULL;
+	struct bpf_link *getsockopt_link = NULL;
+	int cgroup_fd;
 
 	if (create_netns())
 		return;
@@ -194,8 +214,9 @@ void serial_test_bpf_iter_setsockopt(void)
 	iter_skel = bpf_iter_setsockopt__open_and_load();
 	if (!ASSERT_OK_PTR(iter_skel, "iter_skel"))
 		return;
-	iter_skel->links.change_tcp_cc = bpf_program__attach_iter(iter_skel->progs.change_tcp_cc, NULL);
-	if (!ASSERT_OK_PTR(iter_skel->links.change_tcp_cc, "attach iter"))
+	iter_skel->links.change_tcp_val = bpf_program__attach_iter(iter_skel->progs.change_tcp_val,
+								   NULL);
+	if (!ASSERT_OK_PTR(iter_skel->links.change_tcp_val, "attach iter"))
 		goto done;
 
 	/* Load bpf_cubic */
@@ -214,13 +235,23 @@ void serial_test_bpf_iter_setsockopt(void)
 	if (!ASSERT_OK_PTR(dctcp_link, "dctcp_link"))
 		goto done;
 
-	do_bpf_iter_setsockopt(iter_skel, true);
-	do_bpf_iter_setsockopt(iter_skel, false);
+	cgroup_fd = cgroup_setup_and_join(TEST_CGROUP);
+	if (!ASSERT_OK_FD(cgroup_fd, "cgroup switch"))
+		goto done;
+	getsockopt_link = bpf_program__attach_cgroup(iter_skel->progs._getsockopt, cgroup_fd);
+	if (!ASSERT_OK_PTR(getsockopt_link, "getsockopt prog"))
+		goto done;
 
+	do_bpf_iter_setsockopt(iter_skel, true, true);
+	do_bpf_iter_setsockopt(iter_skel, false, true);
+	do_bpf_iter_setsockopt(iter_skel, true, false);
+	do_bpf_iter_setsockopt(iter_skel, false, false);
 done:
 	bpf_link__destroy(cubic_link);
 	bpf_link__destroy(dctcp_link);
+	bpf_link__destroy(getsockopt_link);
 	bpf_cubic__destroy(cubic_skel);
 	bpf_dctcp__destroy(dctcp_skel);
 	bpf_iter_setsockopt__destroy(iter_skel);
+	cleanup_cgroup_environment();
 }
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c b/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c
index ec7f91850dec..60752a7ebdf8 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c
@@ -5,6 +5,13 @@
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_endian.h>
 
+struct {
+	__uint(type, BPF_MAP_TYPE_SK_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC);
+	__type(key, int);
+	__type(value, int);
+} sk_map SEC(".maps");
+
 #define bpf_tcp_sk(skc)	({				\
 	struct sock_common *_skc = skc;			\
 	sk = NULL;					\
@@ -21,6 +28,7 @@ unsigned short listen_hport = 0;
 char cubic_cc[TCP_CA_NAME_MAX] = "bpf_cubic";
 char dctcp_cc[TCP_CA_NAME_MAX] = "bpf_dctcp";
 bool random_retry = false;
+bool cong = false;
 
 static bool tcp_cc_eq(const char *a, const char *b)
 {
@@ -36,10 +44,32 @@ static bool tcp_cc_eq(const char *a, const char *b)
 	return true;
 }
 
+/* This program is used to intercept getsockopt() calls, providing
+ * the value of bpf_sock_ops_cb_flags for the socket; this value
+ * has been saved in per-socket storage earlier via the iterator
+ * program.
+ */
+SEC("cgroup/getsockopt")
+int _getsockopt(struct bpf_sockopt *ctx)
+{
+	struct bpf_sock *sk = ctx->sk;
+	int *optval = ctx->optval;
+	int *sk_storage = 0;
+
+	if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS)
+		return 1;
+	sk_storage = bpf_sk_storage_get(&sk_map, sk, 0, 0);
+	if (sk_storage) {
+		if (ctx->optval + sizeof(int) <= ctx->optval_end)
+			*optval = *sk_storage;
+		ctx->retval = 0;
+	}
+	return 1;
+}
+
 SEC("iter/tcp")
-int change_tcp_cc(struct bpf_iter__tcp *ctx)
+int change_tcp_val(struct bpf_iter__tcp *ctx)
 {
-	char cur_cc[TCP_CA_NAME_MAX];
 	struct tcp_sock *tp;
 	struct sock *sk;
 
@@ -54,17 +84,43 @@ int change_tcp_cc(struct bpf_iter__tcp *ctx)
 	     bpf_ntohs(sk->sk_dport) != listen_hport))
 		return 0;
 
-	if (bpf_getsockopt(tp, SOL_TCP, TCP_CONGESTION,
-			   cur_cc, sizeof(cur_cc)))
-		return 0;
+	if (cong) {
+		char cur_cc[TCP_CA_NAME_MAX];
 
-	if (!tcp_cc_eq(cur_cc, cubic_cc))
-		return 0;
+		if (bpf_getsockopt(tp, SOL_TCP, TCP_CONGESTION,
+				   cur_cc, sizeof(cur_cc)))
+			return 0;
 
-	if (random_retry && bpf_get_prandom_u32() % 4 == 1)
-		return 1;
+		if (!tcp_cc_eq(cur_cc, cubic_cc))
+			return 0;
+
+		if (random_retry && bpf_get_prandom_u32() % 4 == 1)
+			return 1;
+
+		bpf_setsockopt(tp, SOL_TCP, TCP_CONGESTION, dctcp_cc, sizeof(dctcp_cc));
+	} else {
+		int val, newval = BPF_SOCK_OPS_ALL_CB_FLAGS;
+		int *sk_storage;
 
-	bpf_setsockopt(tp, SOL_TCP, TCP_CONGESTION, dctcp_cc, sizeof(dctcp_cc));
+		if (bpf_getsockopt(tp, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+				   &val, sizeof(val)))
+			return 0;
+
+		if (val == newval)
+			return 0;
+
+		if (random_retry && bpf_get_prandom_u32() % 4 == 1)
+			return 1;
+
+		if (bpf_setsockopt(tp, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
+				   &newval, sizeof(newval)))
+			return 0;
+		/* store flags value for retrieval in cgroup/getsockopt prog */
+		sk_storage = bpf_sk_storage_get(&sk_map, sk, 0,
+						BPF_SK_STORAGE_GET_F_CREATE);
+		if (sk_storage)
+			*sk_storage = newval;
+	}
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
  2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
@ 2024-08-06 19:54   ` Martin KaFai Lau
  2024-08-07 17:53     ` Alan Maguire
  0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-06 19:54 UTC (permalink / raw)
  To: Alan Maguire
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf

On 8/2/24 8:29 AM, Alan Maguire wrote:
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 78a6f746ea0b..570ca3f12175 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
>   			return -EINVAL;
>   		inet_csk(sk)->icsk_rto_min = timeout;
>   		break;
> +	case TCP_BPF_SOCK_OPS_CB_FLAGS:
> +		if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
> +			return -EINVAL;
> +		tp->bpf_sock_ops_cb_flags = val;
> +		break;
>   	default:
>   		return -EINVAL;
>   	}
> @@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
>   		if (*optlen < 1)
>   			return -EINVAL;
>   		break;
> +	case TCP_BPF_SOCK_OPS_CB_FLAGS:
> +		if (*optlen != sizeof(int))
> +			return -EINVAL;
> +		if (getopt) {
> +			struct tcp_sock *tp = tcp_sk(sk);
> +			int val = READ_ONCE(tp->bpf_sock_ops_cb_flags);

READ_ONCE() here looks suspicious. There is no existing WRITE_ONCE.

Is it needed? The read here should have already passed the sock_owned_by_me 
test. The existing write side should also have the sock_owned_by_me.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
  2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
@ 2024-08-06 21:27   ` Martin KaFai Lau
  2024-08-07 17:58     ` Alan Maguire
  0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-06 21:27 UTC (permalink / raw)
  To: Alan Maguire
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf

On 8/2/24 8:29 AM, Alan Maguire wrote:
> Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
> bpf_setsockopt() and also add a cgroup/setsockopt program that
> catches setsockopt() for this option and uses bpf_setsockopt()
> to set it.  The latter allows us to support modifying sockops
> cb flags on a per-socket basis via setsockopt() without adding
> support into core setsockopt() itself.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>   .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
>   .../selftests/bpf/progs/setget_sockopt.c      | 37 +++++++++++++++++--
>   2 files changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
> index 7d4a9b3d3722..b9c54217a489 100644
> --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
> +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
> @@ -42,6 +42,7 @@ static int create_netns(void)
>   static void test_tcp(int family)
>   {
>   	struct setget_sockopt__bss *bss = skel->bss;
> +	int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG | BPF_SOCK_OPS_RTO_CB_FLAG;
>   	int sfd, cfd;
>   
>   	memset(bss, 0, sizeof(*bss));
> @@ -56,6 +57,9 @@ static void test_tcp(int family)
>   		close(sfd);
>   		return;
>   	}
> +	ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
> +			     &cb_flags, sizeof(cb_flags)),
> +		  0, "setsockopt cb_flags");

The sfd here is the listen fd. The setsockopt here is after the connection is 
established and the connection won't be affected by this setsockopt...

I don't think this test belongs to test_tcp() here, more on this below...

>   	close(sfd);
>   	close(cfd);
>   
> @@ -65,6 +69,8 @@ static void test_tcp(int family)
>   	ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
>   	ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
>   	ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
> +	ASSERT_GE(bss->nr_state, 1, "nr_state");
> +	ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
>   }
>   
>   static void test_udp(int family)
> @@ -185,6 +191,11 @@ void test_setget_sockopt(void)
>   	if (!ASSERT_OK_PTR(skel->links.socket_post_create, "attach_cgroup"))
>   		goto done;
>   
> +	skel->links.tcp_setsockopt =
> +		bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
> +	if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
> +		goto done;
> +
>   	test_tcp(AF_INET6);
>   	test_tcp(AF_INET);
>   	test_udp(AF_INET6);
> diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c
> index 60518aed1ffc..920af9e21e84 100644
> --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
> +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
> @@ -20,6 +20,8 @@ int nr_connect;
>   int nr_binddev;
>   int nr_socket_post_create;
>   int nr_fin_wait1;
> +int nr_state;
> +int nr_setsockopt;
>   
>   struct sockopt_test {
>   	int opt;
> @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
>   	{ .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
>   	{ .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
>   	{ .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
> +	{ .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new = BPF_SOCK_OPS_ALL_CB_FLAGS,
> +	  .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore = BPF_SOCK_OPS_STATE_CB_FLAG, },
>   	{ .opt = 0, },
>   };
>   
> @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx, struct sock *sk,
>   
>   	if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
>   		return 1;
> +
>   	if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
>   	    tmp != expected)
>   		return 1;
> @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
>   		nr_passive += !(bpf_test_sockopt(skops, sk) ||
>   				test_tcp_maxseg(skops, sk) ||
>   				test_tcp_saved_syn(skops, sk));
> -		bpf_sock_ops_cb_flags_set(skops,
> -					  skops->bpf_sock_ops_cb_flags |
> -					  BPF_SOCK_OPS_STATE_CB_FLAG);
> +
> +		/* no need to set sockops cb flags here as sockopt
> +		 * tests and user-space originated setsockopt() will
> +		 * set flags to include BPF_SOCK_OPS_STATE_CB.
> +		 */

I don't think the "user-space originated..." comment is accruate here. The 
user-space originated setsockopt() from the above test_tcp() won't affect the 
passively established sk here. This took me a while to digest...

iiuc, the removed bpf_sock_ops_cb_flags_set() for the passive connection is now 
implicitly done (and hidden) in the newly added sol_tcp_tests[].restore.

How about keeping the explicit bpf_sock_ops_cb_flags_set() and removing the 
".restore".

The existing bpf_sock_ops_cb_flags_set() can be changed to 
bpf_setsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS) if it helps to test if it is effective.

>   		break;
>   	case BPF_SOCK_OPS_STATE_CB:
> +		nr_state++;

How about removing the nr_state addition and adding a SEC("cgroup/getsockopt") 
bpf prog to test the getsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS).

Create another test_nonstandard_opt() in prog_tests/setget_sockopt.c and 
separate it from the existing test_{tcp, udp...} which is mainly exercising 
set/getsockopt(sol_*_tests[]) on different hooks (right now it has 
lsm_cgroup/socket_post_create and sockops). It doesn't fit testing the user 
syscall set/getsockopt on the nonstandard_opt.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS
  2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
@ 2024-08-06 21:42   ` Martin KaFai Lau
  0 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-06 21:42 UTC (permalink / raw)
  To: Alan Maguire
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf

On 8/2/24 8:29 AM, Alan Maguire wrote:
> Add support to test bpf_setsockopt(.., TCP_BPF_SOCK_OPS_CB_FLAGS, ...)
> in BPF iterator context; use per-socket storage to store the new
> value and retrieve it in a cgroup/getsockopt program we attach to
> allow us to query TCP_BPF_SOCK_OPS_CB_FLAGS via getsockopt.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>   .../bpf/prog_tests/bpf_iter_setsockopt.c      | 83 +++++++++++++------
>   .../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++---

There are too many code churns to reuse this test to test a new 
TCP_BPF_SOCK_OPS_CB_FLAGS sockopt. This is not the right test to reuse. It was 
created mainly to test if the tcp batching logic can survive the bpf-iter's 
seq_stop.

I don't think it needs a separate bpf_set/getsockopt test specifically for the 
bpf iter prog. The test in patch 2 should be enough.

pw-bot: cr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
  2024-08-06 19:54   ` Martin KaFai Lau
@ 2024-08-07 17:53     ` Alan Maguire
  0 siblings, 0 replies; 10+ messages in thread
From: Alan Maguire @ 2024-08-07 17:53 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf

On 06/08/2024 20:54, Martin KaFai Lau wrote:
> On 8/2/24 8:29 AM, Alan Maguire wrote:
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 78a6f746ea0b..570ca3f12175 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock
>> *sk, int optname,
>>               return -EINVAL;
>>           inet_csk(sk)->icsk_rto_min = timeout;
>>           break;
>> +    case TCP_BPF_SOCK_OPS_CB_FLAGS:
>> +        if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS))
>> +            return -EINVAL;
>> +        tp->bpf_sock_ops_cb_flags = val;
>> +        break;
>>       default:
>>           return -EINVAL;
>>       }
>> @@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int
>> optname,
>>           if (*optlen < 1)
>>               return -EINVAL;
>>           break;
>> +    case TCP_BPF_SOCK_OPS_CB_FLAGS:
>> +        if (*optlen != sizeof(int))
>> +            return -EINVAL;
>> +        if (getopt) {
>> +            struct tcp_sock *tp = tcp_sk(sk);
>> +            int val = READ_ONCE(tp->bpf_sock_ops_cb_flags);
> 
> READ_ONCE() here looks suspicious. There is no existing WRITE_ONCE.
> 
> Is it needed? The read here should have already passed the
> sock_owned_by_me test. The existing write side should also have the
> sock_owned_by_me.
> 
>

Yep, you're right. Will remove for v2. Thanks for taking a look!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
  2024-08-06 21:27   ` Martin KaFai Lau
@ 2024-08-07 17:58     ` Alan Maguire
  2024-08-07 21:14       ` Martin KaFai Lau
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Maguire @ 2024-08-07 17:58 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf

On 06/08/2024 22:27, Martin KaFai Lau wrote:
> On 8/2/24 8:29 AM, Alan Maguire wrote:
>> Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
>> bpf_setsockopt() and also add a cgroup/setsockopt program that
>> catches setsockopt() for this option and uses bpf_setsockopt()
>> to set it.  The latter allows us to support modifying sockops
>> cb flags on a per-socket basis via setsockopt() without adding
>> support into core setsockopt() itself.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>   .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
>>   .../selftests/bpf/progs/setget_sockopt.c      | 37 +++++++++++++++++--
>>   2 files changed, 45 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> index 7d4a9b3d3722..b9c54217a489 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>> @@ -42,6 +42,7 @@ static int create_netns(void)
>>   static void test_tcp(int family)
>>   {
>>       struct setget_sockopt__bss *bss = skel->bss;
>> +    int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG |
>> BPF_SOCK_OPS_RTO_CB_FLAG;
>>       int sfd, cfd;
>>         memset(bss, 0, sizeof(*bss));
>> @@ -56,6 +57,9 @@ static void test_tcp(int family)
>>           close(sfd);
>>           return;
>>       }
>> +    ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
>> +                 &cb_flags, sizeof(cb_flags)),
>> +          0, "setsockopt cb_flags");
> 
> The sfd here is the listen fd. The setsockopt here is after the
> connection is established and the connection won't be affected by this
> setsockopt...
> 
> I don't think this test belongs to test_tcp() here, more on this below...
> 
>>       close(sfd);
>>       close(cfd);
>>   @@ -65,6 +69,8 @@ static void test_tcp(int family)
>>       ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
>>       ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
>>       ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
>> +    ASSERT_GE(bss->nr_state, 1, "nr_state");
>> +    ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
>>   }
>>     static void test_udp(int family)
>> @@ -185,6 +191,11 @@ void test_setget_sockopt(void)
>>       if (!ASSERT_OK_PTR(skel->links.socket_post_create,
>> "attach_cgroup"))
>>           goto done;
>>   +    skel->links.tcp_setsockopt =
>> +        bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
>> +    if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
>> +        goto done;
>> +
>>       test_tcp(AF_INET6);
>>       test_tcp(AF_INET);
>>       test_udp(AF_INET6);
>> diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> index 60518aed1ffc..920af9e21e84 100644
>> --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>> @@ -20,6 +20,8 @@ int nr_connect;
>>   int nr_binddev;
>>   int nr_socket_post_create;
>>   int nr_fin_wait1;
>> +int nr_state;
>> +int nr_setsockopt;
>>     struct sockopt_test {
>>       int opt;
>> @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
>>       { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
>>       { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
>>       { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
>> +    { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new =
>> BPF_SOCK_OPS_ALL_CB_FLAGS,
>> +      .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore =
>> BPF_SOCK_OPS_STATE_CB_FLAG, },
>>       { .opt = 0, },
>>   };
>>   @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx,
>> struct sock *sk,
>>         if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
>>           return 1;
>> +
>>       if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
>>           tmp != expected)
>>           return 1;
>> @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
>>           nr_passive += !(bpf_test_sockopt(skops, sk) ||
>>                   test_tcp_maxseg(skops, sk) ||
>>                   test_tcp_saved_syn(skops, sk));
>> -        bpf_sock_ops_cb_flags_set(skops,
>> -                      skops->bpf_sock_ops_cb_flags |
>> -                      BPF_SOCK_OPS_STATE_CB_FLAG);
>> +
>> +        /* no need to set sockops cb flags here as sockopt
>> +         * tests and user-space originated setsockopt() will
>> +         * set flags to include BPF_SOCK_OPS_STATE_CB.
>> +         */
> 
> I don't think the "user-space originated..." comment is accruate here.
> The user-space originated setsockopt() from the above test_tcp() won't
> affect the passively established sk here. This took me a while to digest...
> 
> iiuc, the removed bpf_sock_ops_cb_flags_set() for the passive connection
> is now implicitly done (and hidden) in the newly added
> sol_tcp_tests[].restore.
> 
> How about keeping the explicit bpf_sock_ops_cb_flags_set() and removing
> the ".restore".
> 
> The existing bpf_sock_ops_cb_flags_set() can be changed to
> bpf_setsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS) if it helps to test if it is
> effective.

Sounds good; I'll make that change and avoid changing test_tcp() etc.
> 
>>           break;
>>       case BPF_SOCK_OPS_STATE_CB:
>> +        nr_state++;
> 
> How about removing the nr_state addition and adding a
> SEC("cgroup/getsockopt") bpf prog to test the
> getsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS).
> 

Will do. Given that currently we cannot call bpf_getsockopt() from
cgroup/getsockopt progs it might make sense to use per-socket storage to
store the cb flags value that we set via bpf_setsockopt() in the sock
ops program, and retrieve it in the cgroup/getsockopt prog. Does that
sound ok?

> Create another test_nonstandard_opt() in prog_tests/setget_sockopt.c and
> separate it from the existing test_{tcp, udp...} which is mainly
> exercising set/getsockopt(sol_*_tests[]) on different hooks (right now
> it has lsm_cgroup/socket_post_create and sockops). It doesn't fit
> testing the user syscall set/getsockopt on the nonstandard_opt.

Sounds good. I'll also drop the test in patch 3 as you suggest for v2.
Thanks again!

Alan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS
  2024-08-07 17:58     ` Alan Maguire
@ 2024-08-07 21:14       ` Martin KaFai Lau
  0 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2024-08-07 21:14 UTC (permalink / raw)
  To: Alan Maguire
  Cc: ast, daniel, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, edumazet, bpf

On 8/7/24 10:58 AM, Alan Maguire wrote:
> On 06/08/2024 22:27, Martin KaFai Lau wrote:
>> On 8/2/24 8:29 AM, Alan Maguire wrote:
>>> Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via
>>> bpf_setsockopt() and also add a cgroup/setsockopt program that
>>> catches setsockopt() for this option and uses bpf_setsockopt()
>>> to set it.  The latter allows us to support modifying sockops
>>> cb flags on a per-socket basis via setsockopt() without adding
>>> support into core setsockopt() itself.
>>>
>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>> ---
>>>    .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++
>>>    .../selftests/bpf/progs/setget_sockopt.c      | 37 +++++++++++++++++--
>>>    2 files changed, 45 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> index 7d4a9b3d3722..b9c54217a489 100644
>>> --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
>>> @@ -42,6 +42,7 @@ static int create_netns(void)
>>>    static void test_tcp(int family)
>>>    {
>>>        struct setget_sockopt__bss *bss = skel->bss;
>>> +    int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG |
>>> BPF_SOCK_OPS_RTO_CB_FLAG;
>>>        int sfd, cfd;
>>>          memset(bss, 0, sizeof(*bss));
>>> @@ -56,6 +57,9 @@ static void test_tcp(int family)
>>>            close(sfd);
>>>            return;
>>>        }
>>> +    ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS,
>>> +                 &cb_flags, sizeof(cb_flags)),
>>> +          0, "setsockopt cb_flags");
>>
>> The sfd here is the listen fd. The setsockopt here is after the
>> connection is established and the connection won't be affected by this
>> setsockopt...
>>
>> I don't think this test belongs to test_tcp() here, more on this below...
>>
>>>        close(sfd);
>>>        close(cfd);
>>>    @@ -65,6 +69,8 @@ static void test_tcp(int family)
>>>        ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
>>>        ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
>>>        ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
>>> +    ASSERT_GE(bss->nr_state, 1, "nr_state");
>>> +    ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt");
>>>    }
>>>      static void test_udp(int family)
>>> @@ -185,6 +191,11 @@ void test_setget_sockopt(void)
>>>        if (!ASSERT_OK_PTR(skel->links.socket_post_create,
>>> "attach_cgroup"))
>>>            goto done;
>>>    +    skel->links.tcp_setsockopt =
>>> +        bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd);
>>> +    if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt"))
>>> +        goto done;
>>> +
>>>        test_tcp(AF_INET6);
>>>        test_tcp(AF_INET);
>>>        test_udp(AF_INET6);
>>> diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> index 60518aed1ffc..920af9e21e84 100644
>>> --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
>>> @@ -20,6 +20,8 @@ int nr_connect;
>>>    int nr_binddev;
>>>    int nr_socket_post_create;
>>>    int nr_fin_wait1;
>>> +int nr_state;
>>> +int nr_setsockopt;
>>>      struct sockopt_test {
>>>        int opt;
>>> @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = {
>>>        { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
>>>        { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
>>>        { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
>>> +    { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new =
>>> BPF_SOCK_OPS_ALL_CB_FLAGS,
>>> +      .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore =
>>> BPF_SOCK_OPS_STATE_CB_FLAG, },
>>>        { .opt = 0, },
>>>    };
>>>    @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx,
>>> struct sock *sk,
>>>          if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
>>>            return 1;
>>> +
>>>        if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) ||
>>>            tmp != expected)
>>>            return 1;
>>> @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops)
>>>            nr_passive += !(bpf_test_sockopt(skops, sk) ||
>>>                    test_tcp_maxseg(skops, sk) ||
>>>                    test_tcp_saved_syn(skops, sk));
>>> -        bpf_sock_ops_cb_flags_set(skops,
>>> -                      skops->bpf_sock_ops_cb_flags |
>>> -                      BPF_SOCK_OPS_STATE_CB_FLAG);
>>> +
>>> +        /* no need to set sockops cb flags here as sockopt
>>> +         * tests and user-space originated setsockopt() will
>>> +         * set flags to include BPF_SOCK_OPS_STATE_CB.
>>> +         */
>>
>> I don't think the "user-space originated..." comment is accruate here.
>> The user-space originated setsockopt() from the above test_tcp() won't
>> affect the passively established sk here. This took me a while to digest...
>>
>> iiuc, the removed bpf_sock_ops_cb_flags_set() for the passive connection
>> is now implicitly done (and hidden) in the newly added
>> sol_tcp_tests[].restore.
>>
>> How about keeping the explicit bpf_sock_ops_cb_flags_set() and removing
>> the ".restore".
>>
>> The existing bpf_sock_ops_cb_flags_set() can be changed to
>> bpf_setsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS) if it helps to test if it is
>> effective.
> 
> Sounds good; I'll make that change and avoid changing test_tcp() etc.
>>
>>>            break;
>>>        case BPF_SOCK_OPS_STATE_CB:
>>> +        nr_state++;
>>
>> How about removing the nr_state addition and adding a
>> SEC("cgroup/getsockopt") bpf prog to test the
>> getsockopt(TCP_BPF_SOCK_OPS_CB_FLAGS).
>>
> 
> Will do. Given that currently we cannot call bpf_getsockopt() from
> cgroup/getsockopt progs it might make sense to use per-socket storage to
> store the cb flags value that we set via bpf_setsockopt() in the sock
> ops program, and retrieve it in the cgroup/getsockopt prog. Does that
> sound ok?

The bpf_sock_ops_cb_flags can be directly inspected in cgroup/getsockopt
with the help of bpf_core_cast(). The following is based on the patch3's
_getsockopt (untested):

#include <vmlinux.h>
#include <bpf/bpf_core_read.h>

SEC("cgroup/getsockopt")
int _getsockopt(struct bpf_sockopt *ctx)
{
	struct bpf_sock *sk = ctx->sk;
	int *optval = ctx->optval;
	struct tcp_sock *tp;

	if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS || sk->protocol != IPPROTO_TCP ||
	    ctx->optval + sizeof(int) > ctx->optval_end)
		return 1;

	tp = bpf_core_cast(sk, struct tcp_sock);
	*optval = tp->bpf_sock_ops_cb_flags;
	ctx->retval = 0;
	return 1;
}


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-08-07 21:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-02 15:29 [PATCH bpf-next 0/3] add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Alan Maguire
2024-08-06 19:54   ` Martin KaFai Lau
2024-08-07 17:53     ` Alan Maguire
2024-08-02 15:29 ` [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2024-08-06 21:27   ` Martin KaFai Lau
2024-08-07 17:58     ` Alan Maguire
2024-08-07 21:14       ` Martin KaFai Lau
2024-08-02 15:29 ` [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Alan Maguire
2024-08-06 21:42   ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox