linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v11 0/5] bpf: Force to MPTCP
@ 2023-08-04  5:07 Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook Geliang Tang
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Geliang Tang @ 2023-08-04  5:07 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman
  Cc: Geliang Tang, bpf, netdev, mptcp, apparmor, linux-security-module,
	selinux, linux-kselftest

As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:

"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."

But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:

 - it doesn't work if the application is not using libc (e.g. GoLang
apps)
 - in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
 - mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
 - it is not in BPF, we cannot talk about it at netdev conf.

So this patchset attempts to use BPF to implement functions similer to
mptcpize.

The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.

[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79

v11:
 - add comments about outputs of 'ss' and 'nstat'.
 - use "err = verify_mptcpify()" instead of using =+.

v10:
 - drop "#ifdef CONFIG_BPF_JIT".
 - include vmlinux.h and bpf_tracing_net.h to avoid defining some
macros.
 - drop unneeded checks for mptcp.

v9:
 - update comment for 'update_socket_protocol'.

v8:
 - drop the additional checks on the 'protocol' value after the
'update_socket_protocol()' call.

v7:
 - add __weak and __diag_* for update_socket_protocol.

v6:
 - add update_socket_protocol.

v5:
 - add bpf_mptcpify helper.

v4:
 - use lsm_cgroup/socket_create

v3:
 - patch 8: char cmd[128]; -> char cmd[256];

v2:
 - Fix build selftests errors reported by CI

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79

Geliang Tang (5):
  bpf: Add update_socket_protocol hook
  selftests/bpf: Use random netns name for mptcp
  selftests/bpf: Add two mptcp netns helpers
  selftests/bpf: Drop unneeded checks for mptcp
  selftests/bpf: Add mptcpify test

 net/mptcp/bpf.c                               |  15 ++
 net/socket.c                                  |  24 +++
 .../testing/selftests/bpf/prog_tests/mptcp.c  | 139 +++++++++++++++---
 tools/testing/selftests/bpf/progs/mptcpify.c  |  20 +++
 4 files changed, 179 insertions(+), 19 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c

-- 
2.35.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook
  2023-08-04  5:07 [PATCH bpf-next v11 0/5] bpf: Force to MPTCP Geliang Tang
@ 2023-08-04  5:07 ` Geliang Tang
  2023-08-05  0:00   ` Martin KaFai Lau
  2023-08-04  5:07 ` [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp Geliang Tang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Geliang Tang @ 2023-08-04  5:07 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman
  Cc: Geliang Tang, bpf, netdev, mptcp, apparmor, linux-security-module,
	selinux, linux-kselftest, Yonghong Song

Add a hook named update_socket_protocol in __sys_socket(), for bpf
progs to attach to and update socket protocol. One user case is to
force legacy TCP apps to create and use MPTCP sockets instead of
TCP ones.

Define a mod_ret set named bpf_mptcp_fmodret_ids, add the hook
update_socket_protocol into this set, and register it in
bpf_mptcp_kfunc_init().

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 net/mptcp/bpf.c | 15 +++++++++++++++
 net/socket.c    | 24 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index 5a0a84ad94af..8a16672b94e2 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -19,3 +19,18 @@ struct mptcp_sock *bpf_mptcp_sock_from_subflow(struct sock *sk)
 
 	return NULL;
 }
+
+BTF_SET8_START(bpf_mptcp_fmodret_ids)
+BTF_ID_FLAGS(func, update_socket_protocol)
+BTF_SET8_END(bpf_mptcp_fmodret_ids)
+
+static const struct btf_kfunc_id_set bpf_mptcp_fmodret_set = {
+	.owner = THIS_MODULE,
+	.set   = &bpf_mptcp_fmodret_ids,
+};
+
+static int __init bpf_mptcp_kfunc_init(void)
+{
+	return register_btf_fmodret_id_set(&bpf_mptcp_fmodret_set);
+}
+late_initcall(bpf_mptcp_kfunc_init);
diff --git a/net/socket.c b/net/socket.c
index 2b0e54b2405c..9f98ced88ac5 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1644,11 +1644,35 @@ struct file *__sys_socket_file(int family, int type, int protocol)
 	return sock_alloc_file(sock, flags, NULL);
 }
 
+/*	A hook for bpf progs to attach to and update socket protocol.
+ *
+ *	A static noinline declaration here could cause the compiler to
+ *	optimize away the function. A global noinline declaration will
+ *	keep the definition, but may optimize away the callsite.
+ *	Therefore, __weak is needed to ensure that the call is still
+ *	emitted, by telling the compiler that we don't know what the
+ *	function might eventually be.
+ *
+ *	__diag_* below are needed to dismiss the missing prototype warning.
+ */
+
+__diag_push();
+__diag_ignore_all("-Wmissing-prototypes",
+		  "kfuncs which will be used in BPF programs");
+
+__weak noinline int update_socket_protocol(int family, int type, int protocol)
+{
+	return protocol;
+}
+
+__diag_pop();
+
 int __sys_socket(int family, int type, int protocol)
 {
 	struct socket *sock;
 	int flags;
 
+	protocol = update_socket_protocol(family, type, protocol);
 	sock = __sys_socket_create(family, type, protocol);
 	if (IS_ERR(sock))
 		return PTR_ERR(sock);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-04  5:07 [PATCH bpf-next v11 0/5] bpf: Force to MPTCP Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook Geliang Tang
@ 2023-08-04  5:07 ` Geliang Tang
  2023-08-05  0:23   ` Martin KaFai Lau
  2023-08-04  5:07 ` [PATCH bpf-next v11 3/5] selftests/bpf: Add two mptcp netns helpers Geliang Tang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Geliang Tang @ 2023-08-04  5:07 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman
  Cc: Geliang Tang, bpf, netdev, mptcp, apparmor, linux-security-module,
	selinux, linux-kselftest, Yonghong Song

Use rand() to generate a random netns name instead of using the fixed
name "mptcp_ns" for every test.

By doing that, we can re-launch the test even if there was an issue
removing the previous netns or if by accident, a netns with this generic
name already existed on the system.

Note that using a different name each will also help adding more
subtests in future commits.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 tools/testing/selftests/bpf/prog_tests/mptcp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index cd0c42fff7c0..4ccca3d39a8f 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -7,7 +7,7 @@
 #include "network_helpers.h"
 #include "mptcp_sock.skel.h"
 
-#define NS_TEST "mptcp_ns"
+char NS_TEST[32];
 
 #ifndef TCP_CA_NAME_MAX
 #define TCP_CA_NAME_MAX	16
@@ -147,6 +147,8 @@ static void test_base(void)
 	if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup"))
 		return;
 
+	srand(time(NULL));
+	snprintf(NS_TEST, sizeof(NS_TEST), "mptcp_ns_%d", rand());
 	SYS(fail, "ip netns add %s", NS_TEST);
 	SYS(fail, "ip -net %s link set dev lo up", NS_TEST);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v11 3/5] selftests/bpf: Add two mptcp netns helpers
  2023-08-04  5:07 [PATCH bpf-next v11 0/5] bpf: Force to MPTCP Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp Geliang Tang
@ 2023-08-04  5:07 ` Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 4/5] selftests/bpf: Drop unneeded checks for mptcp Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 5/5] selftests/bpf: Add mptcpify test Geliang Tang
  4 siblings, 0 replies; 15+ messages in thread
From: Geliang Tang @ 2023-08-04  5:07 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman
  Cc: Geliang Tang, bpf, netdev, mptcp, apparmor, linux-security-module,
	selinux, linux-kselftest, Yonghong Song

Add two netns helpers for mptcp tests: create_netns() and
cleanup_netns(). Use them in test_base().

These new helpers will be re-used in the following commits introducing
new tests.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 .../testing/selftests/bpf/prog_tests/mptcp.c  | 35 ++++++++++++-------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index 4ccca3d39a8f..4407bd5c9e9a 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -22,6 +22,26 @@ struct mptcp_storage {
 	char ca_name[TCP_CA_NAME_MAX];
 };
 
+static struct nstoken *create_netns(void)
+{
+	srand(time(NULL));
+	snprintf(NS_TEST, sizeof(NS_TEST), "mptcp_ns_%d", rand());
+	SYS(fail, "ip netns add %s", NS_TEST);
+	SYS(fail, "ip -net %s link set dev lo up", NS_TEST);
+
+	return open_netns(NS_TEST);
+fail:
+	return NULL;
+}
+
+static void cleanup_netns(struct nstoken *nstoken)
+{
+	if (nstoken)
+		close_netns(nstoken);
+
+	SYS_NOFAIL("ip netns del %s &> /dev/null", NS_TEST);
+}
+
 static int verify_tsk(int map_fd, int client_fd)
 {
 	int err, cfd = client_fd;
@@ -147,13 +167,8 @@ static void test_base(void)
 	if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup"))
 		return;
 
-	srand(time(NULL));
-	snprintf(NS_TEST, sizeof(NS_TEST), "mptcp_ns_%d", rand());
-	SYS(fail, "ip netns add %s", NS_TEST);
-	SYS(fail, "ip -net %s link set dev lo up", NS_TEST);
-
-	nstoken = open_netns(NS_TEST);
-	if (!ASSERT_OK_PTR(nstoken, "open_netns"))
+	nstoken = create_netns();
+	if (!ASSERT_OK_PTR(nstoken, "create_netns"))
 		goto fail;
 
 	/* without MPTCP */
@@ -176,11 +191,7 @@ static void test_base(void)
 	close(server_fd);
 
 fail:
-	if (nstoken)
-		close_netns(nstoken);
-
-	SYS_NOFAIL("ip netns del " NS_TEST " &> /dev/null");
-
+	cleanup_netns(nstoken);
 	close(cgroup_fd);
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v11 4/5] selftests/bpf: Drop unneeded checks for mptcp
  2023-08-04  5:07 [PATCH bpf-next v11 0/5] bpf: Force to MPTCP Geliang Tang
                   ` (2 preceding siblings ...)
  2023-08-04  5:07 ` [PATCH bpf-next v11 3/5] selftests/bpf: Add two mptcp netns helpers Geliang Tang
@ 2023-08-04  5:07 ` Geliang Tang
  2023-08-04  5:07 ` [PATCH bpf-next v11 5/5] selftests/bpf: Add mptcpify test Geliang Tang
  4 siblings, 0 replies; 15+ messages in thread
From: Geliang Tang @ 2023-08-04  5:07 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman
  Cc: Geliang Tang, bpf, netdev, mptcp, apparmor, linux-security-module,
	selinux, linux-kselftest, Yonghong Song

Load success means prog_fd and map_fd are always valid. So drop these
unneeded ASSERT_GE checks for them in mptcp run_test().

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 tools/testing/selftests/bpf/prog_tests/mptcp.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index 4407bd5c9e9a..3dc0ba2e7590 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -127,17 +127,7 @@ static int run_test(int cgroup_fd, int server_fd, bool is_mptcp)
 		goto out;
 
 	prog_fd = bpf_program__fd(sock_skel->progs._sockops);
-	if (!ASSERT_GE(prog_fd, 0, "bpf_program__fd")) {
-		err = -EIO;
-		goto out;
-	}
-
 	map_fd = bpf_map__fd(sock_skel->maps.socket_storage_map);
-	if (!ASSERT_GE(map_fd, 0, "bpf_map__fd")) {
-		err = -EIO;
-		goto out;
-	}
-
 	err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0);
 	if (!ASSERT_OK(err, "bpf_prog_attach"))
 		goto out;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v11 5/5] selftests/bpf: Add mptcpify test
  2023-08-04  5:07 [PATCH bpf-next v11 0/5] bpf: Force to MPTCP Geliang Tang
                   ` (3 preceding siblings ...)
  2023-08-04  5:07 ` [PATCH bpf-next v11 4/5] selftests/bpf: Drop unneeded checks for mptcp Geliang Tang
@ 2023-08-04  5:07 ` Geliang Tang
  2023-08-05  0:40   ` Martin KaFai Lau
  4 siblings, 1 reply; 15+ messages in thread
From: Geliang Tang @ 2023-08-04  5:07 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman
  Cc: Geliang Tang, bpf, netdev, mptcp, apparmor, linux-security-module,
	selinux, linux-kselftest, Yonghong Song

Implement a new test program mptcpify: if the family is AF_INET or
AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or
IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in
update_socket_protocol().

Extend the MPTCP test base, add a selftest test_mptcpify() for the
mptcpify case. Open and load the mptcpify test prog to mptcpify the
TCP sockets dynamically, then use start_server() and connect_to_fd()
to create a TCP socket, but actually what's created is an MPTCP
socket, which can be verified through the outputs of 'ss' and 'nstat'
commands.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 .../testing/selftests/bpf/prog_tests/mptcp.c  | 98 +++++++++++++++++++
 tools/testing/selftests/bpf/progs/mptcpify.c  | 20 ++++
 2 files changed, 118 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c

diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index 3dc0ba2e7590..e6aafb4cfa8e 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -6,6 +6,7 @@
 #include "cgroup_helpers.h"
 #include "network_helpers.h"
 #include "mptcp_sock.skel.h"
+#include "mptcpify.skel.h"
 
 char NS_TEST[32];
 
@@ -185,8 +186,105 @@ static void test_base(void)
 	close(cgroup_fd);
 }
 
+static void send_byte(int fd)
+{
+	char b = 0x55;
+
+	ASSERT_EQ(write(fd, &b, sizeof(b)), 1, "send single byte");
+}
+
+static int verify_mptcpify(void)
+{
+	char cmd[256];
+	int err = 0;
+
+	/* Output of ss:
+	 *
+	 * ESTAB 0      0          127.0.0.1:44180    127.0.0.1:42225 cubic
+	 * ... tcp-ulp-mptcp flags:Mmec ...
+	 */
+	snprintf(cmd, sizeof(cmd),
+		 "ip netns exec %s ss -tOni | grep -q '%s'",
+		 NS_TEST, "tcp-ulp-mptcp");
+	if (!ASSERT_OK(system(cmd), "No tcp-ulp-mptcp found!"))
+		err++;
+
+	/* Output of nstat:
+	 *
+	 * #kernel
+	 * MPTcpExtMPCapableSYNACKRX       1                  0.0
+	 */
+	snprintf(cmd, sizeof(cmd),
+		 "ip netns exec %s nstat -asz %s | awk '%s' | grep -q '%s'",
+		 NS_TEST, "MPTcpExtMPCapableSYNACKRX",
+		 "NR==1 {next} {print $2}", "1");
+	if (!ASSERT_OK(system(cmd), "No MPTcpExtMPCapableSYNACKRX found!"))
+		err++;
+
+	return err;
+}
+
+static int run_mptcpify(int cgroup_fd)
+{
+	int server_fd, client_fd, err = 0;
+	struct mptcpify *mptcpify_skel;
+
+	mptcpify_skel = mptcpify__open_and_load();
+	if (!ASSERT_OK_PTR(mptcpify_skel, "skel_open_load"))
+		return -EIO;
+
+	err = mptcpify__attach(mptcpify_skel);
+	if (!ASSERT_OK(err, "skel_attach"))
+		goto out;
+
+	/* without MPTCP */
+	server_fd = start_server(AF_INET, SOCK_STREAM, NULL, 0, 0);
+	if (!ASSERT_GE(server_fd, 0, "start_server")) {
+		err = -EIO;
+		goto out;
+	}
+
+	client_fd = connect_to_fd(server_fd, 0);
+	if (!ASSERT_GE(client_fd, 0, "connect to fd")) {
+		err = -EIO;
+		goto close_server;
+	}
+
+	send_byte(client_fd);
+	err = verify_mptcpify();
+
+	close(client_fd);
+close_server:
+	close(server_fd);
+out:
+	mptcpify__destroy(mptcpify_skel);
+	return err;
+}
+
+static void test_mptcpify(void)
+{
+	struct nstoken *nstoken = NULL;
+	int cgroup_fd;
+
+	cgroup_fd = test__join_cgroup("/mptcpify");
+	if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup"))
+		return;
+
+	nstoken = create_netns();
+	if (!ASSERT_OK_PTR(nstoken, "create_netns"))
+		goto fail;
+
+	ASSERT_OK(run_mptcpify(cgroup_fd), "run_mptcpify");
+
+fail:
+	cleanup_netns(nstoken);
+	close(cgroup_fd);
+}
+
 void test_mptcp(void)
 {
 	if (test__start_subtest("base"))
 		test_base();
+	if (test__start_subtest("mptcpify"))
+		test_mptcpify();
 }
diff --git a/tools/testing/selftests/bpf/progs/mptcpify.c b/tools/testing/selftests/bpf/progs/mptcpify.c
new file mode 100644
index 000000000000..53301ae8a8f7
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mptcpify.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023, SUSE. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include "bpf_tracing_net.h"
+
+char _license[] SEC("license") = "GPL";
+
+SEC("fmod_ret/update_socket_protocol")
+int BPF_PROG(mptcpify, int family, int type, int protocol)
+{
+	if ((family == AF_INET || family == AF_INET6) &&
+	    type == SOCK_STREAM &&
+	    (!protocol || protocol == IPPROTO_TCP)) {
+		return IPPROTO_MPTCP;
+	}
+
+	return protocol;
+}
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook
  2023-08-04  5:07 ` [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook Geliang Tang
@ 2023-08-05  0:00   ` Martin KaFai Lau
  0 siblings, 0 replies; 15+ messages in thread
From: Martin KaFai Lau @ 2023-08-05  0:00 UTC (permalink / raw)
  To: Geliang Tang, Paolo Abeni
  Cc: bpf, netdev, mptcp, apparmor, linux-security-module, selinux,
	linux-kselftest, Yonghong Song, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Florent Revest, Brendan Jackman, Matthieu Baerts, Mat Martineau,
	David S. Miller, Eric Dumazet, Jakub Kicinski, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman

On 8/3/23 10:07 PM, Geliang Tang wrote:
> Add a hook named update_socket_protocol in __sys_socket(), for bpf
> progs to attach to and update socket protocol. One user case is to
> force legacy TCP apps to create and use MPTCP sockets instead of
> TCP ones.
> 
> Define a mod_ret set named bpf_mptcp_fmodret_ids, add the hook
> update_socket_protocol into this set, and register it in
> bpf_mptcp_kfunc_init().
> 
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Acked-by: Yonghong Song <yonghong.song@linux.dev>
> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> ---
>   net/mptcp/bpf.c | 15 +++++++++++++++
>   net/socket.c    | 24 ++++++++++++++++++++++++
>   2 files changed, 39 insertions(+)
> 
> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> index 5a0a84ad94af..8a16672b94e2 100644
> --- a/net/mptcp/bpf.c
> +++ b/net/mptcp/bpf.c
> @@ -19,3 +19,18 @@ struct mptcp_sock *bpf_mptcp_sock_from_subflow(struct sock *sk)
>   
>   	return NULL;
>   }
> +
> +BTF_SET8_START(bpf_mptcp_fmodret_ids)
> +BTF_ID_FLAGS(func, update_socket_protocol)
> +BTF_SET8_END(bpf_mptcp_fmodret_ids)
> +
> +static const struct btf_kfunc_id_set bpf_mptcp_fmodret_set = {
> +	.owner = THIS_MODULE,
> +	.set   = &bpf_mptcp_fmodret_ids,
> +};
> +
> +static int __init bpf_mptcp_kfunc_init(void)
> +{
> +	return register_btf_fmodret_id_set(&bpf_mptcp_fmodret_set);
> +}
> +late_initcall(bpf_mptcp_kfunc_init);
> diff --git a/net/socket.c b/net/socket.c
> index 2b0e54b2405c..9f98ced88ac5 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -1644,11 +1644,35 @@ struct file *__sys_socket_file(int family, int type, int protocol)
>   	return sock_alloc_file(sock, flags, NULL);
>   }
>   
> +/*	A hook for bpf progs to attach to and update socket protocol.
> + *
> + *	A static noinline declaration here could cause the compiler to
> + *	optimize away the function. A global noinline declaration will
> + *	keep the definition, but may optimize away the callsite.
> + *	Therefore, __weak is needed to ensure that the call is still
> + *	emitted, by telling the compiler that we don't know what the
> + *	function might eventually be.
> + *
> + *	__diag_* below are needed to dismiss the missing prototype warning.
> + */
> +
> +__diag_push();
> +__diag_ignore_all("-Wmissing-prototypes",
> +		  "kfuncs which will be used in BPF programs");
This "kfuns which will be used in BPF programs" piece is not accurate. It is a 
fmod_ret entry point for bpf prog.

> +
> +__weak noinline int update_socket_protocol(int family, int type, int protocol)
> +{
> +	return protocol;
> +}
> +
> +__diag_pop();
> +
>   int __sys_socket(int family, int type, int protocol)
>   {
>   	struct socket *sock;
>   	int flags;
>   
> +	protocol = update_socket_protocol(family, type, protocol);

Paolo, could you help to take another look and ack this patch if it has 
addressed your earlier comment ?

>   	sock = __sys_socket_create(family, type, protocol);
>   	if (IS_ERR(sock))
>   		return PTR_ERR(sock);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-04  5:07 ` [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp Geliang Tang
@ 2023-08-05  0:23   ` Martin KaFai Lau
  2023-08-07  6:40     ` Geliang Tang
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-08-05  0:23 UTC (permalink / raw)
  To: Geliang Tang
  Cc: bpf, netdev, mptcp, apparmor, linux-security-module, selinux,
	linux-kselftest, Yonghong Song, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Florent Revest, Brendan Jackman, Matthieu Baerts, Mat Martineau,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	John Johansen, Paul Moore, James Morris, Serge E. Hallyn,
	Stephen Smalley, Eric Paris, Mykola Lysenko, Shuah Khan,
	Simon Horman

On 8/3/23 10:07 PM, Geliang Tang wrote:
> Use rand() to generate a random netns name instead of using the fixed
> name "mptcp_ns" for every test.
> 
> By doing that, we can re-launch the test even if there was an issue
> removing the previous netns or if by accident, a netns with this generic
> name already existed on the system.
> 
> Note that using a different name each will also help adding more
> subtests in future commits.

I run test_progs repeatedly without rebooting qemu to save time. If there is a 
test did not clean up its netns, I would rather uncover it earlier and fix it 
instead. Randomizing the name is hiding the issue and does not help to uncover 
the broken test sooner. Although this change is to mptcp test alone, this could 
be referred in other future tests.

afaik, I don't remember bpf CI ever run into a test failure because the picked 
name had already been used by the system. It seems you ran into this issue a lot 
with the mptcp test in your setup. Could you explain a little more?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 5/5] selftests/bpf: Add mptcpify test
  2023-08-04  5:07 ` [PATCH bpf-next v11 5/5] selftests/bpf: Add mptcpify test Geliang Tang
@ 2023-08-05  0:40   ` Martin KaFai Lau
  0 siblings, 0 replies; 15+ messages in thread
From: Martin KaFai Lau @ 2023-08-05  0:40 UTC (permalink / raw)
  To: Geliang Tang
  Cc: bpf, netdev, mptcp, apparmor, linux-security-module, selinux,
	linux-kselftest, Yonghong Song, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Florent Revest, Brendan Jackman, Matthieu Baerts, Mat Martineau,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	John Johansen, Paul Moore, James Morris, Serge E. Hallyn,
	Stephen Smalley, Eric Paris, Mykola Lysenko, Shuah Khan,
	Simon Horman

On 8/3/23 10:07 PM, Geliang Tang wrote:
> diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
> index 3dc0ba2e7590..e6aafb4cfa8e 100644
> --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
> +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
> @@ -6,6 +6,7 @@
>   #include "cgroup_helpers.h"
>   #include "network_helpers.h"
>   #include "mptcp_sock.skel.h"
> +#include "mptcpify.skel.h"
>   
>   char NS_TEST[32];
>   
> @@ -185,8 +186,105 @@ static void test_base(void)
>   	close(cgroup_fd);
>   }
>   
> +static void send_byte(int fd)
> +{
> +	char b = 0x55;
> +
> +	ASSERT_EQ(write(fd, &b, sizeof(b)), 1, "send single byte");
> +}
> +
> +static int verify_mptcpify(void)
> +{
> +	char cmd[256];
> +	int err = 0;
> +
> +	/* Output of ss:
> +	 *
> +	 * ESTAB 0      0          127.0.0.1:44180    127.0.0.1:42225 cubic
> +	 * ... tcp-ulp-mptcp flags:Mmec ...
> +	 */
> +	snprintf(cmd, sizeof(cmd),
> +		 "ip netns exec %s ss -tOni | grep -q '%s'",
> +		 NS_TEST, "tcp-ulp-mptcp");
> +	if (!ASSERT_OK(system(cmd), "No tcp-ulp-mptcp found!"))
> +		err++;
> +
> +	/* Output of nstat:
> +	 *
> +	 * #kernel
> +	 * MPTcpExtMPCapableSYNACKRX       1                  0.0
> +	 */
> +	snprintf(cmd, sizeof(cmd),
> +		 "ip netns exec %s nstat -asz %s | awk '%s' | grep -q '%s'",
> +		 NS_TEST, "MPTcpExtMPCapableSYNACKRX",
> +		 "NR==1 {next} {print $2}", "1");
> +	if (!ASSERT_OK(system(cmd), "No MPTcpExtMPCapableSYNACKRX found!"))
> +		err++;

The idea is to confirm the protocol has been changed. Is it more direct to use 
getsockopt(SO_PROTOCOL) on the created fd(s)?

> +
> +	return err;
> +}
> +
> +static int run_mptcpify(int cgroup_fd)
> +{
> +	int server_fd, client_fd, err = 0;
> +	struct mptcpify *mptcpify_skel;
> +
> +	mptcpify_skel = mptcpify__open_and_load();
> +	if (!ASSERT_OK_PTR(mptcpify_skel, "skel_open_load"))
> +		return -EIO;

Although the return value does not matter much, -EIO looks weird for the error 
from mptcpify__open_and_load(). May be 'return libbpf_get_error(mptcpify_skel);'

--
pw-bot: cr

> +
> +	err = mptcpify__attach(mptcpify_skel);
> +	if (!ASSERT_OK(err, "skel_attach"))
> +		goto out;
> +
> +	/* without MPTCP */
> +	server_fd = start_server(AF_INET, SOCK_STREAM, NULL, 0, 0);
> +	if (!ASSERT_GE(server_fd, 0, "start_server")) {
> +		err = -EIO;
> +		goto out;
> +	}
> +
> +	client_fd = connect_to_fd(server_fd, 0);
> +	if (!ASSERT_GE(client_fd, 0, "connect to fd")) {
> +		err = -EIO;
> +		goto close_server;
> +	}
> +
> +	send_byte(client_fd);
> +	err = verify_mptcpify();
> +
> +	close(client_fd);
> +close_server:
> +	close(server_fd);
> +out:
> +	mptcpify__destroy(mptcpify_skel);
> +	return err;
> +}
> +


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-05  0:23   ` Martin KaFai Lau
@ 2023-08-07  6:40     ` Geliang Tang
  2023-08-09  6:03       ` Martin KaFai Lau
  0 siblings, 1 reply; 15+ messages in thread
From: Geliang Tang @ 2023-08-07  6:40 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman, bpf, netdev,
	mptcp, apparmor, linux-security-module, selinux, linux-kselftest

On Fri, Aug 04, 2023 at 05:23:32PM -0700, Martin KaFai Lau wrote:
> On 8/3/23 10:07 PM, Geliang Tang wrote:
> > Use rand() to generate a random netns name instead of using the fixed
> > name "mptcp_ns" for every test.
> > 
> > By doing that, we can re-launch the test even if there was an issue
> > removing the previous netns or if by accident, a netns with this generic
> > name already existed on the system.
> > 
> > Note that using a different name each will also help adding more
> > subtests in future commits.

Hi Martin,

I tried to run mptcp tests simultaneously, and got "Cannot create
namespace file "/var/run/netns/mptcp_ns": File exists" errors sometimes.
So I add this patch to fix it.

It's easy to reproduce, just run this commands in multiple terminals:
 > for i in `seq 1 100`; do sudo ./test_progs -t mptcp; done

> 
> I run test_progs repeatedly without rebooting qemu to save time. If there is
> a test did not clean up its netns, I would rather uncover it earlier and fix
> it instead. Randomizing the name is hiding the issue and does not help to
> uncover the broken test sooner. Although this change is to mptcp test alone,
> this could be referred in other future tests.

I added "ip netns show" after "ip netns del" in v12 to check if there is
a test did not clean up its netns.

Thanks,
-Geliang

> 
> afaik, I don't remember bpf CI ever run into a test failure because the
> picked name had already been used by the system. It seems you ran into this
> issue a lot with the mptcp test in your setup. Could you explain a little
> more?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-07  6:40     ` Geliang Tang
@ 2023-08-09  6:03       ` Martin KaFai Lau
  2023-08-09  8:19         ` Geliang Tang
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-08-09  6:03 UTC (permalink / raw)
  To: Geliang Tang
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Florent Revest, Brendan Jackman,
	Matthieu Baerts, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, John Johansen, Paul Moore,
	James Morris, Serge E. Hallyn, Stephen Smalley, Eric Paris,
	Mykola Lysenko, Shuah Khan, Simon Horman, bpf, netdev, mptcp,
	apparmor, linux-security-module, selinux, linux-kselftest

On 8/6/23 11:40 PM, Geliang Tang wrote:
> On Fri, Aug 04, 2023 at 05:23:32PM -0700, Martin KaFai Lau wrote:
>> On 8/3/23 10:07 PM, Geliang Tang wrote:
>>> Use rand() to generate a random netns name instead of using the fixed
>>> name "mptcp_ns" for every test.
>>>
>>> By doing that, we can re-launch the test even if there was an issue
>>> removing the previous netns or if by accident, a netns with this generic
>>> name already existed on the system.
>>>
>>> Note that using a different name each will also help adding more
>>> subtests in future commits.
> 
> Hi Martin,
> 
> I tried to run mptcp tests simultaneously, and got "Cannot create
> namespace file "/var/run/netns/mptcp_ns": File exists" errors sometimes.
> So I add this patch to fix it.
> 
> It's easy to reproduce, just run this commands in multiple terminals:
>   > for i in `seq 1 100`; do sudo ./test_progs -t mptcp; done

Not only the "-t mptcp" test. Other tests in test_progs also don't support 
running parallel in multiple terminals. Does it really help to test the bpf part 
of the prog_tests/mptcp.c test by running like this? If it wants to exercise the 
other mptcp networking specific code like this, a separate mptcp test is needed 
outside of test_progs and it won't be run in the bpf CI.

If you agree, can you please avoid introducing unnecessary randomness to the 
test_progs where bpf CI and most users don't run in this way?

Also, please don't resend the patches too fast until the discussion is 
concluded. Please give reasonable time for others to reply.

I have a high level question. In LPC 2022 
(https://lpc.events/event/16/contributions/1354/), I recall there was idea in 
using bpf to make other mptcp decision/policy. Any thought and progress on this? 
This set which only uses bpf to change the protocol feels like an incomplete 
solution.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-09  6:03       ` Martin KaFai Lau
@ 2023-08-09  8:19         ` Geliang Tang
  2023-08-11  5:53           ` Martin KaFai Lau
  0 siblings, 1 reply; 15+ messages in thread
From: Geliang Tang @ 2023-08-09  8:19 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman, bpf, netdev,
	mptcp, apparmor, linux-security-module, selinux, linux-kselftest

On Tue, Aug 08, 2023 at 11:03:30PM -0700, Martin KaFai Lau wrote:
> On 8/6/23 11:40 PM, Geliang Tang wrote:
> > On Fri, Aug 04, 2023 at 05:23:32PM -0700, Martin KaFai Lau wrote:
> > > On 8/3/23 10:07 PM, Geliang Tang wrote:
> > > > Use rand() to generate a random netns name instead of using the fixed
> > > > name "mptcp_ns" for every test.
> > > > 
> > > > By doing that, we can re-launch the test even if there was an issue
> > > > removing the previous netns or if by accident, a netns with this generic
> > > > name already existed on the system.
> > > > 
> > > > Note that using a different name each will also help adding more
> > > > subtests in future commits.
> > 
> > Hi Martin,
> > 
> > I tried to run mptcp tests simultaneously, and got "Cannot create
> > namespace file "/var/run/netns/mptcp_ns": File exists" errors sometimes.
> > So I add this patch to fix it.
> > 
> > It's easy to reproduce, just run this commands in multiple terminals:
> >   > for i in `seq 1 100`; do sudo ./test_progs -t mptcp; done
> 
> Not only the "-t mptcp" test. Other tests in test_progs also don't support
> running parallel in multiple terminals. Does it really help to test the bpf
> part of the prog_tests/mptcp.c test by running like this? If it wants to
> exercise the other mptcp networking specific code like this, a separate
> mptcp test is needed outside of test_progs and it won't be run in the bpf
> CI.
> 
> If you agree, can you please avoid introducing unnecessary randomness to the
> test_progs where bpf CI and most users don't run in this way?

Thanks Martin. Sure, I agree. Let's drop this patch.

> 
> Also, please don't resend the patches too fast until the discussion is
> concluded. Please give reasonable time for others to reply.

Sure. Please give me a clear reminder next time that it's time to resend
a new version of the patches.

> 
> I have a high level question. In LPC 2022
> (https://lpc.events/event/16/contributions/1354/), I recall there was idea
> in using bpf to make other mptcp decision/policy. Any thought and progress
> on this? This set which only uses bpf to change the protocol feels like an
> incomplete solution.

We are implementing MPTCP packet scheduler using BPF. Patches aren't
sent to BPF mail list yet, only temporarily on our mptcp repo[1].

Here are the patches:

 selftests/bpf: Add bpf_burst test
 selftests/bpf: Add bpf_burst scheduler
 bpf: Export more bpf_burst related functions
 selftests/bpf: Add bpf_red test
 selftests/bpf: Add bpf_red scheduler
 selftests/bpf: Add bpf_rr test
 selftests/bpf: Add bpf_rr scheduler
 selftests/bpf: Add bpf_bkup test
 selftests/bpf: Add bpf_bkup scheduler
 selftests/bpf: Add bpf_first test
 selftests/bpf: Add bpf_first scheduler
 selftests/bpf: Add bpf scheduler test
 selftests/bpf: add two mptcp netns helpers
 selftests/bpf: use random netns name for mptcp
 selftests/bpf: Add mptcp sched structs
 bpf: Add bpf_mptcp_sched_kfunc_set
 bpf: Add bpf_mptcp_sched_ops

If you could take a look at these patches in advance, I would greatly
appreciate it. Any feedback is welcome.

[1]
https://github.com/multipath-tcp/mptcp_net-next.git

-Geliang

> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-09  8:19         ` Geliang Tang
@ 2023-08-11  5:53           ` Martin KaFai Lau
  2023-08-11  9:29             ` Geliang Tang
  0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2023-08-11  5:53 UTC (permalink / raw)
  To: Geliang Tang
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Florent Revest, Brendan Jackman,
	Matthieu Baerts, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, John Johansen, Paul Moore,
	James Morris, Serge E. Hallyn, Stephen Smalley, Eric Paris,
	Mykola Lysenko, Shuah Khan, Simon Horman, bpf, netdev, mptcp,
	apparmor, linux-security-module, selinux, linux-kselftest

On 8/9/23 1:19 AM, Geliang Tang wrote:
> On Tue, Aug 08, 2023 at 11:03:30PM -0700, Martin KaFai Lau wrote:
>> On 8/6/23 11:40 PM, Geliang Tang wrote:
>>> On Fri, Aug 04, 2023 at 05:23:32PM -0700, Martin KaFai Lau wrote:
>>>> On 8/3/23 10:07 PM, Geliang Tang wrote:
>>>>> Use rand() to generate a random netns name instead of using the fixed
>>>>> name "mptcp_ns" for every test.
>>>>>
>>>>> By doing that, we can re-launch the test even if there was an issue
>>>>> removing the previous netns or if by accident, a netns with this generic
>>>>> name already existed on the system.
>>>>>
>>>>> Note that using a different name each will also help adding more
>>>>> subtests in future commits.
>>>
>>> Hi Martin,
>>>
>>> I tried to run mptcp tests simultaneously, and got "Cannot create
>>> namespace file "/var/run/netns/mptcp_ns": File exists" errors sometimes.
>>> So I add this patch to fix it.
>>>
>>> It's easy to reproduce, just run this commands in multiple terminals:
>>>    > for i in `seq 1 100`; do sudo ./test_progs -t mptcp; done
>>
>> Not only the "-t mptcp" test. Other tests in test_progs also don't support
>> running parallel in multiple terminals. Does it really help to test the bpf
>> part of the prog_tests/mptcp.c test by running like this? If it wants to
>> exercise the other mptcp networking specific code like this, a separate
>> mptcp test is needed outside of test_progs and it won't be run in the bpf
>> CI.
>>
>> If you agree, can you please avoid introducing unnecessary randomness to the
>> test_progs where bpf CI and most users don't run in this way?
> 
> Thanks Martin. Sure, I agree. Let's drop this patch.

Thanks you.

>> I have a high level question. In LPC 2022
>> (https://lpc.events/event/16/contributions/1354/), I recall there was idea
>> in using bpf to make other mptcp decision/policy. Any thought and progress
>> on this? This set which only uses bpf to change the protocol feels like an
>> incomplete solution.
> 
> We are implementing MPTCP packet scheduler using BPF. Patches aren't
> sent to BPF mail list yet, only temporarily on our mptcp repo[1].
> 
> Here are the patches:
> 
>   selftests/bpf: Add bpf_burst test
>   selftests/bpf: Add bpf_burst scheduler
>   bpf: Export more bpf_burst related functions
>   selftests/bpf: Add bpf_red test
>   selftests/bpf: Add bpf_red scheduler
>   selftests/bpf: Add bpf_rr test
>   selftests/bpf: Add bpf_rr scheduler
>   selftests/bpf: Add bpf_bkup test
>   selftests/bpf: Add bpf_bkup scheduler
>   selftests/bpf: Add bpf_first test
>   selftests/bpf: Add bpf_first scheduler
>   selftests/bpf: Add bpf scheduler test
>   selftests/bpf: add two mptcp netns helpers
>   selftests/bpf: use random netns name for mptcp
>   selftests/bpf: Add mptcp sched structs
>   bpf: Add bpf_mptcp_sched_kfunc_set
>   bpf: Add bpf_mptcp_sched_ops
> 
> If you could take a look at these patches in advance, I would greatly
> appreciate it. Any feedback is welcome.
> 
> [1]
> https://github.com/multipath-tcp/mptcp_net-next.git

Thanks for sharing. I did not go into the details. iiuc, the scheduler is 
specific to a namespace. Do you see if it is useful to have more finer control 
like depending on what IP address it is connected to? BPF policy is usually 
found more useful to have finer policy control than global or per-netns.

The same question goes for the fmod_ret here in this patch. The progs/mptcpify.c 
selftest is as good as upgrading all TCP connections. Is it your only use case 
and no need for finer selection?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-11  5:53           ` Martin KaFai Lau
@ 2023-08-11  9:29             ` Geliang Tang
  2023-08-11 18:50               ` Martin KaFai Lau
  0 siblings, 1 reply; 15+ messages in thread
From: Geliang Tang @ 2023-08-11  9:29 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Florent Revest,
	Brendan Jackman, Matthieu Baerts, Mat Martineau, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, John Johansen,
	Paul Moore, James Morris, Serge E. Hallyn, Stephen Smalley,
	Eric Paris, Mykola Lysenko, Shuah Khan, Simon Horman, bpf, netdev,
	mptcp, apparmor, linux-security-module, selinux, linux-kselftest

On Thu, Aug 10, 2023 at 10:53:38PM -0700, Martin KaFai Lau wrote:
> On 8/9/23 1:19 AM, Geliang Tang wrote:
> > On Tue, Aug 08, 2023 at 11:03:30PM -0700, Martin KaFai Lau wrote:
> > > On 8/6/23 11:40 PM, Geliang Tang wrote:
> > > > On Fri, Aug 04, 2023 at 05:23:32PM -0700, Martin KaFai Lau wrote:
> > > > > On 8/3/23 10:07 PM, Geliang Tang wrote:
> > > > > > Use rand() to generate a random netns name instead of using the fixed
> > > > > > name "mptcp_ns" for every test.
> > > > > > 
> > > > > > By doing that, we can re-launch the test even if there was an issue
> > > > > > removing the previous netns or if by accident, a netns with this generic
> > > > > > name already existed on the system.
> > > > > > 
> > > > > > Note that using a different name each will also help adding more
> > > > > > subtests in future commits.
> > > > 
> > > > Hi Martin,
> > > > 
> > > > I tried to run mptcp tests simultaneously, and got "Cannot create
> > > > namespace file "/var/run/netns/mptcp_ns": File exists" errors sometimes.
> > > > So I add this patch to fix it.
> > > > 
> > > > It's easy to reproduce, just run this commands in multiple terminals:
> > > >    > for i in `seq 1 100`; do sudo ./test_progs -t mptcp; done
> > > 
> > > Not only the "-t mptcp" test. Other tests in test_progs also don't support
> > > running parallel in multiple terminals. Does it really help to test the bpf
> > > part of the prog_tests/mptcp.c test by running like this? If it wants to
> > > exercise the other mptcp networking specific code like this, a separate
> > > mptcp test is needed outside of test_progs and it won't be run in the bpf
> > > CI.
> > > 
> > > If you agree, can you please avoid introducing unnecessary randomness to the
> > > test_progs where bpf CI and most users don't run in this way?
> > 
> > Thanks Martin. Sure, I agree. Let's drop this patch.
> 
> Thanks you.
> 
> > > I have a high level question. In LPC 2022
> > > (https://lpc.events/event/16/contributions/1354/), I recall there was idea
> > > in using bpf to make other mptcp decision/policy. Any thought and progress
> > > on this? This set which only uses bpf to change the protocol feels like an
> > > incomplete solution.
> > 
> > We are implementing MPTCP packet scheduler using BPF. Patches aren't
> > sent to BPF mail list yet, only temporarily on our mptcp repo[1].
> > 
> > Here are the patches:
> > 
> >   selftests/bpf: Add bpf_burst test
> >   selftests/bpf: Add bpf_burst scheduler
> >   bpf: Export more bpf_burst related functions
> >   selftests/bpf: Add bpf_red test
> >   selftests/bpf: Add bpf_red scheduler
> >   selftests/bpf: Add bpf_rr test
> >   selftests/bpf: Add bpf_rr scheduler
> >   selftests/bpf: Add bpf_bkup test
> >   selftests/bpf: Add bpf_bkup scheduler
> >   selftests/bpf: Add bpf_first test
> >   selftests/bpf: Add bpf_first scheduler
> >   selftests/bpf: Add bpf scheduler test
> >   selftests/bpf: add two mptcp netns helpers
> >   selftests/bpf: use random netns name for mptcp
> >   selftests/bpf: Add mptcp sched structs
> >   bpf: Add bpf_mptcp_sched_kfunc_set
> >   bpf: Add bpf_mptcp_sched_ops
> > 
> > If you could take a look at these patches in advance, I would greatly
> > appreciate it. Any feedback is welcome.
> > 
> > [1]
> > https://github.com/multipath-tcp/mptcp_net-next.git
> 
> Thanks for sharing. I did not go into the details. iiuc, the scheduler is
> specific to a namespace. Do you see if it is useful to have more finer
> control like depending on what IP address it is connected to? BPF policy is
> usually found more useful to have finer policy control than global or
> per-netns.
> 
> The same question goes for the fmod_ret here in this patch. The
> progs/mptcpify.c selftest is as good as upgrading all TCP connections. Is it
> your only use case and no need for finer selection?

This per-netns control is just the first step. We do need finer selection. The
most ideal mode is to select one app to upgrade it's TCP connections only. So
per-cgroup control is much better than per-netns. But we haven't found a good
per-cgroup solution yet.

Thanks,
-Geliang

> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp
  2023-08-11  9:29             ` Geliang Tang
@ 2023-08-11 18:50               ` Martin KaFai Lau
  0 siblings, 0 replies; 15+ messages in thread
From: Martin KaFai Lau @ 2023-08-11 18:50 UTC (permalink / raw)
  To: Geliang Tang
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Florent Revest, Brendan Jackman,
	Matthieu Baerts, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, John Johansen, Paul Moore,
	James Morris, Serge E. Hallyn, Stephen Smalley, Eric Paris,
	Mykola Lysenko, Shuah Khan, Simon Horman, bpf, netdev, mptcp,
	apparmor, linux-security-module, selinux, linux-kselftest

On 8/11/23 2:29 AM, Geliang Tang wrote:
> On Thu, Aug 10, 2023 at 10:53:38PM -0700, Martin KaFai Lau wrote:
>> On 8/9/23 1:19 AM, Geliang Tang wrote:
>>> On Tue, Aug 08, 2023 at 11:03:30PM -0700, Martin KaFai Lau wrote:
>>>> On 8/6/23 11:40 PM, Geliang Tang wrote:
>>>>> On Fri, Aug 04, 2023 at 05:23:32PM -0700, Martin KaFai Lau wrote:
>>>>>> On 8/3/23 10:07 PM, Geliang Tang wrote:
>>>>>>> Use rand() to generate a random netns name instead of using the fixed
>>>>>>> name "mptcp_ns" for every test.
>>>>>>>
>>>>>>> By doing that, we can re-launch the test even if there was an issue
>>>>>>> removing the previous netns or if by accident, a netns with this generic
>>>>>>> name already existed on the system.
>>>>>>>
>>>>>>> Note that using a different name each will also help adding more
>>>>>>> subtests in future commits.
>>>>>
>>>>> Hi Martin,
>>>>>
>>>>> I tried to run mptcp tests simultaneously, and got "Cannot create
>>>>> namespace file "/var/run/netns/mptcp_ns": File exists" errors sometimes.
>>>>> So I add this patch to fix it.
>>>>>
>>>>> It's easy to reproduce, just run this commands in multiple terminals:
>>>>>     > for i in `seq 1 100`; do sudo ./test_progs -t mptcp; done
>>>>
>>>> Not only the "-t mptcp" test. Other tests in test_progs also don't support
>>>> running parallel in multiple terminals. Does it really help to test the bpf
>>>> part of the prog_tests/mptcp.c test by running like this? If it wants to
>>>> exercise the other mptcp networking specific code like this, a separate
>>>> mptcp test is needed outside of test_progs and it won't be run in the bpf
>>>> CI.
>>>>
>>>> If you agree, can you please avoid introducing unnecessary randomness to the
>>>> test_progs where bpf CI and most users don't run in this way?
>>>
>>> Thanks Martin. Sure, I agree. Let's drop this patch.
>>
>> Thanks you.
>>
>>>> I have a high level question. In LPC 2022
>>>> (https://lpc.events/event/16/contributions/1354/), I recall there was idea
>>>> in using bpf to make other mptcp decision/policy. Any thought and progress
>>>> on this? This set which only uses bpf to change the protocol feels like an
>>>> incomplete solution.
>>>
>>> We are implementing MPTCP packet scheduler using BPF. Patches aren't
>>> sent to BPF mail list yet, only temporarily on our mptcp repo[1].
>>>
>>> Here are the patches:
>>>
>>>    selftests/bpf: Add bpf_burst test
>>>    selftests/bpf: Add bpf_burst scheduler
>>>    bpf: Export more bpf_burst related functions
>>>    selftests/bpf: Add bpf_red test
>>>    selftests/bpf: Add bpf_red scheduler
>>>    selftests/bpf: Add bpf_rr test
>>>    selftests/bpf: Add bpf_rr scheduler
>>>    selftests/bpf: Add bpf_bkup test
>>>    selftests/bpf: Add bpf_bkup scheduler
>>>    selftests/bpf: Add bpf_first test
>>>    selftests/bpf: Add bpf_first scheduler
>>>    selftests/bpf: Add bpf scheduler test
>>>    selftests/bpf: add two mptcp netns helpers
>>>    selftests/bpf: use random netns name for mptcp
>>>    selftests/bpf: Add mptcp sched structs
>>>    bpf: Add bpf_mptcp_sched_kfunc_set
>>>    bpf: Add bpf_mptcp_sched_ops
>>>
>>> If you could take a look at these patches in advance, I would greatly
>>> appreciate it. Any feedback is welcome.
>>>
>>> [1]
>>> https://github.com/multipath-tcp/mptcp_net-next.git
>>
>> Thanks for sharing. I did not go into the details. iiuc, the scheduler is
>> specific to a namespace. Do you see if it is useful to have more finer
>> control like depending on what IP address it is connected to? BPF policy is
>> usually found more useful to have finer policy control than global or
>> per-netns.
>>
>> The same question goes for the fmod_ret here in this patch. The
>> progs/mptcpify.c selftest is as good as upgrading all TCP connections. Is it
>> your only use case and no need for finer selection?
> 
> This per-netns control is just the first step. We do need finer selection. The
> most ideal mode is to select one app to upgrade it's TCP connections only. So
> per-cgroup control is much better than per-netns. But we haven't found a good
> per-cgroup solution yet.

Selecting an app or cgroup can sort of be done by getting the current task or 
current cgroup (there is helper to do that). I am imagining eventually it will 
want to decide the protocol upgrade and/or the mptcp-scheduler when the 
destination IP is decided. This fmod_ret upgrade for all acts like a global knob 
(sysctl) and feels like a hack or at least incomplete. However, I also don't see 
a clean way to do that for now in the current shape.

Please respin another revision to address the earlier selftest comment on the 
netns name. Thanks.



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-08-11 18:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-04  5:07 [PATCH bpf-next v11 0/5] bpf: Force to MPTCP Geliang Tang
2023-08-04  5:07 ` [PATCH bpf-next v11 1/5] bpf: Add update_socket_protocol hook Geliang Tang
2023-08-05  0:00   ` Martin KaFai Lau
2023-08-04  5:07 ` [PATCH bpf-next v11 2/5] selftests/bpf: Use random netns name for mptcp Geliang Tang
2023-08-05  0:23   ` Martin KaFai Lau
2023-08-07  6:40     ` Geliang Tang
2023-08-09  6:03       ` Martin KaFai Lau
2023-08-09  8:19         ` Geliang Tang
2023-08-11  5:53           ` Martin KaFai Lau
2023-08-11  9:29             ` Geliang Tang
2023-08-11 18:50               ` Martin KaFai Lau
2023-08-04  5:07 ` [PATCH bpf-next v11 3/5] selftests/bpf: Add two mptcp netns helpers Geliang Tang
2023-08-04  5:07 ` [PATCH bpf-next v11 4/5] selftests/bpf: Drop unneeded checks for mptcp Geliang Tang
2023-08-04  5:07 ` [PATCH bpf-next v11 5/5] selftests/bpf: Add mptcpify test Geliang Tang
2023-08-05  0:40   ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).