* [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
@ 2024-12-11 3:11 Geliang Tang
2024-12-11 4:17 ` MPTCP CI
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Geliang Tang @ 2024-12-11 3:11 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
v5:
- update subjects and commit logs as Matt suggested (thanks).
- avoid repeating checks in patch 1.
- do stricter checks on subflows in patch 3.
v4:
- CI reports the following BUILD_BUG_ON fails on i386:
BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) !=
sizeof(struct bpf_iter_mptcp_subflow))
Just like in bpf_iter_task_new(), change this "!=" to ">".
v3:
- check sock_owned_by_user_nocheck(sk)/spin_is_locked(&sk->sk_lock.slock),
instead of lockdep_sock_is_held(sk).
- add "sizeof" and "alignof" checks.
- drop bpf_mptcp_sk() and bpf_mptcp_subflow_tcp_sock() definitions. Use
bpf_skc_to_mptcp_sock() and mptcp_subflow_tcp_sock() in mptcp_subflow
bpf_iter selftests instead.
v2:
- add CONFIG_LOCKDEP check in patch 2 to fix the build error reported
by CI.
Address Martin's comments in v1.
Geliang Tang (5):
bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock
bpf: Allow use of skc_to_mptcp_sock in cg_sockopt
Squash to "bpf: Register mptcp common kfunc set"
Squash to "bpf: Add mptcp_subflow bpf_iter"
Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest"
include/net/mptcp.h | 4 +-
kernel/bpf/cgroup.c | 2 +
net/core/filter.c | 2 +-
net/mptcp/bpf.c | 41 +++++++++++--------
tools/testing/selftests/bpf/progs/mptcp_bpf.h | 1 -
.../selftests/bpf/progs/mptcp_bpf_iters.c | 11 +++--
6 files changed, 33 insertions(+), 28 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
2024-12-11 3:11 Geliang Tang
@ 2024-12-11 4:17 ` MPTCP CI
2024-12-16 12:06 ` Matthieu Baerts
2024-12-16 12:27 ` Matthieu Baerts
2 siblings, 0 replies; 18+ messages in thread
From: MPTCP CI @ 2024-12-11 4:17 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/12269151441
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/ac38a98bb667
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=916627
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
2024-12-11 3:11 Geliang Tang
2024-12-11 4:17 ` MPTCP CI
@ 2024-12-16 12:06 ` Matthieu Baerts
2024-12-16 12:27 ` Matthieu Baerts
2 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts @ 2024-12-16 12:06 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang
Hi Geliang,
On 11/12/2024 04:11, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> v5:
> - update subjects and commit logs as Matt suggested (thanks).
> - avoid repeating checks in patch 1.
> - do stricter checks on subflows in patch 3.
Thank you for the new version, it looks good to me!
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
2024-12-11 3:11 Geliang Tang
2024-12-11 4:17 ` MPTCP CI
2024-12-16 12:06 ` Matthieu Baerts
@ 2024-12-16 12:27 ` Matthieu Baerts
2 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts @ 2024-12-16 12:27 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang
Hi Geliang,
On 11/12/2024 04:11, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> v5:
> - update subjects and commit logs as Matt suggested (thanks).
> - avoid repeating checks in patch 1.
> - do stricter checks on subflows in patch 3.
(...)
> Geliang Tang (5):
> bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock
> bpf: Allow use of skc_to_mptcp_sock in cg_sockopt
> Squash to "bpf: Register mptcp common kfunc set"
> Squash to "bpf: Add mptcp_subflow bpf_iter"
> Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest"
Now in our tree (feat. for others):
New patches for t/upstream:
- 3bf01409a687: bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock
- d2ffc05b82a9: bpf: Allow use of skc_to_mptcp_sock in cg_sockopt
- 9b47f4d07c92: conflict in t/mptcp-add-bpf_mptcp_sched_ops
- Results: 498a056df60d..dae99806910c (export)
- faea6b4bce3f: "squashed" (with conflicts) patch 3/5 in "bpf: Register
mptcp common kfunc set"
- 7d181ebbf2e9: conflict in t/bpf-Add-mptcp_subflow-bpf_iter
- 53df7927f9db: "squashed" patch 4/5 in "bpf: Add mptcp_subflow bpf_iter"
- 0c9027dda516: "squashed" patch 5/5 in "selftests/bpf: Add
mptcp_subflow bpf_iter subtest"
- Results: dae99806910c..e296bce75fd2 (export)
Tests are now in progress:
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/11567dcf5c1dc57b7447ad68efcdac83d90d28c2/checks
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
@ 2025-03-03 10:33 Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 1/5] mptcp: add bpf_iter_task for mptcp_sock Geliang Tang
` (6 more replies)
0 siblings, 7 replies; 18+ messages in thread
From: Geliang Tang @ 2025-03-03 10:33 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
v5:
- check bpf_iter_task in mptcp_subflow_new() as Mat suggested.
v4:
- drop sock_owned_by_user_nocheck and spin_is_locked. According to
comments from Mat and Martin, in this set mptcp_subflow
bpf_iter only used from a cg sockopt bpf prog, no need to add these
check at this moment.
v3:
- patch 3, continue to use sock_owned_by_user_nocheck() and spin_is_locked()
checks instead of using msk_owned_by_me().
- patch 5, drop declaration of bpf_mptcp_subflow_tcp_sock. It's no longer
used.
- patch 5, update the comment for mptcp_subflow_tcp_sock(), which is a BPF
helper, not a kfunc.
The commit log of "bpf: Register mptcp common kfunc set" doesn't match the
code, please update it as:
'''
bpf: Register mptcp common kfunc set
MPTCP helper mptcp_subflow_ctx() is used to convert struct sock to
struct mptcp_subflow_context. It will be used in MPTCP BPF programs.
This patch defines corresponding wrapper of this helper, and put it
into the newly defined mptcp common kfunc set and register this set
with the flag BPF_PROG_TYPE_CGROUP_SOCKOPT to let it accessible to
the 'cgroup/getsockopt' type of BPF programs.
'''
v2:
- Drop bpf_skc_to_mptcp_sock
- Check the owner before assigning the msk as Mat suggested.
- Use bpf_core_cast() in mptcp_subflow bpf_iter subtest instead of
using bpf_skc_to_mptcp_sock().
Address Martin's suggestions for "Add mptcp_subflow bpf_iter support" v2.
Geliang Tang (5):
mptcp: add bpf_iter_task for mptcp_sock
Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock"
Squash to "bpf: Add mptcp_subflow bpf_iter"
Revert "bpf: Acquire and release mptcp socket"
Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest"
net/mptcp/bpf.c | 56 +++++++++----------
net/mptcp/protocol.c | 1 +
net/mptcp/protocol.h | 16 ++++++
.../testing/selftests/bpf/bpf_experimental.h | 2 +-
tools/testing/selftests/bpf/progs/mptcp_bpf.h | 5 --
.../selftests/bpf/progs/mptcp_bpf_iters.c | 8 +--
6 files changed, 47 insertions(+), 41 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH mptcp-next v5 1/5] mptcp: add bpf_iter_task for mptcp_sock
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
@ 2025-03-03 10:33 ` Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock" Geliang Tang
` (5 subsequent siblings)
6 siblings, 0 replies; 18+ messages in thread
From: Geliang Tang @ 2025-03-03 10:33 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang, Mat Martineau
From: Geliang Tang <tanggeliang@kylinos.cn>
To make sure the mptcp_subflow bpf_iter is running in the
MPTCP context. This patch adds a simplified version of tracking
for it:
1. Add a 'struct task_struct *bpf_iter_task' field to struct
mptcp_sock.
2. Do a WRITE_ONCE(msk->bpf_iter_task, current) before calling
a MPTCP BPF hook, and WRITE_ONCE(msk->bpf_iter_task, NULL) after
the hook returns.
3. In bpf_iter_mptcp_subflow_new(), check
"READ_ONCE(msk->bpf_scheduler_task) == current"
to confirm the correct task, return -EINVAL if it doesn't match.
Also creates helpers for setting, clearing and checking that value.
Suggested-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/protocol.c | 1 +
net/mptcp/protocol.h | 16 ++++++++++++++++
2 files changed, 17 insertions(+)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2b48cf648346..fec776c23fcd 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2726,6 +2726,7 @@ static void __mptcp_init_sock(struct sock *sk)
msk->scaling_ratio = TCP_DEFAULT_SCALING_RATIO;
WRITE_ONCE(msk->first, NULL);
+ WRITE_ONCE(msk->bpf_iter_task, NULL);
inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
WRITE_ONCE(msk->csum_enabled, mptcp_is_checksum_enabled(sock_net(sk)));
WRITE_ONCE(msk->allow_infinite_fallback, true);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index ef1d43406f9b..836891bc28d5 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -327,6 +327,7 @@ struct mptcp_sock {
struct list_head conn_list;
struct list_head rtx_queue;
struct mptcp_data_frag *first_pending;
+ struct task_struct *bpf_iter_task;
struct list_head join_list;
struct sock *first; /* The mptcp ops can safely dereference, using suitable
* ONCE annotation, the subflow outside the socket
@@ -1291,4 +1292,19 @@ mptcp_token_join_cookie_init_state(struct mptcp_subflow_request_sock *subflow_re
static inline void mptcp_join_cookie_init(void) {}
#endif
+static inline void mptcp_set_bpf_iter_task(struct mptcp_sock *msk)
+{
+ WRITE_ONCE(msk->bpf_iter_task, current);
+}
+
+static inline void mptcp_clear_bpf_iter_task(struct mptcp_sock *msk)
+{
+ WRITE_ONCE(msk->bpf_iter_task, NULL);
+}
+
+static inline struct task_struct *mptcp_get_bpf_iter_task(struct mptcp_sock *msk)
+{
+ return READ_ONCE(msk->bpf_iter_task);
+}
+
#endif /* __MPTCP_PROTOCOL_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock"
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 1/5] mptcp: add bpf_iter_task for mptcp_sock Geliang Tang
@ 2025-03-03 10:33 ` Geliang Tang
2025-03-04 1:37 ` Mat Martineau
2025-03-03 10:33 ` [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter" Geliang Tang
` (4 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Geliang Tang @ 2025-03-03 10:33 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow
mptcp_subflow bpt_iter can be used in cgroup/getsockopt,
otherwise, the selftest in this set fails.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/bpf.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index be222fa5f308..aff92f458e7f 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -197,14 +197,22 @@ static struct bpf_struct_ops bpf_mptcp_sched_ops = {
struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk)
{
+ struct mptcp_sock *msk;
+
if (unlikely(!sk || !sk_fullsock(sk)))
return NULL;
- if (sk->sk_protocol == IPPROTO_MPTCP)
- return mptcp_sk(sk);
+ if (sk->sk_protocol == IPPROTO_MPTCP) {
+ msk = mptcp_sk(sk);
+ mptcp_set_bpf_iter_task(msk);
+ return msk;
+ }
- if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk))
- return mptcp_sk(mptcp_subflow_ctx(sk)->conn);
+ if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) {
+ msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn);
+ mptcp_set_bpf_iter_task(msk);
+ return msk;
+ }
return NULL;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter"
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 1/5] mptcp: add bpf_iter_task for mptcp_sock Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock" Geliang Tang
@ 2025-03-03 10:33 ` Geliang Tang
2025-03-04 1:39 ` Mat Martineau
2025-03-03 10:33 ` [PATCH mptcp-next v5 4/5] Revert "bpf: Acquire and release mptcp socket" Geliang Tang
` (3 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Geliang Tang @ 2025-03-03 10:33 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
Drop the NULL check for 'msk' as Martin suggested, add more checks
for 'sk'.
Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as the
argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
v5:
- check bpf_iter_task in mptcp_subflow_new()
v4:
- drop sock_owned_by_user_nocheck and spin_is_locked. According to
comments from Mat [2] and Martin [1], in this set mptcp_subflow
bpf_iter only used from a cg sockopt bpf prog, no need to add these
check at this moment.
[1]
https://lore.kernel.org/all/fdf0ddbe-e007-4a5f-bbdf-9a144e8fbe35@linux.dev/
[2]
https://patchwork.kernel.org/project/mptcp/patch/f6469225598beecbf0bda12a4c33fafa86c0ff15.1739787744.git.tanggeliang@kylinos.cn/
v3:
- continue to use sock_owned_by_user_nocheck and spin_is_locked
checks instead of using msk_owned_by_me().
v2:
- check the owner before assigning the msk as Mat suggested.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/bpf.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index aff92f458e7f..a307490bb20e 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -249,24 +249,33 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
__bpf_kfunc static int
bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
- struct mptcp_sock *msk)
+ struct sock *sk)
{
struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
- struct sock *sk = (struct sock *)msk;
+ struct task_struct *task;
+ struct mptcp_sock *msk;
BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
sizeof(struct bpf_iter_mptcp_subflow));
BUILD_BUG_ON(__alignof__(struct bpf_iter_mptcp_subflow_kern) !=
__alignof__(struct bpf_iter_mptcp_subflow));
- kit->msk = msk;
- if (!msk)
+ if (unlikely(!sk || !sk_fullsock(sk)))
return -EINVAL;
- if (!sock_owned_by_user_nocheck(sk) &&
- !spin_is_locked(&sk->sk_lock.slock))
+ if (sk->sk_protocol != IPPROTO_MPTCP)
return -EINVAL;
+ msk = mptcp_sk(sk);
+ task = mptcp_get_bpf_iter_task(msk);
+ if (!task || task != current)
+ return -EINVAL;
+
+ mptcp_clear_bpf_iter_task(msk);
+
+ msk_owned_by_me(msk);
+
+ kit->msk = msk;
kit->pos = &msk->conn_list;
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH mptcp-next v5 4/5] Revert "bpf: Acquire and release mptcp socket"
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
` (2 preceding siblings ...)
2025-03-03 10:33 ` [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter" Geliang Tang
@ 2025-03-03 10:33 ` Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 5/5] Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest" Geliang Tang
` (2 subsequent siblings)
6 siblings, 0 replies; 18+ messages in thread
From: Geliang Tang @ 2025-03-03 10:33 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
Drop this patch as Martin suggested.
From Martin's review [1], this mptcp_sock_acquire() helper was a workaround
only to please the verifier, but they were not needed.
[1]
https://lore.kernel.org/9b373a23-c093-42d8-b4ae-99f2e62e7681@linux.dev
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/bpf.c | 19 -------------------
1 file changed, 19 deletions(-)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index a307490bb20e..9091da0a24b0 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -297,23 +297,6 @@ bpf_iter_mptcp_subflow_destroy(struct bpf_iter_mptcp_subflow *it)
{
}
-__bpf_kfunc static struct
-mptcp_sock *bpf_mptcp_sock_acquire(struct mptcp_sock *msk)
-{
- struct sock *sk = (struct sock *)msk;
-
- if (sk && refcount_inc_not_zero(&sk->sk_refcnt))
- return msk;
- return NULL;
-}
-
-__bpf_kfunc static void bpf_mptcp_sock_release(struct mptcp_sock *msk)
-{
- struct sock *sk = (struct sock *)msk;
-
- WARN_ON_ONCE(!sk || !refcount_dec_not_one(&sk->sk_refcnt));
-}
-
__bpf_kfunc struct mptcp_subflow_context *
bpf_mptcp_subflow_ctx_by_pos(const struct mptcp_sched_data *data, unsigned int pos)
{
@@ -334,8 +317,6 @@ BTF_ID_FLAGS(func, bpf_mptcp_subflow_ctx, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_new, KF_ITER_NEW | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_next, KF_ITER_NEXT | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_destroy, KF_ITER_DESTROY)
-BTF_ID_FLAGS(func, bpf_mptcp_sock_acquire, KF_ACQUIRE | KF_RET_NULL)
-BTF_ID_FLAGS(func, bpf_mptcp_sock_release, KF_RELEASE)
BTF_KFUNCS_END(bpf_mptcp_common_kfunc_ids)
static const struct btf_kfunc_id_set bpf_mptcp_common_kfunc_set = {
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH mptcp-next v5 5/5] Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest"
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
` (3 preceding siblings ...)
2025-03-03 10:33 ` [PATCH mptcp-next v5 4/5] Revert "bpf: Acquire and release mptcp socket" Geliang Tang
@ 2025-03-03 10:33 ` Geliang Tang
2025-03-03 11:13 ` [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Matthieu Baerts
2025-03-03 11:42 ` MPTCP CI
6 siblings, 0 replies; 18+ messages in thread
From: Geliang Tang @ 2025-03-03 10:33 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
Use bpf_core_cast() instead of bpf_skc_to_mptcp_sock().
Change the 2nd parameter type of bpf_for_each() as 'struct sock'.
Drop use of bpf_mptcp_sock_acquire/release.
Drop declaration of bpf_mptcp_subflow_tcp_sock. It's no longer used.
Update the comment for mptcp_subflow_tcp_sock(), which is a BPF helper,
not a kfunc.
Please update the commit log as:
'''
This patch adds a "cgroup/getsockopt" program "iters_subflow" to test the
newly added mptcp_subflow bpf_iter.
Export mptcp_subflow helpers bpf_iter_mptcp_subflow_new/_next/_destroy
and other helpers into bpf_experimental.h.
Use bpf_for_each() to walk the subflow list of this msk. MPTCP-specific
packet scheduler kfunc can be called in the loop. In this test, just
add all subflow ids to local variable local_ids, then invoke the helper
mptcp_subflow_tcp_sock() in the loop to pick a subsocket.
Out of the loop, use bpf_mptcp_subflow_ctx() to get the subflow context
of the picked subsocket and do some verification. Finally, assign
local_ids to global variable ids so that the application can obtain this
value.
Add a subtest named test_iters_subflow to load and verify the newly added
mptcp_subflow type bpf_iter example in test_mptcp. Use the helper
endpoint_init() to add 3 new subflow endpoints. Send a byte of message
to start the mptcp connection, and wait for new subflows to be added.
getsockopt() is invoked to trigger the "cgroup/getsockopt" test program
"iters_subflow". Check if skel->bss->ids equals 10 to verify whether this
mptcp_subflow bpf_iter loops correctly as expected.
'''
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
tools/testing/selftests/bpf/bpf_experimental.h | 2 +-
tools/testing/selftests/bpf/progs/mptcp_bpf.h | 5 -----
tools/testing/selftests/bpf/progs/mptcp_bpf_iters.c | 8 ++------
3 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index 2ab3f0063c0f..6a96c56f0725 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -577,7 +577,7 @@ extern void bpf_iter_css_destroy(struct bpf_iter_css *it) __weak __ksym;
struct bpf_iter_mptcp_subflow;
extern int bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
- struct mptcp_sock *msk) __weak __ksym;
+ struct sock *sk) __weak __ksym;
extern struct mptcp_subflow_context *
bpf_iter_mptcp_subflow_next(struct bpf_iter_mptcp_subflow *it) __weak __ksym;
extern void
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf.h b/tools/testing/selftests/bpf/progs/mptcp_bpf.h
index b1f6e1fb467e..72263672510a 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf.h
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf.h
@@ -43,13 +43,8 @@ mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow)
}
/* ksym */
-extern struct mptcp_sock *bpf_mptcp_sock_acquire(struct mptcp_sock *msk) __ksym;
-extern void bpf_mptcp_sock_release(struct mptcp_sock *msk) __ksym;
-
extern struct mptcp_subflow_context *
bpf_mptcp_subflow_ctx(const struct sock *sk) __ksym;
-extern struct sock *
-bpf_mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) __ksym;
extern void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow,
bool scheduled) __ksym;
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_iters.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_iters.c
index fd5691a4073b..a69e88cc191d 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf_iters.c
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_iters.c
@@ -28,10 +28,7 @@ int iters_subflow(struct bpf_sockopt *ctx)
if (!msk || msk->pm.server_side || !msk->pm.subflows)
return 1;
- msk = bpf_mptcp_sock_acquire(msk);
- if (!msk)
- return 1;
- bpf_for_each(mptcp_subflow, subflow, msk) {
+ bpf_for_each(mptcp_subflow, subflow, (struct sock *)sk) {
/* Here MPTCP-specific packet scheduler kfunc can be called:
* this test is not doing anything really useful, only to
* verify the iteration works.
@@ -39,7 +36,7 @@ int iters_subflow(struct bpf_sockopt *ctx)
local_ids += subflow->subflow_id;
- /* only to check the following kfunc works */
+ /* only to check the following helper works */
ssk = mptcp_subflow_tcp_sock(subflow);
}
@@ -58,6 +55,5 @@ int iters_subflow(struct bpf_sockopt *ctx)
ids = local_ids;
out:
- bpf_mptcp_sock_release(msk);
return 1;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
` (4 preceding siblings ...)
2025-03-03 10:33 ` [PATCH mptcp-next v5 5/5] Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest" Geliang Tang
@ 2025-03-03 11:13 ` Matthieu Baerts
2025-03-03 11:42 ` MPTCP CI
6 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts @ 2025-03-03 11:13 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang
Hi Geliang, Mat,
On 03/03/2025 11:33, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> v5:
> - check bpf_iter_task in mptcp_subflow_new() as Mat suggested.
Thank you for the modification, but would it not be easier not to do
that in this series, not to increase the complexity?
This can be done in the "use bpf_iter in bpf schedulers" instead, no?
For me, the v4 (with my squash-to patch) is enough to be used with CG
[gs]etsocktopt.
Also, I think the idea is set this marker before calling the future
struct_ops, in places where we know the msk is owned, but not in
bpf_mptcp_sock_from_sock() which doesn't check if the msk is owned, no?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support"
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
` (5 preceding siblings ...)
2025-03-03 11:13 ` [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Matthieu Baerts
@ 2025-03-03 11:42 ` MPTCP CI
6 siblings, 0 replies; 18+ messages in thread
From: MPTCP CI @ 2025-03-03 11:42 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13629090242
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/db85e099ebc4
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=939536
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock"
2025-03-03 10:33 ` [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock" Geliang Tang
@ 2025-03-04 1:37 ` Mat Martineau
2025-03-07 3:51 ` Geliang Tang
2025-03-07 3:54 ` Geliang Tang
0 siblings, 2 replies; 18+ messages in thread
From: Mat Martineau @ 2025-03-04 1:37 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp, Geliang Tang
On Mon, 3 Mar 2025, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow
> mptcp_subflow bpt_iter can be used in cgroup/getsockopt,
> otherwise, the selftest in this set fails.
>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> net/mptcp/bpf.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> index be222fa5f308..aff92f458e7f 100644
> --- a/net/mptcp/bpf.c
> +++ b/net/mptcp/bpf.c
> @@ -197,14 +197,22 @@ static struct bpf_struct_ops bpf_mptcp_sched_ops = {
>
> struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk)
> {
> + struct mptcp_sock *msk;
> +
> if (unlikely(!sk || !sk_fullsock(sk)))
> return NULL;
>
> - if (sk->sk_protocol == IPPROTO_MPTCP)
> - return mptcp_sk(sk);
> + if (sk->sk_protocol == IPPROTO_MPTCP) {
> + msk = mptcp_sk(sk);
> + mptcp_set_bpf_iter_task(msk);
Hi Geliang -
The important part of the bpf_iter_task approach is to only set the
task_struct pointer when the msk lock is already held, and to clear it
before the msk lock is released. When used this way, the check ensures
that the iterator code is both *running with the socket locked* and *in
the same context where the lock is held*.
The code above will set msk->bpf_iter_task even when the lock is not held,
and then it is never cleared.
To set/clear as expected for get/setsockopt I think the task pointer would
have to be set in places where the locks are acquired and released, like
these:
https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955
https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846
- Mat
> + return msk;
> + }
>
> - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk))
> - return mptcp_sk(mptcp_subflow_ctx(sk)->conn);
> + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) {
> + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn);
> + mptcp_set_bpf_iter_task(msk);
> + return msk;
> + }
>
> return NULL;
> }
> --
> 2.43.0
>
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter"
2025-03-03 10:33 ` [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter" Geliang Tang
@ 2025-03-04 1:39 ` Mat Martineau
2025-03-07 3:54 ` Geliang Tang
0 siblings, 1 reply; 18+ messages in thread
From: Mat Martineau @ 2025-03-04 1:39 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp, Geliang Tang
On Mon, 3 Mar 2025, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> Drop the NULL check for 'msk' as Martin suggested, add more checks
> for 'sk'.
>
> Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as the
> argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
>
> v5:
> - check bpf_iter_task in mptcp_subflow_new()
>
> v4:
> - drop sock_owned_by_user_nocheck and spin_is_locked. According to
> comments from Mat [2] and Martin [1], in this set mptcp_subflow
> bpf_iter only used from a cg sockopt bpf prog, no need to add these
> check at this moment.
>
> [1]
> https://lore.kernel.org/all/fdf0ddbe-e007-4a5f-bbdf-9a144e8fbe35@linux.dev/
> [2]
> https://patchwork.kernel.org/project/mptcp/patch/f6469225598beecbf0bda12a4c33fafa86c0ff15.1739787744.git.tanggeliang@kylinos.cn/
>
> v3:
> - continue to use sock_owned_by_user_nocheck and spin_is_locked
> checks instead of using msk_owned_by_me().
>
> v2:
> - check the owner before assigning the msk as Mat suggested.
>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> net/mptcp/bpf.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> index aff92f458e7f..a307490bb20e 100644
> --- a/net/mptcp/bpf.c
> +++ b/net/mptcp/bpf.c
> @@ -249,24 +249,33 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
>
> __bpf_kfunc static int
> bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
> - struct mptcp_sock *msk)
> + struct sock *sk)
> {
> struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> - struct sock *sk = (struct sock *)msk;
> + struct task_struct *task;
> + struct mptcp_sock *msk;
>
> BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
> sizeof(struct bpf_iter_mptcp_subflow));
> BUILD_BUG_ON(__alignof__(struct bpf_iter_mptcp_subflow_kern) !=
> __alignof__(struct bpf_iter_mptcp_subflow));
>
> - kit->msk = msk;
> - if (!msk)
> + if (unlikely(!sk || !sk_fullsock(sk)))
> return -EINVAL;
>
> - if (!sock_owned_by_user_nocheck(sk) &&
> - !spin_is_locked(&sk->sk_lock.slock))
> + if (sk->sk_protocol != IPPROTO_MPTCP)
> return -EINVAL;
>
> + msk = mptcp_sk(sk);
> + task = mptcp_get_bpf_iter_task(msk);
> + if (!task || task != current)
> + return -EINVAL;
Hi Geliang -
The task checking logic would fit better inside the helper, since every
caller will be checking to see if it matches 'current'.
I also think it would read better as (task && task == current)
> +
> + mptcp_clear_bpf_iter_task(msk);
This needs to be cleared where the lock is released, not where the task is
checked.
- Mat
> +
> + msk_owned_by_me(msk);
> +
> + kit->msk = msk;
> kit->pos = &msk->conn_list;
> return 0;
> }
> --
> 2.43.0
>
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock"
2025-03-04 1:37 ` Mat Martineau
@ 2025-03-07 3:51 ` Geliang Tang
2025-03-07 3:54 ` Geliang Tang
1 sibling, 0 replies; 18+ messages in thread
From: Geliang Tang @ 2025-03-07 3:51 UTC (permalink / raw)
To: Mat Martineau; +Cc: mptcp, Geliang Tang
Hi Mat,
Thanks for the review.
On Mon, 2025-03-03 at 17:37 -0800, Mat Martineau wrote:
> On Mon, 3 Mar 2025, Geliang Tang wrote:
>
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> >
> > Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow
> > mptcp_subflow bpt_iter can be used in cgroup/getsockopt,
> > otherwise, the selftest in this set fails.
> >
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> > net/mptcp/bpf.c | 16 ++++++++++++----
> > 1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> > index be222fa5f308..aff92f458e7f 100644
> > --- a/net/mptcp/bpf.c
> > +++ b/net/mptcp/bpf.c
> > @@ -197,14 +197,22 @@ static struct bpf_struct_ops
> > bpf_mptcp_sched_ops = {
> >
> > struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk)
> > {
> > + struct mptcp_sock *msk;
> > +
> > if (unlikely(!sk || !sk_fullsock(sk)))
> > return NULL;
> >
> > - if (sk->sk_protocol == IPPROTO_MPTCP)
> > - return mptcp_sk(sk);
> > + if (sk->sk_protocol == IPPROTO_MPTCP) {
> > + msk = mptcp_sk(sk);
> > + mptcp_set_bpf_iter_task(msk);
>
> Hi Geliang -
>
> The important part of the bpf_iter_task approach is to only set the
> task_struct pointer when the msk lock is already held, and to clear
> it
> before the msk lock is released. When used this way, the check
> ensures
> that the iterator code is both *running with the socket locked* and
> *in
> the same context where the lock is held*.
Got it, thanks for your explanation.
>
> The code above will set msk->bpf_iter_task even when the lock is not
> held,
> and then it is never cleared.
>
> To set/clear as expected for get/setsockopt I think the task pointer
> would
> have to be set in places where the locks are acquired and released,
> like
> these:
>
> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955
> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846
>
The get/setsockopt scenario is a bit complicated. I think we can use
the bpf_iter_task check only for bpf schedulers, keep the
get/setsockopt scenario unchanged, and continue to use the
msk_owned_by_me(msk) check. What do you think?
In addition, I think it's better to add bpf_iter_task to struct
mptcp_sched_data instead in struct mptcp_sock, since it's more related
to mptcp scheduler. For specific usage, please see [1]. What do you
think of this?
Thanks,
-Geliang
[1]
https://patchwork.kernel.org/project/mptcp/cover/cover.1739788598.git.tanggeliang@kylinos.cn/
>
> - Mat
>
> > + return msk;
> > + }
> >
> > - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk))
> > - return mptcp_sk(mptcp_subflow_ctx(sk)->conn);
> > + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) {
> > + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn);
> > + mptcp_set_bpf_iter_task(msk);
> > + return msk;
> > + }
> >
> > return NULL;
> > }
> > --
> > 2.43.0
> >
> >
> >
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter"
2025-03-04 1:39 ` Mat Martineau
@ 2025-03-07 3:54 ` Geliang Tang
0 siblings, 0 replies; 18+ messages in thread
From: Geliang Tang @ 2025-03-07 3:54 UTC (permalink / raw)
To: Mat Martineau; +Cc: mptcp, Geliang Tang
Hi Mat,
Thanks for the review.
On Mon, 2025-03-03 at 17:39 -0800, Mat Martineau wrote:
> On Mon, 3 Mar 2025, Geliang Tang wrote:
>
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> >
> > Drop the NULL check for 'msk' as Martin suggested, add more checks
> > for 'sk'.
> >
> > Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as
> > the
> > argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
> >
> > v5:
> > - check bpf_iter_task in mptcp_subflow_new()
> >
> > v4:
> > - drop sock_owned_by_user_nocheck and spin_is_locked. According to
> > comments from Mat [2] and Martin [1], in this set mptcp_subflow
> > bpf_iter only used from a cg sockopt bpf prog, no need to add
> > these
> > check at this moment.
> >
> > [1]
> > https://lore.kernel.org/all/fdf0ddbe-e007-4a5f-bbdf-9a144e8fbe35@linux.dev/
> > [2]
> > https://patchwork.kernel.org/project/mptcp/patch/f6469225598beecbf0bda12a4c33fafa86c0ff15.1739787744.git.tanggeliang@kylinos.cn/
> >
> > v3:
> > - continue to use sock_owned_by_user_nocheck and spin_is_locked
> > checks instead of using msk_owned_by_me().
> >
> > v2:
> > - check the owner before assigning the msk as Mat suggested.
> >
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> > net/mptcp/bpf.c | 21 +++++++++++++++------
> > 1 file changed, 15 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> > index aff92f458e7f..a307490bb20e 100644
> > --- a/net/mptcp/bpf.c
> > +++ b/net/mptcp/bpf.c
> > @@ -249,24 +249,33 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
> >
> > __bpf_kfunc static int
> > bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
> > - struct mptcp_sock *msk)
> > + struct sock *sk)
> > {
> > struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> > - struct sock *sk = (struct sock *)msk;
> > + struct task_struct *task;
> > + struct mptcp_sock *msk;
> >
> > BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
> > sizeof(struct bpf_iter_mptcp_subflow));
> > BUILD_BUG_ON(__alignof__(struct
> > bpf_iter_mptcp_subflow_kern) !=
> > __alignof__(struct bpf_iter_mptcp_subflow));
> >
> > - kit->msk = msk;
> > - if (!msk)
> > + if (unlikely(!sk || !sk_fullsock(sk)))
> > return -EINVAL;
> >
> > - if (!sock_owned_by_user_nocheck(sk) &&
> > - !spin_is_locked(&sk->sk_lock.slock))
> > + if (sk->sk_protocol != IPPROTO_MPTCP)
> > return -EINVAL;
> >
> > + msk = mptcp_sk(sk);
> > + task = mptcp_get_bpf_iter_task(msk);
> > + if (!task || task != current)
> > + return -EINVAL;
>
> Hi Geliang -
>
> The task checking logic would fit better inside the helper, since
> every
> caller will be checking to see if it matches 'current'.
>
> I also think it would read better as (task && task == current)
Yes, indeed.
>
>
> > +
> > + mptcp_clear_bpf_iter_task(msk);
>
> This needs to be cleared where the lock is released, not where the
> task is
> checked.
Yes, indeed, I'll do that in the next version.
Thanks,
-Geliang
>
> - Mat
>
> > +
> > + msk_owned_by_me(msk);
> > +
> > + kit->msk = msk;
> > kit->pos = &msk->conn_list;
> > return 0;
> > }
> > --
> > 2.43.0
> >
> >
> >
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock"
2025-03-04 1:37 ` Mat Martineau
2025-03-07 3:51 ` Geliang Tang
@ 2025-03-07 3:54 ` Geliang Tang
2025-03-07 8:53 ` Matthieu Baerts
1 sibling, 1 reply; 18+ messages in thread
From: Geliang Tang @ 2025-03-07 3:54 UTC (permalink / raw)
To: Mat Martineau; +Cc: mptcp, Geliang Tang
Hi Mat,
Thanks for the review.
On Mon, 2025-03-03 at 17:37 -0800, Mat Martineau wrote:
> On Mon, 3 Mar 2025, Geliang Tang wrote:
>
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> >
> > Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow
> > mptcp_subflow bpt_iter can be used in cgroup/getsockopt,
> > otherwise, the selftest in this set fails.
> >
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> > net/mptcp/bpf.c | 16 ++++++++++++----
> > 1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> > index be222fa5f308..aff92f458e7f 100644
> > --- a/net/mptcp/bpf.c
> > +++ b/net/mptcp/bpf.c
> > @@ -197,14 +197,22 @@ static struct bpf_struct_ops
> > bpf_mptcp_sched_ops = {
> >
> > struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk)
> > {
> > + struct mptcp_sock *msk;
> > +
> > if (unlikely(!sk || !sk_fullsock(sk)))
> > return NULL;
> >
> > - if (sk->sk_protocol == IPPROTO_MPTCP)
> > - return mptcp_sk(sk);
> > + if (sk->sk_protocol == IPPROTO_MPTCP) {
> > + msk = mptcp_sk(sk);
> > + mptcp_set_bpf_iter_task(msk);
>
> Hi Geliang -
>
> The important part of the bpf_iter_task approach is to only set the
> task_struct pointer when the msk lock is already held, and to clear
> it
> before the msk lock is released. When used this way, the check
> ensures
> that the iterator code is both *running with the socket locked* and
> *in
> the same context where the lock is held*.
Got it, thanks for your explanation.
>
> The code above will set msk->bpf_iter_task even when the lock is not
> held,
> and then it is never cleared.
>
> To set/clear as expected for get/setsockopt I think the task pointer
> would
> have to be set in places where the locks are acquired and released,
> like
> these:
>
> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955
> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846
>
The get/setsockopt scenario is a bit complicated. I think we can use
the bpf_iter_task check only for bpf schedulers, keep the
get/setsockopt scenario unchanged, and continue to use the
msk_owned_by_me(msk) check. What do you think?
In addition, I think it's better to add bpf_iter_task to struct
mptcp_sched_data instead in struct mptcp_sock, since it's more related
to mptcp scheduler. For specific usage, please see [1]. What do you
think of this?
Thanks,
-Geliang
[1]
https://patchwork.kernel.org/project/mptcp/cover/cover.1739788598.git.tanggeliang@kylinos.cn/
>
> - Mat
>
> > + return msk;
> > + }
> >
> > - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk))
> > - return mptcp_sk(mptcp_subflow_ctx(sk)->conn);
> > + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) {
> > + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn);
> > + mptcp_set_bpf_iter_task(msk);
> > + return msk;
> > + }
> >
> > return NULL;
> > }
> > --
> > 2.43.0
> >
> >
> >
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock"
2025-03-07 3:54 ` Geliang Tang
@ 2025-03-07 8:53 ` Matthieu Baerts
0 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts @ 2025-03-07 8:53 UTC (permalink / raw)
To: Geliang Tang, Mat Martineau; +Cc: mptcp, Geliang Tang
Hi Geliang,
On 07/03/2025 04:54, Geliang Tang wrote:
> On Mon, 2025-03-03 at 17:37 -0800, Mat Martineau wrote:
(...)
>> The code above will set msk->bpf_iter_task even when the lock is not
>> held,
>> and then it is never cleared.
>>
>> To set/clear as expected for get/setsockopt I think the task pointer
>> would
>> have to be set in places where the locks are acquired and released,
>> like
>> these:
>>
>> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955
>> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846
>>
>
> The get/setsockopt scenario is a bit complicated. I think we can use
> the bpf_iter_task check only for bpf schedulers, keep the
> get/setsockopt scenario unchanged, and continue to use the
> msk_owned_by_me(msk) check. What do you think?
I think in any case, it doesn't hurt to keep msk_owned_by_me(msk) for
lockdep.
If we think it is difficult to have the bpf_iter for both CG
[gs]etsockopt and struct_ops, then we can also drop the patches for the
[gs]etsockopt.
Or, from a kfunc, is it easy to check if we are called from a struct_ops?
> In addition, I think it's better to add bpf_iter_task to struct
> mptcp_sched_data instead in struct mptcp_sock, since it's more related
> to mptcp scheduler. For specific usage, please see [1]. What do you
> think of this?
Please see my reply to your comment on this thread. In short, we can
always add new parameters later, when we need them, no?
https://lore.kernel.org/706a3051-8328-4321-8eea-71d4120ea996@kernel.org
>
> Thanks,
> -Geliang
>
> [1]
> https://patchwork.kernel.org/project/mptcp/cover/cover.1739788598.git.tanggeliang@kylinos.cn/
>
>>
>> - Mat
>>
>>> + return msk;
>>> + }
>>>
>>> - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk))
>>> - return mptcp_sk(mptcp_subflow_ctx(sk)->conn);
>>> + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) {
>>> + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn);
>>> + mptcp_set_bpf_iter_task(msk);
>>> + return msk;
>>> + }
>>>
>>> return NULL;
>>> }
>>> --
>>> 2.43.0
>>>
>>>
>>>
>
>
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-03-07 8:54 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-03 10:33 [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 1/5] mptcp: add bpf_iter_task for mptcp_sock Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 2/5] Squash to "bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock" Geliang Tang
2025-03-04 1:37 ` Mat Martineau
2025-03-07 3:51 ` Geliang Tang
2025-03-07 3:54 ` Geliang Tang
2025-03-07 8:53 ` Matthieu Baerts
2025-03-03 10:33 ` [PATCH mptcp-next v5 3/5] Squash to "bpf: Add mptcp_subflow bpf_iter" Geliang Tang
2025-03-04 1:39 ` Mat Martineau
2025-03-07 3:54 ` Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 4/5] Revert "bpf: Acquire and release mptcp socket" Geliang Tang
2025-03-03 10:33 ` [PATCH mptcp-next v5 5/5] Squash to "selftests/bpf: Add mptcp_subflow bpf_iter subtest" Geliang Tang
2025-03-03 11:13 ` [PATCH mptcp-next v5 0/5] Squash to "Add mptcp_subflow bpf_iter support" Matthieu Baerts
2025-03-03 11:42 ` MPTCP CI
-- strict thread matches above, loose matches on Subject: below --
2024-12-11 3:11 Geliang Tang
2024-12-11 4:17 ` MPTCP CI
2024-12-16 12:06 ` Matthieu Baerts
2024-12-16 12:27 ` Matthieu Baerts
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.