From: Martin KaFai Lau <martin.lau@linux.dev>
To: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Cc: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
Geliang Tang <geliang@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Mykola Lysenko <mykolal@fb.com>,
Shuah Khan <shuah@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next/net v3 2/5] bpf: Add mptcp_subflow bpf_iter
Date: Fri, 16 May 2025 15:34:26 -0700 [thread overview]
Message-ID: <60092fac-2c8a-4076-9130-8c3e41cba040@linux.dev> (raw)
In-Reply-To: <20250320-bpf-next-net-mptcp-bpf_iter-subflows-v3-2-9abd22c2a7fd@kernel.org>
On 3/20/25 10:48 AM, Matthieu Baerts (NGI0) wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> It's necessary to traverse all subflows on the conn_list of an MPTCP
> socket and then call kfunc to modify the fields of each subflow. In
> kernel space, mptcp_for_each_subflow() helper is used for this:
>
> mptcp_for_each_subflow(msk, subflow)
> kfunc(subflow);
>
> But in the MPTCP BPF program, this has not yet been implemented. As
> Martin suggested recently, this conn_list walking + modify-by-kfunc
> usage fits the bpf_iter use case.
>
> So this patch adds a new bpf_iter type named "mptcp_subflow" to do
> this and implements its helpers bpf_iter_mptcp_subflow_new()/_next()/
> _destroy(). And register these bpf_iter mptcp_subflow into mptcp
> common kfunc set. Then bpf_for_each() for mptcp_subflow can be used
> in BPF program like this:
>
> bpf_for_each(mptcp_subflow, subflow, msk)
> kfunc(subflow);
>
> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> Reviewed-by: Mat Martineau <martineau@kernel.org>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> Notes:
> - v2:
> - Add BUILD_BUG_ON() checks, similar to the ones done with other
> bpf_iter_(...) helpers.
> - Replace msk_owned_by_me() by sock_owned_by_user_nocheck() and
> !spin_is_locked() (Martin).
> - v3:
> - Switch parameter from 'struct mptcp_sock' to 'struct sock' (Martin)
> - Remove unneeded !msk check (Martin)
> - Remove locks checks, add msk_owned_by_me for lockdep (Martin)
> - The following note and 2 questions have been added below.
>
> This new bpf_iter will be used by our future BPF packet schedulers and
> path managers. To see how we are going to use them, please check our
> export branch [1], especially these two commits:
>
> - "bpf: Add mptcp packet scheduler struct_ops": introduce a new
> struct_ops.
> - "selftests/bpf: Add bpf_burst scheduler & test": new test showing
> how the new struct_ops and bpf_iter are being used.
>
> [1] https://github.com/multipath-tcp/mptcp_net-next/commits/export
>
> @BPF maintainers: we would like to allow this new mptcp_subflow bpf_iter
> to be used with struct_ops, but only with the two new ones we are going
> to introduce that are specific to MPTCP, and with not others struct_ops
> (TCP CC, sched_ext, etc.). We are not sure how to do that. By chance, do
> you have examples or doc you could point to us to have this restriction
> in place, please?
The bpf_qdisc.c has done that. Take a look at the "bpf_qdisc_kfunc_filter()".
It is in net-next and bpf-next/net.
>
> Also, for one of the two future MPTCP struct_ops, not all callbacks
> should be allowed to use this new bpf_iter, because they are called from
> different contexts. How can we ensure such callbacks from a struct_ops
> cannot call mptcp_subflow bpf_iter without adding new dedicated checks
> looking if some locks are held for all callbacks? We understood that
> they wanted to have something similar with sched_ext, but we are not
> sure if this code is ready nor if it is going to be accepted.
Same. Take a look at "bpf_qdisc_kfunc_filter()".
next prev parent reply other threads:[~2025-05-16 22:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 17:48 [PATCH bpf-next/net v3 0/5] bpf: Add mptcp_subflow bpf_iter support Matthieu Baerts (NGI0)
2025-03-20 17:48 ` [PATCH bpf-next/net v3 1/5] bpf: Register mptcp common kfunc set Matthieu Baerts (NGI0)
2025-03-20 17:48 ` [PATCH bpf-next/net v3 2/5] bpf: Add mptcp_subflow bpf_iter Matthieu Baerts (NGI0)
2025-05-16 22:34 ` Martin KaFai Lau [this message]
2025-05-19 10:05 ` Matthieu Baerts
2025-03-20 17:48 ` [PATCH bpf-next/net v3 3/5] selftests/bpf: More endpoints for endpoint_init Matthieu Baerts (NGI0)
2025-03-20 17:48 ` [PATCH bpf-next/net v3 4/5] selftests/bpf: Add mptcp_subflow bpf_iter subtest Matthieu Baerts (NGI0)
2025-05-16 22:48 ` Martin KaFai Lau
2025-05-19 10:04 ` Matthieu Baerts
2025-05-20 22:18 ` Martin KaFai Lau
2025-05-23 11:07 ` Matthieu Baerts
2025-03-20 17:48 ` [PATCH bpf-next/net v3 5/5] selftests/bpf: Drop cgroup_fd of run_mptcpify Matthieu Baerts (NGI0)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=60092fac-2c8a-4076-9130-8c3e41cba040@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=geliang@kernel.org \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@kernel.org \
--cc=martineau@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.