From: Matthieu Baerts <matttbe@kernel.org>
To: Martin KaFai Lau <martin.lau@linux.dev>,
Geliang Tang <geliang@kernel.org>
Cc: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Mykola Lysenko <mykolal@fb.com>,
Shuah Khan <shuah@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next/net v2 4/7] bpf: Add mptcp_subflow bpf_iter
Date: Thu, 30 Jan 2025 13:05:54 +0100 [thread overview]
Message-ID: <3b5af48e-4155-4b98-b67b-b75d9fb6285e@kernel.org> (raw)
In-Reply-To: <fdf0ddbe-e007-4a5f-bbdf-9a144e8fbe35@linux.dev>
Hi Martin,
Thank you for your review!
Sorry for the delay here, Geliang started to work on a new version, but
it might take a bit of time as he is currently off for a few days.
On 24/01/2025 01:47, Martin KaFai Lau wrote:
> On 12/19/24 7:46 AM, Matthieu Baerts (NGI0) wrote:
>> From: Geliang Tang <tanggeliang@kylinos.cn>
>>
>> It's necessary to traverse all subflows on the conn_list of an MPTCP
>> socket and then call kfunc to modify the fields of each subflow. In
>> kernel space, mptcp_for_each_subflow() helper is used for this:
>>
>> mptcp_for_each_subflow(msk, subflow)
>> kfunc(subflow);
>>
>> But in the MPTCP BPF program, this has not yet been implemented. As
>> Martin suggested recently, this conn_list walking + modify-by-kfunc
>> usage fits the bpf_iter use case.
>>
>> So this patch adds a new bpf_iter type named "mptcp_subflow" to do
>> this and implements its helpers bpf_iter_mptcp_subflow_new()/_next()/
>> _destroy(). And register these bpf_iter mptcp_subflow into mptcp
>> common kfunc set. Then bpf_for_each() for mptcp_subflow can be used
>> in BPF program like this:
>>
>> bpf_for_each(mptcp_subflow, subflow, msk)
>> kfunc(subflow);
(...)
>> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
>> index
>> c5bfd84c16c43230d9d8e1fd8ff781a767e647b5..e39f0e4fb683c1aa31ee075281daee218dac5878 100644
>> --- a/net/mptcp/bpf.c
>> +++ b/net/mptcp/bpf.c
(...)
>> @@ -47,10 +56,54 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
>> return NULL;
>> }
>> +__bpf_kfunc static int
>> +bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
>> + struct mptcp_sock *msk)
>> +{
>> + struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
>> + struct sock *sk = (struct sock *)msk;
>> +
>> + BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
>> + sizeof(struct bpf_iter_mptcp_subflow));
>> + BUILD_BUG_ON(__alignof__(struct bpf_iter_mptcp_subflow_kern) !=
>> + __alignof__(struct bpf_iter_mptcp_subflow));
>> +
>> + kit->msk = msk;
>> + if (!msk)
>
> NULL check is not needed. verifier should have rejected it for
> KF_TRUSTED_ARGS.
>
>> + return -EINVAL;
>> +
>> + if (!sock_owned_by_user_nocheck(sk) &&
>> + !spin_is_locked(&sk->sk_lock.slock))
>
> I could have missed something. If it is to catch bug, should it be
> sock_owned_by_me() that has the lockdep splat? For the cg get/setsockopt
> hook here, the lock should have already been held earlier in the kernel.
Good point. Because in this series, the kfunc is currently restricted to
CG [gs]etsockopt hooks, we should use msk_owned_by_me(msk) here.
> This set is only showing the cg sockopt bpf prog but missing the major
> struct_ops piece. It is hard to comment. I assumed the lock situation is
> the same for the struct_ops where the lock will be held before calling
> the struct_ops prog?
I understand it is hard to comment on that point. In the 'struct_ops' we
are designing, the lock will indeed be held before calling the stuct_ops
program. So we will just need to make sure this assumption is correct
for all callbacks of our struct_ops.
Also, if I understood correctly, it is possible to restrict a kfunc to
some specific struct_ops, e.g. not to call this kfunc for the TCP CA
struct_ops. So these checks should indeed not be needed, but I will
double-check that with Geliang.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
next prev parent reply other threads:[~2025-01-30 12:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-19 15:46 [PATCH bpf-next/net v2 0/7] bpf: Add mptcp_subflow bpf_iter support Matthieu Baerts (NGI0)
2024-12-19 15:46 ` [PATCH bpf-next/net v2 1/7] bpf: Extend bpf_skc_to_mptcp_sock to MPTCP sock Matthieu Baerts (NGI0)
2024-12-19 15:46 ` [PATCH bpf-next/net v2 2/7] bpf: Allow use of skc_to_mptcp_sock in cg_sockopt Matthieu Baerts (NGI0)
2024-12-19 15:46 ` [PATCH bpf-next/net v2 3/7] bpf: Register mptcp common kfunc set Matthieu Baerts (NGI0)
2024-12-19 15:46 ` [PATCH bpf-next/net v2 4/7] bpf: Add mptcp_subflow bpf_iter Matthieu Baerts (NGI0)
2025-01-24 0:47 ` Martin KaFai Lau
2025-01-30 12:05 ` Matthieu Baerts [this message]
2024-12-19 15:46 ` [PATCH bpf-next/net v2 5/7] bpf: Acquire and release mptcp socket Matthieu Baerts (NGI0)
2025-01-24 1:26 ` Martin KaFai Lau
2024-12-19 15:46 ` [PATCH bpf-next/net v2 6/7] selftests/bpf: More endpoints for endpoint_init Matthieu Baerts (NGI0)
2024-12-19 15:46 ` [PATCH bpf-next/net v2 7/7] selftests/bpf: Add mptcp_subflow bpf_iter subtest Matthieu Baerts (NGI0)
2025-01-24 1:38 ` Martin KaFai Lau
2025-01-15 9:39 ` [PATCH bpf-next/net v2 0/7] bpf: Add mptcp_subflow bpf_iter support Matthieu Baerts
2025-01-17 0:06 ` Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3b5af48e-4155-4b98-b67b-b75d9fb6285e@kernel.org \
--to=matttbe@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=geliang@kernel.org \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@kernel.org \
--cc=martin.lau@linux.dev \
--cc=martineau@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox