From: Martin KaFai Lau <martin.lau@linux.dev>
To: Geliang Tang <geliang@kernel.org>, mptcp@lists.linux.dev
Cc: Geliang Tang <tanggeliang@kylinos.cn>,
Martin KaFai Lau <martin.lau@kernel.org>,
bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH mptcp-next 1/4] bpf: Add mptcp_subflow bpf_iter
Date: Thu, 5 Sep 2024 11:24:58 -0700 [thread overview]
Message-ID: <288ad1c2-501a-4319-bc1e-e7a7e276ff63@linux.dev> (raw)
In-Reply-To: <a75fc3e8df7141ce582448d3f092871a4943fbf4.1725544210.git.tanggeliang@kylinos.cn>
On 9/5/24 6:52 AM, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> It's necessary to traverse all subflows on the conn_list of an MPTCP
> socket and then call kfunc to modify the fields of each subflow. In
> kernel space, mptcp_for_each_subflow() helper is used for this:
>
> mptcp_for_each_subflow(msk, subflow)
> kfunc(subflow);
>
> But in the MPTCP BPF program, this has not yet been implemented. As
> Martin suggested recently, this conn_list walking + modify-by-kfunc
> usage fits the bpf_iter use case.
>
> This patch adds a new bpf_iter type named "mptcp_subflow" to do this.
>
> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> kernel/bpf/helpers.c | 3 +++
> net/mptcp/bpf.c | 57 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 60 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index b5f0adae8293..2340ba967444 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -3023,6 +3023,9 @@ BTF_ID_FLAGS(func, bpf_preempt_enable)
> BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
> BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
> +BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_new)
> +BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_next)
> +BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_destroy)
> BTF_KFUNCS_END(common_btf_ids)
>
> static const struct btf_kfunc_id_set common_kfunc_set = {
> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> index 9672a70c24b0..cda09bbfd617 100644
> --- a/net/mptcp/bpf.c
> +++ b/net/mptcp/bpf.c
> @@ -204,6 +204,63 @@ static const struct btf_kfunc_id_set bpf_mptcp_fmodret_set = {
> .set = &bpf_mptcp_fmodret_ids,
> };
>
> +struct bpf_iter__mptcp_subflow {
> + __bpf_md_ptr(struct bpf_iter_meta *, meta);
> + __bpf_md_ptr(struct mptcp_sock *, msk);
> + __bpf_md_ptr(struct list_head *, pos);
> +};
> +
> +DEFINE_BPF_ITER_FUNC(mptcp_subflow, struct bpf_iter_meta *meta,
> + struct mptcp_sock *msk, struct list_head *pos)
> +
> +struct bpf_iter_mptcp_subflow {
> + __u64 __opaque[3];
> +} __attribute__((aligned(8)));
> +
> +struct bpf_iter_mptcp_subflow_kern {
> + struct mptcp_sock *msk;
> + struct list_head *pos;
> +} __attribute__((aligned(8)));
> +
> +__bpf_kfunc_start_defs();
> +
> +__bpf_kfunc int bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
> + struct mptcp_sock *msk)
> +{
> + struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> +
> + kit->msk = msk;
> + kit->pos = &msk->conn_list;
> + spin_lock_bh(&msk->pm.lock);
I don't think spin_lock here without unlock can be used. e.g. What if
bpf_iter_mptcp_subflow_new() is called twice back-to-back.
I haven't looked at the mptcp details, some questions:
The list is protected by msk->pm.lock?
What happen to the sk_lock of the msk?
Can this be rcu-ify? or it needs some cares when walking the established TCP
subflow?
[ Please cc the bpf list. Helping to review patches is a good way to contribute
back to the mailing list. ]
> +
> + return 0;
> +}
> +
> +__bpf_kfunc struct mptcp_subflow_context *
> +bpf_iter_mptcp_subflow_next(struct bpf_iter_mptcp_subflow *it)
> +{
> + struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> + struct mptcp_subflow_context *subflow;
> + struct mptcp_sock *msk = kit->msk;
> +
> + subflow = list_entry((kit->pos)->next, struct mptcp_subflow_context, node);
> + if (list_entry_is_head(subflow, &msk->conn_list, node))
> + return NULL;
> +
> + kit->pos = &subflow->node;
> + return subflow;
> +}
> +
> +__bpf_kfunc void bpf_iter_mptcp_subflow_destroy(struct bpf_iter_mptcp_subflow *it)
> +{
> + struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> + struct mptcp_sock *msk = kit->msk;
> +
> + spin_unlock_bh(&msk->pm.lock);
> +}
> +
> +__bpf_kfunc_end_defs();
> +
> __diag_push();
> __diag_ignore_all("-Wmissing-prototypes",
> "kfuncs which will be used in BPF programs");
next prev parent reply other threads:[~2024-09-05 18:25 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-05 13:52 [PATCH mptcp-next 0/4] add mptcp_subflow bpf_iter Geliang Tang
2024-09-05 13:52 ` [PATCH mptcp-next 1/4] bpf: Add " Geliang Tang
2024-09-05 18:24 ` Martin KaFai Lau [this message]
2024-09-06 10:02 ` Geliang Tang
2024-09-06 21:29 ` Andrii Nakryiko
2024-09-09 1:12 ` Geliang Tang
2024-09-05 13:52 ` [PATCH mptcp-next 2/4] bpf: Register mptcp tracing kfunc set Geliang Tang
2024-09-05 13:52 ` [PATCH mptcp-next 3/4] selftests/bpf: Add mptcp_subflow bpf_iter test prog Geliang Tang
2024-09-05 13:52 ` [PATCH mptcp-next 4/4] selftests/bpf: Add mptcp_subflow bpf_iter subtest Geliang Tang
2024-09-05 14:14 ` [PATCH mptcp-next 0/4] add mptcp_subflow bpf_iter MPTCP CI
2024-09-05 14:45 ` MPTCP CI
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=288ad1c2-501a-4319-bc1e-e7a7e276ff63@linux.dev \
--to=martin.lau@linux.dev \
--cc=bpf@vger.kernel.org \
--cc=geliang@kernel.org \
--cc=martin.lau@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=tanggeliang@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.