From: Yonghong Song <yonghong.song@linux.dev>
To: Yafang Shao <laoar.shao@gmail.com>,
ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com,
andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
kpsingh@kernel.org, sdf@google.com, haoluo@google.com,
jolsa@kernel.org, tj@kernel.org
Cc: bpf@vger.kernel.org, lkp@intel.com
Subject: Re: [PATCH v3 bpf-next 1/3] bpf: Add bpf_iter_cpumask kfuncs
Date: Thu, 18 Jan 2024 14:27:22 -0800 [thread overview]
Message-ID: <a7699a08-827b-4433-99a8-bfbfda1d38af@linux.dev> (raw)
In-Reply-To: <20240117024823.4186-2-laoar.shao@gmail.com>
On 1/16/24 6:48 PM, Yafang Shao wrote:
> Add three new kfuncs for bpf_iter_cpumask.
> - bpf_iter_cpumask_new
> It is defined with KF_RCU_PROTECTED and KF_RCU.
> KF_RCU_PROTECTED is defined because we must use it under the
> protection of RCU.
> KF_RCU is defined because the cpumask must be a RCU trusted pointer
> such as task->cpus_ptr.
I am not sure whether we need both or not.
KF_RCU_PROTECTED means the function call needs within the rcu cs.
KF_RCU means the argument usage needs within the rcu cs.
We only need one of them (preferrably KF_RCU).
> - bpf_iter_cpumask_next
> - bpf_iter_cpumask_destroy
>
> These new kfuncs facilitate the iteration of percpu data, such as
> runqueues, psi_cgroup_cpu, and more.
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> kernel/bpf/cpumask.c | 69 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 69 insertions(+)
>
> diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c
> index 2e73533a3811..1840e48e6142 100644
> --- a/kernel/bpf/cpumask.c
> +++ b/kernel/bpf/cpumask.c
> @@ -422,6 +422,72 @@ __bpf_kfunc u32 bpf_cpumask_weight(const struct cpumask *cpumask)
> return cpumask_weight(cpumask);
> }
>
> +struct bpf_iter_cpumask {
> + __u64 __opaque[2];
> +} __aligned(8);
> +
> +struct bpf_iter_cpumask_kern {
> + const struct cpumask *mask;
> + int cpu;
> +} __aligned(8);
> +
> +/**
> + * bpf_iter_cpumask_new() - Create a new bpf_iter_cpumask for a specified cpumask
> + * @it: The new bpf_iter_cpumask to be created.
> + * @mask: The cpumask to be iterated over.
> + *
> + * This function initializes a new bpf_iter_cpumask structure for iterating over
> + * the specified CPU mask. It assigns the provided cpumask to the newly created
> + * bpf_iter_cpumask @it for subsequent iteration operations.
> + *
> + * On success, 0 is returen. On failure, ERR is returned.
> + */
> +__bpf_kfunc int bpf_iter_cpumask_new(struct bpf_iter_cpumask *it, const struct cpumask *mask)
> +{
> + struct bpf_iter_cpumask_kern *kit = (void *)it;
> +
> + BUILD_BUG_ON(sizeof(struct bpf_iter_cpumask_kern) > sizeof(struct bpf_iter_cpumask));
> + BUILD_BUG_ON(__alignof__(struct bpf_iter_cpumask_kern) !=
> + __alignof__(struct bpf_iter_cpumask));
> +
> + kit->mask = mask;
> + kit->cpu = -1;
> + return 0;
> +}
We have problem here. Let us say bpf_iter_cpumask_new() is called inside rcu cs.
Once the control goes out of rcu cs, 'mask' could be freed, right?
Or you require bpf_iter_cpumask_next() needs to be in the same rcu cs
as bpf_iter_cpumask_new(). But such a requirement seems odd.
I think we can do things similar to bpf_iter_task_vma. You can allocate memory
with bpf_mem_alloc() in bpf_iter_cpumask_new() to keep a copy of mask. This
way, you do not need to worry about potential use-after-free issue.
The memory can be freed with bpf_iter_cpumask_destroy().
> +
> +/**
> + * bpf_iter_cpumask_next() - Get the next CPU in a bpf_iter_cpumask
> + * @it: The bpf_iter_cpumask
> + *
> + * This function retrieves a pointer to the number of the next CPU within the
> + * specified bpf_iter_cpumask. It allows sequential access to CPUs within the
> + * cpumask. If there are no further CPUs available, it returns NULL.
> + *
> + * Returns a pointer to the number of the next CPU in the cpumask or NULL if no
> + * further CPUs.
> + */
> +__bpf_kfunc int *bpf_iter_cpumask_next(struct bpf_iter_cpumask *it)
> +{
> + struct bpf_iter_cpumask_kern *kit = (void *)it;
> + const struct cpumask *mask = kit->mask;
> + int cpu;
> +
> + cpu = cpumask_next(kit->cpu, mask);
> + if (cpu >= nr_cpu_ids)
> + return NULL;
> +
> + kit->cpu = cpu;
> + return &kit->cpu;
> +}
> +
> +/**
> + * bpf_iter_cpumask_destroy() - Destroy a bpf_iter_cpumask
> + * @it: The bpf_iter_cpumask to be destroyed.
> + */
> +__bpf_kfunc void bpf_iter_cpumask_destroy(struct bpf_iter_cpumask *it)
> +{
> +}
> +
> __bpf_kfunc_end_defs();
>
> BTF_SET8_START(cpumask_kfunc_btf_ids)
> @@ -450,6 +516,9 @@ BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_RCU)
> BTF_ID_FLAGS(func, bpf_cpumask_any_distribute, KF_RCU)
> BTF_ID_FLAGS(func, bpf_cpumask_any_and_distribute, KF_RCU)
> BTF_ID_FLAGS(func, bpf_cpumask_weight, KF_RCU)
> +BTF_ID_FLAGS(func, bpf_iter_cpumask_new, KF_ITER_NEW | KF_RCU_PROTECTED | KF_RCU)
> +BTF_ID_FLAGS(func, bpf_iter_cpumask_next, KF_ITER_NEXT | KF_RET_NULL)
> +BTF_ID_FLAGS(func, bpf_iter_cpumask_destroy, KF_ITER_DESTROY)
> BTF_SET8_END(cpumask_kfunc_btf_ids)
>
> static const struct btf_kfunc_id_set cpumask_kfunc_set = {
next prev parent reply other threads:[~2024-01-18 22:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-17 2:48 [PATCH v3 bpf-next 0/3] bpf: Add bpf_iter_cpumask Yafang Shao
2024-01-17 2:48 ` [PATCH v3 bpf-next 1/3] bpf: Add bpf_iter_cpumask kfuncs Yafang Shao
2024-01-18 22:27 ` Yonghong Song [this message]
2024-01-19 0:51 ` Hou Tao
2024-01-19 3:45 ` Yonghong Song
2024-01-19 9:50 ` Yafang Shao
2024-01-17 2:48 ` [PATCH v3 bpf-next 2/3] bpf, doc: Add document for cpumask iter Yafang Shao
2024-01-17 2:48 ` [PATCH v3 bpf-next 3/3] selftests/bpf: Add selftests " Yafang Shao
2024-01-18 23:46 ` Yonghong Song
2024-01-21 2:45 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a7699a08-827b-4433-99a8-bfbfda1d38af@linux.dev \
--to=yonghong.song@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=laoar.shao@gmail.com \
--cc=lkp@intel.com \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox