public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	David Vernet <void@manifault.com>
Cc: Yafang Shao <laoar.shao@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, bpf <bpf@vger.kernel.org>,
	Stephen Brennan <stephen.s.brennan@oracle.com>
Subject: Re: [RFC PATCH bpf-next 0/3] bpf: Add new bpf helper bpf_for_each_cpu
Date: Thu, 3 Aug 2023 08:22:13 -0700	[thread overview]
Message-ID: <cddeb658-563e-9ff9-0ece-4509eabab663@linux.dev> (raw)
In-Reply-To: <998f8e89-fb00-820f-15d9-1d227cc09e54@oracle.com>



On 8/3/23 1:21 AM, Alan Maguire wrote:
> On 02/08/2023 17:33, Alexei Starovoitov wrote:
>> On Tue, Aug 1, 2023 at 8:30 PM David Vernet <void@manifault.com> wrote:
>>> I agree that this is the correct way to generalize this. The only thing
>>> that we'll have to figure out is how to generalize treating const struct
>>> cpumask * objects as kptrs. In sched_ext [0] we export
>>> scx_bpf_get_idle_cpumask() and scx_bpf_get_idle_smtmask() kfuncs to
>>> return trusted global cpumask kptrs that can then be "released" in
>>> scx_bpf_put_idle_cpumask(). scx_bpf_put_idle_cpumask() is empty and
>>> exists only to appease the verifier that the trusted cpumask kptrs
>>> aren't being leaked and are having their references "dropped".
>>
>> why is it KF_ACQUIRE ?
>> I think it can just return a trusted pointer without acquire.
>>
>>> [0]: https://lore.kernel.org/all/20230711011412.100319-13-tj@kernel.org/
>>>
>>> I'd imagine that we have 2 ways forward if we want to enable progs to
>>> fetch other global cpumasks with static lifetimes (e.g.
>>> __cpu_possible_mask or nohz.idle_cpus_mask):
>>>
>>> 1. The most straightforward thing to do would be to add a new kfunc in
>>>     kernel/bpf/cpumask.c that's a drop-in replacment for
>>>     scx_bpf_put_idle_cpumask():
>>>
>>> void bpf_global_cpumask_drop(const struct cpumask *cpumask)
>>> {}
>>>
>>> 2. Another would be to implement something resembling what Yonghong
>>>     suggested in [1], where progs can link against global allocated kptrs
>>>     like:
>>>
>>> const struct cpumask *__cpu_possible_mask __ksym;
>>>
>>> [1]: https://lore.kernel.org/all/3f56b3b3-9b71-f0d3-ace1-406a8eeb64c0@linux.dev/#t
>>>
>>> In my opinion (1) is more straightforward, (2) is a better UX.
>>
>> 1 = adding few kfuncs.
>> 2 = teaching pahole to emit certain global vars.
>>
>> nm vmlinux|g -w D|g -v __SCK_|g -v __tracepoint_|wc -l
>> 1998
>>
>> imo BTF increase trade off is acceptable.
> 
> Agreed, Stephen's numbers on BTF size increase were pretty modest [1].
> 
> What was gating that work in my mind was previous discussion around
> splitting aspects of BTF into a "vmlinux-extra". Experiments with this
> seemed to show it's hard to support, and worse, tooling would have to
> learn about its existence. We have to come up with a CONFIG convention
> about specifying what ends up in -extra versus core vmlinux BTF, what do
> we do about modules, etc. All feels like over-complication.
> 
> I think a better path would be to support BTF in a vmlinux BTF module
> (controlled by making CONFIG_DEBUG_INFO_BTF tristate). The module is
> separately loadable, but puts vmlinux in the same place for tools -
> /sys/kernel/btf/vmlinux. That solves already-existing issues of BTF size
> for embedded use cases that have come up a few times, and lessens
> concerns about BTF size for other users, while it all works with
> existing tooling. I have a basic proof-of-concept but it will take time
> to hammer into shape.
> 
> Because variable-related size increases are pretty modest, so should we
> proceed with the BTF variable support anyway? We can modularize BTF
> separately later on for those concerned about BTF size.

Alan, it seems a consensus has reached that we should include
global variables (excluding special kernel made ones like
__SCK_ and __tracepoint_) in vmlinux BTF.
please go ahead and propose a patch. Thanks!

> 
> [1]
> https://lore.kernel.org/bpf/20221104231103.752040-1-stephen.s.brennan@oracle.com/

  reply	other threads:[~2023-08-03 15:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 14:29 [RFC PATCH bpf-next 0/3] bpf: Add new bpf helper bpf_for_each_cpu Yafang Shao
2023-08-01 14:29 ` [RFC PATCH bpf-next 1/3] bpf: Add bpf_for_each_cpu helper Yafang Shao
2023-08-01 14:29 ` [RFC PATCH bpf-next 2/3] cgroup, psi: Init root cgroup psi to psi_system Yafang Shao
2023-08-01 14:29 ` [RFC PATCH bpf-next 3/3] selftests/bpf: Add selftest for for_each_cpu Yafang Shao
2023-08-01 17:53 ` [RFC PATCH bpf-next 0/3] bpf: Add new bpf helper bpf_for_each_cpu Yonghong Song
2023-08-02  2:33   ` Yafang Shao
2023-08-02  2:45     ` Alexei Starovoitov
2023-08-02  2:57       ` Yafang Shao
2023-08-02  3:29       ` David Vernet
2023-08-02  6:54         ` Yonghong Song
2023-08-02 15:46           ` David Vernet
2023-08-02 16:23             ` Alexei Starovoitov
2023-08-02 16:33         ` Alexei Starovoitov
2023-08-02 17:06           ` David Vernet
2023-08-02 18:13             ` Alexei Starovoitov
2023-08-03  8:21           ` Alan Maguire
2023-08-03 15:22             ` Yonghong Song [this message]
2023-08-03 16:10               ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cddeb658-563e-9ff9-0ece-4509eabab663@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=stephen.s.brennan@oracle.com \
    --cc=void@manifault.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox