From: David Vernet <void@manifault.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
yhs@meta.com, john.fastabend@gmail.com, kpsingh@kernel.org,
sdf@google.com, haoluo@google.com, jolsa@kernel.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
tj@kernel.org, memxor@gmail.com
Subject: Re: [PATCH bpf-next v2 4/9] bpf: Enable cpumasks to be queried and used as kptrs
Date: Tue, 24 Jan 2023 23:36:45 -0600 [thread overview]
Message-ID: <Y9C/7fxsIPXiG6mr@maniforge> (raw)
In-Reply-To: <20230125043602.gmpi54ixerelmzzx@iphone-mikan.dhcp.thefacebook.com>
On Tue, Jan 24, 2023 at 08:36:02PM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 20, 2023 at 01:25:18PM -0600, David Vernet wrote:
> > +
> > +/**
> > + * struct bpf_cpumask - refcounted BPF cpumask wrapper structure
> > + * @cpumask: The actual cpumask embedded in the struct.
> > + * @usage: Object reference counter. When the refcount goes to 0, the
> > + * memory is released back to the BPF allocator, which provides
> > + * RCU safety.
> > + *
> > + * Note that we explicitly embed a cpumask_t rather than a cpumask_var_t. This
> > + * is done to avoid confusing the verifier due to the typedef of cpumask_var_t
> > + * changing depending on whether CONFIG_CPUMASK_OFFSTACK is defined or not. See
> > + * the details in <linux/cpumask.h>. The consequence is that this structure is
> > + * likely a bit larger than it needs to be when CONFIG_CPUMASK_OFFSTACK is
> > + * defined due to embedding the whole NR_CPUS-size bitmap, but the extra memory
> > + * overhead is minimal. For the more typical case of CONFIG_CPUMASK_OFFSTACK
> > + * not being defined, the structure is the same size regardless.
> > + */
> > +struct bpf_cpumask {
> > + cpumask_t cpumask;
> > + refcount_t usage;
> > +};
> > +
> > +static struct bpf_mem_alloc bpf_cpumask_ma;
> > +
> > +static bool cpu_valid(u32 cpu)
> > +{
> > + return cpu < nr_cpu_ids;
> > +}
> > +
> > +__diag_push();
> > +__diag_ignore_all("-Wmissing-prototypes",
> > + "Global kfuncs as their definitions will be in BTF");
> > +
> > +struct bpf_cpumask *bpf_cpumask_create(void)
> > +{
> > + struct bpf_cpumask *cpumask;
> > +
> > + cpumask = bpf_mem_alloc(&bpf_cpumask_ma, sizeof(*cpumask));
> > + if (!cpumask)
> > + return NULL;
> > +
> > + memset(cpumask, 0, sizeof(*cpumask));
> > + refcount_set(&cpumask->usage, 1);
> > +
> > + return cpumask;
> > +}
>
> Applied patches 1 and 2. Patch 3 doesn't apply anymore. Pls rebase.
Ack, will rebase for v3.
> I'm fine with existing bpf_cpumask proposal, but can we do better?
> This is so close to be a bitmap template.
Agreed that they're close, but I'm not a fan of the UX taxes for what we
get out of it. More below.
> Can we generalize it as
> struct bpf_bitmap {
> refcount_t refcnt;
> int num_bits;
> u64 bits[];
> };
>
> struct bpf_bitmap *bpf_bitmap_create(int bits)
> {
> bitmap = bpf_mem_alloc(&bpf_cpumask_ma, sizeof(*bitmap) + BITS_TO_LONGS(bits) * sizeof(u64));
> bitmap->num_bits = bits;
> }
+1 that having bitmap kfuncs would be nice to expose, and should be
pretty easy to add. Happy to do so in a follow-on patch set.
>
> and special case few custom kfuncs in the verifier that allow
> type cast from bpf_bitmap with to 'struct cpumask *' ? Like
> struct cpumask *bpf_bitmap_cast_to_cpumask(struct bpf_bitmap *bitmap)
> {
> if (bitmap->num_bits == nr_cpu_ids)
> return bitmap->bits;
> return NULL;
> }
> BTF_ID_FLAGS(func, bpf_bitmap_cast_to_cpumask, KF_TRUSTED_ARGS | KF_RET_NULL)
This I'm not a huge fan of though. It seems like we're removing a useful
abstraction and adding a UX tax just to avoid defining and exporting an
additional small set of kfuncs for allocating, and acquire/releasing a
struct bpf_cpumask. That logic is very minimal, just around 100 lines of
code including doxygen comments.
It's kind of unfortunate that cpumask is so close to bitmap, but that's
nothing new -- <linux/cpumask.h> in the kernel is little more than a
thin wrapper around a bitmap that simply provides some more ergonomic
APIs, along with some magic that makes it safe to access cpumask_var_t
on the stack regardless of NR_CPUS. The latter doesn't apply to BPF, but
the former does.
> The UX will be a bit worse, since bpf prog would need to do !=NULL check
> but with future bpf_assert() we may get rid of !=NULL check.
>
> We can keep direct cpumask accessors as kfuncs:
>
> u32 bpf_cpumask_first(const struct cpumask *cpumask);
> u32 bpf_cpumask_first_zero(const struct cpumask *cpumask);
>
> and add bpf_find_first_bit() and the rest of bit manipulations.
Worth noting as well is that I think struct bpf_bitmap is going to be
treated somewhat differently than struct bpf_cpumask and struct cpumask.
There is no type-safety for bitmaps in the kernel. They're just
represented as unsigned long *, so I don't we'll be able to allow
programs to pass bitmaps allocated elsewhere in the kernel to read-only
bitmap kfuncs like we do for struct cpumask *, as the verifier will just
interpret them as pointers to statically sized scalars.
> Since all of the bpf_cpumask do run-time cpu_valid() check we're not
> sacrificing performance.
>
> Feels more generic with wider applicability at the expense of little bit worse UX.
> I haven't thought about acq/rel consequences.
The TL;DR from me is that I agree that having bitmap kfuncs is a great
idea, but I don't see the need to tie the two at the hip at the cost of
a worse UX. I'd prefer to push the extra complexity into the BPF backend
in favor of a simpler programming front-end for users.
Thoughts?
next prev parent reply other threads:[~2023-01-25 5:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-20 19:25 [PATCH bpf-next v2 0/9] Enable cpumasks to be used as kptrs David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 1/9] bpf: Enable annotating trusted nested pointers David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 2/9] bpf: Allow trusted args to walk struct when checking BTF IDs David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 3/9] bpf: Disallow NULLable pointers for trusted kfuncs David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 4/9] bpf: Enable cpumasks to be queried and used as kptrs David Vernet
2023-01-25 4:36 ` Alexei Starovoitov
2023-01-25 5:36 ` David Vernet [this message]
2023-01-25 5:43 ` Alexei Starovoitov
2023-01-20 19:25 ` [PATCH bpf-next v2 5/9] selftests/bpf: Add nested trust selftests suite David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 6/9] selftests/bpf: Add selftest suite for cpumask kfuncs David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 7/9] bpf/docs: Document cpumask kfuncs in a new file David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 8/9] bpf/docs: Document how nested trusted fields may be defined David Vernet
2023-01-20 19:25 ` [PATCH bpf-next v2 9/9] bpf/docs: Document the nocast aliasing behavior of ___init David Vernet
2023-01-25 4:40 ` [PATCH bpf-next v2 0/9] Enable cpumasks to be used as kptrs patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9C/7fxsIPXiG6mr@maniforge \
--to=void@manifault.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kernel-team@meta.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=yhs@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.