From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Maxim Mikityanskiy <maxtram95@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
David Vernet <void@manifault.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Joanne Koong <joannelkoong@gmail.com>, bpf <bpf@vger.kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Kernel Team <kernel-team@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, KP Singh <kpsingh@kernel.org>,
david.faust@oracle.com
Subject: Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
Date: Fri, 13 Jan 2023 10:48:53 +0100 [thread overview]
Message-ID: <87edryg3zy.fsf@oracle.com> (raw)
In-Reply-To: <CAADnVQ+b+XBukob0VAvxraUvXAf9zv8pa2R4QhRvjyULm9=zKA@mail.gmail.com> (Alexei Starovoitov's message of "Wed, 11 Jan 2023 20:48:09 -0800")
> On Wed, Jan 11, 2023 at 2:57 PM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
>>
>> On Tue, Jan 03, 2023 at 03:51:07PM -0800, Alexei Starovoitov wrote:
>> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
>> > > Discoverability plus being able to know semantics from a user PoV to figure out when
>> > > workarounds for older/newer kernels are required to be able to support both kernels.
>> >
>> > Sounds like your concern is that there could be a kfunc that changed it semantics,
>> > but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
>> > such patches from landing. It's up to us to define sane and user
>> > friendly deprecation of kfuncs.
>>
>> I would advocate for adding versioning to BPF API (be it helpers or
>> "stable" kfuncs). Right now we have two extremes: helpers that can't be
>> changed/fixed/deprecated ever, and kfuncs that can be changed at any
>> time, so the end users can't be sure new kernel won't break their stuff.
>> Detecting and fixing the breakage can also be tricky: end users have to
>> write different probes on a case-by-case basis, and sometimes it's not
>> just a matter of checking the number of function parameters or presence
>> of some definition (such difficulties happen when backporting drivers to
>> older kernels, so I assume it may be an issue for BPF programs as well).
>>
>> Let's say we add a version number to the kernel, and the BPF program
>> also has an API version number it's compiled for. Whenever something
>> changes in the stable API on the kernel side, the version number is
>> increased. At the same time, compatibility on the kernel side is
>> preserved for some reasonable period of time (2 years, 5 years,
>> whatever), which means that if the kernel loads a BPF program with an
>> older version number, and that version is within the supported period of
>> time, the kernel will behave in the old way, i.e. verify the old
>> signature of a function, preserve the old behavior, etc.
>
> Right. I think somebody proposed a version scheme for kfuncs already.
> There were so many replies I've lost track.
> But yes it's definitely on the table and
> we should consider it.
> Something like libbpf.map
> We can declare which stable features are supported in which "version".
>
>> This approach has the following upsides:
>>
>> 1. End users can stop worrying that some function changes unexpectedly,
>> and they can have a smoother migration plan.
>>
>> 2. Clear deprecation schedule.
>>
>> 3. Easy way to probe for needed functionality, it's just a matter of
>> comparing numbers: the BPF program loader checks that the kernel is new
>> enough, and the kernel checks that the BPF program's API is not too old.
>>
>> 4. Kernel maintainers will have a deprecation strategy.
>
> +1
>
>> Cons:
>>
>> 1. Arguably a maintainance burden to preserve compatibility on the
>> kernel side, but I would say it's a balance between helpers (which are
>> maintainance burden forever) and kfuncs (which can be changed in every
>> kernel version without keeping any compatibility). "Kfunc that changed
>> its semantics is bad, we should prevent such patches" are just words,
>> but if the developer needs to keep both versions for a while, it will
>> serve as a calm-down mechanism to prevent changes that aren't really
>> necessary. At the same time, the dead code will stop accumulating,
>> because it can be removed according to the schedule.
>
> That sounds like 'pro' instead of 'con' to me :)
>
>> 2. Having a single version number complicates backporting features to
>> older kernels, it would require backporting all previous features
>> chronologically, even if there is no direct dependency. Having multiple
>> version numbers (per feature) is cumbersome for the BPF program to
>> declare. However, this issue is not new, it's already the case for BPF
>> helpers (you can't backport new helpers skipping some other, because the
>> numbers in the list must match).
>
> yeah. I recall amazon linux or something else backported
> helpers out of order and that screwed up bpf progs.
> That was the reason we added numbers to the FN macro in uapi/bpf.h
> That will hopefully prevent such mistakes.
>
> But practically speaking...
> The distro that does out-of-order backporting and skips
> certain helpers is saying: I'm defining my own kABI equivalent
> for bpf progs.
> In that sense there is zero difference between helpers and kfuncs
> from distro point of view and from point of view of their customers.
> Both helpers and kfuncs are neither stable nor unstable.
>
> This discussion is only about pros and cons of the upstream kernel
> and bpf progs that consume upstream kernel.
>
> If we include hyperscalers in the discussion then all
> helpers and all kfuncs immediately become stable from
> point of view of their engineers.
> Big datacenters can maintain kernels with whatever helpers
> and kfuncs they need.
>
>>
>> The above description intentionally doesn't specify whether it should be
>> applied to helpers or kfuncs, because it's a universal concept, about
>> which I would like to hear opinions about versioning without bias to
>> helpers or kfuncs.
>>
>> Regarding freezing helpers, I think there should be a solution for
>> deprecating obsolete stuff. There are historical examples of removing
>> things from UAPI: removing i386 support, ipchains, devfs, IrDA
>> subsystem, even a few architectures [1]. If we apply the versioning
>> approach to helpers, we can make long-waiting incompatible changes in
>> v1, keeping the current set of helpers as v0, used for programs that
>> don't declare a version. Eventually (in 5 years, in 10 years, whatever
>> sounds reasonable) we can drop v0 and remove the support for unversioned
>> BPF programs altogether, similar to how other big things were removed
>> from the kernel. Does it sound feasible?
>
> Not to me. Breaking uapi in whichever way with whatever excuse
> is not on the table.
> We've documented our rules long ago:
>
> Q: Does BPF have a stable ABI?
> ------------------------------
> A: YES. BPF instructions, arguments to BPF programs, set of helper
> functions and their arguments, recognized return codes are all part
> of ABI.
>
>> > "Proper BPF helper" model is broken.
>> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
>> >
>> > is a hack that works only when compiler optimizes the code.
>>
>> What if we replace codegen for helpers, so that it becomes something
>> like this?
>>
>> static inline void *bpf_map_lookup_elem(void *map, const void *key)
>> {
>> // pseudocode alert!
>> asm("call 1" : : "r1"(map), "r2"(key));
>> }
>>
>> I.e. can we just throw in some inline BPF assembly that prepares
>> registers and invokes a call instruction with the helper number? That
>> should be portable across clang and gcc, allowing to stop relying on
>> optimizations.
>
> Great idea!
+1
> It needs "=r" to capture R0 into the 'ret' variable and then it should work.
> clang may have issues with such asm, but should be fixable.
> gcc is less clear.
That inline assembly should work with GCC as it is now. Both compilers
use the same syntax for the `call' instruction.
> iirc they had their own incompatible inline asm :(
> It's a bigger issue.
We are taking care of that, by adding support to the GNU assembler to
also understand the pseudo-C syntax used by llvm. This covers both .s
files specified in the compilation line, and inline asm statements.
Should be ready soon.
next prev parent reply other threads:[~2023-01-13 9:52 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 2/6] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 3/6] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 4/6] bpf: Add bpf_dynptr_clone Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 5/6] bpf: Add bpf_dynptr_iterator Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 6/6] selftests/bpf: Tests for dynptr convenience helpers Joanne Koong
2022-12-08 1:54 ` [PATCH v2 bpf-next 0/6] Dynptr " Alexei Starovoitov
2022-12-09 0:42 ` Andrii Nakryiko
2022-12-09 1:30 ` Alexei Starovoitov
2022-12-09 22:24 ` Joanne Koong
2022-12-12 20:12 ` Andrii Nakryiko
2022-12-13 23:50 ` Joanne Koong
2022-12-14 0:57 ` Andrii Nakryiko
2022-12-14 21:25 ` Joanne Koong
2022-12-16 17:35 ` Alexei Starovoitov
2022-12-20 19:31 ` Andrii Nakryiko
2022-12-25 21:52 ` bpf helpers freeze. Was: " Alexei Starovoitov
2022-12-29 23:10 ` Andrii Nakryiko
2022-12-30 2:46 ` Alexei Starovoitov
2022-12-30 18:38 ` David Vernet
2022-12-30 19:31 ` Alexei Starovoitov
2022-12-30 21:00 ` David Vernet
2022-12-31 0:42 ` Alexei Starovoitov
2023-01-03 11:43 ` Daniel Borkmann
2023-01-03 23:51 ` Alexei Starovoitov
2023-01-04 14:25 ` Daniel Borkmann
2023-01-04 18:59 ` Andrii Nakryiko
2023-01-04 20:03 ` Alexei Starovoitov
2023-01-04 21:57 ` Andrii Nakryiko
2023-01-04 19:37 ` Alexei Starovoitov
2023-01-05 0:13 ` Martin KaFai Lau
2023-01-05 17:17 ` KP Singh
2023-01-05 21:03 ` Andrii Nakryiko
2023-01-06 1:32 ` KP Singh
2023-01-05 21:02 ` Andrii Nakryiko
2023-01-04 20:50 ` David Vernet
2023-01-11 22:56 ` Maxim Mikityanskiy
2023-01-12 4:48 ` Alexei Starovoitov
2023-01-13 9:48 ` Jose E. Marchesi [this message]
2023-01-13 16:35 ` Alexei Starovoitov
2023-01-04 0:55 ` Jakub Kicinski
2023-01-04 18:44 ` Andrii Nakryiko
2023-01-04 19:56 ` Alexei Starovoitov
2023-01-04 18:43 ` Andrii Nakryiko
2023-01-04 19:51 ` Alexei Starovoitov
2023-01-04 21:56 ` Andrii Nakryiko
2023-01-04 18:43 ` Andrii Nakryiko
2023-01-04 19:44 ` Alexei Starovoitov
2023-01-04 21:55 ` Andrii Nakryiko
2023-01-04 23:47 ` David Vernet
2023-01-05 21:01 ` Andrii Nakryiko
2023-01-06 2:54 ` Alexei Starovoitov
2023-01-09 17:46 ` Andrii Nakryiko
2023-01-11 21:29 ` Song Liu
2023-01-12 4:23 ` Alexei Starovoitov
2023-01-12 7:35 ` Song Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edryg3zy.fsf@oracle.com \
--to=jose.marchesi@oracle.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=david.faust@oracle.com \
--cc=joannelkoong@gmail.com \
--cc=kernel-team@meta.com \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=maxtram95@gmail.com \
--cc=song@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox