From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Maxim Mikityanskiy <maxtram95@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
David Vernet <void@manifault.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Joanne Koong <joannelkoong@gmail.com>, bpf <bpf@vger.kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Kernel Team <kernel-team@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, KP Singh <kpsingh@kernel.org>,
david.faust@oracle.com
Subject: Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
Date: Fri, 13 Jan 2023 10:48:53 +0100 [thread overview]
Message-ID: <87edryg3zy.fsf@oracle.com> (raw)
In-Reply-To: <CAADnVQ+b+XBukob0VAvxraUvXAf9zv8pa2R4QhRvjyULm9=zKA@mail.gmail.com> (Alexei Starovoitov's message of "Wed, 11 Jan 2023 20:48:09 -0800")
> On Wed, Jan 11, 2023 at 2:57 PM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
>>
>> On Tue, Jan 03, 2023 at 03:51:07PM -0800, Alexei Starovoitov wrote:
>> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
>> > > Discoverability plus being able to know semantics from a user PoV to figure out when
>> > > workarounds for older/newer kernels are required to be able to support both kernels.
>> >
>> > Sounds like your concern is that there could be a kfunc that changed it semantics,
>> > but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
>> > such patches from landing. It's up to us to define sane and user
>> > friendly deprecation of kfuncs.
>>
>> I would advocate for adding versioning to BPF API (be it helpers or
>> "stable" kfuncs). Right now we have two extremes: helpers that can't be
>> changed/fixed/deprecated ever, and kfuncs that can be changed at any
>> time, so the end users can't be sure new kernel won't break their stuff.
>> Detecting and fixing the breakage can also be tricky: end users have to
>> write different probes on a case-by-case basis, and sometimes it's not
>> just a matter of checking the number of function parameters or presence
>> of some definition (such difficulties happen when backporting drivers to
>> older kernels, so I assume it may be an issue for BPF programs as well).
>>
>> Let's say we add a version number to the kernel, and the BPF program
>> also has an API version number it's compiled for. Whenever something
>> changes in the stable API on the kernel side, the version number is
>> increased. At the same time, compatibility on the kernel side is
>> preserved for some reasonable period of time (2 years, 5 years,
>> whatever), which means that if the kernel loads a BPF program with an
>> older version number, and that version is within the supported period of
>> time, the kernel will behave in the old way, i.e. verify the old
>> signature of a function, preserve the old behavior, etc.
>
> Right. I think somebody proposed a version scheme for kfuncs already.
> There were so many replies I've lost track.
> But yes it's definitely on the table and
> we should consider it.
> Something like libbpf.map
> We can declare which stable features are supported in which "version".
>
>> This approach has the following upsides:
>>
>> 1. End users can stop worrying that some function changes unexpectedly,
>> and they can have a smoother migration plan.
>>
>> 2. Clear deprecation schedule.
>>
>> 3. Easy way to probe for needed functionality, it's just a matter of
>> comparing numbers: the BPF program loader checks that the kernel is new
>> enough, and the kernel checks that the BPF program's API is not too old.
>>
>> 4. Kernel maintainers will have a deprecation strategy.
>
> +1
>
>> Cons:
>>
>> 1. Arguably a maintainance burden to preserve compatibility on the
>> kernel side, but I would say it's a balance between helpers (which are
>> maintainance burden forever) and kfuncs (which can be changed in every
>> kernel version without keeping any compatibility). "Kfunc that changed
>> its semantics is bad, we should prevent such patches" are just words,
>> but if the developer needs to keep both versions for a while, it will
>> serve as a calm-down mechanism to prevent changes that aren't really
>> necessary. At the same time, the dead code will stop accumulating,
>> because it can be removed according to the schedule.
>
> That sounds like 'pro' instead of 'con' to me :)
>
>> 2. Having a single version number complicates backporting features to
>> older kernels, it would require backporting all previous features
>> chronologically, even if there is no direct dependency. Having multiple
>> version numbers (per feature) is cumbersome for the BPF program to
>> declare. However, this issue is not new, it's already the case for BPF
>> helpers (you can't backport new helpers skipping some other, because the
>> numbers in the list must match).
>
> yeah. I recall amazon linux or something else backported
> helpers out of order and that screwed up bpf progs.
> That was the reason we added numbers to the FN macro in uapi/bpf.h
> That will hopefully prevent such mistakes.
>
> But practically speaking...
> The distro that does out-of-order backporting and skips
> certain helpers is saying: I'm defining my own kABI equivalent
> for bpf progs.
> In that sense there is zero difference between helpers and kfuncs
> from distro point of view and from point of view of their customers.
> Both helpers and kfuncs are neither stable nor unstable.
>
> This discussion is only about pros and cons of the upstream kernel
> and bpf progs that consume upstream kernel.
>
> If we include hyperscalers in the discussion then all
> helpers and all kfuncs immediately become stable from
> point of view of their engineers.
> Big datacenters can maintain kernels with whatever helpers
> and kfuncs they need.
>
>>
>> The above description intentionally doesn't specify whether it should be
>> applied to helpers or kfuncs, because it's a universal concept, about
>> which I would like to hear opinions about versioning without bias to
>> helpers or kfuncs.
>>
>> Regarding freezing helpers, I think there should be a solution for
>> deprecating obsolete stuff. There are historical examples of removing
>> things from UAPI: removing i386 support, ipchains, devfs, IrDA
>> subsystem, even a few architectures [1]. If we apply the versioning
>> approach to helpers, we can make long-waiting incompatible changes in
>> v1, keeping the current set of helpers as v0, used for programs that
>> don't declare a version. Eventually (in 5 years, in 10 years, whatever
>> sounds reasonable) we can drop v0 and remove the support for unversioned
>> BPF programs altogether, similar to how other big things were removed
>> from the kernel. Does it sound feasible?
>
> Not to me. Breaking uapi in whichever way with whatever excuse
> is not on the table.
> We've documented our rules long ago:
>
> Q: Does BPF have a stable ABI?
> ------------------------------
> A: YES. BPF instructions, arguments to BPF programs, set of helper
> functions and their arguments, recognized return codes are all part
> of ABI.
>
>> > "Proper BPF helper" model is broken.
>> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
>> >
>> > is a hack that works only when compiler optimizes the code.
>>
>> What if we replace codegen for helpers, so that it becomes something
>> like this?
>>
>> static inline void *bpf_map_lookup_elem(void *map, const void *key)
>> {
>> // pseudocode alert!
>> asm("call 1" : : "r1"(map), "r2"(key));
>> }
>>
>> I.e. can we just throw in some inline BPF assembly that prepares
>> registers and invokes a call instruction with the helper number? That
>> should be portable across clang and gcc, allowing to stop relying on
>> optimizations.
>
> Great idea!
+1
> It needs "=r" to capture R0 into the 'ret' variable and then it should work.
> clang may have issues with such asm, but should be fixable.
> gcc is less clear.
That inline assembly should work with GCC as it is now. Both compilers
use the same syntax for the `call' instruction.
> iirc they had their own incompatible inline asm :(
> It's a bigger issue.
We are taking care of that, by adding support to the GNU assembler to
also understand the pseudo-C syntax used by llvm. This covers both .s
files specified in the compilation line, and inline asm statements.
Should be ready soon.
next prev parent reply other threads:[~2023-01-13 9:52 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 2/6] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 3/6] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 4/6] bpf: Add bpf_dynptr_clone Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 5/6] bpf: Add bpf_dynptr_iterator Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 6/6] selftests/bpf: Tests for dynptr convenience helpers Joanne Koong
2022-12-08 1:54 ` [PATCH v2 bpf-next 0/6] Dynptr " Alexei Starovoitov
2022-12-09 0:42 ` Andrii Nakryiko
2022-12-09 1:30 ` Alexei Starovoitov
2022-12-09 22:24 ` Joanne Koong
2022-12-12 20:12 ` Andrii Nakryiko
2022-12-13 23:50 ` Joanne Koong
2022-12-14 0:57 ` Andrii Nakryiko
2022-12-14 21:25 ` Joanne Koong
2022-12-16 17:35 ` Alexei Starovoitov
2022-12-20 19:31 ` Andrii Nakryiko
2022-12-25 21:52 ` bpf helpers freeze. Was: " Alexei Starovoitov
2022-12-29 23:10 ` Andrii Nakryiko
2022-12-30 2:46 ` Alexei Starovoitov
2022-12-30 18:38 ` David Vernet
2022-12-30 19:31 ` Alexei Starovoitov
2022-12-30 21:00 ` David Vernet
2022-12-31 0:42 ` Alexei Starovoitov
2023-01-03 11:43 ` Daniel Borkmann
2023-01-03 23:51 ` Alexei Starovoitov
2023-01-04 14:25 ` Daniel Borkmann
2023-01-04 18:59 ` Andrii Nakryiko
2023-01-04 20:03 ` Alexei Starovoitov
2023-01-04 21:57 ` Andrii Nakryiko
2023-01-04 19:37 ` Alexei Starovoitov
2023-01-05 0:13 ` Martin KaFai Lau
2023-01-05 17:17 ` KP Singh
2023-01-05 21:03 ` Andrii Nakryiko
2023-01-06 1:32 ` KP Singh
2023-01-05 21:02 ` Andrii Nakryiko
2023-01-04 20:50 ` David Vernet
2023-01-11 22:56 ` Maxim Mikityanskiy
2023-01-12 4:48 ` Alexei Starovoitov
2023-01-13 9:48 ` Jose E. Marchesi [this message]
2023-01-13 16:35 ` Alexei Starovoitov
2023-01-04 0:55 ` Jakub Kicinski
2023-01-04 18:44 ` Andrii Nakryiko
2023-01-04 19:56 ` Alexei Starovoitov
2023-01-04 18:43 ` Andrii Nakryiko
2023-01-04 19:51 ` Alexei Starovoitov
2023-01-04 21:56 ` Andrii Nakryiko
2023-01-04 18:43 ` Andrii Nakryiko
2023-01-04 19:44 ` Alexei Starovoitov
2023-01-04 21:55 ` Andrii Nakryiko
2023-01-04 23:47 ` David Vernet
2023-01-05 21:01 ` Andrii Nakryiko
2023-01-06 2:54 ` Alexei Starovoitov
2023-01-09 17:46 ` Andrii Nakryiko
2023-01-11 21:29 ` Song Liu
2023-01-12 4:23 ` Alexei Starovoitov
2023-01-12 7:35 ` Song Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edryg3zy.fsf@oracle.com \
--to=jose.marchesi@oracle.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=david.faust@oracle.com \
--cc=joannelkoong@gmail.com \
--cc=kernel-team@meta.com \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=maxtram95@gmail.com \
--cc=song@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.